c - I'm getting segmentation error: core dumped when loading a list of string, especially for long lists. There is a mem

时间: 2025-01-06 admin 业界

I'm working on a code which is supposed to load a dictionary and check if the words in a text is correct.

The code compiles when working with small lists (for dictionary) yet I get 'segmentation error: core dumped' when I run it with large lists.

I checked if there is a memory leak with valgrind and apparently there is in line where I defined tmp. (node *tmp = malloc(sizeof(node))).

I tried to fixing the error by adding freeing memory for node *tmp by adding free(tmp) in different lines of code. But the end result is mostly dictionary loading incorrectly.

Below is the code.

// Implements a dictionary's functionality

#include <ctype.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <cs50.h>
#include <strings.h>
#include "dictionary.h"

// Represents a node in a hash table
typedef struct node
{
    char word[LENGTH + 1];
    struct node *next;
} node;

// TODO: Choose number of buckets in hash table
const unsigned int N = 26;

// Hash table
node *table[N];

// Returns true if word is in dictionary, else false
bool check(const char *word) // this is from another file which works fine.
{
    int hash_value = hash(word);
    node *table_tmp = table[hash_value];
    for(node *ptr = table_tmp; ptr != NULL; ptr = ptr->next)
    {
        if (strcasecmp(ptr->word, word))
        {
            return true;
        }
    }

    return false;
}

// Hashes word to a number
unsigned int hash(const char word[LENGTH + 1])
{
    return toupper(word[0]) - 'A';
}

// Loads dictionary into memory, returning true if successful, else false
bool load(const char *dictionary)
{
    // Opens the dictionary
    FILE *source = fopen(dictionary, "r");

    if (source == NULL)
    {
        fclose(source);
        return 0;
    }

    char word_read[LENGTH + 1];

    // Reads data
    while (fgets(word_read, sizeof(word_read), source) != NULL)
    {
        unsigned int hash_value = hash(word_read);

        node *tmp = malloc(sizeof(node));
        if (tmp == NULL)
        {
            return 1;
        }

        strcpy(tmp->word, word_read);
        tmp->next = NULL;

        // Appends with tmp if table is empty
        if (table[hash_value] == NULL)
        {
            table[hash_value] = tmp;
        }

        // Appends with tmp itaretively using ptr pointer
        else
        {
            for(node *ptr = table[hash_value]; ptr != NULL; ptr = ptr->next)
            {
                if (ptr->next == NULL)
                {
                    ptr->next = tmp;
                    break;
                }
            }
        }

        for(node *ptr = tmp; ptr != NULL; ptr = ptr->next)
        {
            if (ptr->next == NULL)
            {
                break;
            }

            free(ptr);
            ptr = NULL;
        }
    }

    if (size() != 0)
    {
        return true;
    }

    else
    {
        return false;
    }
}

// Returns number of words in dictionary if loaded, else 0 if not yet loaded
unsigned int size(void)
{
    int counter = 0;
    for(int i = 0; i < N; i++)
    {
        for(node *ptr = table[i]; ptr != NULL; ptr = ptr->next)
        {
            counter++;
            if (ptr->next == NULL)
            {
                break;
            }
        }
    }

    if (counter == 0)
    {
        return 0;
    }

    return counter;
}

// Unloads dictionary from memory, returning true if successful, else false
bool unload(void)
{

    for(int i = 0; i < N; i++)
    {
        for(node *ptr = table[i]; ptr != NULL; ptr = ptr->next)
        {
            if (ptr->next == NULL)
            {
                break;
            }

            free(ptr);
        }
    }

    for(int i = 0; i < N; i++)
    {
        if (table[i] != NULL)
        {
            table[i] = NULL;
        }
    }

    for(int i = 0; i < N; i++)
    {
        if (table[i] != NULL)
        {
            return false;
        }
    }

    return true;
}

When I run valgrind check, below is the output.

It's apparent that the memory lead is coming from not freeing node *tmp, but I need help on how I should do that.

Also, I appreciate any feedback on the implementation of the for loop (maybe that's the underlying reason behind having the memory issue).

`==10230== 
==10230== HEAP SUMMARY:
==10230==     in use at exit: 696 bytes in 5 blocks
==10230==   total heap usage: 12 allocs, 7 frees, 10,552 bytes allocated
==10230== 
==10230== 224 bytes in 4 blocks are definitely lost in loss record 1 of 2
==10230==    at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==10230==    by 0x109A36: load (dictionary.c:67)
==10230==    by 0x1092CB: main (speller.c:40)
==10230== 
==10230== 472 bytes in 1 blocks are still reachable in loss record 2 of 2
==10230==    at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==10230==    by 0x49CEE6E: __fopen_internal (iofopen.c:65)
==10230==    by 0x49CEE6E: fopen@@GLIBC_2.2.5 (iofopen.c:86)
==10230==    by 0x1099DE: load (dictionary.c:52)
==10230==    by 0x1092CB: main (speller.c:40)
==10230== 
==10230== LEAK SUMMARY:
==10230==    definitely lost: 224 bytes in 4 blocks
==10230==    indirectly lost: 0 bytes in 0 blocks
==10230==      possibly lost: 0 bytes in 0 blocks
==10230==    still reachable: 472 bytes in 1 blocks
==10230==         suppressed: 0 bytes in 0 blocks
==10230== 
==10230== For lists of detected and suppressed errors, rerun with: -s
==10230== ERROR SUMMARY: 4 errors from 2 contexts (suppressed: 0 from 0)`

I'm working on a code which is supposed to load a dictionary and check if the words in a text is correct.

The code compiles when working with small lists (for dictionary) yet I get 'segmentation error: core dumped' when I run it with large lists.

I checked if there is a memory leak with valgrind and apparently there is in line where I defined tmp. (node *tmp = malloc(sizeof(node))).

I tried to fixing the error by adding freeing memory for node *tmp by adding free(tmp) in different lines of code. But the end result is mostly dictionary loading incorrectly.

Below is the code.

// Implements a dictionary's functionality

#include <ctype.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <cs50.h>
#include <strings.h>
#include "dictionary.h"

// Represents a node in a hash table
typedef struct node
{
    char word[LENGTH + 1];
    struct node *next;
} node;

// TODO: Choose number of buckets in hash table
const unsigned int N = 26;

// Hash table
node *table[N];

// Returns true if word is in dictionary, else false
bool check(const char *word) // this is from another file which works fine.
{
    int hash_value = hash(word);
    node *table_tmp = table[hash_value];
    for(node *ptr = table_tmp; ptr != NULL; ptr = ptr->next)
    {
        if (strcasecmp(ptr->word, word))
        {
            return true;
        }
    }

    return false;
}

// Hashes word to a number
unsigned int hash(const char word[LENGTH + 1])
{
    return toupper(word[0]) - 'A';
}

// Loads dictionary into memory, returning true if successful, else false
bool load(const char *dictionary)
{
    // Opens the dictionary
    FILE *source = fopen(dictionary, "r");

    if (source == NULL)
    {
        fclose(source);
        return 0;
    }

    char word_read[LENGTH + 1];

    // Reads data
    while (fgets(word_read, sizeof(word_read), source) != NULL)
    {
        unsigned int hash_value = hash(word_read);

        node *tmp = malloc(sizeof(node));
        if (tmp == NULL)
        {
            return 1;
        }

        strcpy(tmp->word, word_read);
        tmp->next = NULL;

        // Appends with tmp if table is empty
        if (table[hash_value] == NULL)
        {
            table[hash_value] = tmp;
        }

        // Appends with tmp itaretively using ptr pointer
        else
        {
            for(node *ptr = table[hash_value]; ptr != NULL; ptr = ptr->next)
            {
                if (ptr->next == NULL)
                {
                    ptr->next = tmp;
                    break;
                }
            }
        }

        for(node *ptr = tmp; ptr != NULL; ptr = ptr->next)
        {
            if (ptr->next == NULL)
            {
                break;
            }

            free(ptr);
            ptr = NULL;
        }
    }

    if (size() != 0)
    {
        return true;
    }

    else
    {
        return false;
    }
}

// Returns number of words in dictionary if loaded, else 0 if not yet loaded
unsigned int size(void)
{
    int counter = 0;
    for(int i = 0; i < N; i++)
    {
        for(node *ptr = table[i]; ptr != NULL; ptr = ptr->next)
        {
            counter++;
            if (ptr->next == NULL)
            {
                break;
            }
        }
    }

    if (counter == 0)
    {
        return 0;
    }

    return counter;
}

// Unloads dictionary from memory, returning true if successful, else false
bool unload(void)
{

    for(int i = 0; i < N; i++)
    {
        for(node *ptr = table[i]; ptr != NULL; ptr = ptr->next)
        {
            if (ptr->next == NULL)
            {
                break;
            }

            free(ptr);
        }
    }

    for(int i = 0; i < N; i++)
    {
        if (table[i] != NULL)
        {
            table[i] = NULL;
        }
    }

    for(int i = 0; i < N; i++)
    {
        if (table[i] != NULL)
        {
            return false;
        }
    }

    return true;
}

When I run valgrind check, below is the output.

It's apparent that the memory lead is coming from not freeing node *tmp, but I need help on how I should do that.

Also, I appreciate any feedback on the implementation of the for loop (maybe that's the underlying reason behind having the memory issue).

`==10230== 
==10230== HEAP SUMMARY:
==10230==     in use at exit: 696 bytes in 5 blocks
==10230==   total heap usage: 12 allocs, 7 frees, 10,552 bytes allocated
==10230== 
==10230== 224 bytes in 4 blocks are definitely lost in loss record 1 of 2
==10230==    at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==10230==    by 0x109A36: load (dictionary.c:67)
==10230==    by 0x1092CB: main (speller.c:40)
==10230== 
==10230== 472 bytes in 1 blocks are still reachable in loss record 2 of 2
==10230==    at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==10230==    by 0x49CEE6E: __fopen_internal (iofopen.c:65)
==10230==    by 0x49CEE6E: fopen@@GLIBC_2.2.5 (iofopen.c:86)
==10230==    by 0x1099DE: load (dictionary.c:52)
==10230==    by 0x1092CB: main (speller.c:40)
==10230== 
==10230== LEAK SUMMARY:
==10230==    definitely lost: 224 bytes in 4 blocks
==10230==    indirectly lost: 0 bytes in 0 blocks
==10230==      possibly lost: 0 bytes in 0 blocks
==10230==    still reachable: 472 bytes in 1 blocks
==10230==         suppressed: 0 bytes in 0 blocks
==10230== 
==10230== For lists of detected and suppressed errors, rerun with: -s
==10230== ERROR SUMMARY: 4 errors from 2 contexts (suppressed: 0 from 0)`
Share Improve this question edited yesterday cntzcn asked yesterday cntzcncntzcn 111 silver badge2 bronze badges New contributor cntzcn is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 5
  • 2 Impossible to tell without knowing how word_read and node are defined. Please provide A Minimal Complete Reproducible Example. – David C. Rankin Commented yesterday
  • 1 In the line strcpy(tmp->word, word_read);, you have just allocated tmp via malloc, but you never intialised the pointer word. – OldBoy Commented yesterday
  • @OldBoy OP updated code to show that word is not a pointer, but an array. Yet good observation. – chux Commented yesterday
  • With Valgrind, always start with the FIRST error reported. You've shown the leaks. These are usually the LEAST important issue, and are more or less meaningless if your app is crashing. – Paul Floyd Commented 20 hours ago
  • You may also want to read Speller – David C. Rankin Commented 11 hours ago
Add a comment  | 

1 Answer 1

Reset to default 1

Code has at least these problems. If all fixed, may/may not solve OP's woes.

Bad pointer

In unload(), code attempts to access ptr->next after free(ptr);

        //                                           vvvvvvvvv ---> bad
        for(node *ptr = table[i]; ptr != NULL; ptr = ptr->next) {
            if (ptr->next == NULL) {
                break;
            }
            free(ptr);
        }

Get the next pointer before free(ptr).

Something like:

        node *next;
        for(node *ptr = table[i]; ptr != NULL; ptr = next) {
            next = ptr->next;
            free(ptr);
        }

Why keep '\n'?

Better to lop off the potential '\n' at the end of word_read[].

word_read[strcspn(word_read, "\n")] = '\0';

Potential invalid index

toupper(word[0]) - 'A' risks returning a value outside [0...N). Better as

(toupper(word[0]) - 'A')%((unsigned)N)

LENGTH unknown and may be too small

for(node *ptr = tmp; ptr != NULL; ptr = ptr->next)

In load(), for loop unclear. What is attempted here? if (ptr->next == NULL) looks to be always true. Stray code?

Minor: inconsistent types

Why int counter, yet function returns an unsigned?

unsigned int size(void) {
    int counter = 0;
    ...
    return counter;
}
最新文章