开发者

Trying to split by two delimiters and it doesn't work - C

开发者 https://www.devze.com 2022-12-11 21:24 出处:网络
I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago\\n and then split each line by \';\' delimiter and print each record.

I wrote below code to readin line by line from stdin ex.

city=Boston;city=New York;city=Chicago\n

and then split each line by ';' delimiter and print each record. Then in yet another loop I try to split the record by '=' delimiter to get to the actual values. But for some reason then the main (first) loop doesn't loop beyond the first iteration, why?

char*   del1 = ";";
char*   del2 = "=";
char    input[BUFLEN];
开发者_JAVA百科
while(fgets(input, BUFLEN, fp)) {

        input[strlen(input)-1]='\0';
        char* record = strtok(input, &del1);

        while(record) {
                printf("Record: %s\n",record);

                char*  field = strtok(record, &del2);
                while(field) {
                     printf("Field: %s\n",field);
                     field = strtok(NULL, &del2);
                }

                record = strtok(NULL, &del1);
        }
}


Two things: first, this line is not very good:

 input[strlen(input)-1]='\0';

fgets() always finish with '\0' and this will give weird results with your input isn't finished exactly at '\n'.

Second, strtok() cannot be called two times concurrently. For that, use strtok_r(), which recieves a char** as a third argument to store the state.


Looks like you need to use the re-entrant form of strtok, strtok_r because when you are calling strtok inside your loop, it's wiping out the string for the outside loop. When you call record = strtok(NULL), it is trying to parse your inside string again.


you cannot use two loop of strtok because the strtok is storing globally the pointer in your string.

Your would have to do a loop to separate the ;, store those results and one loop to separate the = on the previously stored result.

char*   del1 = ";";
char*   del2 = "=";
char    input[BUFLEN];
char*   tokens[255]; // have to be careful not to go more then that
while(fgets(input, BUFLEN, fp)) {

    // input[strlen(input)-1]='\0'; // you don't need that fgets add the NULL
    char* record = strtok(input, del1);
    i = 0;
    while(record) {

            tokens[i++] = strdup(record);  
            record = strtok(NULL, del1);
    }
    for(v = 0; v < i; v++){
      char*  field = strtok(token[v], del2);
      while(field) {
          printf("Record: %s\n",token[v]);
          printf("Field: %s\n",field);
          field = strtok(NULL, del2);
      }
    }
}

please note you have to free all strdup strings after or you are going to create a memory leak.

also please not the signature of strtok is

char * strtok ( char * str, const char * delimiters );

so you don't need the &del since del is already a char *.


Here is what is happening:

  • user input: a=1;b=2;c=3\n
  • after fgets: a=1;b=2;c=3\0
  • 1st call to strtok(non-NULL, ...): a=1\0b=2;c=3\0
  • 2nd call to strtok(non-NULL, ...): a\01\0b=2;c=3\0

strtok keeps a bit of state when it is first invoked with a non-NULL str argument, so it can remember how long the string was (or, equivalently, where in memory it ends), because it mangles the string thereafter, replacing delimiters with NULLs.

When you call strtok again with a non-NULL argument, that single place where state is kept is overwritten with what strtok perceives to be a new string of 3 characters and a NULL (a=1\0), since strtok can no longer remember the original input characteristics. So, record is set to NULL at the end of the loop on the first iteration, because strtok thinks it's at the end of its (much shorter than intended!) input string.

Check out strtok_r.

0

精彩评论

暂无评论...
验证码 换一张
取 消