I need to divide a C string into tokens. I thought that strtok will be my best try, but I'm getting very strange results...
Here is my test program. In this example I will get 3 tokens with "##" separator but when I try to work with the ones I supposedly had copied, only the third one is shown correctly.. the other two look corrupted or something... I don't know... ?
#include <stdio.h>
#include <string.h>
#include <malloc.h>
#define TAM 3 //elements
char** aTokens(char* str, char* delimitador)
{
char* pch;
char** tokens;
int i = 0;
tokens = (char**)malloc(sizeof(char*)*TAM);
pch = strtok(str, delimitador);
while(pch != NULL)
{
tokens[i] = (char*)malloc((sizeof(strlen(pch))+1) * sizeof(char));
strcpy(tokens[i], pch);
pch = strtok(NULL, delimitador);
i++;
}
return tokens;
}
int main ()
{
char str[] = "30117700,1,TITULAR,SIGQAA070,1977/11/30,M,1,14000,0.00,6600.00,10.00,2011/09/01,2012/09/01,0|17,0.00,NO,0,0,0.00, ,##30117700,1,TITULAR,SIGQAA070,1977/11/30,M,1,14000,0.00,6600.00,10.00,2011/09/01,2012/09/01,0|17,0.00,NO,0,0,0.00, ,##30117700,1,TITULAR,SIGQAA070,1977/11/30,M,1,14000,0.00,6600.00,10.00,2011/09/01,2012/09/01,0|17,0.00,NO,0,0,0.00, ,";
char** tokens;
int i;
tokens = aTokens(str, "##");
for开发者_JAVA技巧(i = 0; i<TAM; i++)
printf("%d -- %s\n", strlen(tokens[i]), tokens[i]);
//Clean
//for(i = 0; i<TAM; i++)
//free(tokens[i]);
//free(tokens);
return 0;
}
output with GCC on Linux:
13 -- 30117700,1,T <---- ?
13 -- 30117700,1,T <----- ?
115 -- 30117700,1,TITULAR,SIGQAA070,1977/11/30,M,1,14000,0.00,6600.00,10.00,2011/09/01,2012/09/01,0|17,0.00,NO,0,0,0.00, ,
I have commented the "clean" section because it provides lots of runtime error too ... :(
Help please!!
I think you are slightly confused on how strtok
works.
For the most part, you've got it right. However, the string of separator characters that is given to strtok
is not used as a string per se, but it used more like an array of characters, and strtok
only cares about these individual characters. So calling strtok
with the string "#" is exactly the same as giving it "##". In order to tokenize your string correctly, you need to decide on a single separator character to use, or use a different (perhaps custom) tokenizer function that can handle multi-character separators..
The following line isn't correct. sizeof(strlen(..))
will be 4 (in a 32-bit app) regardless of the length of the string.
tokens[i] = (char*)malloc((sizeof(strlen(pch))+1) * sizeof(char));
It should probably be:
tokens[i] = (char*)malloc((strlen(pch)+1) * sizeof(char));
Standard implementation of strtok
:
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] = "30117700,1,TITULAR,SIGQAA070,1977/11/30,M,1,14000,0.00,6600.00,10.00,2011/09/01,2012/09/01,0|17,0.00,NO,0,0,0.00, ,##30117700,1,TITULAR,SIGQAA070,1977/11/30,M,1,14000,0.00,6600.00,10.00,2011/09/01,2012/09/01,0|17,0.00,NO,0,0,0.00, ,##30117700,1,TITULAR,SIGQAA070,1977/11/30,M,1,14000,0.00,6600.00,10.00,2011/09/01,2012/09/01,0|17,0.00,NO,0,0,0.00, ,";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str,"#");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, "#");
}
return 0;
}
精彩评论