开发者

Why do I get a 'ÿ' char after every include that is extracted by my parser? - C

开发者 https://www.devze.com 2023-02-08 22:34 出处:网络
I have this function: /*This func runs *.c1 file, and replace every include file with its content It will save those changes to *.c2 file*/

I have this function:

/*This func runs *.c1 file, and replace every include file with its content
It will save those changes to *.c2 file*/
void includes_extractor(FILE *c1_fp, char *c1_file_name ,int c1_file_str_len )
{
    int i=0;
    FILE *c2_fp , *header_fp;
    char ch, *c2_file_name,header_name[80]; /* we can assume line length 80 chars MAX*/
    char inc_name[]="include"; 
    char inc_chk[INCLUDE_LEN+1]; /*INCLUDE_LEN is defined | +1 for null*/

    /* making the c2 file name */

    c2_file_name=(char *) malloc ((c1_file_str_len)*sizeof(char));
    if (c2_file_name == NULL)
    {
     printf("Out of memory !\n");
     exit(0);
    } 

    strcpy(c2_file_name , c1_file_name); 
    c2_file_name[c1_file_str_len-1] = '\0'; 
    c2_file_name[c1_file_str_len-2] = '2';

/*Open source & destination files + ERR check */

    if( !(c1_fp = fopen (c1_file_name,"r") ) )
    {
     fprintf(stderr,"\ncannot open *.c1 file !\n");
     exit(0);
    }

    if( !(c2_开发者_运维百科fp = fopen (c2_file_name,"w+") ) )
    {
     fprintf(stderr,"\ncannot open *.c2 file !\n");
     exit(0);
    }

/*next code lines are copy char by char from c1 to c2,
  but if meet header file, copy its content */

    ch=fgetc(c1_fp);
    while (!feof(c1_fp))
    {
        i=0;    /*zero i */ 
        if (ch == '#') /*potential #include case*/
        {
             fgets(inc_chk, INCLUDE_LEN+1, c1_fp); /*8 places for "include" + null*/
         if(strcmp(inc_chk,inc_name)==0) /*case #include*/
         {
          ch=fgetc(c1_fp);
          while(ch==' ') /* stop when head with a '<' or '"' */
          {
           ch=fgetc(c1_fp);
          } /*while(2)*/

          ch=fgetc(c1_fp); /*start read header file name*/

          while((ch!='"') && (ch!='>')) /*until we get the end of header name*/
          {
           header_name[i] = ch;
           i++;
           ch=fgetc(c1_fp);
          }/*while(3)*/
          header_name[i]='\0';  /*close the header_name array*/


          if( !(header_fp = fopen (header_name,"r") ) ) /*open *.h for read + ERR chk*/
          {
               fprintf(stderr,"cannot open header file !\n");
           exit(0);
              }

          while (!feof(header_fp)) /*copy header file content to *.c2 file*/
          {
           ch=fgetc(header_fp);
           fputc(ch,c2_fp);
          }/*while(4)*/
          fclose(header_fp);
         }
                }/*frst if*/
        else
        {
         fputc(ch,c2_fp);
        }
     ch=fgetc(c1_fp);
    }/*while(1)*/ 

fclose(c1_fp);
fclose(c2_fp);
free (c2_file_name);    
}

This function reads a single *.c1 file and saves a copy of it to *.c2 file, but all the include files from *.c1 file are extracted and their contents expanded in *.c2.

After every include file that is extracted, I get 'ÿ' sign.

The include can contain 1 line or 1000 lines, but the 'ÿ' sign will appear only once after each include that is extracted.

Can't find why...


"ÿ" corresponds to the code point 0xFF. fgetc returns EOF when the end of file is reached, which is (usually) defined as -1. Store -1 in a char and you'll wind up with 0xFF. You must check for EOF between calling fgetc and fpuc.

int ch;
...
/*copy header file content to *.c2 file*/
for (ch=fgetc(header_fp); ch > -1; ch=fgetc(header_fp)) {
   fputc(ch,c2_fp);
}

Instead of getting characters one at a time, you could use fgets to get a block of characters.

#ifndef BUFSIZE
#  define BUFSIZE 1024
#endif
char buf[BUFSIZE], *read;
...
/*copy header file content to *.c2 file*/
while ((read = fgets(buf, BUFSIZE, header_fp))) {
    fputs(buf, c2_fp);
}


You major problem is this loop.

  while (!feof(header_fp)) /*copy header file content to *.c2 file*/
  {
   ch=fgetc(header_fp);
   fputc(ch,c2_fp);
  }/*while(4)*/

When fgetc encounters the end of file, it will return EOF, which is a negative integer. You store this in a char and then without checking write it out to the other file.

feof is very rarely useful as a loop condition. Most of the time it is better to check the return value of a read function.

You should always store to return value of fgetc in an int so that you can check the return value for errors (either an end-of-file condition or some other error). fputc takes in int, in any case.

A better way to construct the loop would be as follows.

int ch_hdr;
while((ch_hdr = fgetc(header_fp)) != EOF)
{
    fputc(ch_hdr, c2_fp);
}


If you look at your code you have to places where you write to the target file.

If I were you I would set a break point at

        }/*frst if*/
        else
        {
        fputc(ch,c2_fp); // brk point here
        }

to check what you are actually writing there.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号