I use getc();
in a C excercise, and after looking back on the program I noticed something weird. I assumed that the file given on the command line arguments contains at least one byte. (It calls getc();
twice in a row without checking for EOF
. After trying it on an empty file it still worked smoothly. My question is: is the behaviour of getc();
on a file pointer that's been exhausted (EOF has been reached and not rewinded) undefined or will it always continue to return EOF?
I think I could expand this question to all the I/O functions in the C STL, please clarify this in your answer too.
Here is the code for the program. The program is supposed to strip a C/C++ source file from all comments (and it works perfectly).
#include <stdio.h>
int main(int argc, char *argv[]) {
int state = 0; // state: 0 = normal, 1 = in string, 2 = in comment, 3 = in block comment
int ignchar = 0; // number of characters to ignore
int cur, next; // current character and next one
FILE *fp; // input file
if (argc == 1) {
fprintf(stderr, "Usage: %s file.c\n", argv[0]);
return 1;
}
if ((fp = fopen(argv[1], "r")) == NULL) {
fprin开发者_JAVA技巧tf(stderr, "Error opening file.\n");
return 2;
}
cur = getc(fp); // initialise cur, assumes that the file contains at least one byte
while ((next = getc(fp)) != EOF) {
switch (next) {
case '/':
if (!state && cur == '/') {
state = 2; // start of comment
ignchar = 2; // don't print this nor next char (//)
} else if (state == 3 && cur == '*') {
state = 0; // end of block comment
ignchar = 2; // don't print this nor next char (*/)
}
break;
case '*':
if (!state && cur == '/') {
state = 3; // start of block comment
ignchar = 2; // don't print this nor next char (/*)
}
break;
case '\n':
if (state == 2) {
state = 0;
ignchar = 1; // don't print the current char (cur is still in comment)
}
break;
case '"':
if (state == 0) {
state = 1;
} else if (state == 1) {
state = 0;
}
}
if (state <= 1 && !ignchar) putchar(cur);
if (ignchar) ignchar--;
cur = next;
}
return 0;
}
Stdio files keep an "eof" flag that's set the first time end-of-file is reached and can only be reset by calling clearerr
or performing a successful fseek
or rewind
. Thus, once getc
returns EOF
once, it will keep returning EOF
, even if new data becomes available, unless you use one of the aforementioned methods for clearing the eof flag.
Some non-conformant implementations may immediately make new data available. This behavior is harmful and can break conformant applications.
If the EOF
flag on the stream is set, getc
should return EOF
(and if you keep calling getc
, it should keep returning EOF
).
Logically, I think it should return EOF
forever.
getc is defined in terms of fgetc.
The getc() function shall be equivalent to fgetc() , except that if it is implemented as a macro it may evaluate stream more than once, so the argument should never be an expression with side effects.
The documentation for fgetc
says:
If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-file indicator for the stream shall be set and fgetc() shall return EOF.
And "is at end-of-file" can be determined by calling feof.
The documentation for feof
says:
The feof() function shall return non-zero if and only if the end-of-file indicator is set for stream.
So unless something happens to clear the end-of-file indicator, it should continue returning EOF
forever.
精彩评论