开发者

Can read(2) return zero when not at EOF?

开发者 https://www.devze.com 2023-01-03 20:11 出处:网络
According to the man page for read(2), it only returns zero when EOF is reached. However, It appears this is incorrect and that it may sometimes return zero, perhaps because the file is not ready to

According to the man page for read(2), it only returns zero when EOF is reached.

However, It appears this is incorrect and that it may sometimes return zero, perhaps because the file is not ready to be read yet? Should I call select() to see if it is ready before reading a file from disk?

Note that nBytes is: 1,445,888

Some sample code:

fd_set readFdSet;
timeval timeOutTv;

timeOutTv.tv_sec = 0;
timeOutTv.tv_usec = 0;

// Let's see if we'll block on the read.
FD_ZERO(&readFdSet);
FD_SET(fd, &readFdSet);

int selectReturn = ::select(fd + 1, &readFdSet, NULL, NULL, &timeOutTv);

if (selectReturn == 0) {
  // There is still more to read.
  return false; // But return early.
} else if (selectReturn < 0) {
  clog << "Error: select failure: " << strerror(errno) << endl;
  abort();
} else {
  assert(FD_ISSET(fd, &readFdSet));

  try {
    const int bufferSizeAvailable = _bufferSize - _availableIn;

    if (_availableIn) {
      assert(_availableIn <= _bufferSize);

      memmove(_buffer, _buffer + bufferSizeAvailable, _availableIn);
    }

    ssize_t got = ::read(fd, _buffer + _availableIn, bufferSizeAvailable);

    clog << " available: " << bufferSizeAvailable << " availableIn: "
         << _availableIn << " bufferSize: " << _bufferSize << " got "
         << got << endl;

    return got == 0;
  } catch (Err &err) {
    err.append("During load from file.");
    throw;
  }
}

The output reads (when it fails with no data read):

available: 1445888 availableIn: 0 bufferSize: 1445888 got: 0

This is running on centos4 32 bit as a virtual machine using VMware开发者_开发问答 Server 1.0.10. The file system being read is local to the virtual machine. The host machine is windows server 2008 32 bit.

The uname -a says:

Linux q-centos4x32 2.6.9-89.0.25.ELsmp #1 SMP Thu May 6 12:28:03 EDT 2010 i686 i686 i386 GNU/Linux

I notice that the link http://opengroup.org/onlinepubs/007908775/xsh/read.html given below states:

The value returned may be less than nbyte if the number of bytes left in the file is less than nbyte, if the read() request was interrupted by a signal...

If a read() is interrupted by a signal before it reads any data, it will return -1 with errno set to [EINTR].

If a read() is interrupted by a signal after it has successfully read some data, it will return the number of bytes read. 

So, perhaps I am getting a signal interrupting the read and thus the value returned is zero because of either a bug or it thinks zero bytes were read?


After some research, there actually are some circumstances under which it will return 0 that you might not think of as being "EOF".

For the gritty details, see the POSIX definition for read(): http://opengroup.org/onlinepubs/007908775/xsh/read.html

Some notable ones are if you ask it to read 0 bytes -- double check that you're not accidentally passing 0 to it -- and reading past the end of the "written" portion of the file (you can actually seek past the end of the file, which "extends" the file with zeroes if you write there, but until you do, "EOF" is still at the end of the already-written portion).

My best guess is that you're getting into a timing problem somewhere. Some questions you need to ask are "How are these files being written?" and "Am I sure they're not zero-length when I try to read them?". For the second one, you could try running a stat() on the file before reading it to see what its current size is.


The only other case that I can think of read() returning 0 is if you pass in nbytes as 0; sometimes that can happen if you're passing in the size of something or other as a parameter. Could that be what's happening right now?

If the file is not ready to be read, what should happen is read returns -1 and errno is set to EAGAIN.


Figured it out! I had an Uninitialized Memory Read (UMR) and was incorrectly seeking to the end of the file.


I have dealt with this many times. Since the app does not know if a fd is attached to a flat file, socket to the network, pipe, etc., some process may send a zero length message of some priority or other and trigger this. I take a wait and see approach, see if EOF sticks:

#include <errno.h>
#include <unistd.h>
#include <poll.h>

 .
 .
 .
int eofct = 0 ;
 .
 .
 .  
do {
    switch ( readcount = read( fd, buff+currentsize, bufSize )){
        case -1:
             if ( errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR ){
               continue ;
             }

            perror( "read()" );
            return -1 ;
        case 0:
            if ( eofct++ < 100 ){
              poll( 0, 0, 1 );
              continue ;
            }

            break ;
      default:
        eofct = 0 ;
        currentsize += readcount ;

        if ( NULL == ( buff = realloc( buff, currentsize + buffsz ))){
          perror( "realloc()" );
          return -1 ;
        }

        continue ;
    }    
  } while ( readcount ); // readcount 0 break is EOF


If open or fcntl sets O_NONBLOCK, then a read should return 0 until there is data ready.


I just ran into this in Go. It seems that it's very dangerous to make something that behaves like read (ie: an io.Reader in Go) return zero length with no io.EOF. If it's a caller that you did not write, it may break; assuming that it will block for at least 1 byte. But if you know that the caller handles it, then you can do it.

0

精彩评论

暂无评论...
验证码 换一张
取 消