开发者

unistd.h read() function: How to read a file line by line?

开发者 https://www.devze.com 2022-12-21 22:38 出处:网络
What I need to do is use the read function from unistd.h to read a file line by line. I have this at the moment:

What I need to do is use the read function from unistd.h to read a file line by line. I have this at the moment:

n = read(fd, str, size);

However, this reads to the end of the file, or up to size number of bytes. Is there a way that I can make it read one line at a time, stopping at a newline? The lines are all of variable length.

I am allowed only these two header files:

#i开发者_C百科nclude <unistd.h>
#include <fcntl.h>

The point of the exercise is to read in a file line by line, and output each line as it's read in. Basically, to mimic the fgets() and fputs() functions.


You can read character by character into a buffer and check for the linebreak symbols (\r\n for Windows and \n for Unix systems).


You'll want to create a buffer twice the length of your longest line you'll support, and you'll need to keep track of your buffer state.

Basically, each time you're called for a new line you'll scan from your current buffer position looking for an end-of-line marker. If you find one, good, that's your line. Update your buffer pointers and return.

If you hit your maxlength then you return a truncated line and change your state to discard. Next time you're called you need to discard up to the next end of line, and then enter your normal read state.

If you hit the end of what you've read in, then you need to read in another maxline chars, wrapping to the start of the buffer if you hit the bottom (ie, you may need to make two read calls) and then continue scanning.

All of the above assumes you can set a max line length. If you can't then you have to work with dynamic memory and worry about what happens if a buffer malloc fails. Also, you'll need to always check the results of the read in case you've hit the end of the file while reading into your buffer.


Unfortunately the read function isn't really suitable for this sort of input. Assuming this is some sort of artificial requirement from interview/homework/exercise, you can attempt to simulate line-based input by reading the file in chunks and splitting it on the newline character yourself, maintaining state in some way between calls. You can get away with a static position indicator if you carefully document the function's use.


This is a good question, but allowing only the read function doesn't help! :P

Loop read calls to get a fixed number of bytes, and search the '\n' character, then return a part of the string (untill '\n'), and stores the rest (except '\n') to prepend to the next character file chunk.

Use dynamic memory.

Greater the size of the buffer, less read calls used (which is a system call, so no cheap but nowadays there are preemptive kernels).

...

Or simply fix a maximum line length, and use fgets, if you need to be quick...


If you need to read exactly 1 line (and not overstep) using read(), the only generally-applicable way to do that is by reading 1 byte at a time and looping until you get a newline byte. However, if your file descriptor refers to a terminal and it's in the default (canonical) mode, read will wait for a newline and return less than the requested size as soon as a line is available. It may however return more than one line, if data arrives very quickly, or less than 1 line if your program's buffer or the internal terminal buffer is shorter than the line length.

Unless you really need to avoid overstep (which is sometimes important, if you want another process/program to inherit the file descriptor and be able to pick up reading where you left off), I would suggest using stdio functions or your own buffering system. Using read for line-based or byte-by-byte IO is very painful and hard to get right.


Well, it will read line-by-line from a terminal.

Some choices you have are:

  • Write a function that uses read when it runs out of data but only returns one line at a time to the caller
  • Use the function in the library that does exactly that: fgets().
  • Read only one byte at a time, so you don't go too far.


If you open the file in text mode then Windows "\r\n" will be silently translated to "\n" as the file is read.

If you are on Unix you can use the non-standard1 gcc 'getline()' function.


1 The getline() function is standard in POSIX 2008.


Convert file descriptor to FILE pointer.

FILE* fp = fdopen(fd, "r");

Then you can use getline().

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号