I want to send 'packets' of data (i.e. discrete messages) between two programs through named pipes. Given that I have to supply a buffer and a buffer size to read
, and given that the read command is blocking (I believe), I either have to have a buffer size that guarantees I never get an under-run, or to know the size of the message up-front. I don't want the sending program to have to know the size of the buffer and pad it out.
As I see it, there are three ways to do this.
- Prepend each package with the size of the message being sent so the listening program开发者_如何转开发 can read that many bytes.
- Read from the pipe a byte at a time and listen for a special end-of-stream value.
- A better way
In the first case I would be able to create a buffer of known size and read into it at once. In the second case I would have to read with a one-byte buffer. This might either be perfectly OK or a massively inefficient travesty.
The only reason I would go for the second approach would be for more flexible input (for example, manual interaction if I wanted it).
Which is the best way to go?
With named pipes, reads and writes are (or can be) atomic. Within limits, if you write, say, 1024 bytes to the pipe, a read call on the other end that is looking for at least 1024 bytes will actually receive the 1024 bytes, even if there is more data in the pipe at the time of the read. Further, and always, if there are just 1024 bytes in the named pipe and a read requests 4096 bytes, it will get the 1024 bytes on the first attempt, and only block on a subsequent attempt.
You say:
Given that I have to supply a buffer and a buffer size to read,
You do...
and given that the read command is blocking (I believe),
It is, unless you set O_NONBLOCK on the file descriptor...
I either have to have a buffer size that guarantees I never get an under-run,
What sort of messages are you sending? What size are you dealing with? Kilobytes, megabytes, bigger?
or to know the size of the message up-front.
There is no particular problem with having, say, a 4KB buffer in the reader, and reading the message in chunks. The issue is knowing when you reach the end of the message. By far the majority of protocols require the length up front, because it makes it easy to write the reader code reliably.
If you are going to do an 'end of stream' (EOS) marker, you are doing 'in-band signalling'. And that causes trouble. What character are you going to use? What happens when that character appears in the data? You need an escape mechanism, such as a character that means 'the next character is not the EOS marker'. For example, in text related to programming, the backslash is used for this. At a terminal, control-V often serves the purpose.
I don't want the sending program to have to know the size of the buffer and pad it out.
Why is it hard for the sender to know the size of the buffer? And why would it need to 'pad it out'?
If you are dealing with large amounts of data (from say kilobytes upwards), the single-character solution is unlikely to yield acceptable performance. I think you would be best off having the sender able to determine the size of packet and telling the reader, or designing the protocol so that there are limits on the size of a packet. If you need to convey arbitrary amounts of data, have a protocol which says:
- Large quantity of data of unknown total size coming.
- For each sub-packet, the message says 'this is a sub-packet of size NN KB'.
- For the last sub-packet, the size might be shorter - that's OK and could indicate 'end of large quantity of data'.
- If the last sub-packet is 'full size', you might send an empty last packet to indicate the EOS.
- Alternatively, if the sub-packets can be of variable size, you can always send an explicit EOS packet.
Also consider what will happen in future if, instead of using named pipes, you want to upgrade your system to work over a socket connection to another machine.
I think you should design your system with packets where the packet headers include the size of the data (the way most networking protocols, such as TCP/IP, do things). And if there's a higher level flow of data of unknown size, handle it along the lines outlined above. But even there, it is better if you can tell the overall size ahead of time.
One simple way would be to have a discrete packet that contains a ftok (based on the named pipe) and a pointer to a null terminated string in shared memory that has been assigned using the ftok return value. All other discrete information can be passed within the packet struct.
sender:
packet.ident = ftok("./mynamedpipe");
packet.pointer = shmget(packet.ident, sizeof(message), IPC_CREAT|IPC_EXCL);
strcpy(packet.pointer, message);
receiver:
message = shmat(packet.ident, NULL, NULL);
Note that the address in shmat isn't explicitly provided in order to prevent remapping existing memory within the receiver process.
精彩评论