I am wondering what the differences are between binary and text based protocols. I read that binary protocols are more compacts/faster to process. How does that work out? Since you have to send the same amount of data? No?
E.g how wo开发者_开发问答uld the string "hello" differ in size in binary format?
If all you are doing is transmitting text, then yes, the difference between the two isn't very significant. But consider trying to transmit things like:
- Numbers - do you use a string representation of a number, or the binary? Especially for large numbers, the binary will be more compact.
- Data Structures - How do you denote the beginning and ending of a field in a text protocol? Sometimes a binary protocol with fixed length fields is more compact.
Text protocols are better in terms of readability, ease of reimplementing, and ease of debugging. Binary protocols are more compact.
However, you can compress your text using a library like LZO or Zlib, and this is almost as compact as binary (with very little performance hit for compression/decompression.)
You can read more info on the subject here:
http://www.faqs.org/docs/artu/ch05s01.html
The string "hello" itself wouldn't differ in size. The size/performance difference is in the additional information that Serialization introduces (Serialization is how the program represents the data to be transferred so that it can be re-construted once it gets to the other end of the pipe).
For example, when serializing the following in .NET using XML (one of the text serialization methods):
string helloWorld = "Hello World!";
You might get something like (I know this isn't exact):
<helloWorld type="String">Hello World!</helloWorld>
Whereas Binary Serialization would be able to represent that data natively in binary without all the extra markup.
binary protocols are better if you are using control bits/bytes
i.e instead of sending msg:Hello in binary it can be 0x01 followed by your message (assuming 0x01 is a control byte which stands for msg)
So, since in text protocol you send msg:hello\0 ...it involves 10 bytes where as in binary protocol it would be 0x01Hello\0 ...this involves 7 bytes
And another example, suppose you want to send a number say 255, in text its 3 bytes where as in binary its 1 byte i.e 0xFF
You need to be clear as to what is part of the protocol and what is part of the data. Text protocols can send binary data and binary protocols can send text data.
The protocol is the part of the message the states "Hi can I connect? I've got some data, where should I put it?, You've got a reply for me? great! thanks, bye!"
Each bit of the conversion is (probably) much smaller in a binary protocol, Take HTTP for example (which is text based):
if you had an encoding standard I bet you could come up with sequence of characters smaller that the 4 Bytes needed for the word 'PUSH'
Some say that binary protocols are more secure, like, for example, Mike Hearn in What should follow the web?.
I wouldn't say that binary formats are more faster to process. If you have a look at CSV or fixed-field-length textual format - it is still can be processed fast.
I would say, everything depends on who is the consumer. If the human being is at the end (like for HTTP or RSS), then there is no need to somehow compact the data, except maybe compressing it.
Binary protocols need parsers/convertors, difficult to extend and keep the backward compatibility. The higher you go in protocol stack, the more human-oriented protocols are (TCP is binary, as packets have to be processed by routers at high speed, but XML is more human-friendly).
I think, size variations does not matter today a lot. For your example, hello
will take the same amount in binary format as in text format, because text format is also "binary" for the computer - only the way we interprete the data matters.
精彩评论