I have a byte array (UTF-8 encoded string send as byte array from client). The message should have the following format:
'number' 'timestamp' 'str1' 'str2'
E.g
1 2000-01-31T20:00.00 the 1st str the 2nd str
It is cl开发者_如何学运维ear that the 'number' and 'timestamp' are easily read from the byte array. The start position of 'str1' can be also figured out. Considering that 'str1' and 'str2' can have any content (any length) in it, what type of delimiter can be used to know when 'str1' ends and 'str2' starts? Or are there any other tricks for parsing something like this.
note1: the message format is provided by me so any solution with a different format/order will do as long as all 4 pieces of info is in the byte array.
note2: I know I could encode str1 so that it doesn't contain my custom delimiter but I would like to avoid the overhead of encoding/decoding the data.
note3: One solution I could think of was to write the length of str1 in front of it when sending the data from client side. E.g 'number' 'timestamp' 'str1length' 'str1' 'str2'
are there any other tricks you can think of?
thanks
I recommend you do the 3rd option you listed:
number timestamp length_of_string1 string1 length_of_string_two string2
Its probably a bad idea to stick a delimiter between string1 and string2 like "|" or "^]" because then you can no longer have the delimiter in your strings...
Also note that if you're sending a string, if it has spaces its going to be split up. The way to solve this is by doing a quotation-aware string split and escaping the string, surrounding it with "s
If I had freedom to choose the syntax, I would do one of the following:
If there is some Unicode character that is never going to appear in
str1
andstr2
(call it'|'
for the sake of argument), I would concatenate the 4 components with'|'
as the separator. Then I would "parse" the string usingString.split("\\\\|");
If I couldn't be certain that any character I picked was not going to be used in
str1
orstr2
, I'd pick a separator character and an escape character (say'|'
and'\\'
) and use the escape character to escape a literal separator and a literal escape character. Building the message and then parsing it is more effort to code, but it will definitely work.As an third alternative, if both ends were Java I'd consider using Java data streams to encode and decode the data.
精彩评论