开发者

XML or CSV for "Tabular Data"

开发者 https://www.devze.com 2023-03-02 22:32 出处:网络
I have开发者_如何学Python \"Tabular Data\" to be sent from server to client --- I am analyzing should I be going for CSV kind of formate or XML.

I have开发者_如何学Python "Tabular Data" to be sent from server to client --- I am analyzing should I be going for CSV kind of formate or XML.

The data which I send can be in MB's, server will be streaming it and client will read it line by line to start paring the output as it gets (client can't wait for all data to come).

As per my present thought CSV would be good --- it will reduce the data size and can be parsed faster.

XML is a standard -- I am concerned with parsing data as it comes to system(live parsing) and data size.

What would be the best solution?

thanks for all valuable suggestions.


If it is "Tabular data" and the table is relatively fixed and regular, I would go for a CSV-format. Especially if it is one server and one client.

XML has some advantage if you have multiple clients and want to validate the file format before using the data. On the other hand, XML has cornered the market for "code bloat", so the amount transfered will be much larger.


I would use CSV, with a header which indicate the id of each field.

id, surname, givenname, phone-number
0, Doe, John, 555-937-911
1, Doe, Jane, 555-937-911

As long as you do not forget the header, you should be fine if the data format ever changes. Of course the client need be updated before the server starts sending new streams.

If not all clients can be updated easily, then you need a more lenient messaging system.

Google Protocol Buffer has been designed for this kind of backward/forward compatibility issues, and combines this with excellent (fast & compact) binary encoding abilities to reduce the message sizes.

If you go with this, then the idea is simple: each message represents a line. If you want to stream them, you need a simple "message size | message blob" structure.

Personally, I have always considered XML bloated by design. If you ever go with Human Readable formats, then at least select JSON, you'll cut down the tag overhead by half.


I would suggest you go for XML. There are plenty of libraries available for parsing. Moreover, if later the data format changes, the parsing logic in case of XML won't change only business logic may need change. But in case of CSV parsing logic might need a change


CSV format will be smaller since you only have to delare the headers on the first row then rows of data below with only commas in between to add any extra characters to the stream size.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号