开发者

What is the best file format to parse?

开发者 https://www.devze.com 2022-12-29 23:43 出处:网络
Scenario: I\'m working on a rails app that will take data entry in the form of uploaded text-based files. I need to parse these files before importing the data. I can choose the file type uploaded to

Scenario: I'm working on a rails app that will take data entry in the form of uploaded text-based files. I need to parse these files before importing the data. I can choose the file type uploaded to the app; the s开发者_Go百科oftware (Microsoft Access) used by those uploading has several export options regarding file type.

While it may be insignificant, I was wondering if there is a specific file type that is most efficiently parsed. This question can be viewed as language-independent, I believe.

(While XML is commonly parsed, it is not a feasible file type for sake of this project.)


If it is something exported by Access, the easiest would be CSV; particularly since Ruby contains a CSV parser in the standard library. You will have to do some work determining the dialect of CSV (what it uses for delimiter, how it handles quotes); I don't know how robust the ruby parser is with those issues, but you also should have some control from Microsoft Access.


You might want to take a look at JSON. It's a lightweight format, and in contrast to XML it's really easy and clean to parse without requiring a huge library on the backend.

It can represent types like strings, numbers, assosiative arrays (objects), and lists of such


I would suggest n-SV (where n is some character) for data that does not include n. That will make lexing the files a matter of a split.

If you have more flexible data, I would suggest JSON.


If you've HAVE to roll your own parser, I would suggest CSV or some form of a delimiter separated format.

If you are able to use other libraries, there are plenty of options. JSON looks quite fascinating.

0

精彩评论

暂无评论...
验证码 换一张
取 消