开发者

How to handle parsing a big xml file and saving it in a database

开发者 https://www.devze.com 2023-02-15 14:50 出处:网络
I have a fairly large xml file ( greater than 2mb ) that I\'m parsing and storing in an sqlite database. I can parse it and store it for the first time fine. My question concerns updating the database

I have a fairly large xml file ( greater than 2mb ) that I'm parsing and storing in an sqlite database. I can parse it and store it for the first time fine. My question concerns updating the database when I want to parse the xml file again ( for changes, additions, or deletions ). My initial thought is to just wipe the information in the database and do inserts again rather than parse the data开发者_JS百科, check to see if a given item is already in the database and do an update. Is there an approach that is better than another? Would there be a performance hit one way or another? I'd appreciate any thoughts on the matter.


Yes, re-inserting is probably a bad idea. How complicated is the xml structure, how many tables are involved when you would query the existence of one item that is reflected by the structure?

If it's complex you might be able to create a checksum of your entries or a hash of some attributes and values which identify a record uniquely and store this hash/checksum in an extra table in the db, when you look for modified entries you just compute the hash/checksum and look for it in one table. Maybe that even makes the querying faster, depending how expensive the hash calculation is.


Inserting only what needs to be changed is clearly going to be quicker than dumping the entire DB and re-inserting. At least that's my thinking.

I suppose it depends on how complex the information you are checking against is, and how efficient your code for doing that process is. If you aren't comfortable doing verification like that, then dumping and reinserting would be a safer option.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号