开发者

What is the best way to cache XML feeds locally?

开发者 https://www.devze.com 2023-01-10 18:21 出处:网络
I have a XML feed which contains 1000+ records of properties (rent, sale). Currently I am calling this feed 16x on homepage, always returning only 3 properties for specific criteria like 3 new house,

I have a XML feed which contains 1000+ records of properties (rent, sale).

Currently I am calling this feed 16x on homepage, always returning only 3 properties for specific criteria like 3 new house, 3 new flats, etc, 5 recommended house, 5 recommended flats etc.

This scenario was working well for 7 months whilst there was 200+ properties and only 100-200 views a day. It is now getting to stage where I have 700+ visits a day and over 1000+ properties and downloading 16 feeds separately just to show homepage is getting slower and traffic is getting massively larger.

Therefore I would like to cache these streams, I would like only my 'robot' to directly download streams from source and all visitors to use my local copy to make things much quicker and decrease traffic load massively.

I dont have a problem downloading XML locally and locally call files to show data. But I would like to know how to solve possible issues like:

  • not showing data to clients because robot is updatin开发者_如何学JAVAg XML files and original file would be overwritten and empty whilst loading new data
  • using XML file as local backup, means that if source server is offline homepage would be still working and loading
  • making sure that I wont lock data for clients in such way that robot would be unable to update files

My first toughts would be to work with 2 xml files for every stream, one which would be shown to clients and one which would be downloaded. If download is correct then downloaded XML would be used as live data and other one deleted. Some kind of incremental marking with one file working as file holding name of actual data.

Is there any way how to cache these XML files so it would do something similar? Really the main issue is to have bulletproof solution so clients wont see error pages or empty results.

Thanks.


Use the caching options built into HttpWebResponse. This lets you programatically choose between obtaining straight from cache (ignoring freshness), ignoring the cache, forcing the cache to be refreshed, forcing the cache to be revalidated and the normal behaviour of using the cache if it's considered fresh according to the original response's age information, and otherwise revalidating it.

Even if you've really specific caching requirements that need to go beyond that, build it on top of doing HTTP caching properly, rather than as a complete replacement.

If you do need to manage your own cache of the XML streams, then normal file locking and if really necessary, .NET ReaderWriterLockSlims should suffice to keep different threads from messing each other up. One possibility to remove the risk of contention that is too high, is to default to direct access in the case of cache contention. Consider that caching is ultimately an optimisation (conceptually you are getting the file "from the server", caching just makes this happen in a more efficient manner). Hence, if you fail to quickly obtain a read-lock, you can revert to downloading directly. This in turn reduces the wait that can happen for the write lock (because pending locks won't stack up over time while a write lock is requested). In practice it probably won't happen very often, but it will save you from the risk of unacceptable contention building up around one file and bringing the whole system down.


I'm going to start by assuming that you don't own code that produces the source XML feed? Because if you do, I'd look at adding some specific support for the queries you want to run.

I had a similar issue with a third-party feed and built a job that runs a few times a day, downloads the feed, parses it, and stores the results locally in a database.

You need to do a bit of comparison each time you update the database, and only add new records and delete old records, but it ensures that you always have data to feed to your clients and the database works around simple issues like file locking.

Then I'd look at a simple service layer to expose the data in your local store.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号