开发者

Prevent duplicates when importing RSS feed to Core Data

开发者 https://www.devze.com 2023-01-06 15:29 出处:网络
Trying to import a RSS feed into Core Data. Once they are imported, when trying to update the feed again after开发者_开发知识库wards, how do I most efficiently prevent duplicates. Right now it checks

Trying to import a RSS feed into Core Data. Once they are imported, when trying to update the feed again after开发者_开发知识库wards, how do I most efficiently prevent duplicates. Right now it checks every item against the datastore during the parsing, which is not very efficient.

I looked into the Top Songs sample from Apple. It uses a least recently used cache for categories. But when every item is different the cache doesn't help at all.

EDIT: To clarify, I can already identify each item uniquely in the feed with guid. The issue is the performance of comparing hundreds of items against the database every time, when most of them are duplicates.


When you are importing a new row you can run a query against the existing rows to see if it is already in place. To do this you create a NSFetchRequest against your entity, set the predicate to look for the guid property and set the max rows returned to 1.

I would recommend keeping this NSFetchRequest around during your import so that you can reuse it while going through the import. If the NSFetchRequest returns a row you can update that row. If it does not return a row then you can insert a new row.

When done correctly you will find the performance more than acceptable.


Can you modify your core data model ?

If you can I would add a "Hash" property to each feed entry to uniquely identify it. Then you could efficiently detect wether a specific entry is already in your database or not.

0

精彩评论

暂无评论...
验证码 换一张
取 消