开发者

Is there a way to cache the fetch output?

开发者 https://www.devze.com 2023-03-19 05:40 出处:网络
I\'m working on a closed system running in the cloud. What I need is a search function that uses user-typed-in regexp to filter the rows in a dataset.

I'm working on a closed system running in the cloud.

What I need is a search function that uses user-typed-in regexp to filter the rows in a dataset.

phrase = re.compile(request.get("query"))
data = Entry.all().fetch(50000)  #this takes around 10s when there are 6000 records
result = x for x in data if phrase.search(x.title)

Now, the database itself won't change too much, and there will be no more than 200-300 searches a day.

Is there a way to somehow cache all the Entries (I expect that there will be no more than 50.000 of them, each no bigger than 500 bytes), so retrieving them won't take up >10 seconds? Or perhaps to parallelize it? I don't mind 10cpu seconds, but I do mind 10 second that the user has to wait.

To address any answers like "index it and use .filter()" - the query is开发者_运维知识库 a regexp, and I don't know about any indexing mechanism that would allow to use a regexp.


You can also use cachepy or performance engine (shameless plug) to store the data on app engine's local instances, so you can have faster access to all entities without getting limited by memcache boundaries or datastore latency.

Hint: A local instance gets killed if it surpasses about 185 MB of memory, so you can store actually quite a lot of data in it if you know what you're doing.


Since there is a bounded number of entries, you can memcache all entries and then do the filtering in memory like you've outlined. However, note that each memcache entry cannot exceed 1mb. But you can fetch up to 32mb of memcache entries in parallel.

So split the entries into sub sets, memcache the subsets and then read them in parallel by precomputing the memcache key.

More here:

http://code.google.com/appengine/docs/python/memcache/functions.html


Since your data is on the order of 20MB, you may be able to load it entirely into local instance memory, which will be as fast as you can get. Alternately, you could store it as a data file alongside your app, reading which will be faster than accessing the datastore.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号