I've been working with Google App Engine and I'm running into some slow performance with some of my data queries. I've read that designing an App Engine datastore is a different mindset from working with SQL databases and I'm not sure I'm doing this the best way. I have two questions to try to get on the right track:
Specifically:
I have a Foo开发者_如何转开发
type and a UserFoo
type. Each UserFoo
is an "instance" of a corresponding Foo
and holds data specific to that instance. My Foo
type has a fooCode
property which is a unique identifier and I map each UserFoo
with each Foo
by using their fooCode
properties. I then operate on each Foo with code like so:
foos = Foo.all().filter('bar =', bar)
for foo in foos:
userFoo = UserFoo.all().filter('userKey =', user).filter('fooCode =', foo.fooCode)
Note: I'm using fooCode
over a reference key so that we can easily delete and re-add new Foo
s and not have to then remap all the corresponding UserFoo
s.
In General:
What are typical approaches to designing GAE datastore tables and best-practices for using them?
This is the staircase of gets anti-pattern. The solution is ReferenceProperty pre-fetching.
The hitch is that you've decided not to use a ReferenceProperty. I would advise you to reconsider this choice.
Note: I'm using fooCode over a reference key so that we can easily delete and re-add new Foos and not have to then remap all the corresponding UserFoos.
Remember that an entity key is just an encoded representation of its path: the kind and name or ID of the entity and any of its ancestors. If you deleted and then re-created a Foo
, it would only have a different key if it was given a different name or ID. If you have some way of giving the old and new entity the same fooCode
, you could just as easily use the fooCode
as the key name, which would allow a deleted and then re-added Foo
to retain its original key.
In General:
Denormalize as much as possible.
Refer to entities via their Key whenever possible; it is the fastest way to get data out of the datastore.
Specifically, you could likely get a dramatic increase in your performance if you used ReferenceProperty to establish relationships rather than codes that go into filters. I am going to guess that querying for a UserFoo's Foo happens a good deal more often than removing and remapping Foo's, yeah? In that case, the pragmatic and datastore-wise thing to do is to use Reference Properties.
Also, if the Foo-UserFoo relationship can be denormalized into a single entity, you remove the need for an entire series of queries altogether.
I would suggest the following changes:
- Use a ReferenceProperty in UserFoo to refer to its Foo, or make it a child entity if appropriate. I don't understand your comment about remapping existing entities - that should never be necessary.
- Add a copy of the 'bar' property to each UserFoo
- Perform a single query, for UserFoo.all().filter('bar =', bar).order('userKey'). Results will be filtered by bar and grouped by user, and only require a single query, instead of one per user.
- Fetch the results at once by calling .fetch() on the query, rather than iterating over them. This is much more efficient.
- Use ReferenceProperty prefetching to retrieve the Foo object for each UserFoo, if you need it.
精彩评论