开发者

Modeling hierarchical location data

开发者 https://www.devze.com 2023-03-18 22:33 出处:网络
I\'ve struggled with this problem for some time now, researched dozens of places, tried multiple methods but I\'m still struggling with an elegant and efficient solution.I\'m certain some model master

I've struggled with this problem for some time now, researched dozens of places, tried multiple methods but I'm still struggling with an elegant and efficient solution. I'm certain some model mastermind can quickly get me on track.

I'm trying to model location data in using Django 1.3. Example, states have cities, cities have buildings, buildings, have rooms and rooms have devices/hardware. I have a working solution but I'm running into the (n+1) number of query problem with traversing the relationships.

In some views, I want to specify a state and then increment through all of the cities in a state, building, rooms and devices. The number of queries increase quickly with the number of buildings, rooms, devices. Currently, if I want to see all of the buildings, rooms, devices in a state, I perform a query for all cities in the state and then use the _set method to follow the foriegn keys backwards, looping city, looping buildings, looping rooms etc.

models.py

class City(models.Model):                                                                                             
    city = models.CharField(max_length=128)                                                                           
    city_code = models.CharField(max_length=6)                                                                                                          
    state = USStateField(_('state'))                                                                                  

class Address(models.Model):                                                                                          
    address1 = models.CharField(max_length=128, blank=True)                                                           
    address2 = models.CharField(max_length=128, blank=True)                                                           
    city = models.ForeignKey(City)                                                                                    
    zipcode= models.CharField(max_length=5, blank=True)                                                               
    zip_plus4=models.CharField(max_length=4, blank=True)                                                              

class Building(Address, CommonModel):                                                                                 
    building = models.CharField(max_length=128, help_text='Building Name')
    building_code = models.CharField(max_length=2 )            

class Roomcode(models.Model):                                                                                         
    roomcode = models.CharField(max_length=2, \                                                                       
    roomcode_name = models.CharField(max_length=64, help_text='Description or name for code')                         

class Room(models.Model):                                                                                             
    building = models.ForeignKey(Building)                                                                            
    name = models.CharField(max_length=128)                                                                           
    roomcode = models.ForeignKey(Roomcode)                                                                            

class Device(models.Model):                                                                                           
    name = models.CharField(max_length=64, unique=True)   
    description = models.CharField(max_length=64, blank=True)                      
    room = models.ForeignKey(Room)   

Currently to get all City, Building, Room, Device of a state I do the following,

views.py

@render_to('place/index.html')                                                                                        
def index(request):                                                                                                   
    sites = City.objects.distinct().order_by('state')                                                

    return { 'sites': sites, }

index.html, something like...

{% for site in sites %}

    {% for building in site.building_set.all %}
        print building details
        {% for room in building.room_set.all %}       
            print room details
            {% for device in room.device_set.all %} 
                print device details

Lots of loops and lots of queries!!

Going the other direction is simple, if I want to get all devices and then have their location information I can query using select_related

devices = Device.objects.select_related().all()

Questions:

  1. What can I do to better model this data?
  2. How can I effeciently traverse the relationships with having an (n+1) number of queries?
  3. MPTT - trees?
  4. Generic relationships?
  5. Raw SQL?
  6. Something else?

update

The main problem I'm trying to solve is the linear increase of sql queries vs the number of items in the database. For instance, I currently have a small amount of devices in a databse, 32 devices, approx 20 cities, 30 buildings, and 35 rooms. Querying all devices using select_related uses only 4 queries and takes about 300 ms sql time, since it can follow foreign keys.

Going the other direction, displaying all Sta开发者_如何转开发tes, Cities, Buildings, Rooms and Devices, ordered by state requires 76 queries and takes 3231 ms sql time. Adding one room, adds one query, a couple devices in dozens of rooms will bring my page to a stand still.

select_related follows foreign keys and works great if I need to know which city/building/room that a device is in.

device = Device.objects.get(id=1)
room = device.room    # hits database
buidling = device.room.building # hits database again.

device = Device.objects.select_related().get(id=1)
room = device.room     # doesn't hit
building = device.building      # doesn't hit

The problem occurs the other direction, as stated in my index.html example above, after retrieving the city, I have to loop through each respective, building, room and device to see all devices, hitting the database for every city, building, room, device. Benchmarking shows the number of sql queries getting out of hand very quickly and greatly increase page load time.


Since your objects are very hierarchical, MPTT and Generic Relationships doesn't make sense. You wouldn't put a Building into a Room, for example. Some of these relationships look permanent, some do not, i.e. a room never leaves a building, but a device may. I don't see a problem with your current modelling scheme.

Since you're concerned with the number of queries, take a look at the select_related feature of querysets. That lets you get deep foreign relationships in a single SQL call, and cache them for the duration of a request cycle. That may alleviate the weight of database transactions.

But, simple question for you: have you benchmarked the number of queries? Is it really that deadly to the performance of your application?

On the other hand, you've impressed a lot of individuality into a collection that really describes little more than: [name, description, parent]. If you re-wrote your classes to reflect that structure, a subclass of MPTT could handle the relationships correctly (never letting a building be in a room, for example), and you could model the entire thing in a single tree. You could easily design a generic "Details and attributes" model to then contain the extra items needed by Cities and Addresses and the like.

0

精彩评论

暂无评论...
验证码 换一张
取 消