开发者

Why is my django bulk database population so slow and frequently failing?

开发者 https://www.devze.com 2023-02-04 12:27 出处:网络
I decided I\'d like to use django\'s model system rather than coding raw SQL to interface with my database, but I am having a problem that surely is avoidable.

I decided I'd like to use django's model system rather than coding raw SQL to interface with my database, but I am having a problem that surely is avoidable.

My models.py contains:

class Student(models.Model):
    student_id = models.IntegerField(unique = True)
    form = models.CharField(max_length = 10)
    preferred = models.CharField(max_length = 70)
    surname = models.CharField(max_length = 70)

and I'm populating it by looping through a list as follows:

from models import Student

for id, frm, pref, sname in large_list_of_data: 
   s = Student(student_id = id, form = frm, preferred = pref, surname = sname)
   s.save()

I don't really want to be saving this to the database each time but I don't know another way to get django to not for开发者_如何学Pythonget about it (I'd rather add all the rows and then do a single commit).

There are two problems with the code as it stands.

  1. It's slow -- about 20 students get updated each second.

  2. It doesn't even make it through large_list_of_data, instead throwing a DatabaseError saying "unable to open database file". (Possibly because I'm using sqlite3.)

My question is: How can I stop these two things from happening? I'm guessing that the root of both problems is that I've got the s.save() but I don't see a way of easily batching the students up and then saving them in one commit to the database.


So it seems I should have looked harder before posing the question.

Some solutions are described in this stackoverflow question (the winning answer is to use django.db.transaction.commit_manually) and also in this one on aggregating saves.

Other ideas for speeding up this type of operation are listed in this stackoverflow question.

0

精彩评论

暂无评论...
验证码 换一张
取 消