This loop checks if a record is in the sqlite database and builds a list of dictionaries for those records that are missing and then executes a multiple insert statement with the list. This works but it is very slow (at least i think it is slow) as it takes 5 minutes to loop over 3500 queries. I am a complete newbie in python, sqlite and sqlalchemy so I wonder if there is a faster way of doing this.
list_dict = []
session = Session()
for data in data_list:
if session.query(Class_object).filter(Class_object.column_name_01 == data[2]).filter(Class_object.column_name_00 == an_id).count() == 0:
list_dict.append({'column_name_00':a_id,
开发者_开发问答 'column_name_01':data[2]})
conn = engine.connect()
conn.execute(prices.insert(),list_dict)
conn.close()
session.close()
edit: I moved session = Session()
outside the loop. Did not make a difference.
SOLUTION:
thanks to mcabral answer I modified the code as:
existing_record_list = []
list_dict = []
conn = engine.connect()
s = select([prices.c.column_name_01], prices.c.column_name_00==a_id)
result = conn.execute(s)
for row in result:
existing_record_list.append(row[0])
for data in raw_data['data']:
if data[2] not in existing_record_list:
list_dict.append({'column_name_00':a_id,
'column_name_01':data[2]}
conn = engine.connect()
conn.execute(prices.insert(),list_dict)
conn.close()
This now takes 6 seconds. That is some improvement!!
3500 queries seems a big number,
Have you consider fetching all entities in one query? Then you will be iterating over a list in memory, and not querying the database for each item.
Glad you found something that works, as an extra 2 cents:
I agree with mcabral. As a general rule, if you are putting a query inside of a loop, you are asking for trouble. Popular SQL DBs are generally optimized for data acquisition. Looping through a query generally indicates that you are procedurally doing something that should/could be done with either a single query or a string a queries that put data into each other.
There are exceptions to this, but from my experience they are usually few and far between... Every time I ran a query through a loop, I regretted it later.
精彩评论