Due to the nature of my application, I need to support fast inserts of large volumes of data into the database. Using executemany() increases performance, but there's a caveat. For example, MySQL has a configuration parameter called max_allowed_packet, and if the total size of my insert queries exce开发者_开发技巧eds its value, MySQL throws an error.
Question #1: Is there a way to tell SQLAlchemy to split the packet into several smaller ones?
Question #2: If other RDBS have similar constraints, how should I work around them as well? P.S. I had posted this question earlier but deleted it when I wrongly assumed that likely I will not encounter this problem after all. Sadly, that's not the case.I had a similar problem recently and used the - not very elegant - work-around:
- First I parsed my.cnf for a value for
max_allow_packets
, if I can't find it, the maximum is set to a default value. - All data items are stored in a list.
- Next, for each data item I count the approximate byte length (with strings, it's the length of the string in bytes, for other data types I take the maximum bytes used to be safe.)
- I add them up, committing after I have reached approx. 75% of
max_allow_packets
(as SQL queries will take up space as well, just to be on the safe side).
This approach is not really beautiful, but it worked flawlessly for me.
精彩评论