开发者

How to iterate through a MySQL table with Python?

开发者 https://www.devze.com 2023-01-04 20:11 出处:网络
I have a Python script which uses the MySQLdb interface to load various CSV files into MySQL tables. In my code, I use Python\'s standard CSV library to read the CSV, then I insert each field into th

I have a Python script which uses the MySQLdb interface to load various CSV files into MySQL tables.

In my code, I use Python's standard CSV library to read the CSV, then I insert each field into the table one at a time, using an INSERT query. I do this rather than using LOAD DATA so that I can convert null values and other minor clean-ups on a per-field basis.

Example table format:

`id_number` | `iteration` | `date`     | `value`
102         | 1           | 2010-01-01 | 63
102         | 2           | 2010-01-02 | NULL
102         | 3           | 2010-01-03 | 65

The null value in the second iteration of id_number = 102 represents a case where value hasn't changed from the previous day i.e. value remains 63.

Basically, I need to convert these null values to their correct values. I can imagine 4 ways of doing this:

  1. Once everything is inserted into the table, run a MySQL query that does the iterating and replacing all by itself.

  2. Once everything is inserted into the table, run a MySQL query to send some data back to Pytho开发者_StackOverflow中文版n, process in Python then run a MySQL query to update the correct values.

  3. Do the processing in Python on a per-field basis before each insert.

  4. Insert into a temporary table and use SQL to insert into the main table.

I could probably work out how to do #2, and maybe #3, but have no idea how to do #1 or #4, which I think are the best methods as it then requires no fundamental changes to the Python code.

My question is A) which of the above methods is "best" and "cleanest"? (Speed not really an issue.) and B) how would I achieve #1 or #4?

Thanks in advance :)


I think you would have the most control and the least amount of work with your #3 option, Especially if you want to keep existing values over null values, I think you risk overwriting those with #1.

If speed is not an issue, for every record in your CSV, compare it to the existing record, and update or insert your record with your preferred values.

0

精彩评论

暂无评论...
验证码 换一张
取 消