I have two tables, 'user' and 'account' with a 1:1 relationship.
account has a fk user_id, and account has unique account_id
I have to take a csv upload with up to 50k rows and for each row either:
- insert the data as a new row (in to account and user) if account.account_number is unique
- update the account table row if the account.account_number already exists
- remove the account and user row if the account.account_number doesnt exist in the csv.
So far I'm dealing with the account table, as this has the account.account_number col to check for unique, but adding a new row to this table relies on having the user_id from a new row already added to the user table.
INSERT INTO account (account_number, trans_rate....etc) VALUES (multiple data sets),(...),(....) ON DUPLICATE KEY UPDATE ...etc
So that deals with the account table, doing updates / inserts as required. But of course, the user_id column is left NULL as the user records haven't been created.
I'm wondering if I can embed the insert statement for a new user row within the above statement, but I don't know where to start... And a bit of googling hasnt led me to beleive its actually possible thus far.
Or any other ideas?..
I can see how to do this with several queries per row, but I'm concerned about the time the data import is going to take if I do it with so many sep开发者_运维知识库erate queries.
Upload the data into a 'staging table' - and then perform two upserts into first the user table, and second the account table - the query used for the second of these would be the staging table joined to the user table on e.g. user name or SSN (whatever the candidate key for users is).
精彩评论