UPDATE: added an example to clarify the format of the data.
Considering a CSV with each line formatted like this:
tbl1.col1,tbl1.col2,tbl1.col3,tbl1.col4,tbl1.col5,[tbl2.col1:tbl2.col2]+
where [tbl2.col1:tbl2.col2]+ means that there could be any number of these pairs repeated
ex:
tbl1.col1,tbl1.col2,tbl1.col3,tbl1.col4,tbl1.col5,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2
The tables would relate to eachother using the line number as a key which would have to be 开发者_如何学运维created in addition to any columns mentioned above.
- Is there a way to use
mysql load data infile
to load the data into two separate tables? - If not, what Unix command line tools would be best suited for this?
no, not directly. load data can only insert into one table or partitioned table.
what you can do is load the data into a staging table, then use insert into
to select the individual columns into the 2 final tables. you may also need substring_index
if you're using different delimiters for tbl2's values. the line number is handled by an auto incrementing column in the staging table (the easiest way is to make the auto column last in the staging table definition).
the format is not exactly clear, and is best done w/perl/php/python, but if you really want to use shell tools:
cut -d , -f 1-5 file | awk -F, '{print NR "," $0}' > table1
cut -d , -f 6- file | sed 's,\:,\,,g' | \
awk -F, '{i=1; while (i<=NF) {print NR "," $(i) "," $(i+1); i+=2;}}' > table2
this creates table1 and table 2 files with these contents:
1,tbl1.col1,tbl1.col2,tbl1.col3,tbl1.col4,tbl1.col5
2,tbl1.col1,tbl1.col2,tbl1.col3,tbl1.col4,tbl1.col5
3,tbl1.col1,tbl1.col2,tbl1.col3,tbl1.col4,tbl1.col5
and
1,tbl2.col1,tbl2.col2
1,tbl2.col1,tbl2.col2
2,tbl2.col1,tbl2.col2
2,tbl2.col1,tbl2.col2
3,tbl2.col1,tbl2.col2
3,tbl2.col1,tbl2.col2
As you say, the problematic part is the unknown number of [tbl2.col1:tbl2.col2] pairs declared in each line. I would tempted to solve this through sed
: split the one file into two files, one for each table. Then you can use load data infile
to load each file into its corresponding table.
精彩评论