I've just downloaded a bunch of text files from data.gov, and there are fields in the text file that I really don't need.
Is there a way to import columns [1,3] and leave the rest?
I figure I'll import using 'load data in file', but didn't see anything on the mysql page as to how to only import certain columns. http://dev.mysql.com/doc/refman/5.0/en/load-data.html
The fields are deli开发者_开发问答mited by ^. Just so I'm clear, if a line in the txt file is
00111^first column entry^second column entry^this would be the 3rd column
I am trying to get my mysql table to contain
first column entry | this would be the 3rd column
You can import the specific columns with:
LOAD DATA LOCAL INFILE 'yourFile' INTO TABLE table_name
FIELDS TERMINATED BY '^' (column1, @dummy, column3, @dummy);
Put all columns which you don't need in @dummy.
You could always create a table with a dummy column(s) which you drop after loading the file (assuming you don't have to load the file very often).
Something like this:
LOAD DATA LOCAL INFILE '/path/to/file' INTO TABLE table_name
FIELDS TERMINATED BY '^' (dummy_column1, column1, dummy_column2, column2);
ALTER TABLE table_name DROP dummy_column1;
ALTER TABLE table_name DROP dummy_column2;
Assuming a Unix platform, you could filter the fields upstream.
cut -d^ -f2,4 mygovfile.dat > mytable.txt
To filter the first and third column, then import using your preferred method. For instance
mysqlimport --local -uxxx -pyyy mydb --fields-terminated-by="^" mytable.txt ....
The two most common ways of dealing with this:
- Import the data just as it is into a staging table, move what you need into your "real" tables, then truncate the staging table.
- Use a text utility to snip out just what you need.
My text utility of choice is awk. A minimal awk script--which probably won't work for you without some tweaking--would look like this.
$ awk 'BEGIN { FS="^";OFS=",";}{print $2, $4}' test.dat
first column entry,this would be the 3rd column
What kind of tweaking? It usually involves taking care of embedded commas, single quotes, and double quotes.
This part
BEGIN { FS="^";OFS=",";}{print $2, $4}
is the whole awk program.
awk rocks.
精彩评论