I have problem with importing a CSV file. I am using postgresql's COPY FROM command to copy a CSV file into a 2-column table.
I have a CSV file in the following format;
"1";"A"
"2";"B"
"3";"C";开发者_运维知识库"CAD450"
"4";"D";"ABX123"
I want to import all these lines of the CSV file into the table but I want to skip any extra added columns.
Currently I am skipping any lines that contain extra columns, for example here columns "1";"C";"CAD450"
and "1";"D";"ABX123"
are skipped and I am importing only the first two columns. But I want to copy all these four lines into my table. So is there any way where I can ignore the last column and copy all the four lines into my table, like this
"1";"A"
"1";"B"
"1";"C"
"1";"D"
Preprocess the file with awk to strip the extra columns:
awk -F';' '{print $1 ";" $2 }' > new_file.csv
Piping it through cut
or awk
(as suggested above) is easier than using python/psycopg.
cat csv_file.csv | cut -d';' -f1,2 | psql -u USER DATABASE -c "COPY table FROM STDIN WITH DELIMITER ';';"
with open("file.csv","r") as f:
t=[line.strip().split(";")[:2] for line in f]
Myriad ways to handle the problem.
I'd probably do something like this:
import csv
import psycopg2
dr = csv.DictReader(open('test.csv','rb'),
delimiter=';',
quotechar='"',
fieldnames=['col1','col2']) # need not specify other cols
CONNSTR = """
host=127.0.0.1
dbname=mydb
user=me
password=pw
port=5432"""
cxn = psycopg2.connect(CONNSTR)
cur = cxn.cursor()
cur.execute("""CREATE TABLE from_csv (
id serial NOT NULL,
col1 character varying,
col2 character varying,
CONSTRAINT from_csv_pkey PRIMARY KEY (id));""")
cur.executemany("""INSERT INTO from_csv (col1,col2)
VALUES (%(col1)s,%(col2)s);""", dr)
cxn.commit()
精彩评论