开发者

How to skip pre header lines with csv.DictReader?

开发者 https://www.devze.com 2023-04-08 21:29 出处:网络
I want to csv.DictReader to deduce the field names from the file. The docs say \"If the fieldnames parameter is omitted, the values in the first row of the csvfile will be used as the fieldnames.\", b

I want to csv.DictReader to deduce the field names from the file. The docs say "If the fieldnames parameter is omitted, the values in the first row of the csvfile will be used as the fieldnames.", but in my case the first row containts a title and the 2nd row which contains the names.

I can't apply next(reader) as per Python 3.2 skip a line in csv.DictReader because the fieldname assignment takes place when initializing the reader (or I'm doing it wrong).

The csvfile (exported from Excel 2010, original source):

CanVec v1.1.0,,,,,,,,,^M
Entity,Attributes combination,"Specific开发者_C百科ation Code
Point","Specification Code
Line","Specification Code
Area",Generic Code,Theme,"GML - Entity name
Shape - File name
Point","GML - Entity name
Shape - File name
Line","GML - Entity name
Shape - File name
Area"^M
Amusement park,Amusement park,,,2260012,2260009,LX,,,LX_2260009_2^M
Auto wrecker,Auto wrecker,,,2360012,2360009,IC,,,IC_2360009_2^M

My code:

f = open(entities_table,'rb')
try:
    dialect = csv.Sniffer().sniff(f.read(1024))
    f.seek(0)

    reader = csv.DictReader(f, dialect=dialect)
    print 'I think the field names are:\n%s\n' % (reader.fieldnames)

    i = 0
    for row in reader:
        if i < 20:
            print row
            i = i + 1

finally:
    f.close()

Current results:

I think the field names are:
['CanVec v1.1.0', '', '', '', '', '', '', '', '', '']

Desired result:

I think the field names are:
['Entity','Attributes combination','"Specification Code Point"',...snip]

I realize it would be expedient to simply delete the first row and carry on, but I'm trying to get as close to just reading the data in situ as I can and minimize manual intervention.


After f.seek(0), insert:

next(f)

to advance the file pointer to the second line before initializing the DictReader.


I used islice from itertools. My header was in the last line of a big preamble. I have passed preamble and used hederline for fieldnames:

with open(file, "r") as f:
    '''Pass preamble'''
    n = 0
    for line in f.readlines():
        n += 1
        if 'same_field_name' in line: # line with field names was found
            h = line.split(',')
            break
    f.close()
    f = islice(open(i, "r"), n, None)

    reader = csv.DictReader(f, fieldnames = h)
0

精彩评论

暂无评论...
验证码 换一张
取 消