开发者

How can I store data to a data dictionary in Python when headings are in mixed up order

开发者 https://www.devze.com 2023-03-10 23:31 出处:网络
I\'d like to store the following data in a data dictionary so that I can easily export it to a CSV file.The problem is that the columns for each school id are not always in the same order:

I'd like to store the following data in a data dictionary so that I can easily export it to a CSV file. The problem is that the columns for each school id are not always in the same order:

text = """
school id= 28392
name|year|degree|age|race
Susan A. Smith|2007|PhD|27|wh开发者_开发百科ite
Fred Collins|2006|PhD|26|hispanic
Amber Real|2007|MBA|28|white
Mike Lee|2003|PhD|27|white

school id= 273533123
name|year|age|race|degree
John B. Black|2003|27|hispanic|MBA
Steven Smith|2005|28|black|PhD
Jacob Waters|2003|25|hispanic|MBA

school id = 3452332
name|year|race|age|degree
Peter Hintze|2002|white|27|Bachelors
Ann Graden|2004|black|25|MBA
Bryan Stewart|2004|white|28|PhD
"""

I'd like to be able to eventually output all data to a CSV file with the following headings:

school id|year|name|age|race|degree

Can I do this in Python?


This actually seems pretty easy. Process the file into a data structure, then export it into a csv.

school = None
headers = None
data = {}
for line in text.splitlines():
    if line.startswith("school id"):
        school = line.split('=')[1].strip()
        headers = None
        continue
    if school is not None and headers is None:
        headers = line.split('|')
        continue

    if school is not None and headers is not None and line:
        if not school in data:
            data[school] = []
        datum = dict(zip(headers, line.split('|')))
        data[school].append(datum)    

In [29]: data
Out[29]: 
{'273533123': [{'age': '27',
                'degree': 'MBA',
                'name': 'John B. Black',
                'race': 'hispanic',
                'year': '2003'},
               {'age': '28',
                'degree': 'PhD',
                'name': 'Steven Smith',
                'race': 'black',
                'year': '2005'},
               {'age': '25',
                'degree': 'MBA',
                'name': 'Jacob Waters',
                'race': 'hispanic',
                'year': '2003'}],
 '28392': [{'age': '27',
            'degree': 'PhD',
            'name': 'Susan A. Smith',
            'race': 'white',
            'year': '2007'},
           {'age': '26',
            'degree': 'PhD',
            'name': 'Fred Collins',
            'race': 'hispanic',
            'year': '2006'},
           {'age': '28',
            'degree': 'MBA',
            'name': 'Amber Real',
            'race': 'white',
            'year': '2007'},
           {'age': '27',
            'degree': 'PhD',
            'name': 'Mike Lee',
            'race': 'white',
            'year': '2003'}],
 '3452332': [{'age': '27',
              'degree': 'Bachelors',
              'name': 'Peter Hintze',
              'race': 'white',
              'year': '2002'},
             {'age': '25',
              'degree': 'MBA',
              'name': 'Ann Graden',
              'race': 'black',
              'year': '2004'},
             {'age': '28',
              'degree': 'PhD',
              'name': 'Bryan Stewart',
              'race': 'white',
              'year': '2004'}]}    
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号