I have a file in tab de开发者_如何学Pythonlimited format with trailing newline characters, e.g.,
123 abc
456 def
789 ghi
I wish to write function to convert the contents of the file into a nested list. To date I have tried:
def ls_platform_ann():
keyword = []
for line in open( "file", "r" ).readlines():
for value in line.split():
keyword.append(value)
and
def nested_list_input():
nested_list = []
for line in open("file", "r").readlines():
for entry in line.strip().split():
nested_list.append(entry)
print nested_list
.
The former creates a nested list but includes \n and \t characters. The latter does not make a nested list but rather lots of equivalent lists without \n and \t characters.
Anyone help?
Regards, S ;-)
You want the csv
module.
import csv
source = "123\tabc\n456\tdef\n789\tghi"
lines = source.split("\n")
reader = csv.reader(lines, delimiter='\t')
print [word for word in [row for row in reader]]
Output:
[['123', 'abc'], ['456', 'def'], ['789', 'ghi']]
In the code above Ive put the content of the file right in there for easy testing. If youre reading from a file from disk you can do this as well (which might be considered cleaner):
import csv
reader = csv.reader(open("source.csv"), delimiter='\t')
print [word for word in [row for row in reader]]
Another option that doesn't involve the csv module is:
data = [[item.strip() for item in line.rstrip('\r\n').split('\t')] for line in open('input.txt')]
As a multiple line statement it would look like this:
data = []
for line in open('input.txt'):
items = line.rstrip('\r\n').split('\t') # strip new-line characters and split on column delimiter
items = [item.strip() for item in items] # strip extra whitespace off data items
data.append(items)
First off, have a look at the csv module, it should handle the whitespace for you. You may also want to call strip()
on value/entry.
精彩评论