开发者

Removing tab delimited spaces from a text file using for loop

开发者 https://www.devze.com 2023-02-17 20:42 出处:网络
For my python class, I am working on opening a .tsv file and taking 15 rows of data, broken down in 4 columns, and turning it into lists for each line.To do this, I must remove the tabs in between eac

For my python class, I am working on opening a .tsv file and taking 15 rows of data, broken down in 4 columns, and turning it into lists for each line. To do this, I must remove the tabs in between each column.

I've been advised to use a for loop and loop through each li开发者_JAVA百科ne. This makes sense but I can't figure out how to remove the tabs.

Any help?


To read lines from a file, and split each line on the tab delimiter, you can do this:

rows = []
for line in open('file.tsv', 'rb'):
    rows.append(line.strip().split('\t'))


Properly, this should be done using the Python CSV module (as mentioned in another answer) as this will handle escaped separators, quoted values etc.

In the more general sense, this can be done with a list comprehension:

rows = [line.split('\t') for line in file]

And, as suggested in the comments, in some cases a generator expression would be a better choice:

rows = (line.split('\t') for line in file)

See Generator Expressions vs. List Comprehensions for some discussion on when to use each.


You should use Python's stdlib csv module, particularly the csv.reader function.

rows = [row for row in csv.reader(open('yourfile.tsv', 'rb'), delimiter='\t')]

There's also a a dialect parameter that can take excel-tab to conform to Microsoft Excel's tab-delimited format.


Check out the built-in string functions. split() should do the job.

>>> line = 'word1\tword2\tword3'
>>> line.split('\t')
['word1', 'word2', 'word3']
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号