AttributeError: trying to match list of string identifiers from file1 in file2_问答_开发者

Here is a brief summary of my aims. I have a list of data in the data text file that are basically names or identifiers. The list of names is all on one line and seperated by a space. I want to make each data a seperate line. These data are identifiers. If for instance one name from the original data text file in also present in the big file I want to have that line of data in the big file, i.e. the name and some additional 开发者_StackOverflowinformation all on the same line written to a smaller data file.

This is the program that I have started to attempt such a feat. Perhaps this is pushing the limits of my skills but I hope to be able to complete this.

datafile = open ('C:\\datatext.txt', 'r')

line = [item for item in open('C:\\datatext.txt', 'r').read().split(' ') 
                  if item.startswith("name") or item.startswith("name2")]

line_list = line.split(" ")

completedataset = open('C:\\bigfile.txt', 'r')
smallerdataset = open('C:\\smallerdataset.txt', 'w')

trials = [ line_list ]


for line in completedataset:
    for t in trials: 
       if t in line:
           smallerdataset.write(line)

completedataset.close()
smallerdataset.close()

Here is the error that i receive when i run the program in python:

Traceback (most recent call last):
  File "C:/program3.py", line 7, in <module>
    line_list = line.split(" ")
AttributeError: 'list' object has no attribute 'split'

I have tried to be very thourough and look forward to your comments. If you have additional questions I will elaborate as needed promptly. All the best and enjoy the rainy weather.

EDIT:

I have made some changes to the program based on suggestions. I have this as my program now:

with open('C:\\datatext.txt', 'r') as datafile:
  lines = datafile.read().split(' ')
matchedLines = [item for item in lines if item.startswith("name1") or item.startswith("othername")]


completedataset = open('C:\\bigfile.txt', 'r')
smallerdataset = open('C:\\smallerdataset.txt', 'w')

trials = [ matchedLines ]


for line in completedataset:
    for t in trials: 
       if t in line:
           smallerdataset.write(line)

completedataset.close()
smallerdataset.close()

and i'm getting this error now:

Traceback (most recent call last):
  File "C:/program5.py", line 17, in 
    if t in line:
TypeError: 'in ' requires string as left operand, not list
>>>

Thank you for you're continued help in this matter.

EDIT 2:

I have made several changes and now I'm getting this error:

Traceback (most recent call last):
  File "C:/program6.py", line 9, in 
    open('C:\\smallerdataset.txt', 'w')) as (completedataset, smallerdataset):
AttributeError: 'tuple' object has no attribute '__exit__'

Here is my program as it stands now:

with open('C:\\datatext.txt', 'r') as datafile:
  lines = datafile.read().split(' ')
matchedLines = [item for item in lines if item.startswith("nam1") or item.startswith("ndname")]


with (open('C:\\bigfile.txt', 'r'),
      open('C:\\smallerdataset.txt', 'w')) as (completedataset, smallerdataset):
  for line in completedataset:
    for t in matchedLines:
      if t in line:
        smallerdataset.write(line)

completedataset.close()
smallerdataset.close()

How can I get around this hurdle?

line = [item for item in open('C:\chiptext.txt', 'r').read().split(' ')
          if item.startswith("SNP") or item.startswith("AFFY")]

This is making line a list of strings. A list object does not have a split method.

It looks like you want a list of all the names in datatext and a subset of that list for names that match some predicate. The best way to do that is the following.

with open('C:\\datatext.txt', 'r') as datafile:
  lines = datafile.read().split(' ')
matchedLines = [item for item in lines if (PREDICATE)]

As a general comment, try not to get too carried away with one-lining code. Your list comprehension line is leaving the file object open.

Edit for new edit: matchedLines is already a list, so I'm not sure why you are wrapping it in another list when you make trials. Below is a simple example of what you are doing.

l = [1,2,3]
ll = [l]
print ll //[[1, 2, 3]]

When you get errors that don't make sense based on what you expect the value of a variable to be, you should add in print statements so you can confirm that the values are correct.

This is likely what you need:

with open('C:\datatext.txt', 'r') as datafile:
  lines = datafile.read().split(' ')
matchedLines = [item for item in lines if item.startswith("name1") or item.startswith("othername")]

with open('C:\bigfile.txt', 'r') as completedataset:
  with open('C:\smallerdataset.txt', 'w') as smallerdataset:
    for line in completedataset:
      for t in matchedLines:
        if t in line:
          smallerdataset.write(line)