filtered=[]
text="any.pdf"
if "doc" and "pdf" and "xls" and "jpg" not in text:
filtered.append(text)
print(filtered)
This is my first Post in Stack Overflow, so excuse if there's something annoying in Question, The Code suppose to append text if text doesn't include any of these words:doc,pdf,xls,jpg. It works fine if Its like:
if "doc" in text:
elif "jpg" in text:
elif "pdf" in text:
elif "xls" in text:
else:
开发者_Python百科 filtered.append(text)
If you open up a python interpreter, you'll find that "doc" and "pdf" and "xls" and "jpg"
is the same thing as 'jpg'
:
>>> "doc" and "pdf" and "xls" and "jpg"
'jpg'
So rather than testing against all the strings, your first attempt only tests against 'jpg'.
There are a number of ways to do what you want. The below isn't the most obvious, but it's useful:
if not any(test_string in text for test_string in ["doc", "pdf", "xls", "jpg"]):
filtered.append(text)
Another approach would be to use a for
loop in conjunction with an else
statement:
for test_string in ["doc", "pdf", "xls", "jpg"]:
if test_string in text:
break
else:
filtered.append(text)
Finally, you could use a pure list comprehension:
tofilter = ["one.pdf", "two.txt", "three.jpg", "four.png"]
test_strings = ["doc", "pdf", "xls", "jpg"]
filtered = [s for s in tofilter if not any(t in s for t in test_strings)]
EDIT:
If you want to filter both words and extensions, I would recommend the following:
text_list = generate_text_list() # or whatever you do to get a text sequence
extensions = ['.doc', '.pdf', '.xls', '.jpg']
words = ['some', 'words', 'to', 'filter']
text_list = [text for text in text_list if not text.endswith(tuple(extensions))]
text_list = [text for text in text_list if not any(word in text for word in words)]
This could still lead to some mismatches; the above also filters "Do something", "He's a wordsmith", etc. If that's a problem then you may need a more complex solution.
If those extensions are always at the end, you can use .endswith
and that can parse tuple.
if not text.endswith(("doc", "pdf", "xls", "jpg")):
filtered.append(text)
basename, ext = os.path.splitext(some_filename)
if not ext in ('.pdf', '.png'):
filtered.append(some_filename)
....
Try the following:
if all(substring not in text for substring in ['doc', 'pdf', 'xls', 'jpg']):
filtered.append(text)
The currently-selected answer is very good as far as explaining the syntactically correct ways to do what you want to do. However it's obvious that you are dealing with file extensions, which appear at the end [fail: doctor_no.py
, whatsupdoc
], and probable that you are using Windows, where case distinctions in file paths don't exist [fail: FUBAR.DOC
].
To cover those bases:
# setup
import os.path
interesting_extensions = set("." + x for x in "doc pdf xls jpg".split())
# each time around
basename, ext = os.path.splitext(text)
if ext.lower() not in interesting_extensions:
filtered.append(text)
精彩评论