开发者

How to ignore numbers when doing string.startswith() in python?

开发者 https://www.devze.com 2023-03-12 05:43 出处:网络
I have a directory with a large number of files. The file names are similar to the following: the(number)one(number), where (number) can be any number. There are also files with the name: the(number),

I have a directory with a large number of files. The file names are similar to the following: the(number)one(number), where (number) can be any number. There are also files with the name: the(number), where (number) can be any number. I was wondering how I can count the number of files with the additional "one(number)" at the end of their file name.

Let's say I have the list of file names, I was thinking of doing

for n in list:
    if n.startswith(the(number)one):
        add one to a counter

Is there anyway for it to accept any number in the (number) space when doing a startswith?

Example: the34one开发者_开发百科5 the37one2 the444one3 the87one8 the34 the32

This should return 4.


Use a regex matching 'one\d+' using the re module.

import re
for n in list:
    if re.search(r"one\d+", n):
        add one to a counter

If you want to make it very accurate, you can even do:

for n in list:
    if re.search(r"^the\d+one\d+$", n):
        add one to a counter

Which will even take care of any possible non digit chars between "the" and "one" and won't allow anything else before 'the' and after the last digit'.

You should start learning regexp now:

  • they let you make some complex text analysis in a blink that would be hard to code manually
  • they work almost the same from one language to another, making you more flexible
  • if you encounter some code using them, you will be puzzled if you didn't cause it's not something you can guess
  • the sooner you know them, the sooner you'll learn when NOT (hint) to use them. Which is eventually as important as knowing them.


The easiest way to do this probably is glob.glob():

number = len(glob.glob("/path/to/files/the*one*"))

Note that * here will match any string, not just numbers.


The same as a one-liner and also answering the question as it should match 'the' as well:

import re
count = len([name for name in list if re.match('the\d+one', name)])
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号