How can I tidy up this file name cleaner?_问答_开发者

How can I tidy up this file name cleaner?

开发者 https://www.devze.com 2023-03-23 02:34 出处：网络

I know there\'s a better way to do this, but I don\'t know what it is. I\'m sorting through a list of files, and I would like to remove \'the usual suspects\' so I can compare one list to another.

I know there's a better way to do this, but I don't know what it is. I'm sorting through a list of files, and I would like to remove 'the usual suspects' so I can compare one list to another.

From what I understand, name.replace() look at each and every item in the listToClean for the phrases I picked, and replace them if present. There has to be a better way to do this...

def cl开发者_C百科eanLists(listToClean, extList):
    cleanFileList = []
    for filename in listToClean:
        name = os.path.split(filename)[1]
        ext = os.path.splitext(name)
        if ext[1] in extList:
            name = name.replace(ext[1], '') 
            name = name.replace('1080p', '')
            name = name.replace('1080P', '')
            name = name.replace('720p', '')
            name = name.replace('720P', '')
            name = name.replace('HD', '')
            name = name.replace('(', ' ')
            name = name.replace(')', '')
            name = name.replace('.', ' ')
            cleanFileList.append(name)
    cleanFileList.sort(key=lambda x: x.lower())
    return cleanFileList

bad_names = ['1080p', '720p'] # and so on
for bad_name in bad_names:
    name = name.replace(bad_name, '')

Obviously, your declaration of words to clean from each name would happen at the top of the function, not for each iteration over the list of file names.

# do this once
import re
bad_strings = ['1080p', '720p'] # etc
regex = '|'.join(re.escape(x) for x in bad_strings)
subber = re.compile(regex, re.IGNORECASE).sub

# do this once for each name
name = name.replace(ext[1], '')
# OR maybe better: name = ext[0] # see below
cleanFileList.append(subber('', name))

Consider where 'csv' is in your list of extensions and you have a file named 'summary_of_csv_files.csv' ...