开发者

Getting rid of multiple periods in a filename using RegEx

开发者 https://www.devze.com 2023-01-07 11:35 出处:网络
I have an application that requires me to \"clean\" \"dirty\" filenames. I was wondering if anybody knew how to handle files that are named like:

I have an application that requires me to "clean" "dirty" filenames.

I was wondering if anybody knew how to handle files that are named like:

1.0.1.21 -- Confidential...doc or Accounting.Files.doc

Basically there's no guarantee that the periods will be in the same place for every file name. I was hoping开发者_Python百科 to recurse through a drive, search for periods in the filename itself (minus the extension), remove the period and then append the extension onto it.

Does anybody know either a better way to do this or how do perform what I'm hoping to do? As a note, regEx is a REQUIREMENT for this project.

EDIT: Instead of seeing 1.0.1.21 -- Confidential...doc, I'd like to see: 10121 -- Confidential.doc

For the other filename, Instead of Accounting.Files.doc, i'd like to see AccountingFiles.doc


You could do it with a regular expression:

string s = "1.0.1.21 -- Confidential...doc";
s = Regex.Replace(s, @"\.(?=.*\.)", "");
Console.WriteLine(s);

Result:

10121 -- Confidential.doc

The regular expression can be broken down as follows:

\.    match a literal dot
(?=   start a lookahead 
.*    any characters
\.    another dot
)     close the lookahead

Or in plain English: remove every dot that has at least one dot after it.

It would be cleaner to use the built in methods for handling file names and extensions, so if you could somehow remove the requirement that it must be regular expressions I think it would make the solution even better.


Here is an alternate solution that doesn't use regular expressions -- perhaps it is more readable:

string s = "1.0.1.21 -- Confidential...doc";
int extensionPoint = s.LastIndexOf(".");
if (extensionPoint < 0) {
    extensionPoint = s.Length;
}
string nameWithoutDots = s.Substring(0, extensionPoint).Replace(".", "");
string extension = s.Substring(extensionPoint);
Console.WriteLine(nameWithoutDots + extension);


I'd do this without regular expressions*. (Disclaimer: I'm not good with regular expressions, so that might be why.)

Consider this option.

string RemovePeriodsFromFilename(string fullPath)
{
    string dir = Path.GetDirectoryName(fullPath);
    string filename = Path.GetFileNameWithoutExtension(fullPath);
    string sanitized = filename.Replace(".", string.Empty);
    string ext = Path.GetExtension(fullPath);

    return Path.Combine(dir, sanitized + ext);
}

* Whoops, looks like you said using regular expressions was a requirement. Never mind! (Though I have to ask: why?)

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号