开发者

invalid chars filter for file/folder name? (ruby)

开发者 https://www.devze.com 2022-12-20 02:05 出处:网络
My script downloads开发者_JAVA百科 files from the net and then it saves them under the name taken from the same web server. I need a filter/remover of invalid characters for file/folder names under Wi

My script downloads开发者_JAVA百科 files from the net and then it saves them under the name taken from the same web server. I need a filter/remover of invalid characters for file/folder names under Windows NTFS.

I would be happy for multi platform filter too.

NOTE: something like htmlentities would be great....


Like Geo said, by using gsub you can easily convert all invalid characters to a valid character. For example:

file_names.map! do |f|
  f.gsub(/[<invalid characters>]/, '_')
end

You need to replace <invalid characters> with all the possible characters that your file names might have in them that are not allowed on your file system. In the above code each invalid character is replaced with a _.

Wikipedia tells us that the following characters are not allowed on NTFS:

  • U+0000 (NUL)
  • / (slash)
  • \ (backslash)
  • : (colon)
  • * (asterisk)
  • ? (question mark)
  • " (quote)
  • < (less than)
  • (greater than)

  • | (pipe)

So your gsub call could be something like this:

file_names.map! { |f| f.gsub(/[\x00\/\\:\*\?\"<>\|]/, '_') }

which replaces all the invalid characters with an underscore.


filename_string.gsub(/[^\w\.]/, '_')

Explanation: Replace everything except word-characters (letter, number, underscore) and dots


I think your best bet would be gsub on the filename. One of the things I know you'll need to delete/replace is :.


I don't know how you plan to use those files later, but pretty much most reliable solution would be to keep the original filenames in a db table (or otherwise serialized hash), and name physical files after the unique ID that you (or the database) generated.

PS Another advantage of this approach is that you don't have to worry about the files with the same names (or different names that filter to same names).

0

精彩评论

暂无评论...
验证码 换一张
取 消