I have a job application form where people fill in their n开发者_JS百科ame and contact info and attach a resume.
The the contact info gets emailed and the resume attached.
I would like to change the name of the file to that it is a combination of the competition number and their name.
How can I clean up my generated filename so that I can guarantee it has no invalid characters in it. So far I can remove all the spaces and lowercase the string.
I'd like to remove any punctuation ( like apostrophes ) and non-alphabetical characters ( like accents ).
For example if "André O'Hara" submitted his resume for job 555 using this form, I would be happy if all the questionable characters were removed and I ended up with a file name like:
555-andr-ohara-resume.doc
What regex can I use to remove all non-alphabetical characters ?
Here is my code so far:
# Create a cleaned up version of competition number + First Name + Last Name number to name the file
my $hr_generated_filename = $cgi->param("competition") . "-" . $cgi->param("first") . "-" . $cgi->param("last");
# change to all lowercase
$hr_generated_filename = lc( $hr_generated_filename );
# remove all whitespace
$hr_generated_filename =~ s/\s+//g;
push @{ $msg->{attach} }, {
Type => 'application/octet-stream',
Filename => $hr_generated_filename.".$file-extension",
Data => $data,
Disposition => 'attachment',
Encoding => 'base64',
};
If you are trying to "white-list" characters, your basic approach should be to use a character class complement:
[...]
defines a character class in Perl regexes, which will match any characters defined inside (including ranges such as a-z
). If you add a ^
, it becomes a complement, so it matches any characters not defined inside the brackets.
$hr_generated_filename =~ s/[^A-Za-z0-9\-\.]//g;
That will remove anything that is not an un-accented Latin letter, a number, a dash, or a dot. To add to your white-list, just add characters inside the [^...]
.
精彩评论