开发者

How to validate names in ASP .NET MVC so accents are allowed (é, á, ...)?

开发者 https://www.devze.com 2023-03-27 09:02 出处:网络
I need to validate form \'name\' fields that may contain characters with accents such as á, é 开发者_JAVA技巧etc.

I need to validate form 'name' fields that may contain characters with accents such as á, é 开发者_JAVA技巧etc.

I tried applying the regular expression in the following attribute as directed in another SO question (apologies I can't find it now) and while it is correctly validating for most characters that I don't want (i.e. *, ^, ?) it is also marking accented characters as invalid.

Is this because I'm validating on the client side not the server?

Any advice would be appreciated.

[ValidateRegExp(@"^\w*$", "Invalid characters in surname")]


I won't give you the regular expression because you really shouldn't validate people's names.
This is one of those things that are so easy to get wrong and offend your users.

What are the benefits to this, anyway?
The rudest thing you can probably do is to say something akin to "Invalid characters in surname". My surname is Абрамов and the characters are completely valid, it's just your system isn't smart enough. User interface shouldn't blame me for your fault.

If accepting such characters really would destroy your database, please respond at least with "We are very sorry but our system doesn't accept anything but English letters."

Take a look at this wonderful post by Patrick:

Falsehoods Programmers Believe about Names

  1. People have exactly one canonical full name.
  2. People have exactly one full name which they go by.
  3. People have, at this point in time, exactly one canonical full name.
  4. People have, at this point in time, one full name which they go by.
  5. People have exactly N names, for any value of N.
  6. People’s names fit within a certain defined amount of space.
  7. People’s names do not change.
  8. People’s names change, but only at a certain enumerated set of events.
  9. People’s names are written in ASCII.
  10. People’s names are written in any single character set.
  11. People’s names are all mapped in Unicode code points.
  12. People’s names are case sensitive.
  13. People’s names are case insensitive.
  14. People’s names sometimes have prefixes or suffixes, but you can safely ignore those.
  15. People’s names do not contain numbers.
  16. People’s names are not written in ALL CAPS.
  17. People’s names are not written in all lower case letters.
  18. People’s names have an order to them. Picking any ordering scheme will automatically result in consistent ordering among all systems, as long as both use the same ordering scheme for the same name.
  19. People’s first names and last names are, by necessity, different.
  20. People have last names, family names, or anything else which is shared by folks recognized as their relatives.
  21. People’s names are globally unique.
  22. People’s names are almost globally unique.
  23. Alright alright but surely people’s names are diverse enough such that no million people share the same name.
  24. My system will never have to deal with names from China.
  25. Or Japan.
  26. Or Korea.
  27. Or Ireland, the United Kingdom, the United States, Spain, Mexico, Brazil, Peru, Russia, Sweden, Botswana, South Africa, Trinidad, Haiti, France, or the Klingon Empire, all of which have “weird” naming schemes in common use.
  28. That Klingon Empire thing was a joke, right?
  29. Confound your cultural relativism! People in my society, at least, agree on one commonly accepted standard for names.
  30. There exists an algorithm which transforms names and can be reversed losslessly. (Yes, yes, you can do it if your algorithm returns the input. You get a gold star.)
  31. I can safely assume that this dictionary of bad words contains no people’s names in it.
  32. People’s names are assigned at birth.
  33. OK, maybe not at birth, but at least pretty close to birth.
  34. Alright, alright, within a year or so of birth.
  35. Five years?
  36. You’re kidding me, right?
  37. Two different systems containing data about the same person will use the same name for that person.
  38. Two different data entry operators, given a person’s name, will by necessity enter bitwise equivalent strings on any single system, if the system is well-designed.
  39. People whose names break my system are weird outliers. They should have had solid, acceptable names, like 田中太郎.
  40. People have names.


  • \p{L} or \p{Letter} = Any unicode charcter recognized as a letter.
  • \p{N} or \p{Number} = Any unicode charcter recognized as a number.

More variants: http://www.regular-expressions.info/unicode.html#prop

[ValidateRegExp(@"^[\p{L}\p{N}]*$", "Disallowed characters in surname")]


How about avoiding regex entirely and using something like:

s.All(c=>char.IsLetter(c))

You should also normalize the string first. So you can deal with accented characters were the base character and the accent are in different chars.


Try this character set: [a-zA-ZÀ-ÿ0-9]. The third range will match all accented characters.

[ValidateRegExp(@"^[a-zA-ZÀ-ÿ0-9]*$", "Invalid characters in surname")]
0

精彩评论

暂无评论...
验证码 换一张
取 消