I am trying to validate 'words' with Ruby 1.8.7.
My regex to catch a word is currently:
/[a-zA-Z]\'*\-*/
This will only catch English words; Is there a way to catch non-English UTF-8 characters?
Even the 1.8.x Regex engine is UTF-8 aware, you just need to use the right expression, and it's slightly more than just using /\w/
:
s = "résumé and some other words"
puts s[/[a-z]+/u]
puts s[/\w+/u]
and you get:
r
résumé
精彩评论