i need a regular expression to match only the word's that match the following conditions. I am using it in my C# program
- Can be any case
- Should not have any numbers
- may contain - and ' characters, but 开发者_C百科are optional
- Should start with a letter
I have tried using the expression ^([a-zA-Z][\'\-]?)+$
but it doesn't work.
Here are list of few words that are acceptable
- London (Case insensitive)
- Jackson's
- non-profit
Here are a list of few words that are not acceptable
- 12london (contains a number and is not started by a alphabet)
- -to (does not start with a alphabet)
- to: (contains : character, any special character other that - and ' is not allowed)
^[a-zA-Z][-'a-zA-Z]*$
This matches any word that starts with an alphabetical character, followed by any number of alphabetical characters, - or '.
Note that you don't need to escape the - and ' when it's inside the character [] class, as long as the dash is either the first or last character in the sequence.
Note also that I've removed the round brackets from your example - if you don't want to capture the input, you'll get better performance by leaving them out.
Try this one:
^[A-Za-z]+[A-Za-z'-]*$
First of all, try your regexes against tools such as http://www.regextester.com/
You are testing strings that both start with AND end with your pattern (^ means start of line, $ is the end), thus leaving out all of the words contained between two spaces.
You should use \b or \B.
Instead of looking for [a-zA-Z] you can use character classes such as '\D' (not digit).
Let me know if the above is working in your scenario.
\b\D[^\c][a-zA-Z]+[^\c]
It says: word boundaries with no digits, no control characters, one or more alphabetical lower or uppercase character, with no following control characters.
精彩评论