开发者

Regular Expression: single word

开发者 https://www.devze.com 2023-04-11 19:03 出处:网络
I want to check in a C# program, if a user input is a single word. The word my only have characters A-Z and a-z. No spaces or other characters.

I want to check in a C# program, if a user input is a single word. The word my only have characters A-Z and a-z. No spaces or other characters. I try [A-Za-z]* , but this doesn't work. What is wrong with this expression?

Regex regex = new Regex("[A-Za-z]*");
if (!regex.IsMatch(userinput);)
{
  ...
}

Can you recomend w开发者_Go百科ebsite with a comprensiv list of regex examples?!


It probably works, but you aren't anchoring the regular expression. You need to use ^ and $ to anchor the expression to the beginning and end of the string, respectively:

Regex regex = new Regex("^[A-Za-z]+$");

I've also changed * to + because * will match 0 or more times while + will match 1 or more times.


You should add anchors for start and end of string: ^[A-Za-z]+$


Regarding the question of regex examples have a look at http://regexlib.com/.

For the regex, have a look at the special characters ^ and $, which represent starting and ending of string. This site can come in handy when constructing regexes in the future.


The asterisk character in regex specifies "zero or more of the preceding character class".

This explains why your expression is failing, because it will succeed if the string contains zero or more letters.

What you probably intended was to have one or more letters, in which case you should use the plus sign instead of the asterisk.

Having made that change, now it will fail if you enter a string that doesn't contain any letters, as you intended.

However, this still won't work for you entirely, because it will allow other characters in the string. If you want to restrict it to only letters, and nothing else, then you need to provide the start and end anchors (^ and $) in your regex to make the expression check that the 'one or more letters' is attached to the start and end of the string.

^[a-zA-Z]+$

This should work as intended.

Hope that helps.

For more information on regex, I recommend http://www.regular-expressions.info/reference.html as a good reference site.


I don't know what the C#'s regex syntax is, but try [A-Za-z]+.


Try ^[A-Za-z]+$ If you don't include the ^$ it will match on any part of the string that has a alpha characters in it.


I know the question is only about strictly alphabetic input, but here's an interesting way of solving this which does not break on accented letters and other such special characters.

The regex "^\b.+?\b" will match the first word on the start of a string, but only if the string actually starts with a valid word character. Using that, you can simply check if A) the string matches, and B) the length of the matched string equals your full string's length:

public Boolean IsSingleWord(String userInput)
{
    Regex firstWordRegex = new Regex("^\\b.+?\\b");
    Match firstWordMatch = firstWordRegex.Match(userInput);
    return firstWordMatch.Success && firstWordMatch.Length == userInput.Length;
}


The other persons have wrote how to resolve the problem you know. Now I'll speak about the problem you perhaps don't know: diacritics :-) Your solution doesn't support àèéìòù and many other letters. A correct solution would be:

^(\p{L}\p{M}*)+$

where \p{L} is any letter plus \p{M}* that is 0 or more diacritic marks (in unicode diacritics can be "separated" from base letters, so you can have something like a + ` = à or you can have precomposed characters like the standard à)


if you just need the characters a-zA-Z you could simply iterate over the characters and compare the single characters if they are inside your range

for example: for each character c: ('a' <= c && c <= 'z') || ('A' <= c && c <= 'Z')

This could increase your performance

0

精彩评论

暂无评论...
验证码 换一张
取 消