How ca开发者_开发问答n I write a regex that matches only letters?
Use a character set: [a-zA-Z]
matches one letter from A–Z in lowercase and uppercase. [a-zA-Z]+
matches one or more letters and ^[a-zA-Z]+$
matches only strings that consist of one or more letters only (^
and $
mark the begin and end of a string respectively).
If you want to match other letters than A–Z, you can either add them to the character set: [a-zA-ZäöüßÄÖÜ]
. Or you use predefined character classes like the Unicode character property class \p{L}
that describes the Unicode characters that are letters.
\p{L}
matches anything that is a Unicode letter if you're interested in alphabets beyond the Latin one
Depending on your meaning of "character":
[A-Za-z]
- all letters (uppercase and lowercase)
[^0-9]
- all non-digit characters
The closest option available is
[\u\l]+
which matches a sequence of uppercase and lowercase letters. However, it is not supported by all editors/languages, so it is probably safer to use
[a-zA-Z]+
as other users suggest
You would use
/[a-z]/gi
[]--checks for any characters between given inputs
a-z---covers the entire alphabet
g-----globally throughout the whole string
i-----getting upper and lowercase
Java:
String s= "abcdef";
if(s.matches("[a-zA-Z]+")){
System.out.println("string only contains letters");
}
Regular expression which few people has written as "/^[a-zA-Z]$/i" is not correct because at the last they have mentioned /i which is for case insensitive and after matching for first time it will return back. Instead of /i just use /g which is for global and you also do not have any need to put ^ $ for starting and ending.
/[a-zA-Z]+/g
- [a-z_]+ match a single character present in the list below
- Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed
- a-z a single character in the range between a and z (case sensitive)
- A-Z a single character in the range between A and Z (case sensitive)
- g modifier: global. All matches (don't return on first match)
In python, I have found the following to work:
[^\W\d_]
This works because we are creating a new character class (the []
) which excludes (^
) any character from the class \W
(everything NOT in [a-zA-Z0-9_]
), also excludes any digit (\d
) and also excludes the underscore (_
).
That is, we have taken the character class [a-zA-Z0-9_]
and removed the 0-9
and _
bits. You might ask, wouldn't it just be easier to write [a-zA-Z]
then, instead of [^\W\d_]
? You would be correct if dealing only with ASCII text, but when dealing with unicode text:
\W
Matches any character which is not a word character. This is the opposite of \w. > If the ASCII flag is used this becomes the equivalent of [^a-zA-Z0-9_].
^ from the python re module documentation
That is, we are taking everything considered to be a word character in unicode, removing everything considered to be a digit character in unicode, and also removing the underscore.
For example, the following code snippet
import re
regex = "[^\W\d_]"
test_string = "A;,./>>?()*)&^*&^%&^#Bsfa1 203974"
re.findall(regex, test_string)
Returns
['A', 'B', 's', 'f', 'a']
/[a-zA-Z]+/
Super simple example. Regular expressions are extremely easy to find online.
http://www.regular-expressions.info/reference.html
For PHP, following will work fine
'/^[a-zA-Z]+$/'
Use character groups
\D
Matches any character except digits 0-9
^\D+$
See example here
Just use \w
or [:alpha:]
. It is an escape sequences which matches only symbols which might appear in words.
If you mean any letters in any character encoding, then a good approach might be to delete non-letters like spaces \s
, digits \d
, and other special characters like:
[!@#\$%\^&\*\(\)\[\]:;'",\. ...more special chars... ]
Or use negation of above negation to directly describe any letters:
\S \D and [^ ..special chars..]
Pros:
- Works with all regex flavors.
- Easy to write, sometimes save lots of time.
Cons:
- Long, sometimes not perfect, but character encoding can be broken as well.
You can try this regular expression : [^\W\d_]
or [a-zA-Z]
.
So, I've been reading a lot of the answers, and most of them don't take exceptions into account, like letters with accents or diaeresis (á, à, ä, etc.).
I made a function in typescript that should be pretty much extrapolable to any language that can use RegExp. This is my personal implementation for my use case in TypeScript. What I basically did is add ranges of letters with each kind of symbol that I wanted to add. I also converted the char to upper case before applying the RegExp, which saves me some work.
function isLetter(char: string): boolean {
return char.toUpperCase().match('[A-ZÀ-ÚÄ-Ü]+') !== null;
}
If you want to add another range of letters with another kind of accent, just add it to the regex. Same goes for special symbols.
I implemented this function with TDD and I can confirm this works with, at least, the following cases:
character | isLetter
${'A'} | ${true}
${'e'} | ${true}
${'Á'} | ${true}
${'ü'} | ${true}
${'ù'} | ${true}
${'û'} | ${true}
${'('} | ${false}
${'^'} | ${false}
${"'"} | ${false}
${'`'} | ${false}
${' '} | ${false}
Lately I have used this pattern in my forms to check names of people, containing letters, blanks and special characters like accent marks.
pattern="[A-zÀ-ú\s]+"
JavaScript
If you want to return matched letters:
('Example 123').match(/[A-Z]/gi)
// Result: ["E", "x", "a", "m", "p", "l", "e"]
If you want to replace matched letters with stars ('*') for example:
('Example 123').replace(/[A-Z]/gi, '*')
//Result: "****** 123"*
/^[A-z]+$/.test('asd')
// true
/^[A-z]+$/.test('asd0')
// false
/^[A-z]+$/.test('0asd')
// false
pattern = /[a-zA-Z]/
puts "[a-zA-Z]: #{pattern.match("mine blossom")}" OK
puts "[a-zA-Z]: #{pattern.match("456")}"
puts "[a-zA-Z]: #{pattern.match("")}"
puts "[a-zA-Z]: #{pattern.match("#$%^&*")}"
puts "[a-zA-Z]: #{pattern.match("#$%^&*A")}" OK
Pattern pattern = Pattern.compile("^[a-zA-Z]+$");
if (pattern.matcher("a").find()) {
...do something ......
}
精彩评论