quick question for regex_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2022-12-20 05:09 出处：网络

I have a word list, but it has some 开发者_运维问答words like East\'s I need to find the words, those only contain a-z and A-Z, from a word list. How to do that.

相关专题：grep regex

I have a word list, but it has some 开发者_运维问答words like East's

I need to find the words, those only contain a-z and A-Z, from a word list. How to do that.

I am using grep. What should I put after grep

grep *** myfile.txt

Thanks!

The regexp you want is ^[a-zA-Z]+$

For grep:

vinko@parrot:~$ more a.txt
Hi
Hi Dude
Hi's

vinko@parrot:~$ egrep ^[a-zA-Z]+$ a.txt
Hi

In pseudocode:

 regexp = "^[a-zA-Z]+$";
 foreach word in list
      if regexp.matches(word)
          do_something_with(word)

The grep syntax is:

grep '^[[:alpha:]]\+$' input.txt

Documentation for grep's pattern syntax is here.

Use fgrep if you want to match against a word list.

fgrep word_list_file myfile.txt

[a-z]+

using the case insensitive option, or

[A-Za-z]+

without the case insensitive option.

Post the data and the langage for more help.

for grep

egrep -i '^[a-z]+$' wordlist.dat

i can't remember what metachars need escaping and not if it doesn't work, try \[a-z\]\+ or any similar combination!

GNU grep

grep -wEo "[[:alpha:]]+" file

Or filter out all words that contain funnies

grep -v '[^a-zA-Z]'

Is there a prize for the shortest answer? :)

Note that there are portability differences between [[:alpha:]] and [A-Za-z]. [A-Za-z] works in more versions of grep, but [[:alpha:]] takes account of wide character environments and internationalization (accented characters for example when they are included in the locale).