I know I could write my own while loop along with regex to count the words开发者_JAVA技巧 in a line. But, I am processing like 1000 lines and I dont want to run this loop each and every time. So, I was wondering is there any way to count the words in the line in perl.
1000 times is not a significant number to a modern computer. In general, write the code that makes sense to you, and then, if there is a performance problem, worry about optimization.
To count words, first you need to decide what is a word. One approach is to match groups of consecutive word characters, but that counts "it's" as two words. Another is to match groups of consecutive non-whitespace, but that counts "phrase - phrase" as three words. Once you have a regex that matches a word, you can count words like this (using consecutive word characters for this example):
scalar( () = $line =~ /\w+/g )
How about splitting the line on one or more non-word characters and counting the size of the resulting array?
$ echo "one, two, three" | perl -nE "say scalar split /\W+/"
3
As a sub that would be:
# say count_words 'foo bar' => 2
sub count_words { scalar split /\W+/, shift }
To get rid of the leading space problem spotted by ysth, you can filter out the empty segments:
$ echo " one, two, three" | perl -nE 'say scalar grep {length $_} split /\W+/'
3
…or shave the input string:
$ echo " one, two, three" | perl -nE 's/^\W+//; say scalar split /\W+/'
3
精彩评论