How can I find last word in a sentenc开发者_开发知识库e with a regular expression?
If you need to find the last word in a string, then do this:
m/
(\w+) (?# Match a word, store its value into pattern memory)
[.!?]? (?# Some strings might hold a sentence. If so, this)
(?# component will match zero or one punctuation)
(?# characters)
\s* (?# Match trailing whitespace using the * because there)
(?# might not be any)
$ (?# Anchor the match to the end of the string)
/x;
After this statement, $1 will hold the last word in the string. You may need to expand the character class, [.!?], by adding more punctuation.
in PHP:
<?php
$str = 'MiloCold is Neat';
$str_Pattern = '/[^ ]*$/';
preg_match($str_Pattern, $str, $results);
// Prints "Neat", but you can just assign it to a variable.
print $results[0];
?>
In general you can't correctly parse English text with regular expressions.
The best you can do is to look for some punctuation that usually terminates a sentence but unfortunately this is not a guarantee. For example the text Mr. Bloggs is here. Do you want to talk to him? contains two periods which have different meanings. There is no way for a regular expression to distinguish between the two uses of the period.
I'd suggest instead that you look at a natural language parsing library. For example the Stanford Parser has no trouble at all correctly parsing the above text into the two sentences:
Mr./NNP Bloggs/NNP is/VBZ here/RB ./. Do/VBP you/PRP want/VB to/TO talk/VB to/TO him/PRP ?/.
There are lots of other freely available NLP libraries that you could use too, I'm not endorsing that one product in particular - it's just an example to demonstrate that it is possible to parse text into sentences with a fairly high reliability. Note though that even a natural language parsing library will still occasionally make a mistake - parsing human languages correctly is hard.
精彩评论