开发者

Regular expression to find last word in sentence

开发者 https://www.devze.com 2023-01-15 02:47 出处:网络
How can I find last word in a sentenc开发者_开发知识库e with a regular expression?If you need to find the last word in a string, then do this:

How can I find last word in a sentenc开发者_开发知识库e with a regular expression?


If you need to find the last word in a string, then do this:

m/
    (\w+)      (?# Match a word, store its value into pattern memory)

    [.!?]?     (?# Some strings might hold a sentence. If so, this)
               (?# component will match zero or one punctuation)
               (?# characters)

    \s*        (?# Match trailing whitespace using the * because there)
               (?# might not be any)

    $          (?# Anchor the match to the end of the string)
/x;

After this statement, $1 will hold the last word in the string. You may need to expand the character class, [.!?], by adding more punctuation.

in PHP:

<?php

$str = 'MiloCold is Neat';
$str_Pattern = '/[^ ]*$/';

preg_match($str_Pattern, $str, $results);

// Prints "Neat", but you can just assign it to a variable.
print $results[0];

?> 


In general you can't correctly parse English text with regular expressions.

The best you can do is to look for some punctuation that usually terminates a sentence but unfortunately this is not a guarantee. For example the text Mr. Bloggs is here. Do you want to talk to him? contains two periods which have different meanings. There is no way for a regular expression to distinguish between the two uses of the period.

I'd suggest instead that you look at a natural language parsing library. For example the Stanford Parser has no trouble at all correctly parsing the above text into the two sentences:

Mr./NNP Bloggs/NNP is/VBZ here/RB ./.
Do/VBP you/PRP want/VB to/TO talk/VB to/TO him/PRP ?/.

There are lots of other freely available NLP libraries that you could use too, I'm not endorsing that one product in particular - it's just an example to demonstrate that it is possible to parse text into sentences with a fairly high reliability. Note though that even a natural language parsing library will still occasionally make a mistake - parsing human languages correctly is hard.

0

精彩评论

暂无评论...
验证码 换一张
取 消