i have a phrase like this
Computer, Eddie is gone to the market.
I want to get the word Eddie and ignore all of the other words since other words are constant, and the word Eddie could be anything.
How can I do th开发者_如何学编程is in regular expression?
Edit:
Sorry I'm using .NET regex :)
You can use this pattern:
Computer, (\w+) is gone to the market\.
This uses brackets to match \w+
and captures it in group 1.
Note that the period at the end has been escaped with a \
because .
is a regex metacharacter.
Given the input:
LOL! Computer, Eddie is gone to the market. Blah blah
blah. Computer, Alice is gone to the market... perhaps...
Computer, James Bond is gone to the market.
Then there are two matches (as seen on rubular.com). In the first match, group 1 captured Eddie
. In the second match, group 1 captured Alice
.
Note that \w+
doesn't match James Bond
, because \w+
is a sequence of "one or more word character". If you need to match these kinds non-"single word" names, then simply replace it with the regex to match the names.
References
- regular-expressions.info/Capturing Groups and The Dot
General technique
Given this test string:
i have 35 dogs, 16 cats and 10 elephants
Then (\d+) (cats|dogs)
yields 2 match results (see on rubular.com)
- Result 1:
35 dogs
- Group 1 captures
35
- Group 2 captures
dogs
- Group 1 captures
- Result 2:
16 cats
- Group 1 captures
16
- Group 2 captures
cats
- Group 1 captures
Related questions
- Saving substrings using Regular Expressions
C# snippet
Here's a simple example of capturing groups usage:
var text = @"
LOL! Computer, Eddie is gone to the market. Blah blah
blah. Computer, Alice is gone to the market... perhaps...
Computer, James Bond is gone to the market.
";
Regex r = new Regex(@"Computer, (\w+) is gone to the market\.");
foreach (Match m in r.Matches(text)) {
Console.WriteLine(m.Groups[1]);
}
The above prints (as seen on ideone.com):
Eddie
Alice
API references
System.Text.RegularExpressions
Namespace
On specification
As noted, \w+
does not match "James Bond"
. It does, however, match "o_o"
, "giggles2000"
, etc (as seen on rubular.com). As much as reasonably practical, you should try to make your patterns as specific as possible.
Similarly, (\d+) (cats|dogs)
will match 100 cats
in $100 catsup
(as seen on rubular.com).
These are issues on the patterns themselves, and not directly related to capturing groups.
/^Computer, \b(.+)\b is gone to the market\.$/
Eddie
would be in the first captured string $1
. If you specify the language, we can tell you how to extract it.
Edit: C#:
Match match = Regex.Match(input, @"^Computer, \b(.+)\b is gone to the market\.$");
Console.WriteLine(match.Groups[1].Value);
Get rid of ^
and $
from the regex if the string would be part of another string - they match start and end of a line respectively.
精彩评论