I'm trying to create a pattern that would identify a money in a string. My expression so far is:
(\d{1,3}[\.,\s]{0,2})*\d{3}[\.,\s]{0,2}\d{0,2}[\s]{0,2}[zl|zł|zlotych|złotych|pln|PLN]{0,1}
and my main problem is with the last part: [zl|zł|zlotych|złotych|pln|PLN], which s开发者_如何学Chould find one of the national notations for money value (sth like $ or usd or dollars) but I'm doing it wrong, since it also matches something like '108.1 z'.
Is it possible to change the last part, so that it would match only expressions that contain the whole expressions like 'zl', 'pln' and so on, and not single letters?
Yes, don't use []
, which defines a character class, but instead use ()
to group your words.
(\d{1,3}[\.,\s]{0,2})*\d{3}[\.,\s]{0,2}\d{0,2}[\s]{0,2}(zl|zł|zlotych|złotych|pln|PLN)?
As you had it written, [zl|zł|zlotych|złotych|pln|PLN]
, means "match any of the characters contained in the []
", or the equivalent of: [zl|łotychpnPLN]
(duplicates removed)
If you don't want the money symbol captured, then start the group with ?:
, i.e.:
(\d{1,3}[\.,\s]{0,2})*\d{3}[\.,\s]{0,2}\d{0,2}[\s]{0,2}(?:zl|zł|zlotych|złotych|pln|PLN)?
Use parentheses (which delimit groups) rather than square brackets (which delimit character classes) around that last group.
As a matter of style, use ? instead of {0,1}.
(\d{1,3}[\.,\s]{0,2})*\d{3}[\.,\s]{0,2}\d{0,2}[\s]{0,2}(zl|zł|zlotych|złotych|pln|PLN)?
You have a few problems here. First off, inside []
characters are taken as literals, so the first two []
blocks should be [.,\s]
.
Next (as the other answers say), the last []
block needs to be a group, not a character class, so replace the []
with ()
.
Finally, at the end you can replace {0, 1}
with ?
. It won't make a difference, but it's neater.
The regex should look like this:
(\d{1,3}[.,\s]{0,2})*\d{3}[.,\s]{0,2}\d{0,2}[\s]{0,2}(zl|zł|zlotych|złotych|pln|PLN)?
For the future, for regex questions it's really helpful if you post a typical input string and desired match along with your question!
精彩评论