开发者

regex to remove ordinals

开发者 https://www.devze.com 2023-03-09 20:53 出处:网络
I need to remove ordinals via regex, but my regex skills are quite lacking. The following locates the ordinals, but includes the digit just prior in the return value. I need to 开发者_如何学编程isolat

I need to remove ordinals via regex, but my regex skills are quite lacking. The following locates the ordinals, but includes the digit just prior in the return value. I need to 开发者_如何学编程isolate and remove just the ordinal.

[0-9](?:st|nd|rd|th)


You need to use a look-behind assertion so that only st|nd|rd|th preceded by a [0-9] are matched, but the [0-9] isn't included in the match. i.e.:

(?<=[0-9])(?:st|nd|rd|th)

I've linked to the perl-compatible syntax, but if you're using posix, posix extended, vi or one of many other regex syntaxes you'll need to look up the syntax.


In perl:

$var =~ s{\b(\d+)(?:st|nd|rd|th)\b}{$1};

In PHP:

$var = preg_replace('/\\b(\d+)(?:st|nd|rd|th)\\b/', '$1', $var);

In .NET:

var = Regex.Replace(@"\b(\d+)(?:st|nd|rd|th)\b", "$1");


If you want to remove as well the numbers followed by ordinals you could use this one:

[0-9]+(?:st| st|nd| nd|rd| rd|th| th)

So for a given text: "The 3rd person is missing but the 2 nd and the 1st is here" you'll have this output: "The person is missing but the and the is here"


Try a negative lookbehind:

(?<=[0-9])(?:st|nd|rd|th)

assuming the dialect of regex supports it.


I came across this question, because I needed to replace ordinal numbers with dot, i. e. 1., 2., 4. etc.

Here is the solution for this problem (in PHP):

$entry = preg_replace('/^\d+\. /', '', $entry);

Test: https://regex101.com/r/xLB6Ov/1

0

精彩评论

暂无评论...
验证码 换一张
取 消