开发者

Preg_replace regex in PHP gives unexpected empty result

开发者 https://www.devze.com 2023-03-21 10:24 出处:网络
I am using a regex to replace all email addresses in a string with a nice <a> to make them clickable. This works perfect, except for the case when there are two words of a certain minimum length

I am using a regex to replace all email addresses in a string with a nice <a> to make them clickable. This works perfect, except for the case when there are two words of a certain minimum length and a dash between them in front of the email address. Only then I get an empty string as result.

<?php

$search = '#(^|[ \n\r\t])(([a-z0-9\-_]+(\.?))+@([a-z0-9\-]+(\.?))+[a-z]{2,5})#si';
$replace = '\\1<a href="mailto:\\2">\\2</a>';

$string = "tttteeee-sssstttt mail@test.nl";
echo preg_replace($search, $replace, $string);
// Output: "" (empty)

$string = "te-st mail@test.nl";
echo preg_replace($search, $replace, $string);
// Output: "te-st <a href="mailto:mail@test.nl">mail@test.nl</a>" (as expected)

$string = "mail@test.nl tttteeee-sssstttt";
echo preg_replace($search, $replace, $string);
// Output: "<a href="mailto:mail@test.nl">mail@test.nl</a> tttteeee-sssstttt" (as expected)

?>

I have tried everything, but I really can't find the problem. A solution would be removing the first dash in the regex开发者_开发知识库 (before the @ sign), but that way email addresses with a dash before the @ wouldn't be highlighted.


OK, minimum use case: #([a-z-]+\.?)+@#, which reaches the backtrack limit (use preg_last_error()), it cannot determine where to put things, as the \. is optional, determining whether to use the inside or the outside + is a lot of work. The default limit of pcre.backtrack_limit of 100000 does not work, setting it to 1000000 does.

To solve this, make it easier on the parser: the first (([a-z0-9\-_]+(\.?))+ should become: ([a-z0-9\-_]+(\.[a-z0-9\-_]+)*), which is a lot easier to solve internally. And as a bonus, instead of the accepted answer, this still doesn't allow consecutive dots.


Try using this for your search string instead:

$search = '#(^|\b)([A-Z0-9_\-.]+@[A-Z0-9_\-.]+\.[A-Z]{2,5})($|\b)#i';

0

精彩评论

暂无评论...
验证码 换一张
取 消