开发者

Regular Expressions: get what is outside of the brackets

开发者 https://www.devze.com 2023-01-16 02:57 出处:网络
I\'m using PHP and I have text like: first [abc] middle [xyz] last I need to get what\'s inside and outside o开发者_JS百科f the brackets. Searching in StackOverflow I found a pattern to get what\'s

I'm using PHP and I have text like:

first [abc] middle [xyz] last

I need to get what's inside and outside o开发者_JS百科f the brackets. Searching in StackOverflow I found a pattern to get what's inside:

preg_match_all('/\[.*?\]/', $m, $s)

Now I'd like to know the pattern to get what's outside.

Regards!


You can use preg_split for this as:

$input ='first [abc] middle [xyz] last';
$arr = preg_split('/\[.*?\]/',$input);
print_r($arr);

Output:

Array
(
    [0] => first 
    [1] =>  middle 
    [2] =>  last
)

This allows some surrounding spaces in the output. If you don't want them you can use:

$arr = preg_split('/\s*\[.*?\]\s*/',$input);

preg_split splits the string based on a pattern. The pattern here is [ followed by anything followed by ]. The regex to match anything is .*. Also [ and ] are regex meta char used for char class. Since we want to match them literally we need to escape them to get \[.*\]. .* is by default greedy and will try to match as much as possible. In this case it will match abc] middle [xyz. To avoid this we make it non greedy by appending it with a ? to give \[.*?\]. Since our def of anything here actually means anything other than ] we can also use \[[^]]*?\]

EDIT:

If you want to extract words that are both inside and outside the [], you can use:

$arr = preg_split('/\[|\]/',$input);

which split the string on a [ or a ]


$inside = '\[.+?\]';
$outside = '[^\[\]]+';
$or = '|';

preg_match_all(
    "~ $inside $or $outside~x", 
    "first [abc] middle [xyz] last", 
    $m);
print_r($m);

or less verbose

  preg_match_all("~\[.+?\]|[^\[\]]+~", $str, $matches)


Use preg_split instead of preg_match.

preg_split('/\[.*?\]/', 'first [abc] middle [xyz] last');

Result:

array(3) {
  [0]=>
  string(6) "first "
  [1]=>
  string(8) " middle "
  [2]=>
  string(5) " last"
}

ideone


As every one says that you should use preg_split, but only one person replied with an expression that meets your needs, and i think that is a little complex - not complex, a little to verbose but he has updated his answer to counter that.

This expression is what most of the replies have stated.

/\[.*?\]/

But that only prints out

Array
(
    [0] => first 
    [1] =>  middle 
    [2] =>  last
)

and you stated you wanted whats inside and outside the braces, sio an update would be:

/[\[.*?\]]/

This gives you:

Array
(
    [0] => first 
    [1] => abc
    [2] =>  middle 
    [3] => xyz
    [4] =>  last
)

but as you can see that its capturing white spaces as well, so lets go a step further and get rid of those:

/[\s]*[\[.*?\]][\s]*/

This will give you a desired result:

Array
(
    [0] => first
    [1] => abc
    [2] => middle
    [3] => xyz
    [4] => last
)

This i think is the expression your looking for.

Here is a LIVE Demonstration of the above Regex

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号