开发者

find all text before using regex

开发者 https://www.devze.com 2023-01-04 11:56 出处:网络
How can I use regex to find all text before the text \"All text before this line will be included\"? I have includes some sample text below for example

How can I use regex to find all text before the text "All text before this line will be included"?

I have includes some sample text below for example

This can include deleting, updating, or adding records to your database, which would then be reflex.

All text before this line will be included

You can make this a bit more sophisticated by encrypting the 开发者_JS百科random number and then verifying that it is still a number when it is decrypted. Alternatively, you can pass a value and a key instead.


Starting with an explanation... skip to end for quick answers

To match upto a specific piece of text, and confirm it's there but not include it with the match, you can use a positive lookahead, using notation (?=regex)

This confirms that 'regex' exists at that position, but matches the start position only, not the contents of it.

So, this gives us the expression:

.*?(?=All text before this line will be included)

Where . is any character, and *? is a lazy match (consumes least amount possible, compared to regular * which consumes most amount possible).

However, in almost all regex flavours . will exclude newline, so we need to explicitly use a flag to include newlines. The flag to use is s, (which stands for "Single-line mode", although it is also referred to as "DOTALL" mode in some flavours).

And this can be implemented in various ways, including...

Globally, for /-based regexes:

/regex/s

Inline, global for the regex:

(?s)regex

Inline, applies only to bracketed part:

(?s:reg)ex

And as a function argument (depends on which language you're doing the regex with).

So, probably the regex you want is this:

(?s).*?(?=All text before this line will be included)


However, there are some caveats:

Firstly, not all regex flavours support lazy quantifiers - you might have to use just .*, (or potentially use more complex logic depending on precise requirements if "All text before..." can appear multiple times).

Secondly, not all regex flavours support lookaheads, so you will instead need to use captured groups to get the text you want to match.

Finally, you can't always specify flags, such as the s above, so may need to either match "anything or newline" (.|\n) or maybe [\s\S] (whitespace and not whitespace) to get the equivalent matching.

If you're limited by all of these (I think the XML implementation is), then you'll have to do:

([\s\S]*)All text before this line will be included

And then extract the first sub-group from the match result.


(.*?)All text before this line will be included

Depending on what particular regular expression framework you're using, you may need to include a flag to indicate that . can match newline characters as well.

The first (and only) subgroup will include the matched text. How you extract that will again depend on what language and regular expression framework you're using.

If you want to include the "All text before this line..." text, then the entire match is what you want.


This should do it:

<?php
$str = "This can include deleting, updating, or adding records to your database, which would then be reflex.

All text before this line will be included

You can make this a bit more sophisticated by encrypting the random number and then verifying that it is still a number when it is decrypted. Alternatively, you can pass a value and a key instead.";

echo preg_filter("/(.*?)All text before this line will be included.*/s","\\1",$str);
?>

Returns:

This can include deleting, updating, or adding records to your database, which would then be reflex.
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号