Hello I am trying to split 开发者_StackOverflowsome input by line, and use trim() on each line. But I would like to do it without using trim, just with regex.
The issue I am having with this, is that whitspaces at the end of the line are not trimmed away. I guess my group [^$\s] whitespaces but no linebreak does not work.
So the question is, how to solve my problem, and how to define a group in preg regex, which explicitly says ignore line breaks? At the moment I am thinking my approach is still wrong. The problem is, if I write \s* instead of this weird group. .+ eats all. If I write .+? I do not get strings which include spaces back complete.
preg_match_all("/^\s*+(.+)[^$\s]*+$/m", $_POST['input'], $matches, PREG_SET_ORDER );
Okay, I'm usually all for using regular expressions. But the trim
approach would be simpler here. And I assume you avoided it because it usually requires an extra loop. But in this instance you could compact it to:
$lines = array_map("trim", explode("\n", $_POST["input"]));
// quite a handy utility function, so just wanted to note that here
But as alternative to your found solution, you could have alternatively used:
preg_split('/((?!\n)\p{Z})*\n((?!\n)\p{Z})*/u', "...\n...");
A bit hackish now. Swapped out the ^$
just for \n
, and used assertions to exclude newlines elsewhere. But the \p{Z}
is a nice alternative to catch all Unicode space character variations, including NBSP and other ninja placeholders.
preg_match_all("/\s*(.*\S)/", $_POST['input'], $matches, PREG_SET_ORDER );
You need something to eat leading whitespace before your capture group, including whole lines. \s*
does that. You don't need to force it to start at the beginning of a line, you're not saving it anyway -- its only purpose is to match up to just before a non-whitespace character.
So now you know that you're looking at non-whitespace, and need to capture up to the last non-whitespace on the same line. Since .
won't match newline, .*\S
does just that.
One difference from your version is that the initial \s*
of the next match gets to eat the trailing whitespace on the line you just matched. Since we no longer care about line endings, the /m
modifier is no longer necessary.
You could make the first star possessive (\s*+
); that won't change what it matches, but it will make it fail marginally faster at the end of the file if there's a long empty tail.
精彩评论