I need to remove some data from an RSS feed.
It's everything that appears before a : and also preferably the space that appears just after the :
Example:
Hello : Charlie wants to know how to delete everything behind him from behind the colon and one space in front. I will always have this question mark on the end?
Where the 开发者_运维技巧: and hello would be matched but not the "Charlie said hello"
Thanks to all who have this wonderful knowledge and take time to reply.
Use
^[^:]+:\s*
instead of
^.+:\s*
This is an example of it working:
perl -le 'my $string = q{Foo : bar baz}; $string =~ s{^[^:]+:\s*}{}; print $string;'
And I recommended the first one over the second to avoid greediness issues:
perl -le 'my $string = q{Foo : bar: baz}; $string =~ s{^[^:]+:\s*}{}; print $string;'
To see the greediness issues I mentioned:
perl -le 'my $string = q{Foo : bar baz}; $string =~ s{^.+:\s*}{}; print $string;'
perl -le 'my $string = q{Foo : bar: baz}; $string =~ s{^.+:\s*}{}; print $string;'
Try this:
^[^:]+:\s?
The trailing \s?
will match a space following the colon, but not require it.
I agree with @gpojd; you should use a negative character class to avoid greediness issues if there are colons in the payload.
You can use just:
^.*:
This matches:
Hello :
Charlie wants to know how to delete everything behind him from behind the colon and one space in front. I will always have this question mark on the end?
Something like ^.*: *
should work well. This matches from the beginning of the line up to the colon and any spaces after it.
精彩评论