I thought that $ indicates the end of string. However, the following piece of code gives "testbbbccc" as a result, which is quite astonishing to me... This means that $ actually matches end of line, not end of the whole string.
#include <iostream>
#include <regex>
using namespace std;
int main()
{
tr1::regex r("aaa([^]*?)(ogr|$)");
string test("bbbaaates开发者_开发百科tbbbccc\nddd");
vector<int> captures;
captures.push_back(1);
const std::tr1::sregex_token_iterator end;
for (std::tr1::sregex_token_iterator iter(test.begin(), test.end(), r, captures); iter != end; )
{
string& t1 = iter->str();
iter++;
cout << t1;
}
}
I have been trying to find a "multiline" switch (which actually can be easily found in PCRE), but without success... Can someone point me to the right direction?
Regards, R.P.
As Boost::Regex was selected for tr1, try the following:
From Boost::Regex
Anchors:
A '^' character shall match the start of a line when used as the first character of an expression, or the first character of a sub-expression.
A '$' character shall match the end of a line when used as the last character of an expression, or the last character of a sub-expression.
So the behavior you observed is correct.
From: Boost Regex as well:
\A
Matches at the start of a buffer only (the same as\`
).
\z
Matches at the end of a buffer only (the same as\'
).
\Z
Matches an optional sequence of newlines at the end of a buffer: equivalent to the regular expression\n*\z
I hope that helps.
There is no multiline switch in TR1 regexs. It's not exactly the same, but you could get the same functionality matching everything:
(.|\r|\n)*?
This matches non-greedily every character, including new line and carriage return.
Note: Remember to escape the backslashes '\'
like this '\\'
if your pattern is a C++ string in code.
Note 2: If you don't want to capture the matched contents, append '?:' to the opening bracket:
(?:.|\r|\n)*?
精彩评论