I have any text in C#, and I need "match" using Regular Expressions, and get a value (parsing the text for get the value).
Texts:
var asunto1 = "ID P20101125_0003 -- Pendiente de autorización --";
var asunto2 = "ID P20101125_0003 any text any text";
var asunto3 = "ID_P20101125_0003 any text any text";
I need get the value:
var peticion = "P20101125_0003";
I have this regular expression, but fails for me:
//ID P20101125_0003 -- Pendiente de autorización --
patternPeticionEV.Append(@"^");
patternPeticionEV.Append(@"ID P");
patternPeticionEV.Append(@"(20[0-9][0-9])"); // yyyy
patternPeticionEV.Append(@"(0[1-9]|1[012])"); // MM
patternPeticionEV.Append(@"(0[1-9]|[12][0-9]|3[01])"); // dd
patternPeticionEV.Append(@"(_)");
patternPeticionEV.Append(@"\d{4}");
//patternPeticionEV.Append(@"*");
patternPeticionEV.Append(@"$");
if (System.Text.RegularExp开发者_如何学编程ressions.Regex.IsMatch(asuntoPeticionEV, exprRegular, System.Text.RegularExpressions.RegexOptions.IgnoreCase))
{
var match = System.Text.RegularExpressions.Regex.Match(asuntoPeticionEV, exprRegular, System.Text.RegularExpressions.RegexOptions.IgnoreCase);
//...
}
Your regular expression ends with "$" which says "the line/text has to end there". You don't want that. Just get rid of this line:
patternPeticionEV.Append(@"$");
and it will mostly work immediately. You then just need to add a capturing group to isolate the bit of text that you want.
I'd also recommend adding using System.Text.RegularExpressions;
so that you don't have to fully qualify Regex
each time. You can also call Match
and then check for success, to avoid matching it twice.
Sample code:
using System.Text.RegularExpressions;
class Test
{
static void Main()
{
DisplayMatch("ID P20101125_0003 -- Pendiente de autorización --");
// No match due to _
DisplayMatch("ID_P20101125_0003 any text any text");
}
static readonly Regex Pattern = new Regex
("^" + // Start of string
"ID " +
"(" + // Start of capturing group
"P" +
"(20[0-9][0-9])" + // yyyy
"(0[1-9]|1[012])" + // MM
"(0[1-9]|[12][0-9]|3[01])" + // dd
@"_\d{4}" +
")" // End of capturing group
);
static void DisplayMatch(string input)
{
Match match = Pattern.Match(input);
if (match.Success)
{
Console.WriteLine("Matched: {0}", match.Groups[1]);
}
else
{
Console.WriteLine("No match");
}
}
}
This might be just me but for things like parsing strings into meaningful values I prefer to do something more verbose like this:
private bool TryParseContent(string text, out DateTime date, out int index)
{
date = DateTime.MinValue;
index = -1;
if (text.Length < 17)
return false;
string idPart = text.Substring(0, 4);
if (idPart != "ID_P" && idPart != "ID P")
return false;
string datePart = text.Substring(4, 8);
if (!DateTime.TryParseExact(datePart, "yyyyMMdd", System.Globalization.DateTimeFormatInfo.InvariantInfo, System.Globalization.DateTimeStyles.None, out date))
return false;
// TODO: do additional validation of the date
string indexPart = text.Substring(13, 4);
if (!int.TryParse(indexPart, out index))
return false;
return true;
}
Why not use substring like below:
var asunto1 = "ID P20101125_0003 -- Pendiente de autorización --";
var asunto2 = "ID P20101125_0003 any text any text";
var asunto3 = "ID_P20101125_0003 any text any text";
var peticion = asunto1.Substring(3,14); //gets P20101125_0003
This regex will give you desired string
^ID[_ ]P[0-9_]+?
精彩评论