I need to parse a textfile with about 10000 groupings like this
group "C_BatTemp" -- block-group
{
block: "Constant"
flags: BLOCK|COLLAPSED
}
-- Skipping output Out1
p_untitled_P_real_T_0[1]
{
type: flt(64,IEEE)*
alias: "Value"
flags: PARAM
}
endgroup -- block-group "C_BatTemp"
The desired objects I expect the parser to fill look like this
string Varname = "C_BatTemp";
string GroupType = "Constant";
string BaseAdressName = "p_untitled_P_real_T_0";
int AdressOffset = 1; // number in parenthesis p_untitled_P_real_T_0[1]<----
string VarType = "fl开发者_开发技巧t(64, IEEE)";
bool IsPointer = true; // true if VarType is "flt(64, IEEE)*" ,
//false if "flt(64, IEEE)"
string VarAlias = "Value";
What is the best way of parsing this ??
How sould I start ?
One solution might be using a regular expression. I quickly made up one, but it might require some additional tuning to fit your exact needs. It works for your example, but might fail for other inputs. The expression is very much tailored to the given example especially in respect to line breaks and comments.
CODE
String input =
@"group ""C_BatTemp"" -- block-group
{
block: ""Constant""
flags: BLOCK|COLLAPSED
}
-- Skipping output Out1
p_untitled_P_real_T_0[1]
{
type: flt(64,IEEE)*
alias: ""Value""
flags: PARAM
}
endgroup -- block-group ""C_BatTemp""";
String pattern = @"^group\W*""(?<varname>[^""]*)""[^{]*{\W*block:\W*""(?<grouptype>[^""]*)""[^}]*}$(\W*--.*$)*\W*(?<baseaddressname>[^[]*)\[(?<addressoffset>[^\]]*)][^{]*{\W*type:\W*(?<vartype>.*)$\W*alias:\W*""(?<alias>[^""]*)""[^}]*}\W*endgroup.*$";
foreach (Match match in Regex.Matches(input.Replace("\r\n", "\n"), pattern, RegexOptions.Multiline))
{
Console.WriteLine(match.Groups["varname"].Value);
Console.WriteLine(match.Groups["grouptype"].Value);
Console.WriteLine(match.Groups["baseaddressname"].Value);
Console.WriteLine(match.Groups["addressoffset"].Value);
Console.WriteLine(match.Groups["vartype"].Value);
Console.WriteLine(match.Groups["vartype"].Value.EndsWith("*"));
Console.WriteLine(match.Groups["alias"].Value);
}
OUTPUT
C_BatTemp
Constant
p_untitled_P_real_T_0
1
flt(64,IEEE)*
True
Value
I had to do something similar recently.
Break each block of data into records (would these be your 'groups'?). Extract each element you need from each record using Regular Expressions.
Without a clearer idea of the data I can't elaborate.
精彩评论