I have a string with nested groups like this ('blabla' is some text within the string that must be ignored)
string Stream1 = @"group ""Main""
bla
bla
group ""Sub1"" -- block-group
var1
var2
endgroup -- block-group ""Sub1""
bla
bla
group ""Sub2"" -- block-group
var1
endgroup -- block-group ""Sub2""
bla
group ""Sub3"" -- block-group
var1
var2
var3
group ""SubSub31"" -- block-group
var10
var20
开发者_高级运维 endgroup -- block-group ""SubSub31""
endgroup -- block-group ""Sub3""
endgroup";
The expected output is a list of GroupObjects like this
public class GroupObject
{
public string GroupName = ""; // Example: SubSub31
public string GroupPath = ""; // Example: Main/Sub3/SubSub31
public List<Var> LocalVar = new List<VarBloc();//Var10,var20
}
I guess some recursive regex will solve this but I can't figure out how to do this.
Can someone give me a hint ?
Sample code would be highly appreciated
A recursive regular expression might solve the problem - but the complexity of it may be too high to easily maintain (and I speak as someone who once implemented and sold a Regular Expression engine).
I'm not going to give you a complete solution - but here's one way to solve the problem.
Your output object needs to change to allow for the nested groups, something like this:
public class Group
{
public string Name { get; set; }
public string GroupPath { get; set; }
public IEnumerable<VarBlock> Variables { get; }
public IEnumerable<Group> NestedGroups { get; }
}
(Note use of properties instead of public members)
Assuming your input stream is a line based format, create a function that divides the string into lines:
public Queue<string> GetLines(string definition) { ... }
Then, create a routine to parse a group:
public Group ParseGroup(Queue<string> lines) { ... }
- When this routine encounters the start of a group, it should recursively call itself to parse the nested group and then add the result to
NestedGroups
. - When this routine encounters the end of a group, it should finish assembling the block, and return the object.
Hope this is helpful.
I recommend ANTLR (http://www.antlr.org/) which has been developed for parsing a wide range of semi-structured documents. There's a book (The Definitive ANTLR Reference) which will get you off the ground. It's capable of providing complete parsers for languages such as Java and C#. You can include (Java) code in the parser which will allow you to process the results into the data structures you require.
精彩评论