I'm trying to parse a txt file that represents a grammar to be used in a recursive descent parser. The txt file would look something like this:
SPRIME ::= Expr eof
Expr ::= Term Expr' Expr' ::= + Term Expr' | - Term Expr' | eTo isolate the left hand side and split the right hand side into seperate production rules, I take each line and 开发者_如何学Gocall:
String[] firstSplit = line.split("::=");
String LHS = firstSplit[0];
String productionRules = firstSplit[1].split("|");
However, when I call the second split method, I am not returned an array of the Strings separated by the "|" character, but an array of each indiviudual character on the right hand side, including "|". So for instance, if I was parsing the Expr' rule and printed the productionRules array, it would look like this:
"+"
"Term" "Expr'" "" "|"When what I really want should look like this:
- Term Expr'
Anyone have any ideas what I'm doing wrong?
The parameter to String.split()
is a regular expression, and the vertical bar character is special.
Try escaping it with a backslash:
String productionRules = firstSplit[1].split("\\|");
NB: two backslashes are required, since the backslash character itself is special within string literals.
Since split
takes a regex as argument you have to escape all non-intended regex symbols.
You need to escape pipe(|
) symbol which is a regex
OR
operator .
String productionRules = firstSplit[1].split("\\|");
or
String productionRules = firstSplit[1].split(Pattern.quote("|"));
The pipe character is the regex operator for "or". What you want is
String productionRules = firstSplit[1].split("\\|");
which tells it to look for an actual pipe character.
精彩评论