开发者

Java String split not returning the right values

开发者 https://www.devze.com 2023-02-25 05:49 出处:网络
I\'m trying to parse a txt file that represents a grammar to be used in a recursive descent parser. The txt file would look something like this:

I'm trying to parse a txt file that represents a grammar to be used in a recursive descent parser. The txt file would look something like this:

SPRIME ::= Expr eof

Expr ::= Term Expr'

Expr' ::= + Term Expr' | - Term Expr' | e

To isolate the left hand side and split the right hand side into seperate production rules, I take each line and 开发者_如何学Gocall:

String[] firstSplit = line.split("::=");
String LHS = firstSplit[0];
String productionRules = firstSplit[1].split("|");

However, when I call the second split method, I am not returned an array of the Strings separated by the "|" character, but an array of each indiviudual character on the right hand side, including "|". So for instance, if I was parsing the Expr' rule and printed the productionRules array, it would look like this:

"+"

"Term"

"Expr'"

""

"|"

When what I really want should look like this:

  • Term Expr'

Anyone have any ideas what I'm doing wrong?


The parameter to String.split() is a regular expression, and the vertical bar character is special.

Try escaping it with a backslash:

String productionRules = firstSplit[1].split("\\|");

NB: two backslashes are required, since the backslash character itself is special within string literals.


Since split takes a regex as argument you have to escape all non-intended regex symbols.


You need to escape pipe(|) symbol which is a regex OR operator .

String productionRules = firstSplit[1].split("\\|");

or

String productionRules = firstSplit[1].split(Pattern.quote("|"));


The pipe character is the regex operator for "or". What you want is

String productionRules = firstSplit[1].split("\\|");

which tells it to look for an actual pipe character.

0

精彩评论

暂无评论...
验证码 换一张
取 消