I am learning RegEx. completely a newbie :P
I wanted to separate numbers from the below data, which are separated by comma only
test
t,b
45,49
31,34,38,34,56,23,,,,3,23,23653,3875,3.7,8.5,2.5,7.8,2., 6 6 6 6 ,
,
.
.,/;,jm.m.,,n ,sdsd, 3,2m54,2 4,2m,ar ,SSD A,,B,4D,CE,S4,D,2343ES,SD
Suppose I am getting the above data from Form text field. Now I want to read the data only which are numbers seperated by comma
Solution should be[string]
45,49,31,34,38,34,56,23,3,23,23653,3875
all other data should be skipped. I tried something like this ^[0-9]+\,$
But it's also selecting 7 from 3.7, and 5 from 8.5, etc.....
Can anyone hel开发者_如何学JAVAp me out in solving this!!
Assuming you are already splitting at commas and try to check whether the elements you get are numbers, use this expression: ^\d+(?:\.\d+)?$
, which means: "must begin with digits potentially followed by a dot and at least one more digit".
This would match 31
as well as 7.8
, but not 2.
, 6 6 6 6
or 2m54
.
Here's a part by part explanation of that expression:
^
means: matches must start at the first character$
means: matches must end at the last character, so both together mean the entire string must match\d+
means: one or more digits(?: ... )
is a non-capturing group allowing to apply the?
quantifier\.
means: the literal dot(?:\.\d+)?
thus means: zero or one occurences of a dot followed by at least one digit
Edit: if you only want integer numbers, just remove the group: ^\d+$
-> entire input must be one or more digits.
Edit 2: If you can prepend and append a comma to the input string(see Edit 4), you should be able to use this regex for getting all numbers: (?<=,)\s*(\d+(?:\.\d+)?)\s*(?=,)
(integers only would require you to remove the (?:\.\d+)?
part).
That expression gets all numbers between two commas with possible whitespace between the commas and the number and catches the number into a group. This should prevent matches of 6 6 6 6
or 2m54
. Then just iterate over the matches to get all the groups.
Edit 3: Here's an example with your input string.
String input = "test\n" +
"t,b\n" +
"45,49\n" +
"31,34,38,34,56,23,,,,3,23,23653,3875,3.7,8.5,2.5,7.8,2., 6 6 6 6 ,\n" +
",\n" +
".\n" +
".,/;,jm.m.,,n ,sdsd, 3,2m54,2 4,2m,ar ,SSD A,,B,4D,CE,S4,D,2343ES,SD\n";
Pattern p = Pattern.compile( "(?<=,|\\n)\\s*(\\d+(?:\\.\\d+)?)\\s*(?=,|\\n)" );
Matcher m = p.matcher( input );
List<String> numbers = new ArrayList<String>();
while(m.find())
{
numbers.add( m.group( 1 ) );
}
System.out.println(Arrays.toString( numbers.toArray() ));
//prints: [45, 49, 31, 34, 38, 34, 56, 23, 3, 23, 23653, 3875, 3.7, 8.5, 2.5, 7.8, 3]
//removing the faction group: [45, 49, 31, 34, 38, 34, 56, 23, 3, 23, 23653, 3875, 3]
Edit 4: actually, you don't need to add commas, just use this expression:
`(?<=,|\n|^)\s*(\d+)\s*(?=,|\n|$)`
The groups at the start and end mean the match must follow the start of the input, a comma or a line break and be followed by the end of the input, a comma or a line break.
The shortest solution i could come up with would be to replace anything that isn't a set of numbers separated by commas with the empty string. So you could do s.replaceAll("[^0-9]*,", ",")
If you have random newlines in there, you will probably want to add in a s.replaceAll("\n", ",")
. Then after those transformations, you can just do as suggested and split on commas.
this experssion will give you all numbers you need (only numbers, no commas).
"^\d+|(?<=,)\d+$|(?<=,)\d+(?=,)"
see the grep example:
kent$ echo "31,34,38,34,56,23,,,,3,23,23653,3875,3.7,8.5,2.5,7.8,2., 6 6 6 6 ,
"|grep -oP "^\d+|(?<=,)\d+$|(?<=,)\d+(?=,)"
31
34
38
34
56
23
3
23
23653
3875
精彩评论