开发者

Regex to read fixed width numeric fields

开发者 https://www.devze.com 2023-02-13 14:56 出处:网络
I would like regex(es) that can parse right-justified numeric values in a fixed length field with optional leading whitespace. (This is essentially FORTRAN output but there are many开发者_Go百科 other

I would like regex(es) that can parse right-justified numeric values in a fixed length field with optional leading whitespace. (This is essentially FORTRAN output but there are many开发者_Go百科 other tools that do this). I know the width of the field.

Assume the field is an integer of width 5 (I5). Then the following are all conformant numeric values:

"  123"
"12345"
"-1234"
"   -1"

I can make no assumption about the previous and following fields. Thus the following is valid for I3,I5,I2:

"-121234512"

and yields the values -12, 12345 and 12.

There should be no additional code associated with the regex. I am using Java regex but I would like this to be fairly general (at least conformant with C#).

If this can be done for integers, I would also like the regex(es) for real numbers which include a decimal point, e.g. F10.3

"   -12.123"


The regex:

(?=[ ]*-?\d+)[ -\d]{5}

matches all of your examples:

"  123"
"12345"
"-1234"
"   -1"

And chaining them in groups:

((?=[ ]*-?\d+)[ -\d]{3})((?=[ ]*-?\d+)[ -\d]{5})((?=[ ]*-?\d+)[ -\d]{2})

on the input:

-121234512

matches:

$1 = -12
$2 = 12345
$3 = 12

A short explanation:

(?=        # start positive look ahead
  [ ]*     #   zero or more space
  -?       #   an optional minus sign
  \d+      #   one or more digits
)          # end positive look ahead
[ -\d]{5}  # spaces, minus sign or digits, exactly 5 times

As you can see, the lookahead forces the order of the characters (spaces before digits and/or minus sign, minus sign before digits).

And a version for you float example might look like:

(?=[ ]*-?\d+(\.\d+)?)[ -\d.]{10}


You can use the regex:

^(?= *-?[0-9]*$).{5}

Rubular link

0

精彩评论

暂无评论...
验证码 换一张
取 消