开发者

IPV6 address into compressed form in Java

开发者 https://www.devze.com 2023-03-28 06:22 出处:网络
I have used Inet6Address.getByName("2001:db8:0:0:0:0:2:1").toString() method to compress IPv6 address, and the output is 2001:db8:0:0:0:0:2:1 ,but i need 2001:db8::2:1 . , Basically the comp

I have used Inet6Address.getByName("2001:db8:0:0:0:0:2:1").toString() method to compress IPv6 address, and the output is 2001:db8:0:0:0:0:2:1 ,but i need 2001:db8::2:1 . , Basically the compression output should based on RFC 5952 standard , that is

  1. Shorten as Much as Possible : For example, 2001:db8:0:0:0:0:2:1 must be shortened to

    2001:db8::2:1.Likewise, 2001:db8::0:1 is not acceptable, because the symbol "::" could have been used to produce a shorter representation 2001:db8::1.

  2. Handling One 16-Bit 0 Field : The symbol "::" MUST NOT be used to shorten just one 16-bit 0 field. For example, the representation 2001:db8:0:1:1:1:1:1 is correct开发者_StackOverflow, but 2001:db8::1:1:1:1:1 is not correct.

  3. Choice in Placement of "::" : = When there is an alternative choice in the placement of a "::", the longest run of consecutive 16-bit 0 fields MUST be shortened (i.e., the sequence with three consecutive zero fields is shortened in 2001: 0:0:1:0:0:0:1). When the length of the consecutive 16-bit 0 fields are equal (i.e., 2001:db8:0:0:1:0:0:1), the first sequence of zero bits MUST be shortened. For example, 2001:db8::1:0:0:1 is correct representation.

I have also checked another post in Stack overflow, but there was no condition specified (example choice in placement of ::).

Is there any java library to handle this? Could anyone please help me?

Thanks in advance.


How about this?

String resultString = subjectString.replaceAll("((?::0\\b){2,}):?(?!\\S*\\b\\1:0\\b)(\\S*)", "::$2").replaceFirst("^0::","::");

Explanation without Java double-backslash hell:

(       # Match and capture in backreference 1:
 (?:    #  Match this group:
  :0    #  :0
  \b    #  word boundary
 ){2,}  # twice or more
)       # End of capturing group 1
:?      # Match a : if present (not at the end of the address)
(?!     # Now assert that we can't match the following here:
 \S*    #  Any non-space character sequence
 \b     #  word boundary
 \1     #  the previous match
 :0     #  followed by another :0
 \b     #  word boundary
)       # End of lookahead. This ensures that there is not a longer
        # sequence of ":0"s in this address.
(\S*)   # Capture the rest of the address in backreference 2.
        # This is necessary to jump over any sequences of ":0"s
        # that are of the same length as the first one.

Input:

2001:db8:0:0:0:0:2:1
2001:db8:0:1:1:1:1:1
2001:0:0:1:0:0:0:1
2001:db8:0:0:1:0:0:1
2001:db8:0:0:1:0:0:0

Output:

2001:db8::2:1
2001:db8:0:1:1:1:1:1
2001:0:0:1::1
2001:db8::1:0:0:1
2001:db8:0:0:1::

(I hope the last example is correct - or is there another rule if the address ends in 0?)


I recently ran into the same problem and would like to (very slightly) improve on Tim's answer.

The following regular expression offers two advantages:

((?:(?:^|:)0+\\b){2,}):?(?!\\S*\\b\\1:0+\\b)(\\S*)

Firstly, it incorporates the change to match multiple zeroes. Secondly, it also correctly matches addresses where the longest chain of zeroes is at the beginning of the address (such as 0:0:0:0:0:0:0:1).


Guava's InetAddresses class has toAddrString() which formats according to RFC 5952.


java-ipv6 is almost what you want. As of version 0.10 it does not check for the longest run of zeroes to shorten with :: - for instance 0:0:1:: is shortened to ::1:0:0:0:0:0. It is a very decent library for the handling of IPv6 addresses, though, and this problem should be fixed with version 0.11, such that the library is RFC 5952 compliant.


The open-source IPAddress Java library can do as described, it provides numerous ways of producing strings for IPv4 and/or IPv6, including the canonical string which for IPv6 matches rfc 5952. Disclaimer: I am the project manager of that library.

Using the examples you list, sample code is:

    IPAddress addr = new IPAddressString("2001:db8:0:0:0:0:2:1").getAddress();
    System.out.println(addr.toCanonicalString());
    // 2001:db8::2:1
    addr = new IPAddressString("2001:db8:0:1:1:1:1:1").getAddress();
    System.out.println(addr.toCanonicalString());
    // 2001:db8:0:1:1:1:1:1
    addr = new IPAddressString("2001:0:0:1:0:0:0:1").getAddress();
    System.out.println(addr.toCanonicalString());
    // 2001:0:0:1::1
    addr = new IPAddressString("2001:db8:0:0:1:0:0:1").getAddress();
    System.out.println(addr.toCanonicalString());
    //2001:db8::1:0:0:1


Not quite elegant but this is my proposal (based on chrixm work):

public static String shortIpv6Form(String fullIP) {
        fullIP = fullIP.replaceAll("^0{1,3}", "");
        fullIP = fullIP.replaceAll("(:0{1,3})", ":");
        fullIP = fullIP.replaceAll("(0{4}:)", "0:");
        //now we have full form without unnecessaires zeros
        //Ex:
        //0000:1200:0000:0000:0000:0000:0000:0000 -> 0:1200:0:0:0:0:0:0
        //0000:0000:0000:1200:0000:0000:0000:8351 -> 0:0:0:1200:0:0:0:8351
        //0000:125f:0000:94dd:e53f:0000:61a9:0000 -> 0:125f:0:94dd:e53f:0:61a9:0
        //0000:005f:0000:94dd:0000:cfe7:0000:8351 -> 0:5f:0:94dd:0:cfe7:0:8351


        //compress to short notation
        fullIP = fullIP.replaceAll("((?:(?:^|:)0+\\b){2,}):?(?!\\S*\\b\\1:0+\\b)(\\S*)", "::$2");

        return fullIP;
    }
  1. results:

    7469:125f:8eb6:94dd:e53f:cfe7:61a9:8351 -> 7469:125f:8eb6:94dd:e53f:cfe7:61a9:8351 7469:125f:0000:0000:e53f:cfe7:0000:0000 -> 7469:125f::e53f:cfe7:0:0 7469:125f:0000:0000:000f:c000:0000:0000 -> 7469:125f::f:c000:0:0 7469:125f:0000:0000:000f:c000:0000:0000 -> 7469:125f::f:c000:0:0 7469:0000:0000:94dd:0000:0000:0000:8351 -> 7469:0:0:94dd::8351 0469:125f:8eb6:94dd:0000:cfe7:61a9:8351 -> 469:125f:8eb6:94dd:0:cfe7:61a9:8351 0069:125f:8eb6:94dd:0000:cfe7:61a9:8351 -> 69:125f:8eb6:94dd:0:cfe7:61a9:8351 0009:125f:8eb6:94dd:0000:cfe7:61a9:8351 -> 9:125f:8eb6:94dd:0:cfe7:61a9:8351 0000:0000:8eb6:94dd:e53f:0007:6009:8350 -> ::8eb6:94dd:e53f:7:6009:8350 0000:0000:8eb6:94dd:e53f:0007:6009:8300 -> ::8eb6:94dd:e53f:7:6009:8300 0000:0000:8eb6:94dd:e53f:0007:6009:8000 -> ::8eb6:94dd:e53f:7:6009:8000 7469:0000:0000:0000:e53f:0000:0000:8300 -> 7469::e53f:0:0:8300 7009:100f:8eb6:94dd:e000:cfe7:6009:8351 -> 7009:100f:8eb6:94dd:e000:cfe7:6009:8351 7469:100f:8006:900d:e53f:cfe7:61a9:8351 -> 7469:100f:8006:900d:e53f:cfe7:61a9:8351 7000:1200:8e00:94dd:e53f:cfe7:0000:0001 -> 7000:1200:8e00:94dd:e53f:cfe7:0:1 0000:0000:0000:0000:0000:0000:0000:0000 -> :: 0000:0000:0000:94dd:0000:0000:0000:0000 -> 0:0:0:94dd:: 0000:1200:0000:0000:0000:0000:0000:0000 -> 0:1200:: 0000:0000:0000:1200:0000:0000:0000:8351 -> ::1200:0:0:0:8351 0000:125f:0000:94dd:e53f:0000:61a9:0000 -> 0:125f:0:94dd:e53f:0:61a9:0 7469:0000:8eb6:0000:e53f:0000:61a9:0000 -> 7469:0:8eb6:0:e53f:0:61a9:0 0000:125f:0000:94dd:0000:cfe7:0000:8351 -> 0:125f:0:94dd:0:cfe7:0:8351 0000:025f:0000:94dd:0000:cfe7:0000:8351 -> 0:25f:0:94dd:0:cfe7:0:8351 0000:005f:0000:94dd:0000:cfe7:0000:8351 -> 0:5f:0:94dd:0:cfe7:0:8351 0000:000f:0000:94dd:0000:cfe7:0000:8351 -> 0:f:0:94dd:0:cfe7:0:8351 0000:0000:0000:0000:0000:0000:0000:0001 -> ::1


After performing some tests, I think the following captures all the different IPv6 scenarios:

"((?:(?::0|0:0?)\\b){2,}):?(?!\\S*\\b\\1:0\\b)(\\S*)" -> "::$2"
0

精彩评论

暂无评论...
验证码 换一张
取 消