开发者

How to generate a random unique string with more than 2^30 combination. I also wanted to reverse the process. Is this possible?

开发者 https://www.devze.com 2023-02-16 19:26 出处:网络
I have a string which contains 3 elements: a 3 digit code (example: SIN, ABD, SMS, etc) a 1 digit code type (example: 1, 2, 3, etc)

I have a string which contains 3 elements:

  • a 3 digit code (example: SIN, ABD, SMS, etc)
  • a 1 digit code type (example: 1, 2, 3, etc)
  • a 3 digit number (example: 500, 123, 345)

Example string: SIN1500, ABD2123, SMS3345, etc..

I wanted to generate a UNIQUE 10 digit alphanumeric and case sensitive string (only 0-9/a-z/A-Z is allowed), with more than 2^30 (about 1 billion) unique combination per string supplied. The generated code must have a particular algorithm so that I can reverse the process. For example:

public static void main(String[] args) {
    String test = "ABD2123";
    String result = generateData(test);
    System.out.println(generateOutput(test)); //for example, the output of this is: 1jS8g4GDn0
    System.out.println(generateOutput(result)); //the output of this will be ABD2123 (the original string supplied)
}

What I wanted to ask is is there any ideas/examples/libraries in java that can do this? Or at least any hint on what keyword should I put on Google?

I tried googling using the keyword java checksum, rng, security, random number, etc and also tried looking at some random number solution (java SecureRandom, xorshift RNG, java.util.zip's checksum, etc) but I can't seem to find one?

Thanks!

EDIT:

My use case for this program is to generate some kind of unique voucher number to be used by specific customers.

The string supplied will contains 3 digit code for company ID, 1 digit code for voucher type, and a 3 digit number for the voucher nominal. I also tried adding 3 random alphanumeric (so the final digit is 7 + 3 digit = 10 digit). This is what I've done so far, but the result is not very good (only about 100 thousand combination):

public static String in ="somerandomstrings";
public static String out="someotherrandomstrings";

public static String encrypt(String kata)
throws Exception {
    String result="";
    String ina=in;
    String outa=out;
    Random ran = new Random();
    Integer modulus=in.length();
    Integer offset= ((Integer.parseInt(Utils.convertDateToString(new Date(), "SS")))+ran.nextInt(60))/2%modulus;

    result=ina.substring(offset, offset+1);
    ina=ina+ina;
    ina=ina.substring(offset, offset+modulus);
    result=result+translate(kata, ina, outa);

    return result;
}

EDIT:

I'm sorry I forgot to put the "translate" function :

    public static String translate(String kata,String  seq1, String  seq2){
    String result="";
    if(kata!=null&seq1!=null&seq2!=null){
        String[] a=kata.split("");
        for (int j = 1; j < a.length; j++) {
            String b=a[j];                      
            String[]seq1split=seq1.split("");
            String[]seq2split=seq2.split("");
            int hint=seq1.indexOf(b)+1;

            String sq="";
            if(seq1split.length>hint)
                sq=seq1split[hint];
            String sq1="";
            if(seq2split.length>hint)
                sq1=seq2split[hint];

            b=b.replace(sq, sq1);

            result=result+b;
        }
    }
    return result;
}

EDIT:

Okay, after a few days I'm currently struggling to convert the Ruby code provided by @sarnold, this is the code I've come up with :

public class Test {

static char START_A = "A".charAt(0);
static char START_a = "a".charAt(0);
static char START_0 = "0".charAt(0);
static int CODEMASK = ((1 << 28) - 1); //turn on lower 28 bits
static int RANDOMMASK = ((1 << 60) - 1) & ~ CODEMASK; //turn on upper 32 bits

public static void main(String[] args) {

    String[] test = new String[]{
            //"AAA0000", "SIN1500", "ABD2123", "SMS3345", "ZZZ9999",
            //"ABD2123", "ABD2123", "ABD2123", "ABD2123", "ABD2123"
            "ABD2123"
            };

    for(String t : test){
        long c = compress(t);
        long a = add_random(c);
        String output = to_output(a);
        long input = from_output(output);

        String[] new_c_r = remove_random(input);
        String u = uncompress(Long.valueOf(new_c_r[0]));

        System.out.println("Original input: " + t);
        System.out.println("    compressed: " + c);
        System.out.println("    after adding random amount: " + a);
        System.out.println("    *output: " + output);
        System.out.println("    *input: " + input);
        System.out.println("    random amount added: " + new_c_r[1]);
        System.out.println("    after removing random amount: " + new_c_r[0]);
        System.out.println("    uncompressed: " + u);
        System.out.println("-----------------------------------------------------------------");
    }

}

public static long compress(String line){ //7 character
    char a = line.charAt(0);
    char b = line.charAt(1);
    char c = line.charAt(2);
    char h = line.charAt(3);
    char i = line.charAt(4);
    char j = line.charAt(5);
    char k = line.charAt(6);

    long small_a = (long) a - START_A;
    long small_b = (long) b - START_A;
    long small_c = (long) c - START_A;
    long letters = (small_a * 26 * 26) + (small_b * 26) + small_c;
    long numbers = letters * 10000 + h * 1000 + i*100 + j*10 + k;
    return numbers;
}

public static String uncompress(long number){
    long k = number % 10;
    number /= 10;
    long j = number % 10;
    number /= 10;
    long i = number % 10;
    number /= 10;
    long h = number % 10;
    number /= 10;
    long small_c = number % 26;
    number /= 26;
    long small_b = number % 26;
    number /= 26;
    long small_a = number % 26;
    number /= 26;

    if (number != 0) throw new RuntimeException("input wasn't generated with compress()");

    long a = small_a + START_A;
    long b = small_b + START_A;
    long c = small_c + START_A;

    StringBuffer result = new StringBuffer();
    result.append((char) a).append((char) b).append((char) c).append(h).append(i).append(j).append(k);

    return result.toString();
}

public static long add_random(long number){
    return (((long) (Math.random()* Math.pow(2, 31))) << 28) + number;
}

public static String[] remove_random(long number){
    return new String[]{String.valueOf(number & CODEMASK), String.valueOf(number & RANDOMMASK)};
}

public static String to_output(long number){
    List<Character> a = new ArrayList<Character>();
    do{
        a.add(transform_out(number % 62));
        number /= 62;
    }while(number > 0);

    Collections.reverse(a);

    StringBuffer result = new StringBuffer();
    for(int i=0; i<a.size(); i++){
        Character s = (Character) a.get(i);
        result.append(s);
    }

    return result.toString();
}

public static long from_output(String string){
    long num = 0;
    for(char c : string.toCharArray()){
        num *= 62;
        num += transform_in(c);
    }
    return num;
}

public static char transform_out(long small){
    long out;

    if (small < 0 || small > 61){
        throw new RuntimeException("small should be between 0 and 61, inclusive");
    }
    if(small < 26){
        out = START_A + small;
    }else if(small < 52){
        out = START_a + (small-26);
    }else{
        out = START_0 + (small-52);
    }
    return (char) out;
}


public static long transform_in(char c){
    if(!String.valueOf(c).matches("[a-zA-Z0-9]")){ 
        throw new RuntimeException("char should be A-Z, a-z, or 0-9, inclusive");
    }
    long num = (long) c;

    long out;
    if(num >= START_A && num <= START_A+26) out = num-START_A;
    else if(num >= START_a && num <= START_a+26) out = (num-START_a) + 26;
    else if(num >= START_0 && num <= START_0+10) out = (num-START_0) + 52;
    else throw new RuntimeException("Salah, bego!");

    return out;
}}

but I can't seem to make this code right, the result is like this :

Original input: ABD2123
compressed: 345451
after adding random amount: 62781252268541291
*output: EnhZJdRFaj
*input: 62781252268541291
random amount added: 0
after removing random amount: 345451
uncompressed: ABI5451
开发者_如何学C

as you've probably noticed the "compressed" and the "uncompressed" field didn't show the same amount. The "random amount added" field is also always returning 0 value. Is there anyone who can help see where I'm wrong in this code?

I'm sorry if it's a mistake to put this code in this thread I will create another thread.

Thank You, Yusuf


Your requirements are very unclear. So I'm going to restrict my answer to generalities:

There are functions called cryptographic hash functions that will map from an arbitrary sequence of bytes to a "hash", with the property that the probability of any two related inputs giving the same output is vanishingly small. However, cryptographic hash functions are (by design) not reversible ... by design.

A mapping from one "space" of Strings to another that is reversible, can be implemented in two ways:

  • You can generate arbitrary Strings as the mapped Strings, and use data structures such as a hash tables to store the forward and reverse mappings.

  • You can use a cryptographic hash function to generate the mapped Strings, and use data structures such as a hash tables to store the reverse mappings.

  • You can use a reversible function to transform between the original and mapped Strings. This has the problem that the mapping is likely to be be easy to reverse engineer.

An example of the latter might be to turn the original String into bytes and then base64 encode it. Or even more trivially, you could insert a random character between each character in the input string. (Obviously, transformations like these can be reverse engineered in a few minutes ... given enough examples. So one has to doubt the wisdom of this approach.)

Without better requirements, it is unclear which (if any) of these approaches is what you need.


I've written a Ruby tool that does what you need. (Sorry it isn't Java, but my Java is over a decade old by now. But the general idea should port to Java without hassle.)

In short, the given input string (7 bytes) is compressed into a number between 0 and 175_760_000 (28 bits). A 32 bit random number is prepended, making a 60 bit integer, which is encoded into a 10-character output string.

My earlier math was wrong; since your input is only 28 bits long and your desired output is 60 bits long, that leaves 32 bits for adding roughly two billion random permutations. I mistyped when performing my calculations.

#!/usr/bin/ruby -w
START_A = "A"[0]
START_a = "a"[0]
START_0 = "0"[0]
CODEMASK = ((1 << 28) - 1) # turn on lower 28 bits
RANDOMMASK = ((1 << 60) - 1) & ~ CODEMASK # turn on upper 32 bits


def compress(line)
    a, b, c, h, i, j, k = line.chomp().each_char().entries()
    a, b, c = a[0], b[0], c[0]
    small_a, small_b, small_c = a - START_A, b - START_A, c - START_A
    letters = (small_a * 26**2) + (small_b * 26) + small_c
    h, i, j, k = Integer(h), Integer(i), Integer(j), Integer(k)
    number = letters * 10_000 + h*1000 + i*100 + j*10 + k
end

def uncompress(number)
    k = number % 10
    number /= 10
    j = number % 10
    number /= 10
    i = number % 10
    number /= 10
    h = number % 10
    number /= 10
    small_c = number % 26
    number /= 26
    small_b = number % 26
    number /= 26
    small_a = number % 26
    number /= 26
    if (number != 0)
            raise "input wasn't generated with compress()"
    end
    a, b, c = small_a + START_A, small_b + START_A, small_c + START_A
    [a.chr(), b.chr(), c.chr(), h.to_s(), i.to_s(), j.to_s(), k.to_s()].join()
end

def add_random(number)
    (rand(2**31) << 28) + number
end 

def remove_random(number)
    [number & CODEMASK, number & RANDOMMASK]
end

def to_output(number)
    a = []
    begin
            a << transform_out(number % 62)
            number /= 62
    end while(number > 0)
    a.reverse().join()
end

def from_output(string)
    num = 0
    string.each_char() do |c|
            num *= 62
            num += transform_in(c)
    end     
    num
end

def transform_out(small)
    if (small < 0 || small > 61)
            raise "small should be between 0 and 61, inclusive"
    end
    if (small < 26)
            out = START_A+small
    elsif (small < 52)
            out = START_a+(small-26)
    else
            out = START_0+(small-52)
    end
    out.chr()
end

def transform_in(char)
    if (/^[A-Za-z0-9]$/ !~ char)
            raise "char should be A-Z, a-z, or 0-9, inclusive"
    end
    num = char[0]
    out = case num
    when START_A .. START_A+26 then num - START_A
    when START_a .. START_a+26 then (num - START_a) + 26
    when START_0 .. START_0+10 then (num - START_0) + 52
    end

    out
end


begin
    while(line = DATA.readline()) do
            line.chomp!()
            c = compress(line)
            a = add_random(c)
            output = to_output(a)
            input = from_output(output)
            new_c, r = remove_random(input)
            u = uncompress(new_c)

            printf("original input: %s\n compressed: %d\n after adding random amount: %d\n *output: %s\n *input: %s\n random amount added: %d\n after removing random amount: %d\nuncompressed: %s\n", line, c, a, output, input, r, new_c, u)
    end
rescue EOFError => e
end


__END__
AAA0000
SIN1500
ABD2123
SMS3345
ZZZ9999

Running the program with the five inputs given at the bottom results in this output:

$ ./compress.rb
original input: AAA0000
 compressed: 0
 after adding random amount: 508360097408221184
 *output: liSQkzXL1G
 *input: 508360097408221184
 random amount added: 508360097408221184
 after removing random amount: 0
uncompressed: AAA0000
original input: SIN1500
 compressed: 123891500
 after adding random amount: 421470683267231532
 *output: fIVFtX9at2
 *input: 421470683267231532
 random amount added: 421470683143340032
 after removing random amount: 123891500
uncompressed: SIN1500
original input: ABD2123
 compressed: 292123
 after adding random amount: 414507907112269083
 *output: emb6JfDhUH
 *input: 414507907112269083
 random amount added: 414507907111976960
 after removing random amount: 292123
uncompressed: ABD2123
original input: SMS3345
 compressed: 124983345
 after adding random amount: 383242064398325809
 *output: cTPpccLAvn
 *input: 383242064398325809
 random amount added: 383242064273342464
 after removing random amount: 124983345
uncompressed: SMS3345
original input: ZZZ9999
 compressed: 175759999
 after adding random amount: 27149937199932031
 *output: CAVf14tiRh
 *input: 27149937199932031
 random amount added: 27149937024172032
 after removing random amount: 175759999
uncompressed: ZZZ9999

The lines you're really looking for are original input, *output, and uncompressed. Your client has the original input lines, after using the to_output(add_random(compress(input))) you will get the ten-character A-Za-z0-9 output in the *output line. You hand that to users, and it's a magic token of some sort. Then when it is time to validate them, you use remove_random(from_output(user_string)) to discover both the random value added to the string and an integer that you can hand to uncompress().

One very important note: The input AAA0000 is stored in plaintext in the lower 28 bits. The random number is stored in plaintext in the upper 32 bits. This is simply an obfuscation of the original inputs, it wouldn't be hard for someone to discover the pattern if they have two inputs and outputs. Heck, they might even correctly guess the algorithm given only a single input and output.

If you need this to be cryptographically strong, then you've still got some work ahead of you :) but that might be as easy as XORing the intermediate 60 bit number with rc4 output of the username or something like that.

Short Explanation

Your input strings can be interpreted as integers, but with a twist: the first three 'digits' are in base 26, the next four digits are in base 10. The number will be less than 175_760_000. Uniquely storing numbers between 0 and 175_760_000 requires 28 bits. Your output strings are also a number, ten digits long, with each 'digit' in base 62. (Think base64 but without -, /, or = (for padding).) 62 possible values and ten positions gives you a maximum value of 839_299_365_868_340_224, which can be represented in 60 bits.

The input string only takes 28 bits to represent, your output string has 60 bits available, and that leaves 32 bits available for storing a randomly-generated number. If we multiply a 32-bit number by 2^28 (the same as a left-shift by 28: 1 << 28) then the lower 28 bits will be free for the originally input number.

Once we've calculated the 60 bit number, we output it in our base 62 notation for human consumption.

To reverse the process, you decode the base 62 number to get our 60 bit number; split the 60 bit number into the lower 28 bit input number and upper 32 bit random number, and then output the 28 bit number in the original format: three base 26 digits followed by four base 10 digits.

FINAL UPDATE

Yusuf, excellent work converting my Ruby to Java. I'm very impressed, especially given how good your Java version looks: your version is more legible. I'm jealous and impressed. :)

I found the two small bugs that remained in your program: RANDOMMASK was accidentally initialized to 0, because Java doesn't promote all operands into the final data type when executing arithmetic shift statements. Changing 1 to 1L forces the result of 1L << 60 to be a long; without the L, the result of 1 << 60 is an int, and isn't large enough to hold the full number.

Further, the digits weren't being compressed correctly; my Ruby code parsed the characters as an integer, and your Java code interpreted the characters as an integer. (Yours used the character's value; mine converted the character to an integer based on the ascii meaning of the character. Mine wasn't really parsing, since it is just doing a subtraction, but if it were a string, String.toInteger(character) would do the same thing, so it is a lot like parsing.)

But your uncompress logic was correct, and because of the mismatch, the output was incorrect. So I changed your code to parse the digits into integers (and changed from char to int to silence a pointless warning).

Here's a diff of what I had to change in your program to make it work:

--- Compress.java.orig  2011-03-25 16:57:47.000000000 -0700
+++ Compress.java   2011-03-25 17:09:42.000000000 -0700
@@ -1,12 +1,12 @@
-import java.util.*
+import java.util.*;

 public class Compress {

 static char START_A = "A".charAt(0);
 static char START_a = "a".charAt(0);
 static char START_0 = "0".charAt(0);
-static int CODEMASK = ((1 << 28) - 1); //turn on lower 28 bits
-static int RANDOMMASK = ((1 << 60) - 1) & ~ CODEMASK; //turn on upper 32 bits
+static long CODEMASK = ((1 << 28) - 1); //turn on lower 28 bits
+static long RANDOMMASK = ((1L << 60) - 1) & ~ CODEMASK; //turn on upper 32 bits

 public static void main(String[] args) {

@@ -42,10 +42,10 @@
     char a = line.charAt(0);
     char b = line.charAt(1);
     char c = line.charAt(2);
-    char h = line.charAt(3);
-    char i = line.charAt(4);
-    char j = line.charAt(5);
-    char k = line.charAt(6);
+    int h = line.charAt(3) - START_0;
+    int i = line.charAt(4) - START_0;
+    int j = line.charAt(5) - START_0;
+    int k = line.charAt(6) - START_0;

     long small_a = (long) a - START_A;
     long small_b = (long) b - START_A;

And now the full source, just in case that's easier :)

import java.util.*;

public class Compress {

static char START_A = "A".charAt(0);
static char START_a = "a".charAt(0);
static char START_0 = "0".charAt(0);
static long CODEMASK = ((1 << 28) - 1); //turn on lower 28 bits
static long RANDOMMASK = ((1L << 60) - 1) & ~ CODEMASK; //turn on upper 32 bits

public static void main(String[] args) {

    String[] test = new String[]{
            //"AAA0000", "SIN1500", "ABD2123", "SMS3345", "ZZZ9999",
            //"ABD2123", "ABD2123", "ABD2123", "ABD2123", "ABD2123"
            "ABD2123"
            };

    for(String t : test){
        long c = compress(t);
        long a = add_random(c);
        String output = to_output(a);
        long input = from_output(output);

        String[] new_c_r = remove_random(input);
        String u = uncompress(Long.valueOf(new_c_r[0]));

        System.out.println("Original input: " + t);
        System.out.println("    compressed: " + c);
        System.out.println("    after adding random amount: " + a);
        System.out.println("    *output: " + output);
        System.out.println("    *input: " + input);
        System.out.println("    random amount added: " + new_c_r[1]);
        System.out.println("    after removing random amount: " + new_c_r[0]);
        System.out.println("    uncompressed: " + u);
        System.out.println("-----------------------------------------------------------------");
    }

}

public static long compress(String line){ //7 character
    char a = line.charAt(0);
    char b = line.charAt(1);
    char c = line.charAt(2);
    int h = line.charAt(3) - START_0;
    int i = line.charAt(4) - START_0;
    int j = line.charAt(5) - START_0;
    int k = line.charAt(6) - START_0;

    long small_a = (long) a - START_A;
    long small_b = (long) b - START_A;
    long small_c = (long) c - START_A;
    long letters = (small_a * 26 * 26) + (small_b * 26) + small_c;
    long numbers = letters * 10000 + h * 1000 + i*100 + j*10 + k;
    return numbers;
}

public static String uncompress(long number){
    long k = number % 10;
    number /= 10;
    long j = number % 10;
    number /= 10;
    long i = number % 10;
    number /= 10;
    long h = number % 10;
    number /= 10;
    long small_c = number % 26;
    number /= 26;
    long small_b = number % 26;
    number /= 26;
    long small_a = number % 26;
    number /= 26;

    if (number != 0) throw new RuntimeException("input wasn't generated with compress()");

    long a = small_a + START_A;
    long b = small_b + START_A;
    long c = small_c + START_A;

    StringBuffer result = new StringBuffer();
    result.append((char) a).append((char) b).append((char) c).append(h).append(i).append(j).append(k);

    return result.toString();
}

public static long add_random(long number){
    return (((long) (Math.random()* Math.pow(2, 31))) << 28) + number;
}

public static String[] remove_random(long number){
    return new String[]{String.valueOf(number & CODEMASK), String.valueOf(number & RANDOMMASK)};
}

public static String to_output(long number){
    List<Character> a = new ArrayList<Character>();
    do{
        a.add(transform_out(number % 62));
        number /= 62;
    }while(number > 0);

    Collections.reverse(a);

    StringBuffer result = new StringBuffer();
    for(int i=0; i<a.size(); i++){
        Character s = (Character) a.get(i);
        result.append(s);
    }

    return result.toString();
}

public static long from_output(String string){
    long num = 0;
    for(char c : string.toCharArray()){
        num *= 62;
        num += transform_in(c);
    }
    return num;
}

public static char transform_out(long small){
    long out;

    if (small < 0 || small > 61){
        throw new RuntimeException("small should be between 0 and 61, inclusive");
    }
    if(small < 26){
        out = START_A + small;
    }else if(small < 52){
        out = START_a + (small-26);
    }else{
        out = START_0 + (small-52);
    }
    return (char) out;
}


public static long transform_in(char c){
    if(!String.valueOf(c).matches("[a-zA-Z0-9]")){ 
        throw new RuntimeException("char should be A-Z, a-z, or 0-9, inclusive");
    }
    long num = (long) c;

    long out;
    if(num >= START_A && num <= START_A+26) out = num-START_A;
    else if(num >= START_a && num <= START_a+26) out = (num-START_a) + 26;
    else if(num >= START_0 && num <= START_0+10) out = (num-START_0) + 52;
    else throw new RuntimeException("Salah, bego!");

    return out;
}}
0

精彩评论

暂无评论...
验证码 换一张
取 消