Ruby: Fuzzing through all unicode characters ‎(UTF8/Encoding/String Manipulation)_问答_开发者

Ruby: Fuzzing through all unicode characters ‎(UTF8/Encoding/String Manipulation)

开发者 https://www.devze.com 2023-01-31 06:03 出处：网络

I can\'t iterate over the entire range of unicode characters. I 开发者_StackOverflow中文版searched everywhere...

I can't iterate over the entire range of unicode characters.

I 开发者_StackOverflow中文版searched everywhere...

I am building a fuzzer and want to embed into a url, all unicode characters (one at a time).

For example: http://www.example.com?a=\uff1c

I know that there are some built tools but I need more flexibility.

If i could do someting like the following: "\u" + "ff1c" it would be great.

This is the closest I got:

char = "\u0000"
...

#within iteration

char.succ!

...

but after the character "\u0039", which is the number 9, I will get "10" instead of ":"

You could use pack to convert numbers to UTF8 characters but I'm not sure if this solves your problem.

You can either create an array with numeric values of all the characters and use pack to get an UTF8 string or you can just loop from 0 to whatever you need and use pack within the loop.

I've written a small example to explain myself. The code below prints out the hex value of each character followed by the character itself.

0.upto(100) do |i|
    puts "%04x" % i + ": " + [i].pack("U*")
end

Here's some simpler code, albeit slightly obfuscated, that takes advantage of the fact that Ruby will convert an integer on the right hand side of the << operator to a codepoint. This only works with Ruby 1.8 up for integer values <= 255. It will work for values greater than 255 in 1.9.