开发者

Slice a binary number into groups of five digits

开发者 https://www.devze.com 2023-01-21 22:18 出处:网络
Is there any neat trick to slice a binary number into groups of five digits in python? \'00010100011011101101110100010111\' => [\'00010\', \'00110\', \'10111\', ... ]

Is there any neat trick to slice a binary number into groups of five digits in python?

'00010100011011101101110100010111' => ['00010', '00110', '10111', ... ]

Edit: I want to write a cipher/encoder in order to generate "easy to read over the phone" tokens. The standard base32 encoding has the following disadvantages:

  • Potential to generate acc开发者_StackOverflowidental f*words
  • Uses confusing chars like chars like 'I', 'L', 'O' (may be confused with 0 and 1)
  • Easy to guess sequences ("AAAA", "AAAB", ...)

I was able to roll my own in 20 lines of python, thanks everybody. My encoder leaves off 'I', 'L', 'O' and 'U', and the resulting sequences are hard to guess.


>>> a='00010100011011101101110100010111'
>>> [a[i:i+5] for i in range(0, len(a), 5)]
['00010', '10001', '10111', '01101', '11010', '00101', '11']


>>> [''.join(each) for each in zip(*[iter(s)]*5)]
['00010', '10001', '10111', '01101', '11010', '00101']

or:

>>> map(''.join, zip(*[iter(s)]*5))
['00010', '10001', '10111', '01101', '11010', '00101']

[EDIT]

The question was raised by Greg Hewgill, what to do with the two trailing bits? Here are some possibilities:

>>> from itertools import izip_longest
>>>
>>> map(''.join, izip_longest(*[iter(s)]*5, fillvalue=''))
['00010', '10001', '10111', '01101', '11010', '00101', '11']
>>>
>>> map(''.join, izip_longest(*[iter(s)]*5, fillvalue=' '))
['00010', '10001', '10111', '01101', '11010', '00101', '11   ']
>>>
>>> map(''.join, izip_longest(*[iter(s)]*5, fillvalue='0'))
['00010', '10001', '10111', '01101', '11010', '00101', '11000']


Another way to group iterables, from the itertools examples:

def grouper(n, iterable, fillvalue=None):
    "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)


Per your comments, you actually want base 32 strings.

>>> import base64
>>> base64.b32encode("good stuff")
'M5XW6ZBAON2HKZTG'


How about using a regular expression?

>>> import re
>>> re.findall('.{1,5}', '00010100011011101101110100010111')
['00010', '10001', '10111', '01101', '11010', '00101', '11']

This will break though if your input string contains newlines, that you want in the grouping.


My question was duplicated by this one, so I would answer it here.

I got a more general and memory efficient answer for all this kinds of questions using Generators

from itertools import islice
def slice_generator(an_iter, num):
    an_iter = iter(an_iter)
    while True:
        result = tuple(islice(an_iter, num))
        if not result:
           return
        yield result

So for this question, We can do:

>>> l = '00010100011011101101110100010111'
>>> [''.join(x) for x in slice_generator(l,5)]
['00010', '10001', '10111', '01101', '11010', '00101', '11']


>>> l = '00010100011011101101110100010111'
>>> def splitSize(s, size):
...     return [''.join(x) for x in zip(*[list(s[t::size]) for t in range(size)])]
...  
>>> splitSize(l, 5)
['00010', '10001', '10111', '01101', '11010', '00101']
>>> 
0

精彩评论

暂无评论...
验证码 换一张
取 消