Just for fun: binary to ascii in Python

0110100001110100011101000111000000111010001011110010111101111000011010110110001101100100001011100110001101101111011011010010111100110101001100010011100000101111

Seriously, you really want to know what’s behind this!

BuHa.info’s New Year’s Challenge gave me the feeling that I quickly have to write a Python script, decoding ones and zeros to text, using ASCII code :-)

  1. import sys
  2.  
  3. # define file pointers and bytelength
  4. file_in = open("bin2txt_in.txt")
  5. file_out = open("bin2txt_out.txt","wb")
  6. bl = 8
  7.  
  8. # read input file; remove whitespaces
  9. # make list of bits (ints) from input
  10. bitlist = map(int,''.join(file_in.read().split()))
  11. if len(bitlist)%bl != 0:
  12.     sys.exit("length of bitlist not integer multiple of %s" % bl)
  13.  
  14. # assemble resultstring:
  15. # - make list of bytes from `bitlist`
  16. # - evaluate each byte -> int -> ascii char
  17. rs = ''.join([chr(sum(bit<<abs(idx-bl)-1 for idx,bit in enumerate(y)))
  18.                for y in zip(*[bitlist[x::bl] for x in range(bl)])
  19.               ])
  20.  
  21. # write output
  22. file_out.write(rs)
  23. file_out.close()

Put the data above in bin2txt_in.txt, run the script(*) in the same directory and read bin2txt_out.txt‘s content. You won’t regret.


(*)For Windows users not knowing how to run the script:
Download Python & install it.
– Copy the script above into a simple text file with the extension .py and doubleclick to run.

And, yeah, you’re right, this is senseless (except for the programming exercise). And.. true, you can decode this using online converters like this one here.

  • for i in range(0,len(s),8): print chr(eval(‘0b’+s[i:i+8])),

    • A quick benchmark to compare both methods…:

      #!/usr/bin/python
       
      import random
      import time
      import hashlib
       
      bitlength = 8
      N = 2**21
      s = N * [0]
      s.extend(N * [1])
      random.shuffle(s) # now s is a list of ints 0, 1 (randomly ordered)
      ss = ''.join(map(str, s)) # a string from the same list
       
      t1 = time.time()
      rs = ''.join([chr(sum(bit<<abs(idx-bitlength)-1 for idx,bit in enumerate(y)))
          for y in zip(*[s[x::bitlength] for x in range(bitlength)])])
      t2 = time.time()
      dt = t2-t1
      print "hash method 1: %s" % hashlib.md5(rs).hexdigest()
      print "time method 1: %s" % dt
       
      t1 = time.time()
      rs = ''.join([chr(eval('0b'+ss[i:i+8])) for i in range(0, len(ss), bitlength)])
      t2 = time.time()
      dt = t2-t1
      print "hash method 2: %s" % hashlib.md5(rs).hexdigest()
      print "time method 2: %s" % dt

      Result:

      $ ./bench.py
      hash method 1: 519e71070a5eb53fa4294b39845fb6a6
      time method 1: 2.73728394508
      hash method 2: 519e71070a5eb53fa4294b39845fb6a6
      time method 2: 4.78630614281

      So your method is almost half as fast. From more tests it looks like both methods scale the same with increasing N. I guess the eval() method is slow.

      Thanks for your comment anyway :)

      JP