Converting string to binary representation in python

Question

I want to create binary values for words based on their content of vowels and consonants, where vowels receive a value of '0' and consonants get a value of '1'.

For example, 'haha' would be represented as 1010, hahaha as 101010.

common_words = ['haha', 'hahaha', 'aardvark', etc...]

dictify = {}

binary_value = []

#doesn't work
for word in common_words: 
    for x in word:
        if x=='a' or x=='e' or x=='i' or x=='o' or x=='u':
            binary_value.append(0)
            dictify[word]=binary_value
        else:
            binary_value.append(1)
            dictify[word]=binary_value

-With this I am getting too many binary digits in the resulting dictionary:

>>>dictify
{'aardvark': [0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1,...}

desired output:

>>>dictify
{'haha': 1010,'hahaha': 101010, 'aardvark': 00111011}

I am thinking of a solution that doesn't involve a loop within a loop...

dictify = {w:"".join('0' if c in 'aeiouAEIOU' else '1' for c in w) for w in common_words} — mshsayem, Commented Feb 17, 2014 at 2:40
Your desired output isn't really possible-- 00111011 won't work as an integer because there's no way to preserve the initial zeroes. You could use a string or a list. — DSM, Commented Feb 17, 2014 at 2:41
Please post your actual code. The code you posted can't work each and binary_value are never set. — Chris Johnson, Commented Feb 17, 2014 at 2:46

user2357112 · Accepted Answer · 2014-02-17 02:41:37Z

The code you've posted doesn't work because all words share the same binary_value list. (It also doesn't work because number_value and each are never defined, but we'll pretend those variables said binary_value and word instead.) Define a new list for each word:

for word in common_words:
    binary_value = []
    for x in word:
        if x=='a' or x=='e' or x=='i' or x=='o' or x=='u':
            binary_value.append(0)
            dictify[word]=binary_value
        else:
            binary_value.append(1)
            dictify[word]=binary_value

If you want the output to look like 00111011 rather than a list, you'll need to make a string. (You could make an int, but then it would look like 59 instead of 00111011. Python doesn't distinguish "this int is base 2" or "this int has 2 leading zeros".)

for word in common_words:
    binary_value = []
    for x in word:
        if x.lower() in 'aeiou':
            binary_value.append('0')
        else:
            binary_value.append('1')
    dictify[word] = ''.join(binary_value)

mshsayem · Accepted Answer · 2014-02-17 02:48:54Z

2

user2357112 explains your code. Here is just another way:

>>> common_words = ['haha', 'hahaha', 'aardvark']
>>> def binfy(w):
        return "".join('0' if c in 'aeiouAEIOU' else '1' for c in w)

>>> dictify = {w:binfy(w) for w in common_words}
>>> dictify
{'aardvark': '00111011', 'haha': '1010', 'hahaha': '101010'}

edited Feb 17, 2014 at 2:48

answered Feb 17, 2014 at 2:43

mshsayem

18k11 gold badges62 silver badges72 bronze badges

Add a comment |

kojiro · Accepted Answer · 2014-02-17 03:14:24Z

This seems like a job for translation tables. Assuming your input strings are all ASCII (and it seems likely or the definition of exactly what is a vowel gets fuzzy), you can define a translation table this way*:

# For simplicity's sake, I'm only using lowercase letters
from string import lowercase, maketrans
tt = maketrans(lowercase, '01110111011111011111011111')

With the above table, the problem becomes trivial:

>>> 'haha'.translate(tt)
'1010'
>>> 'hahaha'.translate(tt)
'101010'
>>> 'aardvark'.translate(tt)
'00111011'

Given this solution, you can build dictify very simply with a comprehension:

dictify = {word:word.translate(tt) for word in common_words} #python2.7
dictify = dict((word, word.translate(tt)) for word in common_words) # python 2.6 and earlier

*This can also be done with Python 3, but you have to use bytes instead of strings:

from string import ascii_lowercase
tt = b''.maketrans(bytes(ascii_lowercase, 'ascii'), b'01110111011111011111011111')
b'haha'.translate(tt)
...

Collectives™ on Stack Overflow

Converting string to binary representation in python

3 Answers 3

Your Answer

Not the answer you're looking for? Browse other questions tagged
python
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
python
or ask your own question.