0

I want to create binary values for words based on their content of vowels and consonants, where vowels receive a value of '0' and consonants get a value of '1'.

For example, 'haha' would be represented as 1010, hahaha as 101010.

common_words = ['haha', 'hahaha', 'aardvark', etc...]

dictify = {}

binary_value = []

#doesn't work
for word in common_words: 
    for x in word:
        if x=='a' or x=='e' or x=='i' or x=='o' or x=='u':
            binary_value.append(0)
            dictify[word]=binary_value
        else:
            binary_value.append(1)
            dictify[word]=binary_value

-With this I am getting too many binary digits in the resulting dictionary:

>>>dictify
{'aardvark': [0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1,...}

desired output:

>>>dictify
{'haha': 1010,'hahaha': 101010, 'aardvark': 00111011}

I am thinking of a solution that doesn't involve a loop within a loop...

7
  • Where does each or number_value come from? Commented Feb 17, 2014 at 2:33
  • 1
    There is no solution that doesn't use two loops. Commented Feb 17, 2014 at 2:36
  • dictify = {w:"".join('0' if c in 'aeiouAEIOU' else '1' for c in w) for w in common_words}
    – mshsayem
    Commented Feb 17, 2014 at 2:40
  • Your desired output isn't really possible-- 00111011 won't work as an integer because there's no way to preserve the initial zeroes. You could use a string or a list.
    – DSM
    Commented Feb 17, 2014 at 2:41
  • Please post your actual code. The code you posted can't work each and binary_value are never set. Commented Feb 17, 2014 at 2:46

3 Answers 3

2

The code you've posted doesn't work because all words share the same binary_value list. (It also doesn't work because number_value and each are never defined, but we'll pretend those variables said binary_value and word instead.) Define a new list for each word:

for word in common_words:
    binary_value = []
    for x in word:
        if x=='a' or x=='e' or x=='i' or x=='o' or x=='u':
            binary_value.append(0)
            dictify[word]=binary_value
        else:
            binary_value.append(1)
            dictify[word]=binary_value

If you want the output to look like 00111011 rather than a list, you'll need to make a string. (You could make an int, but then it would look like 59 instead of 00111011. Python doesn't distinguish "this int is base 2" or "this int has 2 leading zeros".)

for word in common_words:
    binary_value = []
    for x in word:
        if x.lower() in 'aeiou':
            binary_value.append('0')
        else:
            binary_value.append('1')
    dictify[word] = ''.join(binary_value)
2

user2357112 explains your code. Here is just another way:

>>> common_words = ['haha', 'hahaha', 'aardvark']
>>> def binfy(w):
        return "".join('0' if c in 'aeiouAEIOU' else '1' for c in w)

>>> dictify = {w:binfy(w) for w in common_words}
>>> dictify
{'aardvark': '00111011', 'haha': '1010', 'hahaha': '101010'}
1

This seems like a job for translation tables. Assuming your input strings are all ASCII (and it seems likely or the definition of exactly what is a vowel gets fuzzy), you can define a translation table this way*:

# For simplicity's sake, I'm only using lowercase letters
from string import lowercase, maketrans
tt = maketrans(lowercase, '01110111011111011111011111')

With the above table, the problem becomes trivial:

>>> 'haha'.translate(tt)
'1010'
>>> 'hahaha'.translate(tt)
'101010'
>>> 'aardvark'.translate(tt)
'00111011'

Given this solution, you can build dictify very simply with a comprehension:

dictify = {word:word.translate(tt) for word in common_words} #python2.7
dictify = dict((word, word.translate(tt)) for word in common_words) # python 2.6 and earlier

*This can also be done with Python 3, but you have to use bytes instead of strings:

from string import ascii_lowercase
tt = b''.maketrans(bytes(ascii_lowercase, 'ascii'), b'01110111011111011111011111')
b'haha'.translate(tt)
...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.