I have written a numbers to text challenge for some people in my organisation to practice their Python skills. I am an economist working with pandas, but I am trying to teach them some stuff about general programming as well. I am not a trained computer scientist or anything, so what I know is just what I have picked up along the way.
I have written a solution to the numbers to text problem and would like feedback on the following:
- Any general comments about the functions. I find the tuple to text function to be a bit scary to look at.
- Any suggestions as to how to write helpful comments. Squeezing them in next to lines of code themselves seems to confuse the reading, but creating new lines makes the function look massive and it can be hard to read - suggestions welcome.
- Any other possible systems that are simple to implement.
Incidentally, solutions deliberately do not make use of user defined classes. A full description of the problem and the method I used to solve it is provided:
#####********************************************************************************#####
##### DESCRIPTION #####
#####********************************************************************************#####
""" The task is to write a number to text program that can convert any number between
minus 2 billion and plus 2 billion. It sounds deceptively simple...
Examples:
10 -> ten
121 -> one hundred and twenty one
1032 -> one thousand and thirty two
11143 -> eleven thousand one hundred and forty three
1200011 -> one million two hundred thousand and eleven
Note: if you ever want to turn a nested list into a single list containing all the
elements of the nested lists, use itertools.chain()
e.g.
In [1]: l = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
In [2]: l2 = list(itertools.chain(*l))
In [3]: l2
Out[4]: [1, 2, 3, 4, 5, 6, 7, 8, 9]
The solution provided here is based upon the following:
+ If the length of a string representation of a number id divisible by three, then
if that number is split into tuples of three consecutive numerical characters, then
by looking at the character in the tuple and its index position in the tuple, it
is possible to map the character to an English word. For example:
number: '983123214'
tuples: ('9', '8', '3') ('1', '2', '0')
Notice that the characters in the 0th and 2nd index positions in each tuple
will always map to the English words for those particular numbers (except '0'
which maps to no English word as the word 'zero' is not used when annunciating
a number:
'9' : 'nine'
'3' : 'three'
'1' : 'one'
'0' : ''
However, the character in the 1st index position in each tuple will always map to
the and english word that represents the number in a unit of tens, except '0'
which is not used in speech in the same way:
'8' : 'eighty'
'2' : 'twenty'
'0' : ''
Of course the character in the 0th index position in each tuple always represents the
number in a unit of hundreds:
'9' : 'nine' + 'hundred'
'1' : 'one' + 'hundred'
The exception to this rule is when the character in the 1st index position is '0'. In
this case, there are no hundred units:
'0' : ''
A further exception of importance is when the character in the 1st index position is
'1'. Unfortunately the numbers between 10 to 19, do not behave in the same way as
numbers between 20 to 21, 30 to 31 etc this is because to obtain the english word
for the tuple:
tuple: ('1', '2', '3')
we can follow the basic rules relating to index positions as above to get:
'1' : 'one' + 'hundred'
'2' : 'twenty'
'3' : 'three'
The sum of the above will yield:
sum: 'one hundred twenty three'
But this will not work with the following tuple:
tuple : ('4', '1', '2')
because applying the above rules relating to index positions we would get:
'4' : 'four' + 'hundred'
'1' : 'ten'
'2' : 'two
sum : 'four hundred ten two'
when in fact we want 'four hundred eleven'.
Therefore when a '1' character is found in the 1st index position a new method is
needed to determine how to translate that character into English word. The simplest
method it seems to me is to use the character in the 2nd index position to determine
the English word when '1' is observed in the 1st index position. When the following
characters are observed in index position 2, and the character at index position 1
is '1', then:
'0' : 'ten'
'1' : 'eleven'
'2' : 'twelve'
The beauty of this method is that whenever a string of numerical characters can be
divided into tuples of three elements each, the above rules apply in every case. Of
course, we also need to specify the unit level of the tuple itself to get the full
story:
number: 312030967
tuples: ('3', '1', '2') ('0', '3', '0') ('9', '6', '7')
units: 'million', 'thousand', ''
The last tuple does not have a unit as the 'hundred' is already implied by the three
elements in the tuple. The relevant unit can just be added at the end of evaluating
each character in each tuple. Combining the above facts we see that in reference to
the above system:
'3' : 'three' + 'hundred'
'1' : ''
'2' : 'twelve'
unit : 'million'
'0' : ''
'3' : 'thirty'
'0' : ''
unit : 'thousand'
'9' : 'nine' + 'hundred'
'0' : ''
'7' : 'seven
Naturally the above only works if the number entered has a length that gives no
remainder when divided by 3. Therefore the first thing the program will do is
to make sure that whatever number the user enters its length will be divisible by 3
without remainder. In order to achieve this the number entered is padded with '0'
characters until the length of the resulting string is 12. Why 12? Well the program
specifies that the program must work for numbers between -2bn and 2bn. A string
representation of the number 2bn is 10 characters long. Therefore to make the string
divisible by 3 without remainder, 2 '0' characters are padded to the beginning. The
number of '0' pads will obviously depend on the number entered by the user. When a
negative number is entered the '-' symbol is stripped out before the zero padding.
The result is then split into a system of tuples each with three elements, and the
are evaluated with reference to the above described system.
In order to make the evaluation a dictionary is created. The keys of the dictionary
are the characters, '1' to '9'. Other than the values of key '1', the values are
themselves dictionary. These sub-dictionaries have keys that are tuples that contain
integer values that are designed to mirror the index values of the tuples of
characters that are being evaluated. For example:
dict extract : {'2' : {(0, 2) : 'two', (1,) : 'twenty'}}
number : '222'
tuple : ('2', '2', '2')
When evaluating the tuple, the program will first identify which character is found
at index 0. This is '2'. Then it will access the dict with key '2'. This will give
access to the sub dictionary. Now if the index position (still 0) is found in the
tuple key (0, 2) of the sub-dictionary, then the value associated with that key is
returned : 'two'. And so on.
In the case of the character '1', there is a further nested sub-dictionary which will
use the character found in the 2nd index point of the tuple being evaluated when the
character '1' in in index position 1, in order to get the proper English
representation two character numbers beginning with '1'.
There is a system of adding the word 'and' at various points, but this is not
discussed in detail here.
"""
#####********************************************************************************#####
##### Functions and Dictionaries #####
#####********************************************************************************#####
import itertools
nums_dict = { '1' : {(0, 2) : 'one', (1,) : {'0' : 'ten',
'1' : 'eleven',
'2' : 'twelve',
'3' : 'thirteen',
'4' : 'fourteen',
'5' : 'fifteen',
'6' : 'sixteen',
'7' : 'seventeen',
'8' : 'eighteen',
'9' : 'ninteen',}},
'2' : {(0, 2) : 'two', (1,) : 'twenty'},
'3' : {(0, 2) : 'three', (1,) : 'thirty'},
'4' : {(0, 2) : 'four', (1,) : 'fourty'},
'5' : {(0, 2) : 'five', (1,) : 'fifty'},
'6' : {(0, 2) : 'six', (1,) : 'sixty'},
'7' : {(0, 2) : 'seven', (1,) : 'seventy'},
'8' : {(0, 2) : 'eight', (1,) : 'eighty'},
'9' : {(0, 2) : 'nine', (1,) : 'ninty'}
}
def get_number(message1, message2):
"""
message1 : string
message2 : string
return : string
Returns user input string if capable of being cast to integer, and between minus
2 billon and positive 2 billion, else self.
"""
user_input = raw_input(message1)
try:
if int(user_input) < -2000000000 or int(user_input) > 2000000000:
print message2
return get_number(message1, message2)
except ValueError:
print 'That was not valid input'
return get_number(message1, message2)
return user_input
def zero_padding(user_input):
"""
user_input : string
return : string
Returns user input stripped of a minus sign (if present) and padded to the extent
necessary with zeros to ensure that the returned string is 12 characters in length.
"""
if user_input[0] == '-':
user_input = user_input[1:]
modified_input = ('0'*(12 - len(user_input))) + user_input
return modified_input
def convert_to_tuple_list(modified_input):
"""
modified_input : string
return : tuple
Returns tuple with four elements, each a tuple with three elements.
Assumes modified_input has length 12.
"""
tuple_list = tuple((tuple(modified_input[x:x+3]) for x in xrange(0, 10, 3)))
return tuple_list
def tuple_to_text(single_tuple, unit_string, nums_dict = nums_dict):
"""
single_tuple : tuple
unit_string : string
nums_dict : dict
return : list
Returns list of alpha strings that represent text of numerical string characters found
in single_tuple. The final element of the list is the unit_sting.
"""
word_list = [[],[]]
if ''.join(single_tuple) == '000': # if all characters are '0' return empty list
return list(itertools.chain(*word_list))
if single_tuple[0] != '0': # if the fist element of the tuple is not '0'
word_list[0].extend([nums_dict[single_tuple[0]][(0, 2)], 'hundred'])
if single_tuple[1] != '0': # if the second element of the tuple is not '0'
if single_tuple[1] == '1': # Special case where second character is '1'
word_list[1].extend(['and', nums_dict['1'][(1,)][single_tuple[2]], unit_string])
else:
try: #if third element is zero then this will generate an error below as zero
#is not in the nums_dict.
word_list[1].extend(['and', nums_dict[single_tuple[1]][(1,)],
nums_dict[single_tuple[2]][(0, 2)], unit_string])
except KeyError:
word_list[1].extend(['and', nums_dict[single_tuple[1]][(1,)], unit_string])
else:
if single_tuple[2] != '0': # if first element of tuple is zero but the second is not
word_list[1].extend(['and', nums_dict[single_tuple[2]][(0, 2)], unit_string])
else:
word_list[1].append(unit_string)
if len(word_list[0]) == 0: # if no 'hundreds' then remove 'and'
word_list[1].remove('and')
return list(itertools.chain(*word_list))
def create_text_representation(tuple_list):
"""
tuple_list : tuple
return : string
Returns string of words found in each list created by calling the tuple_to_text
function.
"""
list1 = tuple_to_text(tuple_list[0], 'billion')
list2 = tuple_to_text(tuple_list[1], 'million')
list3 = tuple_to_text(tuple_list[2], 'thousand')
list4 = tuple_to_text(tuple_list[3], '')
#If any of the lists 1/2/3 are not empty, but list4 contains no hundred value,
#insert an 'and' into list4 at index position 1 if tuple_list[3] does not contain
#elements all of which are equal to '0'
if any([len(list1) != 0, len(list2) != 0, len(list3) != 0])\
and 'hundred' not in list4 and ''.join(tuple_list[3]) != "000":
list4.insert(0, 'and')
complete_list = itertools.chain(*[list1, list2, list3, list4])
complete_list = [elem for elem in complete_list if not type(elem) is list]
return " ".join(complete_list)
def message(user_input, text_representation):
"""
user_input : string of numerical characters (possible including the minus sign)
text_representation : string of alphas
return : formatted string
Returns string formatted to include 'minus' where necessary, the original number
provided, and the textual representation of that number.
"""
message = \
"""
The number {0} written as text is :
{1}{2}
"""
if user_input[0] == '-':
return message.format(user_input, 'minus ', text_representation)
return message.format(user_input, '', text_representation)
#####********************************************************************************#####
##### Run Method #####
#####********************************************************************************#####
user_input = get_number("Please enter a number between -2 billion and 2 billion: ",
"That number is out of range")
modified_input = zero_padding(user_input)
tuple_list = convert_to_tuple_list(modified_input)
text_representation = create_text_representation(tuple_list)
print message(user_input, text_representation)