Numbers to Text Program - Python Training

Question

I have written a numbers to text challenge for some people in my organisation to practice their Python skills. I am an economist working with pandas, but I am trying to teach them some stuff about general programming as well. I am not a trained computer scientist or anything, so what I know is just what I have picked up along the way.

I have written a solution to the numbers to text problem and would like feedback on the following:

Any general comments about the functions. I find the tuple to text function to be a bit scary to look at.
Any suggestions as to how to write helpful comments. Squeezing them in next to lines of code themselves seems to confuse the reading, but creating new lines makes the function look massive and it can be hard to read - suggestions welcome.
Any other possible systems that are simple to implement.

Incidentally, solutions deliberately do not make use of user defined classes. A full description of the problem and the method I used to solve it is provided:

#####********************************************************************************#####
#####                               DESCRIPTION                                      #####
#####********************************************************************************#####
""" The task is to write a number to text program that can convert any number between
minus 2 billion and plus 2 billion. It sounds deceptively simple...

Examples:
10 -> ten
121 -> one hundred and twenty one
1032 -> one thousand and thirty two
11143 -> eleven thousand one hundred and forty three
1200011 -> one million two hundred thousand and eleven

Note: if you ever want to turn a nested list into a single list containing all the 
elements of the nested lists, use itertools.chain()

e.g.
In [1]: l = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
In [2]: l2 = list(itertools.chain(*l))
In [3]: l2
Out[4]: [1, 2, 3, 4, 5, 6, 7, 8, 9]

The solution provided here is based upon the following:

    + If the length of a string representation of a number id divisible by three, then
      if that number is split into tuples of three consecutive numerical characters, then
      by looking at the character in the tuple and its index position in the tuple, it
      is possible to map the character to an English word. For example:

      number: '983123214'
      tuples: ('9', '8', '3') ('1', '2', '0')

      Notice that the characters in the 0th and 2nd index positions in each tuple
      will always map to the English words for those particular numbers (except '0' 
      which maps to no English word as the word 'zero' is not used when annunciating 
      a number:

        '9' : 'nine'
        '3' : 'three'
        '1' : 'one'
        '0' : ''

    However, the character in the 1st index position in each tuple will always map to 
    the and english word that represents the number in a unit of tens, except '0'
    which is not used in speech in the same way:

        '8' : 'eighty'
        '2' : 'twenty'
        '0' : ''

    Of course the character in the 0th index position in each tuple always represents the
    number in a unit of hundreds:

        '9' : 'nine' + 'hundred'
        '1' : 'one'  + 'hundred'

    The exception to this rule is when the character in the 1st index position is '0'. In
    this case, there are no hundred units:

        '0' : ''

    A further exception of importance is when the character in the 1st index position is
    '1'. Unfortunately the numbers between 10 to 19, do not behave in the same way as 
    numbers between 20 to 21, 30 to 31 etc this is because to obtain the english word 
    for the tuple:

        tuple: ('1', '2', '3')

    we can follow the basic rules relating to index positions as above to get:

        '1' : 'one' + 'hundred'
        '2' : 'twenty'
        '3' : 'three'

    The sum of the above will yield:

        sum: 'one hundred twenty three'

    But this will not work with the following tuple:

        tuple : ('4', '1', '2')

    because applying the above rules relating to index positions we would get:

        '4' : 'four' + 'hundred'
        '1' : 'ten'
        '2' : 'two
        sum : 'four hundred ten two'

    when in fact we want 'four hundred eleven'.

    Therefore when a '1' character is found in the 1st index position a new method is 
    needed to determine how to translate that character into English word. The simplest 
    method it seems to me is to use the character in the 2nd index position to determine
    the English word when '1' is observed in the 1st index position. When the following
    characters are observed in index position 2, and the character at index position 1
    is '1', then:

        '0' : 'ten'
        '1' : 'eleven'
        '2' : 'twelve'

    The beauty of this method is that whenever a string of numerical characters can be 
    divided into tuples of three elements each, the above rules apply in every case. Of
    course, we also need to specify the unit level of the tuple itself to get the full
    story:

        number: 312030967
        tuples: ('3', '1', '2') ('0', '3', '0') ('9', '6', '7')
        units:  'million', 'thousand', ''

    The last tuple does not have a unit as the 'hundred' is already implied by the three
    elements in the tuple. The relevant unit can just be added at the end of evaluating 
    each character in each tuple. Combining the above facts we see that in reference to 
    the above system:

        '3'     : 'three' + 'hundred'
        '1'     : ''
        '2'     : 'twelve'
        unit    : 'million'
        '0'     : ''
        '3'     : 'thirty'
        '0'     : ''
        unit    : 'thousand'
        '9'     : 'nine' + 'hundred'
        '0'     : ''
        '7'     : 'seven

    Naturally the above only works if the number entered has a length that gives no
    remainder when divided by 3. Therefore the first thing the program will do is
    to make sure that whatever number the user enters its length will be divisible by 3
    without remainder. In order to achieve this the number entered is padded with '0'
    characters until the length of the resulting string is 12. Why 12? Well the program
    specifies that the program must work for numbers between -2bn and 2bn. A string
    representation of the number 2bn is 10 characters long. Therefore to make the string
    divisible by 3 without remainder, 2 '0' characters are padded to the beginning. The 
    number of '0' pads will obviously depend on the number entered by the user. When a 
    negative number is entered the '-' symbol is stripped out before the zero padding. 

    The result is then split into a system of tuples each with three elements, and the
    are evaluated with reference to the above described system. 

    In order to make the evaluation a dictionary is created. The keys of the dictionary
    are the characters, '1' to '9'. Other than the values of key '1', the values are
    themselves dictionary. These sub-dictionaries have keys that are tuples that contain 
    integer values that are designed to mirror the index values of the tuples of 
    characters that are being evaluated. For example:

        dict extract : {'2' : {(0, 2) : 'two',    (1,) : 'twenty'}}
        number       : '222'
        tuple        : ('2', '2', '2')

    When evaluating the tuple, the program will first identify which character is found
    at index 0. This is '2'. Then it will access the dict with key '2'. This will give
    access to the sub dictionary. Now if the index position (still 0) is found in the
    tuple key (0, 2) of the sub-dictionary, then the value associated with that key is 
    returned : 'two'. And so on. 

    In the case of the character '1', there is a further nested sub-dictionary which will
    use the character found in the 2nd index point of the tuple being evaluated when the 
    character '1' in in index position 1, in order to get the proper English
    representation two character numbers beginning with '1'.

    There is a system of adding the word 'and' at various points, but this is not 
    discussed in detail here.   
"""

#####********************************************************************************#####
#####                       Functions and Dictionaries                               #####
#####********************************************************************************#####
import itertools
nums_dict = { '1' : {(0, 2) : 'one',      (1,) :                {'0' : 'ten',
                                                                 '1' : 'eleven',
                                                                 '2' : 'twelve',
                                                                 '3' : 'thirteen',
                                                                 '4' : 'fourteen',
                                                                 '5' : 'fifteen',
                                                                 '6' : 'sixteen',
                                                                 '7' : 'seventeen',
                                                                 '8' : 'eighteen',
                                                                 '9' : 'ninteen',}},
              '2' : {(0, 2) : 'two',      (1,) : 'twenty'},
              '3' : {(0, 2) : 'three',    (1,) : 'thirty'},
              '4' : {(0, 2) : 'four',     (1,) : 'fourty'},
              '5' : {(0, 2) : 'five',     (1,) : 'fifty'},
              '6' : {(0, 2) : 'six',      (1,) : 'sixty'},
              '7' : {(0, 2) : 'seven',    (1,) : 'seventy'},
              '8' : {(0, 2) : 'eight',    (1,) : 'eighty'},
              '9' : {(0, 2) : 'nine',     (1,) : 'ninty'}
            }

def get_number(message1, message2):
    """
    message1 : string
    message2 : string
    return   : string

    Returns user input string if capable of being cast to integer, and between minus
    2 billon and positive 2 billion, else self.
    """
    user_input = raw_input(message1)
    try:
        if int(user_input) < -2000000000 or int(user_input) > 2000000000:
            print message2
            return get_number(message1, message2)
    except ValueError:
        print 'That was not valid input'
        return get_number(message1, message2)   
    return user_input

def zero_padding(user_input):
    """
    user_input : string
    return     : string

    Returns user input stripped of a minus sign (if present) and padded to the extent
    necessary with zeros to ensure that the returned string is 12 characters in length.
    """
    if user_input[0] == '-':
        user_input = user_input[1:]
    modified_input = ('0'*(12 - len(user_input))) + user_input
    return modified_input   

def convert_to_tuple_list(modified_input):
    """
    modified_input : string
    return         : tuple

    Returns tuple with four elements, each a tuple with three elements.
    Assumes modified_input has length 12.
    """
    tuple_list = tuple((tuple(modified_input[x:x+3]) for x in xrange(0, 10, 3)))
    return tuple_list

def tuple_to_text(single_tuple, unit_string, nums_dict = nums_dict):
    """
    single_tuple  : tuple 
    unit_string   : string
    nums_dict     : dict
    return        : list

    Returns list of alpha strings that represent text of numerical string characters found 
    in single_tuple. The final element of the list is the unit_sting.
    """
    word_list = [[],[]]
    if ''.join(single_tuple) == '000': # if all characters are '0' return empty list
        return list(itertools.chain(*word_list))
    if single_tuple[0] != '0': # if the fist element of the tuple is not '0'
        word_list[0].extend([nums_dict[single_tuple[0]][(0, 2)], 'hundred'])
    if single_tuple[1] != '0': # if the second element of the tuple is not '0'
        if single_tuple[1] == '1': # Special case where second character is '1'
            word_list[1].extend(['and', nums_dict['1'][(1,)][single_tuple[2]], unit_string])
        else:
            try: #if third element is zero then this will generate an error below as zero
                 #is not in the nums_dict. 
                word_list[1].extend(['and', nums_dict[single_tuple[1]][(1,)], 
                                     nums_dict[single_tuple[2]][(0, 2)], unit_string])
            except KeyError: 
                word_list[1].extend(['and', nums_dict[single_tuple[1]][(1,)], unit_string])             
    else:
        if single_tuple[2] != '0': # if first element of tuple is zero but the second is not
            word_list[1].extend(['and', nums_dict[single_tuple[2]][(0, 2)], unit_string])
        else:
            word_list[1].append(unit_string)

    if len(word_list[0]) == 0: # if no 'hundreds' then remove 'and'
        word_list[1].remove('and')
    return list(itertools.chain(*word_list))

def create_text_representation(tuple_list):
    """
    tuple_list : tuple
    return     : string

    Returns string of words found in each list created by calling the tuple_to_text
    function.
    """ 
    list1 = tuple_to_text(tuple_list[0], 'billion')
    list2 = tuple_to_text(tuple_list[1], 'million')
    list3 = tuple_to_text(tuple_list[2], 'thousand')
    list4 = tuple_to_text(tuple_list[3], '')

    #If any of the lists 1/2/3 are not empty, but list4 contains no hundred value, 
    #insert an 'and' into list4 at index position 1 if tuple_list[3] does not contain
    #elements all of which are equal to '0'
    if any([len(list1) != 0, len(list2) != 0, len(list3) != 0])\
    and 'hundred' not in list4 and ''.join(tuple_list[3]) != "000":
        list4.insert(0, 'and')

    complete_list = itertools.chain(*[list1, list2, list3, list4])
    complete_list = [elem for elem in complete_list if not type(elem) is list]
    return " ".join(complete_list)

def message(user_input, text_representation):
    """
    user_input          : string of numerical characters (possible including the minus sign)
    text_representation : string of alphas
    return              : formatted string

    Returns string formatted to include 'minus' where necessary, the original number
    provided, and the textual representation of that number.
    """
    message = \
    """
    The number {0} written as text is : 
    {1}{2}
    """
    if user_input[0] == '-':
        return message.format(user_input, 'minus ', text_representation)
    return message.format(user_input, '', text_representation)

#####********************************************************************************#####
#####                                Run Method                                      #####
#####********************************************************************************#####  
user_input = get_number("Please enter a number between -2 billion and 2 billion: ",
                        "That number is out of range")
modified_input = zero_padding(user_input)
tuple_list = convert_to_tuple_list(modified_input)
text_representation = create_text_representation(tuple_list)
print message(user_input, text_representation)

The amount of documentation in your code is impressive. I have no time to have a look right now but a pretty similar question was asked recently, maybe you'll find something relevant for you on codereview.stackexchange.com/questions/43744/… . — Josay, Apr 4 '14 at 13:56
Thanks, yes its a simple problem, but for training purposes I try to explain the solutions as best as possible. Actually, it helps me to then write the code better... — Woody Pride, Apr 4 '14 at 14:02
@Josay's referring to my thread, which I saw you've commented on as well. The way to do this best is to split into lists almost every word you're going to be using and create specific functions for less than a thousand and for more than a thousand, as well as a function for splitting by thousands as well. :) The documentation on this really is impressive, and I hope you get it working for you. :) — The Laughing Man, Apr 4 '14 at 14:09
It seems to work perfectly, I was just wondering if this was a good way to tackle the problem or not really, before I teach my staff how to solve problems in python. The issue is I am not an expert at all, so I am always worried I am giving them bad info. We use programming in a very ad hoc way i.e. to achieve data management tasks on the fly, so none of us have every really become real programmers. I'm going to check out your solution for sure... — Woody Pride, Apr 4 '14 at 14:17
Concerning the amount of documentation, wouldn't it be better to have a minimal but sufficient documentation in the code and to put the details in a Sphinx documentation or something like that? — Morwenn, Apr 4 '14 at 14:22

Calpratt · Answer 1 · 2014-04-04 21:44:38Z

I was bored, so I decided to write my own solution. Might be helpful- so I'll post it.

It's fairly similar to your solution. I start by defining some strings:

singles = ["","one","two","three","four","five","six","seven","eight","nine"]
tens = ["","","twenty","thirty","forty","fifty","sixty","seventy","eighty","ninety"]
teens = ["ten","eleven","twelve","thirteen","fourteen","fifteen","sixteen","seventeen","eighteen","nineteen"]
suffix = ["","thousand","million","billion","trillion","quadrillion","zillion"]

My idea here is that if I use a "" entry, I'll remove it before I construct the sentence.

def name_number(number):
    # The simplest case
    if number == 0:
        return "zero"

    # container for the evaluated digits
    name_list = []

    # add negative?
    if number < 0:
        name_list.append("negative")
        number = -number

    # Pad such that it can be broken into threes
    number = str(number)
    while not len(number) % 3 == 0:
        number = '0' + number

    # break number into chunks
    sections = [ number[ii:ii+3] for ii in range(0, len(number), 3)]

    # name the chunks, and add a suffix
    for ii in range(len(sections)):
        # if the section is zero, it can be skipped
        if not int(sections[ii]) == 0:
            # if it is the last chunk, you have to add "and"
            name_list.extend(name_chunk(sections[ii], ii+1 == len(sections) ))
            # limited by the ammount of suffix's defined
            name_list.append(suffix[len(sections)-(ii+1)])

    # remove undefined numbers
    while "" in name_list:
        name_list.remove("")

    # return that stuff
    return " ".join(name_list)

Because I remove the empty entries, I can be lazy while evaluating chunks

def name_chunk(chunk, add_and = False):
    # container for the evaluated digits
    name_list = []
    # have to check for 0 on this digit, seeing as "" hundred will make no sense
    if not int(chunk[0]) == 0:
        name_list.append(singles[int(chunk[0])])
        name_list.append("hundred")
    # check if in teens before moving on
    if int(chunk[1]) == 1:
        if add_and:
            name_list.append("and")
        name_list.append(teens[int(chunk[2])])
    else:
        # have to check for 0 here due to same reason as before
        if add_and and not int(chunk[1:]) == 0:
            name_list.append("and")
        name_list.append(tens[int(chunk[1])])
        name_list.append(singles[int(chunk[2])])
    return name_list

giving me the result:

>>> name_number(341234123412341200)
'three hundred fourty one quatrillion two hundred thirty four trillion one hundred twenty three billion four hundred twelve million three hundred fourty one thousand two hundred'
>>> name_number(341234123412341201)
'three hundred fourty one quatrillion two hundred thirty four trillion one hundred twenty three billion four hundred twelve million three hundred fourty one thousand two hundred and one'
>>>

I particulalry like the idea of iterating until the len(string)%3 == 3. This means, fewer unnecessary tuples of '0's when the number is short, and means you don't have to specify how many '0's should be added, therefore making the function more flexible as to input... Thanks! — Woody Pride, Apr 5 '14 at 2:45
@woody padding works well for immutable objects like strings. just remember if you're padding something like a list, to remove the padding after — Calpratt, Apr 5 '14 at 20:44

skyjur · Answer 2 · 2014-04-04 14:28:18Z

I will review a single function the get_number().

From what I have seen in your code there is completely no benefit of passing messages as variables message1 and message2. It is easer to read code which looks like this:
```
user_input = raw_input("Please enter a number between -2 billion and 2 billion: ")
```
If you still decide to pass it as variables you should give a descriptive name to your message variable
```
user_input = raw_input(msg_ask_to_enter_value)
```

Recursion will reach limits and will throw an exception if I keep entering wrong values. It's not a serious issue, as I would need to repeat for 1000 times but still its better not to leave this bug especially when it is so easy to get around it. Instead of recursion use a loop:

def get_number():
    boundary_low = -2000000000
    boundary_high = 2000000000

    while True
        input = raw_input('Please inter a number in range of %s to %s: ' % (
                           boundary_low, boundary_high))
        try:
            value = int(input)
        except:
            print 'That was not a number'
        else:
            if boundary_low <= value <= boundary_high:
                return value
            else:
                print 'Number out of boundaries'

The point about the messages is well taken. I suppose if I were varying the range of accepted inputs, if could make sense to have different messages, but actually I could just use string formatting as you have done. Also, if I did want to vary the possible range of numbers I would have to update the zero padding function to accept an argument to tell it how many zeros to pad etc. That might actually be nice, just to show how to make the functions as flexible as possible i.e. not need to edit when changing the paramters of the problem. Thanks — Woody Pride, Apr 4 '14 at 14:29

asked	1 year ago
viewed	339 times
active	10 months ago

current community

your communities

more stack exchange communities

Numbers to Text Program - Python Training

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged python strings python-2.7 numbers-to-words or ask your own question.

Visit Chat

Linked

Hot Network Questions

current community

your communities

more stack exchange communities

Numbers to Text Program - Python Training

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python strings python-2.7 numbers-to-words or ask your own question.

Visit Chat

Linked

Related

Hot Network Questions