Sign up ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free.

I'm looking for the easiest way to convert all non-numeric data (including blanks) in Python to zeros. Taking the following for example:

someData = [[1.0,4,'7',-50],['8 bananas','text','',12.5644]]

I would like the output to be as follows:

desiredData = [[1.0,4,7,-50],[0,0,0,12.5644]]

So '7' should be 7, but '8 bananas' should be converted to 0.

share|improve this question
    
And for numeric types you do not want the type to change , i mean like int to convert to float or vice versa , it would be easier if you were aiming for a single type (rather than numeric types) . –  Anand S Kumar yesterday

9 Answers 9

up vote 9 down vote accepted
import numbers
def mapped(x):
    if isinstance(x,numbers.Number):
        return x
    for tpe in (int, float):
        try:
            return tpe(x)
        except ValueError:
            continue
    return 0
for sub  in someData:
    sub[:] = map(mapped,sub)

print(someData)
[[1.0, 4, 7, -50], [0, 0, 0, 12.5644]]

It will work for different numeric types:

In [4]: from decimal import Decimal

In [5]: someData = [[1.0,4,'7',-50 ,"99", Decimal("1.5")],["foobar",'8 bananas','text','',12.5644]]

In [6]: for sub in someData:
   ...:         sub[:] = map(mapped,sub)
   ...:     

In [7]: someData
Out[7]: [[1.0, 4, 7, -50, 99, Decimal('1.5')], [0, 0, 0, 0, 12.5644]]

if isinstance(x,numbers.Number) catches subelements that are already floats, ints etc.. if it is not a numeric type we first try casting to int then to float, if none of those are successful we simply return 0.

share|improve this answer

Another solution using regular expressions

import re

def toNumber(e):
    if type(e) != str:
        return e
    if re.match("^-?\d+?\.\d+?$", e):
        return float(e)
    if re.match("^-?\d+?$", e):
        return int(e)
    return 0

someData = [[1.0,4,'7',-50],['8 bananas','text','',12.5644]]
someData = [map(toNumber, list) for list in someData]
print(someData)

you get:

[[1.0, 4, 7, -50], [0, 0, 0, 12.5644]]

Note It don't works for numbers in scientific notation

share|improve this answer

Considering you need both int and float data types, you should try the following code:

desired_data = []
for sub_list in someData:
    desired_sublist = []
    for element in sub_list:
        try:
            some_element = eval(element)
            desired_sublist.append(some_element)
        except:
            desired_sublist.append(0)
    desired_data.append(desired_sublist) 

This might not be the optimal way to do it, but still it does the job that you asked for.

share|improve this answer
lists = [[1.0,4,'7',-50], ['1', 4.0, 'banana', 3, "12.6432"]]
nlists = []
for lst in lists:
    nlst = []
    for e in lst:
        # Check if number can be a float
        if '.' in str(e):
            try:
                n = float(e)
            except ValueError:
                n = 0
        else:
            try:
                n = int(e)
            except ValueError:
                n = 0

        nlst.append(n)
    nlists.append(nlst)

print(nlists)
share|improve this answer

Not surprisingly, Python has a way to check if something is a number:

import collections
import numbers
def num(x):
    try:
        return int(x)
    except ValueError:
        try:
            return float(x)
        except ValueError:
            return 0

def zeronize(data):
    return [zeronize(x) if isinstance(x, collections.Sequence) and not isinstance(x, basestring) else num(x) for x in data]

someData = [[1.0,4,'7',-50],['8 bananas','text','',12.5644]]
desiredData = zeronize(someData)


desiredData = `[[1, 4, 7, -50], [0, 0, 0, 12]]`

A function is defined in case you have nested lists of arbitrary depth. If using Python 3.x, replace basestring with str.

This this and this question may be relevant. Also, this and this.

share|improve this answer

As an alternative, you can use the decimal module within a nested list comprehension:

>>> [[Decimal(i) if (isinstance(i,str) and i.isdigit()) or isinstance(i,(int,float)) else 0 for i in j] for j in someData]
[[Decimal('1'), Decimal('4'), Decimal('7'), Decimal('-50')], [0, 0, 0, Decimal('12.56439999999999912461134954')]]

Note that the advantage of Decimal is that under the first condition you can use it to get a decimal value for a digit string and a float representation for a float and integer for int:

>>> Decimal('7')+3
Decimal('10')
share|improve this answer

Integers, floats, and negative numbers in quotes are fine:

 def is_number(s):
        try:
            float(s)
            return True
        except ValueError:
            return False

def is_int(s):
    try:
        int(s)
        return True
    except ValueError:
        return False

someData = [[1.0,4,'7',-50, '12.333', '-90'],['-333.90','8 bananas','text','',12.5644]]

 for l in someData:
        for i, el in enumerate(l):
            if isinstance(el, str) and not is_number(el):

                l[i] = 0
           elif isinstance(el, str) and is_int(el):

                l[i] = int(el)
           elif isinstance(el, str) and is_number(el):

                l[i] = float(el)

print(someData)

Output:

[[1.0, 4, 7, -50, 12.333, -90], [-333.9, 0, 0, 0, 12.5644]]
share|improve this answer
1  
I like the simplicity of this approach, but it converts '7' to 0 instead of 7. –  user1882017 yesterday
    
@user1882017, thanks i missed that '7... added isdigit(0) check –  LetzerWille yesterday

A one-liner:

import re
result = [[0 if not re.match("^(\d+(\.\d*)?)$|^(\.\d+)$", str(s)) else float(str(s)) if not str(s).isdigit() else int(str(s)) for s in xs] for xs in somedata]
>>> result
[[1.0, 4, 7, 0], [0, 0, 0, 12.5644]]
share|improve this answer

I assume the blanks you are referring to are empty strings. Since you want to convert all strings, regardless of them containing characters or not. We can simply check if the type of an object is a string. If it is, we can convert it to the integer 0.

cleaned_data = []
for array in someData:
    for item in array:
        cleaned_data.append(0 if type(item) == str else item)

>>>cleaned_data
[1.0, 4, 0, -50, 0, 0, 0, 12.5644]
share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.