Pythonic type coercion from input

Question

My context is a very simple filter-like program that streams lines from an input file into an output file, filtering out lines the user (well, me) doesn't want. The filter is rather easy, simply requiring a particular value for some 'column' in the input file. Options are easily expressed with argparse:

parser.add_argument('-e', '--example', type=int, metavar='EXAMPLE', 
    help='require a value of %(metavar)s for column "example"')

There's a few of these, all typed. As the actual filter gets a line at some point, the question whether to include such a line is simple: split the line and check all the required filters:

c1, c2, c3, *_ = line.split('\t')
c1, c2, c3 = int(c1), int(c2), int(c3) # ← this line bugs me

if args.c1 and args.c1 != c1:
    return False

The second line of which bugs me a bit: as the values are initially strings and I need them to be something else, the types need to be changed. Although short, I'm not entirely convinced this is the best solution. Some other options:

create a separate function to hide the thing that bugs me;
remove the type= declarations from the options (also removes automatic user input validation);
coerce the arguments back to strs and do the comparison with strs (would lead to about the same as what I've got).

Which of the options available would be the 'best', 'most pythonic', 'prettiest'? Or am I just overthinking this...

int() skips leading and trailing whitespace and leading zeros. So, a string comparison is not quite the same. Which do you need? — Janne Karila, Mar 6 '13 at 13:59
I need the numeric value. My data doesn't contain extraneous whitespace or zeroes, thought that doesn't much matter for the issue at hand. — akaIDIOT, Mar 6 '13 at 14:34

Gareth Rees · Accepted Answer · 2013-03-07 10:00:01Z

1. Possible bugs

I can't be certain that these are bugs, but they look dodgy to me:

You can't filter on the number 0, because if args.c1 is 0, then it will test false and so you'll never compare it to c1. Is this a problem? If it is, you ought to compare args.c1 explicitly against None.
If the string c1 cannot be converted to an integer, then int(c1) will raise ValueError. Is this what you want? Would it be better to treat this case as "not matching the filter"?
If line has fewer than three fields, the assignment to c1, c2, c3, *_ will raise ValueError. Is this what you want? Would it be better to treat this case as "not matching the filter"?

2. Suggested improvement

Fixing the possible bugs listed above:

def filter_field(filter, field):
    """
    Return True if the string `field` passes `filter`, which is
    either None (meaning no filter), or an int (meaning that 
    `int(field)` must equal `filter`).
    """
    try:
        return filter is None or filter == int(field)
    except ValueError:
        return False

and then:

filters = [args.c1, args.c2, args.c3]
fields = line.split('\t')
if len(fields) < len(filters):
    return False
if not all(filter_field(a, b) for a, b in zip(filters, fields)):
    return False

1: figured that as well, though my current context will never 'require' a falsy value. 2: valid point, crashing isn't good, even though I'd expect only integers. 3: also a valid point, though my lines contain far more than 3 values in reality. Furthermore, I like your idea of using all and zip; does feel prettier to me :) — akaIDIOT, Mar 7 '13 at 11:21

asked	1 year ago
viewed	78 times
active	1 year ago

current community

your communities

more stack exchange communities

Pythonic type coercion from input

1 Answer

1. Possible bugs

2. Suggested improvement

Your Answer

Not the answer you're looking for? Browse other questions tagged python type-safety or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Pythonic type coercion from input

1 Answer

1. Possible bugs

2. Suggested improvement

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python type-safety or ask your own question.

Related

Hot Network Questions