Code Review Stack Exchange is a question and answer site for peer programmer code reviews. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I have a little function that parse a nested list of coordinates into a flat list of compressed coordinates. By compressed coordinated, I mean that only the delta (distance) between each coordinates are stored in the list and the float coordinates are transformed in integer.

input = [[-8081441,5685214], [-8081446,5685216], [-8081442,5685219], [-8081440,5685211], [-8081441,5685214]]
output = [-8081441, 5685214, 5, -2, -4, -3, -2, 8, 1, -3]

def parseCoords(coords):
    #keep the first x,y coordinates
    parsed = [int(coords[0][0]), int(coords[0][1])]
    for i in xrange(1, len(coords)):
        parsed.extend([int(coords[i-1][0]) - int(coords[i][0]), int(coords[i-1][1]) - int(coords[i][1])])
    return parsed

parsedCoords = parseCoords(input)

As the input list is really big, is there a way to improve the function, maybe by using generators or list comprehension?

share|improve this question
up vote 4 down vote accepted

for i in xrange(1, len(coords)) is a red flag, since it is preferred in python to iterate directly over the elements rather than the indices. If you trully need the indices, you can still use enumerate.

Here it would look like

for i, coord in enumerate(coords[1:]):
    previous = coords[i]
    parsed.extend([int(previous[0] - coord[0]), int(previous[1] - coord[1])])

which seems worse as it

  1. creates a copy of coords when slicing it;
  2. still uses an index to retrieve the previous element.

Instead, it seems better to convert the list into an iterator an manually handle the current/previous coordinates. Something like:

def parse_coordinates(coords):
    iterator = iter(coords)
    previous_x, previous_y = iterator.next()
    parsed = [int(previous_x), int(previous_y)]
    for current_x, current_y in iterator:
        parsed.append(int(previous_x - current_x))
        parsed.append(int(previous_y - current_y))
        previous_x, previous_y = current_x, current_y
    return parsed

You can note the use of append instead of extend that will avoid building a temporary list. append try to be smart when resizing the underlying array so that two consecutive appends should not have more performance hit than extend.

But all in all, using append or extend in a for loop is often better written using a list-comprehension or a generator. You can easily turn this function into a generator by turning these append into yields:

def parse_coordinates(coords):
    iterator = iter(coords)
    previous_x, previous_y = iterator.next()
    yield int(previous_x)
    yield int(previous_y)
    for current_x, current_y in iterator:
        yield int(previous_x - current_x)
        yield int(previous_y - current_y)
        previous_x, previous_y = current_x, current_y

There is an other advantage to this approach: if the coords parameter is empty, your approach using coords[0] and the first one building a list using iterator.next() will crash raising either an IndexError or a StopIteration.

This generator can easily be fed to the list constructor or a for loop and won't crash; producing either an empty list or not entering the loop.


Lastly, you could drop manually managing the previous/current thing by using itertools.tee which is the key feature of the pairwise recipe:

from itertools import tee, izip


def parse_coordinates(coords):
    prev, cur = tee(coords)
    x, y = cur.next()
    yield int(x)
    yield int(y)
    for (previous_x, previous_y), (current_x, current_y) in izip(prev, cur):
        yield int(previous_x - current_x)
        yield int(previous_y - current_y)
share|improve this answer
    
thank you for your great answer! – Below the Radar 3 hours ago
    
just to tell you there is a typo in last code block line 9 – Below the Radar 3 hours ago
    
@Graipher Damn, even re reading the line thrice I couldn't see it... – Mathias Ettinger 44 mins ago

1.Python functions are using underscore as naming separator

So your function parseCoords should be named as parse_coordinates, please read PEP8

2.Your current code can be simplified to

from itertools import chain
input = [[-8081441, 5685214], [-8081446,5685216], [-8081442,5685219], [-8081440,5685211], [-8081441,5685214]]
output = [-8081441, 5685214, 5, -2, -4, -3, -2, 8, 1, -3]


def get_diff(a, b):
    return int(a[0] - b[0]), int(a[1] - b[1])


def parse_coordinates(coords):
    # keep the first x,y coordinates
    parsed = chain.from_iterable(map(get_diff, zip(chain([[0, 0]], coords), 
                                                   coords)))
    return list(parsed)

3. If you really thing you code list is going to be huge, you might convert this function into a generator so you will use only O(1) memory instead of O(n). To do this with code above using Using python 2

from itertools import chain, imap, izip
input = [[-8081441, 5685214], [-8081446,5685216], [-8081442,5685219], [-8081440,5685211], [-8081441,5685214]]
output = [-8081441, 5685214, 5, -2, -4, -3, -2, 8, 1, -3]


def get_diff(a, b):
    return int(a[0] - b[0]), int(a[1] - b[1])


def parse_coordinates(coords):
    # keep the first x,y coordinates
    parsed = chain.from_iterable(imap(get_diff, izip(chain([[0, 0]], coords),
                                                     coords)))
    for item in parsed:
        yield item

Using python 3

from itertools import chain
input = [[-8081441, 5685214], [-8081446,5685216], [-8081442,5685219], [-8081440,5685211], [-8081441,5685214]]
output = [-8081441, 5685214, 5, -2, -4, -3, -2, 8, 1, -3]


def get_diff(a, b):
    return int(a[0] - b[0]), int(a[1] - b[1])


def parse_coordinates(coords):
    # keep the first x,y coordinates
    parsed = chain.from_iterable(map(get_diff, zip(chain([[0, 0]], coords),
                                                   coords)))
    yield from parsed
share|improve this answer
    
@MathiasEttinger I didn't get question right, thanks. I've edited my code – Alex 5 hours ago

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.