Resolving multiple “paths” in nested attributes

Question

I need to resolve "multiple paths" specified like so:

p1 = 'item.element.[compact|fullsize].documents.[note|invoice]'

A list like this is expected:

['item.element.compact.documents.note', 'item.element.fullsize.documents.note',
 'item.element.compact.documents.invoice', 'item.element.fullsize.documents.invoice']

Code:

def resolve_paths(path):
    parts = path.split('.')
    depth = len(parts)
    new_paths = []
    for level, part in enumerate(parts):
        mult_branches = re.findall(r'\[(\w+)(?:\|(\w+))*\]', part)
        if mult_branches:
            mult_branches = flatten_iterable(mult_branches)
            for branch in mult_branches:
                interm_path = '.'.join(parts[:level] + [branch] + parts[level+1:])
                new_paths.extend(resolve_paths(interm_path))
            return new_paths
        elif level == depth - 1:
            new_paths.append(path)
    return new_paths

Several tests I wrote for this function pass, but I'm not entirely happy with this solution, it's somewhat convoluted. Better solutions? Simplifications?

janos · Answer 1 · 2015-07-31 19:22:30Z

What do you think this will return for part = '[compact|fullsize|x|y]'

re.findall(r'\[(\w+)(?:\|(\w+))*\]', part)

It will give:

[('compact', 'y')]

Because, when a group matches multiple times, the last match overwrites previous matches, as per the docs. If you want to support multiple patterns, I'm afraid you will have to do in two steps: matching and then splitting.

The part I don't like about this is that you split path to parts, then when you need to recurse, you re-join the parts again, and in the recursion step it will be split again, and so on. That's a lot of splitting and joining and iterations. The recursive logic can be also tricky to understand.

Here's an alternative, without using recursion and unnecessary splitting, joining, iteration:

from collections import deque


def resolve_paths(path):
    parts = deque(path.split('.'))
    paths = [[]]

    while parts:
        part = parts.popleft()
        branches = re.findall(r'\[(\w+)(?:\|(\w+))*\]', part)
        if branches:
            orig_paths = paths[:]
            paths = []
            for branch in branches[0]:
                for path in orig_paths:
                    paths.append(path + [branch])
        else:
            for path in paths:
                path.append(part)

    return ['.'.join(path) for path in paths]

Malachi · Answer 2 · 2015-07-31 21:45:40Z

up vote 3 down vote

This combinatorial problems typically have a compact and elegant solution using itertools. In this case, it is itertools.product you want to use:

from itertools import product

def resolve_paths(path):
    subpaths = path.split('.')
    for idx, subpath in enumerate(subpaths):
        if subpath[0] == '[' and subpath[-1] == ']':
            subpaths[idx] = subpath[1:-1].split('|')
        else:
            subpaths[idx] = [subpath]
    for path in product(*subpaths):
        yield '.'.join(path)

I have made it an iterator, which I like better for this type of problems.

>>> path = 'item.element.[compact|fullsize].documents.[note|invoice]'
>>> list(resolve_paths(path))
['item.element.compact.documents.note',
 'item.element.compact.documents.invoice',
 'item.element.fullsize.documents.note',
 'item.element.fullsize.documents.invoice']

edited yesterday

Malachi
18.8k349128

answered yesterday

Jaime
2,572515

I feel like this is the best solution presented so far. It avoids recursion, is the most compact, and doesn't even use any regular expressions. – Curt F. yesterday

+1 I agree, awesome solution – mjgpy3 19 hours ago

Elegant, best solution +1. It also treats the issue I pointed out in my first point. – janos 17 hours ago

add a comment |

mjgpy3 · Answer 3 · 2015-07-31 19:53:34Z

The question's not too specific, so here's a "competing" solution to compare:

def resolve_paths(components, results = []):
  if not components:
    return results

  current = components[0]
  branches = split_branches(current) if has_branches(current) else [current]

  return resolve_paths(components[1:], list(append_all(branches, results)) or branches)

def append_component(acc, component):
  return acc + '.' + component

def split_branches(component):
  return component[1:-1].split('|')

def split_components(path):
  return path.split('.')

def has_branches(component):
  return component.startswith('[')

def append_all(branches, results):
  for branch in branches:
    for result in results:
      yield append_component(result, branch)


p1 = 'item.element.[compact|fullsize].documents.[note|invoice]'

print resolve_paths(split_components(p1))

Positives

resolve_paths is fairly simple and straight-forward. It's using an recursion and accumulator pattern which clears up a lot of bookkeeping.
Overall the code's pretty explicit (I believe). For example segments like split_branches(current) if has_branches(current) else [current] are usually pretty nasty, arguably, however, the named helper methods enhance explicitness.
I made a lot of assumptions about parsing that simplified out the regular expression. If these assumptions are wrong, that's okay, they're pretty easily changeable as they are in their own methods.

Negatives

Still recursive, so it could break in extreme edge-cases
Some expressions are overloaded (e.g. list(append_all(branches, results)) or branches)
The list(list(append_all(branches, results)) or branches)/yield pattern is kind of gross. I did it to alleviate the need for an accumulator list in append_all.

asked	yesterday
viewed	91 times
active	yesterday

current community

your communities

more stack exchange communities

Resolving multiple “paths” in nested attributes

3 Answers 3

Positives

Negatives

Your Answer

Not the answer you're looking for? Browse other questions tagged python recursion or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Resolving multiple “paths” in nested attributes

3 Answers 3

Positives

Negatives

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python recursion or ask your own question.

Related

Hot Network Questions