Depth-first search in Python

Question

I wrote this DFS in Python and I was wondering if it correct. Somehow it seems too simple to me.

Each node is a class with attributes:

visited: Boolean
sons: list of nodes linked to the current node.

This code should work both for directed and undirected graphs.

def DFS(node):

    node.visited = True

    if node == None:
        return

    // perform some operation on the node
    do_something(node)

    to_visit = [son in node.sons if not son.visited]
    for son in to_visit:
        DFS(node)

Is this pseudo-code? This bit won't compile in Python: [son in node.sons if !son.visited] — janos, yesterday

f.rodrigues · Answer 1 · 2015-01-26 04:53:35Z

up vote 4 down vote

DFS should keep track of all the nodes visited. Not the node.

The node only properties is it self, and it's children.

Check this amazing implementation:

graph = {'A': set(['B', 'C']),
         'B': set(['A', 'D', 'E']),
         'C': set(['A', 'F']),
         'D': set(['B']),
         'E': set(['B', 'F']),
         'F': set(['C', 'E'])}

def dfs(graph, start):
    visited, stack = set(), [start]
    while stack:
        vertex = stack.pop()
        if vertex not in visited:
            visited.add(vertex)
            stack.extend(graph[vertex] - visited)
    return visited

dfs(graph, 'A') # {'E', 'D', 'F', 'A', 'C', 'B'}

As you can see DFS is called just once and it keeps track of visted nodes all the way, it will use pop(-1) so it will Depth-search, you could either change to pop(0), so it would Breath-Search. It's the same concept, the only difference is which node is visited in which order.

Your implementation will have some problems if the graph is too deep, because it could raise a Maximum recursion depth exceeded.

answered yesterday

f.rodrigues
26010

Unless I'm missing something, if you only ever add unvisited nodes to the stack, you shouldn't need the if vertex not in visited check (presuming a node cannot have two links to the same child). Which you want would depend on whether set subtraction or contains is faster (I would guess the latter). – sapi 23 hours ago

Makes sense. Seems redundant. The code ain't mine tho, the link provide may have some info on that. – f.rodrigues 23 hours ago

@sapi A node would be pushed multiple times to stack if it can be reached from multiple nodes that happen to be visited before it. – Janne Karila 22 hours ago

@JanneKarila Good point; I was picturing a top-down tree (where that won't occur for a DFS), but of course that's a very specific case. – sapi 20 hours ago

add a comment |

ILoveCoding · Answer 2 · 2015-01-26 00:00:56Z

Nope, it is not correct. The problem is that the to_visit list can change after visiting a child of the currect node(for example, on a full graph with 3 vertices your implementation will visit the third node twice(if you run the depth-first search from the first vertex)). However, it is easy to fix it. You can just iterate over all children of the current node and run a depth-first from a child if it has not been visited yet(using an if statement inside the for loop).

ben rudgers · Answer 3 · 2015-01-26 01:16:49Z

Having visited as a property of a node feels a bit like a leaky abstraction because it embeds the needs of a particular tree operation into the more general abstraction of graphs and nodes.

To me, a node is a data structure that has some pointers to zero or more children and perhaps stores a value - depending on the semantics of a particular graph. Consider an acyclic graph with all values at stored at leaves versus various cyclic graphs with values stored at each node.

In the cyclic graph case depth first search is not directly applicable until an appropriate starting node has been selected and though the concept of visited may make sense for some cyclic graph algorithms, it wouldn't in the case of other algorithms such as Dijkstra’s shortest path.

To separate concerns, a depth first search algorithm might create it's own object consisting of a generalized node object and a visited field...or use any of many other data structure approaches.

Without isolation algorithms operating on the graph must walk it a second time to clear the visited field. If there is concurrency, then graph algorithms must lock the entire data structure and the depth first search should have transaction semantics.

Depth first search is actually applicable for cyclic graphs. — ILoveCoding, yesterday
@ILoveCoding Not productively before some meaningful selection of starting node. For utility some sort of appropriate acyclic element must be found or created via transformation. — ben rudgers, yesterday

200_success · Answer 4 · 2015-01-26 04:47:01Z

up vote 2 down vote

Checking if node == None makes no sense, immediately after having set node.visited = True. It will either crash or never match.

In any case, if mode is None is the standard way to check for None, and storing the visited flag in the graph modes themselves ruins the graph for future traversals.

answered yesterday

200_success♦
53.8k463213

By "standard way", it might be good to clarify it's PEP8. Also, an interesting post about == None vs is None: jaredgrubb.blogspot.fr/2009/04/python-is-none-vs-none.html – janos yesterday

add a comment |

asked	yesterday
viewed	141 times
active	today

current community

your communities

more stack exchange communities

Depth-first search in Python

4 Answers 4

Your Answer

Not the answer you're looking for? Browse other questions tagged python graph depth-first-search or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Depth-first search in Python

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python graph depth-first-search or ask your own question.

Related

Hot Network Questions