Suppose you have a directed graph G with a source node s and a sink node t. You wish to count the number of different s-t paths (two paths differ as long as they are not the exact same sequence of edges). G may have cycles, so this number is possibly infinite; you wish to detect this.
If you suppose G is acyclic, a natural approach might be to do a depth first search from s. Each time you reach t, you have found another path.
class Node def dfs_count_paths(target) @@count += 1 if self == target @visited = true for child in @children child.dfs_count_paths(target) end @visited = false end end
Besides not working in the acyclic case, this has the potential to be very slow, because it will enumerate every single path in the graph. For example, if w is another node, there could be 100 s->w paths and 100 w->t paths; all 10,000 combinations of these two are valid s->t paths. Enumerating them all is expensive. However, if we knew both the number of s->w paths and w->t paths, we could compute this directly. With some thought, this idea can become a better approach. In a directed acyclic graph, we can order the vertices so that edges only point forward. In other words, if u is a descendent of v, then u will come after v in the sorted list of nodes. This is called a topological sort. Using this ordering, call s vertex 1, the next vertex 2, and so on, up to t which we may call vertex n. The number of paths from 1 to 2 is just the number of edges from 1 to 2. The number of paths from 1 to 3 is the number of (direct) edges from 1 to 3, as well as paths that use 2 as an intermediate vertex. More generally, let d(j) be the number of paths from 1 to j, then we have that:
This gives a dynamic programming algorithm: for j from 2 to n, compute d(j). d(n) gives the number of paths from 1 to n. This is much more efficient: for each node, computing d takes time proportional to the in-degree of the node, and overall it will take O(E) time, where E is the number of edges in the graph.
How do we topologically sort the graph? A depth-first search will visit all children of a node, and then finish visiting the node itself. This sounds a lot like the property we want for our topological sort: that the descendents of the node come after the node itself. We can express this as the following algorithm:
class Node def topological_sort @visited = true sorted_nodes =  for child in @children if not child.visited sorted_nodes = child.topological_sort + sorted_nodes end end return [self] + sorted_nodes end end
The combination of a topological sort with dynamic programming gives us a good algorithm for counting paths in an acyclic graph. What about when cycles are allowed? After the topological sort, if u comes after v, then u is a descendent of v, so there is a path from v to u. If there is also an edge from u to v, then there is a cycle. This edge is called a back-edge, because it goes against the direction of the topological sort.
So, we may detect cycles by looking for back edges. However, such a cycle doesn’t necessarily indicate that there are an infinite number of paths from s to t, because the cycle may not lie on a path from s to t at all. This is because there are “dead ends” that cannot reach t. We only wish to consider cycles involving nodes that can reach t.
To detect this, we again turn to our trusty tool the depth first search. The nodes that can reach t are the same as the nodes that we can see when exploring backwards along edges out from t. In other words, we will follow in-edges, not out-edges:
def prune_unreachable_nodes @visit = CAN_REACH_TARGET for node in @parents if node.visit == UNVISITED node.prune_unreachable_nodes end end end
This marks only the nodes that can reach t; the other (unreachable) nodes can be ignored.
Putting It Together
Now, to put it all together, the algorithm is as follows:
- Mark the nodes that can reach t using a DFS.
- Topologically sort these nodes.
- Starting from the position of node s in the topological sort, perform the dynamic programming algorithm to compute d for each node.
- If at any point a back edge is encountered, then there are an infinite number of paths.
- d(t) indicates the number of paths from s to t.
The following code gives the dynamic programming part of the approach:
class Graph def count_paths @t.prune if not @s.visit == Node::CAN_REACH_TARGET return 0 end sorted_nodes = @s.topological_sort for i in 0...sorted_nodes.length sorted_nodes[i].position(i) if sorted_nodes[i] == @s start_pos = i end end init = sorted_nodes[start_pos] init.paths(1) for i in start_pos...sorted_nodes.length cur = sorted_nodes[i] for child,count in cur.children if child.visit != Node::UNVISITED and child.position <= i return "INFINITE PATHS" end end #count the paths from init to node i, and save that number in node i paths = 0 if init.children[cur] paths = init.children[cur] end for prev,count in cur.parents paths += prev.paths*count end cur.paths(paths) end @t.paths end end