Topological Sort Algorithm

A topological sort is an ordering of the nodes of a directed graph such that if there is a path from node a to node b, then node a appears before node b in the ordering.

Graph Recursion and Processing state

The idea is to go through the nodes of the graph and always begin a depth-first search at the current node if it has not been processed yet. During the searches, the nodes have three possible states:

  • state 0: the node has not been processed (white)
  • state 1: the node is under processing (light gray)
  • state 2: the node has been processed (dark gray)

Initially, the state of each node is 0. When a search reaches a node for the first time, its state becomes 1. This is our pre-processing step for the node

If the graph contains a cycle, we will find this out during the search, because sooner or later we will arrive at a node whose state is 1. In this case, it is not possible to construct a topological sort. This is the pre-processing step for the edge.

If the graph does not contain a cycle, we can construct a topological sort by adding each node to a list when the state of the node becomes 2. This is our post-processing step for the node. This list in reverse order is a topological sort

As a result we can implement the DFSearch trait in the following way in relation to the above pre-processing & post-processing steps

/// Graph state that we need to maintain
/// for the topological sort algorithm
struct TState {
    tracker: Tracker,
    path: Vec<Node>,
    abort: bool
}

impl TState {
    /// Construct a new `GraphState` given a `Graph`
    fn new(g: &Graph) -> TState {
        TState {
            tracker: g.get_tracker(Undiscovered, 0, None),
            path: Vec::new(),
            abort: false
        }
    }
}

/// Topological sort implementation of the TState
/// There is no need for exit/entry time or tracking parent node.
/// Here we only need to save the `node` in the `tracker.path` following its full processing
impl DFSearch for TState {
    type Output = Vec<Node>;

    /// mark node as visited but not processed
    fn pre_process_node(&mut self, node: Node) -> &mut Self {
        self.tracker[node].visited(Discovered);
        self
    }

    /// Important we store the node in the path following node processing complete
    fn post_process_node(&mut self, node: Node) -> &mut Self {
        self.tracker[node].visited(Processed);
        self.path.push(node);
        self
    }

    /// before we jump into the edge for further exploration
    /// we check if the edge is actually a node already `Discovered` but not `Processed`
    /// if that is the case, we set the abort flag to `True`
    fn pre_process_edge(&mut self, edge: Edge) -> &mut Self {
        let Edge(_,dst) = edge;
        if self.tracker[dst.into()].visited == Discovered {
            self.abort = true;
        }
        self
    }

    /// Implement the abort fn() so we can stop the path search recursion
    fn abort(&self) -> bool {
        self.abort
    }

    /// extract the aggregate path stored
    fn path(&self) -> &Self::Output {
        &self.path
    }

    /// return true if node is either `Discovered` or `Processed`
    fn is_discovered(&self, node: Node) -> bool {
        self.tracker[node].is_discovered()
    }
}

Implementation

When the search has completed and has exhausted all paths the path member of the Tracker structure will now contain the order by which the nodes have been visited. As a result we only have to reverse such order and return it.

/// Topological Sort trait
pub trait TopologicalSort {
    fn topological_sort(&self) -> Option<Vec<Node>>;
}
/// Graph implementation of Topological Sort
impl TopologicalSort for Graph {
    /// Implementation of topological sort for Graph
    fn topological_sort(&self) -> Option<Vec<Node>> {
        // initiate the run state structure for calculating the topological sort of the graph
        let mut ts = TState::new(self);

        // Construct a path aggregate, that is, each path found is joined up together
        // achieved by appending the path of each iteration onto tracker.path
        // see post_processing() of TState implementation of DFSearch
        for &node in &self.nodes {
            // if node is not yet visited && search hasn't thrown a NONE, that is, we've found a circle
            if !ts.is_discovered(node)
                && ts.path_search(self, node).is_none() {
                return None
            }
        }

        // Extract & reverse path from tracker so we extract the topological sort
        ts.path.reverse();
        Some(ts.path)
    }
}