Kosaraju’s algorithm

Kosaraju’s algorithm is an efficient method for finding the strongly connected components of a directed graph.

Approach

The algorithm performs two depth-first searches

  1. the first search constructs an ordered node list of nodes according to the structure of the graph.
  2. the second search applies the ordered node list against the reversed edges of the graph in order to find the strongly connected components.

Graph Recursion and Processing State

In the first Depth First Search we need to calculate per node

  • the exit time, that is, the time in which the node has been Processed, that is, there is nothing left to be found.
  • the node state in relation to any of the states, Undiscovered, Discovered or Processed

Recursion is a key implementation approach that will enable us to perform

  1. Node pre-processing, e.g. capture/log the entry time and before search any deeper
  2. Node post-processing, e.g. capture/log the exit time after there is no path remaining to be found from this node

As a result to measure time across recursions and without the use of a global variable, we resort to the GraphState struct that

  • implements the DFSearch trait that provides the recursive function
  • holds the recursion state for time, path at node, node state & ordered list

In addition, GraphState provide us with the Tracker structure that simplifies handling of the node processing state while we are search the graph.

/// GraphState struct enable us to maintain the processing state of the graph
/// and while we apply a recursive approach in searching the graph
struct GraphState {
    tracker: Tracker,
    queue: BinaryHeap<NodeType>,
    time: Cost,
    path: Vec<Node>
}

impl GraphState {
    /// Construct a new `GraphState` given a `Graph`
    fn new(g: &Graph) -> GraphState {
        GraphState {
            tracker: g.get_tracker(Undiscovered, 0, None),
            queue: BinaryHeap::new(),
            time: 0,
            path: Vec::new()
        }
    }

    /// Extract from `BinaryHeap` the exit times per ordered from max -> min
    fn get_timings(&self) -> Vec<(Node, Cost)> {
        self.queue.iter().rev().map(|&s| {
            let NC(n, c) = s else { panic!("get_timings(): node type is not NodeType::NC") };
            (n,c)
        }  ).collect::<Vec<_>>()
    }
}

/// Graph State implements DFSearch trait and particularly provides specific implementation for
/// the calculation of the strongly connected components, in terms of node post/pre processing fn(),
/// path return fn() and node state fn()
impl DFSearch for GraphState {
    type Output = Vec<Node>;

    /// capture time of entry and set node state to visited,
    /// given the node's edges have yet be visited
    fn pre_process_node(&mut self, node: Node) -> &mut Self {
        // Entering the node at time tick()
        self.time += 1;
        self.tracker[node].visited(Discovered).distance(self.time);
        self
    }

    /// capture time of exit and set node state to processed,
    /// given all edges have also been processed
    fn post_process_node(&mut self, node: Node) -> &mut Self {
        // Exiting the node at time tick()
        self.time += 1;
        self.tracker[node].visited(Processed).distance(self.time);
        self.queue.push(NC(node, self.time));
        self.path.push(node);
        self
    }

    /// Return the path as it was calculated by the post processing step
    fn path(&self) -> &Self::Output {
        &self.path
    }

    /// return the state of the node
    fn is_discovered(&self, node: Node) -> bool {
        self.tracker[node].is_discovered()
    }
}

Transpose the graph

The GraphState will help us capture the node order by which we will run search on the second pass. However, the second pass must run against the transposed graph, that is, the graph with all edges reversed.

impl Graph {
    pub fn transpose(&self) -> Graph {
        self.nodes.iter()
            .fold(Graph::new(), |mut g, &node| {
                g.nodes.insert(node);
                // reverse the edges for this node, if any
                if let Some(edges) = self.edges.get(&node) {
                    edges.iter()
                        .for_each(|&e|{
                            g.nodes.insert(e.into());
                            g.edges.entry(e.into()).or_default().insert(node.into());
                        });
                }
                g
            })
    }
}

Final implementation

With all of the above elements in place, The below function provides an implementation approach to the algorithm

pub trait ConnectedComponents {
    fn strongly_connected(&self) -> Vec<Vec<Node>>;
}

impl ConnectedComponents for Graph {
    fn strongly_connected(&self) -> Vec<Vec<Node>> {

        // initiate the run state structure for calculating the scc of the graph
        // and in order to enable recursive searching in rust
        let mut gs = GraphState::new(self);

        // Pass 1: Find all paths and calculate entry and exit times per node
        self.nodes.iter()
            .for_each(|&start| {
                // println!("Start >> {start}");
                if !gs.tracker[start].is_discovered() {
                    let path = gs.path_search(self, start);
                    println!("Pass 1: Path {:?}",path);
                    gs.path.clear();
                }
            });

        // Extract node sequence ordered by highest exit times
        let v = gs.get_timings();
        println!("Timings: {:?}",v);
        // reverse the graph edges
        let tg = self.transpose();
        // reset run state
        gs = GraphState::new( &tg);

        // Pass 2: Identify and store each strongly connected component identified
        v.into_iter()
            .fold(Vec::new(),|mut components, (node, _)| {
                if !gs.is_discovered(node) {
                    // extract new component
                    let component = gs.path_search(&tg, node ).unwrap();
                    println!("Pass 2: Component [{}]{:?}", component.len(), component);
                    // store component found
                    components.push(component.clone() );
                    // reset path so to remove last found component
                    gs.path.clear();
                }
                components
            })
    }
}