0.4.5.2 Kruskal's Algorithm
Kruskal's algorithm for finding a minimal spanning tree in
a connected graph is a greedy algorithm; that is, given a
choice, it always processes the edge with the least weight.
This algorithm operates by considering edges in the graph in order of
weight from the least weighted edge up to the most while keeping track
of which nodes in the graph have been added to the spanning tree. If
an edge being considered joins either two nodes not in the spanning
tree, or joins a node in the spanning tree to one not in the spanning
tree, the edge and its endpoints are added to the spanning tree.
After considering one edge the algorithm continues to consider the
next higher weighted edge. In the event that a graph contains equally
weighted edges the order in which these edges are considered does not
matter. The algorithm stops when all nodes have been added to the
spanning tree.
Note that, while the spanning tree produced will be
connected at the end of the algorithm, in intermediate steps
Kruskal can be working on many independent, nonconnected
sections of the tree. These sections will be joined before
the algorithm completes.
Often this algorithm is implemented using parent pointers
and equivalence classes. At the start of the
processing, each vertex in the graph is an independent
equivalence class. Looping through the edges in order
of weight, the algorithm groups the vertices together
into one or more equivalence classes to denote that these
nodes have been added to the solution minimal spanning tree.
It is a good idea to process the edges by putting them into
a minheap. This is usually much faster than sorting the
edges by weight since, in most cases, not all the edges will
be added to the minimal spanning tree. See the section on
the heapsort and the heap data structure
for more information about minheaps.
