Computer scienceProgramming languagesJavaInterview preparationAlgorithms and problem solving techniques

Minimum spanning tree problem - Prim and Kruskal algorithms

10 minutes read

Introduction

In this topic, we are going to compare two approaches to solve the problem of minimum spanning tree in a connected undirected weighted graph.

Recall that to solve the minimum spanning tree problem, you should find a subset of the graph edges that form a tree connecting all the nodes and has the minimum possible sum of edge weights.

MST problem attributes

Here are common attributes to identify MST problem:

  1. You have an entity that can be represented as a weighted or unweighted graph. For example, a map with cities' roads.

  2. Edges of the graph have unique or equal values. In the cities' map, you have a distance between cities.

  3. A problem is to find a tree that connects all edges, does not produce a cycle, and the sum of the edges is minimum. In the example of the cities' map, it can be a request to find the shortest distance between all cities.

  4. The graph can be dense or sparse.

There are two types of graph distribution: dense and sparse.

A dense graph has a high number of edges compared to the number of nodes in the graph. A sparse graph has few edges relative to the number of nodes. In a sparse graph, many pairs of nodes are not connected by an edge.

Prim's vs Kruskal's algorithms

There are two widely used algorithms to find a minimum spanning tree for weighted graphs: Prim's and Kruskal's. Both of them are greedy and guarantee to solve the MST problem.

The main difference is the approach. Let's briefly recall what is the key point of each algorithm:

Prim's algorithm:

  1. Start by selecting a node and building a tree from that node

  2. Add iteratively edges to the tree that connect the tree to a non-tree edge with the smallest weight.

  3. Repeat step two until all nodes are included in the tree.

Kruskal's algorithm:

  1. Start by sorting all edges in the graph by weight.

  2. Add edges to the minimum spanning tree one at a time in ascending order of weight.

  3. Check whether adding the next edge to the tree would create a cycle. If not, the edge is added to the tree.

  4. Repeat steps 2 and 3 until all nodes will be included in the tree.

Now it's time to compare them. Through the table, we can easily feel the difference.

Prim's Algorithm

Kruskal's Algorithm

Initiate point

Starts from a random node and builds the tree outwards

Starts with all disconnected nodes and connects them gradually

Method

Add the minimum weight edge to the tree

Sort all edges by weight. Add the edge with minimum weight. The next edge should not create a cycle.

Efficiency

Better for dense graphs

Better for sparse graphs

Memory Complexity

More memory-intensive: it requires a priority queue to store nodes

Less memory-intensive: it only requires a list to store edges

Time Complexity
(N - number of nodes, M - number of edges)

O(M log N) with a binary heap

O(M log M) with a sorting algorithm

Next, we discuss distinctions.

When is it better to use Prim's and Kruskal's?

In general, a choice between these algorithms depends on graph specifics and requirements for a solution.

Prim's approach is better to use in the following cases:

  1. A graph is dense. This is because the algorithm works step-by-step, building a minimum spanning tree from one vertex node.

  2. You work with a disconnected graph. In this case, Prim's algorithm can start building a minimum spanning tree from multiple vertex nodes, which can be useful for solving some problems.

Here are the conditions when it is better to use Kruskal's:

  1. A graph is sparse. The algorithm works with a sorted list of edges, which reduces execution time.

  2. A key constraint is memory efficiency. Kruskal's algorithm requires less memory than Prim's algorithm, as it only needs to store a set of nodes and a set of edges.

Complexity

In general, the time complexity of Prim's is O(M log N) and Kruskal's — O(M log M) due to the approach to how they work. Here are M — the number of edges and N — the number of graph nodes.

Briefly, for Prim's algorithm, we have this case. The complexity to update and find the minimum edge will be O(log N) if a priority queue is implemented via a binary heap. Therefore, the total time complexity for M edges is O(M log N).

In Kruskal's algorithm case, we start with sorting all the edges, that require O(M log M) operations using quicksort. In the next step adding each edge to the disjoint set takes O(log N) time in the worst case. In total, it's required to add N-1 edges to the minimum spanning tree, so the execution time is O(M log M + N log N). Usually, the number of nodes is less than the number of edges and we can simplify it to O(M log M).

In terms of memory complexity, Kruskal's algorithm is generally more memory-efficient than Prim's algorithm. There are the following reasons:

  • for Kruskal's algorithm only needs to store the edges of the graph in a list

  • Prim's algorithm requires a heap data structure that can keep the minimum-weight edges.

Real-life problem examples

Here are two issue examples when we can apply different approaches.

Let's imagine that you are a state major. You have a budget to repair the roads, but there is no guarantee that this budget will be enough to repair all possible roads between all cities. So you want to repair at least one full path that connects all the cities in your area. To achieve it, you need to find a minimum distance between all cities that you repair first.

Prim's algorithm suits this example to find the minimum spanning tree that connects all the cities.

Here is another example for Kruskal's algorithm. You are an engineer at a water supply station. A new country house area was built, and your task is to supply water to each house. Usually, a house does not have a pipe connection with another house, but you may have some water reservoirs that connect to a limited number of houses.

For this case, we have a typical sparse graph, so it is better to Kruskal's algorithm.

Conclusion

In this topic, we compared two approaches to solve the problem of minimum spanning tree in a connected undirected weighted graph: Prim's algorithm and Kruskal's algorithm.

Both algorithms have advantages and disadvantages. For example, Kruskal's algorithm is generally faster when the graph is sparse, while Prim's algorithm is faster for dense graphs. Kruskal's algorithm is also more memory-efficient than Prim's algorithm, as it only needs to store a set of nodes and a set of edges.

One way or another, both algorithms are effective for solving the MST problem, and the choice between them depends on the specific requirements and constraints of the problem.

12 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo