When working with graphs, sometimes we need to know how far some two nodes are from each other. For example, recall our graph with edges representing roads in a city and nodes corresponding to the roads' intersections. If we need to get from one point of the city to another, we probably want to go for the shortest route. Another example is a graph with each node representing an airport and edges corresponding to flights between two airports. If we have to get from one airport to another and there is no direct flight between them, then we'd prefer to use the route with fewer transfers.
In terms of graphs, the described problems can be formulated as follows: given two nodes of a graph named source and target, find a path between them consisting of the smallest possible number of intermediate nodes.
One of the algorithms that allows us to solve this problem is the breadth-first search. It is a simple and efficient algorithm that can be used either as a standalone procedure or a subroutine for other graph processing algorithms. In this topic, we will learn the procedure in more details and see how it works on an example.
Breadth-first search procedure
The breadth-first search (BFS) traversal begins with a single node of a graph. Then, the algorithm visits all its neighbors. After that, it considers all the neighbors of the nodes visited at the previous step and so on until all the nodes are processed. In other words, BFS gradually visits all nodes of a graph in the order of increasing distance from the start node.
The algorithm goes like this:
-
Choose an initial node of a graph, mark it as visited and set its distance to .
-
Consider all unvisited neighbors of the nodes visited at the previous step and mark all of them as visited. Set the distance to each of these nodes to , where is the distance to the nodes processed at the previous step.
-
Repeat step 2 until all the nodes are visited.
After the procedure is completed, the distance to each node will correspond to the shortest distance between the initial node and .
To implement BFS using a particular programming language, you will need to use the queue data structure. The idea is to put a starting node to the queue, then at each step extract the current node from it, consider all the neighbors and put them to the queue, and so on. This allows us to traverse the nodes of a graph exactly in the order described above.
Pseudocode
Let's explore the structure of a graph systematically and find the shortest paths using the Breadth-First Search (BFS) algorithm. This algorithm efficiently finds the shortest paths from a starting node to all other nodes in the graph, providing a foundation for various graph processing tasks. The following pseudocode demonstrates the step-by-step procedure for BFS traversal.
// Function to perform BFS on a graph
function BFS(graph, start):
// Initialize a queue for BFS
queue = Queue()
// Mark the start node as visited and enqueue it
visited = set()
visited.add(start)
queue.enqueue(start)
// Set the distance of the start node to 0
distances = {start: 0}
// BFS traversal
while not queue.isEmpty():
// Dequeue a node from the queue
current = queue.dequeue()
// Process the current node (customize as needed)
processNode(current)
// Visit all unvisited neighbors of the current node
for neighbor in graph[current]:
if neighbor not in visited:
// Mark neighbor as visited, set its distance, and enqueue
visited.add(neighbor)
distances[neighbor] = distances[current] + 1
queue.enqueue(neighbor)
// Function to process a node during BFS (customize as needed)
function processNode(node):
// Print or perform any desired operation on the node
print("Visiting node:", node)An example
Let's apply the algorithm for the following undirected graph using as a starting node:
The figures below illustrate the procedure step by step. The visited nodes are shown in blue, and the unvisited neighbors of the visited nodes are green. The remaining nodes are shown in black. The label of each visited node corresponds to the shortest distance between the initial node and .
We begin the procedure with the node setting its label to . Then, we consider all its unvisited neighbors. There are five of them shown in green in the first figure. We set the distances to each of the nodes to and mark them as visited. At the second step, we consider all the unvisited neighbors of the nodes labeled . There are four of them shown in green in the second figure. We set the distance to each of the nodes to and move to the next unvisited neighbors. After all the nodes are processed, we get the following graph:
The label of each node of that graph corresponds to the shortest distance between the starting node and . The starting node itself has the label . As you can see, the blue nodes and edges of the graph form a tree. Such a tree is called a spanning tree since it connects (spans) all the nodes of the graph. So, finding a spanning tree is one of the applications of the BFS procedure.
Although the graph above is undirected, BFS works the same way for directed graphs.
If you want to see another example of how the algorithm works, you may take a look at a visualization of the process.
Complexity analysis
While applying the BFS procedure for a graph , each node is added to a queue and extracted from it exactly once. Each edge is considered exactly twice: while visiting and then while visiting (or vice versa). So, the overall running time of the algorithm is .