Imagine that there are roads running from city A to city B. Cars are driving on the roads, but there are always traffic jams on some sections of roads, and the rest are empty. You need to understand whether you need new roads or you can distribute the flow of the vehicles in such a way as to make better use of the existing routes. What is the maximum number of cars that can travel from A to B now?
Important terms
First off, let's define the problem properly. The maximum flow problem, as you may have understood, is all about finding such a flow in a network, that the flow from the source to the sink (you remember what these are, don't you?) is as big as possible.
Additionally, you need to understand some other terms in order to go on with this topic: in particular, different types of graphs. While reading about them and looking at examples, you sometimes may come across edges that are not present in the initial flow network. Remember that these are oppositely directed edges that appear due to skew symmetry.
Now, let's talk about residual capacities. Formally, a residual capacity is the difference between the capacity of an edge and its flow. In other words, it is the maximum flow that can be pushed through a certain edge at the moment. For instance, if an edge AB has a flow of 5 and a capacity of 10, the residual capacity for it will be . Mind that pushing the flow through/along the edge means adding some more flow to the current one. For example, if you have an edge AB with a flow of 3 and a capacity of 7, you may push a flow of 4 through it. By doing so, you will increase the flow of the edge to 7.
A definition strongly connected to residual capacity is the residual graph – a graph, each edge of which has a residual capacity assigned to it.
For example, let's create a residual graph for the following network:
It will look like the one below. Let's name this graph .
You may wonder why the edge 4-2 has the residual capacity of 2. Well, it can be explained like this: now you know that there is a flow of two cars going from 2 to 4. But maybe you don't need to send them there for optimal flow. Therefore, you put down the residual capacity of 2, meaning these two cars may go back from 4 to 2 if needed.
In a residual graph, you can find an augmenting path. It is a path from the source to the sink that consists only of the edges with the positive residual capacity. Therefore, by pushing a flow along this path, the total flow of the network is increased. An important property is connected to this path: the flow is maximal if and only if there is no augmenting path in the residual network. Indeed, it means that there is no path, along which a flow may be pushed. Consequently, the total flow of the network can't be increased in any way.
For example, in the above-mentioned residual graph, you can find an augmenting path 1-3-5. After you push the flow of 1 through it, there are no more augmenting paths left, and the residual graph is the following:
Last but not least is the level graph. Each node of this graph stores the length of the shortest path from the source to it. Recall that the length of a path is simply the number of edges in it.
For instance, for , the level graph will look like the one below. Remember that the numbers in the nodes are the lengths from the source to the node.
You may wonder why you need to learn about all these various types of networks, but they are used in different algorithms that solve the maximum flow problem. Therefore, it is essential to know and understand all the above-mentioned theory.
Additional examples
The example with the roads and vehicles is not the only application of the maximum flow problem.
For instance, you may come across it while engineering different networks (e.g. an Internet network). Unfortunately, wires can pass only a limited number of bits per second. Therefore, it is crucial to make the most efficient use of them and maximize the traffic. In this situation, methods used to solve the maximum flow problem may come in handy.
Another example is the pipe system. Nowadays, multistory buildings are very common. In order to provide all residents with sufficient amount of water without creating too many pipes, the existing ones should be used in the most efficient way. It is exactly the issue the maximum flow problem can address! It may be applied to electricity, gas, and other similar systems in the same way.
There are many more applications of the maximum flow problem:
-
Baseball elimination. Sometimes sports analysts want to find out whether a certain team may get the most wins compared to the other ones. One approach to solving this is by narrowing it down to a maximum flow problem.
-
Airline scheduling. Airline companies find it crucial to create the most efficient schedule for flights. One popular solution is to formulate this task as a maximum flow problem. Thanks to such an approach, you can easily maximize the number of flights the given number of crews can perform or, vice versa, find the minimum number of crews needed to perform a certain number of flights.
-
Image segmentation. It is a common task nowadays to try to distinguish the foreground and the background of a picture. In order to do so, you can break it down into pixels and create a flow network. In this network, each node is a pixel and it is connected to the neighboring pixels. Additionally, there are a source and a sink. In the beginning of the algorithm, you assign the following values: likelihood of a pixel being a part of the foreground/background ( and , respectively) and penalties for the fact that two adjacent pixels i and j are classified differently. Then you maximize the following: .
Solving the problem
As you can see from the previous section, the maximum flow problem is common in many fields. As a result, there are plenty of algorithms that aim to solve it. Below you will see the 3 most common of them.
-
Ford-Fulkerson algorithm. The idea behind this algorithm is the following: the initial flow equals 0. Then you look for augmenting paths in the residual graph and increase the total flow with the size of the flow that was pushed along the path. If there is no such path left, the maximal flow has been found and the algorithm is stopped. The time complexity of this algorithm is a bit unusual and equals , where is the number of edges and is the maximum flow.
-
Edmonds-Karp algorithm. Just like in the previous algorithm, the flow equals 0 in the beginning. Then, on every new step, a residual graph is created. After that you look for the shortest path from the source to the sink in this graph and the total flow is increased by the maximal flow that can be pushed along this path. The algorithm is repeated while a path from the source to the sink may be found. As for time complexity, it is , where V is the number of vertices in the graph.
-
Dinic's algorithm. In this algorithm, both residual and level graphs are used. To begin with, you create a level graph. After checking whether the path from the source to the sink exists, you push as many flows as possible, until no augmenting paths are left. These two steps are repeated while a path from the source to the sink exists. In general, time complexity equals , but it can be reduced to if you use a data structure called dynamic trees.
Maximum flow of minimum cost
It is important to understand that there may be several maximum flows in a network.
In some cases, you can choose whichever you want. However, frequently there are some other values that may depend on the flow that you pick. For instance, regarding the example with the cars, you may want to minimize the total length of all routes, as it will keep the amount of pollution produced by cars as little as possible. This value that is minimized (e.g. price, time, pollution level) is assigned to each edge and is called the cost of going through a certain edge.
For example, look at the different flows pushed through the same network:
As you can see, the total flow is the same. Additionally, it is the maximum flow possible for this network. But let's have a look at the costs of going through the edges:
The total cost is computed by multiplying the flow by the cost and adding the numbers together. Now, if you count the total costs for both cases, you will get 215 and 158, respectively. Therefore, if you had to minimize the cost of the flow, you would definitely choose the second one.
Conclusion
To sum up, let's refresh the main points about the maximum flow problem:
-
Maximum flow problem is a problem of finding the biggest total possible flow in a network.
-
Residual capacity is the difference between the capacity of an edge and its flow. A residual graph is a graph that has residual capacities assigned to its edges.
-
The augmenting path goes from the source to the sink and consists only of the edges with the positive residual capacity.
-
A level graph is a graph whose nodes store the lengths of the shortest path from the source to these nodes.
-
The Ford-Fulkerson, Edmonds-Karp, and Dinic's algorithms are the most common ones for solving the maximum flow problem.
-
A variation of the maximum flow problem where you try to minimize the cost of the total flow is called the maximum flow of the minimum cost problem.