BA .

Max-Flow/Min-Cut Duality: Formulation & Theory

In the realm of algorithms and graph theory, two of the most famous optimization problems are Max-Flow and Min-Cut. The Max-Flow problem involves finding the optimal flow in a network, much like water through pipes or traffic on roads. The Min-Cut problem is about finding the minimum capacity of edges that, if removed from a network, would disconnect its source from its sink, effectively reducing the flow between them to zero. Both of these problems are formulatable as a linear program (LP) and, in fact, are intricately linked by a strong dual relationship. In this post, we introduce the Max-Flow problem and its rudimentary LP formulation. We then motivate an equivalent but more consistent formulation based on an augmented graph. Using a matrix/vector representation, we present the primal and dual LPs and walk through the interpretation of the dual LP as the Min-Cut problem.

This post is part 1 of a 2-part series. This part focuses on the linear programming formulations of the Max-Flow problem, matrix/vector representations of the problem based on the graph’s node/edge adjacency matrix, and the dual relationship between Max-Flow and Min-Cut. In the second part, we develop implementations of these two graph optimization problems in gurobipy and visualize the solutions by leveraging both primal and dual variable values.

Max-Flow Primal LP

The Max-Flow problem is defined on a directed graph G=(V,E)G = (V, E) where VV and EE, repsectively, denote the sets of vertices and edges. The term “directed” simply means that each edge is like a one-way street – flow is unidirectional. The goal of the Max-Flow problem is to maximize the flow from the source node ss to the sink node tt subject to limits on the flow allowed across each edge in the graph.

To formulate the problem, let Vi+={j:(j,i)E}, {V_i^+ = \{j : (j, i) \in E\}}, and Vi={j:(i,j)E}. {V_i^- = \{j : (i, j) \in E\}}. The set Vi+V_i^+ represents the vertices neighboring ii via entering edges and ViV_i^- the set neighboring ii via exiting edges. For (i,j)E(i, j) \in E, let xijR+x_{ij} \in \mathbb{R}_+ be a variable representing the flow on edge (i,j)(i, j), and let cijc_{ij} be its upper bound. Then the Max-Flow problem may be formulated as

zp=max  jVt+xjts.t.  jVi+xjijVixij=0,iV{s,t},xijcij,(i,j)E,xijR+,(i,j)E,xtsR+. \begin{aligned} z_p^* = \max~~ & \textcolor{#4566E0}{\sum_{j \in V_t^+} x_{jt}} \\ \text{s.t.}~~ & \textcolor{#FE3F1C}{\sum_{j \in V_i^+} x_{ji}} - \textcolor{#49AF64}{\sum_{j \in V_i^-} x_{ij}} = 0, && \forall i \in V \setminus \{s, t\}, \\ & x_{ij} \le c_{ij}, && \forall (i, j) \in E, \\ & x_{ij} \in \mathbb{R}_+, && \forall (i, j) \in E, \\ & x_{ts} \in \mathbb{R}_+. && \end{aligned}

In the image below, we link the formulation to a specific graph.Edges entering the sink node are colored blue to match the corresponding sum in the objective function of the formulation. At node cc, entering edges are colored red and leaving edges are colored green to illustrate the flow balance constraints at that node. maxflow-no-cycle In this formulation, the objective is to maximize the sum of flows on all the edges entering the target vertex. For the sake of simplicity, we suppose that cij>0c_{ij} > 0 for all (i,j)E(i, j) \in E and that there exists at least one path from ss to tt such that the solution is guaranteed to be non-trivial (i.e., zp>0z_p^* > 0). Flow balance is imposed at all vertices except at the sink tt where there is ideally a net surplus and at the sink ss where there is a corresponding deficit. The flow on each edge is required to be between zero the edge’s capacity.

Intuitively, the total flow entering tt ought to be equal to the flow exiting ss. Indeed, this condition is guaranteed by the flow balance constraints. In summing over those constraints, most flow variables appear exactly once with a positive sign and once with a negative sign. The exceptions are the variables for flows entering tt which appear only once with a positive sign and those exiting ss which appear only once with a negative sign. Ergo,

iV{s,t}(jVi+xjijVixij)=0    jVi+xjtjVixsj=0. \begin{aligned} & \sum_{i \in V \setminus \{s, t\}} \left( \sum_{j \in V_i^+} x_{ji} -\sum_{j \in V_i^-} x_{ij} \right) = 0 \implies & \sum_{j \in V_i^+} x_{jt} - \sum_{j \in V_i^-} x_{sj} = 0. \end{aligned}

Though this condition is superfluous, it is the foundation of a convenient reformulation. Defining a new variable

xts=jVi+xjt, x_{ts} = \sum_{j \in V_i^+} x_{jt},
we observe that
jVi+xjtxts=0,xtsjVixsj=0. \begin{gather*} \sum_{j \in V_i^+} x_{jt} - x_{ts} = 0, \\ x_{ts} - \sum_{j \in V_i^-} x_{sj} = 0. \end{gather*}

The reformulation is achieved by considering the graph G^=(V,E^){\hat{G} = (V, \hat{E})} with augmented edge set E^=E{(t,s)}.{\hat{E} = E \cup \{(t, s)\}}. Incorporating such an edge (with unlimited capacity) allows all of the flow entering tt to recirculate back to ss in a cycle thus making flow balance applicable at every node including ss and tt. Letting V^i+={j:(j,i)E^}, \hat{V}_i^+ = \{j : (j, i) \in \hat{E}\}, and V^i={j:(i,j)E^}, \hat{V}_i^- = \{j : (i, j) \in \hat{E}\}, the reformulated problem is

zp=max  xtss.t.  jV^i+xjijV^ixij=0,iV,xijcij,(i,j)E^,xijR+,(i,j)E. \begin{aligned} z_p^* = \max~~ & \textcolor{#4566E0}{x_{ts}} && \\ \text{s.t.}~~ & \textcolor{#FE3F1C}{\sum_{j \in \hat{V}_i^+} x_{ji}} - \textcolor{#49AF64}{\sum_{j \in \hat{V}_i^-} x_{ij}} = 0, && \forall i \in V, \\ & x_{ij} \le c_{ij}, && \forall (i, j) \in \hat{E}, \\ & x_{ij} \in \mathbb{R}_+, && \forall (i, j) \in E. \end{aligned}
Below, we illustrate this Max-Flow reformulation. Added edge (t,s)(t, s) is colored blue to match the corresponding variable objective function. Flow balance at cc is illustrated as it was before. maxflow-cycle

To form the dual of this problem, we rewrite it compactly using matrix and vector quantities and distinguishing between quantities pertaining to the original graph GG and added edge (t,s)(t, s):

zp=max  0x+1xtss.t.  Ax+atsxts=0,xc,xR+E,xtsR+. \begin{aligned} z_p^* = \max~~ & \boldsymbol{0}^\top \boldsymbol{x} + 1 x_{ts} \\ \text{s.t.}~~ & A \boldsymbol{x} + \boldsymbol{a}_{ts} x_{ts} = \boldsymbol{0}, \\ & \boldsymbol{x} \le \boldsymbol{c}, \\ & \boldsymbol{x} \in \mathbb{R}_+^{|E|}, x_{ts} \in \mathbb{R}_+. \\ \end{aligned}
Here, AA denotes the node/edge adjacency matrix for the original graph GG. The rows of AA are indexed in VV and the columns in EE. The column of AA corresponding to edge (i,j)(i, j) has a 1-1 in the row corresponding to node ii and a 11 in the row corresponding to row jj. The vector ats\boldsymbol{a}_{ts} denotes a similar incidence vector for edge (t,s)(t, s).

Max-Flow Dual LP

Following the simple rules for forming the dual LP, we find the dual of the Max-Flow problem is

zd=min  0z+cys.t.  Az+y0,atsz1,yR+E,zRV. \begin{aligned} z_d^* = \min~~ & \boldsymbol{0}^\top \boldsymbol{z} + \boldsymbol{c}^\top \boldsymbol{y} \\ \text{s.t.}~~ & A^\top \boldsymbol{z} + \boldsymbol{y} \ge \boldsymbol{0}, \\ & \boldsymbol{a}_{ts}^\top \boldsymbol{z} \ge 1, \\ & \boldsymbol{y} \in \mathbb{R}_+^{|E|}, \boldsymbol{z} \in \mathbb{R}^{|V|}. \end{aligned}

Recall the structure of AA. The column corresponding to edge (i,j)(i, j) has 11 in the jthj^\text{th} position, 1-1 in the ithi^\text{th} position, and 00 elsewhere. Likewise, ats\boldsymbol{a}_{ts} has 11 in the sths^\text{th} position, 1-1 in the ttht^\text{th} position, and 00 elsewhere. Incorporating this information, an equivalent formulation of the dual LP is

zd=min  (i,j)Ecijyijs.t.  yijzizj,(i,j)E,zszt1,yijR+,(i,j)E,ziR,iV. \begin{aligned} z_d^* = \min~~ & \sum_{(i, j) \in E} c_{ij} y_{ij} && \\ \text{s.t.}~~ & y_{ij} \ge z_i - z_j, && \forall (i, j) \in E, \\ & z_s - z_t \ge 1, && \\ & y_{ij} \in \mathbb{R}_+, && \forall (i, j) \in E, \\ & z_i \in \mathbb{R}, && \forall i \in V. \end{aligned}

Dual LP Interpretation as Min-Cut

From complementary slackness, we know that if xts>0x_{ts}^* > 0, then zszt=1{z_s^* - z_t^* = 1}. It is clear from the problem structure that for each (i,j)E(i, j) \in E the values of ziz_i and zjz_j are only important to the extent of their difference zizjz_i - z_j. As such, we are free to fix any one ziz_i to a constant value. Without loss of generality, let us fix zt=0z_t = 0. Then, again supposing the instance is non-trivial, complementary slackness requires zs=1z_s = 1 for the dual solution to be optimal.

Next, let (y,z)(\boldsymbol{y}^\prime, \boldsymbol{z}^\prime) be a feasible dual solution. If yij>zizj{y_{ij}^\prime > z_i^\prime - z_j^\prime} and yij>0y_{ij}^\prime > 0, then (y,z)(\boldsymbol{y}^\prime, \boldsymbol{z}^\prime) is suboptimal. To see this, let y\boldsymbol{y}^{\prime\prime} be such that yij=max{zizj,0}.{y_{ij}^{\prime\prime} = \max\{z_i^\prime - z_j^\prime, 0\}}. Then, because cij>0c_{ij} > 0 for all (i,j)E(i, j) \in E, the solution (y,z)(\boldsymbol{y}^{\prime\prime}, \boldsymbol{z}^\prime) is strictly better than (y,z)(\boldsymbol{y}^\prime, \boldsymbol{z}^\prime). That is, given our suppositions that ensure a non-trivial optimal solution in the primal LP, an optimal solution to the dual LP must satisfy yij=max{zizj,0}.{y_{ij}^* = \max\{z_i^* - z_j^*, 0\}}.

With this in mind, it is easy to show that an optimal dual solution must also satisfy 0z1\boldsymbol{0} \le \boldsymbol{z}^* \le \boldsymbol{1}. Again consider feasible solution (y,z)(\boldsymbol{y}^\prime, \boldsymbol{z}^\prime). To see this, let (y,z)(\boldsymbol{y}^{\prime\prime}, \boldsymbol{z}^{\prime\prime}) be such that zi=min{max{zi,0},1}{z_i^{\prime\prime} = \min\{\max\{z_i^\prime, 0\}, 1\}} for all iNi \in N and yij=max{zizj,0}{y_{ij}^{\prime\prime} = \max\{z_i^{\prime\prime} - z_j^{\prime\prime}, 0\}} for all (i,j)E(i, j) \in E. That is, let z\boldsymbol{z}^{\prime\prime} be z\boldsymbol{z}^\prime clipped to [0,1][0, 1] and let y\boldsymbol{y}^{\prime\prime} satisfy the criteria for optimality. For each edge (i,j)(i, j), there are three cases to consider:

  1. If zizjz_i^\prime \le z_j^\prime, then zizjz_i^{\prime\prime} \le z_j^{\prime\prime} so yij=0yijy_{ij}^{\prime\prime} = 0 \le y_{ij}^\prime.
  2. If zi>zjz_i^\prime > z_j^\prime and zi ⁣ ⁣[0,1]z_i^\prime \!\in\! [0, 1] and zj ⁣ ⁣[0,1]z_j^\prime \!\in\! [0, 1], then zizj ⁣= ⁣zizjz_i^{\prime\prime} - z_j^{\prime\prime} \!=\! z_i^\prime - z_j^\prime so yijyijy_{ij}^{\prime\prime} \le y_{ij}^\prime.
  3. If zi>zjz_i^\prime > z_j^\prime and either zi ⁣∉ ⁣[0,1]z_i^\prime \!\not\in\! [0, 1] or zj ⁣∉ ⁣[0,1]z_j^\prime \!\not\in\! [0, 1], then zi ⁣ ⁣zj ⁣< ⁣zi ⁣ ⁣zjz_i^{\prime\prime}\!-\!z_j^{\prime\prime} \!<\! z_i^\prime\!-\!z_j^\prime so yij<yijy_{ij}^{\prime\prime} < y_{ij}^\prime.

If zi∉[0,1]{z_i^\prime \not\in [0, 1]} for at least one iNi \in N, then case (3) is applicable for at least one edge, so y<y\boldsymbol{y}^{\prime\prime} < \boldsymbol{y}^\prime and cycy\boldsymbol{c}^\top \boldsymbol{y}^{\prime\prime} \le \boldsymbol{c}^\top \boldsymbol{y}^\prime. Ergo, an optimal solution to the dual LP must also satisfy 0zi10 \le z_i^* \le 1. As such, zizj1z_i^* - z_j^* \le 1 for every (i,j)E(i, j) \in E so 0yij=max{zizj,0}1{0 \le y_{ij}^* = \max\{z_i^* - z_j^*, 0\} \le 1} as well.

Finally, note that the constraint coefficients in this problem form a totally unimodular matrix. As such, all extreme point solutions are integral. Since, as we discussed, all optimal solutions are confined to [0,1]V+E[0, 1]^{|V| + |E|}, all the extreme point solutions are not just integral solutions but 0/1 solutions. Since every LP has an extreme point solution that is optimal, there exists at least one optimal 0/1 solution. Of course, it is possible for two or more 0/1 solutions to exist, and in that case there also exist infinitely many non-integral solutions in their convex hull.

The crux of all this analysis is that there exists an integer programming (IP) restriction of the Max-Flow dual LP that yields the same zdz_d^*. Its formulation is

zd=min  (i,j)Ecijyijs.t.  zs=1,zt=0,yij=max{zizj,0}{0,1},(i,j)E,zi{0,1},iV. \begin{aligned} z_d^* = \min~~ & \sum_{(i, j) \in E} c_{ij} y_{ij} && \\ \text{s.t.}~~ & z_s = 1, z_t = 0, && \\ & y_{ij} = \max\{z_i - z_j, 0\} \in \{0, 1\}, && \forall (i, j) \in E, \\ & z_i \in \{0, 1\}, && \forall i \in V. \end{aligned}
In this IP, z\boldsymbol{z} encodes a vertex bipartition: iAi \in A if zi=1z_i = 1 or iBi \in B if zi=0z_i = 0. Because zs=1z_s = 1 and zt=0z_t = 0, the only feasible partitions are those that separate ss from tt, and in fact all such partitions are feasible. Similarly, y\boldsymbol{y} encodes the minimal directed cut that separates AA from BB and thus separates ss from tt: (i,j)(i, j) in the cut if yij=1y_{ij} = 1 or not in the cut if yij=0y_{ij} = 0. To see that the cut indeed separates ss from tt, consider that for every path PP from ss to tt, the constraints require that
(i,j)Pyij=(i,j)Pmax{zizj,0}(i,j)P(zizj)=zszt=1. \begin{aligned} \sum_{(i, j) \in P} y_{ij} &= \sum_{(i, j) \in P} \max\{z_i - z_j, 0\} \\ &\ge \sum_{(i, j) \in P} (z_i - z_j) = z_s - z_t = 1. \end{aligned}
That is, the constraints require that at least one edge in every path from ss to tt participates in the cut. An example ss-tt cut is shown below. In the image, nodes belonging to sets AA and BB are, respectively, colored green and red. The minimal cut separating AA from BB is represented by the dotted edges. mincut-suboptimal

As a closing thought, recall that though the Min-Cut IP formulation in this section is a restriction of the Max-Flow’s dual LP, its objective value is the same. Therefore, by strong duality, the maximum ss-tt flow in a graph is equal to the value of the minimum ss-tt cut (i.e., zp=zdz_p^* = z_d^*). This is in contrast to the weak duality often exhibited by other discrete optimization problems defined on graphs on otherwise. For example, the number of edges in a maximum matching is only sure to be less than or equal to the number of vertices in a minimum edge cover. The strong duality of the continuous Max-Flow problem and discrete Min-Cut problem is remarkable.

You May Also Like