线性规划的对偶
阅读信息
1119 词 7 分钟 本页总访问量 加载中...
LP:
By knowing the lower bound of an LP, we can tell how good our solution \(f(\boldsymbol{x})\) is
Assume the solution of an LP \(f^*\) has a lower bound \(f_{\mathrm{LB}}\), then for all feasible \(\boldsymbol{x}\), \(f^*\) is also the lower bound of \(f(\boldsymbol{x})\)
How to find it?
Noting that the constraints are in the form of lower bound, we can find \(\boldsymbol{\lambda}, \boldsymbol{\mu}\) and \(\boldsymbol{\mu} \geq \boldsymbol{0}\),
Dual LP
If \(\boldsymbol{A}^T \boldsymbol{\lambda} + \boldsymbol{G}^T\boldsymbol{\mu} = \boldsymbol{c}\), then we can get the lower bound of \(f^*\): \(\boldsymbol{\lambda}^T\boldsymbol{b} + \boldsymbol{\mu}^T \boldsymbol{h}\)
To maximize the lower bound, we only need to maximize
This is called the Dual LP of the former LP
Relations

The lower Dual:
and
Weak and Strong Duality
Consider the LP form
Weak duality
We have
Proof
Noting
It holds for all LP problems. When will the "=" hold?
Strong duality
We say strong duality holds if \(f^* = \psi^*\)
If \(f^*\) or \(\psi^*\) is finite, then strong duality holds, and there exist optimal solutions \(\boldsymbol{x}^*, \boldsymbol{y}^*\) such that \(\boldsymbol{c}^T\boldsymbol{x}^* = f^* = \psi^* = \boldsymbol{b}^T\boldsymbol{y}^*\)
A weaker form: If LP has a optimal solution, then the dual also has one and strong duality holds.
Proof
By the KKT conditions, there exists a Lagrange multiplier \(\boldsymbol{\mu}^*\) s.t.
stationarity \(\boldsymbol{c} - \boldsymbol{A}^T\boldsymbol{\mu} = \boldsymbol{0}\)
nonnegativity \(\boldsymbol{\mu}^* \geq \boldsymbol{0}\)
complementary slackness \(\left(\boldsymbol{\mu}^*\right)^T(\boldsymbol{Ax} - \boldsymbol{b}) = 0\)
So \(\boldsymbol{c}^T \boldsymbol{x}^* = (\boldsymbol{A\mu}^*)^T\boldsymbol{x}^* = \boldsymbol{b}^T\boldsymbol{\mu}\)
Example: Wasserstein distance
The (first) Wasserstein distance \(W_1(a, b)\) between the two distributions \(a, b\) is the optimal value of the optimal transport problem:
i.e.
To be completed...
Dual LP via Lagrangian
The lagrangian of general LP is
If \(\boldsymbol{x} \in \Omega\) and \(\boldsymbol{\mu} \geq \boldsymbol{0}\), then
So
Maximize the lower bound equals to
for all \(\boldsymbol{x} \in \Omega\). Noting that
is bounded below iff \(\boldsymbol{c} - \boldsymbol{A}^T\boldsymbol{\lambda} - \boldsymbol{G}^T\boldsymbol{\mu} = \boldsymbol{0}\)! So the dual problem is equal to (note that \(\phi = \inf \mathcal{L}\))
Which is the dual LP.
Lagrange Dual Function (in general cases)
The function \(\phi(\boldsymbol{\lambda}, \boldsymbol{\mu})\) is called the dual function.
The dual function is always concave, no matter the primal problem is convex or not.
Proof
Obviously \(\phi\) is the minus supremum of a affine function in \(\boldsymbol{\lambda}, \boldsymbol{\mu}\).
In weak duality cases, we call \(f^* - \phi^*\) the (optimal) duality gap
Duality gap
The difference \(f(\boldsymbol{x}) - \phi(\boldsymbol{\lambda}, \boldsymbol{\mu})\) associated with \(\boldsymbol{x}\) and \((\boldsymbol{\lambda}, \boldsymbol{\mu})\) is called the duality gap.
If the gap is less than \(\varepsilon\), we can tell that the duat solution \((\boldsymbol{\lambda}, \boldsymbol{\mu})\) serves as a proof or certificate that \(\boldsymbol{x}\) is \(\varepsilon\)-suboptimal
Slater’s Theorem
Consider the convex problem
Slater's condition The above problem is strictly feasible, i.e.
Refined Slater’s condition
If some \(g_j\) are affine, \(g_j(\boldsymbol{x}_0) < 0\) can be relaxed to \(g_j(\boldsymbol{x}_0) \leq 0\)
Slater’s Theorem
Strong duality holds for (CP) under (refined) Slater’s condition.
Proof
Here \(D\) is the domain of all functions in (CP) and \(\Omega\) is the feasible set.

If \(f^* = -\infty\), then \(f^* = \phi^* = -\infty\) by weak duality.
Due to (CP) is strongly feasible, if \(f^* > -\infty\), then \(f^* < +\infty\). Assume \(\boldsymbol{A} \in \mathbb{R}^{k \times n}\) with \(\operatorname{rank}A = k\).
Let
Similarly we know that \(C\) is convex.
It's easy to know that \((\boldsymbol{0}, \boldsymbol{0}, f(\boldsymbol{x})) \in C\) for \(\boldsymbol{x} \in \Omega\), and since \(f^* = \inf_{\boldsymbol{x} \in \Omega}f(\boldsymbol{x})\), \((\boldsymbol{0}, \boldsymbol{0}, f^*) \in \partial C\).
So there exists a supporting hyperplane at \((\boldsymbol{0}, \boldsymbol{0}, f^*) \in \partial C\), satisfies:
Then, letting \(\boldsymbol{u}, t \to +\infty\) yields \(\boldsymbol{\mu} \geq \boldsymbol{0}, \alpha \geq 0\).
Noting \((\boldsymbol{g}(\boldsymbol{x}), \boldsymbol{h}(\boldsymbol{x}), f(\boldsymbol{x})) \in C\),
Then we have to show \(\alpha \neq 0\) by Slater's condition.
Now assume \(\alpha = 0\)
By the inequality above, if \(\boldsymbol{x}\) is feasible, (due to \(\alpha = 0\) and \(\boldsymbol{\mu} \geq \boldsymbol{0}\))
So \(\boldsymbol{\mu}^T\boldsymbol{g}(\boldsymbol{x}) = \boldsymbol{0}\)
Let \(\boldsymbol{x}_0\) satisfies Slater's condition, i.e. \(\boldsymbol{x}_0 \in \operatorname{int}D, \boldsymbol{h}(\boldsymbol{x}_0) = \boldsymbol{0}, \boldsymbol{g}(\boldsymbol{x}_0) < \boldsymbol{0}\). Easily we know that \(\boldsymbol{\mu} = \boldsymbol{0}\).
Also we know that \(\boldsymbol{\lambda}^T\boldsymbol{h}(\boldsymbol{x}) \geq 0\) for all \(\boldsymbol{x} \in D\).
\(\boldsymbol{x}_0 \in \operatorname{int}D\) infers there exists \(\delta > 0\), for all \(\boldsymbol{z} \in B(0, \delta), \boldsymbol{x}_0 + \boldsymbol{z} \in \operatorname{int}D\), so
So \(\boldsymbol{A}^T\boldsymbol{\lambda} = \boldsymbol{0}\). Due to \(\boldsymbol{A}\) has full row rank, \(\boldsymbol{\lambda} = \boldsymbol{0}\).
By now we have got \((\boldsymbol{\mu}, \boldsymbol{\lambda}, \alpha) = \boldsymbol{0}\), contradiction!
So \(\alpha \neq 0\)
Then let \(\boldsymbol{\mu}^* = \boldsymbol{\mu} / \alpha, \boldsymbol{\lambda}^* = \boldsymbol{\lambda} / \alpha\), we have
Minimizing over \(\boldsymbol{x}\) we get
And by definition \(f^* \geq \phi^*\), so \(f^* = \phi^*\), strong duality holds!
If \(\operatorname{rand}\boldsymbol{A} \leq k\), we just abandon some duplicated constraints in \(\boldsymbol{h}\) and do the same above.
Dual formulation of (soft-margin SVM)
Due to is constrains are all affine and are all inequality, according to Slater's condition, strong duality holds.
The lagrangian is
minimize it,
So the dual is
Revisit KKT condition
KKT condition holds iff
1. Strong duality holds, \(f^* = \phi^*\)
2. \(\boldsymbol{x}^*\) is the primal optimal solution, \(f^* = f(\boldsymbol{x}^*)\)
3. \((\boldsymbol{\lambda}^*, \boldsymbol{\mu}^*)\) is a dual optimal solution, \(\phi^* = \phi(\boldsymbol{\lambda}^*, \boldsymbol{\mu}^*)\)
Common forms of dual LP