This post is the continued part of [link].

# Constrained Optimization

## Interpretation of Lagrangian Multipliers

Consider the family of problems:

$$\min f(x) | h(x)=u.$$

For each $u$, $x^*(u)$ and $\lambda ^*$ are the corresponding regular local minimum and Lagrangian multiplier respectively. Denote $p(u)=f(x^*(u))$, then:

$$\nabla p(u) = - \lambda ^* (u)$$

where the $\lambda ^*$ can be interpreted as shadow price.

Here gives a brief proof:

Assume the original constrained optimization problem is: $$\min f(x), s.t.\quad h(x)=u,u=Const$$ then according to the KKT condition: $$\nabla f(x(u))+\lambda \nabla h(x(u))=0$$ $$ \frac{dx(u)^T}{du}(\nabla f(x(u))+\lambda \nabla h(x(u)))=0$$ where for the $h(x(u))$: $$h(x(u))=u\rightarrow \frac{dh(x(u))}{du}=\frac{dx(u)^T}{du}\cdot \nabla h(x(u)) =1$$ then finally: $$\frac{f(x^*(u))}{du}=\frac{dx(u)^T}{du}\cdot \nabla f(x(u))=-\lambda$$

## Solving Constrained Optimization

### One-dimension Condition

For the one-dimension constrained optimization problem,

$$\min_{x\in [a,b]}z(x)$$

where $z(x)$ is convex and continuous. Here are two methods: bisection method and golden section method:

**Bisection Method**

- Let $[a^n, b^n]$ be the current interval
- $\tilde{x}^n=\frac{a^n+b^n}{2}$
- If $\frac{dz(\tilde{x}^n)}{dx}<0$, discard $[a^n,\tilde{x}^n]$ and $\tilde{x}^n,b^n \rightarrow a^{n+1},b^{n+1}$; otherwise discard $[\tilde{x}^n,b^n]$ and $a^n,\tilde{x}^n \rightarrow a^{n+1},b^{n+1}$;
- If $|b^{N}-a^N|<\epsilon$, return. Otherwise go to step 2.

Golden section method is similar to the bisection method while the main difference is that you do not have to calculate the derivative of the lost function and for each iteration you need only calculate one extra value.

**Golden Section Method**

- Let $[a^n,b^n]$ be the current interval
- Set $x_L^n=0.618a^n+(1-0.618)b^n$ and $x_R^n=(1-0.618)a^n+0.618b^n$
- If $z(x_L^n)<z(x_R^n)$, discard $[x_R^n,b^n]$ and $a^n,x_R^n\rightarrow a^{n+1}, b^{n+1}$; otherwise discard $[a^n,x_L^n]$ and $x_L^n,b^n \rightarrow a^{n+1}, b^{n+1}$;
- If stopping criterion met, return; otherwise go step 2.

### Normal Situation

By applying the thought of the iterative descent method to the constrained optimization problem, the feasible descent direction is:

$$\nabla f(x^k)^T(y-x^k)<0$$

where $y\in X$ and $X$ is a convex region.

The step size can be chosen by searching for a minimum between $y$ and $x^k$.

**Conditional Gradient Method (Frank-Wolfe Method)**

For a linear constrained problem:

$$\min_{x\in R^n}z(x)\quad s.t.\sum_i h_{ij} x_i \ge b_j\forall j\in J$$

- Start with a feasible solution $x^0$, $n=1$;
- Direction finding. Find $y^n$ that solves the linear program:
- $$\min z_L^n(y)=\sum_i (\frac{\partial z(x^n)}{\partial x_i})y_i, s.t. \sum_i h_{ij}y_i\ge b_j \forall j\in J$$
- Step size determination. Find $\alpha_n$ that solves:
- $$\min_{0\le \alpha \le 1}z[x^n+\alpha (y^n-x^n)]$$
- Move. Set $x^{n+1}=x^n+\alpha_n(y_n-x_n)$.
- Convergence test. If $z(x^n)-z(x^{n+1})\le \kappa$, stop. Otherwise, let $n=n+1$ and go to step 1.

Another method called Gradient Projection Method is first find the gradient descent direction and then project the direction to the feasible plate.

# Linear Programming

## Standard Form of LP

The standard form of the linear programming is:

$$\min c^T x, s.t.Ax=b,x\ge 0$$

so the standard linear programming problem is convex program because both the function and the feasible region are convex. Then the KKT conditions are both necessary and sufficient.

There are only three possible results for a linear programming problem:

- Has an optimal solution (may be multiple global optimum solutions)
- Has no optimal solution because the problem is not bounded
- No feasible region.

## Primal and Dual Problems

The primal problem:

$$\min c^T x ,s.t.Ax=0,x\ge 0$$

and the dual problem:

$$\max p^Tb,s.t.A^Tp\le c$$

They are called primal problem and the dual problem because they share the same KKT conditions:

$$c^Tx=p^Tb,Ax=b,A^Tp\le c,x\ge 0$$

If both $x$ and $p$ are feasible, then we have: $c^Tx\ge p^Tb$. At optimality, $c^Tx^*=p^{*T}b$. There are two special conditions: if the primal is unbounded, then the dual is infeasible; if the primal is not feasible, then the dual will be unbounded.

## Simplex Method

### General procedure of simplex method

- Choose a starting feasible basis, B.
- Compute $B^{-1}$, then the basic feasible solution is $x_B=B^{-1}$ and $x_N=0$ and the objective value is $c^Tx=c^T_BB^{-1}b$.
- Calculate the reducted cost for each nonbasic variable, $\bar{c}_j=c_j-c^T_BB^{-1}A_j$. If all reducted costs are greater or equal than zero, stop and the current basic feasible solution is optimal. Otherwise, select a nonbasic variable, $x_j$, with a negative reduced cost and to to step 3.
Calculate $d_B=B^{-1}A_j$. If $d_B\ge 0$, stop and the problem is unbounded. Otherwise, let $x_{B(l)}$ be the variable which yields the maximum feasible step size,

$$-\frac{x_{B(l)}}{d_{B(l)}}=\min (-\frac{x_{B(i)}}{d_{B(i)}})$$

Replace the column$A_{B(l)}$ with column $A_j$ to obtain a new basis, $B$. Go to step 1.

Here is a brief interpretation of simplex method:

First, we select $B$ as the basis matrix and the corresponding variables are called basic variables, which means that these variables do not have to be zero (they may be zero, they just do not have to be). Then the left-over variables are called non-basic variables which are all equal to zero. When we selct the matrix $B$ and if $B$ is invertible, we can get a series value of the basic varialbes. Besides, all the value of the basic variables have to be greater than zero otherwise they will not satisfied the constraints. So usually finding the starting feasible basis is not that easy and later will discuss about how to find the feasible basis.

Assume that we have found the basis matrix $B$ and also the feasible basic variables. Then we can move from this point to another joint vertex by choosing a entering variable and a leaving variable. Then along the direction of each entering variable, the reduced cost can be calculated. If all the feasible directions towards a candidate entering variables are not the reduced direction, then this point is the optimal point. Otherwise, we can always find a reduced direction, which means that we can determine an entering variable. After entering variable we can move towards this entering variable and see which basic variables become zero first. If no variable is reduced along the direction, then this problem will be unbounded, otherwise, we can find a leaving variable. Then we have finished an iteration to the next point.

### Finding the start feasible basis

Sometimes it is hard to find the start feasible basis. One possible way to find a feasible basis is to add extra variables to be the basic variables and then set their weights as large numbers.

### Sensitivity Analysis

- If the cost vector $c$ changes, then the current basis will still be the optimal if the reduced costs are greater zero for all non-basic variables, otherwise continue the iteration of simplex method. Things are similar if the matrix $A$ changes.

If you have a better understanding towards the simplex method, it is not difficult to analyze the possible result if we change some of the parameters.

# Code of Numerical Methods

I have coded most of the numerical methods included in this course and the code has been pushed to GitHub, see link.