Constraint Optimization - Exam Notes¶
This chapter presents constrained optimization in a clean, exam-ready structure: theory first, formulas second, then solved problems, intuition, and high-priority exam points.
SECTION 1: THEORY CONCEPTS - DEFINITIONS & RELATIONSHIPS¶
1.1 Constrained Optimization¶
A constrained optimization problem minimizes or maximizes an objective while satisfying constraints:
1.2 Linear Programming vs Quadratic Programming¶
- Linear programming (LP): objective and constraints are linear.
- Quadratic programming (QP): objective and/or constraints contain quadratic terms.
1.3 Feasible Region and Active Constraints¶
- Feasible region: all points satisfying constraints.
- Active (tight) constraint: holds with equality at optimum.
- In many problems, optimum occurs at boundary points.
1.3A Geometric Visualization¶
In this geometry: - the blue curve is objective landscape, - the red vertical wall is the constraint boundary, - unconstrained minimum may be infeasible, - constrained optimum is the best feasible point.
1.4 Lagrangian Interpretation¶
The Lagrangian combines objective and constraints into one function:
Meaning: - feasible choices avoid violation penalty, - infeasible choices receive larger penalty through multipliers.
1.5 Primal and Dual¶
- Primal: original optimization in variables \(x\).
- Dual: transformed optimization in multipliers \((\lambda,\nu)\).
Dual is often easier to analyze and gives bounds on primal optimum.
1.6 Weak Duality, Strong Duality, Duality Gap¶
- Weak duality: dual optimum never exceeds primal optimum (for minimization primal).
- Strong duality: primal and dual optima are equal.
- Duality gap:
For convex problems under suitable regularity (for example Slater condition), strong duality typically holds.
1.7 KKT Conditions (Optimality System)¶
KKT is the main condition set for constrained optimization: 1. primal feasibility 2. dual feasibility 3. complementary slackness 4. stationarity
For convex problems, KKT is the standard test for optimality.
1.8 Constraint Qualifications¶
- LICQ: active constraint gradients are linearly independent.
- Slater condition (convex inequalities): at least one strictly feasible point exists.
These conditions support strong duality and reliable KKT use.
1.9 Sensitivity (Shadow Price Meaning)¶
Multiplier values indicate how optimal objective changes when constraints are tightened/relaxed.
SECTION 2: FORMULAE¶
2.1 Standard Constraint Conversion¶
Convert inequalities to \(\le 0\) form before writing Lagrangian:
2.2 Dual Function and Dual Problem¶
If primal is minimization, dual is maximization.
2.3 Minimax Inequality (Weak Duality Form)¶
Equivalent statement:
2.4 KKT Conditions (Differentiable Form)¶
Given candidate \((x^{\star},\lambda^{\star},\nu^{\star})\):
(A) Primal feasibility
(B) Dual feasibility
(C) Complementary slackness
(D) Stationarity
Quick reading rule: - \(\lambda_i^{\star} > 0\) implies constraint \(i\) is active. - If a constraint is strictly inactive, its multiplier is zero.
2.5 Three Exam Workflows¶
Workflow A: Solve a primal with Lagrangian 1. Convert constraints to standard form. 2. Build \(\mathcal{L}\). 3. Write KKT equations. 4. Solve system and verify feasibility.
Workflow B: Convert primal to dual 1. Write Lagrangian: \(\mathcal{L}(x,\lambda,\nu)\). 2. Minimize \(\mathcal{L}\) w.r.t. \(x\): treat \((\lambda,\nu)\) as constants, and find the \(x\)-values that minimize \(\mathcal{L}\) by setting derivatives to zero. Call this \(x(\lambda,\nu)\). 3. Substitute back: plug \(x(\lambda,\nu)\) into the Lagrangian. 4. Simplify: the result is \(g(\lambda,\nu)\). The dual problem is "maximize \(g(\lambda,\nu)\)" subject to multiplier feasibility constraints (for inequalities, \(\lambda \ge 0\)).
Workflow C: Solve a dual problem 1. Differentiate \(g\) and solve \(\nabla g = 0\). 2. Enforce multiplier feasibility. 3. Evaluate objective at optimum.
SECTION 3: SOLVED PROBLEMS¶
Problem 1: Equality-Constrained Quadratic (Primal Solve)¶
Minimize:
Lagrangian:
Stationarity:
Constraint gives:
So:
Problem 2: Inequality-Constrained Circle Problem¶
Minimize:
Lagrangian:
Stationarity:
Complementary slackness implies active constraint (\(\lambda\neq0\)):
Substitute:
Optimal value:
Problem 3: Three-Variable Equality Geometry¶
Minimize:
Lagrangian:
Stationarity:
Use constraint:
Hence:
Problem 4: Primal to Dual Conversion (Single Variable)¶
Primal:
Lagrangian:
Minimize over \(x\):
Substitute into \(\mathcal{L}\):
Dual:
Problem 5: Primal to Dual Conversion (Two Constraints)¶
Primal:
Standard form:
Lagrangian:
Stationarity gives:
Dual function:
Dual problem:
Problem 6: Solving a Dual Fully¶
Primal:
Use \(g(x,y)=2x+y+5\le0\), Lagrangian:
Minimize over \(x,y\):
Dual function:
Maximize:
Dual optimum:
Problem 7: Minimax Verification (No Duality Gap)¶
For:
Primal optimum:
Dual from Problem 4 type form gives optimum:
Hence:
Weak duality holds and strong duality also holds here.
SECTION 4: TRICKS & MENTAL MODELS¶
4.1 Fence-Penalty Model¶
Think of constraints as fenced boundaries: - inside boundary: no violation cost, - outside boundary: multiplier penalty increases cost.
4.2 Active Constraint Fast Test¶
Use complementary slackness immediately:
If \(\lambda_i^{\star}>0\), constraint must be tight.
4.3 Game View (Min-Max Thinking)¶
Primal selects variables; dual selects penalty strengths. Dual maximization pushes the best valid lower bound on primal optimum.
4.4 Exam Speed Pattern¶
Most conversion questions follow one template: 1. normalize constraints, 2. write Lagrangian, 3. stationarity, 4. substitute, 5. final dual objective.
4.5 Geometry Memory Hook¶
At optimum, objective improvement direction is blocked by active constraints; this is exactly what stationarity encodes.
4.6 ML Relevance (SVM Link)¶
SVM optimization is a constrained optimization problem. Dual form is often preferred for cleaner optimization and kernelized formulations.
SECTION 5: IMPORTANT POINTS FOR EXAM¶
- Always convert inequalities into standard \(\le0\) form before forming \(\mathcal{L}\).
- One inequality constraint corresponds to one nonnegative multiplier.
- For minimization primal, dual is maximization.
- Write weak duality correctly: \(d^{\star}\le p^{\star}\).
- KKT must be written as four parts: primal feasibility, dual feasibility, complementary slackness, stationarity.
- In convex problems with Slater condition, KKT is typically sufficient and duality gap is zero.
- If exam asks only "convert primal to dual," stop after writing dual objective with multiplier constraints.
- If exam asks to solve dual, optimize \(g\) and check multiplier feasibility.
- Use complementary slackness early to reduce algebra.
- Add one interpretation line (active constraint, penalty meaning, or duality gap meaning) after derivation for higher-quality answers.
SECTION 6: PRACTICAL IMPLEMENTATION, PSEUDOCODE, AND SMART TRICKS¶
6.1 Which Solver to Use (Practical Rule)¶
- Convex problem with clear algebraic form: use CVXPY modeling.
- Smooth nonlinear constraints with gradients/Jacobians: use
scipy.optimize.minimizewithtrust-constrorSLSQP. - Derivative-free / noisy constraints: use
COBYLA/COBYQA-style methods. - Large streaming ML setup: prefer first-order primal-dual or projected methods.
6.2 Pseudocode: Primal-Dual (Inequality Constraints)¶
Input: f(x), g_i(x) <= 0, step sizes eta_x, eta_lambda
Initialize x0, lambda0 >= 0
for t = 0..T-1:
# Primal step (descent on Lagrangian wrt x)
x_{t+1} = x_t - eta_x * grad_x L(x_t, lambda_t)
# Dual step (ascent on multipliers)
lambda_{t+1} = lambda_t + eta_lambda * g(x_{t+1})
# Projection to dual-feasible region
lambda_{t+1} = max(lambda_{t+1}, 0) # elementwise
end
Return x_T, lambda_T
6.3 Pseudocode: Augmented Lagrangian (Equality Constraints)¶
Input: f(x), c(x)=0, penalty rho>0
Initialize x0, lambda0
for k = 0..K-1:
# Minimize augmented Lagrangian wrt x
x_{k+1} = argmin_x [ f(x) + lambda_k^T c(x) + (rho/2)||c(x)||^2 ]
# Multiplier update
lambda_{k+1} = lambda_k + rho * c(x_{k+1})
end
Return x_K, lambda_K
6.4 Python Blueprint (SciPy)¶
import numpy as np
from scipy.optimize import minimize, LinearConstraint, Bounds
def f(x):
return (x[0]-1.0)**2 + (x[1]-2.0)**2
def grad_f(x):
return np.array([2*(x[0]-1.0), 2*(x[1]-2.0)])
lin_con = LinearConstraint([[1.0, 1.0]], [-np.inf], [2.0]) # x0 + x1 <= 2
bounds = Bounds([0.0, 0.0], [np.inf, np.inf])
res = minimize(
f, x0=np.array([0.5, 0.5]),
jac=grad_f,
method="trust-constr",
constraints=[lin_con],
bounds=bounds
)
6.5 Smart Technical Tricks¶
- Normalize constraints early into \(\le 0\) and \(=0\) forms before any derivation.
- Check convexity signs before assuming strong duality.
- Use active-constraint detection via complementary slackness to reduce equation count.
- In numerical solvers, provide gradients/Jacobians when possible; convergence improves significantly.
- Always verify primal feasibility after optimization; objective value alone is not enough.
- Track KKT residuals as a stopping diagnostic in iterative methods.
6.6 Common Failure Modes¶
- Infeasible initial assumptions (no feasible point).
- Ignoring dual feasibility (\(\lambda \ge 0\)).
- Treating nonconvex problems as if strong duality must hold.
- Reporting stationary points without feasibility checks.
6.7 Competition-Grade Workflow (Kaggle-Style)¶
- Start with a simple feasible baseline and verify constraint satisfaction.
- Lock validation protocol first (split strategy + metric), then optimize.
- Tune model hyperparameters and constraint penalties jointly.
- Track both objective metric and violation metric on validation folds.
- Ensemble only models that meet feasibility thresholds reliably.
6.8 Missing Advanced Point: Many-Constraint ML Problems¶
In fairness/ranking/safety settings, number of constraints can be very large.
A practical approach is to parameterize multipliers (multiplier models) rather than storing one multiplier per constraint; this improves scalability while preserving primal-dual structure.
References¶
- Convex Optimization - Stephen Boyd and Lieven Vandenberghe (Book page)
- Convex Optimization (PDF)
- Machine Learning Mastery: Lagrange Multiplier Approach with Inequality Constraints
- Machine Learning Mastery: A Gentle Introduction to Optimization / Mathematical Programming
- SciPy Optimize Tutorial (Constrained minimization)
- SciPy
minimizeAPI - CVXPY Introduction Tutorial
- CVXPY Constraints Tutorial
- The Kaggle Book (chapter listing; validation/modeling/optimization practice)
- Heavily-Constrained Learning with Lagrange Multiplier Models (NeurIPS 2020)