An **optimization problem** is the task of minimizing some **objective function** $J(\boldsymbol{x}) \in \mathbb{R}$ over some [[domain]] $\mathcal{X}$ subject to **inequality constraints** $\boldsymbol{g}(\boldsymbol{x}) \in \mathbb{R}^{N}$ and **equality constraints** $\boldsymbol{h}(\boldsymbol{x}) \in \mathbb{R}^{M}$:
^problem
$
\min_{\boldsymbol{x} \in \mathcal{X}} J(\boldsymbol{x}) \quad \text{subject to} \quad
\boldsymbol{g}(\boldsymbol{x}) \le \boldsymbol{0} \quad \text{and} \quad
\boldsymbol{h}(\boldsymbol{x}) = \boldsymbol{0}
$
^program
We solve an optimization problem using some [[algorithm]]; see [[optimization algorithm]].
("[[closed form solution]]" is really a subjective term.)
Note that equality constraints are technically *redundant* since we could express $\boldsymbol{h}(\boldsymbol{x}) = 0$ by the two inequality constriants $\boldsymbol{h}(\boldsymbol{x}) \le 0$ and $- \boldsymbol{h}(\boldsymbol{x}) \le 0$.
But it's usually more illuminating to treat them separately.
# terminology
We call $\boldsymbol{x}$ **feasible** if it obeys the constraints.
We call $\boldsymbol{x}$ **strictly** feasible if $\boldsymbol{g}(\boldsymbol{x}) < \boldsymbol{0}$ and $\boldsymbol{h}(\boldsymbol{x}) = \boldsymbol{0}$.
We call $\boldsymbol{x}^{\star}$ a **global minimum** if $J(\boldsymbol{x}^{\star}) \le J(\boldsymbol{x})$ for all feasible $\boldsymbol{x}$.
We call $\widetilde{\boldsymbol{x}}$ a **local minimum** if $J(\widetilde{\boldsymbol{x}}) \le J(\boldsymbol{x})$ for all feasible $\boldsymbol{x}$ in some [[neighbourhood]] of $\widetilde{\boldsymbol{x}}$.
We say $g^{n}$ is **binding** at $\boldsymbol{x}$ if $g^{n}(\boldsymbol{x}) = 0$.
We say $\boldsymbol{\delta} \in \mathcal{X}$ is a **direction of decrease** at $\boldsymbol{x}$ if moving in that direction decreases $J$;
ie the [[directional derivative]] $\partial J(\boldsymbol{x}) \boldsymbol{\delta} < 0$;
ie $\boldsymbol{\delta}$ points away from the gradient $\nabla J(\boldsymbol{x})$.
See [[loss landscape]].
[[Lagrangian duality]] is a powerful tool to at least get lower bounds on the solution and solve [[convex]] problems.
# [[machine learning]] perspective
![[machine learning#^goal]]
"machine learning method" is a synonym for "[[optimization]] problem and algorithm".
How to frame the optimization problem and choose an algorithm?
We must consider *inductive biases*:
![[induct#^inductive-bias]]
Then we also need to conduct [[model assessment]] to estimate the [[generalization error]].
# [[archetype|model organism]]s
[[two-layer perceptron|2LP]]
[[Rosenbrock function]]