given some [[optimization|objective function]] $J(\hat{\theta})$, for some decision rule $\delta : \boldsymbol{y} \mapsto \hat{\theta}$ where $\boldsymbol{y} \sim p_{*}$ is the [[dataset]], we call the [[mean|expected value]] of the objective the [[risk]]: $ \text{Risk}(\delta) = \mathop{\mathbb{E}}_{\boldsymbol{y} \sim p_{*}} [J(\delta(\boldsymbol{y}))] $ ^risk eg in [[machine learning]]: 1. the objective function is the [[generalization error]] 2. we [[latent variable|observe]] some [[train loss|train]]ing [[dataset]] $\mathcal{D} \sim p_{*}^{N}$ 3. the [[statistic|decision rule]] is the [[optimization algorithm]] $\text{fit} : \mathcal{D} \mapsto f$ that outputs e.g. a [[prediction rule]] in [[supervised]] learning. $ \mathrm{Risk}(\text{fit}) = \mathop{\mathbb{E}}_{\mathcal{D} \sim p_{*}^{N}} [\text{Err}_{\text{g}}(\text{fit}[\mathcal{D}])] $ Even more generally we can define [[Bayes risk]] when there are a whole *class* of [[statistics|data generating process]]es we care about doing well under. # sources [[2013HastieEtAlElementsStatisticalLearning|ESL]] uses "expected test error"