The [[risk]] of a [[statistic|decision rule]] (eg a [[train loss|erm]]) over a given [[statistics|data generating process]] $p_{*}$ is
![[risk#^risk]]
What if we care about a whole family of problems $p_{*}[\theta]$
parameterized by $\theta \in \Theta$?
Given a prior $\pi \in \triangle(\Theta)$:
Bayes risk is [[mean|expected]] risk over the prior:
$
\mathrm{Risk}_\text{Bayes}[\pi](\delta) = \mathop{\mathbb{E}}_{\theta \sim \pi} [\mathrm{Risk}[\theta](\delta)] = \mathop{\mathbb{E}}_{\substack{\theta \sim \pi \\ \boldsymbol{y} \sim p_{*}[\theta]}} J[\theta](\delta(\boldsymbol{y}))
$
The *minimizing* [[statistic|decision rule]] wrt some class $\mathcal{D}$ for a given [[prior]] is called its **Bayes rule**. i.e.
$
\delta_{\text{Bayes}}[\mathcal{D}][\pi] = \min_{\delta \in \mathcal{D}} \text{Risk}_\text{Bayes}[\pi](\delta)
$
Note that if a Bayes rule achieves *constant* risk then it must be [[minimax]]