University of Toronto
August 9, 2023
\[ \pi(\theta | x)=\frac{f(x|\theta)\pi(\theta)}{m(x)} \]
The marginal distribution of the data \(m(\mathbf{x})\) is a normalizing constant with respect to \(\theta\).
The posterior of \(\theta\) is proportional to the likelihood times the prior.
\[ \pi(\theta | x) \propto f(x|\theta)\pi(\theta) \]
Given data \(x=(x_1,...,x_n)\), realizations of a random sample \(X=(X_1,...,X_n)\), where \(X_i\sim F_{\theta}, \; \theta\in\Theta\), how do we arrive at an estimate \(\widehat{\theta}\)?
Frequentist: Find value of \(\theta\) that maximizes the log-likelihood
\[ \widehat{\theta}_{MLE}=\arg \max \ell(\theta) \]
Bayesian: Find the posterior distribution \(\pi(\theta|X_1,...,X_n)\) and then the estimator can be defined based on an appropriate summary statistic.
Some common choices for summaries:
Example 1 (Bernoulli model with Beta prior) Suppose we have a random sample \(X_1,...,X_n \overset{iid}{\sim} \text{Bernoulli}(\theta)\) and choose prior distribution \(\theta\sim\text{Beta}(a,b)\).
Find the posterior distribution of \(\theta|X_1,...,X_n\).
Find the posterior mean.
Find the posterior median.
Find the posterior mode.
Example 2 (Location Normal model with Normal prior) Suppose we have a random sample \(X_1,...,X_n \overset{iid}{\sim} \text{N}(\mu, \sigma_0^2)\) with \(\sigma^2_0\) known and choose prior distribution \(\mu \sim \text{N}(\mu_0,\tau_0^2)\).
Find the posterior distribution of \(\mu|X_1,...,X_n\).
Find an estimate for \(\mu\).
Compare with the MLE.
A \(100(1-\alpha)\%\) credible interval for \(\theta\) given data a random sample \(X=(X_1,...,X_n)\) is any pair \((L_n, U_n)\) such that
\[ \Pr\left( \left. L_n<\theta<U_n \right| X_1,...,X_n \right) = 1-\alpha \]
If \(q_\alpha\) represents the \(\alpha\)-quantile of the posterior, that is, \[ \int_{-\infty}^{q_\alpha} \pi (\theta|X_1,...,X_n)d\theta=\alpha \] then the following are \(100(1-\alpha)\%\) credible intervals:
When \(\Theta \subset \mathbb{R}\) (as opposed to \(\Theta = \mathbb{R}\)), the “\(\pm \infty\)” are replaced by the endpoints of \(\Theta\).
How do we pick a way of constructing a credible interval?
Choose the shortest interval.
\[ (q_{\alpha/2},q_{1-\alpha/2}) \] is the shortest interval if the distribution is unimodal.
If samples of size \(n\) are taken, such that they are drawn independently and separately, \(100(1-\alpha)\%\) of the resulting intervals would contain the true value of the parameter in the long run.
The probability that the parameter is in the interval is \(1-\alpha\).
Example 3 (Location Normal model with Normal prior) Suppose we have a random sample \(X_1,...,X_n \overset{iid}{\sim} \text{N}(\mu, \sigma_0^2)\) with \(\sigma^2_0\) known and choose prior distribution \(\mu \sim \text{N}(\mu_0,\tau_0^2)\).
\[ \mu |\mathbf{x} \sim N \left( \left( \frac{1}{\tau_0^2}+\frac{n}{\sigma_0^2}\right) ^{-1} \left( \frac{\mu_0}{\tau_0^2}+\frac{n}{\sigma_0^2} \bar{x} \right), \left( \frac{1}{\tau_0^2}+\frac{n}{\sigma_0^2}\right) ^{-1} \right) \]
Example 4 (Bernoulli model with Beta prior) Let \(X_1,...,X_n \overset{iid}{\sim} \text{Bernoulli}(\theta)\). Choose prior distribution \(\theta\sim\text{Beta}(12,12)\). Suppose we observe \(7\) heads in \(10\) flips of this coin.
\[ \theta|\mathbf{x} \sim \text{Beta}(a+n\bar{x}, b+n(1-\bar{x})) \]
Example 5 (Exponential model with Gamma prior) Let \(X_1,...,X_n \overset{iid}{\sim} \text{Exp}(\lambda)\) with density \(f_\lambda (x_i)=\lambda e^{-x_i\lambda}\). Use prior distribution \(\lambda \sim\text{Gamma}(\alpha,\beta)\) where \((\alpha,\beta)=(2, 3)\). Suppose we observe 7 in a sample of size \(n=10\).