class: center, middle, inverse, title-slide # STA 360/602L: Module 7.2 ## Metropolis in action ### Dr. Olanrewaju Michael Akande --- ## Count data - We will use the Metropolis sampler on count data with predictors, so let's first do some general review. -- - Suppose you have count data as your response variable. -- - For example, we may want to explain the number of c-sections carried out in hospitals using potential predictors such as hospital type, (that is, private vs public), location, size of the hospital, etc. -- - The models we have covered so far are not (completely) adequate for count data with predictors. -- - Of course there are instances where linear regression, with some transformations (especially taking logs) on the response variable, might still work reasonably well for count data. -- - That's not the focus here, so we won't cover that. --- ## Poisson regression - As we have seen so far, a good distribution for modeling count data with no limit on the total number of counts is the .hlight[Poisson distribution]. -- - As a reminder, the Poisson pmf is given by .block[ .small[ `$$\Pr[Y = y | \lambda] = \dfrac{\lambda^y e^{-\lambda}}{y!}; \ \ \ \ y=0,1,2,\ldots; \ \ \ \ \lambda > 0.$$` ] ] -- - Remember that .block[ .small[ `$$\mathbb{E}[Y = y] = \mathbb{V}[Y = y] = \lambda.$$` ] ] -- - When our data fails this assumption, we may have what is known as .hlight[over-dispersion] and may want to consider the [Negative Binomial distribution](https://en.wikipedia.org/wiki/Negative_binomial_distribution) instead (actually easy to fit within the Bayesian framework!). -- - With predictors, index `\(\lambda\)` with `\(i\)`, so that each `\(\lambda_i\)` is a function of `\(\boldsymbol{X}\)`. Therefore, the .hlight[random component] of the glm is .block[ .small[ $$p(y_i | \lambda_i) = \textrm{Poisson}(\lambda_i); \ \ \ i=1,\ldots,n. $$ ] ] --- ## Poisson regression - We must ensure that `\(\lambda_i > 0\)` at any value of `\(\boldsymbol{X}\)`, therefore, we need a .hlight[link function] that enforces this. A natural choice is .block[ `$$\textrm{log}\left(\lambda_i\right) = \beta_{0} + \beta_{1} x_{i1} + \beta_{2} x_{i2} + \ldots + \beta_{p} x_{ip}.$$` ] -- - Combining these pieces give us our full mathematical representation for the .hlight[Poisson regression]. -- - Clearly, `\(\lambda_i\)` has a natural interpretation as the "expected count", and .block[ `$$\lambda_i = e^{\beta_{0} + \beta_{1} x_{i1} + \beta_{2} x_{i2} + \ldots + \beta_{p} x_{ip}}$$` ] so the `\(e^{\beta_{j}}\)`'s are .hlight[multiplicative effects] on the expected counts. -- - For the frequentist version, in .hlight[R], use the `glm` command but set the option `family = “poisson”`. --- ## Analysis of horseshoe crabs - We have data from a study of nesting horseshoe crabs (J. Brockmann, Ethology, 102: 1–21, 1996). The data has been discussed in Agresti (2002). -- - Each female horseshoe crab in the study had a male crab attached to her in her nest. -- - The study investigated factors that affect whether the female crab had any other males, called satellites, residing nearby her. -- - The response outcome for each female crab is her number of satellites. -- - We have several factors (including the female crab's color, spine condition, weight, and carapace width) which may influence the presence/absence of satellite males. -- - The data is called `hcrabs` in the R package `rsq`. --- ## Analysis of horseshoe crabs - Let's fit the Poisson regression model to the data. In vector form, we have .block[ .small[ $$ `\begin{split} y_i & \sim \textrm{Poisson}(\lambda_i); \ \ \ i=1,\ldots,n;\\ \\ \text{log}[\lambda_i] & = \boldsymbol{\beta}^T \boldsymbol{x}_i \end{split}` $$ ] ] where `\(y_i\)` is the number of satellites for female crab `\(i\)`, and `\(\boldsymbol{x}_i\)` contains the intercept and female crab `\(i\)`'s + color; + spine condition; + weight; and + carapace width. -- - Suppose we specify a normal prior for `\(\boldsymbol{\beta} = (\beta_0, \beta_1, \beta_2, \ldots, \beta_{p-1})\)`, `\(\pi(\boldsymbol{\beta}) = \mathcal{N}_p(\boldsymbol{\beta}_0, \Sigma_0)\)`. -- - Can you write down the posterior for `\(\boldsymbol{\beta}\)`? Can you sample directly from it? --- ## Analysis of horseshoe crabs - We can use Metropolis to generate samples from the posterior. -- - First, we need a "symmetric" proposal density `\(\boldsymbol{\beta}^\star \sim g[\boldsymbol{\beta}^\star | \boldsymbol{\beta}^{(s)}]\)`; a reasonable choice is usually a multivariate normal centered on `\(\boldsymbol{\beta}^{(s)}\)`. -- - What about the variance of the proposal density? We can use the variance of the ols estimate, that is, `\(\hat{\sigma}^2 \left(\boldsymbol{X}^T \boldsymbol{X}\right)^{-1}\)`, which we can scale using `\(\delta\)`, to tune the acceptance ratio. -- - Here, `\(\hat{\sigma}^2\)` is calculated as the sample variance of `\(\textrm{log}[y_i + c]\)`, for some small constant `\(c\)`, to avoid problems when `\(y_i = 0\)`. -- - So we have `\(g[\boldsymbol{\beta}^\star | \boldsymbol{\beta}^{(s)}] = \mathcal{N}_p\left(\boldsymbol{\beta}^{(s)}, \delta \hat{\sigma}^2 \left(\boldsymbol{X}^T \boldsymbol{X}\right)^{-1} \right)\)`. -- - Finally, since we do not have any information apriori about `\(\boldsymbol{\beta}\)`, let's set the prior for it to be `\(\pi(\boldsymbol{\beta}) = \mathcal{N}_p(\boldsymbol{\beta}_0 = \boldsymbol{0}, \Sigma_0 = \boldsymbol{I})\)`. --- ## Analysis of horseshoe crabs - The Metropolis algorithm for this model is: 1. Given a current `\(\boldsymbol{\beta}^{(s)}\)`, generate a _candidate_ value `\(\boldsymbol{\beta}^\star \sim g[\boldsymbol{\beta}^\star | \boldsymbol{\beta}^{(s)}] = \mathcal{N}_p\left(\boldsymbol{\beta}^{(s)}, \delta \hat{\sigma}^2 \left(\boldsymbol{X}^T \boldsymbol{X}\right)^{-1} \right)\)`. -- 2. Compute the acceptance ratio .block[ .small[ $$ `\begin{split} r & = \dfrac{\pi(\boldsymbol{\beta}^\star | Y)}{\pi(\boldsymbol{\beta}^{(s)} | Y)} = \dfrac{\pi(\boldsymbol{\beta}^\star) \cdot p(Y | \boldsymbol{\beta}^\star)}{\pi(\boldsymbol{\beta}^{(s)}) \cdot p(Y | \boldsymbol{\beta}^{(s)})}\\ \\ & = \dfrac{ \mathcal{N}_p(\boldsymbol{\beta}^\star | \boldsymbol{\beta}_0 = \boldsymbol{0}, \Sigma_0 = \boldsymbol{I}) \cdot \prod\limits^n_{i=1} \text{Poisson}\left(Y_i| \lambda_i = \text{exp}\left\{ \left(\boldsymbol{\beta}^\star \right)^T \boldsymbol{x}_i \right\} \right) }{ \mathcal{N}_p(\boldsymbol{\beta}^{(s)} | \boldsymbol{\beta}_0 = \boldsymbol{0}, \Sigma_0 = \boldsymbol{I}) \cdot \prod\limits^n_{i=1} \text{Poisson}\left(Y_i| \lambda_i = \text{exp}\left\{ \left(\boldsymbol{\beta}^{(s)} \right)^T \boldsymbol{x}_i \right\} \right) }. \end{split}` $$ ] ] -- 3. Sample `\(u \sim U(0,1)\)` and set .block[ .small[ `\begin{eqnarray*} \boldsymbol{\beta}^{(s+1)} = \left\{ \begin{array}{ll} \boldsymbol{\beta}^\star & \quad \text{if} \quad u < r \\ \boldsymbol{\beta}^{(s)} & \quad \text{if} \quad \text{otherwise} \\ \end{array} \right. . \end{eqnarray*}` ] ] --- class: center, middle # Move to the R script [here](https://sta-602l-s21.github.io/Course-Website/slides/Horseshoe.R). --- class: center, middle # What's next? ### Move on to the readings for the next module!