class: center, middle, inverse, title-slide # STA 360/602L: Module 3.9 ## MCMC and Gibbs sampling III ### Dr. Olanrewaju Michael Akande --- ## Recap of normal model - Sampling model: .block[ .small[ `$$y_{i} | \mu, \tau \overset{iid}{\sim} \mathcal{N} \left( \mu, \tau^{-1}\right).$$` ] ] -- - Suppose we want to specify our uncertainties about `\(\mu\)` and `\(\tau\)` independently of each other. That is, we want `\(\pi(\mu,\tau) = \pi(\mu)\pi(\tau)\)`. -- - For example, .block[ .small[ $$ `\begin{split} \mu & \sim \mathcal{N}\left(\mu_0, \sigma_0^2\right).\\ \tau \ & \sim \textrm{Gamma}\left(\dfrac{\nu_0}{2}, \dfrac{\nu_0}{2\tau_0}\right).\\ \end{split}` $$ ] ] -- - Then in this form, where `\(\sigma_0^2\)` is not proportional to `\(\dfrac{1}{\tau}\)`, the marginal density of `\(\tau\)` is not a gamma density (or a density we can easily sample from). -- - We need to do .hlight[Gibbs sampling]. --- ## Full conditionals - That is, we need .block[ `$$\mu|Y,\tau \sim \mathcal{N}(\mu_n, \tau_n^{-1}),$$` ] where .block[ $$ `\begin{split} \mu_n & = \dfrac{\frac{\mu_0}{\sigma^2_0} + n\tau\bar{y}}{\frac{1}{\sigma^2_0} + n\tau}\\ & \ \ \ \ \ \ \ \ \ \ \ \ \\ \tau_n & = \frac{1}{\sigma^2_0} + n\tau.\\ \end{split}` $$ ] --- ## Full conditionals - and .block[ $$ `\begin{split} \tau| \mu,Y & \sim \textrm{Gamma}\left(\dfrac{\nu_n}{2}, \dfrac{\nu_n\sigma_n^2(\mu)}{2}\right), \end{split}` $$ ] where .block[ $$ `\begin{split} \nu_n & = \nu_0 + n\\ & \ \ \ \ \ \ \ \ \ \ \ \ \\ \sigma_n^2(\mu) & = \dfrac{1}{\nu_n} \left[ \dfrac{\nu_0}{\tau_0} + \sum^n_{i=1} (y_i - \mu)^2 \right] = \dfrac{1}{\nu_n}\left[ \dfrac{\nu_0}{\tau_0} + ns^2_n(\mu) \right]\\ & \ \ \ \ \ \ \ \ \ \ \ \ \\ \textrm{with} \ \ \ s^2_n(\mu) & = \dfrac{1}{n} \sum^n_{i=1} (y_i - \mu)^2 \ \ \Rightarrow \ \ ns^2_n(\mu) = (n-1)s^2 + n(\bar{y} - \mu)^2. \\ \end{split}` $$ ] --- ## Recall the Pygmalion data - Let's implement this Gibbs sampler for the Pygmalion data. -- - For now, let's focus only on the accelerated group. -- - Data for accelerated group (A): 20, 10, 19, 15, 9, 18. -- - Summary statistics: `\(\bar{y}_A = 15.2\)`; `\(s_A = 4.71\)`. --- ## Recall the Pygmalion data - Suppose we assume, as we did before, that these improvement scores are normal with mean `\(\mu\)` and variance `\(\frac{1}{\tau}\)`. -- - As a reminder, in the conjugate case, we had .block[ $$ `\begin{split} \tau \ & \sim \textrm{Gamma}\left(\frac{\nu_0}{2}, \frac{\nu_0}{2\tau_0}\right)\\ \mu|\tau & \sim \mathcal{N}\left(\mu_0, \frac{1}{\kappa_0 \tau}\right).\\ \end{split}` $$ ] -- - We set + `\(\mu_{0} = 0\)`, to reflect "no idea whether students would improve IQ on average"; + `\(\kappa_{0} = 1\)`, to reflect "little faith in this belief, equivalent to having only 1 prior observation in each group"; + `\(\nu_{0} = 1\)` and `\(1/\tau_{0} = 100\)`, based on literature, that is, SD of change scores should be around 10. --- ## Recall the Pygmalion data - Now, in the non-conjugate case, we have .block[ .small[ $$ `\begin{split} \mu & \sim \mathcal{N}\left(\mu_0, \sigma_0^2\right).\\ \tau \ & \sim \textrm{Gamma}\left(\dfrac{\nu_0}{2}, \dfrac{\nu_0}{2\tau_0}\right).\\ \end{split}` $$ ] ] - Suppose for `\(\mu\)`, we use a `\(\mathcal{N}(0,100)\)` prior, and for `\(\tau\)` we use a `\(\textrm{Ga}(\frac{1}{2},50)\)` prior. -- - Matching with .block[ .small[ $$ `\begin{split} \mu & \sim \mathcal{N}\left(\mu_0, \sigma_0^2\right).\\ \tau \ & \sim \textrm{Gamma}\left(\dfrac{\nu_0}{2}, \dfrac{\nu_0}{2\tau_0}\right),\\ \end{split}` $$ ] ] we have: `\(\mu_0 = 0\)`, `\(\sigma_0^2 = 100\)`, `\(\nu_0 = 1\)` and `\(\tau_0 = 1/100\)`. --- ## Gibbs sampling for Pygmalion data ```r y <- c(20,10,19,15,9,18) #data y_bar <- mean(y); s2 <- var(y); n <- length(y) #sample statistics you'll need S <- 10000 # number of samples to draw PHI <- matrix(nrow=S,ncol=3); #matrix to save results colnames(PHI) <- c("mu","tau","sigma2") PHI[1,] <- phi <- c(y_bar,1/s2,s2) #starting values are MLEs mu0 <- 0; sigma02 <- 100; nu0 <- 1; tau0 <- 1/100 #hyperparameters ###### Gibbs sampler ###### set.seed(1234) #to replicate results exactly for(s in 2:S) { #first, draw new mu taun <- 1/sigma02 + n*phi[2] mun <- (mu0/sigma02 + n*y_bar*phi[2])/taun phi[1] <- rnorm(1,mun,sqrt(1/taun)) #now, draw new tau/sigma2 nun <- nu0+n #trick to speed up calculation of sum(y_i-\mu)^2 s2nmu <- (nu0/tau0 + (n-1)*s2 + n*(y_bar-phi[1])^2)/nun phi[2] <- rgamma(1,nun/2,nun*s2nmu/2) phi[3] <- 1/phi[2] #save sigma2 #save the current joint draws PHI[s,] <- phi } ###### End of Gibbs sampler ###### ``` --- ## Pygmalion data: mean ```r plot(PHI[,1],ylab=expression(mu),xlab="Iteration", main=expression(paste("Sampled values of ",mu))) abline(a=mean(PHI[,1]),b=0,col="red4",lwd=2) ``` <img src="3-9-gibbs-sampling-III_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> --- ## Pygmalion data: mean ```r hist(PHI[,1],col=rainbow(20),xlab=expression(mu),ylab="Density",freq=F,breaks=50, main=expression(paste("Posterior density of ",mu))) ``` <img src="3-9-gibbs-sampling-III_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> --- ## Pygmalion data: mean ```r round(mean(PHI[,1]),3) ``` ``` ## [1] 13.99 ``` ```r round(quantile(PHI[,1],c(0.025,0.5,0.975)),3) ``` ``` ## 2.5% 50% 97.5% ## 7.520 14.217 19.277 ``` Posterior summaries for `\(\mu\)`: + Posterior mean `\(\approx 14\)`. + Posterior median `\(\approx 14.22\)`. + 95% credible interval `\(\approx (7.52, 19.28)\)`. For context, `\(\bar{y}_A = 15.2\)`, and we used a `\(\mathcal{N}(0,100)\)` prior for `\(\mu\)`. --- ## Pygmalion data: precision ```r plot(PHI[,2],ylab=expression(tau),xlab="Iteration", main=expression(paste("Sampled values of ",tau))) abline(a=mean(PHI[,2]),b=0,col="red4",lwd=2) ``` <img src="3-9-gibbs-sampling-III_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> --- ## Pygmalion data: precision ```r hist(PHI[,2],col=rainbow(50),xlab=expression(tau),ylab="Density",freq=F,breaks=50, main=expression(paste("Posterior density of ",tau))) ``` <img src="3-9-gibbs-sampling-III_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> --- ## Pygmalion data: precision ```r round(mean(PHI[,2]),3) ``` ``` ## [1] 0.028 ``` ```r round(quantile(PHI[,2],c(0.025,0.5,0.975)),3) ``` ``` ## 2.5% 50% 97.5% ## 0.006 0.025 0.069 ``` Posterior summaries for `\(\tau\)`: + Posterior mean `\(\approx 0.028\)`. + Posterior median `\(\approx 0.025\)`. + 95% credible interval `\(\approx (0.006, 0.069)\)`. For context, `\(s_A = 4.71\)`, which means sample precision `\(= 1/4.71^2 = 0.045\)`. Also, we used a `\(\textrm{Ga}(\frac{1}{2},50)\)` prior for `\(\tau\)`. --- ## Pygmalion data: variance ```r plot(PHI[,3],ylab=expression(sigma^2),xlab="Iteration", main=expression(paste("Sampled values of ",sigma^2))) abline(a=mean(PHI[,3]),b=0,col="red4",lwd=2) ``` <img src="3-9-gibbs-sampling-III_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> --- ## Pygmalion data: variance ```r hist(PHI[,3],col=rainbow(10),xlab=expression(sigma^2),ylab="Density",freq=F,breaks=100, main=expression(paste("Posterior density of ",sigma^2))) ``` <img src="3-9-gibbs-sampling-III_files/figure-html/unnamed-chunk-10-1.png" style="display: block; margin: auto;" /> --- ## Pygmalion data: variance ```r round(mean(PHI[,3]),2) ``` ``` ## [1] 53.34 ``` ```r round(quantile(PHI[,3],c(0.025,0.5,0.975)),2) ``` ``` ## 2.5% 50% 97.5% ## 14.52 39.60 174.11 ``` Posterior summaries for `\(\sigma^2\)`: + Posterior mean `\(= 53.34\)`. + Posterior median `\(= 39.60\)`. + 95% credible interval `\(= (14.52, 174.11)\)`. For context, `\(s_A = 4.71\)`, which means sample variance `\(4.71^2 = 22.18\)`. Again, we used a `\(\textrm{Ga}(\frac{1}{2},50)\)` prior for `\(\tau\)`. --- class: center, middle # What's next? ### Move on to the readings for the next module!