Due: 11:59pm, Sunday, May 30
You all should have R and RStudio installed on your computers by now. If you do not, first install the latest version of R here: https://cran.rstudio.com (remember to select the right installer for your operating system). Next, install the latest version of RStudio here: https://www.rstudio.com/products/rstudio/download/. Scroll down to the “Installers for Supported Platforms” section and find the right installer for your operating system.
You are required to use R Markdown to type up this lab report. If you do not already know how to use R markdown, here is a very basic R Markdown template: https://sta-360-602l-su21.github.io/Course-Website/labs/resources/LabReport.Rmd. Refer to the resources tab of the course website (here: https://sta-360-602l-su21.github.io/Course-Website/resources/ ) for links to help you learn how to use R markdown.
You MUST submit both your .Rmd and .pdf files to the course site on Gradescope here: https://www.gradescope.com/courses/190490/assignments. Make sure to knit to pdf and not html; ask the TA about knitting to pdf if you cannot figure it out. Be sure to submit under the right assignment entry.
You will need the following R packages. If you do not already have them installed, please do so first using the install.packages
function.
library(mvtnorm)
library(coda)
One of the problems with Gibbs sampling is that it moves very slowly when posterior variables are highly correlated. This lab explores this issue. Suppose you fit a Bayesian model to a set of data and your posterior includes three variables \(X\), \(Y\) , and \(Z\) whose joint posterior \(\pi(X, Y,Z)\) is multivariate normal with \[\begin{eqnarray*} \begin{pmatrix}X\\ Y\\ Z \end{pmatrix} & \sim & \mathcal{N}_3\left[\boldsymbol{\theta} = \left(\begin{array}{c} 0\\ 0\\ 0 \end{array}\right),\Sigma = \left(\begin{array}{ccc} 1 & 0.9 & 0.1 \\ 0.9 & 1 & 0.1 \\ 0.1 & 0.1 & 1 \end{array}\right)\right].\\ \end{eqnarray*}\]
For deriving the conditional distributions needed in the questions below, you should refer to the class slides on the form of conditional normal distributions for any given multivariate normal distribution.
Given the multivariate normal distribution above, what are the posterior complete conditionals for \(X\), \(Y\) , and \(Z\)? That is, derive \(\pi(X | Y,Z)\), \(\pi(Y | X,Z)\), and \(\pi(Z | X,Y)\). Note that you should have three univariate normal distributions.
Write a Gibbs sampler that alternates updating each of the variables. You can set the initial values for all three variables to 0 and the number of mcmc samples to 1,000. Provide a trace plot and an autocorrelation plot of the draws for either \(X\) or \(Y\). Comment on the plots.
One option for dealing with this high correlation is doing block updates, where multiple variables are updated at once. Give the conditional distributions for \((X,Y) | Z\) and \(Z|(X,Y)\). Note that you should have one bivariate normal distribution and one univariate normal distribution.
Write a Gibbs sampler using the conditional distributions in Exercise 3 above, where \(X\) and \(Y\) are updated together (using a random draw from a bivariate normal), alternating with \(Z\) being updated. You can once again set the initial values for all three variables to 0 and the number of mcmc samples to 1,000. Provide a trace plot and an autocorrelation plot of the draws for either \(X\) or \(Y\), and comment on the plots.
Comment on the difference between the performance of the two Gibbs samplers. Why is the second more efficient?
10 points: 2 points for each question.