Instructions

Please make sure your final output file is a pdf document. You can submit handwritten solutions for non-programming exercises or type them using R Markdown, LaTeX or any other word processor. All programming exercises must be done in R, typed up clearly and with all code attached. Submissions should be made on gradescope: go to Assignments \(\rightarrow\) Homework 5 (360) for students in STA 360 and Homework 5 (602) for students in STA 602.

Questions

  1. Continuation of the swimming data from class. Recall the problem from class on swimming times. Download the data here: http://www2.stat.duke.edu/~pdh10/FCBS/Exercises/swim.dat. You can also create the data by manually typing it into R from the link or from the video for Module 6.6.

    The file contains data on the amount of time in seconds it takes each of 4 high school swimmers to swim 50 yards. There are 6 times for each student, taken every two weeks. That is, each swimmer has six measurements at \(W = 2, 4, 6, 8, 10, 12\) weeks. Each row corresponds to a swimmer and a higher column index indicates a later date. Assume again that the model for each swimmer is \[T_{i} = \beta_0 + \beta_1 (W_i - \bar{W}) + \epsilon_i,\] where \(T_i\) represents the swimming times and \(\epsilon_i \sim \mathcal{N}(0,\sigma^2)\).

    • Part (a): Using the g-prior with g = n = 6, generate samples/realizations from the prior predictive distribution for a single swimmer over the 12 weeks \((W = 2, 4, 6, 8, 10, 12)\) and create a density plot of the predictive draws (one for each \(W\)). Are the values plausible?
      (3 points for all students)
    • Part (b): Using the data, and the g-prior with g = n = 6 for each swimmer, give the posterior distributions of \(\beta_0\), \(\beta_1\) and \(\sigma^2\) for each swimmer.
      (Students in STA 360: 3 points; Students in STA 602: 2 points)
    • Part (c): For each swimmer \(j\), plot their posterior predictive distributions for a future time \(T^\star\) two weeks after the last recorded observation (overlay the 4 densities in a single plot).
      (Students in STA 360: 3 points; Students in STA 602: 2 points)
    • Part (d): The coach of the team has to recommend which of the swimmers to compete in a swim meet in two weeks time. Using draws from the predictive distributions, compute \(P(Y_j^\star = \text{max}(Y_1^\star,Y_2^\star,Y_3^\star,Y_4^\star))\) for each swimmer \(j\), and based on this make a recommendation to the coach.
      (1 point for all students)
  2. Hoff 9.2.

    • Part (a): 2 points for all students
    • Part (b): 2 points for all students

    You must write your own sampler for part (a). For part (b), you don’t need to write your own Gibbs sampler. Just follow our approach from class and use the packages. Be sure to compare your results to the results from part (a). You can find the data file azdiabetes.dat mentioned in the question here: http://www2.stat.duke.edu/~pdh10/FCBS/Exercises/.

  3. Metropolis-Hastings. Consider the following sampling model: \[y_1, \ldots, y_n | \theta_1, \theta_2 \sim p(y | \theta_1, \theta_2),\] with the priors on \(\theta_1\) and \(\theta_2\) set as \(\pi_1(\theta_1)\) and \(\pi_2(\theta_2)\) respectively, where \(\theta_1, \theta_2 \in \mathbb{R}\).

    Suppose we are interested in generating random samples from the posterior distribution \(\pi(\theta_1, \theta_2 | y_1, \ldots, y_n)\). For each of the following proposal distributions, write down the acceptance ratio for using Metropolis-Hastings to generate the samples we desire. Make sure to simplify the ratios as much as possible for each proposal! Also, comment on whether or not the resulting acceptance ratios are intuitive, that is, what do the resulting acceptance ratios tell about the different samplers.

    • Part (a): Full conditionals: \[ \begin{split} g_{\theta_1}[\theta_1^\star | \theta_1^{(s)}, \theta_2^{(s)}] & = p(\theta_1^\star | y_1, \ldots, y_n, \theta_2^{(s)});\\ g_{\theta_2}[\theta_2^\star | \theta_1^{(s)}, \theta_2^{(s)}] & = p(\theta_2^\star | y_1, \ldots, y_n, \theta_1^{(s)}). \end{split} \]
      (2 points for all students)
    • Part (b): Priors: \[ \begin{split} g_{\theta_1}[\theta_1^\star | \theta_1^{(s)}, \theta_2^{(s)}] & = \pi_1(\theta_1^\star);\\ g_{\theta_2}[\theta_2^\star | \theta_1^{(s)}, \theta_2^{(s)}] & = \pi_2(\theta_2^\star). \end{split} \]
      (2 points for all students)
    • Part (c): Random walk: \[ \begin{split} g_{\theta_1}[\theta_1^\star | \theta_1^{(s)}, \theta_2^{(s)}] & = \mathcal{N}(\theta_1^{(s)}, \delta^2);\\ g_{\theta_2}[\theta_2^\star | \theta_1^{(s)}, \theta_2^{(s)}] & = \mathcal{N}(\theta_2^{(s)}, \delta^2). \end{split} \]
      (2 points for all students)

    In each case, you only need to spend time working through the acceptance ratio for one of the two parameters. The other one should become obvious once you’ve completed the first.

  4. Continuation of the biased coin problem from class. Suppose we have univariate data \(y_1, \ldots, y_n | \theta \sim \text{Bernoulli}(\theta)\) and wish to test: \(\mathcal{H}_0: \theta = 0.5 \ \ \text{vs. } \mathcal{H}_1: \theta \neq 0.5\). We already derived the Bayes factors in class, so you should leverage that.

    Study the asymptotic behavior of the Bayes factor in favor of \(\mathcal{H}_1\). For \(\theta \in \{0.25, 0.46, 0.5, 0.54\}\), make a plot of the Bayes factor (y-axis) against sample size (x-axis). You should have four plots. Comment (in detail) on the implications of the true value of \(\theta\) on the behavior of the Bayes factor in favor of \(\mathcal{H}_1\), as a function of sample size. When answering this question, remind yourself of what Bayes factors actually mean and represent! One line answers will not be sufficient here; explain clearly and in detail what you think the plots mean or represent.
    (Students in STA 360: you can attempt for practice but you don’t need to submit; Students in STA 602: 2 points)

    When you are making the plots, generate the data sequentially. That is, the way the data will be generated as Bernoulli outcomes 0 or 1 for each \(y_i\) in a real experiment. For example, for \(n = 1\), sample \(y_1 = 0\) or \(y_1 = 1\). For \(n = 2\), sample \(y_2 = 0\) or \(y_2 = 1\) and add it to the existing \(y_1\). For \(n = 3\), sample \(y_3 = 0\) or \(y_3 = 1\) and add it to the existing \(y_1\) and \(y_2\). Keep doing that for a sequence of \(n\) values, as \(n\) gets very large, until you see a very discernible pattern in your plots.

Grading

20 points.