Partial Solutions to Homework 04

Exercise B: Practice with moment generating functions (mgf)

In this exercise, you will be looking into a different argument to show that \[W_1=\frac{\left(n-1\right)S^2}{\sigma^2}\sim \chi^2_{n-1}.\] Assume that \(Y_1,\ldots,Y_n\) are IID \(N\left(\mu,\sigma^2\right)\), where both \(\mu\) and \(\sigma^2\) are unknown.

  1. (1 point) Show first that \[\sum_{i=1}^n \left(\frac{Y_i-\mu}{\sigma}\right)^2= \frac{\left(n-1\right)S^2}{\sigma^2} + \left(\frac{\overline{Y}-\mu}{\sigma/\sqrt{n}}\right)^2 \]

ANSWER: Skipped here, as this is a straightforward and recurring derivation.

  1. (2 points) Let \(W=\displaystyle\sum_{i=1}^n \left(\frac{Y_i-\mu}{\sigma}\right)^2\). What is the distribution of \(W_1\)? Write down your argument to support your answer.

ANSWER: Since \(Y_1,\ldots,Y_n\) are IID \(N\left(\mu,\sigma^2\right)\), we can conclude that \(\dfrac{Y_1-\mu}{\sigma}, \ldots, \dfrac{Y_n-\mu}{\sigma}\) are IID \(N\left(0,1\right)\). Because \(W\) is a sum of squares of independent standard normal random variables, \(W\sim\chi^2_n\) by LM Definition 7.3.1.

  1. (2 points) Let \(W_2=\displaystyle\left(\frac{\overline{Y}-\mu}{\sigma/\sqrt{n}}\right)^2\). What is the distribution of \(W_2\)? Write down your argument to support your answer.

ANSWER: Since \(Y_1,\ldots,Y_n\) are IID \(N\left(\mu,\sigma^2\right)\), we can conclude that \(\dfrac{\overline{Y}-\mu}{\sigma/\sqrt{n}}\sim N\left(0,1\right)\). Because \(W_2\) is a square of a random variables that is standard normal, \(W_2\sim\chi^2_1\) by LM Definition 7.3.1.

  1. (3 points) Use LM Theorem 4.6.5 to compute the mgfs for \(W\) and \(W_2\). Are there restrictions on the domain of these functions?

ANSWER: Given Items 2 and 3 and that the pdf of a gamma distribution where \(r=m/2\) and \(\lambda=1/2\) matches the pdf of a \(\chi^2_m\) distribution, we can use LM Theorem 4.6.5 to conclude that \(M_W\left(t\right)=\left(1-2t\right)^{-n/2}\) and \(M_{W_2}\left(t\right)=\left(1-2t\right)^{-1/2}\). Yes, for the mgf to be well-defined, \(t\) has to be less than 1/2 (refer to LM Definition 3.12.1).

  1. (3 points) Can you apply LM Theorem 3.12.3 directly to the mgf for \(W\) and use it to indirectly recover the mgf for \(W_1\)? If it is possible to apply directly, write down your argument to obtain the mgf for \(W_1\). If it is not possible to apply directly, write down what conditions have to be satisfied first in order to use LM Theorem 3.12.3. Do you think these conditions are satisfied? After that, write down your argument to obtain the mgf for \(W_1\).

ANSWER: We can apply LM Theorem 3.12.3b provided that \(W_1\) and \(W_2\) are independent. This condition is satisfied because \(W_1\) is a function of \(S^2\), \(W_2\) is a function of \(\overline{Y}\), and LM Theorem 7.3.2 states that \(\overline{Y}\) and \(S^2\) are independent whenever \(Y_1,\ldots,Y_n\) are IID \(N\left(\mu,\sigma^2\right)\). Therefore, the mgf of \(W_1\) is obtained as follows: \[M_{W_1}\left(t\right)=\frac{M_W\left(t\right)}{M_{W_2}\left(t\right)}=\left(1-2t\right)^{-\left(n-1\right)/2}\]

  1. (2 points) Now that you have obtained the mgf for \(W_1\) in Item 5, can you identify the distribution of \(W_1\)? Cite the theorem in LM that you have to apply to obtain your finding.

ANSWER: Yes, by LM Theorem 3.12.2, the distribution of \(W_1\) is \(\chi^2_{n-1}\).

  1. (2 points) Use the mgf in Item 5 to obtain the mean and variance of \(W_1\).

ANSWER: By LM Theorem 3.12.1, we need derivatives of the mgf. Observe that \(M_{W_1}^\prime\left(t\right)=\left(n-1\right)\left(1-2t\right)^{-\left(n+1\right)/2}\) and \(M_{W_1}^{\prime\prime}\left(t\right)=\left(n-1\right)\left(n+1\right)\left(1-2t\right)^{-\left(n+3\right)/2}\). Therefore, \(\mathbb{E}\left(W_1\right)=M_{W_1}^\prime\left(0\right)=n-1\), \(\mathbb{E}\left(W_1^2\right)=M_{W_1}^{\prime\prime}\left(0\right)=\left(n-1\right)\left(n+1\right)\), and \(\mathsf{Var}\left(W_1\right)=2\left(n-1\right)\).

  1. (2 points) Use Item 7 to find the mean and variance of \(S^2\).

ANSWER: Based on Item 7, \(\mathbb{E}\left(W_1\right)=\mathbb{E}\left(\dfrac{\left(n-1\right)S^2}{\sigma^2}\right)=n-1\). Therefore, \(\mathbb{E}\left(S^2\right)=\left(n-1\right)\dfrac{\sigma^2}{\left(n-1\right)}=\sigma^2\).

Next, we have \(\mathsf{Var}\left(\dfrac{\left(n-1\right)S^2}{\sigma^2}\right)=2\left(n-1\right)\). Therefore, \(\mathsf{Var}\left(S^2\right)=2\dfrac{\sigma^4}{\left(n-1\right)^2}\left(n-1\right)=\dfrac{2\sigma^4}{n-1}\).

  1. (4 points) I already showed in class how to work out the details of maximum likelihood estimation of \(\mu\) and \(\sigma^2\) in the IID \(N\left(\mu,\sigma^2\right)\). Revisit those details and determine whether \(S^2\) is an efficient estimator of \(\sigma^2\).

ANSWER: I do not repeat the calculations involved in obtaining the Fisher information (which is a \(2\times 2\) matrix now). Refer to our notes. In this nice case, you can read off the CRLB from the second row, second column entry of the Fisher information matrix which is \(\dfrac{2\sigma^4}{n}\). \(S^2\) happens to be an unbiased estimator of \(\sigma^2\) with variance obtained in Item 8. Since \(\dfrac{2\sigma^4}{n-1} > \dfrac{2\sigma^4}{n}\), where the latter is the CRLB for \(\sigma^2\), we must conclude that \(S^2\) is not an efficient estimator of \(\sigma^2\).

Exercise C: Confidence intervals for \(\sigma^2\)

You will be using a Monte Carlo simulation to evaluate the performance of two different \(100\left(1-\alpha\right)\%\) confidence intervals for \(\sigma^2\).

  1. (3 points) Use the finding in Item 6 in Exercise B to construct a 90% conservative confidence interval for \(\sigma^2\) based on Chebyshev’s inequality. Make sure to provide expressions for \(L\) and \(U\) and show that \(\mathbb{P}\left(L\leq \sigma^2 \leq U\right) \geq 0.9\) is satisfied.

ANSWER: Using Item 8 about the mean and variance along with Chebyshev’s inequality setting \(c^2=10\), we will have \[\mathbb{P}\left(|S^2-\sigma^2|\leq c\sqrt{\frac{2\sigma^4}{n-1}}\right)\geq 0.9\] Doing some algebra involving these inequalities, we will have \[\mathbb{P}\left(S^2-c\sqrt{\frac{2\sigma^4}{n-1}} \leq \sigma^2 \leq S^2+c\sqrt{\frac{2\sigma^4}{n-1}}\right) \geq 0.9\] But there is a problem with this interval, as it depends on an unknown \(\sigma^2\). We cannot just replace \(\sigma^2\) with \(S^2\), as the inequality will no longer hold.

One approach would be to write \[|S^2-\sigma^2|\leq c\sqrt{\frac{2\sigma^4}{n-1}} \Rightarrow 0\leq S^2\leq \sigma^2\left(1+c\sqrt{\frac{2}{n-1}}\right).\] Therefore, \[\mathbb{P}\left(\dfrac{S^2}{1+c\sqrt{\dfrac{2}{n-1}}}\leq \sigma^2 < \infty \right)\geq 0.9\]

  1. (3 points) Exercise B made you go through the argument to show that \(\displaystyle\frac{\left(n-1\right)S^2}{\sigma^2}\sim \chi^2_{n-1}\). Use this to construct an exact 90% confidence interval for \(\sigma^2\). Make sure to provide expressions for \(L\) and \(U\) and show that \(\mathbb{P}\left(L\leq \sigma^2 \leq U\right) = 0.9\) is satisfied.

ANSWER: This essentially forces you to prove LM Theorem 7.5.1a. Since \(\displaystyle\frac{\left(n-1\right)S^2}{\sigma^2}\sim \chi^2_{n-1}\), we can find \(c_1\) and \(c_2\) such that \[\mathbb{P}\left(c_1\leq \frac{\left(n-1\right)S^2}{\sigma^2}\leq c_2\right)=0.9\] Here we can set \(c_1\) and \(c_2\) such that \(\mathbb{P}\left(\chi^2_{n-1}\leq c_1\right)=0.05\) and \(\mathbb{P}\left(\chi^2_{n-1}\geq c_2\right)=0.05\). Based on Table A.3 and the notation in the textbook, \(c_1=\chi^2_{0.05, n-1}\) and \(c_2=\chi^2_{0.95, n-1}\). Next, we have (fill in the details) \[\chi^2_{0.05, n-1}\leq \frac{\left(n-1\right)S^2}{\sigma^2}\leq \chi^2_{0.95, n-1} \Rightarrow \frac{\left(n-1\right)S^2}{\chi^2_{0.05, n-1}} \geq \sigma^2 \geq \frac{\left(n-1\right)S^2}{\chi^2_{0.95, n-1}}\] Therefore, \[L=\frac{\left(n-1\right)S^2}{\chi^2_{0.95, n-1}},\ \ U=\frac{\left(n-1\right)S^2}{\chi^2_{0.05, n-1}}\]

  1. (3 points) Use the intervals constructed in Items 1 and 2 to write R code to conduct a Monte Carlo simulation where you draw \(n=10\) random samples from \(N\left(1,4\right)\) and then you calculate the two confidence intervals obtained in Items 1 and 2. Compare the coverage properties and the average lengths of the two intervals. Show your code and discuss your findings. You may find the command qchisq() useful.

ANSWER: Below you may find the code for this Monte Carlo simulation. The length for the confidence interval based on Chebyshev’s inequality cannot be calculated because there is no upper bound. In this situation, comparison using average length is not possible.

The coverage rates are all compatible with the stated guarantee, because the conditions of the guarantee are satisfied.

# Reuse code from HW03
# the c for Chebyshev
c <- sqrt(10)
cons.ci <- function(n)
{
  # Random draws from N(1,4)
  y <- rnorm(n, 1, 2)
  # Compute sample variance
  s.var <- var(y)
  # Chebyshev lower bound
  lb.chebyshev <- s.var/(1+c*sqrt(2)/sqrt(n-1))
  # CI based on pivotal
  # have to make adjustments here to match the notation in textbook
  ci.pivotal <- c((n-1)*s.var/qchisq(0.95, n-1, lower.tail = TRUE), (n-1)*s.var/qchisq(0.05, n-1, lower.tail = TRUE))
  # Collect three numbers lower and upper limits of the first interval
  # and the lower and upper limits of the second interval
  return(c(lb.chebyshev, ci.pivotal))
}
# Set the sample size for yourself 
n <- 10
# Construct interval 10000 times
results <- replicate(10^4, cons.ci(n))
# Calculate rate of capture
# true sigma squared is 4!
mean(results[1, ] < 4)
[1] 0.9925
mean(results[2, ] < 4 & 4 < results[3, ])
[1] 0.8991
  1. (3 points) Modify the R code in Item 3 using the same condition that the population mean is 1 and the population variance is 4 but the shape of the distribution is not normal. You are free to choose a distribution that satisfies the conditions mentioned, provided that you can draw random samples from that distribution in R. Compare the coverage properties and the average lengths of the two intervals. Show your code and discuss your findings. You may find the command qchisq() useful.

ANSWER: The length for the confidence interval based on Chebyshev’s inequality cannot be calculated because there is no upper bound. In this situation, comparison using average length is not possible.

It might be a bit difficult to find a distribution which works, but you have many to choose from and to try. I can use \(\mathsf{U}\left(1-2\sqrt{3},1+2\sqrt{3}\right)\) works, because the population mean is 1 and the population variance is 4.

The coverage rate for Chebyshev’s inequality seems to be compatible with the stated guarantee, although the conditions of the guarantee are not satisfied. The coverage rate based on the pivotal quantity is higher than expected, as we should see a value close to 90%.

c <- sqrt(10)
cons.ci <- function(n)
{
  # Random draws from specified uniform distribution
  y <- runif(10, min = 1 - 2*sqrt(3), max = 1 + 2*sqrt(3))
  # Compute sample variance
  s.var <- var(y)
  # Chebyshev lower bound
  lb.chebyshev <- s.var/(1+c*sqrt(2)/sqrt(n-1))
  # CI based on pivotal
  # have to make adjustments here to match the notation in textbook
  ci.pivotal <- c((n-1)*s.var/qchisq(0.95, n-1, lower.tail = TRUE), (n-1)*s.var/qchisq(0.05, n-1, lower.tail = TRUE))
  # Collect three numbers lower and upper limits of the first interval
  # and the lower and upper limits of the second interval
  return(c(lb.chebyshev, ci.pivotal))
}
# Set the sample size for yourself 
n <- 10
# Construct interval 10000 times
results <- replicate(10^4, cons.ci(n))
# Calculate rate of capture
# true sigma squared is 4!
mean(results[1, ] < 4)
[1] 1
mean(results[2, ] < 4 & 4 < results[3, ])
[1] 0.979
  1. (1 point) Will increasing sample size from \(n=10\) to \(n=1000\) help in Item 4? Discuss your findings. ANSWER: Below you will find the code based on Item 4, but this time with \(n=1000\). Notice that the target coverage rate of 90% was not attained even for a large sample size. Observe that both types of confidence intervals are valid for finite \(n\). Although confidence intervals based on Chebyshev’s inequality should work outside of normality, the expected value and variances upon which Chebyshev’s inequality relies on requires normally distributed random variables. Therefore, you will see the performance of the confidence intervals deteriorating.
c <- sqrt(10)
cons.ci <- function(n)
{
  # Random draws from specified uniform distribution
  y <- runif(10, min = 1 - 2*sqrt(3), max = 1 + 2*sqrt(3))
  # Compute sample variance
  s.var <- var(y)
  # Chebyshev lower bound
  lb.chebyshev <- s.var/(1+c*sqrt(2)/sqrt(n-1))
  # CI based on pivotal
  # have to make adjustments here to match the notation in textbook
  ci.pivotal <- c((n-1)*s.var/qchisq(0.95, n-1, lower.tail = TRUE), (n-1)*s.var/qchisq(0.05, n-1, lower.tail = TRUE))
  # Collect three numbers lower and upper limits of the first interval
  # and the lower and upper limits of the second interval
  return(c(lb.chebyshev, ci.pivotal))
}
# Set the sample size for yourself 
n <- 1000
# Construct interval 10000 times
results <- replicate(10^4, cons.ci(n))
# Calculate rate of capture
# true sigma squared is 4!
mean(results[1, ] < 4)
[1] 0.6868
mean(results[2, ] < 4 & 4 < results[3, ])
[1] 0.1853

Exercise D: The connection between \(\chi^2\) and \(F\) distributions in large samples

Let \(V \sim \chi^2_m\), \(U\sim\chi^2_n\), and \(V\) and \(U\) are independent. From LM Theorem 7.3.3, you already know that \(F=\left(V/m\right)/\left(U/n\right)\) has an \(F\) distribution with \(m\) numerator degrees of freedom and \(n\) denominator degrees of freedom. Your task is to apply the asymptotic tools to show that \[m\times F =m\times \dfrac{V/m}{U/n}= V \times \frac{1}{(U/n)} \overset{d}{\to} \chi^2_m.\]

  1. (2 points) It is possible to express \(U/n\) as a sample average of IID random variables. Show how to do this. What is the common distribution of these random variables?

ANSWER: Since \(U\sim\chi^2_n\), it can be represented as a sum of squares of \(n\) independent standard normal random variables, i.e., \(\displaystyle U=\sum_{i=1}^n Z_i^2\) where \(Z_1,\ldots,Z_n\) are independent \(N(0,1)\) random variables (refer to LM Definition 7.3.1). The common distribution of each \(Z_i^2\) is \(\chi^2_1\).

  1. (2 points) Apply the law of large numbers to \(U/n\) and show that it converges to a known constant. Make sure to determine the value of this constant and how you got this value.

ANSWER: The law of large numbers applies to the sequence of IID standard normal random variables \(Z_1^2,Z_2^2, \ldots\). By the law of large numbers, \[\frac{U}{n}=\frac{1}{n}\sum_{i=1}^nZ_i^2\overset{p}{\to} \mathbb{E}\left(Z_i^2\right)\] as \(n\to\infty\). Since \(Z_i^2\sim\chi^2_1\), by Item 7 of Exercise B, \(\mathbb{E}\left(Z_i^2\right)=1\).

  1. (3 points) Show how to apply Theorem 1 to obtain the desired result. Make sure to write a complete argument and cite which parts of Theorem 1 you have used.

ANSWER: The given information already provides the key steps: \[m\times F =m\times \dfrac{V/m}{U/n}= V \times \frac{1}{(U/n)}.\] By Item 2, \(U/n\overset{p}{\to}1\). Based on Theorem 1 Item 5, \(\dfrac{1}{(U/n)}\overset{p}{\to}1\) as \(n\to\infty\). In addition, \(V\sim \chi^2_m\) whether or not \(n\to\infty\). Therefore, \(V\overset{d}{\to}\chi^2_m\) as \(n\to\infty\). By Theorem 1 Item 8 (or Item 9, depending on how you set up your argument), \(m\times F\overset{d}{\to}\chi^2_m\) as \(n\to\infty\).

Exercise E: Confidence intervals for \(\sigma^2_X/\sigma^2_Y\)

Suppose you have \(X_1,X_2,\ldots,X_m\) are IID \(N\left(\mu_X,\sigma^2_X\right)\) and \(Y_1,Y_2,\ldots,Y_n\) are IID \(N\left(\mu_Y,\sigma^2_Y\right)\). Assume that these two sets of random samples are independent of each other.

  1. (2 points) What is the distribution of \(\dfrac{\left(m-1\right)S^2_X}{\sigma^2_X}\)? Show your work and cite the results needed to support your argument.

ANSWER: From the given information, \(X_1,X_2,\ldots,X_m\) are IID \(N\left(\mu_X,\sigma^2_X\right)\). By LM Theorem 7.3.2, \(\dfrac{\left(m-1\right)S^2_X}{\sigma^2_X}\sim \chi_{m-1}^2\).

  1. (2 points) What is the distribution of \(\dfrac{\left(n-1\right)S^2_Y}{\sigma^2_Y}\)? Show your work and cite the results needed to support your argument.

ANSWER: From the given information, \(Y_1,Y_2,\ldots,Y_n\) are IID \(N\left(\mu_Y,\sigma^2_Y\right)\). By LM Theorem 7.3.2, \(\dfrac{\left(n-1\right)S^2_Y}{\sigma^2_Y}\sim \chi_{n-1}^2\).

  1. (2 points) Why are \(\dfrac{\left(m-1\right)S^2_X}{\sigma^2_X}\) and \(\dfrac{\left(n-1\right)S^2_Y}{\sigma^2_Y}\) independent random variables? Show your work and cite the results needed to support your argument.

ANSWER: Since \(X_1,X_2,\ldots,X_m\) and \(Y_1,Y_2,\ldots,Y_n\) are independent samples, \(S^2_X\) and \(S^2_Y\) are also independent of each other. Therefore, \(\dfrac{\left(m-1\right)S^2_X}{\sigma^2_X}\) and \(\dfrac{\left(n-1\right)S^2_Y}{\sigma^2_Y}\) have to be independent.

  1. (3 points) Using the results in Items 1 to 3, propose a pivotal quantity which can be used to construct a \(100\left(1-\alpha\right)\%\) confidence interval for \(\sigma^2_X/\sigma^2_Y\). After proposing this quantity, construct a \(100\left(1-\alpha\right)\%\) confidence interval. Provide the appropriate \(L\) and \(U\).

ANSWER: A pivotal quantity which can be used to construct a \(100\left(1-\alpha\right)\%\) confidence interval for \(\sigma^2_X/\sigma^2_Y\) is \[\dfrac{\dfrac{\left(m-1\right)S^2_X}{\sigma^2_X}/\left(m-1\right)}{\dfrac{\left(n-1\right)S^2_Y}{\sigma^2_Y}/\left(n-1\right)}.\] This quantity has an \(F_{m-1,n-1}\) distribution. Therefore, we can find \(c_1\) and \(c_2\) such that \[\mathbb{P}\left(c_1 \leq \dfrac{S^2_X}{S^2_Y}\dfrac{\sigma^2_Y}{\sigma^2_X}\leq c_2\right)=100\left(1-\alpha\right)\%.\] Using the notation in the book and in Table A.4, \(c_1=F_{1-\alpha/2,m-1,n-1}\) and \(c_2=F_{\alpha/2,m-1,n-1}\). Solving the inequalities for \(\sigma^2_X/\sigma^2_Y\), we will have \[\mathbb{P}\left(\dfrac{S^2_Y}{S^2_X}F_{1-\alpha/2,m-1,n-1} \leq \dfrac{\sigma^2_Y}{\sigma^2_X}\leq \dfrac{S^2_X}{S^2_Y} F_{\alpha/2,m-1,n-1}\right)=100\left(1-\alpha\right)\%.\] which then implies \[\mathbb{P}\left(\dfrac{S^2_X}{S^2_Y}\dfrac{1}{F_{1-\alpha/2,m-1,n-1}} \geq \dfrac{\sigma^2_X}{\sigma^2_Y}\geq \dfrac{S^2_X}{S^2_Y} \dfrac{1}{F_{\alpha/2,m-1,n-1}}\right)=100\left(1-\alpha\right)\%.\] So, \[L=\dfrac{S^2_X}{S^2_Y} \dfrac{1}{F_{\alpha/2,m-1,n-1}}, \ \ U=\dfrac{S^2_X}{S^2_Y}\dfrac{1}{F_{1-\alpha/2,m-1,n-1}}.\]

WARNING: You might think this is fundamentally different from the result in LM Theorem 9.5.2, but you should pay attention to the sample sizes and LM Exercises 7.3.11 and 7.3.12. Try to make the solution here match the result found in LM Theorem 9.5.2 (of course, adjust the notation in that theorem accordingly!).