Description
This homework is to practice more on multiple linear regression. Attach the complete R codes for Problem 2 at the end of the homework. Total: 90 points.

(25 points) Understanding the general linear regression model. For each of the following models, indicate whether we can use the techniques of multiple linear regression model to estimate the coe cients _{i}’s or not. Explain. (Here, we assume that all X_{i}’s are nonrandom, and “_{i}’s in each model are i.i.d. with E(“_{i}) = 0 and Var(“_{i}) = ^{2}.)


Y_{i} = _{0} + _{1}X_{i1} + _{2} log X_{i2} + _{3}X_{i}^{2}_{1} + “_{i}.



Y_{i} = log( _{1}X_{i1}) + _{2}X_{i2} + “_{i}, where _{1} > 0; X_{i1} > 0; 8i:



Y_{i} = log( _{1} + X_{i1}) + _{2}X_{i2} + “_{i}, where _{1} > 0; X_{i1} > 0; 8i:

Y_{i} = exp( _{0} + _{1}X_{i1}) + “_{i}.

Y_{i} = exp( _{0} + _{1}X_{i1} + _{2}X_{i1}X_{i2} + “_{i}).


(45 points) Data analysis: general testing framework. Consider the multiple linear regression
iid 2
model: Y_{i} = _{0 }+ _{1}X_{i1} + _{2}X_{i2} + _{3}X_{i3} + _{4}X_{i4} + “_{i}; i = 1; : : : ; n; “_{i} N(0; ):
Describe how you would test (at 0.05 signi cance level):

H_{0} : _{1} = _{2} = 0: vs. H_{a} : either _{1} or _{2} not equal to 0:

H_{0} : _{1} = 1; _{2} = 2: vs. H_{a} : not both equalities in H_{0} holds.

H_{0} : _{2} = _{3}: vs. H_{a} : _{2} 6= _{3}.
Perform the test for the dataset ‘HW6Q2.txt”.

(20 points) Rigorous deviation. Consider the multiple linear regression model in the matrix form Y = X + ” with E(“) = 0 and Var(“) = ^{2}I_{n}. Let H = X(X^{T} X) ^{1}X^{T} be the hat matrix. Show that


H^{T} =H,H^{2}=H.



The diagonal elements of H are all between 0 and 1.

1