Inferences in Linear Regression

#econometrics #economics

Oh, Hyunzi. (email: wisdom302@naver.com)
Korea University, Graduate School of Economics.
2024 Spring, instructed by prof. Kim, Dukpa.

Main References

Kim, Dukpa. (2024). "Econometric Analysis" (2024 Spring) ECON 518, Department of Economics, Korea University.
Davidson and MacKinnon. (2021). "Econometric Theory and Methods", Oxford University Press, New York.

Normality

In addition to Finite Sample Results > ^6d88d6 Finite Sample Results > Assumption 18 (Classic Assumptions), we make the following assumptions:

Assumption (assumption-Normality).

A-N) the errors are normally distributed. i.e. and

Recall that from the previous discussion Geometry of Least Squares Estimator > Proposition 3 (Ordinary Least Squares estimator of ), we have the decomposition of as and from Finite Sample Results > ^a21829 Finite Sample Results > Proposition 21 (unbiased estimate of ),

Theorem (distribution of fitted value and residual).

Under A1~A5 A1~A5 and A-N A-N, we have the following results:

.
.
and are independent.

Proof.Remark that and . Also, by ^98bcba Assumption 1 (assumption-Normality), we have

As and , we have
As and , we have
We use Normal Distribution Theory > ^234d84Normal Distribution Theory > Lemma 5 (independent between matrices normal) for the proof.
- First, note that and
- By the assumption, we have
- Since is constant, ISTS that and are independent, which is true, since where the last equality holds by Geometry of Least Squares Estimator > Exercise 8 (ch1.6. anniliate projection).
- Therefore, and are independent to each other.

This completes the proof. □

Theorem (distribution of least square estimates).

Under A1~A5 A1~A5 and A-N A-N, we have the following results:

.
.
and are independent.

Proof.Remark Geometry of Least Squares Estimator > Proposition 3 (Ordinary Least Squares estimator of ): and also Finite Sample Results > ^a21829 Finite Sample Results > Proposition 21 (unbiased estimate of ),

Note that then by , we have
Note that and Since is symmetric and idempotent, by Introductory Linear Algebra > ^04f3a2 Introductory Linear Algebra > Theorem 6 (decomposition of symmetric and idempotent matrix), we can decompose the matrix as where is matrix with the first eigenvectors of corresponding to eigenvalue , and . Thus we have Remark that Therefore, we have by Normal Distribution Theory > ^ca7ceb Normal Distribution Theory > Lemma 9 (multivariate normal and chi-squared distribution).
Note that and . We now use Normal Distribution Theory > ^234d84 Normal Distribution Theory > Lemma 5 (independent between matrices normal), by showing Therefore, and are independent, resulting in the independency between and .

This completes the proof. □

Single-Hypothesis Test

The t-test follows the hypothesis:

(null hypothesis):
(alternative hypothesis):

where is given by the researcher. The t-test check whether a coefficient is significantly different from .

Theorem (t-test).

Under A1~A5 A1~A5 and A-N A-N, we have where

Proof.From ^939dbc Theorem 3 (distribution of least square estimates), we have Now let denote the -th element of . Then,

By Normal Distribution Theory > ^40fcf1 Normal Distribution Theory > Lemma 2 (linear transformation of normal distribution), we have Also, from ^939dbc Theorem 3 (distribution of least square estimates), we have which is independent to , since and are independent each to each other.

Therefore, by Normal Distribution Theory > ^6ab93b Normal Distribution Theory > Proposition 13 (normal and t-distribution), which completes the proof. □

Remark (decomposition of t-stat).

Note that where the first term follows and the second term is under and under . Thus, while the -statistics around is likely to be generated under the null hypothesis, if it is significantly different from , it is likely to be from the alternative hypothesis.

Definition (confidence interval).

A confidence interval for is constructed as where denotes the quantile of a Student's t distribution with degrees of freedom.

Given the significance level , the followings are defined

	not reject	reject
is true	Good (A)	Type I Error (B)
is true	Type II Error (C)	Good (D)

Type I error: the probability of rejecting when is correct.
Type II error: the probability of not rejecting when is correct.
Size of the test: probability of rejecting , .
Power of the test: probability of not rejecting , .

Note that the significance level is often set as either , or .

Definition (p-value).

A p-value is a probability of obtaining a test result at least as extreme as the observed case.

Joint-Hypothesis Test

The F-test follows the hypothesis:

(null hypothesis):
(alternative hypothesis):

where is matrix and is vector given by the researcher. The t-test check whether the estimated is significantly different from .

Theorem (f-test).

Under A1~A5 A1~A5 and A-N A-N, we have where

Proof.Remark that we have from ^939dbc Theorem 3 (distribution of least square estimates).

Then, by Normal Distribution Theory > ^40fcf1 Normal Distribution Theory > Lemma 2 (linear transformation of normal distribution), we have and using Normal Distribution Theory > ^ca7ceb Normal Distribution Theory > Lemma 9 (multivariate normal and chi-squared distribution), since is matrix.

Note that and is independent as and are independent each other by ^a522c8 Theorem 2 (distribution of fitted value and residual).

Finally, using Normal Distribution Theory > ^5212a3 Normal Distribution Theory > Definition 15 (f-distribution), we have which is This completes the proof. □

Remark (alternative expression of F-stat).

F statistic can be alternatively expressed as where denotes the restricted sum of squared residuals, and denotes the unrestricted one.

Proof.Remark that from Restricted Least Squares > ^30c879 Restricted Least Squares > Remark 2 (RSS on restricted and unrestricted), we have and And from Finite Sample Results > ^a21829 Finite Sample Results > Proposition 21 (unbiased estimate of ), we have Therefore, we have This completes the proof. □

Remark (f-test and t-test).

If -test is performed for the single restriction, then it can expressed as a -test using Normal Distribution Theory > ^906491 Normal Distribution Theory > Proposition 17 (f and t distribution), i.e. since .