Instrumental Variables

#econometrics #economics

Oh, Hyunzi. (email: wisdom302@naver.com)
Korea University, Graduate School of Economics.
2024 Spring, instructed by prof. Kim, Dukpa.


Main References

  • Kim, Dukpa. (2024). "Econometric Analysis" (2024 Spring) ECON 518, Department of Economics, Korea University.
  • Davidson and MacKinnon. (2021). "Econometric Theory and Methods", Oxford University Press, New York.

Model and Assumptions

From the model given: we assume the situation where A1* is violated from Asymptotic Results in Basic Linear Model > Assumption 2 (ASM for consistency), i.e. Then, from the Econometric Analysis/Asymptotics > Theorem 1 (weak law of large numbers), we can expect that which indicates is now biased, i.e. inconsistent.

Source of Correlation

  1. Omitted Variables: Suppose that the true model is given by and is unobservable, then the regression without gives
  2. Errors-in-variables: Suppose that the only observable data of is which is a noisy version. If we use , then the regression equation can be where, by construction, and are correlated.
  3. Simultaneity bias: Suppose that the true model is given by the set of equations then, by the construction, the regressor is correlated with the error because affects via the first equation, and affects via the second equations.
  4. Lagged dependent variable with serially correlated errors: Consider the system of and and are correlated since both are influenced by .

Instrumental Variable Estimator

Proposition (instrumental variable estimator).

Given an instrumental variable that are uncorrelated with the errors but correlated with the regressors, i.e. we have an Instrumental Variable (IV) estimator defined as

Proof.Similar to Geometry of Least Squares Estimator > Proposition 3 (Ordinary Least Squares estimator of ), we drive the result using orthogonal projection and the method of moments.

  1. using MOM
    • Given the condition that , we have where the last equation holds since is assumed.
    • Replacing the population mean to the sample mean, we have
  2. using orthogonal projection
Proposition (IV variance estimator).

Given the IV estimator, the variance estimator is

Example (IV estimator for school years).

Consider a regression of where is the individuals' earnings and is the schooling. Now, since and must be strongly correlated by the unobserved deterministic variable , we introduce IV such that where the constant term stays since anyway, and is the individual's quarter of birth. Note that since the ability and the birth quarter would not have correlation between, but since the individual who are born in earlier quarter of the year is likely to have more schooling periods depending on the starting month of the school semester.

Remark (weak instrument).

An instrument variable is called weak instrument if

While we cannot test for since the true error is unknown, but one might want to test for , since we have the data for the both and .

Here, we can run the regression and for the null hypothesis , we can run Inferences in Linear Regression > Theorem 8 (f-test). Note that the -stat must have small value (i.e. large -value) since the null hypothesis is the assumption we hope to be held.

Asymptotic Normality

Consistency

Assumption (ASM for IV consistency).
  • I1*) : contemporary uncorrelated with the errors.
  • I2) : linearity.
  • I3) : full rank, where are both matrices.
  • I4&5*) , where is a non-singular matrix.
  • I6) , , and .
  • I7) , where is a non-singular matrix.
  • I8) is the average of variances over .

Where the IV estimator is defined by

Theorem (consistency of IV estimator).

Under I1*, I2, I3, I4&5*, I6, and I7, we have

Proof.Using I2 and I3, we can drive the IV estimator, Then, first, I6 shows

Secondly, using I1*, i.e., , we have And lastly, using I4&5*, i.e. , and I7, we have thus we have and by Convergence of Random Variables > Lemma 17 (mslim implies plim), Now, combining the results, we have thus, is a consistent estimator for .

Theorem (consistency of IV variance estimator).

Under I1*, I2, I3, I4&5*, I6, I7, and I8, we have

Proof.Note that we have From the definition of , Now, by I6 of , we have and from Theorem 6 (consistency of IV estimator), Also, by the boundness of , we have Therefore, by Econometric Analysis/Asymptotics > Proposition 20 (properties of op and Op), we have which completes the proof.

Asymptotic Normality

Assumption (ASM for IV asymptotic normality).
  • I1**) is an sequence with and .
  • I2) : linearity.
  • I3) : full rank, where are both matrices.
Lemma (Cramer-Wold device in IV estimator).

Proof.Note that by I**, and , we haveand Thus, we have Now using Econometric Analysis/Asymptotics > Theorem 12 (Cramer-Wold device), for any real vector such that , we have and since , by Econometric Analysis/Asymptotics > Theorem 8 (Lindeberg-Levy CLT), we have Therefore, we have or This completes the proof.

Theorem (asymptotic normality).

Under I1**, I2, and I3, we have

Generalized Instrumental Variable Estimator

If there exists more than one instrumental variable, we can use the averaged sum of them that are weighted to has the maximized correlation with .

Consider we have number of IV, and let be a full rank and matrix. Then let where each column of is a weighted average of the columns of .

We find the optimal , that makes the correlation between and are maximized: Then, the generalized IV estimator is

Remark (GIV and IV estimator).

Note that if , then is reduced into .