Convergence of Random Variables

Oh, Hyunzi. (email: wisdom302@naver.com)
Korea University, Graduate School of Economics.
2024 Spring, instructed by prof. Kim, Dukpa.


Main References

  • Kim, Dukpa. (2024). "Econometric Analysis" (2024 Spring) ECON 518, Department of Economics, Korea University.
  • Davidson and MacKinnon. (2021). "Econometric Theory and Methods", Oxford University Press, New York.

Modes of Convergence

Pointwise and Uniform Convergence

Definition (convergence).

A sequence of numbers is convergent to the limit , i.e. if

Definition (pointwise convergence).

Let be a function and let be a sequence of functions, where for all . Then, converges pointwise to if or for all .

Example (example of pointwise convergence).

Let , and define Then converges pointwise to .

Definition (uniform convergence).

Let be a function and let be a sequence of functions, where for all . Then, converges uniformly to if or

Example (pointwise does not imply uniform convergence).

From the given function and in Example 3 (example of pointwise convergence), we have thus does not converges uniformly to .

Example (example of uniform convergence).

Let , and define where is a step-function, i.e. it returns the integer part of its input. Then, thus converges uniformly to . Moreover, pointwise convergence also holds.

Almost Sure Convergence

First, remark the formal definition of random variable:
Definition 5 (random variable).

A random variable is a measurable function from to . i.e, it assigns a real number to each outcome.

where the probability space is defined as
Definition 1 (probability space).

A probability space is the triple consists of three elements:

  1. A sample space, , is the set of all possible outcomes of a random experiment. An element of is , which is called an outcome.
  2. An event space, , is a collection of all subsets of , called a -field. An element of is , which is called an event.
  3. A probability function (measure), , assigns each event to a probability, which is a number between and .
Definition (almost surely converges).

The sequence of random vectors converges almost surely, i.e. if

Example (example of almost surely convergence).

Let , and define the random variables as Note that for , . Thus, we have and where the third equality holds by the property of the probability. Thus, by Definition 7 (almost surely converges), we have

Proposition (almost surely to pointwise convergence).

If then pointwise and .

Proof.Since then pointwise. Also, since holds.

Note that in Definition 7 (almost surely converges), we do not care about the timing when enters to the neighborhood of and remains there forever. Thus we introduce further concepts.

Definition (event eventually and infinitely often).

Let be a probability space, and let be a sequence of events in . Then events eventually (e.v.) denotes the chance of happening in infinite horizon of time: Using De Morgan's law, the opposite case is infinitely often (i.o.), the chance that will not violate the given in infinite horizon of time:

Here, the term e.v. and i.o. emphasizes that in terms of convergence in a sequence, the only important thing is the long-term behavior, not the behavior in the first finite horizon of the time.

Remark (almost sure convergence, and ev and io).

Define a sequence of sets as . Then, , if the probability that events eventually is equals to : which condition is equivalent to i.e. the probability that happens infinitely often is equals to .

Usually, Remark 11 (almost sure convergence, and ev and io) is another common way to define Definition 7 (almost surely converges). For the brief understanding, suppose pointwise and let . By the definition, for all .
Thus we have Alternatively, resulting in the definition of almost sure convergence. For the infinitely often part, it can be easily driven using de Morgan's law.

Convergence in Probability

Definition (convergence in probability).

Let be a sequence of random vectors. convergence in probability to , i.e. if or equivalently,

Convergence in probability is a direct translation of mathematical convergence in terms of probability.

Proposition (aslim implies plim).

If , then .

Proof.Suppose . Then, by Remark 11 (almost sure convergence, and ev and io), we have Therefore i.e. .

Example (plim does not imply aslim).

Suppose , and let be an indicator function. Now let and .
Here, , since even gets infinitely larger, there will always exists some such that .
However, in terms of the probability, since the interval gets smaller as . Thus, we have .

Mean Square Convergence

Instead of computing the probability of stays around , we can measure on-average variability of around .

Definition (mean-square convergence).

Let be a sequence of random vectors. convergence in mean square to , i.e. if

Note that the convergence in mean square is a stronger concept than the convergence in probability, since it requires not only high occurrence around , but also low volatility around . Before showing this rigorously, we first

Proposition (Markov's inequality).

For random variable , we have for , if .

This statement is a direct corollary of the original statement, Statistical Proof > Theorem 26 (Markov's inequality).

Lemma (mslim implies plim).

If , then . However, the converse does not hold.

Proof.We use Proposition 16 (Markov's inequality) for the proof. Suppose , and let and . Then, Since we have by the mean-square convergence, we have meaning that Therefore, we have .

Example (mslim and aslim have no clear relationship).

Let , and define Then, we have thus .
However, for each , if , then . Thus we have .

Convergence in Distribution

Definition (convergence in distribution).

Let be a sequence of random vectors. convergence in distribution to , i.e. if for all continuity points of of , where and are the cumulative distribution functions of and , respectively.

If the Cumulative Distribution Function (CDF) of pointwise converges to CDF of , then it converges in distribution. Note that does not require the convergence in .

Lemma (plim implies dlim).

If , then .

Proof.First, note that for any random variable and a real number , where . Thus we have Using this result, we now show that pointwise at every point where is continuous. Let be a such point. Then, for every , and Thus we have Since , and by the , we therefore have i.e. which means .

Remark (dlim implies plim to a constant).

Let be a constant. If implies .

Proposition (portmanteau theorem).

For a sequence of random vectors , if and only if for every bounded and continuous functional

Transformation in Convergence

Continuous Mapping Theorem

Theorem (Continuous Mapping Theorem).

Let be a continuous function and be a random variables. Then we have:

  1. If , then .
  2. If , then .
  3. If , then .

Proof.Recall Introductory Analysis > Definition 8 (continuity), and suppose a function is continuous. i.e.

  1. almost surely convergence
    Assume , i.e. Note that by the continuity of , for an arbitrary , pick such that Thus, we have Therefore, we have which means .

  2. convergence in probability
    Assume , i.e. Since is continuous, for an arbitrary , we have i.e. Therefore, we have which means .

  3. convergence in distribution
    Assume , then by Proposition 22 (portmanteau theorem), it is equivalent to holds for every bounded and continuous functional . Now, using Portmanteau theorem again, it suffices to prove that note that the composite function is also a bounded continuous functional, since is continuous. Therefore we have .

Example (example of CMT).

Slutsky Theorem

Theorem (Slutsky theorem).

Let be a constant, and let , , be random vectors. If and is a continuous function at all points , then we have

Proof.Since Lemma 20 (plim implies dlim), we have a random vector converging in distribution: Thus, by Theorem 23 (Continuous Mapping Theorem), we have since is continuous.

Remark (alternative form of Slutsky).

Let and , where is a constant. Then

  1. ,
  2. ,
  3. if is invertible.
Example (example of slutsky).

Joint Convergence

aslim
?
mslim
plim
dlim
Proposition (joint dlim is dlim).

For , we have provided is constant.

Note that this proposition is a direct corollary of Theorem 25 (Slutsky theorem).

Proposition (joint plim is plim).

For , we have

Proof.Note that where the second inequality holds by sub-additivity of the probability measure.