Law of Large Numbers

Convergence of Random Variables

Definition (converge of random variables).

Let be the sequence of random variables in probability space , and denote .

  1. We say converges to in if and denote as in .
  2. We say converges to almost surely in if and denote as in .
  3. We say converges in probability to in if and denote as in .
Proposition (lim and aslim implies plim).
  1. If in for , then .
  2. If in , then .

Proof.(1) By Definition 1 (converge of random variables), we have where the inequality holds by Basic Definitions in Probability > Remark 27 (common form of Chebyshev). Since , we have if . Thus we have the desired result.

(2) Put . Since , by Definition 1 (converge of random variables), we have . Then, for , we have and thus . Also, note that for any , we have . Then by Measure Theoretic Preliminaries > Theorem 68 (dominated convergence theorem), we have which means .

Exercise (aslim and continuous function).

If is continuous and , then we have .

Proof.Let and fix an arbitrary . Since is continuous, we have as , we have . Therefore meaning that .

Lemma (linearity of convergence).

Let and be sequence of random variables:

  1. If and , then .
  2. If and , then .
  3. If and , then .

Proof.We only prove for the first case, the other cases are trivial from Exercise 3 (aslim and continuous function).

Lemma (aslim inequality preserved under limit).

Let almost surely and suppose and . Then we have almost surely.

Proof.Since almost surely, we have for all .

Also, for an arbitrary , there exists some such that for any .

Then, there exist some constant that satisfies meaning that .

Infinitely Often and Events Eventually

Definition (limsup and liminf in set space).

Let . We define

Definition (infinitely often and events eventually).
  1. We say infinitely often(i.o.) for
  2. We say events eventually(e.v.) for
Example (sufficient condition for a.s. convergence).

Define and suppose . Then we have .

Proof.Put . Then as , we have . Thus for , we have for any where . This means if . As , we thus have and by Definition 1 (converge of random variables), we have .

Remark (i.o. complement and e.v.).

From Definition 7 (infinitely often and events eventually), we have the following relationships.

  1. .
  2. .

Proof.It suffices to prove only the first relationship. Note that from definition, Which completes the proof.

Borel-Cantelli Lemma

Lemma (first Borel-Cantelli lemma).

If the sum of the probabilities of the events is finite, then the probability that infinitely many of them is zero, that is

Proof.From Definition 6 (limsup and liminf in set space), we have and note that the sequence of events is non-increasing as Then, by continuity from above in Measure Theoretic Preliminaries > Theorem 18 (properties of measure), and by subadditivity in Measure Theoretic Preliminaries > Theorem 18 (properties of measure), we have where the always exists in . Thus we have the desired result.

Theorem (second Borel-Cantelli lemma).

If the events are independent and , then .

Proof.Note that Thus, if suffices to show that for any , as it equals to .

Then for , we have by taking limits on the both sides, we have where the first equality holds since is decreasing as and by Measure Theoretic Preliminaries > Theorem 18 (properties of measure). Thus we have the desired result.

Subsequent Convergence and Convergence

Remark (subsequent convergence and convergence).

Let be a sequence of elements in topological space. If every subsequence has further subsequence that converges to , then we have .

Proof.RTA: assume that . Then, there exists some such that which implying that for every , there exists such that Now let be a subsequence of . Then, for each , we have which equals to meaning that , thus a contradiction. Therefore must hold.

Theorem (aslim subsequence and plim sequence).

if and only if there exist a convergent subsequence for any subsequence .

Proof.() Assume , and let be a subsequence. Then, since we have , we have Next, for each , we can choose such that Now let . Then we have Thus by Lemma 10 (first Borel-Cantelli lemma), we have Then by Remark 9 (i.o. complement and e.v.), for , we have except infinitely many , meaning that .

() Assume for any subsequence , there exists a convergent subsequence . Then by Proposition 2 (lim and aslim implies plim), we have , meaning Now let . Then, for any subsequence , there exists further subsequence as . Thus by Remark 12 (subsequent convergence and convergence), we have , implying .

Corollary (composition of convergence).

For a continuous function , if the random variables , then we have Furthermore, if is bounded, then

Uncorrelated Random Variables

Definition (uncorrelated random variable).

A family of random variables , for given index set with , is said to be uncorrelated if

Remark (independent implies uncorrelated).
Remark (inequalities on absolute value).

Note that for all .

Lemma (variance of uncorrelated random variables).

Let be uncorrelated random variables with for all . Then we have

Weak Law of Large Numbers

Weak Laws in L2 Space

Theorem (weak law of large numbers in L2 space).

Let be uncorrelated random variables with for all and . Then we have in , and as well in probability.

Truncation Method

Lemma (calculation of pth moment).

For random variable and , we have

Lemma (first condition for truncation convergence).

For a random variable , if , then as .

Proof.Define a function . Then we have the relationship of By integrating both sides with respect to , by monotonicity of Measure Theoretic Preliminaries > Theorem 67 (properties of integrability), we have since .
Also, because , we have as . Thus by Measure Theoretic Preliminaries > Theorem 68 (dominated convergence theorem), we have as .

Lemma (second condition for truncation convergence).

For a random variable , if as , then we have as .

Proof.Put . Then we have for , and for , we have .

Now, using Lemma 20 (calculation of pth moment), we have Now let . Then, by the assumption of we have the following facts:

  1. is bounded, i.e. where .
  2. as , i.e.

Thus, for a large enough , we have thus we can conclude that .

Weak Law of Large Numbers

Definition 5 (identical distribution).

Let and be both measurable functions. We say and are equal in distribution and denote as if their distributions are identical, i.e.,

Definition (independently and identically distributed).

The random variables are called independently and identically distributed (i.i.d.) if for all Borel sets .

Lemma (weak law of large numbers).

Let be i.i.d. random variables with as for any . Now put and . Then

Proof.Put . Then, as are identical, are identical. Thus by letting , we have We need to show that as .

Note that since , we have thus we need to check that whether the two components both converges to zero.

Combining the results, we have .

Corollary (weak law of large numbers).

Let be i.i.d. random variables with for any . Now put and . Then

Strong Law of Large Numbers

Conditions For Strong Laws

Theorem (first strong law of large numbers).

Let be i.i.d. with and . If , we have

Proof.Note that for , we have meaning that we can assume without loss of generality.

This gives us then by Lemma 10 (first Borel-Cantelli lemma), we have implying that .

Theorem (necessary condition for strong law).

Let be i.i.d. with . Also, put . Then we have the following results.

  1. .
  2. .

Proof.(1) First, using Lemma 20 (calculation of pth moment), we have where the inequality holds as is a decreasing function and the last equality holds by the assumption . Then, by Theorem 11 (second Borel-Cantelli lemma), we have as are i.i.d.

(2) First, remark that Now let As we have from the proof of (1), it suffices to show that which will gives us .

Then for , we have since as (as ), and . Thus we have , which is the desired result.

means converges, i.e. it is a cauchy sequence. Thus what we need to show is that the two contradicts each other, i.e. if it converge, then it is not cauchy. thus it does not converge. this is what wehaver show on the last line of the proof.

Truncation Method

Lemma (sufficient condition for the strong law).

Let be i.i.d. where and . Then by letting and , we have where .

Proof.Note that and if . Since are i.i.d., we have where the first inequality holds as is a decreasing function and the last equality holds by Lemma 20 (calculation of pth moment). Then by Lemma 10 (first Borel-Cantelli lemma), we have . Thus we have and for some , we have Therefore, we have as . This gives us , meaning that which concludes the proof.

Lemma (bounded sum of variances).

Let be i.i.d. where and . Then by letting , we have

Proof.Using Lemma 20 (calculation of pth moment), we have Then, using Measure Theoretic Preliminaries > Theorem 75 (Fubini theorem), we have Now, note that if , then and if , then where denotes the integer part of .

Thus we have by the assumption.

Strong Law of Large Numbers

Theorem (strong law of large numbers).

Let be i.i.d. where and . Then for , we have

Proof.By Lemma 28 (sufficient condition for the strong law), it suffices to show that where and .

Without loss of generality, we can assume that , since for each , we have and and are both i.i.d.

Below, we will show that for any . Then, by Lemma 5 (aslim inequality preserved under limit), we can take limits on the both sides so that and by Measure Theoretic Preliminaries > Remark 40 (limit, limsup, and liminf), we have which is the desired result.

Firstly, we show that , where . Put where . Note that are still independent, and we can apply Lemma 18 (variance of uncorrelated random variables). Then, using Basic Definitions in Probability > Proposition 26 (Chebyshev's inequality) and Measure Theoretic Preliminaries > Theorem 75 (Fubini theorem), we have Then, using , we have for and using the geometric series, we have as the first term is less than . (note that )

Thus we have and by Lemma 29 (bounded sum of variances), we have Then by Lemma 10 (first Borel-Cantelli lemma), we have which means Also, since and i.e. from the proof of Lemma 28 (sufficient condition for the strong law), the Measure Theoretic Preliminaries > Theorem 68 (dominated convergence theorem) implies so we have which gives us Thus combining the result, we have which means

Now, we handle the values outside of the index . Let be . As we are assuming , we have and Since , and we have as . As we already have , we have and Therefore, we have which gives us for any .