Central Limit Theorems

Preliminaries Theorems

Strong Law of Large Numbers

Theorem 30 (strong law of large numbers).

Let be i.i.d. where and . Then for , we have

Note that under the strong laws, the equal statement is

Remark (further cases of strong laws).

If , as , we have and if , we have

The proof of the case when will be approached in this section.

Distribution Functions

Theorem 7 (properties of distribution function).

Let be any distribution function. Then, we have the following properties:

  1. is non-decreasing.
  2. , and .
  3. is right-continuous, i.e., .
  4. If , then .
Remark 8 (sufficient condition for continuous distribution function).

Note that from Theorem 7 (properties of distribution function), we can conclude that is continuous if and only if .

Proposition 10 (distribution function is almost surely continuous).

Let be a distribution function. is discontinuous at most countably many points.

Dense in Real Space

Definition (dense).

For , is dense in if , meaning that

Proposition (properties of dense set).

Let be dense in . Then we have the following results for :

  • if and only if such that , we have .
  • if and only if such that , we have .

Proof.We only prove the first case: RTA, suppose , then we have , which is a contradiction to the assumption .

Weak Convergence

Definition (weak convergence).
  1. A sequence of distribution functions is said to converges weakly to if and denote as .
  2. A sequence of random variables is said to converges weakly to if their distribution functions converges weakly.
Example (example of continuous distribution function).

Let has distribution function and put and its distribution as . Then we have .

Example (weak convergence of dirac-delta measure).

Let be a dirac-delta measure with unit mass at whereThen for where we have .

Proof.Note that is a probability measure. For a sequence of disjoint sets , we have as if and only if , implying that such that .

Then the limit of will be Thus at continuity points, implying .

Weak Convergence and Convergence

Theorem (weak convergence and aslim).

If then there exists a sequence of random variables with distribution such that .

Proof.Without loss of generality, let , , and be a Lebesgue measure. Here, we use Basic Definitions in Probability > Theorem 9 (sufficient condition for distribution function) by letting where . Then has distribution function for each .

Now we will show that for all but a countable number of . Here, we have to deal with the case when is discontinuous. Define and when is constant. As Basic Definitions in Probability > Theorem 7 (properties of distribution function), is an increasing and right-continuous function, and we have the following facts for :

  • If , then .
  • If , then is an upper bound of . Thus .
  • If , then .
  • If , then is a lower bound of . Thus .

Since by definition, it suffices to show that , which can be divided into two statements:

  1. for any such that , we have .
    • (step 1) are distinct sets along the different .
    • (step 2)
  2. for any such that , we have .
    • (step 3) .
    • (step 4) .

If the both statements hold, we have i.e. .

Now, we drill down into following four steps.

(step 1) First, we show that are distinct along , i.e. for , we have Here, we need to show . Using Proposition 3 (properties of dense set), it suffices to show that for any such that , we have .

Thus, by the prior facts of and , we have therefore are distinct for different .

(step 2) Define . We show that is at most countable.
If , , then we can choose such that . Since are distinct by the step 1, is an injective function.

As is an image of the injective function, we have and since is a countable set, is at most countable set. Thus the set is at most countable. Since the countable set is the null set under the Lebesgue measure, we have which is the statement 1.

(step 3) Here, we show that for .
Let be a continuity point of that satisfies . Then, since is smaller than the supremum and since , i.e. , it is contained in the set, i.e. . Then, as by the assumption, we have which gives and since this holds for all under such restrictions, we have

(step 4) Lastly, we show that for .
Let be a continuity point of that satisfies . Then, as is an upper bound of the given set and since , i.e. , we have . As by the assumption, we have and we have thus we have

Now, combining the results from the step 3 and 4, we have and since for any cases, we have the equality, i.e. . Then by Measure Theoretic Preliminaries > Remark 40 (limit, limsup, and liminf), we have .

Continuous Mapping Theory

Lemma (identical distribution function and identical distribution).

Let and be both random variables. Then if and only if their distribution functions are identical.

Proof.Note that the 'only if' part is trivial, since if are identically distributed, then their distributions are the same, i.e. . Thus we have meaning that the two have the same distribution functions.

Now, we prove the 'if' part.

() Suppose the two have the same distribution functions, i.e. . We need to show that for any , we have .

First let and note that it is a system since for any , we have Now denote Then we have since for any , we have Now we check whether is a system.

  1. since
  2. Let and . Then thus .
  3. Let where . Then we have thus .

As is a system containing , by Measure Theoretic Preliminaries > Theorem 15 (Dynkin's pi-lambda theorem), we have . Thus for every Borel sets in , and are identically distributed. This completes the proof.

Theorem (weak convergence and bounded convergence).

Let be a sequence of random variables. Then if and only if for every bounded continuous function , we have .

Proof.() Suppose . By Definition 4 (weak convergence), their distribution function weakly converges, i.e. . Then by Theorem 7 (weak convergence and aslim), there exist random variables with the same distribution function such that . Note that by Lemma 8 (identical distribution function and identical distribution), we also have .

Using Law of Large Numbers > Exercise 3 (aslim and continuous function), as is continuous, we have and since is a bounded function, Then by Measure Theoretic Preliminaries > Theorem 68 (dominated convergence theorem), we have where denotes that the integration was performed under 's probability space.

Remark that by Basic Definitions in Probability > Theorem 28 (change of the variable formula), we have and As for all , we therefore have Thus the result equals to

() Here, we define some bounded continuous function and show the reverse conclusion.

For given random variable and an arbitrary , define a function which is continuous.

Note that since for all and when , we have the inequalities of for all .

Then for a distribution function of the random variables , we have where the limit holds by the assumption of . As we have the inequality for large , by letting , we have by the continuity of Measure Theoretic Preliminaries > Theorem 18 (properties of measure).

On the contrary, we have and by letting , we have by the continuity of measure.

Combining the results, we have Thus, if is a continuity point, then by Measure Theoretic Preliminaries > Remark 40 (limit, limsup, and liminf) we have which implies , and by Definition 4 (weak convergence), we have .

Theorem (continuous mapping theorem).

Let be a measurable function and define If and , then Furthermore, if is bounded, then we have .

Proof.By the assumption of , using Theorem 7 (weak convergence and aslim), let be the random variables such that with .

Let be a continuous function. Then for we have , since if is discontinuous then must be discontinuous. Now, given the assumption of , we have implying that . Then, as is continuous almost surely, we have Also, as is continuous, is almost surely bounded. Thus by Measure Theoretic Preliminaries > Theorem 68 (dominated convergence theorem), we have Then by Theorem 9 (weak convergence and bounded convergence), we have .

For the distribution functions, then for , we have and thus we have .

For the second conclusion, by letting and repeating the process to have , we have and therefore have .

Weak Convergence and Limits

Theorem (alternative definitions of weak convergence).

Assume be a probability space. The following statements are equivalent:

  1. .
  2. For all open sets , .
  3. For all closed sets , .
  4. For all Borel sets with , .

Proof.() Suppose . Then using Theorem 7 (weak convergence and aslim), let be the random variables with , such that . Let be an open set, then we have since

  • If , then we have since is either or .
  • If , then as , we have almost surely.

Thus by Measure Theoretic Preliminaries > Corollary 64 (Fatou's Lemma), we have and since , we have which is the desired result.

() Suppose is the open sets, then by Topological Preliminaries > Definition 7 (closed set), let . Thus we have As we have Then from as , it equals to

() Put and . Then by Topological Preliminaries > Remark 11 (closure and boundary), we have and since and are disjoint, . As we assume , we have as its boundary is null set.

Thus by (2), we have and by (3), we have Combining the result, Measure Theoretic Preliminaries > Remark 40 (limit, limsup, and liminf), as we have we have .

() Suppose for some , and . Then . Let be the continuity point of , then since , we have . Then, we have for continuity point . Thus , meaning .

Corollary (weak convergence and convergence of pdf).

Suppose with having probability distribution function . Then for , we have .

Proof.As has pdf , we have and for either , we have . Thus we have Therefore by (4) of Theorem 11 (alternative definitions of weak convergence), we have .

Limit Sequence of Distribution

Theorem (Helly's selection theorem).

For any sequence of distribution function, there exists a subsequence and a right-continuous non-decreasing such that for all continuity points of .

Proof.Before the proof, remark that may not be a distribution function, while
it is a right-continuous and non-decreasing, since it is non-bounded and may have the value or .

Now, suppose a distribution function . We first construct a convergent sequence for a fixed value.

Let be a sequence of rational numbers as is countable. Since is a distribution function, the set is bounded in . Since the set is a bounded sequence in the real space, it has convergent subsequence.

Let . Then for each fixed , construct a subsequence that is a subsequence of such that Then, for each and , we have where is the subsequence of , and is the subsequence of , and so on. Thus for a sequence of the diagonal elements we have as the diagonal elements satisfies for any .

Now, we need to expand the result into the real numbers. While may not be right continuous, as is the distribution function by the assumption, we have as and for all .

Note that every , it can be expressed as . Similarly, we define Then we have the following results:

(1) since as for any . and trivially, if which implies , meaning that is non-decreasing.

(2) is right continuous: To check this, we need to show for a sequence .

() Since is a decreasing sequence and , we have for all . Thus .

() Let be arbitrary given. Then, there exists such that . Also, since and for a large enough , we have . Thus, combining the results, we have as . This results us for any , implying .

(3) : Let be any continuity point of . For an arbitrary given , choose such that and satisfies since is non-decreasing and is continuous at .

Now, be remind of the relationships of and Therefore is is large enough, we have for any . This gives us for all continuity points .

Tightness and Weak Convergence

Definition (vauge convergence).

Let be a sequence of distribution function that converges to for all continuity points of . Then we denote if may not be a distribution function. For instance, in Theorem 13 (Helly's selection theorem), we have .

Definition (tight).

Let be the sequence of distribution functions. We say is tight if which means for the corresponding distributions , we have

Theorem (tightness and weak convergence).

Every subsequential limit is the distribution function of a probability measure if and only if the sequence is tight.

Proof.() Suppose is a sequence of distribution functions and is tight. Then by Theorem 13 (Helly's selection theorem), we have . Now we need to show that is a distribution function, and it suffices to show that for all , Remark that and the statement equals to . Let be any points in . And for each , let where are fixed and the continuity points of . As we have , we therefore have and . Thus which implies where the limits exists as is an increasing function. Since is given arbitrary, we have which gives us be a distribution function.

() Suppose every subseqential limit of is distribution function. RTA: assume is not tight. This means, Now for some sub-subsequence , using Theorem 13 (Helly's selection theorem), suppose . By the supposition, is a distribution function. Now let where are continuity points of . Then we have As are continuity points, by letting and we have i.e. is not a distribution function, which is a contradiction to the supposition. Therefore must be tight.

Characteristic Functions

Complex Random Variables

Remark (complex random variables).

A random variable is complex valued if where .

Definition (integrability of complex valued random variables).

Let a random variable . Then i.e. integrable if which holds if and only if .

Proposition (properties of complex integrability).

Let where . Then we have the following facts.

  1. , .
  2. .

Proof.(1) Since , and let . Then we have (2) Note that for , we can express for some . Thus we have which completes the proof.

Characteristic Functions

Definition (characteristic function).

Let be a random variable. we define its characteristic function by

Remark (integral notation).

Using Riemann-Stieltjes integration, we can denote

Proposition (properties of characteristic function).

Let be a random variable and is its characteristic function defined as .

  1. .
  2. .
  3. .
  4. is uniformly continuous.
  5. .
  6. If are independent,

Proof.(1) the result is trivial since

(2) using the properties of complex number and the trigonometric functions,

(3) this is trivial from the Euler equation and can also be shown by

(4) From here, note that as , we have and thus by Measure Theoretic Preliminaries > Theorem 68 (dominated convergence theorem), which the result does not depend on . Therefore is uniformly continuous.

(5) using definition,

Example (characteristic function of poisson distribution).

From the definition of poisson distribution, its distribution with parameter is then its characteristic function is

Proof.Remark that and thus we have which completes the proof.

Example (characteristic function of normal distribution).

From the definition of standard normal distribution, for , its probability density function is and its characteristic function is

Proof.From as we have the equality thus Now we show that the integral equals to .
For and its differential is thus we have . This implies so we have , thus we have .

Characteristic Function of Transformation

Corollary (ch.f of iid summation).

If are i.i.d. with the characteristic functions , for , we have

Proof.Using (5) and (6) from Proposition 22 (properties of characteristic function), we have where the last equality holds by the i.i.d.

Lemma (sufficient conditions for unique distribution).

Suppose are probability measures on such that for all . Then we have on .

Proof.We use theorem to prove the lemma.
Let where if . Then is pi system for any , we have Now let then by the assumption of , we have . Below, we show that is lambda system:

  1. : as they are probability measures.
  2. if and , then :
  3. if and , then :
Proposition (distribution function of characteristic function).

Let be the random variables with distribution functions and characteristic functions . Put where . Then for , its characteristic function is and furthermore, for each distribution , the distribution of is

  1. is non-decreasing, as is non-decreasing for all and .
  2. , as for all and .
  3. , as for all .

As is a distribution function, there exists some random variable such that . Using Definition 20 (characteristic function), Also, for , Now we show that the distribution is unique if it has the same value for every sets in . Using the same logic of Lemma 26 (sufficient conditions for unique distribution), define and is pi system as and note that it is if .
Also, let then is lambda system since

  1. :
  2. For any such that ,
  3. For any sequence such that , we have

Also, by the assumption. Thus, by Measure Theoretic Preliminaries > Theorem 15 (Dynkin's pi-lambda theorem), we have . Thus any Borel sets under interest is well defined.

Characteristic Functions to Weak Convergence

Uniqueness of Characteristic Function

Lemma (limit of sine integral).

Let . Then we have

Proof.First, consider and note that which is the integral of interest. Now, by differentiating with respect to , we have Using the integration by parts, we have thus Therefore and this implies where is the constant.

We can calculate by letting , and we have thus . Finally, this completes the proof.

Theorem (inversion formula).

Let be a probability measure and be its characteristic function. If , then

Proof.Remark that Now denote then, since and , the integral is finite by . Thus, by Measure Theoretic Preliminaries > Theorem 75 (Fubini theorem), we have and by defining , we can re-express As we have using the results from Lemma 28 (limit of sine integral), the limit differs by Then, Note that by the property of lebesgue measure, we have and as previously obtained, . Thus by Measure Theoretic Preliminaries > Theorem 68 (dominated convergence theorem), we have this completes the proof.

Exercise (end point of inversion formula).

Let be a probability measure and . Then we have

Proof.From as , the integral is finite. By exploiting Measure Theoretic Preliminaries > Theorem 75 (Fubini theorem), we have If , then and if , as for all , we have meaning that Thus by Measure Theoretic Preliminaries > Theorem 68 (dominated convergence theorem), we have this completes the proof.

Corollary (identical characteristic function implies same distribution).

Suppose are random variables with their characteristic functions for all . Then, for their distributions ,

Proof.As , by Theorem 29 (inversion formula), we have and by Exercise 30 (end point of inversion formula), we also have . This gives us which completes the proof. Also, note that Lemma 26 (sufficient conditions for unique distribution), we thus have .

Corollary (normal distribution closed under addition).

Let be independent random variables with normal distribution and . Then we have

Proof.Remark Example 24 (characteristic function of normal distribution), which gives us and Now define , with As Corollary 31 (identical characteristic function implies same distribution), it suffices to show that . As and are independent, by Proposition 22 (properties of characteristic function), we have thus we have .

Continuity Theorem

Theorem (continuity theorem).

Let be probability measures with characteristic function for .

  1. if , then for all .
  2. if where is continuous at , then is tight and with characteristic function .

Proof.(1) suppose . now let , which is bounded and continuous. then by Theorem 9 (weak convergence and bounded convergence), we have

(2) First, we show that is tight.
Note that since Lemma 28 (limit of sine integral), we have . by Measure Theoretic Preliminaries > Theorem 75 (Fubini theorem), we have note that thus we have . discarding integral over , we have now we need to show that is tight by showing that for arbitrary .

As is assumed to be continuous at and by Proposition 22 (properties of characteristic function), , thus we have now for given , fix such that Note that by assumption, and for all . then by Measure Theoretic Preliminaries > Theorem 68 (dominated convergence theorem), we have and there exists some , that is for all , we have meaning that is tight.

Secondly, we show that is characteristic function of .
as is tight, by Theorem 16 (tightness and weak convergence), every subsequential limit converges to some probability distribution, i.e. . Then, by (1), we have where the is the characteristic function of .

Lastly, we show that .
let be bounded and continuous function. from the second result of and (1), then for any subsequence of has further subsequence that converges to . i.e. as this holds for arbitrary subsequence in the real space, by Law of Large Numbers > Remark 12 (subsequent convergence and convergence), we have then by the converse of Theorem 9 (weak convergence and bounded convergence), we therefore have , where is the unique probability measure that corresponds to , whose uniqueness is provided by Corollary 31 (identical characteristic function implies same distribution).

Central Limit Theorem

Moments and Derivatives

Lemma (boundary of correction term).

We have

Proof.Integrating by parts, we have if , by rearranging the terms, now for , we have and iterating with respect to gives us which gives us now it is left to show that the right hand side is bounded.

if is small enough, as for all , we have and if is large, as , we have thus we have the desired result.

Corollary (expected value of boundary).

Taking expectation values on Lemma 34 (boundary of correction term), we have

Proof.using properties of expectation, which is the desired result.

Definition (small o).

We denote if

we use this notation to simplify the proof.

Theorem (error term of characteristic function).

If and , then

Proof.As , consider the case when in Corollary 35 (expected value of boundary) if , then the minimum goes to as . this also means that almost surely, where is finite. thus applying Measure Theoretic Preliminaries > Theorem 68 (dominated convergence theorem), we have implying that the error term is . Thus we have .

Theorem (existence of finite second moments).

If then .

Proof.Remark that as we have where the inequality holds since for any .

applying l'hospital's law, where the last equality holds since as .

As and as , by Measure Theoretic Preliminaries > Corollary 64 (Fatou's Lemma), we have where the last inequality holds by the assumption. thus we have .

Central Limit Theorem

Lemma (convergence of complex sequence).

If , then we have

Proof.we just provide intuition for the proof. note that thus where the assumption provides .

Theorem (central limit theorem).

Let are i.i.d. with and . If then where .

Proof.Remark that since , and are still i.i.d., we can consider in place of . Thus we may assume .

For as are i.i.d., by Corollary 25 (ch.f of iid summation) and Proposition 22 (properties of characteristic function), we have and by Theorem 37 (error term of characteristic function), if are fixed, then . thus we have let as . Therefore by Lemma 39 (convergence of complex sequence) we have note that the last equality holds as the characteristic functions are unique to each distribution, by Corollary 31 (identical characteristic function implies same distribution).

As , by Theorem 33 (continuity theorem), we have where is the probability measure with characteristic function and with . this gives us .