Chapter 3: conditional expectation and martingales Flashcards
(26 cards)
Conditional expectation considering X and Z, discrete random variables. Values in {x_1,..,x_m} and {z_1,..,z_n}
P[X=x_i | Z= z_j]
E [X/Z = zj] =
Y= E[ X/Z] where if z(w)=z_j, then …
Conditional expectation defined:
P[X=x_i | Z= z_j] = P[ X=x_i, Z= z_j] / P[ Z=z]
E[ X/Z = zj] = sum of i of x_iP[X=x_i | Z= z_j]
Y= E[ X/Z] where if z(w)=z_j, then Y(w)= E[X\Z = z_j]
Problems:
1) not clear how discrete and continuous interact
2) what if we don’t have all vars discrete or continuous RVs
For a prob space ( Ω,F,P) a RV X: Ω) to R
For a large F we want to work with sub sigma algebra g: we want random variable Y st
1) Y is in mG ie Y is G-measurable
Y depends on the info we have
2) Y is the best way to approximate X with a G-measurable random variable
Eg best prediction for X, given G, info we have up to today
Unique best prediction
Eg minimise E[ |Y-X|]
Minimise var(Y-X) etc
Theorem 3.1.1- conditional expectation
Let X be an L^1 random variable on ( Ω,F,P). Let G be a sub-sigma-field if F. Then there exists a random variable Y in L^1 st
1) Y is G-measurable
2) For every “G” In G, E[Y1_G] = E[ X1_G]
For some event “G”in G, conditional expectation is 0 if not on event
(Indicator function used)
Moreover, if Y’ in L^1 is a 2nd RV satisfying conditions P [Y=Y’]=1
(Doesn’t tell us what Y is)
Definition 3.1.2 best way to approximate X given only info in G
We refer to Y as a version of the conditional expectation of X given G and write Y= E[X\G]
Sketch proof of def 3.1.2: Y as conditional expectation
Look at all the missing info and use expectation to average missing info, to predict.
Y is a random variable, depending on info from G/“G”
Example: let X_1 and X_2 be independent random variables
P[ X_i =1] = P [X_i = -1] = 0.5
Claim: E[ ( X_1 + X_2) | sigma(X_1) ] = X _1
Note: X_1 + X_2 is X , sigms(X_1) is G
That is, info in sigma field generated by X_1 not X_2
, X_1 is Y
E[ ( X_1 + X_2) | sigma(X_1) ] = X _1 + 0
+0 as expectation of X_2 is 0, averaging out and we don’t know information
Proof: we need to check properties one and two
1)
X_1 in sigma(X_1) by lemma 2.2.5 and do Y is in mG ie Y is G-measurable
2) Take “G” in G ( event “G” which is G-measurable)
E[ X 1_G] = E[ (X_1 + X_2) 1_G]
= E[ X_1 1_G] + E[X_2 1_G]
( note: X_2 is in msigma(X_2) by lema 2.2.5, ie it is simga(X_2) measurable.
(We wanted: expectation of Y1_G and expectation of X_1 1_G)
Similarly indicator function is in mG by Lemma 2.2.4)
(Sigma(X_1) and sigma(X_2) are also both independent)
= E[X_1 1_G] + E(X_2)E(1_G)
conditional expectation
E( X|G) =y
X is the RV we want to predict
G is Sigma fields cu representing info we currently know
Y is the conditional expectation of X given G, best guess for X
Proposition 3.2.2 properties of conditional expectations
Let G, H (curly G and H) be sub-sigma-fields if F and X,Y.Z in L^1
a_1, a_2 in R
Then, almost surely
LINEARITY:
E[{a_1X_1 + a_2X_2 }| G = a_1E[X_1|G] + a_2E[X_2|G]
ABSOLUTE VALUES:
|E[X|G]| less than or equal to E[ |X| |G]
MONOTONICITY:
If X is less than or equal to Y then
E[ X|G] less than or equal to E[Y |G]
CONSTANTS:
If a is in R (deterministic) then
E[ a|G] =a
MEASURABILITY:
If X is G measurable ( X depends on info only in G) (show you’ve checked this condition)
Then E[X|G ] =X
INDEPENDENCE:
If X is independent of G ( X depends on info we don’t have)
Then E[X\G] = E[X]
TAKING OUT WHAT IS KNOWN:
(No info but condition)
If Z is G-measurable, then E[ZX|G] = ZE[X|G]
TOWER:
If H is subset of G then E[E[X\G]|H] = E[X|H]
TAKING E:
It holds that E[ E[ X\G]] = E[X]
Ie interaction between conditional expectation and expectation
NO INFO: It holds that E[X | {empty ,sample space}] = E[X]
If taking conditional expectation given smallest sigma field eg this is {empty,sample space} which gives no info; if we don’t know anything best guess is E[X]
(Remember first 5 properties and always write which one you’ve used)
Lemma 3.2.3
Expectations of X and conditional expectations
Let G be a sub-sigma-field of F. Let X be an F-measurable random variable and let Y=E[X|G]. Suppose that Y’ is a G-measurable random variable. Then
E[ (X-Y)^2] less than or equal to E[ (X-Y’)^2]
Ie measure distance between X and Y( the conditional expectation of X)
Y is at least as good an estimator if X as Y’ is, ie no better approx of X than Y
Lemma 3.2.3 PROOF
Let G be a sub-sigma-field of F. Let X be an F-measurable random variable and let Y=E[X|G]. Suppose that Y’ is a G-measurable random variable. Then
E[ (X-Y)^2] less than or equal to E[ (X-Y’)^2]
E[ (X-Y’)^2]= E [ (X-Y +Y-Y’)^2]
BY LINEARITY
= E[ (X-Y)^2] + 2E[(X-Y)(Y-Y’)] + E[ (Y-Y’)^2]
BY MONOTONICITY OF EXP, bigger than or equal to 0 as (Y-Y’)^2 bigger than or equal to 0
( look at middle term: TAKING E RULE E[(X-Y)(Y-Y’)]= E[E[(X-Y)(Y-Y’)|G]] BY LINEARITY = E[(Y-Y’)E[(X-Y)|G]] =E[(Y-Y’)( E[X|G]-Y)]
(Look at ( E[X|G]-Y)] =0))
Thus
E[ (X-Y’)^2]
Bigger than or equal to
E[ (X-Y)^2]
A stochastic process
A stochastic process (S_n)_{n=0} ^ infinity
(Or sometimes n=1)
Is a sequence of RVs.
A stochastic process is bounded
A stochastic process is bounded if for some c in R we have
|S_n| less than or equal to C for all n
DEF 3.3.1 filtration
A sequence of sigma-fields (F_n)_{n=0}^infinity
Is known as a filtration if F_0 subset F_1 … subset F
DEF 3.2.2 adapted
A stochastic process X= (X_n) is adapted to the filtration (F_n) if for all n, X_n is F_n measurable
- if filtration has info that we see, based on this info we can see value of F.
“Watch happen”, random value we know at all times n in N
DEF 3.3.3 martingale
A process M=(M_n)_{n=0} ^infinity
Is a martingale wrt (F_n) if
(M1) (M_n) is adapted
(M2) M_n in L^1 for all n
(M3) E[ M_{n+1} |F_n] = M_n
M3 is the martingale property of fairness
Def SUBMARTINGALE
A process M=(M_n)_{n=0} ^infinity
Is a submartingale wrt (F_n) if
(M1) (M_n) is adapted
(M2) M_n in L^1 for all n
(M3) E[ M_{n+1} |F_n] bigger than or equal to M_n
M3 is the martingale property of fairness
DEF SUPERMARTINGALE
A process M=(M_n)_{n=0} ^infinity
Is a supermartingale wrt (F_n) if
(M1) (M_n) is adapted
(M2) M_n in L^1 for all n
(M3) E[ M_{n+1} |F_n] less than than or equal to M_n
M3 is the martingale property of fairness
Example: a martingale
Consider (X_n) is RVs
P( X_i =1) = P( X_i =-1) = 0.5
S_n = sum from i=1 to n of [ X_i]
(Total win after n plays)
F_n = sigma(X_1…,X_n)
X_n is info from first n rounds
Then (Sn) is a martingale:
Equally likely to win or lose and independence outcomes thus in the long run expect 0, previous results don’t help you.
Checking this:
(M1)S_n in mF_n by p2.5 because X_i are sigma(X_i)-measurable
(M2) |S_n| =| sum from i=1 to n of X_i| *less than or equal to [X_i] = n
⇔
S_n is bounded S_n ∈L^1
*by triangle law
(M3)
E[S_{n+1} | F_n]
=
E[(X_{n+1} +Σ from i=1 to n of [X_i] ) |F_n ]
(indep of before time n is X_{n+1}, the sum is measurable) = E[X_{n+1}] + Σ from i=1 to n of [X_i] (using measurability and independence) =S_n
(as each expectation of X_n+1 is 0)
Example 3.3.9: filtration
2 general examples of martingales: 2
Example 3.3.9 Let Z ∈ L^1
be a random variable and let (F_n) be a filtration. Then
M_n = E[Z|F_n]
is a martingale.
- sequence of better approx to Z, by taking conditional exp wrt F_n
Lemma 3.3.6: expectation stays constant
Let (F_n) be a filtration and suppose that (M_n) is a martingale.
Then
E[M_n] = E[M_0]
for all n ∈ N
proof:
(M3) implies E[M_{n+1}|F_n] = M_n
And by taking expectation:
E[E[M_{n+1}|F_n]] = E(M_n)
by the taking exp property of conditional exp LHN
E(M_{n+1}) = E(M_n)
then by induction result follows
E(M_0) = E(M_n)
Example of a sub-martingale:
Take (X_i) to be iid
P[X_i =2] =P[X_i =-1] =0.5
(ie not symmetrical and expectation of X_i not equal to 0)
Now, E(X_0) = 0.5 bigger than 0
S_n = sum from i=1 to n [X_i]
Check (M1)(triangle law) X_i in mF_i ( ie measurable wrt generated filtration sigma(X_i,..,),
so X_i∈mF_n so by p S_n∈mF_n
(M2) |X_i| ≤2
so |S_n| ≤ sum from i=1 to n of |X_i| ≤ sum from i=1 to n of [2] = 2n less than infinity
So S_n is bounded by 2n, so S_n∈L^1
(M3) E[S_{n+1}| F_n] by def of S_n =E[(S_n + X_{n+1})| F_n] by linearity = E[S_n | F_n] + E[X_{n+1}| F_n] (S_n ∈mF_n and X_{n+1} is indep of F_n) =S_n + E[X_{n+1}] by conditional exp rules = S_n +0.5 ≥ S_n
lemma 3.3.6***
if (Mn) is a submartingale
if (Mn) is a submartingale, then by definition E[Mn+1 | Fn] ≥ Mn, so taking
expectations gives us E[Mn+1] ≥ E[Mn].
(E[M_0] ≤ E[Mn]. )
lemma 3.3.6***
if (Mn) is a supermartingale
For supermartingales we get E[Mn+1] ≤ E[Mn].
(E[M_0] ≥ E[Mn]. )
In words: submartingales, on average, increase, whereas supermartingales, on average, decrease.
The use of super- and sub- is counter intuitive in this respect