2.2 Filtrations Let(Q, F)be a measureable space. A filtration in discrete time is a sequence of r-algebras Ft) such that FtCy Ft Ft+ for all t=0.1... In continuous time, the second condition is replaced by Fs CFt for all s< t 3 Markov processes The idea of a Markov process is to capture the idea of a short-memory stochastic process: once its current state is known, past history is irrelevant from the point of view of predicting its future Definition. Let( Q, F)be a measurable space and let(P, E)be, respectively,a probability measure on and a filtration of this space. Let X be a stochastic process in discrete time on( Q, F). Then X is called a(P, E )-Markov process if 1. X is F-adapted, and
2.2 Filtrations Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of σ–algebras {Ft} such that Ft ⊂ F and Ft ⊂ Ft+1 for all t = 0, 1, . . .. In continuous time, the second condition is replaced by Fs ⊂ Ft for all s ≤ t. 3 Markov processes The idea of a Markov process is to capture the idea of a short-memory stochastic process: once its current state is known, past history is irrelevant from the point of view of predicting its future. Definition. Let (Ω, F) be a measurable space and let (P, F) be, respectively, a probability measure on and a filtration of this space. Let X be a stochastic process in discrete time on (Ω, F). Then X is called a (P, F)-Markov process if 1. X is F−adapted, and 11
2. For each t E Z+ and each Borel set B CB(R P(Xt+1∈BFt)=P(Xt+1∈B|a(Xt) ometimes when the probability measure and filtration are understood, we will talk simply of a Markov process Remark. Often the filtration F is taken to be that generated by the process X itself Proposition. Let(Q, F, P, E)be a filtered probability space and let X be a(P, F)- Markov process. Let t, k E Z+. Let f: R-R be a Borel function such that f(Xt+k) is integrable. Then EIf(Xt+)Ft]=EI(Xi+)o(Xi) 10 and hence there is a borel function g:Z+×Z+×R→ R such that, for each t, EI(Xu+k)Ft=g(t, k, Xt) 11 Proof. To show it for k:= 1, use that f(Xi+1) is a o(X++1)-measurable random variable and hence is the limit of a monotone increasing sequence of o(Xt+1)-simple random variables. But such random variables are linear combinations of indicator functions of sets X+1(B)with B a Borel set. This completes the proof for k=1 To prove it for arbitrary positive k, use induction. To prove it for k+ 1 assuming it true for k use the law of iterated The vector case is a simple extension of the scalar case. However, it is important that the definition of a vector Markov process is not that each component is Markov
2. For each t ∈ Z+ and each Borel set B ⊂ B (R) P (Xt+1 ∈ B|Ft) = P (Xt+1 ∈ B|σ (Xt)). (9) Sometimes when the probability measure and filtration are understood, we will talk simply of a Markov process. Remark. Often the filtration F is taken to be that generated by the process X itself. Proposition. Let (Ω, F, P, F) be a filtered probability space and let X be a (P, F)- Markov process. Let t, k ∈ Z+. Let f : R → R be a Borel function such that f (Xt+k) is integrable. Then, E [f (Xt+k)|Ft ] = E [f (Xt+k)|σ (Xt)] (10) and hence there is a Borel function g : Z+ × Z+ × R → R such that, for each t, E [f (Xt+k)|Ft ] = g (t, k, Xt). (11) Proof. To show it for k = 1, use that f (Xt+1) is a σ (Xt+1) −measurable random variable and hence is the limit of a monotone increasing sequence of σ (Xt+1) −simple random variables. But such random variables are linear combinations of indicator functions of sets X −1 t+1 (B) with B a Borel set. This completes the proof for k = 1. To prove it for arbitrary positive k, use induction. To prove it for k + 1 assuming it true for k, use the law of iterated expectations. The vector case is a simple extension of the scalar case. However, it is important that the definition of a vector Markov process is not that each component is Markov. 12
Instead, we require that all the relevant(for the future of X) bits of information in Ft are in the a-algebra generated by all the stochastic variables in Xt, i.e. o(Xt)is defined as the single a-algebra o(Xi.t, X2,t,,Xn,t). This means that, for each Borel function f:Rn→R EI(Xi+k)]= EI(X++k)lo(Xt) and hence that there is a Borel function g: Z+ X R"-R such that, for each t EIf(X+1ft=g(t, Xt (13 (The case k= l is so important that we stress it here by ignoring greater values of 3.0.1 Probability transition functions and time homogeneity Definition. Let(, F, P, E be a filtered probability space and let X be a(p F Markov process. Then, for each t=0, 1, 2... its probability transition function Qt:R×B(R)→[0.,1 is defined via Qt(x,B)=P(Xt∈BXt Note that any Markov process has a sequence of probability transition functions Note also that for each fixed t and a, Qt+1(, )is a probability measure on B(R) Meanwhile, if we fix B, Q++1(X(), B)is a random variable. Indeed, it is the con ditional probability of Xt+1 E B given Xt, i.e. Q++1(Xt, B)=EIx41(o(X. Moreover, the conditional expectation of any a(Xt+1)-measurable random variable (given X, is an integral with respect to the measure Qt+1 in the following sense
Instead, we require that all the relevant (for the future of X) bits of information in Ft are in the σ−algebra generated by all the stochastic variables in Xt , i.e. σ (Xt) is defined as the single σ−algebra σ ({X1,t, X2,t, ..., Xn,t}). This means that, for each Borel function f : R n → R m, E [f (Xt+k)|Ft ] = E [f (Xt+k)|σ (Xt)] (12) and hence that there is a Borel function g : Z+ × R n→ R m such that, for each t, E [f (Xt+1)|Ft ] = g (t, Xt). (13) (The case k = 1 is so important that we stress it here by ignoring greater values of k.) 3.0.1 Probability transition functions and time homogeneity Definition. Let (Ω, F, P, F) be a filtered probability space and let X be a (P, F)- Markov process. Then, for each t = 0, 1, 2... its probability transition function Qt : R×B (R) → [0, 1] is defined via Qt (x, B) = P (Xt ∈ B|Xt−1 = x). (14) Note that any Markov process has a sequence of probability transition functions. Note also that for each fixed t and x, Qt+1 (x, ·) is a probability measure on B (R). Meanwhile, if we fix B, Qt+1 (Xt (·), B) is a random variable. Indeed, it is the conditional probability of Xt+1 ∈ B given Xt , i.e. Qt+1 (Xt , B) = E h IX −1 t+1(B) ¯ ¯ ¯ σ (Xt) i . Moreover, the conditional expectation of any σ (Xt+1)-measurable random variable (given Xt) is an integral with respect to the measure Qt+1 in the following sense. 13
Proposition. Let(Q, F, P, F be a filtered probability space and let X be a (P, E)-Markov process. Let(Qt) be its probability transition functions and let ZEC(Q, o(X++1), P). Then, for each t=0, 1, E[ZIXi=/f(y)Q++1(Xt, dy) or, put differently, we have for each t=0, 1, .. and each a E[ZIX =a]=/f(y)Q++1(a, dy) (16) Proof We will show it first for an indicator variable Z=Ix-1(A) where AE B(R).Then f(y)=IA(y). We now need to show that the random variable/f(y)Qt+1(Xt, dy) qualifies as the conditional expectation E [ZXt. Clearly it is a(Xt)-measurable But does it integrate to the right thing? Well, let G E o(Xt) and recall that, by definition,Qu+1(X,A)=E((alo(XD).Hence f (y)Q+1(Xt, dy)P(dw) IA(y)Q++1(Xt, dy)P(d /-(x,4P)=/E(xa(x)P()=/x=aP() Meanwhile. since z (A) we obviously have zP(dw)=Ix (18)
Proposition. Let (Ω, F, P, F) be a filtered probability space and let X be a (P, F)-Markov process. Let hQti be its probability transition functions and let Z ∈ L1 (Ω, σ (Xt+1), P). Then, for each t = 0, 1, ... E [Z|Xt ] = Z R f (y) Qt+1 (Xt , dy) (15) or, put differently, we have for each t = 0, 1, ... and each x, E [Z|Xt = x] = Z R f (y) Qt+1 (x, dy). (16) Proof. We will show it first for an indicator variable Z = IX −1 t+1(A) where A ∈ B (R). Then f (y) = IA (y). We now need to show that the random variable Z R f (y) Qt+1 (Xt , dy) qualifies as the conditional expectation E [Z|Xt ] . Clearly it is σ (Xt) −measurable. But does it integrate to the right thing? Well, let G ∈ σ (Xt) and recall that, by definition, Qt+1 (Xt , A) = E ³ IX −1 t+1(A) |σ (Xt) ´ . Hence Z G Z R f (y) Qt+1 (Xt , dy) P (dω) = Z G Z R IA (y) Qt+1 (Xt , dy) P (dω) = = Z G Qt+1 (Xt , A) P (dω) = Z G E ³ IX −1 t+1(A) |σ (Xt) ´ P (dω) = Z G IX −1 t+1(A)P (dω). (17) Meanwhile, since Z = IX −1 t+1(A) we obviously have Z G ZP (dω) = Z G IX −1 t+1(A)P (dω) . (18) 14
To show the theorem for an arbitrary Z E C(Q, o(Xt+1), P), use the Monotone Convergence Theorem. L We now use the probability transition function to define a time homogeneous Markov Definition. Let(Q, F, P, )be a filtered probability space and let X be a(P, e Markov process. Let(Qt)t be its probability transition functions. If there is a Q such that Qt=Q for all t= 1, 2, . then X is called a time homogeneous Markov process Proposition. Let(@, F, P, e be a filtered probability space and let X be a time homogeneous(P, F)-Markov process. For any nonnegative integers k, t, let Yt+k C(Q, o(Xt+k), P). Then for each k=0, 1, .. there is a Borel function gk: R-R such that. for each t=0.1 EY++Ft= gk(XL) (19) In particular, there is a Borel function h such that, for each t=0, 1 E[t+1|F]=h(X) 3. 1 Finitestate markov chains in discrete time This is perhaps the simplest class of Markov processes. Let(Q2, F, P)be a probability space and let={x1,x2,……,xn} be a finite set.X:Z+→ a be a stochastic process. Denote by u the vector of probabilities that Xt= i and suppose there is
To show the theorem for an arbitrary Z ∈ L1 (Ω, σ (Xt+1), P), use the Monotone Convergence Theorem. We now use the probability transition function to define a time homogeneous Markov process. Definition. Let (Ω, F, P, F) be a filtered probability space and let X be a (P, F)- Markov process. Let hQti ∞ t=1 be its probability transition functions. If there is a Q such that Qt = Q for all t = 1, 2, ... then X is called a time homogeneous Markov process. Proposition. Let (Ω, F, P, F) be a filtered probability space and let X be a time homogeneous (P, F)-Markov process. For any nonnegative integers k, t, let Yt+k ∈ L 1 (Ω, σ (Xt+k), P). Then for each k = 0, 1, ... there is a Borel function gk : R → R such that, for each t = 0, 1, ... E [Yt+k|Ft ] = gk (Xt). (19) In particular, there is a Borel function h such that, for each t = 0, 1, ... E [Yt+1|Ft ] = h (Xt). (20) 3.1 Finite–state Markov chains in discrete time This is perhaps the simplest class of Markov processes. Let (Ω, F, P) be a probability space and let X = {x1, x2, . . . , xn} be a finite set. X : Z+ → X be a stochastic process. Denote by µt the vector of probabilities that Xt = xi and suppose there is 15