7 Lemma 8.1:If 0o is identified and Eo[l logp(X;0)]<oo for all 0∈Θ,Qo(0)is uniquely maximized at0=fo. Proof:By Jensen's inequality,we know that for any strictly convex function g(),E[g(Y)]>g(E[Y]).Take g(y)=-log(y).So, for0卡0o, Eoo[-log( >-品》 Note that 厂密-r= So,Ea[-log(〗>0or Qo(0o)=Eoo[logp(X;00)]>E0o [logp(X;0)]=Qo(0) This inequality holds for all 000
7 Lemma 8.1: If θ0 is identified and Eθ0 [| log p(X; θ)|] < ∞ for all θ ∈ Θ, Q0(θ) is uniquely maximized at θ = θ0. Proof: By Jensen’s inequality, we know that for any strictly convex function g(·), E[g(Y )] > g(E[Y ]). Take g(y) = − log(y). So, for θ = θ0, Eθ0 [− log( p(X; θ) p(X; θ0))] > − log(Eθ0 [ p(X; θ) p(X; θ0)]) Note that Eθ0 [ p(X; θ) p(X; θ0)] = p(x; θ) p(x; θ0)p(x; θ0)dµ(x) = p(x; θ)=1 So, Eθ0 [− log( p(X;θ) p(X;θ0) )] > 0 or Q0(θ0) = Eθ0 [log p(X; θ0)] > Eθ0 [log p(X; θ)] = Q0(θ) This inequality holds for all θ = θ0.
8 Under technical conditions for the limit of the maximum to be the maximum of the limit,0(Xn)should converge in probability to 00. Sufficient conditions for the maximum of the limit to be the limit of the maximum are that the convergence is uniform and the parameter space is compact
8 Under technical conditions for the limit of the maximum to be the maximum of the limit, ˆ θ(Xn) should converge in probability to θ0. Sufficient conditions for the maximum of the limit to be the limit of the maximum are that the convergence is uniform and the parameter space is compact
9 The discussion so far only allows for a compact parameter space.In theory compactness requires that one know bounds on the true parameter value,although this constraint is often ignored in practice.It is possible to drop this assumption if the function Q(0;Xn)cannot rise too much as 0 becomes unbounded.We will discuss this later
9 The discussion so far only allows for a compact parameter space. In theory compactness requires that one know bounds on the true parameter value, although this constraint is often ignored in practice. It is possible to drop this assumption if the function Q(θ; Xn) cannot rise too much as θ becomes unbounded. We will discuss this later
10 Definition (Uniform Convergence in Probability):Q(0;Xn) converges uniformly in probability to Qo(0)if sup(;)-Qo()P() 0∈Θ More precisely,we have that for all e>0, Poo[sup Q(0;Xn)-Qo(0)I>e]0 0∈⊙ Why isn't pointwise convergence enough?Uniform convergence guarantees that for almost all realizations,the paths in 0 are in the e-sleeve.This ensures that the maximum is close to 00.For pointwise convergence,we know that at each 0,most of the realizations are in the e-sleeve,but there is no guarantee that for another value of 0 the same set of realizations are in the sleeve. Thus,the maximum need not be near 00
10 Definition (Uniform Convergence in Probability): Q(θ; Xn) converges uniformly in probability to Q0(θ) if sup θ∈Θ |Q(θ; Xn) − Q0(θ)| P (θ0) → 0 More precisely, we have that for all > 0, Pθ0 [sup θ∈Θ |Q(θ; Xn) − Q0(θ)| > ] → 0 Why isn’t pointwise convergence enough? Uniform convergence guarantees that for almost all realizations, the paths in θ are in the -sleeve. This ensures that the maximum is close to θ0. For pointwise convergence, we know that at each θ, most of the realizations are in the -sleeve, but there is no guarantee that for another value of θ the same set of realizations are in the sleeve. Thus, the maximum need not be near θ0.
11 Theorem 8.2:Suppose that Q(0;Xn)is continuous in 0 and there exists a function Qo(0)such that 1.Qo(0)is uniquely maximized at 0o 2.Θis compact 3.Qo(0)is continuous in 0 4.Q(0;Xn)converges uniformly in probability to Qo(0). then (Xn)defined as the value of aee which for each Xn=n maximizes the objective function Q(;Xn)satisfies 0(Xn)00
11 Theorem 8.2: Suppose that Q(θ; Xn) is continuous in θ and there exists a function Q0(θ) such that 1. Q0(θ) is uniquely maximized at θ0 2. Θ is compact 3. Q0(θ) is continuous in θ 4. Q(θ; Xn) converges uniformly in probability to Q0(θ). then ˆ θ(Xn) defined as the value of θ ∈ Θ which for each Xn = xn maximizes the objective function Q(θ; Xn) satisfies ˆ θ(Xn) P→ θ0