Non-parametric estimation in a semimartingale regression model. Part 1. Oracle inequalities

Конев Виктор Васильевич; Пергаменщиков Сергей Маркович

ВЕСТНИК ТОМСКОГО ГОСУДАРСТВЕННОГО УНИВЕРСИТЕТА

2009 Математика и механика № 3(7)

УДК 519.2

V. Konev, S. Pergamenshchikov

NON-PARAMETRIC ESTIMATION IN A SEMIMARTINGALE REGRESSION MODEL.

PART 1. ORACLE INEQUALITIES1

This paper considers the problem of estimating a periodic function in a continuous time regression model with a general square integrable semimartingale noise. A model selection adaptive procedure is proposed. Sharp non-asymptotic oracle inequalities have been derived.

Keywords: Non-asymptotic estimation; Non-parametric regression; Model selection; Sharp oracle inequality; Semimartingale noise.

AMS 2000 Subject Classifications: Primary: 62G08; Secondary: 62G05

1. Introduction

Consider a regression model in continuous time

dyt = S(t)dt + dt,t, 0 < t < n, (1)

where S is an unknown 1 -periodic R ^ R function, S e L2[0,n]; (^t)t>0 is a square

integrable unobservable semimartingale noise such that for any function f from

L2 [0, n] the stochastic integral

en

In if) = J0 fs d ^ (2)

is well defined with

EIn (f) = 0 and ElП (f) < a* £ f ds , (3)

where a* is some positive constant.

An important example of the disturbance (^t )t >0 is the following process

wt + 9izt, (4)

where and g2 are unknown constants, | | + | g2 |> 0, (wt )t>0 is a standard

Brownian motion, (zt )t >0 is a compound Poisson process defined as

N

Z = I Yj , (5)

j=1

where (Nt )t>0 is a standard homogeneous Poisson process with unknown intensity X> 0 and (Yj) j>j is an i.i.d. sequence of random variables with

EYj = 0 and EYj = 1. (6)

1 The paper is supported by the RFFI - Grant 09-01-00172-a.

Let (Tk )k a1 denote the arrival times of the process (Nt )t >0, that is,

Tk = inf{{ > 0: Nt = k}. (7)

As is shown in Lemma A.2, the condition (3) holds for the noise (4) with

a* = ^X .

The problem is to estimate the unknown function S in the model (1) on the basis of observations (y )0<,<„ .

This problem enables one to solve that of functional statistics which is stated as follows. Let observations (xk )0<k<n be a segment of a sequence of independent identically distributed random processes xf = (xf )0<t S1 specified on the interval (0,1), which obey the stochastic differential equations

dxtk = S(t)dt + d, x0 = x0 , 0 < t < 1, (8)

where )j<k <„ is an i.i.d sequence of random processes )0<t S1 with the same

distribution as the process (4). The problem is to estimate the unknown function f (t) e L2[0,1] on the basis of observations ()1sk<n . This model can be reduced to (1), (4) in the following way. Let y = (yt )0<t >n denote the process defined as :

f x, if 0 < t < 1;

y H k

[yk+ xt_k+1 - x0, if k -1 < t < k, 2 < k < n.

This process satisfies the stochastic differential equation

dyt = S (t )dt + d 11,

where S (t) = S ({t}) and

- %, if 0 < t < 1;

^ +tt-k+1- if k-1 <t<k- 2<kn;

{t} = t - [t] is the fractional part of number t.

In this paper we will consider the estimation problem for the regression model (1) in L2 [0,1] with the quality of an estimate S being measured by the mean integrated squared error (MISE)

R(S, S) := Es (S- S )2, (9)

where ES stands for the expectation with respect to the distribution PS of the process (1) given S;

ii/ii2 := £ f2 (t )dt.

It is natural to treat this problem from the standpoint of the model selection approach. The origin of this method goes back to early seventies with the pioneering papers by Akaike [1] and Mallows [16] who proposed to introduce penalizing in a log-likelihood type criterion. The further progress has been made by Barron, Birge and Massart [2, 17] who developed a non-asymptotic model selection method which enabled one to derive non-asymptotic oracle inequalities for a gaussian non-parametric

regression model with the i.i.d. disturbance. An oracle inequality yields the upper bound for the estimate risk via the minimal risk corresponding to a chosen family of estimates. Galtchouk and Pergamenshchikov [6] developed the Barron - Birge - Massart technique treating the problem of estimating a non-parametric drift function in a diffusion process from the standpoint of sequential analysis. Fourdrinier and Pergamenshchikov [5] extended the Barron - Birge - Massart method to the models with dependent observations and, in contrast to all above-mentioned papers on the model selection method, where the estimation procedures were based on the least squares estimates, they proposed to use an arbitrary family of projective estimates in an adaptive estimation procedure, and they discovered that one can employ the improved least square estimates to increase the estimation quality. Konev and Pergamenshchikov [14] applied this improved model selection method to the non-parametric estimation problem of a periodic function in a model with a coloured noise in continuous time having unknown spectral characteristics. In all cited papers the non-asymptotic oracle inequalities have been derived which enable one to establish the optimal convergence rate for the minimax risks. Moreover, in the latter paper the oracle inequalities have been found for the robust risks.

In addition to the optimal convergence rate, an important problem is that of the efficiency of adaptive estimation procedures. In order to examine the efficiency property one has to obtain the oracle inequalities in which the principal term has the factor close to unity.

The first result in this direction is most likely due to Kneip [13] who obtained, for a gaussian regression model, the oracle inequality with the factor close to unity at the principal term. The oracle inequalities of this type were obtained as well in [3] and in [4] for the inverse problems. It will be observed that the derivation of oracle inequalities in all these papers rests upon the fact that by applying the Fourier transformation one can reduce the initial model to the statistical gaussian model with independent observations. Such a transform is possible only for gaussian models with independent homogeneous observations or for the inhomogeneous ones with the known correlation characteristics. This restriction significantly narrows the area of application of such estimation procedures and rules out a broad class of models including, in particular, widely used in econometrics heteroscedastic regression models (see, for example, [12]). For constructing adaptive procedures in the case of inhomogeneous observations one needs to amend the approach to the estimation problem. Galtchouk and Pergamenshchikov [7, 8] have developed a new estimation method intended for the heteroscedastic regression models. The heart of this method is to combine the Barron-Birge-Massart non-asymptotic penalization method [2] and the Pinsker weighted least square method minimizing the asymptotic risk (see, for example, [18, 19]). Combining of these approaches results in the significant improvement of the estimation quality (see numerical example in [7]). As was shown in [8] and [9], the Galthouk -Pergamenshchikov procedure is efficient with respect to the robust minimax risk, i.e. the minimax risk with the additional supremum operation over the whole family of addmissible model distributions. In the sequel [10, 11], this approach has been applied to the problem of a drift estimation in a diffusion process. In this paper we apply this procedure to the estimation of a regression function S in a semimartingale regression model (1). The rest of the paper is organized as follows. In Section 2 we construct the model selection procedure on the basis of weighted least squares estimates and state the main results in the form of oracle inequalities for the quadratic risks. Section 3 gives the proofs of all theorems. In Appendix some technical results are established.

2. Model selection

This Section gives the construction of a model selection procedure for estimating a function S in (1) on the basis of weighted least square estimates and states the main results.

For estimating the unknown function S in model (1), we apply its Fourier expansion in the trigonometric basis j) 7>j in L2 [0,1] defined as

< = 1, <j (x) = ^2 Trj (2n[j/2\x), j > 2, (10)

where the function Trj (x) = cos(x) for even j and Trj (x) = sin(x) for odd j ; [x]

denotes the integer part of x . The corresponding Fourier coefficients

(*1

9j = (S,* j) = JQ S(t) * j (t) dt (11)

can be estimated as

1 pn

6 j,n = n Ji ♦ j ) ddy ■ (12)

In view of (1), we obtain

0jn = 0j +~r^j’« ’ =~TIn^, ()

v« v«

where In is given in (2).

For any sequence x = (Xj)Jkl, we set

I x I2 = X XJ and x) = Z 1{|xj|>0} • (14)

j=1 j=1

Now we impose the additional conditions on the noise (£t )t>0.

C) There exists some positive constant a > 0 such that the sequence

Sj,n = Ej -ct, j > 1, for any n > 1, satisfies the following inequality

ci*(«) = sup | Bl n(x) |<® ,

xeH ,# (x)< n

where H = [-1,1]* and

B1,n (x) = Z xj ? j,n • (15)

j=1

C2) Assume that for all n > 1

c2 (n) = sup E B2 n (x) <<»,

|x|<1 , # ( x)<n

where

B2,n (x) = I Xj I with | j „ = j - E j . (16)

j=1

As is stated in Theorem 2, Conditions C\) and C2) hold for the process (4). Further we introduce a class of weighted least squares estimates for S(t) defined as

Sy = Z Y(j)0j>j ’ (17)

j=1

where y = (y( j)) 7>j is a sequence of weight coefficients such that

0 <y(j) < 1 and 0 < #(y) < n. (18)

Let r denote a finite set of weight sequences y = (y(j))7>j with these properties, v = card(r) be its cardinal number and

^ = max # (y). (19)

yer

The model selection procedure for the unknown function S in (1) will be constructed on the basis of estimates (S y)yer . The choice of a specific set of weight

sequences r will be discussed at the end of this section. In order to find a proper

weight sequence y in the set r one needs to specify a cost function. When choosing an appropriate cost function one can use the following argument. The empirical squared error

Err„ (y) = (S?Y - S)2

can be written as

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

ro ^ ^

Err„ (Y) = Z Y2 (j)j - 2 Z YUWjn + Z 0? • (20)

j=1 j=i j=1

Since the Fourier coefficients (0j)j>j are unknown, the weight coefficients (yj)j>j

can not be determined by minimizing this quantity. To circumvent this difficulty one needs to replace the terms § . n 0 j by some their estimators § j n . We set

0 J,n = j-iIn, (21)

J n

where ^n is an estimator for the quantity a in condition C1).

For this change in the empirical squared error, one has to pay some penalty. Thus, one comes to the cost function of the form

Jn (Y) = Z Y2 ^./)^2 n - 2 z y(J) 0j,n + p Pn(Y), ^2)

j=1 j=l

where p is some positive constant, P (y) is the penalty term defined as

2. | v |2

P „(Y) = 2^-. (23)

n

In the case when the value of a in C1) is known, one can put ^n = and

ct| y I2

Pn (y) = -LLL- (24)

n

Substituting the weight coefficients, minimizing the cost function, that is

Y = argminyer (y) , (25)

in (17) leads to the model selection procedure

S* = Sy ■ (26)

It will be noted that y exists, since r is a finite set. If the minimizing sequence in

(25) y is not unique, one can take any minimizer.

Theorem 1. Assume that the conditions C1) and C2) hold with o> 0 . Then for

any n > 1 and 0 < p < 1/3 , the estimator (26) satisfies the oracle inequality

R(SS)<1 + 3P~minR(Sy,S) + -B„*(p), (27)

1 - 3p yer n

where the risk R(-, S) is defined in (9),

6^ Es 19 n-o\

1 - 3p

and V„(p) = 2°°Vt 4°c'(n) + 2vc2 (n). (28)

CTp(l- 3p)

Now we check conditions Ct) and C2) for the model (1) with the noise (4) to arrive at the following result.

Theorem 2. Suppose that the coefficients and g2 in model (1), (4), are such that

a + el > 0 and EYj < <x. Then the estimation procedure (26), for any n > 1 and

0 < p < 1/3 , satisfies the oracle inequality (27) with

a = a* = gj2 + Xq2 , cj* (n) = 0, and sup c2 (n) < 4a* (ct* + £2EY^).

U> 1

The proofs of Theorems 1, 2 are given in Section 3.

Corollary 3. Let the conditions of Theorem 1 hold and the quantity a in C\) be known. Then, for any n > 1 and 0 < p < 1/3 , the estimator (26) satisfies the oracle inequality

R(S*,S)<1 + 3P~2p2 minR(Sy,S) + -W„(p),

1 - 3p y^r n

where ¥n (p) is given in (28).

1. Estimation of a

Now we consider the case of unknown quantity a in the condition C). One can estimate a as

n

n = Z §2j,n with 1 = ^] + !• (29)

j

Proposition 4. Suppose that the conditions of Theorem 1 hold and the unknown function S(t) is continuously differentiable for 0 < t < 1 such that

|S| ! = JjS(t)| dt <+». (30)

Then, for any n > 1,

K„ (S)

Vn

Es |~„-a|<-^-, (31)

where Kn(S) = 41 S' I2 +CT + V(n) + 41 S I1/4 + Clijt ■

n n

The proof of Proposition 4 is given in Section 3. Theorem 1 and Proposition 4 imply the following result.

Theorem 5. Suppose that the conditions of Theorem 1 hold and S satisfies the conditions of Proposition 4. Then, for any n > 1 and 0 < p < 1/3 , the estimate (26) satisfies the oracle inequality

R(S.,S)<1 + 3p~ 2p2 minR(Sy,S) + -Dn (p), (32)

1 - 3p y^r n

where Dn (p) = 2 (p) + 2p(1 -p}^K"(S}.

(1 - 3p)V n

2. Specification of weights in the selection procedure (26)

Now we will specify the weight coefficients (y( j)) 7>j in a way proposed in [7] for a heteroscedastic discrete time regression model. Consider a numerical grid of the form

An = {I,-, k *} x {tm } , where tt = is and m = [1/s2 ]. We assume that both parameters k* > 1 and 0 < s < 1 are functions of n, i.e. k* = k* (n) and s = s(n), such that

lim k* (n) = +», lim k* (n) ln n = 0,

lim s(n) = 0 and lim nss(n) = +»

(33)

for any 8 > 0 . One can take, for example,

s (n) =----1----- and k * (n) = <Jin(n + Y)

ln(n +1)

for n > 1.

For each a = (P, t) e An , we introduce the weight sequence

Ya = ( Ya (J)) j> 1

given as Ya (J) = !{1< j< j0 } + I1 - (Ja )P ) !{ j0 < j<ma } , (34)

where jo = jo (a) = [®a/ln n] ,

■»„ = <st t»),wl and = (P + 1)2(2bP + 1) .

np P

We set

r = {Y a-ae A }• (35)

It will be noted that in this case v = k*m .

Remark 1. It will be observed that the specific form of weights (34) was proposed by Pinsker [19] for the filtration problem with known smoothness of regression function observed with an additive gaussian white noise in the continuous time. Nussbaum [18] used these weights for the gaussian regression estimation problem in discrete time.

The minimal mean square risk, called the Pinsker constant, is provided by the weight least squares estimate with the weights where the index a depends on the smoothness order of the function S . In this case the smoothness order is unknown and, instead of one estimate, one has to use a whole family of estimates containing in particular the optimal one.

The problem is to study the properties of the whole class of estimates. Below we derive an oracle inequality for this class which yields the best mean square risk up to a multiplicative and additive constants provided that the the smoothness of the unknown function S is not available. Moreover, it will be shown that the multiplicative constant tends to unity and the additive one vanishes as n ^ w with the rate higher than any minimax rate.

In view of the assumptions (33), for any 8 > 0, one has

v

lim —— = 0.

n^-o> n°

Moreover, by (34) for any a e An

Z 1{Ya(j)>0} ““a j=1

Therefore, taking into account that A - A < 1 for P ^ 1, we get Therefore, for any 8 > 0 ,

lim-^- = 0.

n

Applying this limiting relation to the analysis of the asymptotic behavior of the additive term Dn (p) in (32) one comes to the following result.

Theorem 6. Suppose that the conditions of Theorem 1 hold and S e -^[0,1]. Then, for any n > 1 and 0 < p < 1/3 , the estimate (26) with the weight coefficients (35) satisfies the oracle inequality (32) with the additive term Dn (p) obeying, for any 8 > 0 , the following limiting relation

lim = 0.

n°

3. Proofs

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

1. Proof of Theorem 1 Substituting (22) in (20) yields for any y e r

Err„(y) = Jn(y) + 2^ y(j)Q'j,n+ || S ||2 -pPm(y), (36)

j=1

1 1- 1 CT-CT

where 0' jn = 0 j.„-0j 0 =-J= 0j j + - $ .,n + - j + —^

and the sequences (gjn)j>j and (|).a1 are defined in conditions C\) and C2).

Denoting

ro i ro

L(Y) = Z Y (j), M (Y) = ^ Z Y( j')0 j $ j,n , (37)

j=1 vn j=1

and taking into account the definition of the "true" penalty term in (24), we rewrite (36) as

Err„ (Y) = Jn (Y) + 2 L( Y) + 2 M (y) + - Bln (y) +

n n

,--------B2 n (e(Y)) 2

+^VPn(y) 2’^ + II s ||2 -PPn(Y), (38)

Van

where e(y) = y/1 y |, the functions B1n and B2n are defined in (15) and (16).

Let y0 = (y0 (j))j>\ be a fixed sequence in r and y be as in (25). Substituting y0

and y in the equation (38), we consider the difference

Err„ (Y)- Errn (Yo) = J (Y)- J (Y o ) + 2 L( x) + - Bln (x) + 2M (x) +

n n

I o I D /T, \ B2,n (e) 0 nTVTT^ B2,n (e0) _ „C: \

+2 VPn (y) I----------- 2V Pn (Y0 ) I-- PPn(Y) + PPn(Y0 )’

van van

where x = y - Yo, e = e(y) and e0 = e(y0). Note that by (19)

|L( X )| < | L(y )| + |L(y )| < 2^.

Therefore, by making use of the condition C\) and taking into account that the cost function J attains its minimum at y , one comes to the inequality

Err„ (Y)- Err„ (yq ) < 4^ + 2Cl + 2 M (x) + n n

+2 VPS -pPn( Y) + ppn(Y 0) - • (39)

van van

Applying the elementary inequality

2 | ab |< sa2 + s-1b2 (40)

with s = p implies the estimate

2 A7T7-VI B2n(e(Y)) I < „ ( , + Bl,n (e(Y))

2V Pn(Y)------4=-------< PPn (Y) + —-----------•

Van nap

We recall that 0 < p < 1. Therefore, from here and (39), it follows that

2B2,n , 2c* (n)

ErTn (Y) < EiTn(Yo ) + 2M(x) + + r^L +

Г0 )

nap n

1 i +—| a-a| n

(l Y I2 + 1 Yо |2 +4H) + 2pP„ (Yо),

where B’2,n = supyEr Bf>n (e(y)). In view of (19), one has

sup IyI2<Й-

уеГ

Thus, one gets

Errn (Y) < Errn(Yo ) + 2M(x) + —^ + 2Cl (n) +

nap n

+^|~ „ -a|+2pP„ (Y 0). (41)

n

In view of Condition C2), one has

Es B*2,n < X Es B{n(e(Y))<vc*(n), (42)

уеГ

where v = card(r).

Now we examine the first term in the right-hand side of (39). Substituting (13) in (37) and taking into account (3), one obtains that for any non-random sequence x = (x( j))7>j with #(x) < да

1 да 1

EsM2(x) <a*-X x2(j)92 -|| Sx ||2, (43)

n j=1 n

where Sx = ^ j=1 x( j)0jфj . Let denote

* nM 2 (x)

Z = sup-----------—,

xeri || Sx ||

where Г = Г - у0. In view of (43), this quantity can be estimated as

* x—^ nEv M2 (x) * *

ES Z - Z „2 - Z =CT V- (44)

XG^ II Sx II XG^

Further, by making use of the inequality (40) with s = p || Sx ||, one gets

Z *

2|M (x) |< p || Sx ||2 +—. (45)

np

Note that, for any x e r ,

I I Sx | | 2 - | | Sx I I 2 = I x2 {№) -0 2 ) < -2M (x),

j=1

where

Since | x( j) |< 1 for any x e r , one gets

ES M2 (x) <ct*

(46)

Denoting

one has

* nM{ (x)

Z = sup------------L—~,

x.r1 || Sx ||2

ES Z* < a*v.

By the same argument as in (45), one derives

(47)

Z *

2| Mi(x) |<p || Sx ||2 + .

np

From here and (46), one finds the upper bound for || Sx ||, i.e.

I ISxl | 2 <

I Isxi I 2+ z; .

i-p np(i-p)'

Using this bound in (45) gives

2M(x) <

pIISXW2 . Z* + z*

i-p np(i-p;>

Setting x = x in this inequality and taking into account that

we obtain

II s;ll2 = S - sY0 II2 ^ 2 (Err« (Y) + Err« (Yg )),

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

2M (x) < 2P(Errn (y)+Errn (y o )) + _Z * + Z*

1 -p

«p(i -p;>

From here and (41), it follows that

Err„ (Y) ^ 1+- Err„ (Yo) + ^j1 P 1 - 3p n(1 - 3p)

b*

ap

Z * + Z* 2p(1 -p)

- + c*(n) + 3|a | a-a |

np(1 - 3p) 1 - 3p

Pn (Y 0).

(48)

n

Taking the expectation yields

R( S *, S) < R( S Y0, S) + ■ 2(1 -p)

1 - 3p

n(1 - 3P)

vc* (n) ap

+ c* (n) + 3^ ES | ct — ct|

2a v 2p(1 - p)

+-------------+ P (Yo).

np(1 - 3p) 1 - 3p

Using the upper bound for Pn (y0) in Lemma A.1, one obtains

R(S*, S) <1 + 3p- 2^ R(Sy0,S) + -B„* (p),

1 - 3p n

where B^ (p) is defined in (27).

Since this inequality holds for each y0 e r, this completes the proof of Theorem 1.

2. Proof of Theorem 2 We have to verify Conditions C\) and C2) for the process (4).

Condition C) holds with q (n) = 0 . This follows from Lemma A.2 if one puts f = g = ^j , j -1. Now we check Condition C2). By the Ito formula and Lemma A.2, one gets

d/2 (f) = 2It_ (f)dlt (f) + a2f2 (t)dt + @2 X f2 (s)(Azs )2

0<s <t

and

Therefore, putting we obtain

E /2 (f) = a* f 2 (t)dt.

/1( f) = It2 (f) - E/2 (f),

d'jt( f) = £2 / (t) dmt + 27t- (/)f(t)dkt, 70(/) = 0 and ^ (Azs)2 -Xt.

0< s <t

Now we set

(49)

j=1

where x = (Xj)j>j with #(x) < n and | x |< 1. This process obeys the equation

djt(x) = qI dmt + 2Zt- (x)d^, J0(x) = 0,

where (x) = Z xj $(t) and Ct(x) = Z xj (^ j № j(t) •

j>i j>i

Now we show that

E

rn _ _

J0 11-(x)dIt(x) = 0 •

(50)

Indeed, note that

J0 11-(x)d1t(x) = & Z Xj {J 1t-(^ j (x)dmt +

j>1

+2Z xj J0 71-(i- )Zt -(x)d ^ •

j >1

Therefore, Lemma A.4 directly implies

E J” 7 tj (x)dmt=z xi e J” 7t2- j ^2 (t )dmt

/>1

-Z X/E J” (E It2- (* j ))(t)dmt =0.

/>1

Moreover, we note that

F 7t-(*j )Zt- (x)d%t = Z X JT 7t-(*j )1t- (fo) (t) dSt

l >1

and £ 71-(^j)7t- (<h) ^(t)dSt = £ /2- j)1 - (<h) (t)dSt-

- m (I2 ^ ))/t- ^(t) d St.

From Lemma A.5, it follows

E J0 11-(<fr j)7t- (f) f(t)dSt = 0

and we come to (50). Furthermore, by the Ito formula one obtains — 2 — —

I„(x) = 2 J0 I1-(x)dIt(x) + 4£2 J0 z2 (x)<* + ^24 Z ®2k (x) Yk4 l[Tk <n} +

k=1

+4£l2Z Z2k-(x)Y2 1{3k<«} +4^23Z ®2k (x)ZTk-(x) Yk1{Tk<n} • k=1 k=1

By Lemma A.3 one has E (ZTk _ | Tk) = 0 . Therefore, taking into account (50), we

calculate

E12„(x) = 4ft2E Z2 (x)dt + ft4 E Dl n (x) + 4^2D2>„ (x), (51)

<» ^

where D1,n (x) = Z E(x)1{?i <n} and D2,n (x) = Z E- (x)Vk <n} •

k=1 k=1

By applying Lemma A.2, one has

E c2 (x)dt=Z xi xj I” ^(t m j(t)E 7t (i) 7t j)dt =

ij

=z xi xj I” i(t)^ j(t) (^i(s)^ j(s)ds)dt=

=“2 z xi xj (i” m j(t )dt) - «2 • (52)

Further it is easy to check that

D1,n = X JJ ®2 (x)dt = X f Z Xj ^ (t)

I j>1

Therefore, taking into account that #(x) < n and | x |< 1, we estimate D1n by applying the Causchy - Schwarts - Bounyakovskii inequality

Di <4Xn

Z Xj j>i

< 4Xn#(x) < 4Xn

(53)

Finally, we write down the process (x) as

Ct(x) = Jo Qx(t-s)dSs with Qx (t,s) = Z xj^j(s)ij(t) •

j> 1

k-l

By putting D 2,n = E Z !{rk <n} Z Qx2 (Tk > ■Tl)

k=2 l=l

and applying Lemma A.3 we obtain

D2,n = ft Z E j) (Tk ,S)ds1{Tk <n} + 02 D 2,n =

k=1

= ^12 Jo Qx (t-s)dsdt + ft2 D 2,n •

Moreover, one can rewrite the second term in the last equality as

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

<» 1»

D2,n = ^ E V, <n} ^ Qx (Tk , Tl )1{Tk <n} =

/=1 k=l+1

=X2 Jo (Jo S ((s + z’s)dz)=

= X2 Jn (J0t Qx2 (t, s)ds) dt.

Thus, D2,n ^ (Xft + X2& ) JJ ( £ Ql (t> s)ds ) dt =

= (Xa2 + x 2fe2 )n2 = Xa*n2. (54)

The equation (51) and the inequalities (52), (53) imply the validity of condition C2) for the process (4). Hence Theorem 2.

3. Proof of Proposition 4 Substituting (13) in (29) yields

j=

Further, denoting

x j — 1{/< j<n}

we represent the last term in (55) as

n "■) n 1 n

S n = Z e? + _/= Z j + - Z %

j=/ V- j=/ - j=/

= 1/.

and x" j = -,=1 Vn

1{/< j<n} -

2

1 r2 1 n / t \ 1 r» / ff \ n — l +1

Z ,n ~~ l,n (X ) +~^ B2,n (x ) + CT>

n j=! n \ n

j=

where the functions B1n ( ) and B2n ( ) are defined in conditions C\) and C2). Combining these equations leads to the inequality

-v2

2 n

Es\o n-a\< Z 9? +~r Es \Z 9 j % j,n\ + j>i Vn j=i

+-\Bi,n (x') \ +-1 E \ B? (x")| +— a. n y/n n

By Lemma A.6 and conditions Cx), C2), one gets

2)

n

E s|S n-s|< 2 e; + Es\± 0,5 j,n |+ ^ ^ + -f-

j>l sn j=i n yIn \ln

In view of the inequality (3), the last term can be estimated as

n I n ,- 2

EsiZ0jSj,n I ^>*£0j <Va*|S.

Hence Proposition 4.

4. Appendix

A.1. Property of the penalty term (24)

Lemma A.1. Assume that the condition CJ holds with o> 0 . Then for any n > 1 and y e r ,

Pn (y) < Es Err„ (y) + ^.

n

Proof. By the definition of ErrM (y) one has

Err„ (Y) = ZI (y(J) -1)0j +J(J)~^Sj j=1 V Vn

In view of the condition C) this leads to the desired result

Es Err, (y) >1 £ Y2 (j) E %,n = Pn (y) - ^. n j=1 «

A.2. Properties of the process (4)

Lemma A.2. Let f and g be any non-random functions from L2 [0, n] and (It (f ))t>o be the process defined by (4). Then, for any 0 < t < n,

EIt (f)I (g) = a* f (s)g(s)ds ,

where a* = gj2 + .

This Lemma is a direct consequence of Ito’s formula as well as the following result.

Lemma A.3. Let Q be a bounded [0,®)R function measurable with respect to B[0, +<») ® Gk , where

Gk =a{Tlt...,Tk} with some k > 2. (A.1)

Then

E (Irt _ (Q)\Gk) = 0

T k—1

and E ((_ (Q) | Gk) = ) )k Q2 (s)ds + &2 £ Q (T).

l= 1

Now we will study stochastic cadlag processes n = (n )o<t<n of the form

nt = Z u)1{rl<t<T+1} - (A.2)

1=0

where u0 (t) is a function measurable with respect to a{ws, s < t} and the coefficient U (t), l > 1, is a function measurable with respect to

a{ws,s < t, Y,..., Y,T,...,T,}.

Now we show the following result.

Lemma A.4. Let n = (nt)0<t<n be a stochastic non-negative process given by (A.2), such that

m

E Jo nu du«».

Then

m

E J0 - dmu = 0 ,

where the process m = (mt) is defined in (49).

Proof. Note that the stochastic integral, with respect to the martingale (49), can be written as

J0 n- dmu = Z n- (Azu )2 -x j0 n du =

0 <u<n

Z2 m

- Yk l{Tk <n} -X J0 nudu •

k=1

Therefore, taking into account the representation (A.2), we obtain

en

JQ nu _ dmu =Yl -XT 2, (A.3)

where = Z uk-i (Tk ~) Yk \rk <n] and Y2 = J0 nudu.

k=1

Recalling that EYj2 = 1 and uk > 0 , we calculate

E= £ E Uk- (T -)1{:

L{Tk <«} • k=1

Moreover, the functions (uk) are cadlag processes, therefore the Lebesgue measure of the set {t e R+ : uk (t-) ^ uk (t)} equals zero. Thus,

E uk-1 (Tk )!{rk <n} E !{rk _J<«} J0 uk-1 (Tk-1 +u) e du-

f«-T

Jo

This implies

r«-T

Jo

EYj = E \{Ti Sn} JJ- 1 u, (2} + u) e^“ dw. (A.4)

}=0

Similarly we obtain

ET2 = Z E\t, <n} JT U1 (t)!{t<T,+1} dt =

1=0 1

= Z E 1{2JSn} ' u (T + u) e-_u du. (A.5)

1=0

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Substituting (A.4) and (A.5) in (A.3) implies the assertion of Lemma A.4.

Lemma A.5. Assume that E Yj4 < <» . Then, for any measurable bounded non-

random functions f and g, one has

E |" It- (f) I - (g) g (t) d %t = 0.

I"

Jo

Proof. First we note that

J"

Jo

E JO It2- (/)/,_ (g) g (t) dzt = E X ^ - (f) Itj - (g) g (Tj )1{rj S"} E Yj = 0.

j>i

Therefore, to prove this lemma one has to show that

E r 7t2^7t(g^ g') dw = 0' (A.6)

To this end we represent the stochastic integral It (f) as

I (f) = It" (f) + 02 I/ (f) ,

where ^ (f) = Jo f dws and /f (/) = £ f dzs.

Note that

E | Itz (f) |4 < M4 E Y14 E Nn2 = M4 E Y14 (Xn + X2 n2) <«, where M = sup (| f (t) | + | g(t) |).

0 <t <n

Therefore, taking into account that the processes (wt) and (zt) are independent, we

get

E f It4(f)(Itw(g))2 g(t)dt <«,

i.e. E Jjn It2 (f)Itw(g) g(t) dwt = 0.

fn Jo

Similarly, we obtain

en

E J[ (1^ (f))2 If (g) g(t) dwt = 0 and E £ It (f) If (f)If (g) g(t) dwt = 0.

Therefore, to show (A.6) one has to check that

en

E J0 n = 0> (A.7)

where nt = (z (f) )) (g) g(t) •

Taking into account that the processes (nt) and (wt) are independent, we get

Jo nt dwt ^ E^j £ nt dt E suP K I-

E

0<t <n

Here, the last term can be estimated as

(N V

E sup | n ^ M4I ZI Yj

0<t<n j

<M4 E I Y I3 ENn <<».

m

Hence the stochastic integral Jq ntdwt is an integrable random variable and

E J” nt dwt = EE (J” nt dwt In >0 ^^ n) = 0 •

Thus we obtain the equality (A.7) which implies (A.6). Hence Lemma A.5.

A.3. Property of the Fourier coefficients

Lemma A.6. Suppose that the function S in (1) is differentiable and satisfies the condition (30). Then the Fourier coefficients (11) satisfy the inequality

sup IX e? < 4| S1.

l> 2 j=l

Proof. In view of (10), one has

1 f1 A

1 •

02p =—-j^— I S{t)sm(2npt)dt n

2np

1 f1 •

02p+1 =~T — Jo S(t)(cos(2npt)- V)dt -v2np 0

and

V2 fi • 2

=---------I S(t)sin (npt)dt, p > 1.

np Jo

From here, it follows that, for any j > 2

e? ^4lS l?. j

Taking into account that

sup / X -1 < 2,

l>2 j>l j2

we arrive at the desired result.

Acknowledgments

This research has been executed in the framework of the State Contract № 02.740.11.5026.

REFERENCES

1. Akaike H. A new look at the statistical model identification // IEEE Trans. on Automatic Control. 1974. P. 716 - 723.

2. Barron A., Birge L., and Massart P. Risk bounds for model selection via penalization // Probab. Theory Relat. Fields. 1999. P. 301 - 415.

3. Cao Y. and Golubev Y. On oracle inequaliies related to a polynomial fitting // Math. Meth. Stat. 2005. No. 4. P. 431 - 450.

4. Cavalier L., Golubev G.K., Picard D. and Tsybakov A. Oracle inequalities for inverse problems // Ann. Statist. 2002. P. 843 - 874.

5. Fourdrinier D. andPergamenshchikov S. Improved selection model method for the regression with dependent noise // Ann. Institute Statist. Math. 2007. No. 3. P. 435 - 464.

6. Galtchouk L. and Pergamenshchikov S. Non-parametric sequential estimation of the drift in diffusion processes // Math. Meth. Stat. 2004. No. 1. P. 25 - 49.

7. Galtchouk L. and Pergamenshchikov S. Sharp non-asymptotic oracle inequalities for non-parametric heteroscedastic regression models // J. Non-parametric Stat. 2009. V. 21. No. 1. P. 1 - 16.

8. Galtchouk L. and Pergamenshchikov S. Adaptive asymptotically efficient estimation in heteroscedastic non-parametric regression // J. Korean Statist. Soc. 2009. URL: http://ees. elsivier.com/jkss

9. Galtchouk L. and Pergamenshchikov S. Adaptive asymptotically efficient estimation in heteroscedastic non-parametric regression via model selection. 2009. URL: http://hal. archives-ouvertes.fr/hal-00326910/fr/

10. Galtchouk L. and Pergamenshchikov S. Adaptive sequential estimation for ergodic diffusion processes in quadratic metric. Part 1. Sharp non-asymptotic oracle inequalities // Prepublication 2007/06, IRMA, Universite Louis Pasteur de Strasbourg, 2007.

11. Galtchouk L. and Pergamenshchikov S. Adaptive sequential estimation for ergodic diffusion processes in quadratic metric. Part 2. Asymptotic efficiency // Prepublication 2007/07, IRMA, Universite Louis Pasteur de Strasbourg, 2007.

12. Goldfeld S.M. and Quandt R.E. Nonlinear Methods in Econometrics. North-Holland, London, 1972.

13. Kneip A. Ordered linear smoothers // Ann. Stat. 1994. P. 835 - 866.

14. Konev V.V. and Pergamenshchikov S.M. General model selection estimation of a periodic

regression with a Gaussian noise // Ann. Institute Statist. Math. 2008. URL: http://dx.doi.org/ 10.1007/s10463-008-0193-1

15. Jacod J. andShiryaev A.N. Limit theorems for stochastic processes. V. 1. N.Y.: Springer, 1987.

16. Mallows C. Some comments on Cp // Technometrics. 1973. P. 661 - 675.

17. Massart P. A non-asymptotic theory for model selection // ECM Stockholm. 2004. P. 309 -

323.

18. Nussbaum M. Spline smoothing in regression models and asymptotic efficiency in L2 // Ann. Statist. 1985. P. 984 - 997.

19. Pinsker M.S. Optimal filtration of square integrable signals in gaussian white noise // Probl. Transimis. Inform. 1981. P. 120 - 133.

СВЕДЕНИЯ ОБ АВТОРАХ:

Konev Victor, Department of Applied Mathematics and Cybernetics, Tomsk State University,

Lenin str. 36, 634050 Tomsk, Russia. E-mail: [email protected]

Pergamenshchikov Serguei, Laboratoire de Math'ematiques Raphael Salem, Avenue de

l’Universit'e, BP. 12, Universit'e de Rouen, F76801, Saint Etienne du Rouvray, Cedex France

and Department of Mathematics and Mechanics,Tomsk State University, Lenin str. 36, 634041

Tomsk, Russia. E-mail: [email protected]

^атья принята в печать 26.08.2009 г.

Non-parametric estimation in a semimartingale regression model. Part 1. Oracle inequalities Текст научной статьи по специальности «Математика»

Аннотация научной статьи по математике, автор научной работы — Конев Виктор Васильевич, Пергаменщиков Сергей Маркович

Похожие темы научных работ по математике , автор научной работы — Конев Виктор Васильевич, Пергаменщиков Сергей Маркович

Текст научной работы на тему «Non-parametric estimation in a semimartingale regression model. Part 1. Oracle inequalities»