Научная статья на тему 'Об оптимальном адаптивном прогнозе многомерного процесса АРМа(1,1)'

Об оптимальном адаптивном прогнозе многомерного процесса АРМа(1,1) Текст научной статьи по специальности «Математика»

CC BY
189
27
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
АДАПТИВНЫЕ ПРОГНОЗЫ / АСИМПТОТИЧЕСКАЯ РИСК-ЭФФЕКИТВНОСТЬ / МНОГОМЕРНЫЙ АРМА / МОМЕНТ ОСТАНОВКИ / ОПТИМАЛЬНЫЙ РАЗМЕР ВЫБОРКИ / УСЕЧЁННОЕ ОЦЕНИВАНИЕ / ADAPTIVE PREDICTORS / ASYMPTOTIC RISK EFFICIENCY / MULTIVARIATE ARMA / OPTIMAL SAMPLE SIZE / STOPPING TIME / TRUNCATED PARAMETER ESTIMATORS

Аннотация научной статьи по математике, автор научной работы — Кусаинов Марат Исламбекович

Рассматривается проблема асимптотической эффективности адаптивных одношаговых прогнозов многомерного устойчивого процесса АРМА(1,1) с неизвестными параметрами динамики. Прогнозирование основано на методе усечённого оценивания матрицы. Усечённые оценки являются модификацией усечённых последовательных оценок, позволяющей достичь заданной точности на выборках фиксированного размера. Критерий оптимальности прогнозов основан на функции потерь, определённой как линейная комбинация размера выборки и выборочного среднего квадрата ошибки прогноза. Изучены случаи известной и неизвестной дисперсии шума. В последнем случае оптимальный объём наблюдения записывается как момент остановки.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

On optimal adaptive prediction of multivariate ARMA(1,1) process

The problem of asymptotic efficiency of adaptive one-step predictors for ARMA(1,1) process with unknown dynamic parameters is considered. The predictors are based on the truncated estimators of the unknown matrix. The truncated estimation method is a modification of the truncated sequential estimation method, that yields estimators with a given accuracy by samples of fixed size. The criterion of prediction optimality is based on the loss function, defined as a linear combination of sample size and squared prediction error''s sample mean. The cases of known and unknown variance of the noise model are studied. In the latter case the optimal sample size is a special stopping time.

Текст научной работы на тему «Об оптимальном адаптивном прогнозе многомерного процесса АРМа(1,1)»

ВЕСТНИК ТОМСКОГО ГОСУДАРСТВЕННОГО УНИВЕРСИТЕТА

2015 Управление, вычислительная техника и информатика

№ 1 (30)

UDK 519.233.22

M.I. Kusainov

ON OPTIMAL ADAPTIVE PREDICTION OF MULTIVARIATE ARMA(1,1) PROCESS

The problem of asymptotic efficiency of adaptive one-step predictors for ARMA(1,1) process with unknown dynamic parameters is considered. The predictors are based on the truncated estimators of the unknown matrix. The truncated estimation method is a modification of the truncated sequential estimation method, that yields estimators with a given accuracy by samples of fixed size. The criterion of prediction optimality is based on the loss function, defined as a linear combination of sample size and squared prediction error's sample mean. The cases of known and unknown variance of the noise model are studied. In the latter case the optimal sample size is a special stopping time.

Keywords: adaptive predictors; asymptotic risk efficiency; multivariate ARMA; optimal sample size; stopping time; truncated parameter estimators.

According to Ljung's concept of construction of complete probabilistic models of dynamic systems, the prediction is a crucial part of it (see [1, 2]). A model is said to be useful if it allows one to make predictions of high statistical quality. Models of dynamic systems often have unknown parameters, which demand estimation in order to build adaptive predictors. The quality of adaptive prediction is explicitly dependent on the chosen estimators of model parameters.

There is a wide variety of possible estimation methods. For example, the sequential estimation method makes it possible to obtain estimators with guaranteed accuracy by samples of finite but random and unbounded size (see, e.g., [3] among others). The more modern truncated sequential estimation method yields estimators with prescribed accuracy by samples of random but bounded size (see, e.g., [4]).

This work suggests predictors based upon the truncated estimators of parameters introduced in [5, 6] as a modification of the truncated sequential estimators. Truncated estimators were constructed for ratio type func-tionals and are designed to use samples of fixed (non-random) size and have guaranteed accuracy in the sense of the L2m -norm, m > 1.

The requirement of both good quality of predictions and reasonable duration of observations needed to achieve one is formulated as a risk efficiency problem. The criterion is given by certain loss functions and optimization is performed based on it. The loss function describing sample mean of squared prediction errors and sample size as well as the corresponding risk as applied to scalar AR(1) were examined in [7]. It was shown that the least squares estimators of the dynamic parameter are asymptotically risk efficient. Later, this result was refined and extended to other stochastic models in [8], using the sequential estimators of unknown parameters.

In this paper we construct and investigate real-time predictors based on truncated estimators in the case of more general model. We consider the problem of the risk minimization associated with size of a sample and predictions of values of a stable multivariate ARMA(1,1) process with unknown dynamic matrix parameter. The proposed procedure is shown to be asymptotically risk efficient as the cost of prediction error tends to infinity.

The same problem for scalar AR(1) case was considered in [9], multivariate AR(1) in [10]. The ARMA model was studied in [1, 2] among others. A thorough review of risk efficient parameter estimation and adaptive prediction problem for autoregressive processes was recently made in [11] (see the references therein as well).

where A and M are p x p matrix parameters with eigenvalues from the unit circle to provide the process stability (henceforth we shall refer to such matrices as "stable" ones). We assume the parameter A to be unknown

1. Problem statement

Consider the multivariate stable ARMA(1,1) process satisfying the equation

x(k) = Ax(k -1) + §(k) + Ml(k -1), k > 1,

(1)

and M to be known. The random vectors |(k) for k > 1 are independent and identically distributed (i.i.d.) with zero mean and finite variance ct2 = E || |(1) ||2, we also assume the components j (k), j = 1, p, to be uncorrelated and i.d. so that the covariance matrix E= E|(1)|'(1) is diagonal with elements ct2 / p. Denote the A stable region

A0 c Mpxp.

It is known that the optimal in the mean square sense one-step predictor is the conditional expectation of the process with respect to its past, i.e.

xopt (k) = Ax(k -1) + M%(k -1), k > 1. Since both the parameter A and the value of |(k -1) are unknown, it is natural to replace them with some estimators Ak and |(k -1), which we specify in Section 2 below.

Define adaptive predictors as the following (see, e.g., [1, 12]):

x(k) =Ak-1 x(k -1) + M| (k -1), k > 1, (2)

for which the corresponding prediction errors have the following form

e(k) = x(k) - X(k) = (A - Ak-1)x(k -1) + M(§(k -1) -1(k -1)) + §(k). Let e2 (n) denote the sample mean of squared prediction error

e2(n) =1 j || e(k) ||2 .

n k=1

Define the loss function

A 2

Ln =— e (n) + n, n

where the parameter A(> 0) is the cost of prediction error. The corresponding risk function

Rn = E0 Ln = AE0 e2(n) + n, (3)

n

E0 denotes expectation under the distribution P0 with the given parameter 8 = (Xu,...,Xpp,|au,...,^pp,ct2). Define the set © such that for 8e© the matrices A and M are stable and ct2 > 0. The main aim is to minimize the risk Rn on the sample size n . We consider the cases of known and unknown ct2 .

2. Main result

In this section we solve the stated optimization problem under different conditions on model parameters. We use, similarly to [10], the truncated estimation method introduced in [5]. This method makes it possible to obtain the ratio type estimators with guaranteed accuracy using a sample of fixed size. Such quality may essentially simplify investigation of analytical properties in various adaptive procedures.

Let the truncated estimators of the autoregressive parameter A be based on the following Yule-Walker type estimators

Ak =®tG;\ k > 2, A 0 = A, = 0, (4)

= t~t ¿x(0x'(i - 2), Gk = J- ¿x(/ -1)x'(i - 2) k 1 i= 2 k 1 i=2

and have the form

A* =AiX(|Ak|> Hk), k > 2. (5)

Here At = det(Gk), the notation %(B) means the indicator function of the set B and

Hk = log-1/2 k. (6)

We note that according to [5], Hk can be taken as any decreasing slowly changing positive function. We take the estimators for |(k) in the following form

l(k) = j (-M )'■ (x(k - i) -A kx(k -1 - i)), k > 1. (7)

i=0

This way the prediction error can be rewritten as

e(k) = l|(k) + (-M)k |(0) - jj (-M) (Ak- - A)x(k -1 - i).

i=0

2.1. Known ct2 case

If the noise variance ct2 is known, instead of Ak in (2) we shall use the projection of estimators (5) onto a closed ball B e Rpxp, such that A0 c B

a; = p™/[-U]A k,

ensuring

ii a; -aii< dB, (8)

where dB is the diameter of B. Given that ct2 is known, the property (8) allows one to weaken the noise moment conditions compared to the more general case of unknown ct2 (see Section 2.2 below). Rewrite the formulae accordingly

k -1

(k) = j (-M)'' (x(k - i) - A;x(k -1 - i)), x* (k) = Ak-1 x(k -1) + M §* (k -1),

1=0

k-1

e* (k) = x(k) - x* (k) = §(k) + (-M)k §(0) - j (-M)' (Ak-1 - A)x(k -1 - i),

i=0

e» =1 jj ii e*(k) ||2, Ln = Ae*2(n) + n, R = EeLn = ^e» + n. nj~[ n n

To minimize the risk Rn we rewrite it in the form

R = A (ct2 + Dn) + n, (9)

n

where

1 . 1 II k -1 II2

D„ = - j E0 || x*(k) -xop(k) ||2 = - j eJ (-M)k|(0) - j (-My (Ak-, - A)x(k -1 -i)\\ .

n k=1 n k=1 || i=0 ||

We shall use the properties of the estimators Ak given in Lemma 1 below. Define k0 = max{p, [e|A| 2 ]}, where [a] denotes the integer part of a and

A = lim Ak, Pe - a.s.

k

Now we establish the conditions on the system parameters, under which A ^ 0. It can be shown, similarly to, e.g., [13], that due to ergodicity of the process (x(k ))k >0:

-L jjx (i -1) x '(i - 2)

• - 1 i=2

k

Gk = j x (i -1) x '(i - 2) ,. ... > G, Pe - a.s., k - 1 i=2

where

G = AF + ME,

F = j A SA'', S = AEM'+ MEA' + E + MEM'. (10)

i>0

The condition for A ^ 0 is thus nondegeneracy of G. For example, in the scalar case p = 1 we have

^ (A+M )(1 + AM) 2

G =-2-ct ,

1 -A2

which is the first order autocovariance; the condition is A + M ^ 0 as stability of the process implies 1 + AM ^ 0. From here on C denotes those non-negative constants, the values of which are not critical. Lemma 1. Assume the model (1) and let for some integer m > 1 the conditions

E 11(1) ||4pm<», E | x(0) ||4pm<» (11)

E0 II À,-MI2™ . (13)

be true. Assume also that the matrix G defined in (10) is nondegenerate. Then the truncated estimators A; satisfy

(i) for 1 < k < k0

E8 || A; -A||2m < C; (12)

(ii) for k > k0

112m < C logmk km

The proof of Lemma 1 is similar to that of the assertion (31) in [5] and Lemma 1 in [10]. Now we rewrite Dn in the form

_.2 n 1 n Ilk-1 II2

Dn =-X h Mk ||2 +"X EJE (-M)' (Ak-, - A)x(k -1 - i) -

n • pk=1 nk=1 || i=0 ||

- - jxE8 [(-M); |(0)]'X (-M)'E8 (a;-! - A) x(k -1 - i). (14)

n ;=1 i=0

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Consider the first summand. It is known that M; = TJ;T-1, where J is Jordan canonical form of M and the columns of T are generalized eigenvectors of M. It then can be shown that || M; || < C max || |;, where I, i = 1, p are the eigenvalues of M. Boundedness of the series now follows from the stability of M, so we have

2 n C

-X ||M; ||2<-.

n • p k=1

Consider the second summand of (14):

n Ilk-1

1 n k -1

- S eJE (-M)' (M-1 - À)x(k -1 - <

n k=1 ||/' =0 II

< - Es SÎ E0 iiM1 (Mk-1 -À)x(k -1 -/) ii2 +

n k=1 i=0 1 n k -1 k -1

+-EESe ii À*k-1 -À ii2 ii Mjx(k -1 - j) ii • ii Mix(k -1 - i) ii. (15)

n k=1 ji=0

If the conditions E11(1) i4p < œ, Ei x(0) i4p < œ hold then using the Cauchy-Schwarz-Bunyakovsky inequality, (8) and (13) we get

1 _n k-1

- SS E0 ii Mi (Mk-1 - À)x(k -1 - i) ii2 <

n k =1 i=0

n k-1

< CX XJE8 || At j -A ||2 |M'||2 < CXT log1^2; < Clogg' n.

nXX ^ ;-1 "" 11 n;1/2 n1/2

The second summand of (15) and the third summand of Dn itself are treated the same way and don't have impact on the result. Thus, usage of estimators a; in adaptive predictors yields

Dn < Cn-1/2 log1/2 n = o(1) as n Considering (9), the stated risk minimization problem reduces to minimization of the principal term

D A 2

Rn «—ct + n-> min.

n n

Since the parameter ct2 is known, the expression can be easily minimized with the optimal sample size

noA = A1/2ct. (16)

The corresponding approximate minimal risk value is

Ra = 2A1/2ct + O(A1/4 log1/2 A) as A (17)

Thus, the following corollary is true.

Corollary 1. Assume that E || |(1) ||4p < E || x(0) ||4p < . and the variance c2 is known. Then the number n°A defined in (16) minimizes the risk function Rn defined in (9) and the asymptotic formula (17) for Rn„ holds.

2.2. Unknown ct2 case

Since ct2 is directly involved in the expression (9) for Rn, the optimal sample size can not be obtained as before. Similarly to [7, 8, 10], one uses the stopping time TA as an estimator of noA, replacing ct2 in its definition with an estimator <5 2

Ta = inf {n > A1/2ct„}, (18)

where nA is the initial sample size depending on A and specified below (see Theorem 1),

p 1

-j ii x(k) -A „x(k -1) ii2 . (19)

n p+1M |2 nti

The choice of estimator is motivated by the fact that using the strong law of large numbers we have

1 n 1 n ( i M i2 ^

- j ii x(k) -Ax(k -1) ii2 = - j ii |(k) + M|(k -1) ii2 —-— ct2 I 1 + J!-!L I, Pe - a.s.

n k=1 n k=1 ^ p )

In this section we define predictors of x(k) using truncated estimators A k instead of Ak. Rewrite the needed formulae

k-1

I (k) = j (-M) (x(k - i) -A kx(k -1 - i)), x (k) = A k-1 x(k -1) + M |(k), (20)

i=0

k -1

e(k) = x(k) - x(k) = |(k) + (-M)k |(0) - j (-M) (Ak-1 - A)x(k -1 - i),

i=0

7(n) =1 jj ii ~e(k) ii2, Ln = a7(n) + n, R„ = EeI„ = AEe7(n) + n. (21)

n k=1 n n

Analogously to [7], we prove the asymptotic equivalence of TA and n°A in the almost surely and mean senses (see (23), (24) below) and the optimality of the adaptive prediction procedure in the sense of equivalence of the obviously modified risk

Ra = Ee Lta = AEe -Tf ~2Va ) + EJA (22)

TA

and R o, see (8).

nA

Theorem 1. Assume that E || |(1) ||16p<., E || x(0) ||16p<. and nA in (18) is such that max{k0, Ar log2 A} < nA = o(A12) with r e (2/5,1 / 2). Let the predictors ~x(k) be defined by (20) and the risk functions defined by (21), (22). Then for every ee©

% -.— 1 Pe - a.s(23)

nA

ET

— —-► 1, (24)

noA

-^ "-- 1. (25)

R

n

The proof of Theorem 1 is presented in Section 3.

Remark 1. The third assertion of Theorem 1 is also true for predictors based on Ak.

3. Proof of Theorem 1

First, we prove the properties (23), (24) of the stopping time TA.

From the conditions of Theorem 1 on noise moments for MeM0 it follows

sup E0 i x(k) ii16p < C. (26)

k >0

C = p

^ us

where

Denote

C =_

M p +1| M ||2'

Rewrite formula (19) for <5 2 using (1):

CTn2 = — X11 ^(k) + Mm -1) + (A-A„)x(k -1) 112 =

n ;=1

= ^XX || ) + (k -1) ||2 +Wn + vn, (27)

n ;=1

C n 2C n W =X || (A„ - A)x(k-1) ||2, vB =--MX [(AAB - A)x(k- 1)]'(|(k) + (k-1)).

n ;=1 n ;=1

Now we show that

ct in ———W, P8- a.s. (28)

Consider Wn. It follows from the definition (5) of the truncated estimators An that they are asymptotically equivalent to the corresponding correlation estimators (4), see, e.g., p. 8 in [5]. Since the estimators (4) are strongly consistent we have

A -A-> 0, P8-a.s.

n ft^tt 7 8

Given that

1n

-X x(k -1) x'(k -1) —-— F, Pq a.s.,

n ;=1

where F is a constant matrix (see (10)), it follows that

Wn-> 0, P8- a.s.

n ft^tt 7 8

Similar arguments are used to show

vn-> 0, P8 - a.s.

n n^tt 7 8

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

The relation (28) obviously follows from these facts, the representation (27) and strong law of large numbers. From the definition (18) of TA it follows that with P8 -probability one TA ^ro as A ^ro. Therefore, by

(28) we have <5ja ^ ct2 P8 - a.s. and hence

Ta 1, P8- a .s.

A1/2ct

For proof of (24) we introduce for any positive A the auxiliary sequence of numbers yA r

Denote

2 , 1

Y A = n2A -, n > 1.

U" 2log A'

n n i ( 11 » r l|2

^ ?

™ =^ S

n k=1

(k) + Ml(k -1) ii2 -

M- ^

2

By the definition of TA and (27) we have

EJa < nA + £ Po ( n2 A-1 < Cm- J II Ç(k ) + M Ç(k -1) II2 +W„ + v „

< nA + j Jp0 n2A-1 < ^X ii Ç(k) + MÇ(k -1) ii2 +yA

n> nA [I n k=1

< nA + X {P0(n2A-1 < ct2 + 2ya,„ ) +

n~nA

+Po ( I v „ |> y A, n / 2) + Po (Wn > y A, n / 2) + P0 ( I mn |> y A,„ )}.

+ Po (W„ +K |> yA,n ,

Note that

where

Therefore

nA +

X Po(n2A-1 <ct2 + 2yA,n ) = nA + 1 = nAA,

nA = inf {n2 A-1 > ct 2 + 2y a,„ } =

A1/2ct| 1 +

1

+ X Poin2A-1 <CT2 + 2yA

log A -1

- 1.

+1.

A1/2ct

Now we show that the other summands in the right-hand side of (29) vanish as A when normalized appropriately.

Consider the probability Pe (|vn |>yA,n). According to (26), the Chebyshev inequality and the Cauchy-Schwarz-Bunyakovsky inequality for n > nA we have

2C

Po ( I v„ |> y A,n / 2) = Po

C

X [(A„ - A)x(k - 1)]'(^(k) + MÇ(k -1))

> yA,n / 2 l<

<-x(Eo ii A„-A ii2 Eo ii x(k-1) ii2ii Ç(k) + M^(k-1) ii2) <

nyA,n k=1 ^ '

< Cl°g'/2 " < CA log A^. V«yA,„ n5/2

From the assumptions on nA it follows that

A-1/2 j Pe (| v„ |> yA,„) < CA1/2 log A j ^ <

«>«a l>la n

< CA1/2 log3/2 A • nA'2 < CA~ ^ log-3/2 A ——— 0. The probability Pe > ya,l ) is treated analogously.

As for the probability Pe(| mn |> yA,l), note that mn is sum of martingales, thus the Chebyshev inequality and the Burkholder inequality yield

(CM

Po ( I m„ I > y A,n ) < Po

x (ii (ç(k) + mç(k -1) ii2 - (ct2 + iim ii ct2))

>ya,„ <

< Cy-A„n-1 = 4CA2 log2 A • n-5.

Therefore, by assumptions on

A-1/2 X Po [I m„ I >yA,„ )< CA3/2 log2 A X n-5 <

< CA3/2 log2 A • nA < CA--ï- log-6 A ——— 0.

Then from (29) it follows that

- ET

lim ET^ < 1.

a-» A1 ct

k=1

n

A

Same arguments can be used to show

FT lim FTL ^ 1

A^œ A a

and thus, in view of (30) the assertion (24) holds.

Regarding (25), rewrite its left-hand side using (17) and (22)

Ra AEq T-/(TA)+EQTA

Ra 2 A1' 2CT + O(A1/4 log1'2 A) From (24) and (31) it follows that to prove (25) it suffices to show the convergence

A1/2 eq T- (Ta ) ———1.

tact

Define

N' = [(ct-e)A1'2]j, N" = [(ct + e)A1'2] +1, 0 <e <ct. We will need the following properties

P0 (Ta < N") = O(A-r), P0 (TA > N") = O(A-1), which we prove similarly to Lemma 4 of [7].

Denote S1 =ct2 - (ct- e)2. Using non-negativeness of Wn, definitions of TA and <5 n2 one gets

P0(Ta < N') < P0(Ta < (ct-e)A1'2) =

= P0(ctn2 < A-1n2, for some nA < n < (ct-e)A1'2]<

< Pa

X || l(k) + M|(k -1) ||2 +v n < (ct-e)2, forsome n > nA

I n ;=1 j

= P8( mn + v n >8j, forsome n > nA )<

<X Pq^K |>|J+X Pq (I v n l>8. / 2).

n> Ua V ^ J nA

Consider the first summand. By the Chebyshev inequality, (26) and the Burkholder inequality

■ X (|| (S(k) + MS(k -1) 112 -(ct2 + 11M 11 ct2))

S

1 n

>1 '<

< C X ^n|4 < C X n-2 < CA-r log-2 A.

n> Ua n n-nA

The following is proved analogously to how (29) is treated

sup E8 (n log-1 n- |vn |)2 <ro.

Thus,

S P0 (IVnl^S, /2) < C S n-2log2 n < CnA log2 A < CA"'.

The first property of (33) follows from (34)-(36).

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Prove the second property of (33). Denote 82 = (ct + e)2 -ct2. Then, by definition (18) of TA and (27)

P0 (Ta > N") < P0

NtS II №) + M^(k "!) II2 +Wn.+Vn, > A-'(N")2

N k=1

< P0 "m xii 5(k) + MUk -1) ii2 +\WN,+v n.|> (ct + e)2 <

v N k=1 ,

< P0 (\ mN„ \> 82 '2) + P0 (\ Wn. +vn. \> 82 '2). By the Chebyshev and Burkholder inequalities

P0 (I mN, \ > 82 ' 2) < C(N")-2 = O(A-1), P0 ( \ Wn" +vn, \ > 82 ' 2) < C(N")-2 = O(A-1). Thus, the second assertion in (33) holds true. To prove (32) we show that

(32)

(33)

(34)

(35)

(36)

1 T

1 "T,

A1/2Ee — e\TA )x(Ta < N') -—— 0, A1/2E,— e\TA )%(TA > N'')

T,

T

-l a

A1/2 Ee-1- e\TA )x( N '< Ta < N'') Tact

J 0,

J 1.

(37)

(38)

Prove the first assertion in (37). By the definition of e2(k) we get

A1/2Ee T-~\TA )x(Ta < N') = T A

1 T1 II k-1 ||2

= A1/2Ee — j (-M)k 1(0) - j (-M) (Ak- - A)x(k -1 - i) x(Ta < N') +

TA

A k=1 | T.

LA k =1

+ 2A1/2Ee — j |'(k)| (-M)k |(0) -j (-M) (AM - A)x(k -1 - i) I^T. < N') +

1

i=0 T.

+a1/2e^-^ j ii i(k) ii2 x(ta < n').

TA

(39)

A k=1

Consider the first summand. By the Cauchy-Schwarz-Bunyakovsky inequality and the definition of TA assumptions on nA and r, the properties (33) and Lemma 1 we have

TA II k-1

1

A1/2Ee — j (-M)k 1(0) - j (-M) (Ak-! - A)x(k -1 - i)

T

A k=1 |

1(Ta < N') <

< A1/2Pe1/2(Ta < N') -l j ee

nA k=1 V

(-M)k |(0) - j (-my (Ak- - A)x(k -1 - i) .

Examine the expression under the root square. The most significant summand is treated using the Cauchy-Schwarz-Bunyakovsky inequality

k-1 ||4 Ee jj(-M y (A k-! -A) x(k -1 - i) <

< Ee ii Ak-! -A ii8 Ee|j(-M)ix(k -1 -i)|| < ^io^jEel j(-M)ix(k -1 -i)|| .

\ II i =0 II k V II i =0 II

It can be easily shown, employing the Holder inequality, that

j (-M)ix(k -1 - i)\ < Ee| jj Ml • II x(k -1 - i)| I < C (j \Mi/2\\ I •jj |Mi7H8 < C

EB

and hence, by the assumptions on nA and r, the properties (33) and Lemma 1 we have

1 Ta Ilk-1 II2

A1/2Ee — j j (-M) (Ak- - A)x(k -1 - i)\\ x(Ta < N') < TA k =1 || i =0 ||

< CA1/2P1/2(Ta < N')-Lj^ < CA^ log-2 A —-J— 0. nA k=1 k

Consider the second summand of (39). The Doob's maximal inequality for martingales (see, e.g., [14]) and the Cauchy-Schwarz-Bunyakovsky inequality yield

1

A12 Ee

TA2

< a~ ¥ nA^e max I jji'ck) | (-M )k i(0) - j (-M y (A k- - A) x(k -1 - 01| <

< CTA-T-1.

j|'(k)| (-M)k|(0) - j (-M)(Ak-! -A)x(k -1 -i) Ix(Ta < N')

j Ee

< CA"logA — Consider the last summand of (39). We have

(-M)k|(0) -j (-M) (AAk- - A)x(k -1 - i)|(0)| <

0.

4

1 TA N I-

A1/2E0 £ ii l(k) ii2 x(ta < N') < A1'2nA2P01/2(T4 < N')£VE0 ii l(k) ii4 <

TA k =1 k=1

< CA^log-4 A^N' < CA-^log-4 A ^ , 0. Thus, the first part of (37) has been proved, similar arguments are applied to the second part with x(ta < N') replaced by xT > N'') to get

1 Ta II k-1 II2

A1/2E0 — £ (-M)kl(0) -S (-M)(MMk-1 -À)x(k -1 -i) x(Ta > N') < CA-1 ——— 0,

TA k=1 || i =0 || 1 Ta i k-1 \

A1/2E^-7 Sl'(k ) I (-M )k l(0)-£ ( -M )i (À k-1 -M)x(k -1 - i) U(TA > N') < CA~3 log A ——— 0,

T,

A k =1

1 Ta ,

A1/2F0 —S II S(k) II2 X(Ta > N') < CA"1 0

A k=1

and to (38) with x(TA < N') replaced by z(N' < TA < N") to get

1 Ta II k-1 ||2

A1/2F0 —-S (-M)k |(0) - S (-M)' (Ak-1 - A)x(k -1 - i) x(N' < TA < N") < CA-* log2 A -

TAa k =1

1 Ta ( k-1 \ ,

A1/2F0 —5— S^'(k)| (-M)k|(0)-S(-M)'(vVk-1 - A)x(k-1 -i) |X(N'<TA <N") < CA-2 logA-

TA — k=1 ^ i =0 y

Now we show that

0,

0.

->1.

A1/2Eq -t-XX || ) ||2 X(N' < Ta < N'') -

Ta ct ;=1

To this end rewrite the left-hand side as follows

1 ta

A"2Eq -7-X h 5(k) ||2 X(N'< Ta < N'') =

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Ta ct ;=1

= AJ/2Eq -^X (|| ) ||2 - ct2)X(N' < Ta < N'') + A1/2ctEq ^X(N' < TA < N''). Ta ct ;=1 Ta

We show that the first summand converges to 0 and the second one converges to 1. By the Doob's maximal inequality and the Cauchy-Schwarz-Bunyakovsky inequality

f / n N2V2

A1/2 F0

1

T2—

S (II «k ) II2 -—2)

X(N' < TA < N") < CA1'

1

( N ')2

F0 max | S (Il Ç(k )ll2 -—2)

< CA-

>1/2

S F0 ii Ç(k) ii4 < CA-

0.

Consider the second summand. To prove its almost sure convergence to 1 it suffices (see, e.g., [15]) to show that

and that the family of random variables

P0-lim A1/2—^x(N'< Ta < N") = 1

A^a T

1 A

Z H A1/2 T- x(N'< Ta < N")

(40)

(41)

is uniformly integrable.

The property (40) is fulfilled as, according to (23),

T

1 A

A 2—

1, P0 - a .s.

and hence

limx(N'< TA < N") = limx| 1 - e/ — <-T- < 1 + e/ — | = 1, P0 - a.s.

A —

Property (41) holds true since Z is uniformly bounded.

4. Numerical simulation

To confirm theoretical results we performed numerical simulation programmed in MATLAB for 2-dimensional stable ARMA(1,1). The needed expected values are approximated by sample means of 100 realizations. E.g., the realizations of the stopping time TA are

TAn) = inf (k > A1'2ctf ), n = 1,100,

k >nA '

then its expectation EeTA is computed as foolows

___ 1 100

eTa =—y TA(n). 0 A 100

The initial value x(0) and the noises |(k) are generated from the multivariate Gaussian distribution

x(0) ~ N(0,I), £(k) ~ N^0,^2-1j,

Where I is the identity 2x2 matrix and thus, E || |(k) ||2 = ct2 . We take the true value of the matrix parameters to be

_(0.4 0.7 ^ _(0.1 -0.2

= [ 0.4 -0.5), =[ 0.7 0.4

with eigenvalues, respectively, ^ = 0.6446, X2 = -0.7446, ^ = 0.25 + 0.3428/, = 0.25 - 0.3428/, which satisfy the stability conditions.

Tables 1(a, b) contain comparison of the estimates of Rn( and the value 2A1'2ct as implied by (17) for the

ET R

prediction error two values of prediction error cost A, as well as the estimates of 0 A and —-.

nA Rno

nA

T a b l e 1

Estimates of the crucial values

a) The prediction error cost A = 5000

a2 < ETA nA Ro nA Ro nA 2 A1/2a Ro nA

1 70.71 1.02 166.9 1.18 0.98

3 122.5 1.01 282.5 1.15 0.99

5 158.1 1.00 341.5 1.08 1

b) The prediction error cost A = 10000

a2 nA ETa nA Ro nA Ro nA 2 A1/2a Ro nA

1 100 1.01 222.3 1.12 0.99

3 173.2 0.99 377.5 1.09 1

5 223.6 1.00 468.1 1.04 1

As the tables imply, the ratio 0 A converges to 1 with growth of the optimal sample size n°A, which is

nA

also reflected in the fact that the values of risks R( and R „, for cases of unknown and known noise variance reA nA

spectively, are very close to each other. At the same time, the values of both risks are accurately approximated by 2A1'2 ст only if the prediction error cost and with it the optimal sample size are rather large.

5. Summary

This paper presents the problem of optimization of both one-step prediction quality and sample size for stable multivariate ARMA(1,1) process with unknown dynamic parameters. The cases of known and unknown noise variance were studied. In both cases optimization is performed based on the loss function describing the sample mean of squared prediction error. If the noise variance is unknown, the risk function depends on the mean of the duration of observations, defined as a stopping time in this case. It was shown that the risk functions are equivalent to each other asymptotically.

The adaptive predictors were constructed upon the basis of truncated estimators of the dynamic matrix parameter. The mentioned estimators have given statistical properties on a sample of fixed size. Usage of such estimators essentially simplifies analytical investigation of statistical properties of adaptive predictors and can be applied in various adaptive procedures (control, filtration, etc.).

Acknowledgements

The author wishes to thank his scientific adviser professor Vasiliev V.A., who provided valuable comments and ideas regarding the research theme.

REFERENCES

1. Ljung L., Soderstrom T. System Identification Theory for the User. Prentice Hall. Upper Saddle River, 1983.

2. Ljung L. Theory and Practice of Recursive Identification. Cambridge : The MIT Press, 1987.

3. Konev V., Pergamenshchikov S. On the Duration of Sequential Estimation of Parameters of Stochastic Processes in Discrete Time //

Stochastics. 1986. V. 18. Is. 2. P. 133-154.

4. Konev V., Pergamenshchikov S. Truncated Sequential Estimation of the Parameters in Random Regression // Sequential Analysis.

1990. V. 9, issue 1. P. 19-41.

5. Vasiliev V.A. Truncated Estimation Method with Guaranteed Accuracy // Annals of Institute of Statistical Mathematics. 2014. V. 66.

P. 141-163.

6. Vasiliev V. Guaranteed Estimation of Logarithmic Density Derivative By Dependent Observations // Topics in Nonparametric Statis-

tics: Proceedings of the First Conference of the International Society for Non-parametric Statistics / eds. by M.G. Akritas et al. New York : Springer, 2014.

7. Sriram T. Sequential Estimation of the Autoregressive Parameter in a First Order Autoregressive Process // Sequential Analysis.

1988. V. 7. Is. 1. P. 53-74.

8. Konev V., Lai T. Estimators with Prescribed Precision in Stochastic Regression Models // Sequential Analysis. 1995. V. 14. Is. 3.

P. 179-192.

9. Vasiliev V., Kusainov M. Asymptotic Risk-Efficiency of One-Step Predictors of a Stable AR(1) // Proceedings of XII All-Russian

Conference on Control Problems. Moscow, 2014.

10. Kusainov M., Vasiliev V. On Optimal Adaptive Prediction of Multivariate Autoregression // Sequential Analysis. 2015. V. 34. Is. 2. (to be published).

11. Sriram T., IaciR. Sequential Estimation for Time Series Models. Sequential Analysis. 2014. V. 33. Is. 2. P. 136-157.

12. Box G., Jenkins G., Reinsel G. Time Series Analysis: Forecasting and Control. Wiley, Hoboken, 2008.

13. Pergamenshchikov S. Asymptotic Properties of the Sequential Plan for Estimating the Parameter of an Autoregression of the First Order // Theory of Probability and Its Applications. 1991. V. 36. Is. 1. P. 42-53.

14. Liptser R., Shiryaev A. Statistics of Random Processes. New York : Springer, 1977.

15. Gikhman I., SkorokhodA. Introduction to the Theory of Random Processes. Saunders, Philadelphia, 1969.

Kusainov Marat Islambekovich. E-mail: [email protected] Tomsk State University, Tomsk, Russian Federation

Поступила в редакцию 10 декабря 2014 г.

Кусаинов Марат И. (Томский государственный университет. Россия). Об оптимальном адаптивном прогнозе многомерного процесса АРМА(1,1).

Ключевые слова: адаптивные прогнозы; асимптотическая риск-эффекитвность; многомерный АРМА; момент остановки; оптимальный размер выборки; усечённое оценивание.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Рассматривается проблема асимптотической эффективности адаптивных одношаговых прогнозов многомерного устойчивого процесса АРМА(1,1) с неизвестными параметрами динамики. Прогнозирование основано на методе усечённого оценивания матрицы. Усечённые оценки являются модификацией усечённых последовательных оценок, позволяющей достичь заданной точности на выборках фиксированного размера. Критерий оптимальности прогнозов основан на функции потерь, определённой как линейная комбинация размера выборки и выборочного среднего квадрата ошибки прогноза. Изучены случаи известной и неизвестной дисперсии шума. В последнем случае оптимальный объём наблюдения записывается как момент остановки.

REFERENCES

1. Ljung L., Soderstrom T. System Identification Theory for the User. Prentice Hall, Upper Saddle River, 1983.

2. Ljung L. Theory and Practice of Recursive Identification. Cambridge: The MIT Press, 1987.

3. Konev V., Pergamenshchikov S. On the Duration of Sequential Estimation of Parameters of Stochastic Processes in Discrete Time.

Stochastics, 1986, vol. 18., issue 2, pp. 133-154. DOI: 10.1080/17442508608833405

4. Konev V., Pergamenshchikov S. Truncated Sequential Estimation of the Parameters in Random Regression. Sequential Analysis,

1990, vol. 9, issue 1, pp. 19-41. DOI: 10.1080/07474949008836194

5. Vasiliev V. A Truncated Estimation Method with Guaranteed Accuracy. Annals of Institute of Statistical Mathematics, 2014, vol. 66,

pp. 141-163. DOI: 10.1007/s10463-013-0409-x

6. Vasiliev V. Guaranteed Estimation of Logarithmic Density Derivative By Dependent Observations. Topics in Nonparametric Statis-

tics: Proceedings of the First Conference of the International Society for Non-parametric Statistics. New York: Springer, 2014. DOI: 10.1007/978-1-4939-0569-0_31

7. Sriram T. Sequential Estimation of the Autoregressive Parameter in a First Order Autoregressive Process. Sequential Analysis, 1988,

vol. 7, issue 1, pp. 53-74. DOI: 10.1080/07474948808836142

8. Konev V., Lai T. Estimators with Prescribed Precision in Stochastic Regression Models. Sequential Analysis, 1995, vol. 14, issue 3,

pp. 179-192. DOI: 10.1080/07474949508836330

9. Vasiliev V., Kusainov M. Asymptotic Risk-Efficiency of One-Step Predictors of a Stable AR(1). Proceedings of XIIAll-Russian Conference on Control Problems, Moscow, 2014.

10. Kusainov M., Vasiliev V. On Optimal Adaptive Prediction of Multivariate Autoregression. Sequential Analysis, 2015, vol. 34, issue 2. 23 p. (to be published).

11. Sriram T., Iaci R. Sequential Estimation for Time Series Models. Sequential Analysis, 2014, vol. 33, issue 2, pp. 136-157.

12. Box G., Jenkins G., Reinsel G. Time Series Analysis: Forecasting and Control. Wiley, Hoboken, 2008.

13. Pergamenshchikov S. Asymptotic Properties of the Sequential Plan for Estimating the Parameter of an Autoregression of the First Order. Theory of Probability and Its Applications, 1991, vol. 36, issue 1, pp. 42-53.

14. Liptser R., Shiryaev A. Statistics of Random Processes. New York: Springer, 1977.

15. Gikhman I., Skorokhod A. Introduction to the Theory of Random Processes. Philadelphia: Saunders, 1969.

i Надоели баннеры? Вы всегда можете отключить рекламу.