Серия «Математика»
Том 1 (2007), № 1, С. 70-85
Онлайн-доступ к журналу: http://isu.ru/izvestia
ш
УДК 517.977
Search of equilibrium controls in differential game with boundary conditions
Olga Vasilieva ([email protected])
University of Valle, Cali
Abstract. The present paper deals with a finite-time differential game of m players with nonzero sum. It should be emphasized that the players' states are governed by boundary value ODE systems (rather than initial value systems). By the end of the game we understand an equilibrium situation, which is attained by applying an equlibrium control strategy. So our purpose is to design a well-founded suitable algorithm for equilibrium control search. In order to fulfil this task we shall make use of optimal control techniques.
Keywords: optimal control, differential game, dynamic system, boundary conditions
Within the frameworks of classical approach to the game theory (see, e.g., [1]), the ultimate goal is to find a decision which would be suitable for every player. In other words, one is seeking a decision strategy which would maintain some sort of balance in the game. Such strategy was originally defined in a strict mathematical way by J.F.Nash in his revolutionary work [5] and was called the equilibrium strategy.
Initially, the concept of Nash equilibrium was used in mathematical programming (e.g., multicriteria optimization and static game theory) where Nash equilibrium solutions have frequently occured in saddle points of the payoff function. Further, this concept was generalized for differential (i.e., dynamic) games [1, 3] and grew into Nash equilibrium control strategy.
In order to find an equilibrium control strategy in a differential game of m players we can use various approaches. Among them, there is one method of our particular interest which consists in reducing differential game to a problem of optimal control [9]. The latter is then solved using some gradient or subgradient searching procedure or, alternatively, successive approximation technique based on the maximum principle.
1. Introduction
In most cases, the state of finite-time nonzero-sum dynamic game of m players is subject to a system of ordinary differential equations with initial conditions, so that the problem of optimal control is to be solved along the profile of initial value system (see, e.g., [8, 13]). In this paper we replace initial conditions by boundary conditions and, thus, deliberately complicate the situation. In such case, the dynamic game can also be reduced to a problem of optimal control which, however, should be solved along the profile of boundary value system of ODE.
For many types of problems in economics and engineering which deal with controllable dynamic processes, it is very essential that the system of ODE describing the process is supposed to meet boundary conditions. Optimal control problem with boundary conditions itself has independent scientific and practical significance when we talk about optimization in some areas of applied chemical (nuclear and subatomic) engineering and other industrial processes [4, pp.255-312]. Therefore, we shall need some reliable method capable to solve numerically optimal control problems with boundary conditions.
In the preceding works [10, 11, 12], there has been proposed and thoroughly justified an iterative solution algorithm which can handle optimal control problems with boundary conditions. Here we do not give all the details related to this algorithm, since we are dealing with differential game and use that algorithm as a simple step within the frameworks of another method which is designed to find an equilibrium control strategy.
Thus, the search of equilibrium situation in differential game is reduced to a sequence of optimal control problems, each of that is proposed to be solved by means of iterative method given in [10, 11, 12]. The problem to maintain the higher level of output while trying to minimize the flow of expendable supplies posed in [4, pp.263-268] can serve as a classical example inllustrating the practical actuality of the dynamic game which is considered in the present paper using rigorously mathematical outlook.
The paper is organized as follows. Section 2 gives formulation of the differential game along with essential definitions and assumptions related to the problem entries. Section 3 exposes the theoretical background which is needed in order to create and justify the method capable to find an equilibrium control strategy. The method itself is described in Section 4. As a necessary condition for optimality we employ the maximum principle of L.S.Pontryagin adopted for control problems with boundary conditions [11] which can also play role of sufficient condition under some additional assumptions. Such particular case is considered in section 5 in order to demonstrate potentialities of the proposed method.
2. Problem Formulation and Preliminaries
We shall focus on the finite-time differential game of m players with variable sum. The duration of the game is defined by the time interval T = [io,ii]. Each participant i of the game chooses his control ui = Ui(t), Ui(t) € Kri, i = 1,2,... ,m, out of the class of measurable functions, restricted by direct constraint
Ui(t) € Ui, t € T, (2.1)
where Ui C Kri are specified compact sets. Thus, the situation in the game is characterized by the collection of admissible controls
u(t) = (ui(t),u2(t), Um (t)) , u(t) € U,
m
U = Ui X U2 X • • • X Um €Kr, r = J2 ri.
i=1
For each situation, state vector x = x(t, u), x(t, u) € is determined by the following boundary value problem:
X = A(t) x + b(u,t), (2.2)
L0 x(t0) + L1 x(t1) — g = 0, rank[L0 L1] = n.
Here A(t) is time-dependent and L0, L1 are numerical (n x n) specified matrices, while b(u, t) is some prescribed (n X 1) vector-function and g is constant numerical vector from Kn.
Remark 1. It has been demonstrated in [11] that if
det [L0 + L1 X(t1)]=0 (2.3)
where the fundamental matrix X = X(t) of solutions of homogeneous system (2.2) satisfies the matrix equation
X = A(t)X, X(t0) = I, I — identity matrix,
then every set of admissible controls generates unique solution x = x(t, u) of the boundary-value problem (2.2).
The gain of each player i, i = 1, 2, . . . , m is determined by the cost functional
Ji(u) = pi(x(t0, u), x(t1, u)), i = 1, 2,...,m. (2.4)
where all scalar functions i = 1,2,...,m are continuous in x(t0, u), x(t1, u) together with their partial derivatives.
Every participant of the game has intention to maximize his payoff (2.4) by choosing his admissible control ui out of the set of available controls Ui.
By the end (or solution) of the game we mean a set of admissible controls u* = u*(t) and the corresponding state vector x* = x(t, u*) for which
Ji(u*) = max Ji(u1,..., u*_ i,Ui, u*+1,..., u*m), i = 1,2,... ,m.
Ui&i
(2.5)
Now we can draw a close analogy with the finite case considered in [2]. Namely, we shall refer to the set of controls u* = u*(t) satisfying (2.5), as equilibrium strategy or equilibrium control. It should also be noted that here we consider the most general situation of a game with variable sum:
m
J (u) = J2 Ji(u) = const, u £ U. (2.6)
i=1
3. Method of Solution
Preliminarily, in the control set u = (u1,...,um) we formally select a control function ui, which corresponds to the payoff functional Ji(u), (i.e., carries the same index i) and introduce the following notation:
u = (ui,Vi), Vi = (ui,... ,ui_i,ui+i,.. .,um).
Now we shall pose a serie of auxiliary subproblems whose solutions will be used later on.
The first one is to find an admissible control ui = ui(t,vi) which maximizes the corresponding cost Ji(u) = Ji(ui; vi) with respect to ui for the fixed admissible vi on the solutions x i(t) = x(t, (ui; vi)) of the boundary-value problem (2.2):
Ji(u(vi); vi ) = max Ji(ui; vi), i = 1, 2,...,m. (3.1)
Ui&Ui
On the ground of previous works [10, 11] we can assert that the optimal control ui = ui(t,vi) in (3.1) fulfills the maximum condition
(^i(t), b((u; vi ),t)) = max (^(t), b((uf; vi ),t)), t £ T, i = 1, 2,...,m.
Ui&Ui
(3.2)
on the profiles ^ i(t) = ^(t, (ui; vi)), i = 1,2,... ,m of conjugate boundary value problem
xp = -A(t)' ^, (3.3)
Bo*(to) - BMti) + Bod^i(xd(^r)t'Xi(ti)) + Bid^i(x = o,
dxi(to) dxi(ti)
(3.4)
Bo L'0 + Bi Li = 0, rank [Bo Bi] = n. (3.5)
Here (■ , ■) stands for inner product, prime ' denotes the matrix transpose, and numerical matrices B0, B1 which define the boundary conditions (3.4) are chosen arbitrarily so that to satisfy (3.5). It has been demonstrated in [11] that conjugate BVP (3.3)-(3.5) always has unique solution if the direct BVP (2.2) does.
Additionally, if all ^i(x(t0), x(t1)) are concave in x(t0), x(t1), then the maximum principle becomes both necessary and sufficient condition for optimality. In that case one can find global optimal control ui = ui(t,vi) using the numerical technique worked out in [11]. Otherwise we should suppose that ui conveys global maximum to Ji with respect to ui.
Remark 2. Using the formulation (3.1), one can determine equilibrium controls (2.5) as solutions of the system of operator equations:
ui(t) = ui(t,vi), i = 1,2,...,m, (3.6)
where the equality to be held almost everywhere on T.
The spurious simplicity of this approach rarely yields desirable results even in the finite-dimensional situation [2, 5]. Therefore, we suggest another method of equilibrium control searching.
Under the assumption on resolvability of auxiliary subproblems (3.1), let us form two auxiliary functionals:
m
J (u) = Ji(Ui(vi); vi), (3.7)
i=i
m
$(u) = J (u) - J (u) = Y,[Ji(Ui(vi); vi) - Ji(ui; vi)] (3.8)
i=1
where each summand in (3.8) is nonnegative due to Eq.(3.1).
Theorem 1. $(u) > 0, u e U and $(u*) = 0 if and only if u* is a set of equilibrium strategies.
The result of the Theorem 1 immediately follows from the definition (2.5), auxiliary problem (3.1), Remark 1 and the form of functionals (2.6), (3.7), (3.8).
Thus, the search of equilibrium controls is reduced to a problem of optimal control:
u* : $(u*) = min $(u) = 0, (3.9)
ueU
where the objective functional $(u) is defined on the profile of original BVP (2.2) as well as on the profiles xi = x(t, (ui(vi); vi)), i = 1,2,... ,m of boundary value problems
x = A(t) x + b((ui(vi); vi),t),
(3.10)
Lo x(to) + Li x(ti) - g = 0. (3.11)
Unfortunately, the solution technique used to for (3.1) cannot be simply shifted to treat the latter problem (3.9)-(3.11). The reason for that is concealed in behavior of the corresponding conjugate variable ^ (see Eqs.(3.3), (3.4)). Apparently, solutions ^i of conjugate BVPs (3.3)-(3.4) continuously depend on x i. Therefore, admissible controls u i satisfying the maximum condition (3.2) might become discontinuous with respect to ^i. This would result in the fact that system (3.10) could turn into nonlinear one with discontinuous right-hand side.
In order to apply the same technique which was used to solve problems of the type (3.1), we shall introduce approximate functional:
&u(z) = Ju(z) - J(z), z = z(t), z e U (3.12)
where
m
'Ju(z) Ji(Ui; Pi) = Ji(Ui,Z2, ...,Zm)+-----+ Jm(zi, . . .,Zm-1,Um),
i=1
(3.13)
u i = Ui(t,vi), Pi = (Zi,...,Zi-i,Zi+i,...,Zm), i = 1, 2,... ,m. This allows us to formulate the following result.
Lemma 1. Functional $u(z), z e U, defined on the profiles of systems (2.2), (3.10) for fixed u i, i = 1, 2, . . . , m satisfies the same conditions as each cost functional Ji(u), i = 1,2,... ,m on the profile of (2.2). Moreover,
$u(u) = $(u), u e U, $u(z) < $(z), z e U, u e U.
Both statements of Lemma 1 are obvious. It is easy to see that
m
$u(z) = y2 Ji(ui; Pi) - J(u) < max Ji(Zi; pi) +----
i=i
+ max Ji(Zm; Pm) - J(z) = J(z) - J(z) = $(z).
zm
&Um
Thus, we see that functional $u(z), z e U approximates another one $(z), z e U from below and their values coincide for z = u. All stated above results in the following statement.
Proposition 3.1. If the functional set
U* = {u = u(t) : u(t) e U, t e T, $(u) = 0}
is not empty and for each vector-function u = u(t), u(t) e U, t e T the corresponding set
Z*(u) = {z = z(t) : z(t) e U, t e T, $u(z)=0}
is also nonempty, then U* c Z(u).
The Proposition 3.1 affirms that if there is any solution u* of (3.9) then u* £ Z*, i.e., $u(u*) = 0. However, at the moment, we cannot guarantee existence of such solution.
4. Iterative Algorithm
Now we have formed sufficient ground in order to set forth the iterative numerical algorithm using which one can determine equilibrium situation in the differential game with variable sum. Below we outline a general scheme of such algorithm.
STAGE 1. Given some arbitrary initial approximation
uk(t) e U, k = 0,1,2,..., uk = (uk;vk), i = 1,2,...,m,
we shall calculate the corresponding state vector xk = x(t, uk) by integrating linear BVP (2.2).
Then we solve a serie of optimal control problems (3.1), determine admissible controls Uik = ui(t,vk) and their corresponding state profiles
Xik = x(t, (Uik; vk)), i = 1, 2,...,m
according to (3.10), (3.11).
STAGE 2. Calculate the value of the functional $(uk). IF $(uk) > 0 THEN go to STAGE 3.
IF $(uk) = 0 THEN uk = u* is an optimal solution (i.e., equilibrium strategy), and the solution process is terminated. STOP.
STAGE 3. By formulae (3.12), (3.13) for u = uk, z = u we form approximate functional
$k(u) = $uk(u), $k(uk) = $(uk) > 0.
It should be noted that functional
m
$fc(u) = £ [pl(xlk(to), Xik(ii)) - Vl(xk(to), xk(ti))J , (4.1)
i=l
is defined on the profile xk = x(t, uk) of BVP (2.2) as well as on the profiles Xik = x(t, (uik; vi)) of BVPs
X = A(t) x + b((Uik; Vi),t), (4.2)
L0x(t0) + Lix(ti) — g = 0, i = 1,2,... ,m.
Then, according to (3.9), the problem of optimal control
ufc+i : $fc(uk+i) = min $fc(u). (4.3)
u&U
is to be solved. Unlike (3.9)-(3.11), the problem (4.1)-(4.3) does not conceal uncertainties related to behavior of gonjugate variable and, on the contrary, is very much alike the problem (3.1). To solve the latter, one can use iterative procedure described in [10, 11]. However, it should be mentioned that here we are seeking not the minimum of (4.1) itself but its zero value:
uk+i : $fc(uk+i) = 0. (4.4)
STAGE 4. Upgrade the counter k = k + 1, set uk+i = uk and return to STAGE 1.
In some particular cases the algorithm described above may provide analytical rather than numerical solution of the original problem. In order to illustrate that, let us analyze a simplified version of the problem (2.1)-(2.5). First, we restrict the number of players to two:
u = (ui,u2), |ui(t)| < 1, |u2(t)| < 1, t e T = [to,ti].
Second, assume that the state of the game is described by the system of differential equations each of which is linear in both x eR2 and u eR2:
x = A(t) x + A(t)ui(t) + D2(t)u2(t) + b(t),
Lo x(to) + Li x(ti) - g = 0, Third, suppose that both cost functionals are also linear:
Ji(u) = (j, x(to, u)} + {di,x(ti, u)} , i = 1, 2.
where c% dx, i = 1, 2 are some specified numerical vectors from R2. In order to solve problems (3.1), we shall use the maximum principle (3.2)-(3.5) which, in this case, is necessary and sufficient condition for globally optimal control. By virtue of the maximum condition (3.2) we have
uJi(t,u2) = sign { Di(t)'tJi(t) } ( )
u2(t,ui) = sign{ D2(t)'^2(t)} 1 j
where VJi(t), VJ2(t) are solutions of conjugate problems (3.3)-(3.5). Note that here VJi(t), i = 1,2 depend upon specified entries of the original problem only. Therefore
u i(t,u2)= u i(t), u2(t,ui)= u 2 (t).
Now, according to formulae (3.7), (3.8) we shall form nonnegative functional $(u) :
$(u) = (cl, x(t0,u1,u2) — x(t0, u)^ + (d1, x(ti,Ui,u2) — x(ti, u)^
+ ^c2, x(t0,U1,U2) — x(t0, u)^ + ^d2, x(t1,U1,U2) — x(t1, u)^ = (c1,5x1 (t0)) + (d1,5x1(t1^ + (c2,^(to)) + (d2, ^(t^) , where by 5x,i, i = 1,2 we have denoted the profiles of the following BVPs: 5xi = Di [Ui(t) — Ui(t)], i = 1,2, L05xi(t0) + L15xi(t1) = 0, rank [L0 L1] = n. It is obvious that u* such that $(u*) = 0 is attained when
u*(t) = u1(t), u2(t) = u2 (t).
Thus, in this particular case, the equilibrium controls are immediately calculated by formulae (4.5).
5. Numerical Example
Consider a differential game of two players u = (u1, u2) with finite duration T = [0,1]. Each player may choose his strategy so that to satisfy direct constraint |ui(t)| < 1, i = 1,2, when t e [0,1]. The state of the game is subject to two-dimensional system of ordinary differential equations
I X 1(t) = X2(t),
\ X2(t) = X1(t) + u1(t) — 2u2(t), ( . )
with boundary conditions
f X1(0) = 1, \ X2(1) = 0.
Objective functionals
{J1(u1,u2) = x2(0) ^ max, J2(u1,u2) = —x2(1) ^ max,
(5.2)
(5.3)
express payoffs of each player respectively. Our task is to find a set of equilibrium strategies u* = (u*,u*) which would satisfy two conditions of the form (2.5) simultaneously:
J1(u) = max J1 (u1,u*), N(i)i<1
J2(u) = max J2 (u*,u2). iu2(t)i<1 1
Before starting to attack the problem posed above, we ought to make sure that BVP (5.1), (5.2) is resolvable for any admissible entry u. To do so, we should get back to Remark 1 and check out the condition (2.3). In this particular case we have:
Lo
1 0 , Li = 0 0
0 0 0 1
X(t) = 1
et + e t t —t e — e
tt e — e t
et + e ~ t
and
[Lo + LiX (1)] =
1 0
e2 — 1 e2 + 1
2e
It is obvious that condition (2.3) holds since
e2 + 1
det [Lo + LiX(1)] =
2e
2e
> 0.
According to (3.3), the corresponding adjoint system is:
P 1 (t) = — Mt), Mt) = —Pi(t)
whose general solution can be written as
pi(t) = Ciet + C2e -1, p2 (t) = —ciet + C2e -1.
Boundary conditions (3.4) for both payoff functionals are of the following form:
For Ji : pi(1)=0, p2(0) = —1,
For J2 : pi(1) = —2xi(1), p2(0) = 0.
Here we have chosen matrices B0, Bi as
(5.4)
Bo =
00 01
Bi =
10 00
so to fulfill condition (3.5).
Since problem (5.1)-(5.3) is linear-quadratic and J\, J2 are concave, the maximum principle plays role of both necessary and sufficient condition for global optimality while we try to resolve two auxiliary problems of maximum principle (3.1), (3.2):
Find Ui(t, u2) : Ji(ui,u2) = max Ji(ui,u2), (5.5)
|«x|<i
Find u2(t, u1) : J2(u1,u2) = max J1(u1,u2). (5.6)
|«2|<1
Auxiliary problem (5.5) can be solved analytically using Eq. (4.5):
u1(t,u2) = - 1
In order to solve the second one (5.6) we shall employ iterative process of the maximum principle [6, 7]. Before doing so, let us examine problem (5.6) by means of formal analysis.
Remark 3. Analyzing Eq.(5.6), we should note that u2(t,u1) changes its sign together with x1(t), moreover, it is proportional to —sign xi(1) (see Eq.(5.4)). Since state variable x1(1) depends upon u1 by means of system (5.1), we may affirm that u2(t,u1) is piece-wise constant within [0,1], i.e., is subject to some point-wise constraint, thus we can write symbolically
U2(t,ui) € {ai(ui)} where each ai(u1) is some constant depending on u1.
It is clear that equilibrium control u* = (u*,u2) must be a solution of optimal control problem (3.9) which can be represented as
0 < $(u) = ^2(0,^1(^2),U2) - X2(0,U1,U2)
— x1(l,u1,U2(u1)) + x1(l,u1,u2) ^ min,
(5.7)
where functional $(u) is defined on the profiles of boundary value problem which consists of 6 differential equations:
Xi(t,u1,u2), i = 1, 2 :
X1 = x2 X1(0) =1
X 2 = X1 + u1 - -2u2 X2(1) =0
Xi(t,u1,u2), i = 1, 2 :
X1 = X2 X1(0) =1
X 2 = X1 — l — 2u2 X2(1) =0
Xi(t,u1,u2), i = 1, 2 :
X1 = X2 X1(0) =1
X 2 = X1 + u1 — - 2 sign X1(1) X2(1) =0
(5.8)
Now we can clearly see the obvious complexity of the problem (5.7), namely, its discontinuity of the right-hand side in state variable for t = 1 (in this particular example). Generally speaking, we have to apply iterative process to integrate this problem. However, there is another way to determine the state trajectory x(t, u).
Along with solvability condition (see Remark 1) in [11], there was derived a sort of analogue of Cauchy formula for representation of state x(t, u1,u2) of BVP similar to (5.1). According to that formula, we can write
x(t, u) = A(t) j J*X-1(t)b(u, t)dr
+ [L0 + L1X (1)]-1 • g — L1X (1) • 0 X-1(t )b(u, t )dT J.
In our particular case we have
l
X-1(t) = 7,
et + e 1 —(el — e f) é + e-t
— e — e-1)
[Lo + L1X (1)]-1 =
e — 1 2e
e2 + 1 e2 + 1
and b(u,t) = (0,u1(t) — 2u2(t))'. Here we do not need the entire profile of (5.1) and our interest is restricted to end-points of (5.1):
®2(0,u1,u2) = —
e2 1
+
e2 1
e2 + 1 2(e2 + 1) o 1 '1
/1(er — e-T [u1 (r) — 2u2(t)] dr o
x1(1,u1,u2) =
+
11
— 2 (eT + e-T [u1(r) — 2u2(r)] dr; 2 Jo
(e2 — 1)2 e2 + 1"
2e
e2 + 1
+
4e(e2 + 1) 4e
eT [u1(r) — 2u2(r)] dr
e2 + 1 (e2 — 1)2
4e 4e(e2 + 1)
f1 e-T [u1(r) — 2u2(r)] dr. o
If we make some calculations upto two places of decimals, we obtain
X2(0,u1,u2) = —0.76 — 0.12 f eT [u1(r) — 2u2(r)] dr
Jo
—0.88 / e-T [u1(r) — 2u2(r)] dr; Jo
X1(1,u1,u2)=0.65 — 0.32 ^J eT [u^r) —2u2(r)] dr —0 e-T [u1(r) — 2u2(r)] dr} .
By introducing convenient notation
b2(u,t) = u1(t) — 2u2(t),
n1(t) = 0.12e* + 0.88e-t, n2 (t) = ef — e-t,
and calculating
f m(r) dr = 0.76, f n2(r) dr = 1.09,
oo
which we shall need in the further computations for piece-wise constant controls, we can finally represent the missing end-points of (5.1) as
x2(0, u1,u2) = —0.76 — n1(r) b2(u,r)dr, (5.9)
Jo
X2(1,u1,u2) = 0.65 — 0.32 / n2(r) b2(u,r)dr. (5.10)
o
1
o
Taking into account the fact that optimal controls in (5.5), (5.6) are piece-wise constants, we can obtain out of representations (5.9), (5.10)that
u(t,U2) = — 1, t € [0,1],
u2(t,u1) =
— 1 if u1 = —1,
—0.93 if u1 = 0, —0.68 if u1 = 0.5,
—0.43 if u1 = 1.
Now we shall illustrate how the problem (5.1)—(5.3) can be solved using iterative algorithm described in Section 4. First, we choose some initial approximation, e.g., u°(t) = 0, u°(t) = 0, and solve two auxiliary problems of optimal control (5.5), (5.6) using successive approximation technique [11, 6, 7] based on the maximum principle (Stage 1 of the algorithm described in section 4). This gives us numerical solution u 0(t) = —1, u0(t) = —0.93 in which we have (Stage 2)
$(u°) = 0.76 > 0
according to (3.7), (3.8). Then we pass to the next stage (Stage 3) and form approximate functional
(u) = $°(u) = x2(0,u1,u2) — x2(0, u1, u2) — x1(1,u1, u2) + xl(1,u\,u2),
(5.11)
by formulae (3.12), (3.13), (4.1) which coincides with $(u°) in u = u0 :
$°(u°) = $(u°) = 0.76 > 0.
Our task is to find u1 = (u1, u1) such as $°(u1) = 0. It means that we have to minimize (u) up to its zero value on the profiles of 6 x 6 BVP (2.2), (4.2) which takes the form
Xi(t, u1,u2), i = 1, 2 :
X1 = X2
X 2 = X1 + u1 — 2u2 Xi(t, u0, u2), i = 1, 2 :
XI = X2
X 2 = X1 — 1 — 2u2 Xi(t, u1, u2), i = 1, 2 :
XI = X2
X 2 = X1 + u1 — 1.86
X1(0) = 1
X2(1) =0
X1(0) = 1
X2(1) = 0
X1(0) = 1 X2(1) = 0
(5.12)
Comparing (5.8) and (5.12) we can clearly see the role of $0(u) which approximates $(u) from below. It is obvious that (5.12) is simpler than (5.8) since it does not depend internally on the sign of state variable x1. It
should be also noted that system (5.12) consists of three subsystems whose profiles are different.
It should be emphasized that here we are not seeking global minimum of approximate functional $o(u), we are only interested in descending till its zero level. In order to solve the problem of optimal control (4.3), (4.4), we have applied iterative procedure worked out on the basis of earlier works [10, 11, 12]. As a result, we have obtained u1(t) = —1, u^(t) = —0.93 which fulfills ^(u1) = 0 (Stage 4).
Now we return to the Stage 1 and make use of the form of $o(u) taking into account representations (5.9), (5.10) since we only need end-points rather than entire profiles of state vector x. Having solved two auxiliary problems (5.5), (5.6) in u1 we arrive to
u1(t) = —1, u2(t) = —1
and calculate ^(u1) = 0.03 > 0. Then we form
$1(u) = x2(0, u1, u2) — x2(0, u1, u2) — x2(1, u1, u2) + x1(1, u1, u2) and solve the problem of optimal control (4.3)
u2 : $1(u2) = min $1(u)
\ui(t)\ < 1, i =1, 2
which results in
u2(t) = —1, u2(t) = —1 and $(u2) = 0.
Thus, we have found equilibrium control strategy u* = (u1,u*) where u*(t) = —1, u2(t) = —1
using which we have
J1(u*) = 0, J2 (u*) = —0.1. Note that we have solved nonzero-sum differential game (2.6) where J (u) = J1 (u) + J2 (u) and J (u*) = 0.1 = 0.
6. Conclusion
In this paper, the search of equilibrium control strategy in finite-time nonzero-sum differential game has been carried out by reducing the game to a serie of optimal control problem and then applying iterative processes of the maximum principle. By formulation, the game process has been
subordinated to a system of ODE with boundary conditions. The situation when, by the end of the game, some players are supposed to get into their prescribed states and, at the same time, select their strategies so that to maintain equilibrium is inherent in many practical models. On the other hand, boundary conditions usually set up a barrier on the way of evolution of solution techniques. Radical step to overcome that obstacle was taken by applying relevant algorithms of optimal control rather than customary methods of the game theory. As a result, we have obtained an iterative algorithm for equilibrium control search whose implementation was then demonstrated by means of numerical example.
References
1. Basar, T. and G.J.Olsder, Dynamic Noncooperative Game Theory, SIAM, Philadelphia, PA (1998).
2. Belenkii, V., Volkonskii, V. and S.Ivanov, Iterative Methods in the Game Theory and Programming, Nauka, Moscow, USSR (1974) [in Russian].
3. Bryson, A.E. and Y.C.Ho, Applied Optimal Control, John Wiley and Sons, New York, NY (1975).
4. Fedorenko, R.P., Approximate Solution of Optimal Control Problem, Nauka, Moscow, USSR (1978) [in Russian].
5. Nash, J.F., "Noncooperative Games", Annals of Mathematics, Vol.54, No.2, pp.286295 (1951).
6. Vasiliev, O.V., Beltyukov, N.B. and V.A.Terletzky, Optimization Algorithms for Dynamic Systems Based on the Maximum Principle, in "Surveys on Cybernetics: Models and Analysis of Large-Scale Systems", Nauka, Moscow, USSR, pp. 17-38 (1991) [in Russian].
7. Vasiliev, O.V, Optimization Methods, World Federation Publishers, Inc., Atlanta, GA (1996).
8. Vasiliev, O.V. and O.O.Vasilieva, "Inverse Problems and Equilibrium Strategies in the Theory of Optimal Control", Zurich University Press, Switzerland (1997).
9. Vasilieva, O.O. and K.Mizukami, "One Method for Equilibrium Control Searching in Convex Differential Games", Proceedings of SICE, Kanazawa, Japan, pp.1453-1456 (1993).
10. Vasilieva, O.O. and K.Mizukami. "Convergence of Improve Successive Approximation Method for Optimal Control Problem", Proceedings of the 1st ASCC, Tokyo, Japan, Vol.2, pp.901-904 (1994).
11. Vasilieva, O.O. and K.Mizukami. "Optimal Control for Boundary Value Problem", Russian Mathematics, Vol.38, pp.31-39 (1994).
12. Vasilieva, O. O. and K.Mizukami "Dynamical Processes Described by a Boundary Value Problem: Necessary Conditions of Optimality and Methods of Solving, Journal of Computer and Systems Sciences International, Vol.39, pp.90-95 (2000).
13. Vasilieva, O.O. and O.V.Vasiliev. "Search of Equilibrium Controls in Differential m-person Game", Russian Mathematics, Vol.44, pp.7-12 (2000).
О. О. Васильева
Поиск равновестных управлений в дифференциальной игре с граничными условиями
Аннотация. В статье рассматривается конечная по времени дифференциальная игра с т игроками с ненулевой суммой. Состояния игроков управляются системами с граничными условиями, (а не системами с начальными условиями). В конце игры мы понимаем ситуацию равновесия, которая достигается применением равновесной стратегии управления. Целью работы является разработка обоснованного подходящего алгоритма для поиска равновесного управления. Для этого используется техника оптимального управления.