Научная статья на тему 'Численный алгоритм оценки параметров математической модели динамики ВИЧ инфекции CD4+ T-клеток'

Численный алгоритм оценки параметров математической модели динамики ВИЧ инфекции CD4+ T-клеток Текст научной статьи по специальности «Математика»

CC BY
104
20
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
ПРОСТЕЙШАЯ МАТЕМАТИЧЕСКАЯ МОДЕЛЬ ВНУТРИКЛЕТОЧНОЙ ДИНАМИКИ ВИЧ-ИНФЕКЦИИ / ОЦЕНКА ПАРАМЕТРОВ / ОБРАТНАЯ ЗАДАЧА / ОПТИМИЗАЦИОННЫЙ АЛГОРИТМ / МЕТОД НЕЛДЕРА-МИДА

Аннотация научной статьи по математике, автор научной работы — Кабанихин Сергей Игоревич, Криворотько Ольга Игоревна, Мортье Адель, Воронов Дмитрий Андреевич, Ермоленко Дарья Владимировна

Численно исследована задача аппроксимации значений шести коэффициентов базовой модели ВИЧ-инфекции по дополнительным измерениям концентраций виремии и целевых клеток крови в фиксированные моменты времени, называемая обратной задачей. Обратная задача для динамической модели ВИЧ сводится к задаче минимизации целевого функционала. Реализован шестимерный алгоритм Нелдера-Мида для решения задачи минимизации. Для достаточно небольшого интервала значений неизвестных коэффициентов относительно некоторой средней величины (около 33 % от общего интервала между экспериментальными значениями крайних коэффициентов), предлагаемый алгоритм демонстрирует хорошую сходимость с вероятностью 17 %. Приведены и обсуждены результаты численного эксперимента.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

A numerical algorithm of parameter estimation for dynamic model for hiv infection of CD4+ T cells

The problem of the values approximation of 6 coefficients of the HIV infection basic model so-called the inverse problem has been computationally investigated by the complementary measurements of viremia and target cells blood concentrations through the fixed timing. The inverse problem for HIV dynamic model has been reduced to the problem of objective functional minimization. The 6-dimensional Nelder-Mead algorithm has been implemented to the minimization task realization. The proposed algorithm demonstrates the good convergence with probability of 17 % for the fairly small interval range of unknown coefficients relating to average mean (about 33% of the total bandwidth between experimental extreme coefficients values). The numerical experiment results has been performed and discussed.

Текст научной работы на тему «Численный алгоритм оценки параметров математической модели динамики ВИЧ инфекции CD4+ T-клеток»

УДК 519.622.2, 004.942

A NUMERICAL ALGORITHM OF PARAMETER ESTIMATION FOR DYNAMIC MODEL FOR HIV INFECTION OF CD4+ T CELLS

Sergey Igorevich KABANIKHIN12, Olga Igorevna KRIVOROTKO12,

Adele MORTIER3, Dmitriy Andreevich VORONOV2, Darya Vladimirovna YERMOLENKO2

1 Institute of Computational Mathematics and Mathematical Geophysics of SB RAS 630090, Novosibirsk, Lavrent'ev ave., 6

2 Novosibirsk State University 630090, Novosibirsk, Pirogov str., 2

3 Telecom ParisTech

75013, Paris, 46 rue Barrault

The problem of the values approximation of 6 coefficients of the HIV infection basic model - so-called the inverse problem has been computationally investigated by the complementary measurements of viremia and target cells blood concentrations through the fixed timing. The inverse problem for HIV dynamic model has been reduced to the problem of objective functional minimization. The 6-dimensional Nelder-Mead algorithm has been implemented to the minimization task realization. The proposed algorithm demonstrates the good convergence with probability of 17 % for the fairly small interval range of unknown coefficients relating to average mean (about 33% of the total bandwidth between experimental extreme coefficients values). The numerical experiment results has been performed and discussed.

Keywords: HIV infection, parameter estimation, inverse problem, optimization algorithm, Nelder-Mead method.

The human immunodeficiency virus (HIV), by causing a progressive failure of the human immune system, allows life-threatening opportunistic infections and cancers to thrive. Within blood, HIV includes as both free virus particles and virus within infected immune cells, specifically CD4+ T cells.

HIV infection leads to low levels of CD4+ T cells through a number of mechanisms, including apoptosis of uninfected bystander cells, direct viral killing of infected cells, and killing of infected CD4+ T cells by CD8 cytotoxic lymphocytes that recognize infected cells.

The research in the within-host dynamics of HIV infections focuses on these mechanisms, and especially the spread of the virus in vivo. The latest models take into account virus's cell-to-cell interactions, T cell activation, and immune exhaustion [19]. However, a simpler model involving ordinary

non-linear equations also provides satisfying results for early times, with a limited parameter set.

Since the direct problem remains quite easy to solve, it lends itself well to the study of HIV inverse problem. With this basic model and regular data of viremia and T cells loads over time after infection, we propose an algorithm which characterizes individual parameters in HIV modeling. The knowledge of these parameters could open the way to more personalized (and more efficient) treatments.

The paper organized as follows. In a second section, the different parameters and the model of the problem are presented in order to justify the inverse problem approach.

Then, in a third section, some results and plots of the outputs are shown and analysed, which leads to a concluding paragraph reviewing the global performance of the approach and the accuracy of the

Kabanikhin S.I. - doctor ofphysical and mathematical sciences, corresponding member of RAS, professor, director, e-mail: [email protected]

Krivorotko O.I. - candidate of physical and mathematical sciences, junior researcher, senior lecturer of the chair for

advanced mathematics, e-mail: [email protected]

Mortier Adele - student, e-mail: [email protected]

Voronov D.A. - senior lecturer of the chair for advanced mathematics, e-mail: [email protected] Yermolenko D.V. - student, e-mail: [email protected]

approximate solution. Finally, we will consider future perspectives which could enhance the model or the solving program.

STATEMENT OF THE PROBLEM

A basic model of HIV dynamics involves only three determining variables: uninfected and infected target cells (mostly CD4+ T cell), and free HIV viruses within blood. Their concentrations are respectively noted T(t), I(t), V(t). We can also build the vector quantity X(t), which characterizes the whole immune response to the infection: X(t) = (T(t), I(t), V(t))T.

In practice, it is relatively easy to measure this vector variable, and thus access to the solution of the direct problem of HIV dynamics with an acceptable level of precision: we simply need daily blood samples from HIV-positive patients. Besides, interactions between the three blood concentrations T(t), I(t), V(t) can be modeled quite simply by a SIR model with vital dynamics and constant population [5, 13, 15]. It is a basic compartmental model that involves Susceptible cells I(t), Infectious virions V(t), and Recovered immune cells T(t):

dX dt

dT = X-,

TT - eVT,

^ = PVT - 5I,

(1)

dV dt

= pI - cV,

{X^ (tk)}

фо.

(4)

Henceforth, we will note (4) the sampled solution of (1)-(2) for a given q. A realistic time sampling cannot exceed for instance 5 measurements a day (exhaustion caused by blood taking, equipments required). A frequency of 2 blood samples per day during 5 days was chosen to solve the inverse problem. As a result, sampled times are:

KW vktk = k/2. (5)

As for the SIR model, it allows a relatively easy solving of the direct problem for all q we may consider. Moreover, we can assume that the solution qeex of the inverse problem is the value of q which minimizes the following misfit function, based on the method of Least Squares:

J:.

R6 ^

I

ite[to. tf

\Xq(tk)- Xq (tk)

(6)

with initial conditions at time t0

T(to) = To, I(to) = Io, V(to) = Vo. (2)

The description of the parameters and its possible value interval are given in Table 1. These six coefficients may vary according to the metabolism of each individual. But contrary to the variable X(t), they remain quite difficult to measure experimentally. That is the reason why tackling the HIV dynamics modeling with an inverse problem approach seems appropriate [8]. So let qex be the 6-dimen-sional vector that contains all information about the immune features of a given individual:

qex = ft, dr, P, 5, p, c]T. (3)

The exact solution qex will be the unknown element of the inverse problem; its components shall be in the ranges specified above. The goal of the inverse problem modeling is to approximate these coefficients eonly by knowing the behaviour of the function X1 (t) through blood samples of a given individual (i.e., a given qex) in a given time interval. These baseline data can be written as follows:

Consequently, the HIV dynamics inverse problem can be reduced to a problem of optimization (6) and ordinary differential equation (ODE) solving. The first step will be carried out through the Nelder-Mead method [12], while a fifth-order Runge-Kutta method [4] will handle the second step. We will now explain in details the running of these two methods and their implementation.

NUMERICAL METHODS FOR SOLVING DIRECT

AND INVERSE PROBLEMS

Direct problem solving

The solving of the direct problem relies on a Runge-Kutta method for non-stiff ordinary differential equations, inspired by the Matlab solver ode45.m [3], which gave satisfying results in other studies [14]. The default method of ode45.m is the Dormand-Prince [4] embedded method, designed to produce an estimate of the local truncation error of a single Runge-Kutta step. It allows to control the error with adaptive stepsize [18] by managing two methods, one with order 5 and one with order 4.

Minimization of the misfit function

By definition, the solution of the inverse problem is the only q for which J(q) = 0. And since J is a positive function:

J(q) = 0 ^^ q = argmin(J). (7)

In papers [1, 6] the gradient algorithm was proposed to find the minimum of the misfit function J. In this paper, to find the minimum of the misfit function, the Nelder-Mead method [12] was implemented for a 6-dimensional space. The method uses the concept of a simplex, which is a special polytope. In our case, it is made up of 7 different q from the interval described in Table 1. At each steps, q e R6 is updated using one of these four transformations:

reflection q * = contraction q * expansion q *!

reduction q = i

q + a ■

* = q - ß

* = q - y-q*q,

r+2 ■ qqi ■

Here q = centroid^J (q), qh = argmax^ (J (q)), qt = argmin q (J (q)), a = 1, P = 1/2, y = 2.

However, the Nelder-Mead method is a heuristic search method that can converge to non-stationary points, i.e., local minima. In this case, we can obtain a wrong solution of inverse problem (1)-(2), (4). This is all the more probable as the initial random set of points q0 is dispersed around the expected solution. That is why, without the use of an auxiliary method, the initial random generation of the array of the zeros approximation remains a key issue.

Besides, the Nelder-Mead method does not take into account the unrealistic character of the qn it returns. Indeed, the standard stopping condition of this algorithm is:

X J(q„)- J(q)f < Ю-

qeR6

For getting the synthetic data (4) we solve of the direct problem for a fixed qe by the Runge-Kutta method. Afterwards, the value of qex should not be reused by the program.

NUMERICAL RESULTS

In this section we demonstrate the numerical results of solving inverse problem (1)-(2), (4) with synthetic data.

Initial settings

We set the time interval [t0, tf) as [0, 5) and the time sampling frequency h = 0.5: t0 < tk = kh < f. The value of the exact solution qe used to generate synthetic data was equal to the mean values of Table 1 (column 3).

The plots of synthetic data generated with the set of coefficients qex are demonstrated at Figure 1.

As for the generation of the set of initial points q0, a uniform random distribution over an adjustable interval around the expected solution was chosen. Considering each coefficient, this interval represents a given percentage of the total bandwidth between extrema of Table 1. For a given coefficient, e.g. X, it can thus be written:

It only quantifies how close to each other are the misfit function's values at the vertices of the simplex. Thus, a subsidiary method should check if the found minimum belongs to Table 1, and if it actually makes the function J converge towards 0. If these two conditions are actually fulfilled, the program can stop. If not, the NelderMead method has to be launched once again, with an other random set of points qn.

Generation of synthetic data

In practice, the blood samples (4) come from experimental measurements (which can, by the way, be erroneous in a certain extent). But here, synthetic exact data have to be generated from a fixed qex, in order to check the good convergence of the algorithm, which should be able to recalculate this value by itself.

^x] with

KL = v - -0/3,

B. Convergence in the Nelder-Mead Algorithm

The Nelder-Mead method does not always converge to a global minimum for the function J, and may be launched several times before giving a relevant solution of the inverse problem. In this section, we focus on the results of the Nelder-Mead method when it has been run successfully, so as to estimate the accuracy of the convergence. The behaviours of eight random set were analyzed step-by-step during the Nelder-Mead method, by plotting the changing coordinates of the simplex's centroid. Three sets used initial random points from a "2/3-adjusted in-

Table 1

Extreme and mean values of immune coefficients [7]

Parameter Minimum Mean Maximum Units

1 4.3 x 10-2 1.089 x 10-1 2 x 10-1 HL-1 day-1

dT 4.3 x 10-3 1.089 x 10-2 2 x 10-2 day-1

ß 1.9 x 10-4 1.179 x 10-3 4.8 x 10-3 ^L day-1

5 1.3 x 10-1 3.660 x 10-1 8 x 10-1 day-1

P 9.8 x 101 1.427 x 103 7.1 x 103 day-1

c 3 3 3 day-1

Here: X and dT are respectively the birth rate and the dying rate of target cells; p quantifies the infection by free virus; 5 is the dying rate of infected cells; p and c are respectively the production rate and the clearing rate of the virions.

200 000-

0

1000- 0.0010- 1000-1

800- 0.0008- 800-

ь zß 600- 0.0006- Ь ся 600-

о - 0.0004- о О *—• -

Т1 400- 0.0002 400-

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

СИ О bJ 200- 12 3 4 1 d 5 а 200-

0 1 1 1 1 2 3 4 1 5 0

Time (days)

400 000-, с

/"'S

2 3 Time (days)

2 3 Time (days)

Fig. 1. Concentration of uninfected target cells Tq (t) (a), of infected target cells Iq (t) (b), and of free HIV viruses Vq' (t) (c).

Table 2

Coefficients of the simplex's centroid q at different steps of the Nelder-Mead method

Step number dT

Value Relative error (%) Value Relative error (%)

1 0.120758863 10.89 0.011395613 4.64

10 0.115669636 6.22 0.008978078 -17.56

30 0.114106539 4.78 0.008449692 -22.41

60 0.113997897 4.68 0.008368867 -23.15

100 0.113992729 4.68 0.008364289 -23.19

ß 5

Step number Value Relative error (%) Value Relative error (%)

1 0.001814283 53.88 0.42271688 15.50

10 0.001305623 10.74 0.379353123 3.65

30 0.001185291 0.53 0.368954912 0.81

60 0.001178254 -0.06 0.36633236 0.09

100 0.001179676 0.06 0.366008649 0.00

P c

Step number Value Relative error (%) Value Relative error (%)

1 1984.812433 39.09 3 0.00

10 1447.04152 1.40 3 0.00

30 1429.956621 0.21 3 0.00

60 1426.260772 -0.05 3 0.00

100 1425.38208 -0.11 3 0.00

0.0045-,

7 0.0035 ^ 0.0025 H

t j

a.

u £

g 0.0015

I

0.0005-0.8-,

•—s

0.60.4 0.2

t

j

u £

•3

j

£

6000400020000

0.20 n 0.16 0.120.08 0.04 0.020-

'>, 0.016-V

•i 0.012-

v—/

S

g 0.008 >

0.004

t"

~r

"i-1—

20 40 60 80 Iteration step

0.0011797-

"e, 0.00117974

I 0.113992-j

^ 0.1139910.113990 0.0083647

0.0083645

л •a

0.0083643

S

J 0.0083641 >

0.0083639

100

90

Iteration step

100

Centroid

Exact | • | Initial points -------Minimum [-------\Maximum

Fig. 2. Convergence of the coefficients ft (a, b), S (c, d), p (e, f), X (g, h) and dT (i, j) using an initial "2/3-adjusted interval"; a, c, e, g, i - global convergence, b, d, f, h, j - later behaviour

terval". The 5 recovered coefficients of problem (1) by using the Nelder-Mead method are demonstrated on Figure 2. Note, that the coefficient c has the exact value at Step 0 (see Table 2 for details).

Overall, we can note a good convergence of the different coefficients towards their exact value, with a final relative error lower than 0.05 %. The only exception seems to be dT, whose values become apparently further from the exact one.

Besides, the final relative errors seemed to be all the more dispersed that the initial interval allowed for the random points generation is wide.

Another way to estimate the convergence of the algorithm is to focus on the behavior of the misfit function for each computation step. With the initial data form «2/3-adjusted interval», we obtain the behaviour of the misfit function at each computation step (see Fig. 3).

Iteration step

Fig. 3. The global behaviour of the misfit function at each computation step (base 10 logarithm)

Most of the time, the simplex algorithm stopped at the 100th iteration step (maximum preset number of allowed iterations). However, we can assume that a hundred step provides a quite satisfying accuracy insofar as the misfit function seems to reach a stationary state, with a value close to unity.

CONCLUSION

The Nelder-Mead method has managed to find the coefficients of the initial differential system with a good precision. Nevertheless, its performance remains quite low, because of a total random choice of initial conditions, and a heuristic method that is also too sensible to these conditions. Finally, the trials using synthetic data (which do not include the possible errors caused by experimental measurement) can only give an overview of the operating effectiveness of the whole program.

Future Perspective

A certain number of improvements of the algorithm could be examined, in order to increase its performance in the search of extrema, or its ability to take into account the later fluctuations of viral loads and T-cell blood concentrations.

First of all, the SIR simple model does not integrate the influence of latency in human body, which is yet an important long-term factor during the infection. Some studies [9] aimed to deal with this effects, by considering for instance the long-term role of peripheral virion reservoirs. As a result, the time becomes an explicit variable of the global differential system; nevertheless, the solving method does not have to change a lot, since the Runge-Kutta complete algorithm is precisely designed for this eventuality.

Furthermore, the Nelder-Mead method could be optimized in itself. In this regard, some methods are best suited for constrained optimization [2, 17]; at each step, the points out of preset limits are reset in the good interval. This could be of great advantage,

insofar as it prevents the program from looping a long time around unrealistic minima.

Besides, some auxiliary loops including Monte-Carlo method could be able to increase the probability of success during the Nelder-Mead algorithm, by presetting the more favorable initial conditions. Some of these techniques are based on the Metropolis algorithm [10, 11, 16], which samples the probability distribution of the model space (ie, the possible solutions of inverse problem) by using of modified random walk. Combining this statistical approach with the classic simplex search could provide a more realistic model, with a better performance and a reasonable complexity (Monte Carlo all alone remains time-consuming).

ACKNOWLEDGMENT

The authors would like to thank Prof. A.I. Ilyin for problem statement and fruitful discussions. This work was partially supported by the Ministry of Education and Science of Russian Federation and the project No. 107 with Republic of Kazakhstan "Development of software for research and numerical solving of direct and inverse problems of pharmacokinetics and epidemiology".

REFERENCES

1. Afraites L., Atlas A. Parameters identification in the mathematical model of immune competition cells // J. Inverse Ill-Posed Probl. 2015. 23. (4). 323-337.

2. BoxM. J. A new method of constrained optimization and a comparison with other methods // Comput. J. 1965. 8. (1). 42-52.

3. Compere M. ODE Solvers for Octave and Mat-lab. 1999. https://sites.google.com/site/comperem/ home/ode solvers.

4. Dormand J.R., Prince P.J. A family of embedded Runge-Kutta formulae // J. Comput. Appl. Math. 1980. 6. (1). 19-26.

5. Hethcote H.W. Three basic epidemiological models // Applied Mathematical Ecology. Eds. S.A. Levin, T.G. Hallam, L.J. Gross. Berlin; Heidelberg: SpringerVerlag, 1989. 18. 119-144.

6. Ilyin A., Kabanikhin S.I., Nurseitov D.B. et al. Analysis of ill-posedness and numerical methods of solving a nonlinear inverse problem in pharmacokinet-ics for the two-compartmental model with extravascu-lar drug administration // J. Inverse Ill-Posed Probl. 2012. 20. (1). 39-64.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

7. Jones E., Roemer P., RaghupathiM., Pan-kavich S. Analysis and simulation of the three-component model of HIV dynamics // SIAM Undergrad. Res. Online. 2014. 7. 89-106.

8. Kabanikhin S.I. Inverse and Ill-Posed Problems. Theory and Applications. Berlin: de Gruyter, 2011.

9. Mbogo W.R., LuboobiL.S., OdhiamboJ.W. Stochastic model for Langerhans cells and HIV dynamics in vivo // ISRN Appl. Math. 2014. 2014. ID 594617.

10. Metropolis N., Ulam S. The Monte Carlo method // J. Am. Stat. Assoc. 1949. 44. (247). 335-341.

11. Mosegaard K., Tarantola A. Monte Carlo sampling of solutions to inverse problems // J. Geo-phys. Res. 1995. 100. (B7). 12431-12447.

12. Nelder J.A., Mead R. A simplex method for function minimization // Comput. J. 1965. 7. (4). 308313.

13. Perelson A.S., Ribeiro R.M. Modeling the within-host dynamics of HIV infection // BMC Biol. 2013. 11. 96.

14. Roemer P. Stochastic modeling of the persistence of HIV: early population dynamics // Trident Scholar Project Report. 2013. (420). 64 p.

15. Stafford M.A., Corey L., Cao Y. et al. Modeling plasma virus concentration during primary HIV infection // J. Theor. Biol. 2000. 203. (3). 285-301.

16. Tarantola A. Inverse problem theory and methods for model parameter estimation // Society for Industrial and Applied Mathematics. Philadelphia, 2005. 41-54.

17. Tauler R., Brown S.D., Walczak B. Three basic epidemiological models // Comprehensive Chemometrics: Chemical and Biochemical Data Analysis. Eds. S.D. Brown, R. Tauler, B. Walczak. Elsevier Science: 2009. 18. 556-557.

18. Van Der Houwen P.J., Sommeijer B.P. Parallel iteration of high-order Runge-Kutta methods with stepsize control // J. Comput. Appl. Math. 1990. 29. (1). 111-127.

19. Zhang C., Zhou S., Groppelli E. et al. Hybrid spreading mechanisms and T cell activation shape the dynamics of HIV-1 infection // PLoS Comput. Biol. 2015. 11. (4). e1004179.

ЧИСЛЕННЫЙ АЛГОРИТМ ОЦЕНКИ ПАРАМЕТРОВ МАТЕМАТИЧЕСКОЙ МОДЕЛИ ДИНАМИКИ ВИЧ ИНФЕКЦИИ CD4+ T-КЛЕТОК

Сергей Игоревич КАБАНИХИН1, Ольга Игоревна КРИВОРОТЬКО12, Адель МОРТЬЕ3, Дмитрий Андреевич ВОРОНОВ2, Дарья Владимировна ЕРМОЛЕНКО2

1 Институт вычислительной математики и математической геофизики СО РАН 630090, Новосибирск, пр. Академика Лаврентьева, 6

2Новосибирский государственный университет 630090, г. Новосибирск, ул. Пирогова, 2

3 Telecom ParisTech 75013, Paris, 46 rue Barrault

Численно исследована задача аппроксимации значений шести коэффициентов базовой модели ВИЧ-инфекции по дополнительным измерениям концентраций виремии и целевых клеток крови в фиксированные моменты времени, называемая обратной задачей. Обратная задача для динамической модели ВИЧ сводится к задаче минимизации целевого функционала. Реализован шестимерный алгоритм Нелдера-Мида для решения задачи минимизации. Для достаточно небольшого интервала значений неизвестных коэффициентов относительно некоторой средней величины (около 33 % от общего интервала между экспериментальными значениями крайних коэффициентов), предлагаемый алгоритм демонстрирует хорошую сходимость с вероятностью 17 %. Приведены и обсуждены результаты численного эксперимента.

Ключевые слова: простейшая математическая модель внутриклеточной динамики ВИЧ-инфекции, оценка параметров, обратная задача, оптимизационный алгоритм, метод Нелдера-Мида.

Кабанихин С.И. - д.ф.-м.н., член-корреспондент РАН, профессор, ВРИО директора, е-mail: [email protected]

Криворотько О.И. - к.ф.-м.н., младший научный сотрудник, старший преподаватель кафедры высшей математики физического факультета, е-mail: [email protected] Мортье Адель - студент, е-mail: [email protected]

Воронов Д.А. - старший преподаватель кафедры высшей математики физического факультета, е-mail: dmitriy.voronov.89@gmail. com

Ермоленко Д.В. - студент 1 курса магистратуры, е-mail: [email protected]

i Надоели баннеры? Вы всегда можете отключить рекламу.