Научная статья на тему 'DEEP NEURAL NETWORKS IN HYDROLOGY: THE NEW GENERATION OF UNIVERSAL AND EFFICIENT MODELS'

DEEP NEURAL NETWORKS IN HYDROLOGY: THE NEW GENERATION OF UNIVERSAL AND EFFICIENT MODELS Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
67
22
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
DEEP NEURAL NETWORKS / DEEP LEARNING / MACHINE LEARNING / HYDROLOGY / MODELING

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Ayzel Georgy V.

For around a decade, deep learning - the sub-field of machine learning that refers to artificial neural networks comprised of many computational layers - has been modifying the landscape of statistical model development in many research areas, such as image classification, machine translation, and speech recognition. Geoscientific disciplines in general and the field of hydrology in particular, are no exception to this movement. Recently, the proliferation of modern deep learning-based techniques and methods has been actively gaining popularity for solving a wide range of hydrological problems: modeling and forecasting of river runoff, hydrological model parameters regionalization, assessment of available water resources, and identification of the main drivers of the recent change in water balance components. This growing popularity of deep neural networks is primarily due to their high universality and efficiency. The presented qualities, together with the rapidly growing amount of accumulated environmental information, as well as the increasing availability of computing facilities and resources, allow us to speak about deep neural networks as a new generation of mathematical models designed to, if not to replace existing solutions, then significantly enrich the field of geophysical processes modeling. This paper provides a brief overview of the current state of the field of development and application of deep neural networks in hydrology. Also in the following study, the qualitative long-term forecast regarding the development of deep learning technology for managing the corresponding hydrological modeling challenges is provided based on the use of the Gartner Hype Curve, which in the general details describes the life cycle of modern technologies.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «DEEP NEURAL NETWORKS IN HYDROLOGY: THE NEW GENERATION OF UNIVERSAL AND EFFICIENT MODELS»

UDC 556.5

Вестник СПбГУ. Науки о Земле. 2021. Т. 66. Вып. 1

Deep neural networks in hydrology: the new generation of universal and efficient models*

G. V. Ayzel

State Hydrological Institute,

23, 2-ia liniia V. O., St. Petersburg, 199004, Russian Federation; University of Potsdam, Institute for Environmental Sciences and Geography, 24-25, Karl-Liebknecht-Str., Potsdam, 14476, Germany

For citation: Ayzel, G. V. (2021). Deep neural networks in hydrology: the new generation of universal and efficient models. Vestnik of Saint Petersburg University. Earth Sciences, 66 (1), 5-18. https://doi.org/10.21638/spbu07.2021.101

For around a decade, deep learning — the sub-field of machine learning that refers to artificial neural networks comprised of many computational layers — has been modifying the landscape of statistical model development in many research areas, such as image classification, machine translation, and speech recognition. Geoscientific disciplines in general and the field of hydrology in particular, are no exception to this movement. Recently, the proliferation of modern deep learning-based techniques and methods has been actively gaining popularity for solving a wide range of hydrological problems: modeling and forecasting of river runoff, hydrological model parameters regionalization, assessment of available water resources, and identification of the main drivers of the recent change in water balance components. This growing popularity of deep neural networks is primarily due to their high universality and efficiency. The presented qualities, together with the rapidly growing amount of accumulated environmental information, as well as the increasing availability of computing facilities and resources, allow us to speak about deep neural networks as a new generation of mathematical models designed to, if not to replace existing solutions, then significantly enrich the field of geophysical processes modeling. This paper provides a brief overview of the current state of the field of development and application of deep neural networks in hydrology. Also in the following study, the qualitative long-term forecast regarding the development of deep learning technology for managing the corresponding hydrological modeling challenges is provided based on the use of the Gartner Hype Curve, which in the general details describes the life cycle of modern technologies.

Keywords: deep neural networks, deep learning, machine learning, hydrology, modeling. 1. Introduction

Nowadays, technologies based on artificial neural networks surround us everywhere: for example, they help us to find necessary information on the Internet, translate text, and set the route from home to work. But this has not always been the case. The history of development and application of artificial neural networks is full of ups and downs, high hopes, and disappointments. The first work we know of that started the development of the artificial neural network research field is an article by McCulloch and Pitts, published

* The reported study was funded by RFBR (project no. 19-35-60005). © St. Petersburg State University, 2021

in 1943. In this paper, the authors introduce the concept and logical design of an artificial neuron: a linear combination of input signals that is then transformed by a nonlinear threshold activation function. The proposed model of a single artificial neuron allowed the authors to describe the first concept of an artificial neural network consisting of many interconnected artificial neurons. It should be noted that 77 years after the publication of the article by McCulloch and Pitts, the concept of artificial neural networks has not undergone any changes. In essence, McCulloch and Pitts considered neural networks with a maximum number of seven neurons, but in new neural networks, this number can reach several billion (Adiwardana et al., 2020).

Until the late 1960s, the field of research on artificial neural networks was experiencing its first rise. One of the central works of this period is an article by Rosenblatt (1958) in which he proposes a solution to the problem of binary classification based on the linear perceptron and introduces the concept of "training" a neural network. Rosenblatt's work served as an incentive to use artificial neural networks based on linear perceptrons for solving applied problems. However, the excitement about using neural networks did not last long. Minsky and Papert's book Perceptrons, published in 1969, was full of destructive criticism of Rosenblatt's idea and suspended the pace of research in the field of neural networks until the mid-1980s.

The second wave of interest in neural networks is associated with the classic works of Rumelhart, Hinton, and Williams (Rumelhart et al., 1985; 1986), which appeared in the second half of the 1980s. In these works, the authors proposed a method of back-propagation — a conceptually new way of training artificial neural networks of arbitrary architecture — which is widely used to date. However, another period of frustration came shortly. This was because the available computing resources and data volumes for training neural networks were minimal at that time. Application of simpler and less demanding machine learning algorithms, for example, decision trees or support vector machines, achieved comparable or even higher efficiency. Another "winter" in the field of neural network research lasted until the mid-2000s (about 20 years), which led to the modern revolution of deep learning (Schmidhuber, 2015).

Since the mid-2000s, a new era in the study of artificial neural networks has begun. In this short period, everything came together — progress in the field of computational resources, as well as the emergence of large data sets, have made it possible to develop and train deep neural networks — artificial neural networks with hundreds and thousands of computational layers and millions of neurons. The field of research of such deep artificial neural networks was called deep learning. In 2006 a group of scientists headed by Jeffrey Hinton proposed a much faster and more stable way of training deep neural networks based on local pre-training and gradient descent methods (Hinton et al., 2006). In the same year, the deep neural network set a new record in handwritten number recognition efficiency for the MNIST dataset (Ranzato et al., 2006) for the first time. The discovery-rich 2006 also brought the first work in which training of a deep neural network was not done on the central processing unit (CPU) but on the graphical processing unit (GPU), which reduced training time by an order of magnitude. In 2012 the deep convolution neural network AlexNet (Krizhevsky et al., 2012) set a new record in the efficiency of recognition of different classes of objects at the ImageNet competition (Deng et al., 2009). Three years later, the ResNet-152 deep neural network showed recognition efficiency on ImageNet exceeding the accuracy of human image recognition (Russakovsky et al., 2015). It is the

Google Trends

2004 2006 2008 2010 2012 2014 2016 2018

Fig. 1. Relative frequency of "Artificial Neural Networks" and "Deep Neural Networks" queries to Google search engine according to Google Trends service (Trends.google.com, 2020)

success in image recognition on the ImageNet data set that has been the source of modern hype around the use of deep neural networks (Russakovsky et al., 2015), which also finds its confirmation in the dynamics of queries to the search engine Google (Fig. 1). Over the past decade, numerous studies of deep learning have shown outstanding results in pattern recognition (LeCun et al., 2015; Russakovsky et al., 2015), machine translation (Sutskever et al., 2014) and speech recognition (LeCun et al., 2015). Deep neural networks are used everywhere. Prominent examples include personalized voice assistants (Google Assistant from Google or Alice from Yandex) and autonomous vehicles (Rao and Frtunikj, 2018).

Deep neural networks have, with some delay, also found wide application in the field of Earth sciences: for example, in cloud type recognition (Zhang et al., 2018), determination of the underlying land cover type based on remote sensing data (Zhang et al., 2016), detection of land-use change (Zhao and Du, 2016), determination of extreme weather conditions (Racah et al., 2017), precipitation nowcasting (Shi et al., 2017; Agrawal et al., 2019; Ayzel et al., 2019). The recent work by Reichstein et al. (2019) summarizes the latest advances in both classical machine learning models and modern deep neural networks in Earth sciences. The authors of this paper highlight the development prospects and possible difficulties that the scientific community may face when using deep neural networks: for example, the interpretability of models and calculations based on them, the absence of boundary conditions for extrapolation in the field of rare occurrences, and high computational complexity.

Deep neural networks have found their application for hydrological problems with even greater delay than for Earth sciences in general (Shen, 2018). Thus, in his review Shen (2018) identifies three areas in which deep neural networks have demonstrated their high efficiency: (1) extraction of relevant hydrological information from remotely sensed data (Song et al., 2016; Tao et al., 2016; Moraux et al., 2019; Pan et al., 2019), (2) modeling dynamics of hydrological quantities (Kratzert et al., 2018; Ayzel, 2019; Kratzert et al., 2019b; Ayzel et al., 2020; Song et al., 2020), (3) modeling and generation of complex spatial and temporal distributions of hydrological quantities (Laloy et al., 2018). Thus, at the moment, we are witnessing an avalanche-like growth of interest of the hydrological community in

Scientific publications in the Web of Science

- Deep Neural Networks

о.

с 40 го

01 >

ibo 1/1 с о

20

13 Q-

о 10

О

2010

2012

2014

2016

2018

Fig. 2. The total number of published scientific papers in the Web of Science reference base on the topic "Deep Neural Networks", included in the research field "Water Resources" (Webofscience.com, 2020)

deep neural networks (Shen et al., 2018). This fact is also confirmed by the data on the number of published scientific articles in the field of water resources research, which are presented in the Web of Science reference database (Fig. 2): the number of articles published in 2019 has increased by six times in comparison with 2016. Not excluding the low base effect, there is still no doubt that the number of publications on the application of deep neural networks in hydrology will increase exponentially over the next few years.

The objective of this paper is to discuss and critically analyze deep neural networks as universal and efficient next-generation hydrological models. Hereinafter we define a hy-drological model in the broad sense, meaning any mathematical model describing the process (or processes) of the land hydrological cycle: for example, a model of river flow formation or a model of soil moisture dynamics. In addition to the conceptual postulation of the universality and efficiency of deep neural networks in hydrology (Sections 2, 3), this paper will consider the contribution of large data and computational resources to the high efficiency of deep-learning models (Section 4), the interpretability of deep-neural network models and the calculations made on their basis (Section 5), and also give a forecast of the development of the field of deep learning for solving hydrological problems in the period up to 2050 (Section 6).

2. Universal models of the new generation

The universality of artificial neural networks is not a general characteristic: Hornik's well-known theorem (Hornik et al., 1989) asserts that any continuous function can be accurately approximated by a neural network with one (hidden) level. Let us present this theorem as a general equation:

y = f(X) + e, (1)

where y — target variable, f — neural network, X — input variables, e — error.

Thus, any neural network is a universal approximator — a model that functionally connects the target variable with the space of its features. Such a trivial mathematical

approach to the definition of a neural network opens up unlimited possibilities of modeling any natural (and not only) processes: for this purpose, we need to have a set of data describing the behavior of the target variable depending on the space of its features and computational time to choose a suitable neural network architecture empirically.

As an example, let us consider the problem of river runoff modeling. The traditional way to solve this problem is to use hydrological models, which in general clearly describe the processes of river runoff formation: for example, interception of precipitation by vegetation, water movement in the soil and channel network. Such models mainly contain a small number of input variables (sum of precipitation, air temperature) and a small number of parameters (from a few units to the first tens), the optimal values of which are calibrated against the available river runoff observations. The difference of the neural network is that it will describe runoff formation processes implicitly (the so-called black box) — only based on found patterns between input data (the space of which can be arbitrary) and river runoff. Similar to the calibration of hydrological model parameters, neural network training is adjusting the single neuron weights (from hundreds to hundreds of thousands) in a way to minimize the error of calculated and observed values of river runoff.

Thus, when using neural networks for modeling hydrological processes, we are not limited either in the choice of factors that form the feature space of the target variable or in the selection of the architecture of the model itself. Such freedom of choice, however, has its disadvantages. First of all, using neural networks, we lose the most crucial property of explicit interpretability of the model — weights of single neurons are optimized only to minimize model error in a training sample; there are no requirements for their physical feasibility. Second, the space dimension of possible combinations of input data and deep neural network model architectures is so large that, in any case, a researcher will have to make a subjective choice.

The universality of deep neural networks is also expressed in the universality of the modeling approach, the general concept of which can be presented in five successive stages (Fig. 3): (1) data collection, (2) data storage, (3) data preparation, (4) training of the deep neural network model, (5) analysis and presentation of the model results. At this point, the researcher has a central role in this approach: he decides which data sources he will store and how; based on the hypothesis, he determines the division of available data into predictors and target variables for the modeling; he selects the technology stack and deep neural network architecture to be used for the modeling.

However, with the emergence of automated machine learning (AutoML), the role of the researcher in the presented approach to modeling environmental processes will move more and more from technical issues to model data analysis and conceptual hypothesis formulation (Hutter et al., 2019). The idea of automated machine learning is to give the computer algorithm the right to select the optimal structure of the deep neural network based on some set of heuristics implemented in it. Today, one of the most advanced algorithms in this field is Neural Architecture Search (NAS), presented by a group of researchers from Google in 2018 (Zoph et al., 2018). Using NAS allows you to create deep neural networks which demonstrate both high efficiencies and contain fewer parameters.

The actively developing direction of automated machine learning embodies the fantastic dream of humanity about the cyber world — a world in which routine tasks are taken over by artificial intelligence, freeing the holders of natural intelligence for creative work. Thus, in the coming years, we should expect technologies that will allow the

Fig. 3. A general approach to modelling natural processes based on deep neural networks. Made by author

researcher to set only the target variable and meta-description of the research area, and the search for suitable input data and the modeling itself will be done automatically. Undoubtedly, the development of such technologies would not have been possible without the high universality of deep neural networks that allow finding complex patterns in heterogeneous observation data.

3. Efficient models of the new generation

Although the universality of deep neural networks is an important characteristic that distinguishes them favorably from both classic models of machine learning and physically-based models, it is the high efficiency of deep neural networks that has made them so popular. As we noted earlier, in the field of computer sciences deep neural networks confidently surpass classical approaches and methods in efficiency: for example, in tasks of classification and localization of objects, machine translation, speech recognition (Sut-skever et al., 2014; LeCun et al., 2015; Schmidhuber, 2015).

Unfortunately, given the infancy of deep learning in hydrology, at the moment it is impossible to say with certainty that deep neural networks are superior to classical approaches to modeling in efficiency. First of all, this is due to the fact that in the field of hydrology the practice of holding 'competitions' to compare the efficiency of different models on a predetermined data set has not become widespread. It is worth noting that the situation in hydrological modeling is gradually improving as open data sets from many hundreds of river catchments (e. g., CAMELS or CANOPEX) are available to enable such comparisons (Arsenault et al., 2016; Addor et al., 2017).

In the study by Kratzert et al. (2018), it was shown that a deep neural network based on the LSTM architecture shows comparable efficiency of streamflow modeling on a basin-scale in comparison with the conceptual hydrological model SAC-SMA. Interestingly, the desire to test the hypothesis of the performance of deep neural networks for regional-scale runoff modeling, as well as for ungauged basins, led two independent groups of researchers to similar conclusions. In essence, it was shown that deep neural networks based on the LSTM architecture demonstrate higher efficiency of river runoff modeling for ungauged basins by using all available regional information to train them (Kratzert et al., 2019b; Ayzel et al., 2020). Thus, it was clearly demonstrated how the property of universality of deep neural networks helps them to achieve higher efficiency. Therefore, deep neural networks, unlike classical hydrological models, are not limited in the amount of information that they can learn during the training period. It is this property that allows deep neural networks to 'see and remember' a complete picture of the peculiarities of river runoff formation on a regional scale during the learning process. In the next section, we will discuss in more detail the contribution of big data and available computational resources to the efficiency of deep neural networks.

4. The role of big data and computational resources

As noted in the Introduction, the revolution in deep learning began in the mid-2000s and was caused by two main reasons: (1) the emergence of large data sets, and (2) the increasing availability of computational resources. The combination of these two factors made it possible to unlock the full potential of deep neural networks and to demonstrate their high efficiency in addressing a plethora of scientific and engineering challenges. In 2017, a group of researchers from Google published a paper that experimentally confirmed a positive correlation between the sample size and the efficiency of deep neural networks trained on it (Sun et al., 2017). Thus, using as much data as possible to train deep neural networks allows the latter to increase their generalization capacity and, consequently, their efficiency.

However, the question arises — how much data can be considered big enough for successful learning of the deep neural network, and do such data sets exist in hydrology? Preliminary answers to these questions can be found in the paper by Gauch et al. (2019), which experimentally shows that using 9 years of observations of 531 river catchments from the CAMELS data set (Addor et al., 2017) to train a regional model of river runoff formation based on a deep neural network is sufficient to yield its performance on a plateau. Thus, the data sets of hydrometeorological variables and parameters already available in the public domain (e. g., CAMELS or CANOPEX) can be considered big data, the use of which will make it possible to unlock the full potential of deep-learning models in applying them to the tasks of modeling daily river runoff dynamics. It should be noted that, in modern hydrology, these data sets are not extraordinary or "big". To date, the amount of open hydrological and meteorological data that can be used to train deep-neural networks is enormous and growing. These include, for example, climate reanalysis, numerical weather models, remote sensing and automated environmental monitoring systems. Thus, in terms of data availability, hydrology is ready for the expansion of deep-learning models and techniques.

Progress in the field of deep learning is closely linked to progress in computational methods and technologies (Schmidhuber, 2015). A similar situation exists in the field of numerical weather prediction (Bauer et al., 2015). Modern CPUs and GPUs allow an increasing number of operations per second, which means that researchers have the opportunity to develop and apply more complex model architectures without any performance loss (Fuhrer et al., 2018). The same fact leads to a significant reduction in the time required to test a hypothesis or conduct a numerical experiment. At present, research centers and scientific institutes all over the world have solid computing clusters on which deep neural networks can be quickly and efficiently trained. In Russia, access to computational clusters specialized for deep neural networks is also provided by the Centers for Collective Use (CCU). If there is no possibility of using specialized computational clusters, anyone can use interactive computational environments, the limited functionality of which is available free of charge. The most famous service of this kind that can be used for training deep neural networks is Google Colab (colab.research.google.com).

To summarize the section, we note that deep neural networks have a bright future in hydrology: a large amount of data is publicly available that can be used to train neural networks; the necessary computing resources are available at universities, scientific institutes, and CCUs, as well as on the Internet.

5. Interpretability

We consider how the properties of universality, efficiency, and interpretability of hy-drological models relate to three classes into which they can be conditionally divided, namely: (1) traditional physically-based models, (2) classical machine learning models, and (3) deep neural networks. So, physically-based models are not universal, but can be efficient enough and easily interpreted. The same situation is, to some extent, observed for classical methods of machine learning (for example, Random Forest) — universality is not their strong point, while such models can be very efficient, and there are robust methods of their interpretation. Although deep neural networks have high universality (Section 2) and efficiency (Section 3), it is technically challenging to interpret such models because of the large number of parameters and nonlinearity of internal connections (Samek et al., 2017; Gunning et al., 2019).

However, the interpretation of deep neural networks and the calculations based on them are becoming increasingly popular, including in Earth sciences and hydrology (Kratzert et al., 2019a). This is because the high efficiency of deep neural networks makes them promising candidates for use in critical areas related to the prediction of natural hazards. At the same time, decision-making in such critical areas must be based on models we can trust, which is impossible without a full understanding of how and from what the model makes predictions, what the limits of its application are, and its robustness against extreme conditions (Bremer et al., 2019; Rudin, 2019).

In their study, McGovern et al. offer a range of statistical methods for interpreting machine learning models that can reveal hidden mechanisms of how these models make decisions (2019). In particular, the authors propose four methods for interpreting deep neural networks: (1) saliency maps (Simonyan et al., 2014), (2) weighted class-activation maps (Selvaraju et al., 2017), (3) backward optimization (Olah et al., 2017), and (4) novelty detection (Wagstaff and Lee, 2018). Each method has its advantages and disadvantages

(McGovern et al., 2019). However, all these methods can only be applied post factum, to an already established model, and thus the probability of biased evaluation appears — researchers may consciously seek confirmation bias (Arazo et al., 2019).

Discussion of the interpretability characteristics of deep neural networks will always remain in the focus of research by the scientific community: adaptation of deep learning technologies to critical applications undoubtedly requires a thorough understanding of decision-making mechanisms. On the other hand, is it so essential to understand what the deep neural network is based on if its use can significantly increase the predictability of extreme hydrometeorological events? There is no doubt that the answers to this and other questions concerning the interpretability of deep neural networks will be obtained as the technology of deep learning becomes widespread in hydrology.

6. Development forecast

Since 1995, the Gartner Hype Curve has been published once a year by the Gartner Group of Companies. The report is a forecast schedule of public expectations from the most advanced technologies (O'Leary, 2008). Conceptually, the Gartner Hype Curve can be presented as five consecutive periods that the technology goes through after its appearance: (1) the innovation trigger, (2) the peak of inflated expectations, (3) the trough of disillusionment, (4) the slope of enlightenment, and (5) the plateau of productivity (Fig. 4).

к 2025 2030 ,2035 2045

2020/ 2040

2012. 'w

Innovation Peak of inflated Trough of Slope of Plateau of

trigger expectations disillusionment enlightenment productivity

Fig. 4. Gartner Hype Curve for deep learning in hydrology. Made by author

Based on subjective experience gained in both research disciplines, the author proposes a hypothetical type of Gartner Hype Curve for the technology of deep learning in hydrology (Fig. 4). Based on this curve, it is possible to make a simplified forecast of the development of the technology of deep learning in hydrology for the period until 2050. Thus, at the moment (2020), we are in the middle of the innovation trigger (launch),

which will last until 2025: within these five years, the technology will become popular and widespread. The period of inflated expectations will be between 2025 and 2035: during these ten years, the technology will gain (by 2030) maximum popularity among researchers, after which the expectations from the technology will go down. In essence, it will no longer have the same novelty and relevance; the technology will become commonplace and will acquire the status of a regular research tool. The subsequent period of the trough of disillusionment will last five to ten years and will be associated with a natural decline in interest in the application of the tool that has managed to become commonplace. However, during the period of the slope of enlightenment (2040-2045), it is expected that deep learning technology will acquire some qualitatively new applications that will allow it to reach the plateau of productivity (2050). For example, the classic machine learning model of Random Forest (Ho, 1995) has undergone a similar development cycle and is now on its productivity plateau. It should be noted that the Random Forest model can calculate the importance of input variables in model prediction, which enabled it to enter the productivity plateau period. The use of this feature is the basis for many modern studies to identify and quantify the contribution of a particular factor to the dynamics of hydrome-teorological values (Avanzi et al., 2020; Konapala and Mishra, 2020).

7. Conclusion

This study provides a brief overview of the field of deep neural networks and their applications in hydrology. It shows that deep neural networks are universal and efficient models of the next generation, the use of which will make it possible to exploit the potential of existing "big" environmental data. However, the low interpretability of deep neural networks will constrain their application in critical areas (e. g., natural hazard prediction) where confidence in the modeling system based on transparency and determinism of its simulations plays a crucial role. A simplified forecast of the development of the technology of deep learning in hydrology was made based on the Gartner Hype Curve (Fig. 4). Based on the proposed forecast, the peak of inflated expectations around deep neural networks in hydrology will occur between 2025 and 2035.

The availability of big data in the field of hydrology and related disciplines, as well as simplicity of access to computational resources, make the technology of deep learning especially promising for solving scientific and practical problems. There is no doubt that more and more studies will be published in the coming years, demonstrating the high efficiency of deep neural networks for hydrological applications. The author hopes that Russian universities and scientific institutes will strengthen their support to research in this promising area, which will also bring us closer to the goals and objectives set in the Strategy of Scientific and Technological Development of the Russian Federation.

References

Addor, N., Newman, A. J., Mizukami, N. and Clark, M. P. (2017). The CAMELS data set: catchment attributes and meteorology for large-sample studies. Hydrology and Earth System Sciences (HESS), 21 (10), 5293-5313.

Adiwardana, D., Luong, M. T., So, D. R., Hall, J., Fiedel, N., Thoppilan, R., Yang, Z., Kulshreshtha, A., Nemade, G., Lu, Y. and Le, Q. V. (2020). Towards a Human-like Open-Domain Chatbot. ArXiv, [online] Available at: https://arxiv.org/abs/2001.09977 [Accessed Mar. 04, 2021].

Agrawal, S., Barrington, L., Bromberg, C., Burge, J., Gazen, C. and Hickey, J. (2019). Machine Learning for Precipitation Nowcasting from Radar Images. ArXiv, [online] Available at: https://arxiv.org/ abs/1912.12132 [Accessed Mar. 04, 2021].

Arazo, E., Ortego, D., Albert, P., O'Connor, N. E. and McGuinness, K. (2019). Pseudo-labeling and confirmation bias in deep semi-supervised learning. ArXiv, [online] Available at: https://arxiv.org/ abs/1908.02983 [Accessed Mar. 04, 2021].

Arsenault, R., Bazile, R., Ouellet Dallaire, C. and Brissette, F. (2016). CANOPEX: A Canadian hydrometeorological watershed database. Hydrological Processes, 30 (15), 2734-2736.

Avanzi, F., Johnson, R. C., Oroza, C. A., Hirashima, H., Maurer, T. and Yamaguchi, S. (2020). Insights into preferential-flow snowpack runoff using Random Forest. Water Resources Research, 55 (12), 1072710746.

Ayzel, G. (2019). Does Deep Learning Advance Hourly Runoff Predictions? In: Proceedings of the V International Conference on Information Technologies and High-Performance Computing. [online] Khabarovsk: CEUR-WS, 62-70. Available at: http://ceur-ws.org/Vol-2426/paper9.pdf [Accessed Mar. 04, 2021].

Ayzel, G., Heistermann, M., Sorokin, A., Nikitin, O. and Lukyanova, O. (2019). All convolutional neural networks for radar-based precipitation nowcasting. Procedia Computer Science, 150, 186-192.

Ayzel, G., Kurochkina, L., Kazakov, E. and Zhuravlev, S. (2020). Runoff prediction in ungauged basins: benchmarking the efficiency of deep learning. In: IV Vinogradov Conference "Hydrology: from Learning to Worldview" in Memory of Outstanding Russian Hydrologist Yury Vinogradov. [online] Saint Petersburg: E3S Web Conf., 1-6. Available at: https://doi.org/10.1051/e3sconf/202016301001 [Accessed Mar. 04, 2021].

Bauer, P., Thorpe, A. and Brunet, G. (2015). The quiet revolution of numerical weather prediction. Nature, 525 (7567), 47-55.

Bremer, L. L., Hamel, P., Ponette-Gonzalez, A. G., Pompeu, P. V., Saad, S. I. and Brauman, K. A. (2019). Who are we measuring and modeling for? Supporting multi-level decision-making in watershed management. Water Resources Research, 56 (1), e2019WR026011.

Deng, J., Dong, W., Socher, R., Li, L. J., Li, K. and Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition (CVPR2009). [online] Miami: IEEE, 248-255. Available at: https://ieeexplore.ieee.org/document/5206848 [Accessed Mar. 04, 2021].

Fuhrer, O., Chadha, T., Hoefler, T., Kwasniewski, G., Lapillonne, X., Leutwyler, D., Lüthi, D., Osuna, C., Schär, C., Schulthess, T. C. and Vogt, H. (2018). Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0. Geoscientific Model Development, 11 (4), 1665-1681.

Gauch, M., Mai, J. and Lin, J. (2019). The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction. ArXiv, [online] Available at: https://arxiv.org/abs/1911.07249 [Accessed Mar. 04, 2021].

Gunning, D., Stefik, M., Choi, J., Miller, T., Stumpf, S. and Yang, G. Z. (2019). XAI — Explainable artificial intelligence. Science Robotics, 4 (37), eaay7120.

Ho, T. K. (1995). Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition. [online] Montreal: IEEE, 1, 278-282 [Accessed Mar. 04, 2021].

Hornik, K., Stinchcombe, M. and White, H. (1989). Multilayer feedforward networks are universal approximators. Neural networks, 2 (5), 359-366.

Hinton, G. E., Osindero, S. and Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural computation, 18 (7), 1527-1554.

Hutter, F., Kotthoff, L. and Vanschoren, J. (2019). Automated Machine Learning. New York, USA: Springer.

Konapala, G. and Mishra, A. (2020). Quantifying climate and catchment control on hydrological drought in continental United States. Water Resources Research, 56 (1), e2018WR024620.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Kratzert, F., Klotz, D., Brenner, C., Schulz, K. and Herrnegger, M. (2018). Rainfall-runoff modelling using long short-term memory (LSTM) networks. Hydrology and Earth System Sciences, 22 (11), 6005-6022.

Kratzert, F., Herrnegger, M., Klotz, D., Hochreiter, S. and Klambauer, G. (2019a). NeuralHydrology-Interpreting LSTMs in Hydrology. In: W. Samek, G. Montavon, A. Vedaldi, L. K. Hansen, K.-R. Müller, ed., Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Bern: Springer, 347-362.

Kratzert, F., Klotz, D., Herrnegger, M., Sampson, A. K., Hochreiter, S. and Nearing, G. S. (2019b). Toward Improved Predictions in Ungauged Basins: Exploiting the Power of Machine Learning. Water Resources Research, 55 (12), 11344-11354.

Krizhevsky, A., Sutskever, I. and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems 25 (NIPS 2012). [online] Lake Tahoe: NIPS, 1097-1105. Available at: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf [Accessed Mar. 04, 2021].

Laloy, E., Hérault, R., Jacques, D. and Linde, N. (2018). Training-image based geostatistical inversion using a spatial generative adversarial neural network. Water Resources Research, 54 (1), 381-406.

LeCun, Y., Bengio, Y. and Hinton, G. (2015). Deep learning. Nature, 521 (7553), 436-444.

McCulloch, W. S. and Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5 (4), 115-133.

McGovern, A., Lagerquist, R., John Gagne, D., Jergensen, G. E., Elmore, K. L., Homeyer, C. R. and Smith, T. (2019). Making the black box more transparent: Understanding the physical implications of machine learning. Bulletin of the American Meteorological Society, 100 (11), 2175-2199.

Minsky, M. and Papert, S. A. (1969). Perceptrons: An introduction to computational geometry. Boston: MIT Press.

Moraux, A., Dewitte, S., Cornelis, B. and Munteanu, A. (2019). Deep Learning for Precipitation Estimation from Satellite and Rain Gauges Measurements. Remote Sensing, 11 (21), 2463.

Olah, C., Mordvintsev, A. and Schubert, L. (2017). Feature visualization. Distill, 2 (11), e7.

O'Leary, D. E. (2008). Gartner's hype cycle and information system research issues. International Journal of Accounting Information Systems, 9 (4), 240-252.

Pan, B., Hsu, K., AghaKouchak, A. and Sorooshian, S. (2019). Improving precipitation estimation using convolutional neural network. Water Resources Research, 55 (3), 2301-2321.

Racah, E., Beckham, C., Maharaj, T., Kahou, S. E., Prabhat, M. and Pal, C. (2017). Extreme Weather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events. In: Advances in Neural Information Processing Systems 31 (NIPS 2017). [online] Long Beach: NIPS, 3402-3413. Available at: https://papers.nips.cc/paper/6932-extremeweather-a-large-scale-climate-dataset-for-semi-supervised-detection-localization-and-understanding-of-extreme-weather-events.pdf [Accessed Mar. 04, 2021].

Ranzato, M. A., Poultney, C., Chopra, S. and LeCun, Ya. (2007). Efficient learning of sparse representations with an energy-based model. In: Advances in neural information processing systems 21 (NIPS 2007). [online] Vancouver: NIPS, 1137-1144. Available at: https://papers.nips.cc/paper/3112-efficient-learning-of-sparse-representations-with-an-energy-based-model.pdf [Accessed Mar. 04, 2021].

Rao, Q. and Frtunikj, J. (2018). Deep learning for self-driving cars: chances and challenges. In: Proceedings of the 1st International Workshop on Software Engineering for AI in Autonomous Systems. [online] Gothenburg: IEEE, 35-38. Available at: https://ieeexplore.ieee.org/document/8452728 [Accessed Mar. 04, 2021].

Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J. and Carvalhais, N. (2019). Deep learning and process understanding for data-driven Earth system science. Nature, 566 (7743), 195-204.

Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65 (6), 386.

Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1 (5), 206-215.

Rumelhart, D. E., Hinton, G. E. and Williams, R. J. (1985). Learning internal representations by error propagation. [report] California Univ San Diego La Jolla, Inst for Cognitive Science, San Diego.

Rumelhart, D. E., Hinton, G. E. and Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323 (6088), 533-536.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M. and Berg, A. C. (2015). Imagenet large scale visual recognition challenge. International journal of computer vision, 115 (3), 211-252.

Samek, W., Wiegand, T. and Müller, K. R. (2017). Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. [online] Available at: https://arxiv.org/ abs/1708.08296 [Accessed Mar. 04, 2021].

Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D. and Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision. [online] Venice: IEEE, 618-626. Available at: https:// ieeexplore.ieee.org/document/8237336 [Accessed Mar. 04, 2021].

Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85-117.

Shen, C. (2018). A transdisciplinary review of deep learning research and its relevance for water resources scientists. Water Resources Research, 54 (11), 8558-8593.

Shen, C., Laloy, E., Elshorbagy, A., Albert, A., Bales, J., Chang, F. J., Ganguly, S., Hsu, K. L., Kifer, D., Fang, Z. and Fang, K. (2018). HESS Opinions: Incubating deep-learning-powered hydrologic science advances as a community. Hydrology and Earth System Sciences, 22 (11), 5639-5656.

Shi, X., Gao, Z., Lausen, L., Wang, H., Yeung, D. Y., Wong, W. K. and Woo, W. C. (2017). Deep learning for precipitation nowcasting: A benchmark and a new model. In: Advances in neural information processing systems 30 (NIPS 2017). [online] Long Beach: NIPS, 5617-5627. Available at: https://papers. nips.cc/paper/7145-deep-learning-for-precipitation-nowcasting-a-benchmark-and-a-new-model [Accessed Mar. 04, 2021].

Simonyan, K., Vedaldi, A. and Zisserman, A. (2014). Deep inside convolutional networks: Visualising image classification models and saliency maps. [online] Available at: https://arxiv.org/abs/1312.6034 [Accessed Mar. 04, 2021].

Song, X., Zhang, G., Liu, F., Li, D., Zhao, Yu. and Yang, J. (2016). Modeling spatio-temporal distribution of soil moisture by deep learning-based cellular automata model. Journal of Arid Land, 8 (5), 734-748.

Song, T., Ding, W., Wu, J., Liu, H., Zhou, H. and Chu, J. (2020). Flash Flood Forecasting Based on Long Short-Term Memory Networks. Water, 12 (1), 109.

Sun, C., Shrivastava, A., Singh, S. and Gupta, A. (2017). Revisiting unreasonable effectiveness of data in deep learning era. In: Proceedings of the IEEE international conference on computer vision. [online] Venice: IEEE, 843-852. Available at: https://ieeexplore.ieee.org/document/8237359 [Accessed Mar. 04, 2021].

Sutskever, I., Vinyals, O. and Le, Q. V. (2014). Sequence to sequence learning with neural networks. In: Advances in neural information processing systems 27 (NIPS 2014). [online] Montreal: NIPS, 31043112. Available at: https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf [Accessed Mar. 04, 2021].

Tao, Yu., Gao, X., Hsu, K., Sorooshian, S. and Ihler, A. (2016). A deep neural network modeling framework to reduce bias in satellite precipitation products. Journal of Hydrometeorology, 17 (3), 931-945.

Trends.google.com. (2020). Google Trends' Official Website. [online] Available at: https://trends.google.com [Accessed Mar. 04, 2021].

Wagstaff, K. L. and Lee, J. (2018). Interpretable discovery in large image data sets. [online] Available at: https://arxiv.org/pdf/1806.08340.pdf [Accessed Mar. 04, 2021].

Webofscience.com. (2020). Web of Science Core Collection's Official Website. [online] Available at: https:// apps.webofknowledge.com [Accessed Mar. 04, 2021].

Zhang, L., Zhang, L. and Du, B. (2016). Deep learning for remote sensing data: A technical tutorial on the state of the art. IEEE Geoscience and Remote Sensing Magazine, 4 (2), 22-40.

Zhang, J., Liu, P., Zhang, F. and Song, Q. (2018). CloudNet: Ground-based cloud classification with deep convolutional neural network. Geophysical Research Letters, 45 (16), 8665-8672.

Zhao, W. and Du, S. (2016). Learning multiscale and deep representations for classifying remotely sensed imagery. ISPRS Journal of Photogrammetry and Remote Sensing, 113, 155-165.

Zoph, B., Vasudevan, V., Shlens, J. and Le, Q. V. (2018). Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. [online] Salt Lake City: IEEE, 8697-8710. Available at: https://ieeexplore.ieee.org/document/8579005 [Accessed Mar. 04, 2021].

Received: April 23, 2020 Accepted: December 14, 2020

Contact information: Georgy V. Ayzel — [email protected]

Глубокие нейронные сети в гидрологии: новое поколение универсальных и эффективных моделей*

Г. В. Айзель

Государственный гидрологический институт,

Российская Федерация, 199004, Санкт-Петербург, 2-я линия В. О., 23; Потсдамский университет, Институт наук об окружающей среде и географии, Германия, 14476, Потсдам, ул. Карла Либкнехта, 24-25

Для цитирования: Ayzel, G. V. (2021). Deep neural networks in hydrology: the new generation of universal and efficient models. Вестник Санкт-Петербургского университета. Науки о Земле, 66 (1), 5-18. https://doi.org/10.21638/spbu07.2021.101

В течение последнего десятилетия глубокое обучение — область машинного обучения, относящаяся к искусственным нейронным сетям, состоящим из множества вычислительных слоев, — изменяет ландшафт развития статистических моделей во многих областях исследований, таких как классификация изображений, машинный перевод, распознавание речи. Географические науки, а также входящая в их состав область исследования гидрологии суши, не стоят в стороне от этого движения. В последнее время применение современных технологий и методов глубокого обучения активно набирает популярность для решения широкого спектра гидрологических задач: моделирования и прогнозирования речного стока, районирования модельных параметров, оценки располагаемых водных ресурсов, идентификации факторов, влияющих на современные изменения водного режима. Такой рост популярности глубоких нейронных сетей продиктован прежде всего их высокой универсальностью и эффективностью. Представленные качества в совокупности с быстрорастущим количеством накопленной информации о состоянии окружающей среды, а также ростом доступности вычислительных средств и ресурсов, позволяют говорить о глубоких нейронных сетях как о новом поколении математических моделей, призванных если не заменить существующие решения, то значительно обогатить область моделирования геофизических процессов. В данной работе представлен краткий обзор текущего состояния области разработки и применения глубоких нейронных сетей в гидрологии. Также в работе предложен качественный долгосрочный прогноз развития технологии глубокого обучения для решения задач гидрологического моделирования на основе использования «кривой ажиотажа Гартнера», в общих чертах описывающей жизненный цикл современных технологий.

Ключевые слова: глубокие нейронные сети, глубокое обучение, машинное обучение, гидрология, моделирование.

Статья поступила в редакцию 23 апреля 2020 г.

Статья рекомендована к печати 14 декабря 2020 г.

Контактная информация:

Айзель Георгий Владимирович — [email protected]

* Данное исследование поддержано РФФИ (грант № 19-35-60005).

i Надоели баннеры? Вы всегда можете отключить рекламу.