Information and Signal Processing
UDC 004.383.8,004.85=111
E.N. Benderskaya
SOFT COMPUTING BASED ON NONLINEAR DYNAMIC SYSTEMS: POSSIBLE FOUNDATION OF NEW DATA MINING SYSTEMS
The article describes how the structure of AI systems is formed through an incoming image in nonlinear dynamic systems. The main steps of a new approach to solve the problem of image recognition, as well as its strengths and limitation are presented. An example of the use of a chaotic dynamic system to solve a clustering problem is shown.
SOFT COMPUTING; NONLINEAR DYNAMIC SYSTEM; AI; CHAOTIC DYNAMICS; IMAGE RECOGNITION; PATTERN RECOGNITION; TURING MACHINE.
Е.Н. Бендерская
МЯГКИЕ ВЫЧИСЛЕНИЯ НА БАЗЕ НЕЛИНЕЙНЫХ ДИНАМИЧЕСКИХ СИСТЕМ: ВОЗМОЖНАЯ ОСНОВА ДЛЯ НОВЫХ СИСТЕМ ИЗВЛЕЧЕНИЯ ЗНАНИЙ
Рассмотрены вопросы формирования структуры интеллектуальной системы с помощью входного образа на основе нелинейных динамических систем. Представлены основные составляющие нового подхода к решению задач распознавания образов, его достоинства и ограничения. Приведен пример использования хаотической динамики системы для решения задач кластеризации.
МЯГКИЕ ВЫЧИСЛЕНИЯ; НЕЛИНЕЙНЫЕ ДИНАМИЧЕСКИЕ СИСТЕМЫ; ИСКУССТВЕННЫЙ ИНТЕЛЛЕКТ; ХАОТИЧЕСКАЯ ДИНАМИКА; РАСПОЗНАВАНИЕ ИЗОБРАЖЕНИЙ; РАСПОЗНАВАНИЕ ОБРАЗОВ; МАШИНА ТЬЮРИНГА.
As today there is a large number of problem solving methods in various fields. Many methods are heuristic-based and it is not always clear which type of problem is best solved by a method, and what parameters will yield the best result for a given problem.
When solving any problem, the question arises of the best method to use in order to find a solution which satisfies the initial requirements. The classic approach is decomposition (usually functional decomposition) of the initial problem into sub problems and finding the best methods for each sub problem. Often, the researchers use only well-known methods or previously tested methods. In that case, there is a random component in the choice of methods from a large number of many possible ones.
Because of this the second way to solve this task is the development of AI systems which pick the methods and the parameters which are best for solving a certain problem. Searching and application take place instead of developing new AI system in those cases when required AI systems exist for a subject area. These systems fall into three main categories:
• multi-level automatic system with an expert at the highest level (advisory systems);
• a group of methods which solve problems by a majority rule;
• universal methods (the results will be less accurate than specialized methods).
One of the main problem is when developing AI systems is ensuring that the system is sophisticated enough for what needs to be solved.
Such systems should be able to recognize both simple and sophisticated images, coming closer to the abilities of the biological counterparts.
One of the possible ways of solving this dilemma is using multi-level systems for making decisions which would include a preliminary assessment of the problem. Usually the higher level is automated, but not completely automatic and requires an expert. Advisory systems allow increasing in the number of possible methods to be considered in addition to those which the developer already knows when trying to find a solution. Also such systems allow us to take into account of the accumulated knowledge about the features of each method, and based on task input data and the requirements for the solution. It can give recommendations about the best method and optimum settings. However, the final decision depends on the expert developer.
Due to the complexity of formal representation of the selecting process the optimal method for solving a problem, we should offer somewhat redundant but fairly effective approach. It can help to create a system which would include most suitable methods, their use for solving the problem, and subsequent selection of the best solution by some quality criteria or majority rule [14]. For example, the systems of decision rule committees in the theory of pattern recognition and the formal algebra of events used on the set of these rules, designed by Yi. Zhuravlev [14].
The other design approach which scientists from different fields carried out is the creation of a fairly universal method of solving the problem. Such a method would be suitable for a large number of conditions for solving the problem. Also the method would be insensitive to the deviation of the actual data from the data embedded a priori in the method. Such an approach generates methods that are universal and they give the results close to the optimal solution on average. In this case, the quality of the solution may be much lower than potentially achievable. This is applied to the main indicators of quality, such as the probability of the correct solution, the accuracy of the solution, as well as to the secondary indicators — complexity, cost of memory and time consumed. The difficulty of finding a suitable method, as
well as developing a general method lies in the fact that the complexity of the method (and therefore the structure used) has to be adequate for solving the problem given. It would seem that the more complex the method, the wider the range of problems it can solve. However, simple problems, when solved by a complex method, often produce unsatisfactory results. Figuratively speaking, the additional degrees of freedom in the method, being unaffected by the input data, generate errors. This can be most clearly demonstrated on a neural network with an excessive number of elements for solving a simple problem. Instead of learning, with subsequent generalization the network does not produce patterns. It simply stores the input examples, and completely repeats the features of the training examples, which may be related not to the features of the input space, but to the peculiarities of measurement and acquisition of data. As a result, incorrect results are obtained from the test data (or even worse, when the network is already in use).
Consider ways to ensure adequate structural complexity:
1) principle of adjustment (development) of the structure (method) for a particular problem;
2) synergic governance principles;
3) principle of minimum description length.
In the first case the development of the neural network for a specific application is assumed, and thus the adequacy of the structure complexity and problem complexity is ensured [7]. To implement this method, developed by A. Galushkin, one must pass the priori information to the primary and secondary optimization functions, which are then used to determine the adequacy of the structure.
A. Kolesnikov has developed a whole theory of synergistic control [10, 11], with maximum use of the dynamics features of the object being controlled during development for «nonviolent» control and maximum use of the object's own dynamics to achieve a certain goal (subspace, trajectory or point) [10, 11]. By using this approach, it is assured that the control system and control object are of adequate complexity. Similar to the first principle of adjusting to the problem is the principle of minimum
description length, (proposed by A. Potapov) illustrated in detail for image recognition tasks [13]. To compare methods of problem-solving, a metric is created which corresponds to the length of the description of the method which can be used to solve the problem and thus the method with a minimum value of the metric is selected.
Analysis of the main existing approaches leads to the idea of a new approach that would combine all three principles to ensure the adequacy of the system structure [7] for the complexity of the problem. An approach which will be proposed is based on the assumption that the complexity of the system can be controlled by its response to dynamic changes in the input image directly. For this purpose, one can use a dynamic self-organizing system, which is sensitive to changes in input data and interpretation of the structure of the system and, accordingly, its complexity. It must involve the concepts of the dynamic system complexity and the complexity of the attractor.
By structural complexity in this case we mean not only the complexity of connections in the system itself, but also the complexity of the generated image in the phase space (attractor), which reflects the dynamics of the system and hence the complexity of the task.
Chaotic Dynamics — New Opportunities to Solve Complex Problems
Trend analysis of the mathematical apparatus of the static and dynamic point of view also leads to the idea of using a highly sensitive-to-changes-in-the-input space nonlinear dynamic systems for the development of intelligent systems. It contains in its dynamics all the possible problem solutions, simple as well as complex. And unlike the artificial construction of a universal approach, there is a universal system that organizes itself, adjusts to the solution.
Mathematical methods of nonlinear dynamics and chaos can be regarded as the next stage in the development of mathematical methods. There is a tendency to shift from deterministic to statistical models with more complexity, to chaotic, which can be deterministic but due to nonlinearity. A large number of elements lead to complex and often unpredictable behavior.
Fig. 1 is a schematic representation of development stages of the mathematical apparatus in terms of the complexity of methods, models and objects which can be described based on them. The convention of these steps is that it does not take into account the time for the models to come into existence. Many of them were offered long ago, but due to the lack of suitable computational tools for modelling at the time, they could not be applied, but now these models are quite popular.
The development of mathematical methods and models from the point of view of a logic device (focus on static, the left column of Fig. 1) can be represented as follows. In the beginning
STATIC FOCUS
DYNAMIC FOCUS
Logic - Crisp set
Static attractor
I
Interval type 1 fuzzy logic
Limt cycle attractor
LA1 Ol
Interval type 2 fuzzy loge
Tone attractor
Fig. 1. Evolution of formal methods: dealing with uncertainty
there was classical logic which operated with clear numbers and precise sets. Largely this is why classical computational architectures require exact and specific input of the source data when performing calculations. It is impossible to do where some complex and hard to formalize problems exist.
A significant breakthrough in the field of information processing and overcoming linguistic uncertainty was the introduction of the concept of «fuzzy sets» and development of the theory of fuzzy logic. Now it is possible to perform operations simultaneously at a certain interval. The element on which the operations are performed is now an interval instead of a single point.
Further development of the theory of fuzzy sets and fuzzy logic is in some sense going via the extensive path: finding fuzzy sets of the second type, which are in reality «interval on an interval», increasing the dimension, etc. This, of course, enhances the capabilities of devices which deal with complexly organized and uncertain data, but, nevertheless, it is not as effective as the transition from a number of intervals.
One can observe the mathematical apparatus becoming more and more complex from the point of view of dynamic models when looking at the example of attractors attracting sets of dynamic systems as they become more complex (right column of Fig. 1). Firstly, models of systems the dynamics of which converge to the set of individual points of attraction in the phase space (point attractor), then to the set of closed trajectories (attractor type: limit cycle, torus), and finally to the set of trajectories that define a location in the phase space in the form of an infinite number of changing states (chaotic attractors).
For static models, the next level of generalization, in order to extend the ability of making calculations simultaneously on a whole set of possible solutions, is also modelling with a chaotic attractor.
When looking at the trends in neural networks, we realize the necessity of using the capacity of chaotic dynamic systems for solving problems of AI and accomplishing related tasks (e. g. coding and information transfer). The functioning of the dynamic neural network
with an irregular structure makes it possible to form a solution on the boundary of orderchaos, which corresponds to a variety of different structures of the output space, extremes of which are ordered dynamics (cycle) and turbulent dynamics (lack of structure in general).
This is the next step in the development of the neural network structure, as in this case, not only the weights of the network are adjusted, but a collective solution is found by a set of nonlinear elements of the same type, each one having unstable dynamics, but as a whole, under the influence of the input data, they form a stable dynamic system.
Control of the Structural Complexity of the System
Control of chaos is often associated with the task of suppressing chaotic oscillations — the shift of the system to a stable periodic motion, or to a state of equilibrium. In a broad sense, it is the transformation of the chaotic behavior of the system into regular behavior or chaotic, but with different properties.
The challenges arising from the chaos control problem are much different from the traditional problems of automatic control [1, 2]. Instead of classic control goals, such as bringing the trajectory of the system to a set point or to a given movement, soft goals are set to chaos control: creating modes with partially specified properties, qualitative change in the phase portrait of the system, synchronization of chaotic oscillations and others. Unlike traditional control operations, in physical application of chaos theory the focus is not on finding the most effective way of achieving goals, but on researching the fundamental possibility of achieving it, on determining a class of possible movements by the controlled physical system [1, 2].
Study of the dynamics of ensembles consisting of a large number of nonlinear elements, is one of the main trends in the theory of nonlinear oscillations and waves. The main factor in the dynamics of ensembles of oscillating systems, which leads to an ordered-time behavior, is the synchronization of the ensemble elements. Numerous studies show that space-distributed random vibrating systems have many beneficial properties. In some of them self-synchronization occurs with specific parameters of
the system. By self-synchronization we mean the process which contains identical elements of the system, each of which is characterized by chaotic dynamics. It can be initialized in various ways, over time, and starts to oscillate synchronously without outside influence.
In the presence of external influence on the nonlinear dynamic system, we get a response that reflects both the external conditions of the problem and the input signals which characterize the problem being solved. With this approach instead of creating a model for solving the problem, the target setting is given — a required outcome of solving the problem and it is believed that the solution is not unique. In any case in the form of presentation it forms a variety, which can be interpreted as the only solution, or as a set of basic solutions.
Instead of the usual representation of the original problem to be solved as a set of functions for subsequent use or for splitting the system into separate parts, in the synergetic approach the synthesis and study are performed on the system as a whole. Changing the state of a particular element system may not affect the state of the system as a whole, however, the joint dynamics of all the elements defines a unique state of a macroscopic system. This state of the system will be the solution of the problem [3, 4].
Namely this, the occurrence of synchronization (collective behavior), allows living systems to adapt, learn, and extract information in real time to solve computationally complex problems (due to distributed information processing). Many elements with complex dynamics produce efficient computing [8, 9].
A computing device that implements the proposed approach can be a set of asynchronous models of dynamic systems that interact with each other and combine properties such as being hybrid and asynchronous, having clusters (no rigid centralization and dynamic clustering of related models), and being stochastic [8, 9]. O. Granichin developed a computational model for such a device that is based on the following set of basic parameters [8]:
set of computational primitives (dynamic models H with parameters from the set Q);
memory X — total space of states of all models;
feed S — dynamic graph with a finite bit string s of whether to include the models at certain nodes;
program G — the rules given by graph S are the rules (or goals) for «switches» of the tape and model parameters when the pair (x, q) appears at one of the «active» nodes in the switching set J;
cycle — the time interval between successive switches;
breakpoint set T.
One can speak of a generalization of a Turing machine [8] which can be represented as a chain of interrelated components <A, H, Q, q, q0, X, x, x0, S, s, s0, J, G, T > where A is the set of models (computational primitives); H — the evolution operator; Q — the set of states (parameter values); X — the memory; S — the generic tape (graph); J — the set of switching; G — the program (goals); T — the breakpoint set. The main stages in the use of complex modes of operation of chaotic systems to solve practical problems can be represented by the following sequence.
The initial state is given and the goal is defined — to reach a certain state. It is assumed that the goal can be achieved by navigating through a trajectory that passes near one of the attractors. Then the system is started and the input signals corresponding to the task are given to it. After a transition process the system goes into an attractor. Searching takes place for a trajectory which is accessible using a small perturbation of the system and is close enough to pass next to the desired point or sequence of points which corresponds to the desired state of the system. If such a path is not found, random input is fed into the system in order to jump to another attractor until the goal is achieved.
The Chaotic Neural Network — Example of a Structure and Dynamics Determined by Input Images
In chaotic dynamics under the influence of external perturbation structures are produced, and it may initially include the entire set of possible options. Chaotic systems allow us to go to the next level of aggregation in the concept of computing process and perform the calculations simultaneously on a whole set of possibilities. This set will be shaped by external
signals, thus providing an adequate complexity. In many ways, this is similar to the principles used in quantum computing, which contains the entire set of solutions until the answer is found.
We want to consider a relatively simple and clear example of the use of external images to form the structure of a system. We present the use of various metrics based on the input data for the calculation of the connection matrix in the chaotic neural network (CNN) [5, 6]. It is capable of solving the clustering problem only on the basis of input data without any additional and prior information about task.
A feature of this oscillatory neural network is chaotic dynamics of individual neurons outputs, and mutual, independent on the initial conditions, self-clusterization. For the use of CNN it allows to solve problems with minimal prior clustering information concerning the objects to be sorted into clusters. One can draw an analogy between the formation of functional and logical structures on CNN with self-generated functional clusters of activity in the brain to solve different problems.
CNN is a one-layer recurrent network in which the elements are connected to «each other» without having any connection back to «themselves»:
1 N
y, (t +1) = CI Wjf (y, (t)), t = 1...T, (1)
¡*j
f (y(t)) = 1 - 2y2(t), (2)
Wj = exp(- | x,. - Xj |2 /2a(3)
where c = I w j = l, N is the scaling constant,
•* j
computed by the algorithm presented in [5, 6]; Wij is the connection strength (weight vector) between neurons i and j; N is the number of neurons, which is equal to the number of points in the input image, represented in the form of X = (x1, x2, ..., xm); m is the dimension of the image space; T is the simulation time. As shown in [5], for nonlinear transformation f( y(t)) one can use any mapping that generates chaotic oscillations, however, a logistic mapping (2) is preferred.
The training of CNN consists of assigning weight vectors, which are based on the ratio of the input image (3) and uniquely determine
the field, which acts on all the neural networks. As this field is not uniform, the analysis and resolution of the difference equations system (1) are much more difficult.
Study of the dynamics of ensembles of systems consisting of a large number of nonlinear elements is one of the main directions of development of nonlinear oscillations and waves theory. The main factor in the dynamics of ensembles of oscillating systems, which leads to ordered space-time behavior of the ensemble, is the synchronization of the elements [12].
Analysis of the different images dynamics for CNN (input structures, reflecting the impact of external environment on the system) with the same system parameters allows one to see the varying «music» of vibrations at each of the clusters formed in the system. In Fig. 2, you can clearly distinguish ensembles of elements, the character of the output oscillations is very different, and allows one to talk about the existence of self-generated clusters system and the availability of fragmentary synchronization [5]. With this synchronization the instantaneous outputs of neurons belonging to the same cluster do not match either in amplitude or phase and do not have a fixed phase shift between any two sequences. By cluster fragmentation synchronization we mean synchronization in the sense that each cluster is characterized by a unique «melody» of vibrations, encoded in the temporal sequence of output values of neurons. The proposed method for detection of cluster synchronization is described in detail in [5, 6] and it is based on an analysis of the relative remoteness of the instantaneous output values of each neurons pairs in a varying time interval.
The difficulty of using chaos and developing chaos logic also reflects in the fact that the term «chaos» defines several fundamentally different modes of the system. To separate the useful chaos from the rest, the expression «determinate chaos» has come about. The word «determinate» was introduced to highlight the repeatability of the experiments, and therefore we should make the calculations with it, and we need to have the possibility of its application.
The need to address increasingly complex problems, and the opportunities that are
Fig. 2. Fragmentary synchronization for two different input images (one can see the «music» of oscillations of each cluster)
provided when using synergistic principles of analysis and synthesis, leads to the idea that for the complex challenges that have manifested emergent properties, the more effective approach is the holistic analysis as a whole, without division. This is not a departure from functional decomposition, but a significant addition to it, since during fragmentation of the system we often lose the uniqueness associated with system patterns.
Thus, we propose a general approach to solving different tasks — by reducing the original problem to a control problem, an optimization problem, or a problem of pattern recognition. This approach is similar to the neural network approach in the part, where
problems of different types are reduced to the same type of problem and solvable by homogeneous network structures. In this approach, the complexity of the method (and the system to implement it) will be adequate to the complexity of the problem being solved just as it is in the formal synthesis theory of neural network structure through functions of primary and secondary optimization [7]. On the other hand when using a single approach it is possible to combine operations easily. For example, for information systems — it is the perception and storage, and actual processing of information. From association to storage and subsequent recognition, this is consistent with current ideas on how living systems solve problems.
REFERENCES / СПИСОК ЛИТЕРАТУРЫ
1. Andrievskii B., Fradkov A. Control of chaos: methods and applications. I. Methods. Automation and Remote Control. 2003, Vol. 64, Iss. 5, pp. 673-713.
2. Andrievskii B.R., Fradkov A.L. Control of chaos: method and applications. II Applications. Automation and Remote Control. 2004, Vol. 65, Iss. 4, pp. 505-533.
3. Benderskaya E.N. Nonlinear Trends in Modern Artificial Intelligence: A New Perspective. Beyond AI: Interdisciplinary Aspects of Artificial Intelligence. Topics in Intelligent Engineering and Informatics. Springer, 2013, Vol. 4, pp. 113-124.
4. Benderskaya E.N., Zhukova S.V. Multidisciplinary Trends in Modern Artificial Intelligence: Turing's Way. AIECM — Turing 2012, Book Chapters: Artificial Intelligence, Evolutionary
Computation and Metaheuristics. Springer, 2013, pp. 320-343.
5. Benderskaya E.N., Zhukova S.V. Fragmentary Synchronization in Chaotic Neural Network and Data Mining. HAIS 2009. LNCS. Springer, Heidelberg, 2009, Vol. 5572, pp. 319-326.
6. Benderskaya E.N., Zhukova S.V. Clustering by Chaotic Neural Networks with Mean Field Calculated Via Delaunay Triangulation. HAIS 2008. LNCS (LNAI). Springer, Heidelberg, 2008, Vol. 5271, pp. 408-416.
7. Galushkin A.I. Neural Networks Theory. Springer, 2007, 396 p.
8. Granichin O.N., Vasil'ev V.I. Computational modelbasedonevolutionaryprimitives.Turingmachine generalization. Internat. Journal of Nanotechnology and Molecular Computation. 2010, Vol. 2, No. 2, pp. 30-43.
9. Granichin O.N., Izmakova O.A. A
Randomized Stochastic Approximation Algorithm for Self-Learning. Automation and Remote Control. 2005, Vol. 66, Iss. 8, pp 1239-1248.
10. Kolesnikov A., Veselov G., Monti A., Ponci F., Santi E., Dougal R. Synergetic synthesis of dc-dc boost converter controllers: theory and experimental analysis. Proceedings IEEE Applied Power Electronics Conference and Exposition APEC 17th. Dalas, TX, 2002, pp. 409-415.
11. Kolesnikov A., Veselov G., Popov A., Kolesnikov A., Kuzmenko A., Dougal R.A., Kondratiev I. Synergetic approach to the modelling of power electronic systems. IEEE 7th Workshop on Computers in Power Electronics. Blacksburg, VA,
USA, 2000, pp. 259-262
12. Pikovsky A., Rosenblum M., Kurths J.
Synchronization: A Universal Concept in Nonlinear Sciences. Cambridge Nonlinear Science Series. Cambridge University Press, 2003, 432 p.
13. Potapov A.S. Principle of representational minimum description length in image analysis and pattern recognition. Pattern Recognition and Image Analysis. (Advances in Mathematical Theory and Applications). 2012, Vol. 22, No. 1, pp. 82-91.
14. Zhuravlev Yu.I. An algebraic approach to recognition or classification problems. Pattern Recognition and Image Analysis. (Advances in Mathematical Theory and Applications). 1988, Vol. 8, p. 59.
BENDERSKAYA, Elena N. St. Petersburg State Polytechnical University. 195251, Politekhnicheskaya Str. 29, St. Petersburg, Russia. E-mail: [email protected]
БЕНДЕРСКАЯ Елена Николаевна — доцент кафедры компьютерных систем и программных технологий Санкт-Петербургского государственного университета, кандидат технических наук. 195251, Россия, Санкт-Петербург, ул. Политехническая, д. 29. E-mail: [email protected]
© St. Petersburg State Polytechnical University, 2014