WAVELET TYPE SELECTION IN THE PROBLEM OF ANOMALY INTRUSIONS DETECTION IN COMPUTER NETWORKS USING MULTIFRACTAL ANALYSIS METHODS
Sheluhin Oleg Ivanovich,
Head of department of Information security and Automation, professor, D.Sc (Techn), MTUCI, Moscow, Russia, [email protected]
Sirukhi Joseph Were,
Enerproject Group, Deputy Head of Technical Service Department, PhD, MTUCI, Moscow, Russia, [email protected]
Pankrushin Alexandr Vladimirovich,
PhD student of department of Information security and Automation, MTUCI, Moscow, Russia, [email protected]
Keywords: hurst parameter, long-range dependence, packet traffic, parameter estimation, telecommunications networks, time-scale analysis, online analysis, wavelet decomposition, sliding window analysis.
In the article examines detection peculiarities of anomaly bursts of telecommunications traffic, in real time, using fractal methods. The method is based, on the current estimation of the traffic fractal properties using a sliding window and multiresolution wavelet analysis. Traffic fractal properties are estimated by the Hurst parameter. Multiresolution wavelet analysis is realized by using the discrete wavelet transform of scalable processes. As a result, network traffic is translated into the time-frequency space, where it can be analyzed in details. Thus, the wavelet transform can be represented as a way of simultaneous observation of the time sequences of all the durations on different scales. Since in the problem of detecting the amount of the traffic being analyzed is limited by the size of the analysis window, for different mother wavelets, the number of decomposition levels will be different. These features are illustrated by two examples of wavelet types - Haar and Daubechies. The reason for choosing these wavelets systems is due to their orthonormality, and that they form the basis, where the number of vanishing moments can easily be increased.
A comparative analysis of the detection reliability of the traffic anomaly bursts , has been made, using an example of anomalies caused by the Neptune (SYN-flood) attack for the two types of wavelets. The proposed approach of estimating the parameters of self-similarity in real time is required, for example, creates effective intrusion detection in computer networks and provides wide opportunities for analyzing properties of telecommunications signals. It is indicated, that in problems of detection of traffic anomaly bursts selection of the type of mother wavelet is essential. It is shown that the size of the analysis window is limited on the one hand by the accuracy requirements of the Hurst parameter estimator and, on the other hand by the anomaly duration. Recommendations on the choice of the type of wavelet for the problem of detection of anomaly intrusions into computer networks, using multifractal analysis methods are given.
For citation:
Sheluhin O.I., Sirukhi J.W., Pankrushin A.V. Wavelet type selection in the problem of anomaly intrusions detection in computer networks using multifractal analysis methods. T-Comm. 2015. Vol 9. No.4, pр. 88-92.
Problem formulation
in [1,2,8,9] showed that for the estimation of the Hurst parameter H, the only property of the wavelet that matters is the number of vanishing moments. There is no difference whether the wavelet is symmetrical or not, has orthogonal, semi orthogonal or bi-orthogonal basis [3, 4, 10]. However, estimation of anomaly detection has its own features associated with the formulation of the problem [II, 12]. It is related to the volume of the analyzed sample.
Since in the problem of detecting the amount of traffic being analyzed is limited by the size of the analysis window, for different mother wavelets, the number of decomposition levels will be different. As a result it appears that the accuracy of anomaly detection depends on the choice of the type of wavelet. We illustrate this with two types of wavelets -Haar and Daubechies. The reason for choosing these wavelets systems is due to their orthonormality, and that they form the basis, where the number of vanishing moments can easily be increased.
Experimental description
We had considered two cases for features of estimation of Hurst parameter H with different types of wavelets been used in anomaly detection.
In the first case we characterize analyzed data that is obtained using a sliding window analysis [5, 6] amounting to 1000 samples, regardless of the type of wavelet, and in the second, the requirement of an equal number of decomposition levels for any choice of the form of the mother wavelet.
In the first case when Haar is used as the mother wavelet the number of levels of decomposition is 9 under a selected size of the analysis window. At the same time when using Daubechies wavelet the number of possible levels of decomposition for the same size of the analysis window of 1000 samples was found to be only 5. Given that the Hurst parameter estimation error directly depends on the available number of decomposition levels, the error, characterized by the sample variance, in the case of the Haar wavelet is significantly lower than in the Daubechies wavelets.
Experimental Results
Fig. I shows the results of self-similarity index estimation in a sliding window at different values of q. The network traffic is presented as a graph with the length N = 104 with
anomaly occurrence caused by the Neptune (SYN-flood) attack. The duration of the attack is shown in the upper figure, and marked by a window.
When estimating the Hurst parameter online, the window size N0 = 103 of the samples shifted in steps of ^ = 10- accordance with the recommendations for the
calculation of the Hurst parameter proposed in [3, 4] for both wavelets have been calculated probability of correct detection for anomalies and false alarm for the two options of calculating the Hurst parameter - at all received levels of decomposition (octaves) - for the interval [jlfj,] = [1:N] and for the selected interval [jlfj2] = [3:N] with the excep-
tion of the wavelet coefficients calculation of the first two levels of decomposition.
1000 Ml 0
I
IS 1
OS
I
IS t
OS 0
I
IS J
OS 0
Lj.
JlJIL j-L
,1 l; J '
1000 2000 3MOWOS000 6000TOOOWOOW00 10000
i ™ i 7W' i i i i nr " | |
0 100 200 300 «0 wo 600 700 SCO 900 1000
0 100 200 304 400 $00 600 TOO S» 900 1000
■ » 1 1 i i i 1
100 200 300 400 SOO 600 TOO 8» 900 1000
Fig. I. The graph of traffic realization during anomaly occurrence, Hurst parameter estimation for different values of q and using Haar wavelets. Parameters of the analysis window a = ifl; JJ0 = io3
The numerical values of the calculated results for Haar wavelets for the first variant of the calculation (Gi.jz] = [1:9]) are given in Table I.
Table I
Characteristics of the anomaly detection reliability caused by the Neptune type of attack using Haar wavelets, in case I, [j^jj = [i:9]
Alpha = 0.01 q=2 q=6 Alpha = OOS q=2 q=4 q=6
Threshold 0.92 0.7 0.64 0.8 0.62 0.56
Probability Correct Detection 0.88 0.91 0.74 0.98 0.97 0.97
Probability False Alarm 0,12 0.18 0.15 0,68 0.48 0.36
As seen on the histograms of the probability distribution of the calculated Hurst parameter (Fig. 2a-c), before the anomaly, the Hurst parameter is close to the normal distribution law of the anomaly.
In this case, when window is passing the area of the anomalies the occurrence of the second peak in the histograms of the probability distribution can be clearly seen, that illustrates the possibility of estimating the moment of the anomalies availability.
The numerical results obtained in Table. I shows a good level of correct detection (more than 74-91%) with low false alarm (12-18%) for a given level of alpha = 0.01. As can be seen from the same table, the increase in the level of alpha to 0.05 leads to a natural increase in the probability of correct detection (up to 98%), but leads to a significant increase in the probability of false alarm (up 68%).
Numerical calculation results for the second option ([j1(j2] = [3:9]) are given in Table. 2.
J I
5 -1 -0.5 0 0.5 t 1
J
-05 0 05
J
-I -os o as i
■05 G 05
Fig. 2. Histograms of the calculated Hurst parameter distribution at the end of the training interval before anomalies occurrence (left) and during the anomaly period (right) for different degrees of q: a) - q = 2, b) - q = 4, c) - q = 6
Table 2
Characteristics of the anomaly detection reliability caused by the Neptune type of attack using Haar wavelets, in case I, (j^jJ = [3:9]
q=2 q=4 .o ii q=2 q=4 q=6
Threshold 0.92 0.77 0.7 U1 0.87 0.73 0.67
Probability Correct d ii 0.91 0.75 0.72 d ii 0.95 0.85 0.76
Detection .c Q. -C CL
Probability < 0.39 0.28 0.27 < 0.52 0.36 0.31
False Alarm
Analysis of the calculated Hurst parameter distribution histograms when passing the anomalies stage (Fig. 3a-c) leads to the conclusion of the similarity results with those of the first case. The distribution comprises of several maximums that enables detection of an anomaly.
In analyzing the numerical results presented in Table 2, it can be concluded that has a high anomaly detection probability as in the first case, of 72-91%, but with increased false alarm of 27-39% as compared with the first case at a predetermined level of alpha=0.0l.
Considering the results of calculations using Daubechies wavelets {numerical results of both variants are shown in Table 3 and 4, we note the absence of several peaks in the histograms of the probability distribution for the calculations option with — [1:51 {Fig- 4a-c) and the correspond-
ing low probability of correct detection (2-26%).
30
15 ! ,
10 iL
5 il
0
■1.5 -1 -05 0 0.5 1 1.5
0.5 1 -1.5 -1 -0.5 0 0.5 1 1.5
Fig. 3. Histograms of the calculated Hurst parameter distribution at the anomaly occurrence time for degrees calculation of q = 2,4,6
Table 3
Characteristics of the anomaly detection reliability caused by the Neptune type of attack using Daubechies wavelets, in case I, QlfjJ = [1:5]
q=2 q=4 q=6 q=2 q=4 q=6
Threshold 0.91 0.73 0.66 ul 0.86 0.67 0.67
Probability Correct d ii 0.02 0.23 0.26 d ii 0.27 0.4 0.37
Detection .c Q- .c CL
Probability False Alarm < 0.12 0.15 0.17 < 0.19 0.25 0.24
Table 4
Characteristics of the anomaly detection reliability caused by the Neptune type of attack using Daubechies wavelets, in case I, [j^jj = [3:5]
q=2 q=4 q=6 q=2 q=4 q=6
Threshold 0.01 0.99 0.92 0.89 0.05 0.9 0.81 0.77
Probability 0.18 0.14 0.09 0.43 0.34 0,32
Correct 1! M
Detection _c Q. £ CL
Probability False Alarm < 0.17 0.14 0.13 < 0.31 0.2! 0.2
For the calculations with [jlfj2] = [3:5] probability distribution histogram while containing closely spaced peaks, but they are expressed too implicitly, which is reflected in the numerical results in Table 4 (low probability of correct detection 9-18%).
1 40
30
20
l 10 0
■1.5 -1 -C.5 0 5.5 ! IS
1.5 ■! <1-5 CI 0.5 1 1.5
50
«
30
1 20
J L. 10 0
.15 .1 45 0 05 I 15
-0.5 0 0.5
JuU_iJL
1000 2000 3000 4000 5000 5000 7000 8000 9000 10000
M OT W W MO W MO M w
Fig. S. The traffic realization graph under conditions of anomaly occurrence, and Hurst parameter estimation for different vatues of q using Daubechies wavelets (db6). Parameters analysis window A = 10; N0 = 2900
ability of false alarms caused by disproportionate size of the analysis window and the duration of the attack.
Table 5
Characteristics of the anomaly detection reliability caused by the Neptune type of attack using Daubechies wavelets, in case 2, [j1(Jz] = [l: 8]
q=2 q=4 q=6 q=2 q=4 q-6
Threshold 0.99 0.63 0.57 tn 0.78 0.6 0.55
Probability Correct ö ii 0 0.48 0.3 Ö II 0.84 0.71 0.39
Detection -C Q. -C Q.
Probability False Alarm < 0 0.39 0.27 < 0.94 0.6 0.37
Table 6
Characteristics of the anomaly detection reliability caused by the Neptune type of attack using Daubechies wavelets, in case 2, [j1(j2] = [3=8]
Fig. 4, Histograms of the calculated Hurst parameter distribution at the end of the training interval before the anomalies occurrence (left) and during the anomaly occurrence (right) for different degrees of q: a) - q = 2, b) - q = 4, c) - q = 6
For the second case calculations, where the demand is put on an equal number of decomposition levels for any choice of the mother wavelet type, the analysis window, by using waveiet Daubechies (db6) have to be increased to 2900, and A = 10, so that the number of octaves can be
equal to 8.
"tiii • ib11 ii I Hi. 1
Alpha =0.01 q=2 q=4 q=6 Alpha = 0.05 q=2 q=4 a ii
Threshold 0.89 0.53 0.47 0.71 0.51 0.44
Probability Correct Detection 0.4 0.51 0.45 0.74 0.56 0.5
Probability False Alarm 0.39 0.74 0.57 0.98 0.77 0.67
80--
LjllL
D 5 t. 0 7 os o.g : M
1510-
u
Fig. 8. Histograms of the calculated Hurst parameter distribution at the end of the training interval before the anomalies occurrence (left) and during the anomaly occurrence (right) for different degrees of q: a) - q = 2. b) - q = 4, c) - q = 6
The numerical values of the anomaly detection reliability of the characteristics obtained when using Hurst as an information parameter and Daubechies wavelets (db6) are shown in Tabie 5 and 6.
Numerical data analysis shows that the detection accuracy significantly increased in comparison with the window, giving only 5 octaves, but there is an unacceptably high prob-
As can be seen from Fig. 8, the histogram of the calculated Hurst parameter distribution contain two pronounced maximums, which determines the best numerical indicators in Table 5 and 6.
Thus, the conducted numerical analysis leads to the conclusion that the problems of detecting the type of mother wavelet selection is essential.
т
Conclusions
To solve the problem of detecting changes in the fractal dimension caused by anomalous changes in the properties of telecommunications traffic in real (current) time scale is proposed to use an estimation method, based on the mathematical apparatus of multiresolution analysis.
To solve the problem of detection is proposed to use both mono and multifractal estimation.
The proposed method is a generalization of known results for the case of the sliding window analysis and allows, in contrast to the known works, to calculate the current estimate, and not by setting the Hurst value.
It is indicated that in the problems of the detection the choice of the mother wavelet type is essential, and the size of the analysis window is limited on the one hand by the Hurst parameter estimation accuracy requirements, and on the other hand by the anomalies duration.
The proposed method of estimating the parameters of self-similarity in real time is required, for example, to create an effective means of intrusion detection in computer networks and provides wide opportunities in problems of properties analysis of telecommunications signals.
References
1. She/tj/iin O.I., Sakalema D.Zh., Filinova A.S. Intrusion detection in computer networks. Network anomaly. Moscow. Hotline - Telecom, 2013. 220 p.
2. She/uhin O.I. Multifractals, Information applications. Moscow. Hotline - Telecom, 201 1. 576 p.
3. Abry P., Veitch Û. Wavelet analysis of long-range dependent traffic. IEEE Trans, on Info, Theory, 1998. Vol, 44. No. I, pp. 2-15.
4. Abry P., Taqqu MS, Flandrin P., Veitch D. Wavelets for the analysis, estimation, and synthesis of scaling data, in Park K., Wtllin-ger W. (Eds.), Self-similar Network Traffic and Performance Evaluation, John Wiley & Sons, 2000, pp. 39-88.
5. She/uhin O.I., Pankrushin A.V. Validation of network traffic anomaly detection methods discrete wavelet analysis. T-Comm, 2013. No.10, pp. II0-IIS.
6. She/uhin 0.1,, Pankrushin A.V. Measuring of Reliability of Network Anomalies Detection Using Methods of Discrete Wavelet Analysis. Science and Information (SAI). Conference 2013, London, UK, pp.393-397.
7. Veitch D„ Abry P. A wavelet based joint estimator of the parameters of long-range dependence. IEEE Transactions on Information Theory (special issue on Multiscale statistical signal analysis and its applications), 1999. Vol. 45. No. 3, pp. 878-897.
8. Veitch D„ Abry P., Flandrin P., Chainais P. Infinitely divisible cascade analysis of network traffic data, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (Istanbul, Turkey), June 2000.
9. S/ie/uhin O.I., Smolskiy S.M., Osin A.V. Self-similar processes in telecommunications. John Wiley & Sons, 2007. 320 p.
10. S. Mallat. Wavelets in signal processing. Lane, from English.-Moscow. Mir, 2005. 671 p,
11. She/uhin O.I., Antonion AA. Analysis of changes in the fractal properties of telecommunication traffic caused by abnormal invasions. T-Comm, 2014. No. 6, pp. 61-64.
12. She/uhin O.I., Atayero A.A. Integrated Model for information Communication Systems and Networks. Design and Development. IGI Global. USA, 2012. 462 p.
13. Sheluhin O.I., Pankrushin A.V. Detection of anomalies in network traffic using the methods of fractal analysis in reai time, T-Comm, 2014. No. 8, pp. 108-112.
ВЫБОР ТИПА ВЕЙВЛЕТА В ЗАДАЧЕ ОБНАРУЖЕНИЯ АНОМАЛЬНЫХ ВТОРЖЕНИЙ В КОМПЬЮТЕРНЫЕ СЕТИ МЕТОДАМИ МУЛЬТИФРАКТАЛЬНОГО АНАЛИЗА
Шелухин Олег Иванович, Московский Технический Университет Связи и Информатики, Заведующий Кафедрой "Информационная безопасность и автоматизация", профессор, д.т.н., Москва, Россия, [email protected] Сирухи Джозеф Вере, Enerproject Group, PhD, Москва, Россия, [email protected] Панкрушин Александр Владимирович, Московский Технический Университет Связи и Информатики, аспирант Кафедры "Информационная безопасность и автоматизация" МТУСИ, Москва, Россия, [email protected]
Рассмотрены особенности обнаружения в реальном масштабе времени аномальных выбросов телекоммуникационного трафика фрактальными методами. Метод базируется на текущей оценке фрактальных свойств трафика с помощью скользящего окна и кратномасштабном вейвлет-анализе. Фрактальные свойства трафика оцениваются показателем Херста. Кратномасштабный вейвлет-анализ осуществлен с использованием дискретного вейвлет преобразования масштабируемых процессов. В результате сетевой трафик переводится в частотно-временное пространство, где может быть подробно проанализирован. Таким образом, вейвлет преобразование может быть представлено в виде способа одновременного наблюдения временных последовательностей всех продолжительностей на различных масштабах. Показано, что в задачах обнаружения аномальных выбросов трафика выбор типа материнского вейвлета имеет существенное значение. Показано, что размер окна анализа ограничивается с одной стороны требованиям к точности оценки параметра Херста, а с другой стороны длительностью аномалии. Даны рекомендации по выбору типа вейвлета в задаче обнаружения аномальных вторжений в компьютерные сети методами мультифрактального анализа.