Video compression method on the basis of discrete wavelet transform for application in video information systems with non-standard parameters

Valitskaya N.S.; Vlasyuk I.V.; Potashnikov A.M.

VIDEO COMPRESSION METHOD ON THE BASIS OF DISCRETE WAVELET TRANSFORM FOR APPLICATION IN VIDEO INFORMATION SYSTEMS WITH NON-STANDARD PARAMETERS

DOI: 10.36724/2072-8735-2020-14-3-47-53

Natalya S. Valitskaya,

Moscow Technical University of Communications and Informatics, Moscow, Russia, [email protected]

Igor V. Vlasyuk,

Moscow Technical University of Communications and Informatics, Moscow, Russia, [email protected]

Aleksei M. Potashnikov,

Moscow Technical University of Communications Keywords: video information systems, synchronization,

and Informatics, Moscow, Russia, parallelization, fault tolerance, video compression, discrete

[email protected] wavelet transform, filtering, additional calculations

The features of image compression standards based on a discrete wavelet transform are considered. Particular attention is paid to the issues of data synchronization and fault tolerance of video information systems, in which, to obtain high spa-tial resolution, the image is divided into fragments with their subsequent separate decoding and playback. A modification of the codec for low-latency video data transmission systems based on a discrete wavelet transform is presented taking into account the option of changing the filtering direction of the samples from vertical and horizontal to diagonal developed at the TiZV department of MTUCI, allowing to slightly reduce computational complexity by reducing the number of processed extrapolated elements at the edges of the image (or tile), simplify parallel processing of data, and also introduce an addi-tional level of fault tolerance these transmission systems directly to the video codec by generating a pair of independent mutually complementary data streams. A variant of optimizing the parameters of the quantizer of the DVT subband signals is shown, taking into account the spatial spatial frequency response of the human visual system. The possibility and scenarios of using the presented video codec in video information systems with nonstandard parameters (as a rule, with ultra-high resolution and often complex shape of the display area) are analyzed.

Для цитирования:

Валицкая Н.С., Власюк И.В., Поташников А.М. Метод видеокомпрессии на базе дискретного вейвлет-преобразования для применения в видеоинформационных системах с нестандартными параметрами // T-Comm: Телекоммуникации и транспорт. 2020. Том 14. №3. С. 47-53.

For citation:

Valitskaya N.S., Vlasyuk I.V., Potashnikov A.M. (2020) Video compression method on the basis of discrete wavelet transform for application in video information systems with non-standard parameters. T-Comm, vol. 14, no.3, pр. 47-53. (in Russian)

Introduction

Currently, there is an active development of video information systems. In case it is necessary use the images of such high resolution that there is no interface for their transmission or there is not enough resource on one video server, it is also possible to tise several video servers that are connected using an Ethernet switch 11 j. Due to the instability of network technology, there is a problem of signal synchronization when displayed on a video wall.

An analysis of Internet sources, including publications hosted on electronic abstracting platforms such as eLibrary and Scopus, showed that in video information systems, synchronization is currently implemented in the process of data packaging. However, this is not enough: loss of synchronization can occur at the stage of decoding transport streams, since the time required to complete this process is uncontrollable.

Thus, it is necessary to ensure synchronization at the level of a video codec capable of forming tiles of the complete image, because this is the first and, accordingly, the last operation in the process of transmitting media data [2].

In order to implement such a synchronization method, compression standards based on the wavelet transform can be used, since this will allow displaying a single fragment of the low-frequency component on all screen modules reproduced by a single source {video server), which reduces the number of blocks in the decoding process that can potentially occur thread out of sync. At the same time, high-frequency components, although they can be reproduced by different sources, do not significantly affect the integrity of the image perception even in the case of some desynchronization, especially in non-optimal conditions for image control characteristic of video information systems. Therefore, this method provides synchronization built into the stream. However, when using wavelet transforms, a problem arises in the form of additional calculations.

J PEG2000 codec algorithm

An example of a wavelet-based compression standard is JPEG2000. The block diagram of the JPEG-2000 encoder is shown in Figure 1 [3].

Pre-processing includes dividing each component of the frame into areas (tiles), shifting in brightness to equalize the dynamic range, and converting color space. The wavelet transform is a convolution of the original signal with the wavelet function,

A two-dimensional discrete wavelet transform (DWT) is applied to the image in accordance with formulas (1-2) [4]:

1 ^ - , X (,)

Wv{j0,m,n)- -j=- £ V /(*,.)')'/>,„vi

w; (;, m, «)=XX/ (*, y)v).m.„ t-w)-

(2)

yJMN r-o >-o

where: f(x,y) - function of source tile; W9{j0,m,n) ~ coefficients that determine the approximation of a function on a jo scale; i = {H, V, D} - index for identifying "directed" wavelets; m,n) — coefficients that determine horizontal, vertical and

diagonal details when i = H, V and D respectively; MxN - image tile resolution; jO - arbitrary initial scale; j=0,1...J-1 - scales; <p-m„{x,y) ~ two-dimentional scaling function; y/' ,„„(*,}') — two-dimentional wavelet functions.

Tailing OC level №8 to YCbCr transform

Shifting

Forward Wavelet fransform Quantization

ROI Rate control

Preprocessing

Tier-1 Tier-1

encoder encoder

Fig. I. JPEG200 image compression diagram

Two-dimensional discrete wavelet transform is performed by filtering each row and column of a pre-processed image tile using a low-pass filter and high-pass filter, and it does not matter if these operations are applied to rows or columns in the first place.

Since the number of samples doubles as a result of this process, the samples of the output signal of each filter are subsam-pled in steps of 2; therefore, the sampling rate remains constant.

An example of a two-dimensional wavelet transform is shown in Figure 2.

Fig, 2. DWT (First level of decomposition)

The lines of the original image are subjected to low-frequency and high-frequency filtering, as a result, the number of samples increases by 2 times. Then, subsampling of the row-counts occurs in step 2, which means the removal of every second column of the LF and HF components obtained in the previous step of the wavelet transform. Further, similar actions are performed with respect to the columns: first, the number of column samples doubles as a result of filtering, then it also decreases by 2 times due to subsampling, i.e. removal of every second line.

At this stage, the two-dimensional discrete wavelet transform is completed. The result is the decomposition of the image tile into 4 components (LL, LH, HL, HH) [4]. The value of these four subbands is conveniently summarized in table 1.

This decomposition is the first level of decomposition. As shown in Figure 3, the two-dimensional discrete wavelet transform can be reapplied to the low-frequency component of the image, which will mean an increase in the level of decomposition by one.

Table 1

Names and notation of subbands

Name Notation Axis filter

X y

Low frequency (LF) LL LF LF

Horizontal (HF) HL HF LF

Vertical (HF) LH LF HF

Diagonal (HF) HH LF HF

The coefficients obtained after a two-dimensional discrete wavelet transform are quantized in accordance with formula (4):

1 level of decomposition 2 level of decomposition 3 level of decompositioi

IE

m ULI

LH1 HH1

"W m HL2 HL1

LH2 HH2

i .. -i'} '• LH1 HH1

Fig. 3. DWT(l-3 levels of decomposition)

The required number of levels of decomposition is determined in accordance with the spatial frequency response (SFR) of the human visual system, shown in Figure 4 |5]. Without taking into account the anisotropy of the SFR of the v isual system, it can also be represented in a two-dimensional version by rotation about the vertical axis.

0,010 1 2 lg(f), per./grad.

Fig. 4. SFR on human visual system

In the figure, Kmeanl, Kmean2, ... Kmean6 are the average sensitivity levels within each subband at 6 iterations of DWT (decomposition levels). At a frequency of 1 per./deg. the sensitivity of the visual system is quite low, and further div ision of the spectrum does not make sense. We pose the problem: when dividing the LF subband by 2 times at each level of decomposition, determine the number of the last iteration Jmax, at which fin in = 1 per. / deg. In this ease J max can be calculated by the following formula;

J max,, =

/ max /min

60

= 6

(3)

qb (/,j) = signifylt (i,j))

(4)

where: yb (i, j) - discrete wavelet transform coefficient; sign{yh(i, j)) - DWT coefficient sign; ¿\h - quantization step.

The quantizer dead zone ("deadband") is located near zero and is equal to twice the quantization step.

The quantization resolution in each frequency subband is different: the subbands corresponding to the lower sensitivity of the visual system are quantized more roughly: in fact, they adapt according to the average level of sensitivity within each subband.

Before encoding, the tile subbands are preliminarily divided into blocks so that the code blocks corresponding to one subband have the same size. The PCRD-opt Algorithm (PostCompression Rate-Distortion optimization) algorithm allows to achieve the desired data transfer rate with minimal distortion. To highlight the region of high quality - the region of interest (ROI), a single-bit image mask is used, transformed to cover all the coefficients corresponding to the ROI [6j.

Additional calculations

Additional calculations occur when filtering pixels located at the edges of the image. Since the resolution of the frame is greater than the resolution of one screen module, the image is divided into tiles, and the number of extreme pixels increases. To process N- 1

one such pixel, additional - counts are required, where N is

2

the (liter order. With an increase in the level of decomposition, the area of pixels outside the image, which is necessary for processing extreme pixels, expands by 2 times. This is explained by the fact that the wavelet transform is accompanied by a decrease in the resolution of the low-frequency component by 2 times in rows and columns compared to the previous level. Accordingly,

N -1

when level J is decomposed, the number of samples - is

2

equal to the Dop samples of the original image, which essentially determines half the filter aperture according to formula (5):

A* =£±2"

(5)

Rl"

.Ri-

al

Fig. 5. Expanding the filter aperture while increasing the level of decomposition

The expansion of the sampling area outside the image necessary for processing the extreme pixels, with an increase in the level of decomposition, is shown in Figure 5.

For example, consider the case of applying an 1 Ith-order filter. At the first level of decomposition, R1 = 5 pixels, at the second, after suhsampling the samples in steps of 2, the distance Rl' = 10 pixels of the original image, and at the third, respectively, Rl"= 20 pixels.

The greater the filter aperture, the greater the level of decomposition and the order of the filter.

A method to reduce computational complexity

To reduce computational complexity, we will use the method developed by Sedov M.O. In the article [7], the author describes the principle of adaptive discrete wavelet transform, which allows you to change the filtering direction depending on the image structure. The idea is the following: the image is divided into blocks, and if the block contains vertical and / or horizontal borders, classical filtering by rows and columns is applied to it; if the image section has diagonal borders, the filtering of parallel to main diagonal elements is used, and then - perpendicular to the elements of the main diagonal; diagonal filtering can be done using the classic filters, if you rotate the image by 45 degrees.

We take advantage of the latter solution. Filtering at an angle of 45 degrees can be used to reduce computational complexity when processing pixels located at the edges of the image. It is more convenient to carry out this type of filtering by rotating the tile by 45 degrees and applying classical filtering by rows and columns. However, when the image is rotated, the orthogonality of the structure in the form of a shift by half a pixel is violated, and there is no way to process such a tile in simple ways.

The problem can be solved as follows: the tile is divided into two "chess" structures, one of which contains only odd pixels in odd lines and even in even lines (structure A), and the other contains the remaining pixels of the tile (structure B), as shown in Fig. 6, and for each such structure separately afier a 45-degree rotation, a two-dimensional fiberboard is applied with the required number of decomposition levels in accordance with the diagram in Fig. 1.

©

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

o o

• •

Q

® 0

a)

Q O

b)

fig. 6. a) Pan of the lile comaining structure A; b) The part of the tile containing structure B

Pixels of the source image Pr of size X, Y, where A" e [O, X), y G [O, Y) - the coordinates of the pixels of the

original image, belong to structure A, if the condition X+y ,, .

-eiv is met, otherwise - to structure B.

2

When the image is divided into such structures, the distance between the samples increases by a factor of i/2- The larger the distance betw een the samples, the lower the filter cutoff frequency. In this case, a smaller number of DWT iterations will be required to achieve the seiectcd frequency fmin. This is equivalent to working with a tile that has a resolution in each of the original (horizontal and vertical) directions that is -¡2 less. The filler becomes lower frequency, and the number of additional horizontal and vertical samples remains the same, or equivalently, at the same cutoff frequency lower than the filter aperture, as shown in Figure 7.

XI

¡17

i i 11 11 11

11 _

X2

Fig. 7. Filter aperture during classical filtration and filtration at an angle of 45 degrees at one f^,

Let's show the dependence of the gain achieved by diagonal filtering, depending on the resolution of the screen module of the video wall. First, we determine the values of the areas of additional pixels during classical filtering D1 and when filtering at an angle of 45 degrees D2 as a percentage of the resolution of the image in one direction.

In formulas (6-7), double distances appear when counting D1 and D2, because XI and X2 characterize the half-aperture of the filter aperture, i.e. the area of additional calculations when the filter window is located is symmetrical relative to the edge of the tile, and when considering the gain in relation to the image resolution on one side, it must be taken into account that the tile has two edges in the horizontal and vertical directions.

(6)

y ]*2

01=--100, %

/

X~>*2 D2 = —- ■ 100, % I

(7)

The gain is determined in accordance with the formula (8): V = D\-D2, % (8)

Graphs of the dependence of the gain on the resolution of the screen module with the filter order N=11 for different levels of decomposition are shown in Figure 8.

Fault tolerance

Splitting a tile into 2 parts and their parallel processing gives increased fault tolerance. In the case of failures at the network and software level, part of the image may be lost, for example, individual transforms. With the loss of low-frequency components, this ean lead to extremely noticeable artifacts in the image.

When parallel processing of two chess structures formed from pixels of the common part of the frame with their independent further entropy encoding and transmission, the amount of transmitted data, of course, increases, but not twice. At the same time, a tile can be restored only by one of the components, using one of the known methods of local restoration of image elements, for example, used in demosaicing, for example [9].

This will lead to almost imperceptible distortions in the image due to a V2 decrease in resolution in the worst case, vertically or horizontally, which is quite acceptable for emergency operation of the system.

Conclusion

In the course of the study of the features of the use of DWT-based image compression standards in video information systems, along with advantages, for example, the convenience of synchronizing streams due to the possibility of outputting low-frequency transform ants from a single source with a short delay time, scalability and the absence of the need for post-processing of images due to the specifics of distortions, shortcomings were also identified, such as often insufficient code parallelism, instability to errors in the data stream Tor low-frequency transformations, necessity to perform extrapolation processing of images near its boundaries.

The proposed scheme for dividing an image into two "chess" structures applying wavelet transforms to them in the direction ± Tt/4 to the vertical allows eliminating or reducing the influence of several of the above disadvantages; to form two independent data streams in a natural way, each of which allows restoration of

the image with high subjective quality by simple means, reduce the size of extrapolated areas at the edges of the image, and improve the parallelization of the video compression algorithm by half.

The research results can be applied to create video information systems with increased requirements for synchronization accuracy and fault tolerance.

References

1. Gorodilov M.A., Dolgovesov B.S., Khramtsov I.D., Radostev A.H. (2015). Features of multi-screen display of distributed multimedia data. Bulletin of NSU. Series; Physics, vol. 10, no. 2, pp. 91 -98. (in Russian}

2. Valitskaya N.S., Vlasyuk LV. (2019). Synchronization methods for video information systems. Telecommunications and Information Technologies, no. 2, pp. 51-57. (in Russian)

3. Vlasyuk I.V., Romanova E.P., Sidorova A.I. (2010). Coding of high-quality areas in the JPEG 2000 image compression standard, T-Comm, no. 9, pp. 53-55. (in Russian)

4. Gonzalez R., Woods R. Digital image processing. Translated from English by P.A. Chochia. Moscow: Technosphere, 2005. 1072 p. (in Russian)

5. Fairchild Mark D. Color appearance models. Chichester: John Wiley & Sons, Ltd., 2005. 385 p.

6. A chary a T., Tsai P JPEG2000 standard for image compression: concepts, algorithms and VLSI architectures. Ho bo ken: John Wiley & Sons, Inc., 2005. 274 p.

7. Sedov M.O. (2012). Adaptive discrete wavelet transform. T-Comm, no. 9, pp. 127-128. (in Russian)

8. Dvorkovich V.P., Gilmanshin A.V. (2008). A new approach to the use of wavelet filters in image processing. Digital signal processing, no. 1, pp. 37-42. (in Russian)

9. Bezrukov V.N., Vlasyuk LV., Romanov S.G. Patent No. 2557261 Russian Federation, H04N 5/335, H04N 11/24 2013138922/07; declared 08/20/2013; publ. 07/20/2015, Bull. No. 20. Method and apparatus for generating image signals in standard and high definition digital television systems, (in Russian)

МЕТОД ВИДЕОКОМПРЕССИИ НА БАЗЕ ДИСКРЕТНОГО ВЕЙВЛЕТ-ПРЕОБРАЗОВАНИЯ ДЛЯ ПРИМЕНЕНИЯ В ВИДЕОИНФОРМАЦИОННЫХ СИСТЕМАХ С НЕСТАНДАРТНЫМИ ПАРАМЕТРАМИ

Валицкая Наталья Сергеевна, МТУСИ, Москва, Россия, [email protected] Власюк Игорь Викторович, МТУСИ, Москва, Россия, [email protected] Поташников Алексей Михайлович, МТУСИ, Москва, Россия, [email protected]

Аннотация

Рассмотрены особенности стандартов сжатия изображений на базе дискретного вейвлет-преобразования. Особое внимание уделено вопросам синхронизации данных и отказоустойчивости видеоинформационных систем, в которых для получения высокой пространственной разрешающей способности применяется разделение изображения на фрагменты с их последующим раздельным декодированием и воспроизведением. Представлена модификация кодека для систем передачи видеоданных с малой задержкой на базе дискретного вейвлет-преобразования с учетом разработанного на кафедре ТиЗВ варианта с изменением направлений фильтрации отсчетов с вертикального и горизонтального на диагональные, позволяющая несколько снизить вычислительную сложность за счет уменьшения количества обрабатываемых экстраполированных элементов на краях изо-бражения (или тайла), упростить параллельную обработку данных, а также внедрить дополнительный уровень отказоустойчивости системы передачи непосредственно в видеокодек за счет генерации пары независимых взаимодополняющих друг друга потоков данных. Показан вариант оптимизации параметров квантователя сиг-налов субполос ДВП с учетом пространственной частотно-контрастной характеристики зрительной системы человека. Проанализирована возможность и сценарии использования представленного видеокодека в видео-информационных системах с нестандартными параметрами (как правило, со сверхвысоким разрешением и часто сложной формой области отображения).

Ключевые слова: видеоинформационные системы, синхронизация, распараллеливание, отказоустойчивость, видеокомпрессия, дискретное вейвлет-преобразование, фильтрация, дополнительные вычисления.

Литература

1. Городилов М.А., Долговесов Б.С., Храмцов И.Д., Радостев А.Х. Особенности построения систем для полиэкранного отобра-жения распределенных мультимедийных данных // Вестник НГУ. Серия: Физика, 2015. Т. 10, № 2. С. 91-98.

2. Валицкая Н.С., Власюк И.В. Методы синхронизации потоков в видеоинформационных системах // Телекоммуникации и информационные технологии, 2019. №2. С. 51-57.

3. Власюк И.В., Романова Е.П., Сидорова А.И. Кодирование областей повышенного качества в стандарте сжатия изображений JPEG 2000 // T-Comm: Телекоммуникации и транспорт, 2010. № 9. С. 53-55.

4. Гонсалес Р., Вудс Р. Цифровая обработка изображений / Пер. с англ. П.А.Чочиа. М.: Техносфера, 2005. 1072 с.

5. Fairchild Mark D. Color appearance models. Chichester: John Wiley & Sons, Ltd., 2005. 385 p.

6. Acharya T., Tsai P. JPEG2000 standard for image compression: concepts, algorithms and VLSI architectures. Hoboken: John Wiley & Sons, Inc., 2005. 274 p.

7. Седов М.О. Адаптивное дискретное вейвлет-преобразование // T-Comm: Телекоммуникации и транспорт, 2012. №9. С. 127-128.

8. Дворкович В.П., Гильманшин А.В. Новый подход к использованию вейвлет-фильтров при обработке изображений // Цифровая обработка сигналов, 2008. №1. С. 37-42.

9. Безруков В.Н., Власюк И.В., Романов С.Г. Патент №2557261 Российская Федерация, H04N 5/335, H04N 11/24 2013138922/07; заявл. 20.08.2013; опубл. 20.07.2015, Бюл. № 20. Способ и устройство формирования сигналов изображений в системах цифрового телевидения стандартной и высокой четкости.

Информация об авторах:

Валицкая Наталья Сергеевна, МТУСИ, студент, Москва, Россия

Власюк Игорь Викторович, МТУСИ, к.т.н., доцент кафедры ТиЗВ, Москва, Россия

Поташников Алексей Михайлович, МТУСИ, мнс отдела ЦТиВ, Москва, Россия

Video compression method on the basis of discrete wavelet transform for application in video information systems with non-standard parameters Текст научной статьи по специальности «Компьютерные и информационные науки»

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Valitskaya N.S., Vlasyuk I.V., Potashnikov A.M.

Похожие темы научных работ по компьютерным и информационным наукам , автор научной работы — Valitskaya N.S., Vlasyuk I.V., Potashnikov A.M.

Метод видеокомпрессии на базе дискретного вейвлет-преобразования для применения в видеоинформационных системах с нестандартными параметрами

Текст научной работы на тему «Video compression method on the basis of discrete wavelet transform for application in video information systems with non-standard parameters»