URAL MATHEMATICAL JOURNAL, 2018, Vol. 4, No. 1, pp. 14-23
DOI: 10.15826/umj.2018.1.002
OPTIMIZATION OF THE ALGORITHM FOR DETERMINING THE HAUSDORFF DISTANCE FOR CONVEX POLYGONS
Dmitry I. Danilovt and Alexey S. Lakhtin^
Ural Federal University, Ekaterinburg, Russia iDanilov.dmitry.i@gmail.com, ^ alexey.lakhtin@urfu.ru
Abstract: The paper provides a brief historical analysis of problems that use the Hausdorff distance; provides an analysis of the existing Hausdorff distance optimization elements for convex polygons; and demonstrates an optimization approach. The existing algorithm served as the basis to propose low-level optimization with superoperative memory, ensuring the finding a precise solution by a full search of the corresponding pairs of vertices and sides of polygons with exclusion of certain pairs of vertices and sides of polygons. This approach allows a significant acceleration of the process of solving the set problem.
Key words: Hausdorff distance, Polygon, Optimization, Optimal control theory, Differential games, Theory of image recognition.
1. Introduction
Recognition of images is not a new problem in its essence and arises in the most diverse lines of research, ranging from applied tasks in the field of analog signal security and digitization to the problems of theories of optimal control and differential games. The most intensive development and improvement of methods for solving such problems is observed in the current period. This is due, above all, to the need to release people from the arising huge information loads and to use both thinking and perception characteristic of recognition. All these problems are of pronounced interdisciplinary nature and are the basis for the development of a new generation of specialized and applied technical recognition systems used in various fields, including medicine [21] and artificial intelligence development. One of the earliest lines of research was optical character recognition (OCR).
The recognition and comparison of images [4, 5], including recognition and localization [4, 5] is a relevant problem of the era of digital information processing. As is known, the most important information about the shape is in the outlines of objects. Many real-world objects can be recognized from the images of their outlines, and there is no need to use the original gray-scale images. Due of this, recognition algorithms are most often designed to operate binary, outline, or close to outline images.
One of the known methods for detecting and analyzing objects in binary outline images distinguishable from the surrounding context due to their geometric properties is the geometric measurement of the distance between the image points. One of approaches to solving this problem is to modify the Hausdorff metrics to identify objects geometrically close to arbitrary reference ones specified by bit masks. In this approach, the image is considered as a set of complex elements or a set of points in a two-dimensional Euclidean space. For these sets, the measure of mutual proximity is calculated; in the case of complex elements, the Hausdorff metric is used. Modifications of the Hausdorff metric in image recognition have been used since 1993 [6]. They have an intuitive operation principle, an explicit connection to the object geometry, and do not require any training sample. However, this approach is little known compared with neural networks, although an increasing number of related publications have been released in recent years.
2. Main Results
The biggest disadvantage of algorithms that use modifications of the Hausdorff metric is a rather high computational complexity, on the average 2 ~ 3 times higher than that of the simplest correlation algorithms. Non-invariance to rotation and scale, which, in the absence of a priori information on the orientation and size of the objects to be recognized, forces to use the scanning of a multitude of versions of the standard sample at different turning angles and scales; therefore, one of the relevant problems is the development of a calculation optimization technology in these algorithms.
Among the large number of publications dealing with this subject, several review works should be singled out [2, 4, 5]. These works touch upon both theoretical and applied aspects in terms of the algorithm development. Back in the 1990s, the works of P. Gruber [4, 5] covering various faces of approximating convex bodies were released, and the agenda adjacent to this topic was further developed and studied in [2]. The author emphasizes that along with the development of the perfect form of classical approaches, outstanding results on attractive sets have been obtained.
The Hausdorff metric [1, 14] denotes distance h : D ^ R on certain given set D between its subsets X, Y, where
h(X,Y) = max maxmin d(x,y), maxmin d(x,y)
where d(x,y) is the distance between elements of subsets of the given set.
The definition of the metric space in [20] was formulated for convex unbounded closed subsets of a Banach space using the Hausdorff metric, which establishes the differences of the properties of convex ones with this metric from the properties of the metric space. The research in [20] led to the important assertion that not every object in the metric space can be approximated by generalized polyhedra and, therefore, the generalized polyhedron concept was introduced, and approximation criteria were proposed. It was shown that the uniform continuity of the support function is a necessary and sufficient condition for approximation.
Another no less important problem of minimizing the Hausdorff metric between two convex polygons was addressed in [18]. The authors consider two polygons: one fixed and the other changing its location on the plane (rotation or parallel transfer). Ushakov and Lebedev et al. [1113, 18] developed and tested the iterative step-by-step shift and rotation algorithms that ensure a reduction in the Hausdorff distance between a moving and a fixed object, using the differential properties of the function of the Euclidean distance to a convex set and the geometric properties of the Chebyshev center of a compact set, and proved the theorems on the correctness of the developed algorithms for a wide range of cases. A multiple start-over of the algorithm allows choosing the best option.
The works of A.B. Kurzhansky [10] and F.L. Chernousko [3] use the approximation of sets of attainability of differential games with ellipsoids and parallelepipeds in solving problems of the optimal control of dynamical systems. In this case, the Hausdorff distance is the criterion of optimality.
Also, the work of A.S. Lakhtin [18] considers the algorithm for obtaining a precise analytic solution of the problem of minimizing the Hausdorff distance through a full search among a finite number of options depending on the number of vertices of the given polygons. The options are pairs or triples of vectors connecting the vertices of two polygons or the vertex of one and the side of the other polygon.
Having analyzed all the presented ideas and methods, it was decided to take the ideas underlying the analytical algorithm from publication [18] and the numerical subgradient method from publication [18] as the basis for the new proposed algorithm.
Of all the algorithms already considered, the most suitable for improvement is the analytical algorithm of the step-by-step displacement between a moving and a fixed object from publication [18]. For the convenience of the material perception, the ideas that underlie this method are provided below.
Suppose that two convex polyhedra A, B € Rn are given. It is required to move them so as to minimize the Hausdorff distance between them, which, as is known, is calculated by formula
d(A, B) = max { maxmin ||a — 611, maxmin ||a — 611).
L aeA beB beB aeA J
Assuming that polyhedron A is fixed, and B moves by parallel transfer for vector x € Rn , we have convex function F(x) = d(A, B + x), the minimum point of which x* is sought for, i.e. F* = F(x*) = minxeRn F(X).
For convenience, a plane case is considered, i.e. A, B € R2 are convex polygons, but the idea of the proposed method can be used in spaces of larger dimensions. It was proven in [18] that dF(x) = co {La(x) U Lb(x)}, where
La(x) = {—l : € Ia(x) : (l, a* — x) — pb(l) = F(x), ||l|| = 1}
and
I A (x) = {i : dist (a*, B + x) = F (x)}.
Set LB (x) is defined similarly.
Define sets LA(x*) and L*B(x*) as sets of vectors co-directed to single vectors from sets La(x*) and LB(x*), respectively, with length F* = F(x*).
Type V vectors are called vectors from set LA(x*) U L*B(x*), connecting the vertices of different polygons. Type W vectors are called vectors from set LA(x*) [J L*B(x*), connecting the vertex of one polygon with the side of the other, where the vector is perpendicular to this side. Note that any vector from LA(x*) U L*B(x*) is of either type V or type W.
By definition, any type V vector has the form lk = 6jfc + x* — aifc and, therefore, complies with equality F* = ||lk|| = ||6jfc + x* — aifc|| = ||x* — (aifc — 6jfc)||. Geometrically, this corresponds to the distance from the point with coordinates x* to the point with coordinates (aifc — 6jfc).
Let the type W vector be a vector from vertex aifc to side (6k+x*), (6k+1+x*) then the following equality is met:
_ (x* — (aik — 6jk) x (6jk+i — 6jk))
p* =
(bjk+i-bjk)
Geometrically, this corresponds to the distance from the point with coordinates x* to the straight line passing through points with coordinates (aifc — 6jfc), (aifc — 6jfc+1).
Thus, the problem of finding the minimum point x* reduces to a full search among a finite number of options. The pairs of vertices of different polygons give type V vectors, and the pairwise consideration of the vertices of the same polygon with the sides of the other gives type W vectors. In the auxiliary space, it is required each time to solve the problem of optimal placement of point x* providing the shortest distances to the corresponding points and straight lines. In other words, it is required in each case to find the center of a circle passing through given points and tangent to the given straight lines.
Based on the described work, the algorithm was implemented, which, through a full search among a finite number of options depending on the number of vertices of the given polygons, finds a precise analytical solution of the optimization problem posed.
One of the important drawbacks of the described algorithm is the need for a full search, which leads to a very high computational complexity. But, despite these shortcomings, the algorithm ensures finding a precise solution for a fixed time.
The first stage of the work was the implementation of the algorithm itself without any optimization. C++ was chosen as the programming language for the implementation because of the execution rate, absence of unnecessary calls, similarity with Assembler, as well as abundance of optimization tools and parallelization of the algorithm execution.
The implementation of this algorithm was divided into several logical parts. The first part was the creation of data structures, both to store polyhedra, and to optimize the results at each step of the algorithm, as well as to enable storing and reading data structures from the file. The second part includes all the auxiliary algorithms that perform the following operations: finding vectors from the vertex of one polygon to the other, checking algorithms, whether this vertex-vertex and vertex-side pair of vectors (in both directions) ensures the best optimization. Similarly, the checks for a triple of vectors from the set of vertex-vertex and vertex-side pairs (in both directions) are performed. These algorithms include finding the center of a circle using three points, tangent and two points, two tangents and one point, and three tangents. The third part includes the algorithm for a full search among all possible pairs and triples of vectors found in other parts of the implementation.
The result obtained is new. Prior to this, there has been only a theoretical justification for the analytical method, but no ways for its implementation that could be officially referred to could be found. As a result, this algorithm was implemented. The resulting implementation was tested on a large number of pairs of convex polygons of various types. This result is of independent value, both for subsequent testing of any approaches to optimization, and for testing any subgradient methods.
During the operation of the algorithm, which performs a complete search among all possible pairs and triples of vectors, statistical data about which vectors influence the formation of the final optimal result were accumulated. Based on the processing results of these data, Hypothesis 1 was formulated. Its idea is that some groups of vectors can be excluded from the search, since they do not participate in the formation of the final optimal result. It was suggested that such vectors are those that go from one polygon to another but intersect any side of the other polygon.
Hypothesis 1 was tested on a set of polygons, based on which the hypothesis was formulated, see Fig. 1 and Fig. 2. But when the set of polygons was expanded, counter examples were found. In Fig. 3 the type V vector from vertex 3 to vertex 5 intersects the side beginning at vertex 6 and ending at vertex 0. In Figure 4, the type V vector from vertex 0 to vertex 3 intersects the side beginning at vertex 1 and ending at vertex 2. Thus, Hypothesis 1 was not confirmed. Nevertheless, this heuristic idea is viable, since in quite a large number of cases for polygons without peculiar features, Hypothesis 1 is fair and provides a substantial reduction in the search options.
Due to finding a counter example for Hypothesis 1, a new hypothesis was required. This hypothesis was based on the idea of using a support function. To begin with, recall the definition of the Hausdorff distance through support functions [19]
H(A, B) = max max max((1, a) — pB(I)), max max((1, b) — Pa (l)) rI aeA ||1|| = 1 beB ||1|| = 1 J
Also, within the framework of the Hausdorff distance determination through support functions, a minimum is used. Argmin are the elements, to which the minimum is reached. Assume that these will be the required elements.
Further, the notion of the visible part of a polygon is introduced. Fixing the direction vectors and looking in this direction at each polygon from the given pair separately, as, for example, is shown in Fig. 2, shows that only few sides and vertices of the polygons are visible. After this operation, a set of "visible"vertices and sides is obtained, to which boundary sides are added. The boundary sides are those sides of the polygon that were not included in the original sample, but one of the vertices of this side was added. Based on the resulting set of vertices and polygon sides, a set of vertex-vertex and vertex-side vectors that are involved in the full search algorithm are built. The first stage of the algorithm has been completed; the output is the set of vectors for the algorithm under consideration.
Figure 1. The triple of SSS type vectors compliant with Hypothesis 1
Figure 2. The triple of VVV type vectors compliant with Hypothesis 1
Further, this set of vectors should be reduced to a more limited set. This occurs by crossing the sets of vectors obtained using the algorithm described in the first step, but with a modified direction vector. It is necessary to perform the first stage four times, each time turning the direction vector by 10 degrees relatively to its axis. All the resulting vectors must be crossed to obtain the first part of the set for the search algorithm, see Fig. 5. To obtain the second part of the set, it is necessary to execute the algorithm of intersection of sets of pairs, obtained with the algorithm described in the first step, with direction vectors rotated for 180 degrees, see Fig. 6. The first and second sets obtained are combined and transferred to the full search algorithm.
To test each of the described algorithms, a testing system was developed that included the following components. The main component of the algorithms is the generation of convex polygons. For this purpose, various algorithms that are considered below in more detail were implemented.
The first algorithm is the following one. On a plane, N points are arranged randomly in the following way: N/4 points with positive abscissas and ordinates, N/4 points with negative abscissas
1
Figure 3. The triple of VVV type vectors non-compliant with Hypothesis 1
0
Figure 4. The triple of VVV type vectors non-compliant with Hypothesis 1
and positive ordinates, N/4 points with positive abscissas and negative ordinates, N/4 points with negative abscissas and ordinates. The next step is to select the point with the lowest abscissa and ordinate values. Relatively to this point, the convex hull of the given set is built using the following algorithm: take the point chosen at the previous stage and choose the next one at the minimum positive turning angle. This algorithm is repeated until the starting point is reached. As a result, a convex polygon is obtained. The downside of this algorithm is that the number of vertices of the polygon obtained at the output cannot be controlled.
The second algorithm is the algorithm for constructing a polygon based on a triangle. At the first stage, the triangle is constructed by placing three arbitrary points on a plane. The input data of the algorithm is the number of vertices for the polygon, which should be provided at the output. To achieve the necessary number of vertices "a point is added to any side of the polygon" iteratively as follows: firstly, an arbitrary side of the polygon is selected, and then a point is chosen between the ends of the segment of this side, so that the polygon remains convex with the added
Figure 5. Selection of the vector vertices to perform a search under Hypothesis 2
Figure 6. Selection of the vector vertices to perform a search under Hypothesis 2
point. This algorithm is iterated until the required number of vertices is reached.
The third method is manual testing. It was decided to take a set of 20 ~ 30 polygons different in their construction. The algorithm arbitrarily selects one of them, provided that the second selected polygon (if any) is not similar to the given one.
The described algorithms are used to generate polygons used in the testing of the Hausdorff distance minimization algorithm. At each stage, a pair of polygons is generated, which are sent to the input of one of the algorithms.
To automate the work, the storage systems for the following objects were also created: generated polygons, results of the optimization algorithm, which include the following parameters: the type of algorithm used (full search or some other option), the vector of the second polygon displacement relatively to the first, the type of the set of vectors used to obtain the given displacement vector (V is the vector to the vertex, S is the vector perpendicular to the side of the polygon; the following sets
of vectors were considered: VV, VS, SS, VVV, VVS, VSS, SSS), the Hausdorff distance obtained after minimization, the number of sets of vectors considered before the desired pair or triple of vectors was found, the list of vectors used in the pair or triple described before. Writing and reading algorithms were developed for the storage system.
Based on the previously described algorithms and storage systems, automated tests were developed that could perform the following functions: generating polygon pairs automatically; saving them to a file for further use; performing a full search algorithm to calculate the costs necessary to minimize the Hausdorff distance; performing one or more of the optimized algorithms; comparing the results to verify the validity of the optimized algorithm; and saving the optimization results.
This testing was performed for all possible combinations of pairs of polygons with 3 to 10 vertices inclusive. The test results for Hypothesis 2 are shown in Table 1. Also, the testing was selectively carried out for polygons with more than 10 vertices. Based on the testing results, the statistical data described in Table 1 were collected, including the number of vertices of polygons A and B, as well as the information about how fewer steps were taken, in percentage of the number of steps in the analytical algorithm, was deleted to find the position, at which the minimum Hausdorff distance is reached.
11 / 111 3 4 5 6 7 8 9 10
3 0% 0-15 % 0-15% 5-15 % 5-15% 10-30% 10-30% 10-30%
4 0-15% 0-20 % 0-20% 0-20 % 0-20% 10-30% 10-30% 15-30%
5 0-15% 10-20 % 10-20% 10-20 % 10-25% 10-30% 10-30% 15-30%
6 5-15% 10-20 % 10-20% 10-20 % 10-25% 10-30% 10-30% 15-30%
7 5-15% 10-20 % 10-20% 10-20 % 10-25% 10-30% 10-30% 15-30%
8 10-30% 10-30 % 10-30% 10-30 % 10-30% 10-30% 10-30% 15-30%
9 10-30% 10-30 % 10-30% 10-30 % 10-30% 10-30% 10-30% 15-30%
10 10-30% 15-30 % 15-30% 15-30 % 15-30% 15-30% 15-30% 15-30%
Table 1. Statistical data obtained by testing Hypothesis 2
The results show that the improvement degree depends on both the geometric features of the polygons and their location. Therefore, the degree of reduction in the number of search steps can vary with the same number of vertices. When selecting polygons, the number of vertices of which does not exceed 10, the reduction in the number of the algorithm steps reaches 30%. The percentage by the average value of which pair of the number of vertices of the polygon A and B grows monotonically. The overall result is a significant acceleration of the algorithm with a number of vertices equal to six or more.
3. Conclusion
As a result, the analytical algorithm was implemented. This result is of independent value, both for subsequent testing of any approaches to optimization, and for any subgradient methods. Two hypotheses were tested. The test of first hypothesis resulted in finding a counter example. As a consequence, the second hypothesis was implemented, for which no counter examples were found on a large and diverse sampling of polygon pairs.
As a result, the algorithm was developed, which ensures finding the precise optimal mutual arrangement of polygons in all the cases tested, despite a significant reduction in the search scope. The advantages achieved are as follows: the ability to solve a large number of practical problems not only accurately, but also quickly; the implemented algorithm combines speed and quality. The only
drawback of the algorithm is the absence of a rigorous proof of the fact that vectors determining the optimal position of the polygons will not be ignored in the process of the search reduction. The research can be continued in this direction.
The results of this work can be applied in comparing images [6, 7], recognizing images, recognizing and localizing human faces [8, 9] and emotions on faces [16], as well as in one of the methods of medical imaging based on wave transformation using the Hausdorff distance [21].
REFERENCES
1. Arutyunov A.V. Lectures on Convex and Polysemantic Analysis. Moscow: Fizmatlit, 2014. 184 p.
2. Bronshtein E.M. Approximation of convex sets by polytopes. Sovremennaya matematika. Fundamen-tal'nye napravleniya [Contemporary Mathematics. Fundamental Directions], 2007. vol. 22. P. 5-37. (in Russian)
3. Chernousko F.L. Otsenivanie fazovogo sostoyaniya dinamicheskih sistem [Estimation of the Phase State of Dynamic Systems]. Moscow: Nauka, 1988. (in Russian)
4. Gruber P.M. Approximation by convex polytopes. Polytopes: Abstract, Convex and Computational. Bisztriczky T., McMullen P., Schneider R., Weiss A.I. (eds.) NATO ASI Series (Series C: Mathematical and Physical Sciences), 1994. Vol. 440. P. 173-203. DOI: 10.1007/978-94-011-0924-6^
5. Gruber P.M. Comparision of best and random approximation of convex bodies by polytopes. Rend. Circ. Mat. Palermo, II. Ser., Suppl., 1997. Vol. 50. P. 189-216.
6. Huttenlocher D.P., Klanderman G.A., Rucklidge W.J. Comparing images using the Hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1993. Vol. 15. no. 9. P. 850-863. DOI: 10.1109/34.232073
7. Huttenlocher D.P., Rucklidge W.J. A multi-resolution technique for comparing images using the Hausdorff distance. 1992. [Technical Report 1321, Cornell University]. URL: http://hdl.handle.net/1813/6165
8. Jesorsky O., Kirchberg K.J., Frischholz R.W. Robust face detection using the Hausdorff distance. Audio-and Video-Based Biometric Person Authentication. AVBPA 2001, Lecture Notes in Computer Science, 2001. Vol. 2091. P. 90-95. DOI: 10.1007/3-540-45344-X_L4
9. Kirchberg K.J., Jesorsky O., Frischholz R.W. Genetic model optimization for Hausdorff distance-based face localization. Biometric Authentication. BioAW 2002. Lecture Notes in Computer Science, 2002. Vol. 2359. P. 103-111. DOI: 10.1007/3-540-47917-1.11
10. Kurzhansky A.B., Filippova T.F. On the description of sets of surviving trajectories of differential inclusion. Doklady akademii nauk SSSR [Report of AS of the USSR], 1986. Vol. 289. no. 1. P. 38-41.
11. Lakhtin A.S. Konstruktsii negladkogo i mnogoznachnogo analiza v zadachah dinamicheskoy optimizatsii i teorii uravneniy Gamil'tona-Yakobi. Diss. dokt. fiz.-mat.nauk [Constructions of Non-smooth and Polysemantic Analyses in Problems of Dynamic Optimization and the Theory of Hamilton-Jacobi Equations. Dr. phys. and math. sci. diss.], Yekaterinburg, 2001. (in Russia)
12. Lakhtin A.S., Ushakov V.N. The problem of optimizing the Hausdorff distance between two convex polyhedra. Sovremennaya matematika i ee prilozheniya [Modern Mathematics and its Applications], 2003. Vol. 9. P. 60-67. (in Russian)
13. Lebedev P.D., Uspensky A.A. Procedures for calculating the non-convexity of a planar set. Comput. Math. Math. Phys., 2009. Vol. 49, no. 3. P. 418-427. DOI: 10.1134/S096554250903004X
14. Petrov N.N. Vvedenie v vypuklyy analiz: uchebnoe posobie. [Introduction to Convex Analysis: Study Guide], 2009. (in Russian)
15. Romanov A.V., Kataeva L.Yu. On the use of superconvertive memory for the solution of a system of algebraic equations by the method of alternating directions. VII Mezhdunarodnaya nauchno-tekhnicheskaya molodezhnaya konferentsiya "Budushchee tekhnicheskoy naukiNizhniy Novgorod,, 16.05.2008: tezisy doklada [The Future of the Engineering Science. VII International Scientific and Technical Youth Conference: book of Abstracts], 2008. P. 33-34. (in Russian)
16. Rosenblum M., Yacoob Y., Davis L. Human expression recognition from motion using a radial basis function network architecture. IEEE Transactions on Neural Networks, 1996. Vol. 7, no. 5. P. 11211138. DOI: 10.1109/72.536309
17. Schlesinger M.I., Vodolazskii Y.V. and Yakovenko V.M. Recognizing the similarity of polygons in a strengthened Hausdorff metric. Cybern. Syst. Anal., 2014. Vol. 50, no. 3. P. 476-486. DOI: 10.1007/s10559-014-9636-2
18. Ushakov V.N., Lebedev P.D. Iterative methods for minimization of the Hausdorff distance between movable polygons. Vestn. Udmurtsk. Univ. Mat. Mekh. Komp. Nauki, 2017. Vol. 27, no. 1. P. 86-97. (in Russian) DOI: 10.20537/vm170108
19. Ushakov V.N., Lakhtin A.S., Lebedev P.D. Optimization of the Hausdorff distance between sets in Euclidean space. Proc. Steklov Inst. Math., 2015. Vol. 291, Suppl. 1. P. 222-238. DOI: 10.1134/S0081543815090151
20. Yaksubaev K.D., Shuklina Yu.A. Metric space of unlimited convex sets and unlimited polyhedron. Mezhdunarodnyy nauchno-issledovatel'skiy zhurnal [International Research Journal], 2017. No. 5 (59), part 3. P. 162-164. DOI: 10.23670/IRJ.2017.59.103
21. Zhang J., Liu Y. Medical image registration based on wavelet transform using Hausdorff distance. Transactions on Edutainment VII, Lecture Notes in Computer Science, 2012. Vol. 7145. P. 248-254. DOI: 10.1007/978-3-642-29050-3.24