PH.D. DISSERTATION PREFACE -- Henry S. Baird -- Xerox PARC

PH.D. DISSERTATION PREFACE -- Henry S. Baird

____________________________________________________________

Ph.D. Dissertation Preface.

This work deals with the computer vision task of recognizing rigid patterns in the plane which have undergone unknown distortions. We view a pattern as a set of "locally-defined" features: that is, each can be located, independently of the rest, by examining small regions of the image.

Given a model pattern and an image, the goals of recognition are to match each image feature with the corresponding model feature, and to locate the overall instance of the model within the image. Finding such a "global" matching can be difficult, and provably near-linear-time algorithms to do so are rare.

Distortions studied include arbitrary translation, rotation, and scaling, and bounded location error ("noise"). Many prior workers have restricted attention to translation and small changes in rotation and scale. We permit noise bounds to be specified as arbitrary convex polygons about each model feature location.

We show that efficient recognition is possible even when only the location of features is known. A pruned tree-search method is developed which makes use of the Soviet ellipsoid algorithm for testing feasibility of systems of linear constraints. We give a general analysis of the geometry of constraints on distortions which are consistent with arbitrary matchings. The expected number of feasible partial matchings examined to find m successful matchings is proven to be O( m n ), where n is the number of pattern features. An interesting blend of theoretical analysis and practical implementation shows that the algorithm has an overall expected runtime that is theoretically asymptotically quadratic in n, but practically linear for patterns with fewer than 100 features.

These results hold for a class of random patterns, under moderate noise likely to occur in practice, and assuming there are no spurious or missing feature points in the image. If spurious or missing features occur, the algorithm's asymptotic runtime will worsen.

The approach extends directly to 3D patterns and more general affine distortions, at generally higher asymptotic costs. Although we have exploited only location of features, other properties such as orientation, size, order along boundaries, and semantic labels, can be easily used within the same framework of pruned search, for improved results.

baird@parc.xerox.com

Last updated December 7, 1998.