Looking through the glasses of Information Theory (IT) has proved to be effective both for formulating and designing algorithmic solutions to many problems in computer vision and pattern recognition (CVPR): image matching, clustering and segmentation, salient point detection, feature selection and dimensionality reduction, projection pursuit, optimal classifier design, and many others. Nowadays, researchers are widely bringing IT elements to the CVPR arena. Among these elements, there are measures (entropy, mutual information, Kullback–Leibler divergence, Jensen–Shannon divergence...), principles (maximum entropy, minimax entropy, minimum description length...) and theories (rate distortion theory, coding, the method of types...).
This book introduces and explores the latter elements, together with the one of entropy estimation, through an incremental complexity approach. Simultaneously, the main CVPR problems are formulated and the most representative algorithms, considering authors’ preferences for sketching the IT–CVPR field, are presented. Interesting connections between IT elements when applied to different problems are highlighted, seeking for a basic/skeletal research roadmap. This roadmap is far from being comprehensive at present due to time and space constraints, and also due to the current state of development of the approach. The result is a novel tool, unique in its conception, both for CVPR and IT researchers, which is intended to contribute, as much as possible, to a cross-fertilization of both areas.
The motivation and origin of this manuscript is our awareness of the existence of many sparse sources of IT-based solutions to CVPR problems, and the lack of a systematic text that focuses on the important question: How useful is IT for CVPR? At the same time, we needed a research language, common to all the members of the Robot Vision Group. Energy minimization, graph theory, and Bayesian inference, among others, were adequate methodological tools during our daily research. Consequently, these tools were key to design and build a solid background for our Ph.D. students. Soon we realized that IT was a unifying component that flowed naturally among our rationales for tackling CVPR problems. Thus, some of us enrolled in the task of writing a text in which we could advance as much as possible in the fundamental links between CVPR and IT. Readers (starters and senior researchers) will judge to what extent we have both answered the above fundamental question and reached our objectives.
Although the text is addressed to CVPR–IT researchers and students, it is also open to an interdisciplinary audience. One of the most interesting examples is the computational vision community, which includes people interested both in biological vision and psychophysics. Other examples are the
roboticians and the people interested in developing wearable solutions for the visually impaired (which is the subject of our active work in the research group).
Under its basic conception, this text may be used for an IT-based one semester course of CVPR. Only some rudiments of algebra and probability are necessary. IT items will be introduced as the text flows from one computer vision or pattern recognition problem to another. We have deliberately avoided a succession of theorem–proof pairs for the sake of a smooth presentation. Proofs, when needed, are embedded in the text, and they are usually excellent pretexts for presenting or highlighting interesting properties of IT elements. Numerical examples with toy settings of the problems are often included for a better understanding of the IT-based solution. When formal elements of other branches of mathematics like field theory, optimization, and so on, are needed, we have briefly presented them and referred to excellent books fully dedicated to their description.
Problems, questions and exercises are also proposed at the end of each chapter. The purpose of the problems section is not only to consolidate what is learnt, but also to go one step forward by testing the ability of generalizing the concepts exposed in each chapter. Such section is preceded by a brief literature review that outlines the key papers for the CVPR topic, which is the subject of the chapter. These papers’ references, together with sketched solutions to the problems, will be progressively accessible in the Web site http://www.rvg.ua.es/ITinCVPR.
We have started the book with a brief introduction (Chapter 1) regarding the four axes of IT–CVPR interaction (measures, principles, theories, and entropy estimators). We have also presented here the skeletal research roadmap (the ITinCVPR tube). Then we walk along six chapters, each one tackling a
different problem under the IT perspective. Chapter 2 is devoted to interest points, edge detection, and grouping; interest points allow us to introduce the concept of entropy and its linking with Chernoff information, Sanov’s theorem, phase transitions and the method of types. Chapter 3 covers contour and region-based image segmentation mainly from the perspective of model order selection through the minimum description length (MDL) principle, although the Jensen–Shannon measure and the Jaynes principle of maximum entropy are also introduced; the question of learning a segmentation model is tackled through links with maximum entropy and belief propagation; and the unification of generative and discriminative processes for segmentation and recognition is explored through information divergence measures. Chapter 4 reviews registration, matching, and recognition by considering the following image registration through minimization of mutual information and related measures; alternative derivations of Jensen–Shannon divergence yield deformable matching; shape comparison is encompassed through Fisher information; and structural matching and learning are driven by MDL. Chapter 5 is devoted to image and pattern clustering and is mainly rooted in three IT approaches to clustering: Gaussian mixtures (incremental method for adequate order selection), information bottleneck (agglomerative and robust with model order selection) and mean-shift; IT is also present in initial proposals for ensembles clustering (consensus finding). Chapter 6 reviews the main approaches to feature selection and transformation: simple wrappers and filters exploiting IT for bypassing the curse of dimensionality; minimax entropy principle for learning patterns using a generative approach; and ICA/gPCA methods
based on IT (ICA and neg-entropy, info-max and minimax ICA, generalized PCA and effective dimension). Finally, Chapter 7, Classifier Design, analyzes the main IT strategies for building classifiers. This obviously includes decision trees, but also multiple trees and random forests, and how to improve boosting algorithms by means of IT-based criteria. This final chapter ends with an information projection analysis of maximum entropy classifiers and a careful exploration of the links between Bregman divergences and classifiers.
We acknowledge the contribution of many people to this book. In first place, we thank many scientists for their guide and support, and for their important contributions to the field. Researchers from different universities and institutions such as Alan Yuille, Hamid Krim, Chen Ping-Feng, Gozde Unal, Ajit Rajwadee, Anand Rangarajan, Edwin Hancock, Richard Nock, Shun-ichi Amari, and Mario Figueiredo, among many others, contributed with their advices, deep knowledge and highly qualified expertise. We also thank all the
colleagues of the Robot Vision Group of the University of Alicante, especially Antonio Peñalver, Juan Manuel Sáez, and Miguel Cazorla, who contributed with figures, algorithms, and important results from their research. Finally, we thank the editorial board staff: Catherine Brett for his initial encouragement and support, and Simon Rees and Wayne Wheeler for their guidance and patience.
University of Alicante, Spain
(© 2009, Springer)