We describe a fast connected components labeling algorithm using a region coloring approach. It computes region attributes such as size, moments, and bounding boxes in a single pass through the image. Working in the context of real-time pupil detection for an eye tracking system, we compare the time performance of our algorithm with a contour tracing-based labeling approach and a region coloring method developed for a hardware eye detection system. We find that region attribute extraction performance exceeds that of these comparison methods. Further, labeling each pixel, which requires a second pass through the image, has comparable performance. © Springer-Verlag 2009.
We test a number of the leading computational color constancy algorithms using a comprehensive set of images. These were of 33 different scenes under 11 different sources representative of common illumination conditions. The algorithms studied include two gray world methods, a version of the Retinex method, several variants of Forsyth's gamut-mapping method, Cardei et al,'s neural net method, and Finlayson et al.'s Color by Correlation method. We discuss a number of issues in applying color constancy ideas to image data, and study in depth the effect of different preprocessing strategies. We compare the performance of the algorithms on image data with their performance on synthesized data. All data used for this study is available online at http://www.cs.sfu.ca/~color/data, and implementations for most of the algorithms are also available (http://www.cs.sfu.ca/∼color/code). Experiments with synthesized data (part one of this paper) suggested that the methods which emphasize the use of the input data statistics, specifically Color by Correlation and the neural net algorithm, are potentially the most effective at estimating the chromaticity of the scene illuminant. Unfortunately, we were unable to realize comparable performance on real images. Here exploiting pixel intensity proved to be more beneficial than exploiting the details of image chromaticity statistics, and the three-dimensional (3-D) gamut-mapping algorithms gave the best performance.
Sensor sharpening has been proposed as a method for improving color constancy algorithms but it has not been tested in the context of real color constancy algorithms. In this paper we test sensor sharpening as a method for improving color constancy algorithms in the case of three different cameras, the human cone sensitivity estimates, and the XYZ response curves. We find that when the sensors are already relatively sharp, sensor sharpening does not offer much improvement and can have a detrimental effect However, when the sensors are less sharp, sharpening can have a substantive positive effect. The degree of improvement is heavily dependent on the particular color constancy algorithm. Thus we conclude that using sensor sharpening for improving color constancy can offer a significant benefit, but its use needs to be evaluated with respect to both the sensors and the algorithm.
We introduce using images for word sense disambiguation, either alone, or in conjunction with traditional text based methods. The approach is based on a recently developed method for automatically annotating images by using a statistical model for the joint probability for image regions and words. The model itself is learned from a data base of images with associated text. To use the model for word sense disambiguation, we constrain the predicted words to be possible senses for the word under consideration. When word prediction is constrained to a narrow set of choices (such as possible senses), it can be quite reliable. We report on experiments using the resulting sense probabilities as is, as well as augmenting a state of the art text based word sense disambiguation algorithm. In order to evaluate our approach, we developed a new corpus, ImCor, which consists of a substantive portion of the Corel image data set associated with disambiguated text drawn from the SemCor corpus. Our experiments using this corpus suggest that visual information can be very useful in disambiguating word senses. It also illustrates that associated non-textual information such as image data can help ground language meaning. © 2005 Elsevier B.V. All rights reserved.
The main thrust of this paper is to modify the multi-scale retinex (MSR) approach to image enhancement so that the processing is more justified from a theoretical standpoint. This leads to a new algorithm with fewer arbitrary parameters that is more flexible, maintains color fidelity, and still preserves the contrast-enhancement benefits of the original MSR method. To accomplish this we identify the explicit and implicit processing goals of MSR. By decoupling the MSR operations from one another, we build an algorithm composed of independent steps that separates out the issues of gamma adjustment, color balance, dynamic range compression, and color enhancement, which are all jumbled together in the original MSR method. We then extend MSR with color constancy and chromaticity-preserving contrast enhancement.