Alternaria is one of the most cosmopolitan fungal genera encountered and impacts humans and human activities in areas of material degradation, phytopathology, food toxicology, and respiratory disease. Contemporary methods of taxon identification rely on assessments of morphology related to sporulation, which are critical for accurate diagnostics. However, the morphology of Alternaria is quite complex, and precise characterization can be laborious, time-consuming, and often restricted to experts in this field. To make morphology characterization easier and more broadly accessible, a generalized statistical model was developed for the three-dimensional geometric structure of the sporulation apparatus. The model is inspired by the widely used grammar-based models for plants, Lindenmayer-systems, which build structure by repeated application of rules for growth. Adjusting the parameters of the underlying probability distributions yields variations in the morphology, and thus the approach provides an excellent tool for exploring the morphology of Alternaria under different assumptions, as well as understanding how it is largely the consequence of local rules for growth. Further, different choices of parameters lead to different model groups, which can then be visually compared to published descriptions or microscopy images to validate parameters for species-specific models. The approach supports automated analysis, as the models can be fit to image data using statistical inference, and the explicit representation of the geometry allows the accurate computation of any morphological quantity. Furthermore, because the model can encode the statistical variation of geometric parameters for different species, it will allow automated species identification from microscopy images using statistical inference. In summary, the approach supports visualization of morphology, automated quantification of phenotype structure, and identification based on form. © 2011 British Mycological Society.
We present a general model for tracking smooth trajectories of multiple targets in complex data sets, where tracks potentially cross each other many times. As the number of overlapping trajectories grows, exploiting smoothness becomes increasingly important to disambiguate the association of successive points. However, in many important problems an effective parametric model for the trajectories does not exist. Hence we propose modeling trajectories as independent realizations of Gaussian processes with kernel functions which allow for arbitrary smooth motion. Our generative statistical model accounts for the data as coming from an unknown number of such processes, together with expectations for noise points and the probability that points are missing. For inference we compare two methods: A modified version of the Markov chain Monte Carlo data association (MCMCDA) method, and a Gibbs sampling method which is much simpler and faster, and gives better results by being able to search the solution space more efficiently. In both cases, we compare our results against the smoothing provided by linear dynamical systems (LDS). We test our approach on videos of birds and fish, and on 82 image sequences of pollen tubes growing in a petri dish, each with up to 60 tubes with multiple crossings. We achieve 93% accuracy on image sequences with up to ten trajectories (35 sequences) and 88% accuracy when there are more than ten (42 sequences). This performance surpasses that of using an LDS motion model, and far exceeds a simple heuristic tracker. © 2011 IEEE.
We present a comprehensive strategy for evaluating image retrieval algorithms. Because automated image retrieval is only meaningful in its service to people, performance characterization must be grounded in human evaluation. Thus we have collected a large data set of human evaluations of retrieval results, both for query by image example and query by text. The data is independent of any particular image retrieval algorithm and can be used to evaluate and compare many such algorithms without further data collection. The data and calibration software are available on-line (http://kobus.ca/research/data). We develop and validate methods for generating sensible evaluation data, calibrating for disparate evaluators, mapping image retrieval system scores to the human evaluation results, and comparing retrieval systems. We demonstrate the process by providing grounded comparison results for several algorithms. © 2005 IEEE.
We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Learning the joint distribution of image regions and words has many applications. We consider in detail predicting words associated with whole images (auto-annotation) and corresponding to particular image regions (region naming). Auto-annotation might help organize and access large collections of images. Region naming is a model of object recognition as a process of translating image regions to words, much as one might translate from one language to another. Learning the relationships between image regions and semantic correlates (words) is an interesting example of multi-modal data mining, particularly because it is typically hard to apply data mining techniques to collections of images. We develop a number of models for the joint distribution of image regions and words, including several which explicitly learn the correspondence between regions and words. We study multi-modal and correspondence extensions to Hofmann's hierarchical clustering/aspect model, a translation model adapted from statistical machine translation (Brown et al.), and a multi-modal extension to mixture of latent Dirichlet allocation (MoM-LDA). All models are assessed using a large collection of annotated images of real scenes. We study in depth the difficult problem of measuring performance. For the annotation task, we look at prediction performance on held out data. We present three alternative measures, oriented toward different types of task. Measuring the performance of correspondence methods is harder, because one must determine whether a word has been placed on the right region of an image. We can use annotation performance as a proxy measure, but accurate measurement requires hand labeled data, and thus must occur on a smaller scale. We show results using both an annotation proxy, and manually labeled data.