Jacobus J Barnard

Jacobus J Barnard

Professor, Computer Science
Associate Director, Faculty Affairs-SISTA
Professor, Electrical and Computer Engineering
Professor, Cognitive Science - GIDP
Professor, Genetics - GIDP
Professor, Statistics-GIDP
Professor, BIO5 Institute
Member of the General Faculty
Member of the Graduate Faculty
Primary Department
Department Affiliations
Contact
(520) 621-4632

Research Interest

Kobus Barnard, PhD, is an associate professor in the recently formed University of Arizona School of Information: Science, Technology, and Arts (SISTA), created to foster computational approaches across disciplines in both research and education. He also has University of Arizona appointments with Computer Science, ECE, Statistics, Cognitive Sciences, and BIO5. He leads the Interdisciplinary Visual Intelligence Lab (IVILAB) currently housed in SISTA. Research in the IVILAB revolves around building top-down statistical models that link theory and semantics to data. Such models support going from data to knowledge using Bayesian inference. Much of this work is in the context of inferring semantics and geometric form from image and video. For example, in collaboration with multiple researchers, the IVILAB has applied this approach to problems in computer vision (e.g., tracking people in 3D from video, understanding 3D scenes from images, and learning models of object structure) and biological image understanding (e.g., tracking pollen tubes growing in vitro, inferring the morphology of neurons grown in culture, extracting 3D structure of filamentous fungi from the genus Alternaria from brightfield microscopy image stacks, and extracting 3D structure of Arabidopsis plants). An additional IVILAB research project, Semantically Linked Instructional Content (SLIC) is on improving access to educational video through searching and browsing.Dr. Barnard holds an NSF CAREER grant, and has received support from three additional NSF grants, the DARPA Mind’s eye program, ONR, the Arizona Biomedical Research Commission (ABRC), and a BIO5 seed grant. He was supported by NSERC (Canada) during graduate and post-graduate studies (NSERC A, B and PDF). His work on computational color constancy was awarded the Governor General’s gold medal for the best dissertation across disciplines at SFU. He has published over 80 papers, including one awarded best paper on cognitive computer vision in 2002.

Publications

Barnard, K., & Funt, B. (1999). Camera calibration for color research. Proceedings of SPIE - The International Society for Optical Engineering, 3644, 576-585.

Abstract:

In this paper we introduce a new method for determining the relationship between signal spectra and camera RGB which is required for many applications in color. We work with the standard camera model, which assumes that the response is linear. We also provide an example of how the fitting procedure can be augmented to include fitting for a previously estimated non-linearity. The basic idea of our method is to minimize squared error subject to linear constraints, which enforce positivity and range of the result. It is also possible to constrain the smoothness, but we have found that it is better to add a regularization expression to the objective function to promote smoothness. With this method, smoothness and error can be traded against each other without being restricted by arbitrary bounds. The method is easily implemented as it is an example of a quadratic programming problem, for which there are many software solutions available. In this paper we provide the results using this method and others to calibrate a Sony DXC-930 CCD color video camera. We find that the method gives low error, while delivering sensors which are smooth and physically realizable. Thus we find the method superior to methods which ignore any of these considerations.

Barnard, K., & Forsyth, D. (2001). Exploiting image semantics for picture libraries. Proceedings of First ACM/IEEE-CS Joint Conference on Digital Libraries, 469-.

Abstract:

A system for learning the semantics of collections of images from features and associated text is discussed. The idea of the application of this system to the digital image libraries is explored. The nature of search and browsing is considered and it is argued that for many applications these should be used together.

Schlecht, J., & Barnard, K. (2009). Learning models of object structure. Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference, 1615-1623.

Abstract:

We present an approach for learning stochastic geometric models of object categories from single view images. We focus here on models expressible as a spatially contiguous assemblage of blocks. Model topologies are learned across groups of images, and one or more such topologies is linked to an object category (e.g. chairs). Fitting learned topologies to an image can be used to identify the object class, as well as detail its geometry. The latter goes beyond labeling objects, as it provides the geometric structure of particular instances. We learn the models using joint statistical inference over category parameters, camera parameters, and instance parameters. These produce an image likelihood through a statistical imaging model. We use trans-dimensional sampling to explore topology hypotheses, and alternate between Metropolis-Hastings and stochastic dynamics to explore instance parameters. Experiments on images of furniture objects such as tables and chairs suggest that this is an effective approach for learning models that encode simple representations of category geometry and the statistics thereof, and support inferring both category and geometry on held out single view images.

Barnard, K., Duygulu, P., & Forsyth, D. (2003). Recognition as translating images into text. Proceedings of SPIE - The International Society for Optical Engineering, 5018, 168-178.

Abstract:

We present an overview of a new paradigm for tackling long standing computer vision problems. Specifically our approach is to build statistical models which translate from a visual representations (images) to semantic ones (associated text). As providing optimal text for training is difficult at best, we propose working with whatever associated text is available in large quantities. Examples include large image collections with keywords, museum image collections with descriptive text, news photos, and images on the web. In this paper we discuss how the translation approach can give a handle on difficult questions such as: What counts as an object? Which objects are easy to recognize and which are hard? Which objects are indistinguishable using our features? How to integrate low level vision processes such as feature based segmentation, with high level processes such as grouping. We also summarize some of the models proposed for translating from visual information to text, and some of the methods used to evaluate their performance.

Barnard, K., Duygulu, P., & Forsyth, D. (2001). Clustering art. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2, II434-II441.

Abstract:

We extend a recently developed method for learning the semantics of image databases using text and pictures. We incorporate statistical natural language processing in order to deal with free text. We demonstrate the current system on a difficult dataset, namely 10,000 images of work from the Fine Arts Museum of San Francisco. The images include line drawings, paintings, and pictures of sculpture and ceramics. Many of the images have associated free text whose varies greatly, from physical description to interpretation and mood. We use WordNet to provide semantic grouping information and to help disambiguate word senses, as well as emphasize the hierarchical nature of semantic relationships. This allows us to impose a natural structure on the image collection, that reflects semantics to a considerable degree. Our method produces a joint probability distribution for words and picture elements. We demonstrate that this distribution can be used (a) to provide illustrations for given captions and (b) to generate words for images outside the training set. Results from this annotation process yield a quantitative study of our method. Finally, our annotation process can be seen as a form of object recognizer that has been learned through a partially supervised process.