Jacobus J Barnard
Associate Director, Faculty Affairs-SISTA
Associate Professor, BIO5 Institute
Associate Professor, Electrical and Computer Engineering
Professor, Cognitive Science - GIDP
Professor, Computer Science
Professor, Genetics - GIDP
Professor, Statistics-GIDP
Primary Department
(520) 621-6613
Research Interest
Kobus Barnard, PhD, is an associate professor in the recently formed University of Arizona School of Information: Science, Technology, and Arts (SISTA), created to foster computational approaches across disciplines in both research and education. He also has University of Arizona appointments with Computer Science, ECE, Statistics, Cognitive Sciences, and BIO5. He leads the Interdisciplinary Visual Intelligence Lab (IVILAB) currently housed in SISTA. Research in the IVILAB revolves around building top-down statistical models that link theory and semantics to data. Such models support going from data to knowledge using Bayesian inference. Much of this work is in the context of inferring semantics and geometric form from image and video. For example, in collaboration with multiple researchers, the IVILAB has applied this approach to problems in computer vision (e.g., tracking people in 3D from video, understanding 3D scenes from images, and learning models of object structure) and biological image understanding (e.g., tracking pollen tubes growing in vitro, inferring the morphology of neurons grown in culture, extracting 3D structure of filamentous fungi from the genus Alternaria from brightfield microscopy image stacks, and extracting 3D structure of Arabidopsis plants). An additional IVILAB research project, Semantically Linked Instructional Content (SLIC) is on improving access to educational video through searching and browsing.Dr. Barnard holds an NSF CAREER grant, and has received support from three additional NSF grants, the DARPA Mind’s eye program, ONR, the Arizona Biomedical Research Commission (ABRC), and a BIO5 seed grant. He was supported by NSERC (Canada) during graduate and post-graduate studies (NSERC A, B and PDF). His work on computational color constancy was awarded the Governor General’s gold medal for the best dissertation across disciplines at SFU. He has published over 80 papers, including one awarded best paper on cognitive computer vision in 2002.


Cardei, V. C., Funt, B., & Barnard, K. (1999). White point estimation for uncalibrated images. Final Program and Proceedings - IS and T/SID Color Imaging Conference, 97-100.


Color images often must be color balanced to remove unwanted color casts. We extend previous work on using a neural network for illumination, or white-point, estimation from the case of calibrated images to that of uncalibrated images of unknown origin. The results show that the chromaticity of the ambient illumination can be estimated with an average CIE Lab error of 5ΔE. Comparisons are made to the grayworld and white patch methods.

Barnard, K., & Funt, B. (1999). Camera calibration for color research. Proceedings of SPIE - The International Society for Optical Engineering, 3644, 576-585.


In this paper we introduce a new method for determining the relationship between signal spectra and camera RGB which is required for many applications in color. We work with the standard camera model, which assumes that the response is linear. We also provide an example of how the fitting procedure can be augmented to include fitting for a previously estimated non-linearity. The basic idea of our method is to minimize squared error subject to linear constraints, which enforce positivity and range of the result. It is also possible to constrain the smoothness, but we have found that it is better to add a regularization expression to the objective function to promote smoothness. With this method, smoothness and error can be traded against each other without being restricted by arbitrary bounds. The method is easily implemented as it is an example of a quadratic programming problem, for which there are many software solutions available. In this paper we provide the results using this method and others to calibrate a Sony DXC-930 CCD color video camera. We find that the method gives low error, while delivering sensors which are smooth and physically realizable. Thus we find the method superior to methods which ignore any of these considerations.

Barnard, K., & Forsyth, D. (2001). Exploiting image semantics for picture libraries. Proceedings of First ACM/IEEE-CS Joint Conference on Digital Libraries, 469-.


A system for learning the semantics of collections of images from features and associated text is discussed. The idea of the application of this system to the digital image libraries is explored. The nature of search and browsing is considered and it is argued that for many applications these should be used together.

Schlecht, J., & Barnard, K. (2009). Learning models of object structure. Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference, 1615-1623.


We present an approach for learning stochastic geometric models of object categories from single view images. We focus here on models expressible as a spatially contiguous assemblage of blocks. Model topologies are learned across groups of images, and one or more such topologies is linked to an object category (e.g. chairs). Fitting learned topologies to an image can be used to identify the object class, as well as detail its geometry. The latter goes beyond labeling objects, as it provides the geometric structure of particular instances. We learn the models using joint statistical inference over category parameters, camera parameters, and instance parameters. These produce an image likelihood through a statistical imaging model. We use trans-dimensional sampling to explore topology hypotheses, and alternate between Metropolis-Hastings and stochastic dynamics to explore instance parameters. Experiments on images of furniture objects such as tables and chairs suggest that this is an effective approach for learning models that encode simple representations of category geometry and the statistics thereof, and support inferring both category and geometry on held out single view images.

Barnard, K., Duygulu, P., & Forsyth, D. (2003). Recognition as translating images into text. Proceedings of SPIE - The International Society for Optical Engineering, 5018, 168-178.


We present an overview of a new paradigm for tackling long standing computer vision problems. Specifically our approach is to build statistical models which translate from a visual representations (images) to semantic ones (associated text). As providing optimal text for training is difficult at best, we propose working with whatever associated text is available in large quantities. Examples include large image collections with keywords, museum image collections with descriptive text, news photos, and images on the web. In this paper we discuss how the translation approach can give a handle on difficult questions such as: What counts as an object? Which objects are easy to recognize and which are hard? Which objects are indistinguishable using our features? How to integrate low level vision processes such as feature based segmentation, with high level processes such as grouping. We also summarize some of the models proposed for translating from visual information to text, and some of the methods used to evaluate their performance.