Jacobus J Barnard
Associate Director, Faculty Affairs-SISTA
Associate Professor, BIO5 Institute
Associate Professor, Electrical and Computer Engineering
Member of the Graduate Faculty
Professor, Cognitive Science - GIDP
Professor, Computer Science
Professor, Genetics - GIDP
Professor, Statistics-GIDP
Primary Department
(520) 621-4632
Research Interest
Kobus Barnard, PhD, is an associate professor in the recently formed University of Arizona School of Information: Science, Technology, and Arts (SISTA), created to foster computational approaches across disciplines in both research and education. He also has University of Arizona appointments with Computer Science, ECE, Statistics, Cognitive Sciences, and BIO5. He leads the Interdisciplinary Visual Intelligence Lab (IVILAB) currently housed in SISTA. Research in the IVILAB revolves around building top-down statistical models that link theory and semantics to data. Such models support going from data to knowledge using Bayesian inference. Much of this work is in the context of inferring semantics and geometric form from image and video. For example, in collaboration with multiple researchers, the IVILAB has applied this approach to problems in computer vision (e.g., tracking people in 3D from video, understanding 3D scenes from images, and learning models of object structure) and biological image understanding (e.g., tracking pollen tubes growing in vitro, inferring the morphology of neurons grown in culture, extracting 3D structure of filamentous fungi from the genus Alternaria from brightfield microscopy image stacks, and extracting 3D structure of Arabidopsis plants). An additional IVILAB research project, Semantically Linked Instructional Content (SLIC) is on improving access to educational video through searching and browsing.Dr. Barnard holds an NSF CAREER grant, and has received support from three additional NSF grants, the DARPA Mind’s eye program, ONR, the Arizona Biomedical Research Commission (ABRC), and a BIO5 seed grant. He was supported by NSERC (Canada) during graduate and post-graduate studies (NSERC A, B and PDF). His work on computational color constancy was awarded the Governor General’s gold medal for the best dissertation across disciplines at SFU. He has published over 80 papers, including one awarded best paper on cognitive computer vision in 2002.


Barnard, K., Duygulu, P., Forsyth, D., Freitas, N. D., Blei, D. M., & Jordan, M. I. (2003). Matching Words and Pictures. Journal of Machine Learning Research, 3(6), 1107-1135.


We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Learning the joint distribution of image regions and words has many applications. We consider in detail predicting words associated with whole images (auto-annotation) and corresponding to particular image regions (region naming). Auto-annotation might help organize and access large collections of images. Region naming is a model of object recognition as a process of translating image regions to words, much as one might translate from one language to another. Learning the relationships between image regions and semantic correlates (words) is an interesting example of multi-modal data mining, particularly because it is typically hard to apply data mining techniques to collections of images. We develop a number of models for the joint distribution of image regions and words, including several which explicitly learn the correspondence between regions and words. We study multi-modal and correspondence extensions to Hofmann's hierarchical clustering/aspect model, a translation model adapted from statistical machine translation (Brown et al.), and a multi-modal extension to mixture of latent Dirichlet allocation (MoM-LDA). All models are assessed using a large collection of annotated images of real scenes. We study in depth the difficult problem of measuring performance. For the annotation task, we look at prediction performance on held out data. We present three alternative measures, oriented toward different types of task. Measuring the performance of correspondence methods is harder, because one must determine whether a word has been placed on the right region of an image. We can use annotation performance as a proxy measure, but accurate measurement requires hand labeled data, and thus must occur on a smaller scale. We show results using both an annotation proxy, and manually labeled data.

Barnard, K. (1999). Color constancy with fluorescent surfaces. Final Program and Proceedings - IS and T/SID Color Imaging Conference, 257-261.


Fluorescent surfaces are common in the modern world, but they present problems for machine color constancy because fluorescent reflection typically violates the assumptions needed by most algorithms. The complexity of fluorescent reflection is likely one of the reasons why fluorescent surfaces have escaped the attention of computational color constancy researchers. In this paper we take some initial steps to rectify this omission. We begin by introducing a simple method for characterizing fluorescent surfaces. It is based on direct measurements, and thus has low error and avoids the need to develop a comprehensive and accurate physical model. We then modify and extend several modern color constancy algorithms to address fluorescence. The algorithms considered are CRULE and derivatives, Color by Correlation, and neural net methods. Adding fluorescence to Color by Correlation and neural net methods is relatively straight forward, but CRULE requires modification so that its complete reliance on diagonal models can be relaxed. We present results for both synthetic and real image data for fluorescent capable versions of CRULE and Color by Correlation, and we compare the results with the standard versions of these and other algorithms.

Barnard, K., Cardei, V., & Funt, B. (2002). A comparison of computational color constancy algorithms - Part I: Methodology and experiments with synthesized data. IEEE Transactions on Image Processing, 11(9), 972-984.

PMID: 18249720;Abstract:

We introduce a context for testing computational color constancy, specify our approach to the implementation of a number of the leading algorithms, and report the results of three experiments using synthesized data. Experiments using synthesized data are important because the ground truth is known, possible confounds due to camera characterization and pre-processing are absent, and various factors affecting color constancy can be efficiently investigated because they can be manipulated individual and precisely. The algorithms chosen for close study include two gray world methods, a limiting case of a version of the Retinex method, a number of variants of Forsyth's gamut-mapping method, Cardei et al.'s neural net method, and Finlayson et al.'s Color by Correlation method. We investigate the ability of these algorithms to make estimates of three different color constancy quantities: the chromaticity of the scene illuminant, the overall magnitude of that illuminant, and a corrected, illumination invariant, image. We consider algorithm performance as a function of the number of surfaces in scenes generated from reflectance spectra, the relative effect on the algorithms of added specularities, and the effect of subsequent clipping of the data. All data is available on-line at http://www.cs.sfu.ca/∼color/data, and implementations for most of the algorithms are also available (http://www.cs.sfu.ca/∼color/code).

Guan, J., Brau, E., Simek, K., Morrison, C. T., Butler, E. A., & Barnard, K. J. (2015). Moderated and Drifting Linear Dynamical Systems. International Conference on Machine Learning.

This venue is a peer reviewed, competitive conference (acceptance rate: 26%) and the full paper is published as part of the conference proceedings [ CSRanking endorsed, A* ]

Ramanan, D., Forsyth, D. A., & Barnard, K. (2005). Detecting, localizing and recovering kinematics of textured animals. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2, 635-642.


We develop and demonstrate an object recognition system capable of accurately detecting, localizing, and recovering the kinematic configuration of textured animals in real images. We build a deformation model of shape automatically from videos of animals and an appearance model of texture from a labeled collection of animal images, and combine the two models automatically. We develop a simple texture descriptor that outperforms the state of the art. We test our animal models on two datasets; images taken by professional photographers from the Corel collection, and assorted images from the web returned by Google. We demonstrate quite good performance on both datasets. Comparing our results with simple baselines, we show that for the Google set, we can recognize objects from a collection demonstrably hard for object recognition. © 2005 IEEE.