CESCA Seminar Talk (11/15): “Advancing Computer Vision by Leveraging Humans”
Event Date: 2013-11-15.
Event Location: Lavery Hall 350.
Event Contact: chaowang (at) vt( dot )edu
Sponsoring Group: CESCA
Information: by Prof. Devi Parikh, Department of ECE, Virginia Tech
Historically, humans have played a limited role in advancing the challenging problem of computer vision: either by designing algorithms in their capacity as researchers or by acting as ground-truth generating minions. This seems rather counter-productive since we often aim to replicate human performance (e.g. in semantic image understanding) and we desire humans to communicate with vision systems (e.g. in image search, or for training the systems). In this talk, I will describe my recent efforts in expanding the roles humans play in advancing computer vision.
In the first part of my talk, I will describe our recently-introduced "human-debugging" paradigm. It allows us to identify weak-links in machine vision approaches that require further research. It involves replacing subcomponents of machine vision pipelines with human subjects, and examining the resultant effect on overall recognition performance. I will present several of our efforts within this framework that address image classification, object recognition and person detection. I will discuss the lessons learnt and present subsequent improvements to computer vision algorithms inspired by these findings.
In the second part of my talk, I will present our work on allowing humans and machines to better communicate with each other by exploiting visual attributes. Visual attributes are mid-level concepts such as "furry" and "metallic" that bridge the gap between low-level image features (e.g. texture) and high-level concepts (e.g. rabbit or car). They are shareable across different but related concepts. Most importantly, visual attributes are both machine detectable and human understandable, making them ideal as a mode of communication between the two. I will present our work on discovering a vocabulary of these attributes in the first place and on enhancing the communication power of these attributes by using them relatively. We utilize attributes for a variety of applications including improved image search and effective active learning of image classifiers.
Devi Parikh is an Assistant Professor in the Bradley Department of Electrical and Computer Engineering at Virginia Tech (VT), where she leads the Computer Vision Lab. She is also a member of the Virginia Center for Autonomous Systems (VaCAS) and the VT Discovery Analytics Center (DAC).
Prior to this, she was a Research Assistant Professor at Toyota Technological Institute at Chicago (TTIC), an academic computer science institute affiliated with University of Chicago. She has held visiting positions at Cornell University, University of Texas at Austin, Microsoft Research, MIT and Carnegie Mellon University. She received her M.S. and Ph.D. degrees from the Electrical and Computer Engineering department at Carnegie Mellon University in 2007 and 2009 respectively. She received her B.S. in Electrical and Computer Engineering from Rowan University in 2005.
Her research interests include computer vision, pattern recognition and AI in general and visual recognition problems in particular. Her recent work involves leveraging human-machine collaborations for building smarter machines. She has also worked on other topics such as ensemble of classifiers, data fusion, inference in probabilistic models, 3D reassembly, barcode segmentation, computational photography, interactive computer vision, contextual reasoning and hierarchical representations of images.
She was a recipient of the Carnegie Mellon Dean's Fellowship, National Science Foundation Graduate Research Fellowship, Outstanding Reviewer Award at CVPR 2012, Google Faculty Research Award in 2012 and the 2011 Marr Best Paper Prize awarded at the International Conference on Computer Vision (ICCV).
(Previous CESCA Seminar talks are available at