Oh, Those Robot Eyes!

Toy Robot

Willow Garage is organizing a workshop at the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) 2010 in San Francisco to discuss the intersection of computer vision with human-robot interaction. Willow Garage is the hardware and open source software organization behind the Robot Operating System (ROS) and the PR robot development platform. Here’s a recent video from Willow Garage of work done at the University of Illinois on how robots can be taught to perceive images:

Of all the human senses, the eyes present the largest quantity of data to the brain. For robots, the choice of eyes varies, from single light-dependant resistors to high-resolution video cameras. Processing hardware can include a few transistors in single detectors to a wide range of machine vision and image processing graphics cards developed for industrial use. Vision system software is often available as freeware or shareware on the Internet. Open source vision libraries such as OpenCV are available for C, C++, and Python, and include software for:

  • Getting input from cameras,
  • Transforming images,
  • Segmenting images and shape matching,
  • Pattern recognition, including face detection
  • Tracking and motion in two and three dimensions,
  • 3D reconstruction from stereo vision, and
  • Machine learning algorithms.

A recent PloS Computational Biology study leverages advances in stream processing hardware (high-end NVIDIA graphic cards and the PlayStation 3’s IBM Cell Processor) to accelerate progress in both artificial vision and our understanding of the computational underpinnings of biological vision. Led by researchers at MIT and Harvard, the study takes advantage of newer graphics processing hardware to explore the range of biologically-inspired vision models. The researchers compared selected parameter sets — for example, the number of image filters and filter weights — in relation to biological vision systems. Candidate parameter sets were randomly sampled to develop computational models and the models were subjected to “unsupervised learning” using the videos Cars and Planes, Boats, and clips from the TV series Law and Order. The researchers then used synthetic object recognition tests to compare the performance of the models, generating 7,500 model instantiations in three groups of 2,500. They found that combining the five best biologically-inspired models consistently outperformed state-of-the-art machine vision systems. They concluded that “this large-scale screening approach can yield significant, reproducible gains in performance in a variety of basic object recognition tasks and that it holds the promise of offering insight into which computational ideas are most important for achieving this performance.”

R.O.B.O.T. ComicsUnderstanding how the latest graphics hardware can be used to determine the computational underpinning of biologically-inspired computer vision systems will become increasing important for robots in human-populated environments as well as human-robot interaction (HRI). The Willow Garage workshop at CVPR promises to explore computer vision algorithms for tasks such as person detection, pose estimation, scene understanding, and object recognition — tasks that now require human teleoperation of robots.

HRI tasks where humans help robots to see (such as the work done at the University of Illinois) can help clarify the important features of computer vision algorithms. These algorithms are likely to improve the computational accuracy and object recognition capabilities of biologically-inspired vision systems as streaming graphics hardware continues to get faster. Oh, those robot eyes!


  1. […]HRI tasks where humans help robots to see (such as the work done at the University of Illinois) can help clarify the important features of computer vision algorithms[…]

  2. it’s a good idea to ask humans via the internet but what happens when it’s 4chan telling the robot what everything is?

Leave a Reply