The effects of segmentation and feature choice in a translation model of object recognition

Jacobus J Barnard, Pinar Duygulu, Raghavendra Guru, Prasad Gabbur, David Forsyth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

39 Citations (Scopus)

Abstract

We work with a model of object recognition where words must be placed on image regions. This approach means that large scale experiments are relatively easy, so we can evaluate the effects of various early and mid-level vision algorithms on recognition performance. We evaluate various image segmentation algorithms by determining word prediction accuracy for images segmented in various ways and represented by various features. We take the view that good segmentations respect object boundaries, and so word prediction should be better for a better segmentation. However, it is usually very difficult in practice to obtain segmentations that do not break up objects, so most practitioners attempt to merge segments to get better putative object representations. We demonstrate that our paradigm of word prediction easily allows us to predict potentially useful segment merges, even for segments that do not look similar (for example, merging the black and white halves of a penguin is not possible with feature-based segmentation; the main cue must be "familiar configuration"). These studies focus on unsupervised learning of recognition. However, we show that word prediction can be markedly improved by providing supervised information for a relatively small number of regions together with large quantities of unsupervised information. This supervisory information allows a better and more discriminative choice of features and breaks possible symmetries.

Original languageEnglish (US)
Title of host publicationProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume2
StatePublished - 2003
Event2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Madison, WI, United States
Duration: Jun 18 2003Jun 20 2003

Other

Other2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition
CountryUnited States
CityMadison, WI
Period6/18/036/20/03

Fingerprint

Object recognition
Unsupervised learning
Image segmentation
Merging
Experiments

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Vision and Pattern Recognition
  • Software
  • Control and Systems Engineering

Cite this

Barnard, J. J., Duygulu, P., Guru, R., Gabbur, P., & Forsyth, D. (2003). The effects of segmentation and feature choice in a translation model of object recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Vol. 2)

The effects of segmentation and feature choice in a translation model of object recognition. / Barnard, Jacobus J; Duygulu, Pinar; Guru, Raghavendra; Gabbur, Prasad; Forsyth, David.

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Vol. 2 2003.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Barnard, JJ, Duygulu, P, Guru, R, Gabbur, P & Forsyth, D 2003, The effects of segmentation and feature choice in a translation model of object recognition. in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. vol. 2, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Madison, WI, United States, 6/18/03.
Barnard JJ, Duygulu P, Guru R, Gabbur P, Forsyth D. The effects of segmentation and feature choice in a translation model of object recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Vol. 2. 2003
Barnard, Jacobus J ; Duygulu, Pinar ; Guru, Raghavendra ; Gabbur, Prasad ; Forsyth, David. / The effects of segmentation and feature choice in a translation model of object recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Vol. 2 2003.
@inproceedings{77a6753127da4439964abd1ed9cfb174,
title = "The effects of segmentation and feature choice in a translation model of object recognition",
abstract = "We work with a model of object recognition where words must be placed on image regions. This approach means that large scale experiments are relatively easy, so we can evaluate the effects of various early and mid-level vision algorithms on recognition performance. We evaluate various image segmentation algorithms by determining word prediction accuracy for images segmented in various ways and represented by various features. We take the view that good segmentations respect object boundaries, and so word prediction should be better for a better segmentation. However, it is usually very difficult in practice to obtain segmentations that do not break up objects, so most practitioners attempt to merge segments to get better putative object representations. We demonstrate that our paradigm of word prediction easily allows us to predict potentially useful segment merges, even for segments that do not look similar (for example, merging the black and white halves of a penguin is not possible with feature-based segmentation; the main cue must be {"}familiar configuration{"}). These studies focus on unsupervised learning of recognition. However, we show that word prediction can be markedly improved by providing supervised information for a relatively small number of regions together with large quantities of unsupervised information. This supervisory information allows a better and more discriminative choice of features and breaks possible symmetries.",
author = "Barnard, {Jacobus J} and Pinar Duygulu and Raghavendra Guru and Prasad Gabbur and David Forsyth",
year = "2003",
language = "English (US)",
volume = "2",
booktitle = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

}

TY - GEN

T1 - The effects of segmentation and feature choice in a translation model of object recognition

AU - Barnard, Jacobus J

AU - Duygulu, Pinar

AU - Guru, Raghavendra

AU - Gabbur, Prasad

AU - Forsyth, David

PY - 2003

Y1 - 2003

N2 - We work with a model of object recognition where words must be placed on image regions. This approach means that large scale experiments are relatively easy, so we can evaluate the effects of various early and mid-level vision algorithms on recognition performance. We evaluate various image segmentation algorithms by determining word prediction accuracy for images segmented in various ways and represented by various features. We take the view that good segmentations respect object boundaries, and so word prediction should be better for a better segmentation. However, it is usually very difficult in practice to obtain segmentations that do not break up objects, so most practitioners attempt to merge segments to get better putative object representations. We demonstrate that our paradigm of word prediction easily allows us to predict potentially useful segment merges, even for segments that do not look similar (for example, merging the black and white halves of a penguin is not possible with feature-based segmentation; the main cue must be "familiar configuration"). These studies focus on unsupervised learning of recognition. However, we show that word prediction can be markedly improved by providing supervised information for a relatively small number of regions together with large quantities of unsupervised information. This supervisory information allows a better and more discriminative choice of features and breaks possible symmetries.

AB - We work with a model of object recognition where words must be placed on image regions. This approach means that large scale experiments are relatively easy, so we can evaluate the effects of various early and mid-level vision algorithms on recognition performance. We evaluate various image segmentation algorithms by determining word prediction accuracy for images segmented in various ways and represented by various features. We take the view that good segmentations respect object boundaries, and so word prediction should be better for a better segmentation. However, it is usually very difficult in practice to obtain segmentations that do not break up objects, so most practitioners attempt to merge segments to get better putative object representations. We demonstrate that our paradigm of word prediction easily allows us to predict potentially useful segment merges, even for segments that do not look similar (for example, merging the black and white halves of a penguin is not possible with feature-based segmentation; the main cue must be "familiar configuration"). These studies focus on unsupervised learning of recognition. However, we show that word prediction can be markedly improved by providing supervised information for a relatively small number of regions together with large quantities of unsupervised information. This supervisory information allows a better and more discriminative choice of features and breaks possible symmetries.

UR - http://www.scopus.com/inward/record.url?scp=0042440879&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0042440879&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0042440879

VL - 2

BT - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

ER -