Fusing object detection and region appearance for image-text alignment

Luca Del Pero, Philip Lee, James Magahern, Emily Hartley, Kobus Barnard

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a method for automatically aligning words to image regions that integrates specific object classifiers (e.g., "car"detectors) with weak models based on appearance features. Previous strategies have largely focused on the latter, and thus have not exploited progress on object category recognition. Hence, we augment region labeling with object detection, which simplifies the problem by reliably identifying a subset of the labels, and thereby reducing correspondence ambiguity overall. Comprehensive testing on the SAIAPR TC dataset shows that principled integration of object detection improves the region labeling task.

Original languageEnglish (US)
Title of host publicationMM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops
Pages1113-1116
Number of pages4
DOIs
StatePublished - Dec 29 2011
Event19th ACM International Conference on Multimedia ACM Multimedia 2011, MM'11 - Scottsdale, AZ, United States
Duration: Nov 28 2011Dec 1 2011

Publication series

NameMM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops

Other

Other19th ACM International Conference on Multimedia ACM Multimedia 2011, MM'11
CountryUnited States
CityScottsdale, AZ
Period11/28/1112/1/11

Keywords

  • Algorithms
  • Performance

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Human-Computer Interaction

Fingerprint Dive into the research topics of 'Fusing object detection and region appearance for image-text alignment'. Together they form a unique fingerprint.

  • Cite this

    Del Pero, L., Lee, P., Magahern, J., Hartley, E., & Barnard, K. (2011). Fusing object detection and region appearance for image-text alignment. In MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops (pp. 1113-1116). (MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops). https://doi.org/10.1145/2072298.2071951