Probabilistic web image gathering

Keiji Yanai, Jacobus J Barnard

Research output: Chapter in Book/Report/Conference proceedingConference contribution

45 Citations (Scopus)

Abstract

We propose a new method for automated large scale gath-ering of Web images relevant to specified concepts. Our main goal is to build a knowledge base associated with as many concepts as possible for large scale object recognition studies. A second goal is supporting the building of more accurate text-based indexes for Web images. In our method, good quality candidate sets of images for each keyword are gathered as a function of analysis of the surrounding HTML text. The gathered images are then segmented into regions, and a model for the probability distribution of regions for the concept is computed using an iterative algorithm based on the previous work on statistical image annotation. The learned model is then applied to identify which images are visually relevant to the concept implied by the keyword. Implicitly, which regions or the images are relevant is also determined. Our experiments reveal that the new method performs much better than Google Image Search and a sim-ple method based on more standard content based image retrieval methods.

Original languageEnglish (US)
Title of host publicationMIR 2005 - Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, Co-located with ACM Multimedia 2005
PublisherAssociation for Computing Machinery, Inc
Pages57-64
Number of pages8
ISBN (Print)1595932445, 9781595932440
StatePublished - Nov 10 2005
Event7th ACM SIGMM International Workshop on Multimedia Information Retrieval, MIR 2005 - Singapore, Singapore
Duration: Nov 10 2005Nov 11 2005

Other

Other7th ACM SIGMM International Workshop on Multimedia Information Retrieval, MIR 2005
CountrySingapore
CitySingapore
Period11/10/0511/11/05

Fingerprint

HTML
Object recognition
Image retrieval
Probability distributions
Experiments

Keywords

  • Image selection
  • Prob-abilistic method
  • Web image mining
  • Web image search

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Signal Processing
  • Software
  • Media Technology

Cite this

Yanai, K., & Barnard, J. J. (2005). Probabilistic web image gathering. In MIR 2005 - Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, Co-located with ACM Multimedia 2005 (pp. 57-64). Association for Computing Machinery, Inc.

Probabilistic web image gathering. / Yanai, Keiji; Barnard, Jacobus J.

MIR 2005 - Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, Co-located with ACM Multimedia 2005. Association for Computing Machinery, Inc, 2005. p. 57-64.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yanai, K & Barnard, JJ 2005, Probabilistic web image gathering. in MIR 2005 - Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, Co-located with ACM Multimedia 2005. Association for Computing Machinery, Inc, pp. 57-64, 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, MIR 2005, Singapore, Singapore, 11/10/05.
Yanai K, Barnard JJ. Probabilistic web image gathering. In MIR 2005 - Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, Co-located with ACM Multimedia 2005. Association for Computing Machinery, Inc. 2005. p. 57-64
Yanai, Keiji ; Barnard, Jacobus J. / Probabilistic web image gathering. MIR 2005 - Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, Co-located with ACM Multimedia 2005. Association for Computing Machinery, Inc, 2005. pp. 57-64
@inproceedings{c419af395dbc49599fa2b826da481702,
title = "Probabilistic web image gathering",
abstract = "We propose a new method for automated large scale gath-ering of Web images relevant to specified concepts. Our main goal is to build a knowledge base associated with as many concepts as possible for large scale object recognition studies. A second goal is supporting the building of more accurate text-based indexes for Web images. In our method, good quality candidate sets of images for each keyword are gathered as a function of analysis of the surrounding HTML text. The gathered images are then segmented into regions, and a model for the probability distribution of regions for the concept is computed using an iterative algorithm based on the previous work on statistical image annotation. The learned model is then applied to identify which images are visually relevant to the concept implied by the keyword. Implicitly, which regions or the images are relevant is also determined. Our experiments reveal that the new method performs much better than Google Image Search and a sim-ple method based on more standard content based image retrieval methods.",
keywords = "Image selection, Prob-abilistic method, Web image mining, Web image search",
author = "Keiji Yanai and Barnard, {Jacobus J}",
year = "2005",
month = "11",
day = "10",
language = "English (US)",
isbn = "1595932445",
pages = "57--64",
booktitle = "MIR 2005 - Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, Co-located with ACM Multimedia 2005",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - Probabilistic web image gathering

AU - Yanai, Keiji

AU - Barnard, Jacobus J

PY - 2005/11/10

Y1 - 2005/11/10

N2 - We propose a new method for automated large scale gath-ering of Web images relevant to specified concepts. Our main goal is to build a knowledge base associated with as many concepts as possible for large scale object recognition studies. A second goal is supporting the building of more accurate text-based indexes for Web images. In our method, good quality candidate sets of images for each keyword are gathered as a function of analysis of the surrounding HTML text. The gathered images are then segmented into regions, and a model for the probability distribution of regions for the concept is computed using an iterative algorithm based on the previous work on statistical image annotation. The learned model is then applied to identify which images are visually relevant to the concept implied by the keyword. Implicitly, which regions or the images are relevant is also determined. Our experiments reveal that the new method performs much better than Google Image Search and a sim-ple method based on more standard content based image retrieval methods.

AB - We propose a new method for automated large scale gath-ering of Web images relevant to specified concepts. Our main goal is to build a knowledge base associated with as many concepts as possible for large scale object recognition studies. A second goal is supporting the building of more accurate text-based indexes for Web images. In our method, good quality candidate sets of images for each keyword are gathered as a function of analysis of the surrounding HTML text. The gathered images are then segmented into regions, and a model for the probability distribution of regions for the concept is computed using an iterative algorithm based on the previous work on statistical image annotation. The learned model is then applied to identify which images are visually relevant to the concept implied by the keyword. Implicitly, which regions or the images are relevant is also determined. Our experiments reveal that the new method performs much better than Google Image Search and a sim-ple method based on more standard content based image retrieval methods.

KW - Image selection

KW - Prob-abilistic method

KW - Web image mining

KW - Web image search

UR - http://www.scopus.com/inward/record.url?scp=84928752144&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84928752144&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84928752144

SN - 1595932445

SN - 9781595932440

SP - 57

EP - 64

BT - MIR 2005 - Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, Co-located with ACM Multimedia 2005

PB - Association for Computing Machinery, Inc

ER -