Modeling the statistics of image features and associated text

Jacobus J Barnard, Pinar Duygulu, David Forsyth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a methodology for modeling the statistics of image features and associated text in large datasets. The models used also serve to cluster the images, as images are modeled as being produced by sampling from a limited number of combinations of mixing components. Furthermore, because our approach models the joint occurrence image features and associated text, it can be used to predict the occurrence of either, based on observations or queries. This supports an attractive approach to image search as well as novel applications such a suggesting illustrations for blocks of text (auto-illustrate) and generating words for images outside the training set (auto-annotate). In this paper we illustrate the approach on 10,000 images of work from the Fine Arts Museum of San Francisco. The images include line drawings, paintings, and pictures of sculpture and ceramics. Many of the images have associated free text whose nature varies greatly, from physical description to interpretation and mood. We incorporate statistical natural language processing in order to deal with free text. We use WordNet to provide semantic grouping information and to help disambiguate word senses, as well as emphasize the hierarchical nature of semantic relationships.

Original languageEnglish (US)
Title of host publicationProceedings of SPIE - The International Society for Optical Engineering
EditorsP.B. Kantor, T. Kanungo, J. Zhou
Pages1-11
Number of pages11
Volume4670
DOIs
StatePublished - 2002
Externally publishedYes
EventDocumentation Recognition and Retrieval IX - San Jose, CA, United States
Duration: Jan 21 2002Jan 22 2002

Other

OtherDocumentation Recognition and Retrieval IX
CountryUnited States
CitySan Jose, CA
Period1/21/021/22/02

Fingerprint

Semantics
Statistics
statistics
Museums
Painting
Sampling
Processing
semantics
natural language processing
moods
occurrences
museums
arts
education
sampling
methodology
ceramics

Keywords

  • Aspect model
  • Hierarchical clustering
  • Image retrieval
  • Learning image semantics
  • Recognition

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Condensed Matter Physics

Cite this

Barnard, J. J., Duygulu, P., & Forsyth, D. (2002). Modeling the statistics of image features and associated text. In P. B. Kantor, T. Kanungo, & J. Zhou (Eds.), Proceedings of SPIE - The International Society for Optical Engineering (Vol. 4670, pp. 1-11) https://doi.org/10.1117/12.450716

Modeling the statistics of image features and associated text. / Barnard, Jacobus J; Duygulu, Pinar; Forsyth, David.

Proceedings of SPIE - The International Society for Optical Engineering. ed. / P.B. Kantor; T. Kanungo; J. Zhou. Vol. 4670 2002. p. 1-11.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Barnard, JJ, Duygulu, P & Forsyth, D 2002, Modeling the statistics of image features and associated text. in PB Kantor, T Kanungo & J Zhou (eds), Proceedings of SPIE - The International Society for Optical Engineering. vol. 4670, pp. 1-11, Documentation Recognition and Retrieval IX, San Jose, CA, United States, 1/21/02. https://doi.org/10.1117/12.450716
Barnard JJ, Duygulu P, Forsyth D. Modeling the statistics of image features and associated text. In Kantor PB, Kanungo T, Zhou J, editors, Proceedings of SPIE - The International Society for Optical Engineering. Vol. 4670. 2002. p. 1-11 https://doi.org/10.1117/12.450716
Barnard, Jacobus J ; Duygulu, Pinar ; Forsyth, David. / Modeling the statistics of image features and associated text. Proceedings of SPIE - The International Society for Optical Engineering. editor / P.B. Kantor ; T. Kanungo ; J. Zhou. Vol. 4670 2002. pp. 1-11
@inproceedings{6a6cb2110ba447849af8892c795584f0,
title = "Modeling the statistics of image features and associated text",
abstract = "We present a methodology for modeling the statistics of image features and associated text in large datasets. The models used also serve to cluster the images, as images are modeled as being produced by sampling from a limited number of combinations of mixing components. Furthermore, because our approach models the joint occurrence image features and associated text, it can be used to predict the occurrence of either, based on observations or queries. This supports an attractive approach to image search as well as novel applications such a suggesting illustrations for blocks of text (auto-illustrate) and generating words for images outside the training set (auto-annotate). In this paper we illustrate the approach on 10,000 images of work from the Fine Arts Museum of San Francisco. The images include line drawings, paintings, and pictures of sculpture and ceramics. Many of the images have associated free text whose nature varies greatly, from physical description to interpretation and mood. We incorporate statistical natural language processing in order to deal with free text. We use WordNet to provide semantic grouping information and to help disambiguate word senses, as well as emphasize the hierarchical nature of semantic relationships.",
keywords = "Aspect model, Hierarchical clustering, Image retrieval, Learning image semantics, Recognition",
author = "Barnard, {Jacobus J} and Pinar Duygulu and David Forsyth",
year = "2002",
doi = "10.1117/12.450716",
language = "English (US)",
volume = "4670",
pages = "1--11",
editor = "P.B. Kantor and T. Kanungo and J. Zhou",
booktitle = "Proceedings of SPIE - The International Society for Optical Engineering",

}

TY - GEN

T1 - Modeling the statistics of image features and associated text

AU - Barnard, Jacobus J

AU - Duygulu, Pinar

AU - Forsyth, David

PY - 2002

Y1 - 2002

N2 - We present a methodology for modeling the statistics of image features and associated text in large datasets. The models used also serve to cluster the images, as images are modeled as being produced by sampling from a limited number of combinations of mixing components. Furthermore, because our approach models the joint occurrence image features and associated text, it can be used to predict the occurrence of either, based on observations or queries. This supports an attractive approach to image search as well as novel applications such a suggesting illustrations for blocks of text (auto-illustrate) and generating words for images outside the training set (auto-annotate). In this paper we illustrate the approach on 10,000 images of work from the Fine Arts Museum of San Francisco. The images include line drawings, paintings, and pictures of sculpture and ceramics. Many of the images have associated free text whose nature varies greatly, from physical description to interpretation and mood. We incorporate statistical natural language processing in order to deal with free text. We use WordNet to provide semantic grouping information and to help disambiguate word senses, as well as emphasize the hierarchical nature of semantic relationships.

AB - We present a methodology for modeling the statistics of image features and associated text in large datasets. The models used also serve to cluster the images, as images are modeled as being produced by sampling from a limited number of combinations of mixing components. Furthermore, because our approach models the joint occurrence image features and associated text, it can be used to predict the occurrence of either, based on observations or queries. This supports an attractive approach to image search as well as novel applications such a suggesting illustrations for blocks of text (auto-illustrate) and generating words for images outside the training set (auto-annotate). In this paper we illustrate the approach on 10,000 images of work from the Fine Arts Museum of San Francisco. The images include line drawings, paintings, and pictures of sculpture and ceramics. Many of the images have associated free text whose nature varies greatly, from physical description to interpretation and mood. We incorporate statistical natural language processing in order to deal with free text. We use WordNet to provide semantic grouping information and to help disambiguate word senses, as well as emphasize the hierarchical nature of semantic relationships.

KW - Aspect model

KW - Hierarchical clustering

KW - Image retrieval

KW - Learning image semantics

KW - Recognition

UR - http://www.scopus.com/inward/record.url?scp=0036031157&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036031157&partnerID=8YFLogxK

U2 - 10.1117/12.450716

DO - 10.1117/12.450716

M3 - Conference contribution

AN - SCOPUS:0036031157

VL - 4670

SP - 1

EP - 11

BT - Proceedings of SPIE - The International Society for Optical Engineering

A2 - Kantor, P.B.

A2 - Kanungo, T.

A2 - Zhou, J.

ER -