Alleviating search uncertainty through concept associations: Automatic indexing, co-occurrence analysis, and parallel computing

Hsinchun Chen, Joanne Martinez, Amy Kirchhoff, Tobun D. Ng, Bruce R. Schatz

Research output: Contribution to journalArticle

25 Citations (Scopus)

Abstract

In this article, we report research on an algorithmic approach to alleviating search uncertainty in a large information space. Grounded on object filtering, automatic indexing, and co-occurrence analysis, we performed a large-scale experiment using a parallel supercomputer (SGI Power Challenge) to analyze 400,000+ abstracts in an INSPEC computer engineering collection. Two system-generated thesauri, one based on a combined object filtering and automatic indexing method, and the other based on automatic indexing only, were compared with the human-generated INSPEC subject thesaurus. Our user evaluation revealed that the system-generated thesauri were better than the INSPEC thesaurus in concept recall, but in concept precision the 3 thesauri were comparable. Our analysis also revealed that the terms suggested by the 3 thesauri were complementary and could be used to significantly increase "variety" in search terms and thereby reduce search uncertainty.

Original languageEnglish (US)
Pages (from-to)206-216
Number of pages11
JournalJournal of the American Society for Information Science
Volume49
Issue number3
StatePublished - 1998

Fingerprint

Automatic indexing
Thesauri
thesaurus
Parallel processing systems
indexing
uncertainty
Supercomputers
Uncertainty
Parallel computing
Indexing
Thesaurus
engineering
experiment
evaluation

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Alleviating search uncertainty through concept associations : Automatic indexing, co-occurrence analysis, and parallel computing. / Chen, Hsinchun; Martinez, Joanne; Kirchhoff, Amy; Ng, Tobun D.; Schatz, Bruce R.

In: Journal of the American Society for Information Science, Vol. 49, No. 3, 1998, p. 206-216.

Research output: Contribution to journalArticle

@article{aa5723067a1848f1a345055212d2e986,
title = "Alleviating search uncertainty through concept associations: Automatic indexing, co-occurrence analysis, and parallel computing",
abstract = "In this article, we report research on an algorithmic approach to alleviating search uncertainty in a large information space. Grounded on object filtering, automatic indexing, and co-occurrence analysis, we performed a large-scale experiment using a parallel supercomputer (SGI Power Challenge) to analyze 400,000+ abstracts in an INSPEC computer engineering collection. Two system-generated thesauri, one based on a combined object filtering and automatic indexing method, and the other based on automatic indexing only, were compared with the human-generated INSPEC subject thesaurus. Our user evaluation revealed that the system-generated thesauri were better than the INSPEC thesaurus in concept recall, but in concept precision the 3 thesauri were comparable. Our analysis also revealed that the terms suggested by the 3 thesauri were complementary and could be used to significantly increase {"}variety{"} in search terms and thereby reduce search uncertainty.",
author = "Hsinchun Chen and Joanne Martinez and Amy Kirchhoff and Ng, {Tobun D.} and Schatz, {Bruce R.}",
year = "1998",
language = "English (US)",
volume = "49",
pages = "206--216",
journal = "Journal of the Association for Information Science and Technology",
issn = "2330-1635",
publisher = "John Wiley and Sons Ltd",
number = "3",

}

TY - JOUR

T1 - Alleviating search uncertainty through concept associations

T2 - Automatic indexing, co-occurrence analysis, and parallel computing

AU - Chen, Hsinchun

AU - Martinez, Joanne

AU - Kirchhoff, Amy

AU - Ng, Tobun D.

AU - Schatz, Bruce R.

PY - 1998

Y1 - 1998

N2 - In this article, we report research on an algorithmic approach to alleviating search uncertainty in a large information space. Grounded on object filtering, automatic indexing, and co-occurrence analysis, we performed a large-scale experiment using a parallel supercomputer (SGI Power Challenge) to analyze 400,000+ abstracts in an INSPEC computer engineering collection. Two system-generated thesauri, one based on a combined object filtering and automatic indexing method, and the other based on automatic indexing only, were compared with the human-generated INSPEC subject thesaurus. Our user evaluation revealed that the system-generated thesauri were better than the INSPEC thesaurus in concept recall, but in concept precision the 3 thesauri were comparable. Our analysis also revealed that the terms suggested by the 3 thesauri were complementary and could be used to significantly increase "variety" in search terms and thereby reduce search uncertainty.

AB - In this article, we report research on an algorithmic approach to alleviating search uncertainty in a large information space. Grounded on object filtering, automatic indexing, and co-occurrence analysis, we performed a large-scale experiment using a parallel supercomputer (SGI Power Challenge) to analyze 400,000+ abstracts in an INSPEC computer engineering collection. Two system-generated thesauri, one based on a combined object filtering and automatic indexing method, and the other based on automatic indexing only, were compared with the human-generated INSPEC subject thesaurus. Our user evaluation revealed that the system-generated thesauri were better than the INSPEC thesaurus in concept recall, but in concept precision the 3 thesauri were comparable. Our analysis also revealed that the terms suggested by the 3 thesauri were complementary and could be used to significantly increase "variety" in search terms and thereby reduce search uncertainty.

UR - http://www.scopus.com/inward/record.url?scp=0032028271&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032028271&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0032028271

VL - 49

SP - 206

EP - 216

JO - Journal of the Association for Information Science and Technology

JF - Journal of the Association for Information Science and Technology

SN - 2330-1635

IS - 3

ER -