GANNET: A machine learning approach to document retrieval

Hsinchun Chen, Jinwoo Kim

Research output: Contribution to journalArticle

18 Scopus citations

Abstract

Information retrieval using probabilistic techniques has attracted significant attention on the part of researchers in information and computer science over the past few decades. In the 1980s, knowledge-based techniques also have made an impressive contribution to "intelligent" information retrieval and indexing. More recently, information science researchers have turned to other, newer artificial intelligence-based inductive learning techniques including neural networks, symbolic learning, and genetic algorithms. The newer techniques have provided great opportunities for researchers to experiment with diverse paradigms for effective information processing and retrieval. In this article we first provide an overview of newer techniques and their usage in information science research. We then present in detail the algorithms we adopted for a hybrid Genetic Algorithms and Neural Nets based system, called GANNET. GANNET performed concept (keyword) optimization for user-selected documents during information retrieval using the genetic algorithms. It then used the optimized concepts to perform concept exploration in a large network of related concepts through the Hopfield net parallel relaxation procedure. Based on a test collection of about 3,000 articles from DIALOG and an automatically created thesaurus, and using Jaccard's score as a performance measure, our experiment showed that GANNET improved the Jaccard's scores by about 50 percent and it helped identify the underlying concepts (keywords) that best describe the user-selected documents.

Original languageEnglish (US)
Pages (from-to)7-41
Number of pages35
JournalJournal of Management Information Systems
Volume10
Issue number4
StatePublished - Mar 1 1994

Keywords

  • Intelligent information retrieval
  • Machine learning

ASJC Scopus subject areas

  • Management Information Systems
  • Computer Science Applications
  • Management Science and Operations Research
  • Information Systems and Management

Fingerprint Dive into the research topics of 'GANNET: A machine learning approach to document retrieval'. Together they form a unique fingerprint.

  • Cite this