An intelligent personal spider (agent) for dynamic Internet/Intranet searching

Hsinchun Chen, Chung Yi-Ming, Marshall Ramsey, Christopher C. Yang

Research output: Contribution to journalArticle

61 Citations (Scopus)

Abstract

As Internet services based on the World-Wide Web become more popular, information overload has become a pressing research problem. Difficulties with search on Internet will worsen as the amount of on-line information increases. A scalable approach to Internet search is critical to the success of Internet services and other current and future National Information Infrastructure (Nil) applications. As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent personal spider (agent) approach to Internet searching. The approach, which is grounded on automatic textual analysis and general-purpose search algorithms, is expected to be an improvement over the current static and inefficient Internet searches. In this experiment, we implemented Internet personal spiders based on best first search and genetic algorithm techniques. These personal spiders can dynamically take a user's selected starting homepages and search for the most closely related homepages in the web, based on the links and keyword indexing. A plain, static CGI/HTML-based interface was developed earlier, followed by a recent enhancement of a graphical, dynamic Java-based interface. Preliminary evaluation results and two working prototypes (available for Web access) are presented. Although the examples and evaluations presented are mainly based on Internet applications, the applicability of the proposed techniques to the potentially more rewarding Intranet applications should be obvious. In particular, we believe the proposed agent design can be used to locate organization-wide information, to gather new, time-critical organizational information, and to support team-building and communication in Intranets.

Original languageEnglish (US)
Pages (from-to)41-58
Number of pages18
JournalDecision Support Systems
Volume23
Issue number1
StatePublished - May 1998

Fingerprint

Computer Communication Networks
Intranets
Spiders
Internet
Digital Libraries
HTML
World Wide Web
Intranet
Digital libraries
Genetic Techniques
Research
Genetic algorithms
Communication

Keywords

  • Agents
  • Evolutionary programming
  • Information retrieval
  • Internet
  • Intranet
  • Java
  • Machine learning
  • Semantic retrieval
  • Spider
  • World-Wide Web

ASJC Scopus subject areas

  • Management Information Systems
  • Information Systems
  • Information Systems and Management

Cite this

An intelligent personal spider (agent) for dynamic Internet/Intranet searching. / Chen, Hsinchun; Yi-Ming, Chung; Ramsey, Marshall; Yang, Christopher C.

In: Decision Support Systems, Vol. 23, No. 1, 05.1998, p. 41-58.

Research output: Contribution to journalArticle

Chen, H, Yi-Ming, C, Ramsey, M & Yang, CC 1998, 'An intelligent personal spider (agent) for dynamic Internet/Intranet searching', Decision Support Systems, vol. 23, no. 1, pp. 41-58.
Chen, Hsinchun ; Yi-Ming, Chung ; Ramsey, Marshall ; Yang, Christopher C. / An intelligent personal spider (agent) for dynamic Internet/Intranet searching. In: Decision Support Systems. 1998 ; Vol. 23, No. 1. pp. 41-58.
@article{446196ae0e4b4adb8b13ad7ad53e74d3,
title = "An intelligent personal spider (agent) for dynamic Internet/Intranet searching",
abstract = "As Internet services based on the World-Wide Web become more popular, information overload has become a pressing research problem. Difficulties with search on Internet will worsen as the amount of on-line information increases. A scalable approach to Internet search is critical to the success of Internet services and other current and future National Information Infrastructure (Nil) applications. As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent personal spider (agent) approach to Internet searching. The approach, which is grounded on automatic textual analysis and general-purpose search algorithms, is expected to be an improvement over the current static and inefficient Internet searches. In this experiment, we implemented Internet personal spiders based on best first search and genetic algorithm techniques. These personal spiders can dynamically take a user's selected starting homepages and search for the most closely related homepages in the web, based on the links and keyword indexing. A plain, static CGI/HTML-based interface was developed earlier, followed by a recent enhancement of a graphical, dynamic Java-based interface. Preliminary evaluation results and two working prototypes (available for Web access) are presented. Although the examples and evaluations presented are mainly based on Internet applications, the applicability of the proposed techniques to the potentially more rewarding Intranet applications should be obvious. In particular, we believe the proposed agent design can be used to locate organization-wide information, to gather new, time-critical organizational information, and to support team-building and communication in Intranets.",
keywords = "Agents, Evolutionary programming, Information retrieval, Internet, Intranet, Java, Machine learning, Semantic retrieval, Spider, World-Wide Web",
author = "Hsinchun Chen and Chung Yi-Ming and Marshall Ramsey and Yang, {Christopher C.}",
year = "1998",
month = "5",
language = "English (US)",
volume = "23",
pages = "41--58",
journal = "Decision Support Systems",
issn = "0167-9236",
publisher = "Elsevier",
number = "1",

}

TY - JOUR

T1 - An intelligent personal spider (agent) for dynamic Internet/Intranet searching

AU - Chen, Hsinchun

AU - Yi-Ming, Chung

AU - Ramsey, Marshall

AU - Yang, Christopher C.

PY - 1998/5

Y1 - 1998/5

N2 - As Internet services based on the World-Wide Web become more popular, information overload has become a pressing research problem. Difficulties with search on Internet will worsen as the amount of on-line information increases. A scalable approach to Internet search is critical to the success of Internet services and other current and future National Information Infrastructure (Nil) applications. As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent personal spider (agent) approach to Internet searching. The approach, which is grounded on automatic textual analysis and general-purpose search algorithms, is expected to be an improvement over the current static and inefficient Internet searches. In this experiment, we implemented Internet personal spiders based on best first search and genetic algorithm techniques. These personal spiders can dynamically take a user's selected starting homepages and search for the most closely related homepages in the web, based on the links and keyword indexing. A plain, static CGI/HTML-based interface was developed earlier, followed by a recent enhancement of a graphical, dynamic Java-based interface. Preliminary evaluation results and two working prototypes (available for Web access) are presented. Although the examples and evaluations presented are mainly based on Internet applications, the applicability of the proposed techniques to the potentially more rewarding Intranet applications should be obvious. In particular, we believe the proposed agent design can be used to locate organization-wide information, to gather new, time-critical organizational information, and to support team-building and communication in Intranets.

AB - As Internet services based on the World-Wide Web become more popular, information overload has become a pressing research problem. Difficulties with search on Internet will worsen as the amount of on-line information increases. A scalable approach to Internet search is critical to the success of Internet services and other current and future National Information Infrastructure (Nil) applications. As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent personal spider (agent) approach to Internet searching. The approach, which is grounded on automatic textual analysis and general-purpose search algorithms, is expected to be an improvement over the current static and inefficient Internet searches. In this experiment, we implemented Internet personal spiders based on best first search and genetic algorithm techniques. These personal spiders can dynamically take a user's selected starting homepages and search for the most closely related homepages in the web, based on the links and keyword indexing. A plain, static CGI/HTML-based interface was developed earlier, followed by a recent enhancement of a graphical, dynamic Java-based interface. Preliminary evaluation results and two working prototypes (available for Web access) are presented. Although the examples and evaluations presented are mainly based on Internet applications, the applicability of the proposed techniques to the potentially more rewarding Intranet applications should be obvious. In particular, we believe the proposed agent design can be used to locate organization-wide information, to gather new, time-critical organizational information, and to support team-building and communication in Intranets.

KW - Agents

KW - Evolutionary programming

KW - Information retrieval

KW - Internet

KW - Intranet

KW - Java

KW - Machine learning

KW - Semantic retrieval

KW - Spider

KW - World-Wide Web

UR - http://www.scopus.com/inward/record.url?scp=0032068674&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032068674&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0032068674

VL - 23

SP - 41

EP - 58

JO - Decision Support Systems

JF - Decision Support Systems

SN - 0167-9236

IS - 1

ER -