Using content-based and link-based analysis in building vertical search engines

Michael Chau, Hsinchun Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

This paper reports our research in the Web page filtering process in specialized search engine development. We propose a machine-learning-based approach that combines Web content analysis and Web structure analysis. Instead of a bag of words, each Web page is represented by a set of content-based and link-based features, which can be used as the input for various machine learning algorithms. The proposed approach was implemented using both a feedforward/backpropagation neural network and a support vector machine. An evaluation study was conducted and showed that the proposed approaches performed better than the benchmark approaches.

Original languageEnglish (US)
Title of host publicationDigital Libraries
Subtitle of host publicationInternational Collaboration and Cross-Fertilization - 7th International Conference on Asian Digital Libraries, ICADL 2004
EditorsQihao Miao, Ee-peng Lim, Zhaoneng Chen, Yuxi Fu, Hsinchun Chen, Edward Fox
PublisherSpringer-Verlag
Pages515-518
Number of pages4
ISBN (Print)9783540240303
DOIs
StatePublished - 2005
Event7th International Conference on Asian Digital Libraries, ICADL 2004 - Shanghai, China
Duration: Dec 13 2004Dec 17 2004

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3334 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference7th International Conference on Asian Digital Libraries, ICADL 2004
CountryChina
CityShanghai
Period12/13/0412/17/04

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Using content-based and link-based analysis in building vertical search engines'. Together they form a unique fingerprint.

  • Cite this

    Chau, M., & Chen, H. (2005). Using content-based and link-based analysis in building vertical search engines. In Q. Miao, E. Lim, Z. Chen, Y. Fu, H. Chen, & E. Fox (Eds.), Digital Libraries: International Collaboration and Cross-Fertilization - 7th International Conference on Asian Digital Libraries, ICADL 2004 (pp. 515-518). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3334 LNCS). Springer-Verlag. https://doi.org/10.1007/978-3-540-30544-6_3