Evaluating the usefulness of sentiment information for focused crawlers

Tianjun Fu, Ahmed Abbasi, Dajun Zeng, Hsinchun Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Despite the prevalence of sentiment-related content on the Web, there has been limited work on focused crawlers capable of effectively collecting such content. In this study, we evaluated the efficacy of using sentiment-related information for enhanced focused crawling of opinion-rich web content regarding a particular topic. We also assessed the impact of using sentiment-labelled web graphs to further improve collection accuracy. Experimental results on a large testbed encompassing over half a million web pages revealed that focused crawlers utilizing sentiment information as well as sentiment-labelled web graphs are capable of gathering more holistic collections of opinion-related content regarding a particular topic. The results have important implications for business and marketing intelligence gathering efforts in the Web 2.0 era.

Original languageEnglish (US)
Title of host publicationProceedings of 20th Annual Workshop on Information Technologies and Systems
PublisherSocial Science Research Network
StatePublished - 2010
Event20th Annual Workshop on Information Technologies and Systems, WITS 2010 - St. Louis, MO, United States
Duration: Dec 11 2010Dec 12 2010

Other

Other20th Annual Workshop on Information Technologies and Systems, WITS 2010
CountryUnited States
CitySt. Louis, MO
Period12/11/1012/12/10

Fingerprint

Testbeds
Websites
Marketing
Industry

Keywords

  • Focused crawler
  • Labelled web graph
  • Sentiment analysis

ASJC Scopus subject areas

  • Information Systems

Cite this

Fu, T., Abbasi, A., Zeng, D., & Chen, H. (2010). Evaluating the usefulness of sentiment information for focused crawlers. In Proceedings of 20th Annual Workshop on Information Technologies and Systems Social Science Research Network.

Evaluating the usefulness of sentiment information for focused crawlers. / Fu, Tianjun; Abbasi, Ahmed; Zeng, Dajun; Chen, Hsinchun.

Proceedings of 20th Annual Workshop on Information Technologies and Systems. Social Science Research Network, 2010.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fu, T, Abbasi, A, Zeng, D & Chen, H 2010, Evaluating the usefulness of sentiment information for focused crawlers. in Proceedings of 20th Annual Workshop on Information Technologies and Systems. Social Science Research Network, 20th Annual Workshop on Information Technologies and Systems, WITS 2010, St. Louis, MO, United States, 12/11/10.
Fu T, Abbasi A, Zeng D, Chen H. Evaluating the usefulness of sentiment information for focused crawlers. In Proceedings of 20th Annual Workshop on Information Technologies and Systems. Social Science Research Network. 2010
Fu, Tianjun ; Abbasi, Ahmed ; Zeng, Dajun ; Chen, Hsinchun. / Evaluating the usefulness of sentiment information for focused crawlers. Proceedings of 20th Annual Workshop on Information Technologies and Systems. Social Science Research Network, 2010.
@inproceedings{eca16d8ea262427391bfb726f542c5c7,
title = "Evaluating the usefulness of sentiment information for focused crawlers",
abstract = "Despite the prevalence of sentiment-related content on the Web, there has been limited work on focused crawlers capable of effectively collecting such content. In this study, we evaluated the efficacy of using sentiment-related information for enhanced focused crawling of opinion-rich web content regarding a particular topic. We also assessed the impact of using sentiment-labelled web graphs to further improve collection accuracy. Experimental results on a large testbed encompassing over half a million web pages revealed that focused crawlers utilizing sentiment information as well as sentiment-labelled web graphs are capable of gathering more holistic collections of opinion-related content regarding a particular topic. The results have important implications for business and marketing intelligence gathering efforts in the Web 2.0 era.",
keywords = "Focused crawler, Labelled web graph, Sentiment analysis",
author = "Tianjun Fu and Ahmed Abbasi and Dajun Zeng and Hsinchun Chen",
year = "2010",
language = "English (US)",
booktitle = "Proceedings of 20th Annual Workshop on Information Technologies and Systems",
publisher = "Social Science Research Network",

}

TY - GEN

T1 - Evaluating the usefulness of sentiment information for focused crawlers

AU - Fu, Tianjun

AU - Abbasi, Ahmed

AU - Zeng, Dajun

AU - Chen, Hsinchun

PY - 2010

Y1 - 2010

N2 - Despite the prevalence of sentiment-related content on the Web, there has been limited work on focused crawlers capable of effectively collecting such content. In this study, we evaluated the efficacy of using sentiment-related information for enhanced focused crawling of opinion-rich web content regarding a particular topic. We also assessed the impact of using sentiment-labelled web graphs to further improve collection accuracy. Experimental results on a large testbed encompassing over half a million web pages revealed that focused crawlers utilizing sentiment information as well as sentiment-labelled web graphs are capable of gathering more holistic collections of opinion-related content regarding a particular topic. The results have important implications for business and marketing intelligence gathering efforts in the Web 2.0 era.

AB - Despite the prevalence of sentiment-related content on the Web, there has been limited work on focused crawlers capable of effectively collecting such content. In this study, we evaluated the efficacy of using sentiment-related information for enhanced focused crawling of opinion-rich web content regarding a particular topic. We also assessed the impact of using sentiment-labelled web graphs to further improve collection accuracy. Experimental results on a large testbed encompassing over half a million web pages revealed that focused crawlers utilizing sentiment information as well as sentiment-labelled web graphs are capable of gathering more holistic collections of opinion-related content regarding a particular topic. The results have important implications for business and marketing intelligence gathering efforts in the Web 2.0 era.

KW - Focused crawler

KW - Labelled web graph

KW - Sentiment analysis

UR - http://www.scopus.com/inward/record.url?scp=84900416626&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84900416626&partnerID=8YFLogxK

M3 - Conference contribution

BT - Proceedings of 20th Annual Workshop on Information Technologies and Systems

PB - Social Science Research Network

ER -