Partially supervised learning for radical opinion identification in hate group web forums

Ming Yang, Hsinchun Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Web forums are frequently used as platforms for the exchange of information and opinions, as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. However, radical opinion is highly hidden and distributed in Web forums, while non-radical content is unspecific and topically more diverse. It is costly and time consuming to label a large amount of radical content (positive examples) and non-radical content (negative examples) for training classification systems. Nevertheless, it is easy to obtain large volumes of unlabeled content in Web forums. In this paper, we propose and develop a topic-sensitive partially supervised learning approach to address the difficulties in radical opinion identification in hate group Web forums. Specifically, we design a labeling heuristic to extract high quality positive examples and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group Web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance than its counterparts.

Original languageEnglish (US)
Title of host publicationISI 2012 - 2012 IEEE International Conference on Intelligence and Security Informatics: Cyberspace, Border, and Immigration Securities
Pages96-101
Number of pages6
DOIs
StatePublished - 2012
Event2012 10th IEEE International Conference on Intelligence and Security Informatics, ISI 2012 - Washington, DC, United States
Duration: Jun 11 2012Jun 14 2012

Other

Other2012 10th IEEE International Conference on Intelligence and Security Informatics, ISI 2012
CountryUnited States
CityWashington, DC
Period6/11/126/14/12

Fingerprint

Supervised learning
Labeling
Labels

Keywords

  • document classification
  • opinion mining
  • partially supervised learning
  • Web forum

ASJC Scopus subject areas

  • Artificial Intelligence
  • Information Systems

Cite this

Yang, M., & Chen, H. (2012). Partially supervised learning for radical opinion identification in hate group web forums. In ISI 2012 - 2012 IEEE International Conference on Intelligence and Security Informatics: Cyberspace, Border, and Immigration Securities (pp. 96-101). [6284099] https://doi.org/10.1109/ISI.2012.6284099

Partially supervised learning for radical opinion identification in hate group web forums. / Yang, Ming; Chen, Hsinchun.

ISI 2012 - 2012 IEEE International Conference on Intelligence and Security Informatics: Cyberspace, Border, and Immigration Securities. 2012. p. 96-101 6284099.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yang, M & Chen, H 2012, Partially supervised learning for radical opinion identification in hate group web forums. in ISI 2012 - 2012 IEEE International Conference on Intelligence and Security Informatics: Cyberspace, Border, and Immigration Securities., 6284099, pp. 96-101, 2012 10th IEEE International Conference on Intelligence and Security Informatics, ISI 2012, Washington, DC, United States, 6/11/12. https://doi.org/10.1109/ISI.2012.6284099
Yang M, Chen H. Partially supervised learning for radical opinion identification in hate group web forums. In ISI 2012 - 2012 IEEE International Conference on Intelligence and Security Informatics: Cyberspace, Border, and Immigration Securities. 2012. p. 96-101. 6284099 https://doi.org/10.1109/ISI.2012.6284099
Yang, Ming ; Chen, Hsinchun. / Partially supervised learning for radical opinion identification in hate group web forums. ISI 2012 - 2012 IEEE International Conference on Intelligence and Security Informatics: Cyberspace, Border, and Immigration Securities. 2012. pp. 96-101
@inproceedings{7986e62d0a314299868371cb53ed4de8,
title = "Partially supervised learning for radical opinion identification in hate group web forums",
abstract = "Web forums are frequently used as platforms for the exchange of information and opinions, as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. However, radical opinion is highly hidden and distributed in Web forums, while non-radical content is unspecific and topically more diverse. It is costly and time consuming to label a large amount of radical content (positive examples) and non-radical content (negative examples) for training classification systems. Nevertheless, it is easy to obtain large volumes of unlabeled content in Web forums. In this paper, we propose and develop a topic-sensitive partially supervised learning approach to address the difficulties in radical opinion identification in hate group Web forums. Specifically, we design a labeling heuristic to extract high quality positive examples and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group Web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance than its counterparts.",
keywords = "document classification, opinion mining, partially supervised learning, Web forum",
author = "Ming Yang and Hsinchun Chen",
year = "2012",
doi = "10.1109/ISI.2012.6284099",
language = "English (US)",
isbn = "9781467321037",
pages = "96--101",
booktitle = "ISI 2012 - 2012 IEEE International Conference on Intelligence and Security Informatics: Cyberspace, Border, and Immigration Securities",

}

TY - GEN

T1 - Partially supervised learning for radical opinion identification in hate group web forums

AU - Yang, Ming

AU - Chen, Hsinchun

PY - 2012

Y1 - 2012

N2 - Web forums are frequently used as platforms for the exchange of information and opinions, as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. However, radical opinion is highly hidden and distributed in Web forums, while non-radical content is unspecific and topically more diverse. It is costly and time consuming to label a large amount of radical content (positive examples) and non-radical content (negative examples) for training classification systems. Nevertheless, it is easy to obtain large volumes of unlabeled content in Web forums. In this paper, we propose and develop a topic-sensitive partially supervised learning approach to address the difficulties in radical opinion identification in hate group Web forums. Specifically, we design a labeling heuristic to extract high quality positive examples and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group Web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance than its counterparts.

AB - Web forums are frequently used as platforms for the exchange of information and opinions, as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. However, radical opinion is highly hidden and distributed in Web forums, while non-radical content is unspecific and topically more diverse. It is costly and time consuming to label a large amount of radical content (positive examples) and non-radical content (negative examples) for training classification systems. Nevertheless, it is easy to obtain large volumes of unlabeled content in Web forums. In this paper, we propose and develop a topic-sensitive partially supervised learning approach to address the difficulties in radical opinion identification in hate group Web forums. Specifically, we design a labeling heuristic to extract high quality positive examples and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group Web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance than its counterparts.

KW - document classification

KW - opinion mining

KW - partially supervised learning

KW - Web forum

UR - http://www.scopus.com/inward/record.url?scp=84867343799&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867343799&partnerID=8YFLogxK

U2 - 10.1109/ISI.2012.6284099

DO - 10.1109/ISI.2012.6284099

M3 - Conference contribution

SN - 9781467321037

SP - 96

EP - 101

BT - ISI 2012 - 2012 IEEE International Conference on Intelligence and Security Informatics: Cyberspace, Border, and Immigration Securities

ER -