A framework for stylometric similarity detection in online settings

Ahmed Abbasi, Hsinchun Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Online marketplaces and communication media such as email, web sites, forums, and chat rooms have been ubiquitously integrated into our everyday lives. Unfortunately, the anonymous nature of these channels makes them an ideal avenue for online fraud, hackers, and cybercrime. Anonymity and the sheer volume of online content make cyber identity tracing an essential yet strenuous endeavor for Internet users and human analysts. In order to address these challenges, we propose a framework for online stylometric analysis to assist in distinguishing authorship in online communities based on writing style. Our framework includes the use of a scalable identity-level similarity detection technique coupled with an extensive stylistic feature set and an identity database. The framework is intended to support stylometric authentication for Internet users as well as provide support for forensic investigations. The proposed technique and extended feature set were evaluated on a test bed encompassing thousands of feedback comments posted by 100 electronic market traders. The method outperformed benchmark stylometric techniques with an accuracy of approximately 95% when differentiating between 200 trader identities. The results indicate that the proposed stylometric analysis approach may help mitigate the effects of online anonymity abuse.

Original languageEnglish (US)
Title of host publicationAssociation for Information Systems - 13th Americas Conference on Information Systems, AMCIS 2007
Subtitle of host publicationReaching New Heights
Pages1442-1451
Number of pages10
StatePublished - Dec 1 2007
Event13th Americas Conference on Information Systems, AMCIS 2007 - Keystone, CO, United States
Duration: Aug 10 2007Aug 12 2007

Publication series

NameAssociation for Information Systems - 13th Americas Conference on Information Systems, AMCIS 2007: Reaching New Heights
Volume2

Other

Other13th Americas Conference on Information Systems, AMCIS 2007
CountryUnited States
CityKeystone, CO
Period8/10/078/12/07

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Networks and Communications
  • Information Systems
  • Library and Information Sciences

Fingerprint Dive into the research topics of 'A framework for stylometric similarity detection in online settings'. Together they form a unique fingerprint.

Cite this