A comparison of fraud cues and classification methods for fake escrow website detection

Ahmed Abbasi, Hsinchun Chen

Research output: Contribution to journalArticle

22 Citations (Scopus)

Abstract

The ability to automatically detect fraudulent escrow websites is important in order to alleviate online auction fraud. Despite research on related topics, such as web spam and spoof site detection, fake escrow website categorization has received little attention. The authentic appearance of fake escrow websites makes it difficult for Internet users to differentiate legitimate sites from phonies; making systems for detecting such websites an important endeavor. In this study we evaluated the effectiveness of various features and techniques for detecting fake escrow websites. Our analysis included a rich set of fraud cues extracted from web page text, image, and link information. We also compared several machine learning algorithms, including support vector machines, neural networks, decision trees, naïve bayes, and principal component analysis. Experiments were conducted to assess the proposed fraud cues and techniques on a test bed encompassing nearly 90,000 web pages derived from 410 legitimate and fake escrow websites. The combination of an extended feature set and a support vector machines ensemble classifier enabled accuracies over 90 and 96% for page and site level classification, respectively, when differentiating fake pages from real ones. Deeper analysis revealed that an extended set of fraud cues is necessary due to the broad spectrum of tactics employed by fraudsters. The study confirms the feasibility of using automated methods for detecting fake escrow websites. The results may also be useful for informing existing online escrow fraud resources and communities of practice about the plethora of fraud cues pervasive in fake websites.

Original languageEnglish (US)
Pages (from-to)83-101
Number of pages19
JournalInformation Technology and Management
Volume10
Issue number2-3 SPEC. ISS.
DOIs
StatePublished - 2009

Fingerprint

fraud
website
Websites
Support vector machines
Web sites
Fraud
auction
neural network
tactics
Decision trees
World Wide Web
Principal component analysis
Learning algorithms
Learning systems
Internet
Classifiers
experiment
ability
Neural networks
resources

Keywords

  • Fraud cues
  • Internet fraud
  • Machine learning
  • Online escrow services
  • Website classification

ASJC Scopus subject areas

  • Information Systems
  • Communication
  • Business, Management and Accounting (miscellaneous)

Cite this

A comparison of fraud cues and classification methods for fake escrow website detection. / Abbasi, Ahmed; Chen, Hsinchun.

In: Information Technology and Management, Vol. 10, No. 2-3 SPEC. ISS., 2009, p. 83-101.

Research output: Contribution to journalArticle

@article{16b56d05acec43ff814b749ec8df443f,
title = "A comparison of fraud cues and classification methods for fake escrow website detection",
abstract = "The ability to automatically detect fraudulent escrow websites is important in order to alleviate online auction fraud. Despite research on related topics, such as web spam and spoof site detection, fake escrow website categorization has received little attention. The authentic appearance of fake escrow websites makes it difficult for Internet users to differentiate legitimate sites from phonies; making systems for detecting such websites an important endeavor. In this study we evaluated the effectiveness of various features and techniques for detecting fake escrow websites. Our analysis included a rich set of fraud cues extracted from web page text, image, and link information. We also compared several machine learning algorithms, including support vector machines, neural networks, decision trees, na{\"i}ve bayes, and principal component analysis. Experiments were conducted to assess the proposed fraud cues and techniques on a test bed encompassing nearly 90,000 web pages derived from 410 legitimate and fake escrow websites. The combination of an extended feature set and a support vector machines ensemble classifier enabled accuracies over 90 and 96{\%} for page and site level classification, respectively, when differentiating fake pages from real ones. Deeper analysis revealed that an extended set of fraud cues is necessary due to the broad spectrum of tactics employed by fraudsters. The study confirms the feasibility of using automated methods for detecting fake escrow websites. The results may also be useful for informing existing online escrow fraud resources and communities of practice about the plethora of fraud cues pervasive in fake websites.",
keywords = "Fraud cues, Internet fraud, Machine learning, Online escrow services, Website classification",
author = "Ahmed Abbasi and Hsinchun Chen",
year = "2009",
doi = "10.1007/s10799-009-0059-0",
language = "English (US)",
volume = "10",
pages = "83--101",
journal = "Information Technology and Management",
issn = "1385-951X",
publisher = "Kluwer Academic Publishers",
number = "2-3 SPEC. ISS.",

}

TY - JOUR

T1 - A comparison of fraud cues and classification methods for fake escrow website detection

AU - Abbasi, Ahmed

AU - Chen, Hsinchun

PY - 2009

Y1 - 2009

N2 - The ability to automatically detect fraudulent escrow websites is important in order to alleviate online auction fraud. Despite research on related topics, such as web spam and spoof site detection, fake escrow website categorization has received little attention. The authentic appearance of fake escrow websites makes it difficult for Internet users to differentiate legitimate sites from phonies; making systems for detecting such websites an important endeavor. In this study we evaluated the effectiveness of various features and techniques for detecting fake escrow websites. Our analysis included a rich set of fraud cues extracted from web page text, image, and link information. We also compared several machine learning algorithms, including support vector machines, neural networks, decision trees, naïve bayes, and principal component analysis. Experiments were conducted to assess the proposed fraud cues and techniques on a test bed encompassing nearly 90,000 web pages derived from 410 legitimate and fake escrow websites. The combination of an extended feature set and a support vector machines ensemble classifier enabled accuracies over 90 and 96% for page and site level classification, respectively, when differentiating fake pages from real ones. Deeper analysis revealed that an extended set of fraud cues is necessary due to the broad spectrum of tactics employed by fraudsters. The study confirms the feasibility of using automated methods for detecting fake escrow websites. The results may also be useful for informing existing online escrow fraud resources and communities of practice about the plethora of fraud cues pervasive in fake websites.

AB - The ability to automatically detect fraudulent escrow websites is important in order to alleviate online auction fraud. Despite research on related topics, such as web spam and spoof site detection, fake escrow website categorization has received little attention. The authentic appearance of fake escrow websites makes it difficult for Internet users to differentiate legitimate sites from phonies; making systems for detecting such websites an important endeavor. In this study we evaluated the effectiveness of various features and techniques for detecting fake escrow websites. Our analysis included a rich set of fraud cues extracted from web page text, image, and link information. We also compared several machine learning algorithms, including support vector machines, neural networks, decision trees, naïve bayes, and principal component analysis. Experiments were conducted to assess the proposed fraud cues and techniques on a test bed encompassing nearly 90,000 web pages derived from 410 legitimate and fake escrow websites. The combination of an extended feature set and a support vector machines ensemble classifier enabled accuracies over 90 and 96% for page and site level classification, respectively, when differentiating fake pages from real ones. Deeper analysis revealed that an extended set of fraud cues is necessary due to the broad spectrum of tactics employed by fraudsters. The study confirms the feasibility of using automated methods for detecting fake escrow websites. The results may also be useful for informing existing online escrow fraud resources and communities of practice about the plethora of fraud cues pervasive in fake websites.

KW - Fraud cues

KW - Internet fraud

KW - Machine learning

KW - Online escrow services

KW - Website classification

UR - http://www.scopus.com/inward/record.url?scp=68349154502&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=68349154502&partnerID=8YFLogxK

U2 - 10.1007/s10799-009-0059-0

DO - 10.1007/s10799-009-0059-0

M3 - Article

AN - SCOPUS:68349154502

VL - 10

SP - 83

EP - 101

JO - Information Technology and Management

JF - Information Technology and Management

SN - 1385-951X

IS - 2-3 SPEC. ISS.

ER -