Incremental hacker forum exploit collection and classification for proactive cyber threat intelligence: An exploratory study

Ryan Williams, Sagar Samtani, Mark Patton, Hsinchun Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Cyber threats have emerged as a key societal concern. To counter the growing threat of cyber-attacks, organizations, in recent years, have begun investing heavily in developing Cyber Threat Intelligence (CTI). Fundamentally a data driven process, many organizations have traditionally collected and analyzed data from internal log files, resulting in reactive CTI. The online hacker community can offer significant proactive CTI value by alerting organizations to threats they were not previously aware of. Amongst various platforms, forums provide the richest metadata, data permanence, and tens of thousands of freely available Tools, Techniques, and Procedures (TTP). However, forums often employ anti-crawling measures such as authentication, throttling, and obfuscation. Such limitations have restricted many researchers to batch collections. This exploratory study aims to (1) design a novel web crawler augmented with numerous anti-crawling countermeasures to collect hacker exploits on an ongoing basis, (2) employ a state-of-the-art deep learning approach, Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN), to automatically classify exploits into pre-defined categories on-the-fly, and (3) develop interactive visualizations enabling CTI practitioners and researchers to explore collected exploits for proactive, timely CTI. The results of this study indicate, among other findings, that system and network exploits are shared significantly more than other exploit types.

Original languageEnglish (US)
Title of host publication2018 IEEE International Conference on Intelligence and Security Informatics, ISI 2018
EditorsDongwon Lee, Ghita Mezzour, Ponnurangam Kumaraguru, Nitesh Saxena
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages94-99
Number of pages6
ISBN (Electronic)9781538678480
DOIs
StatePublished - Dec 24 2018
Event16th IEEE International Conference on Intelligence and Security Informatics, ISI 2018 - Miami, United States
Duration: Nov 9 2018Nov 11 2018

Other

Other16th IEEE International Conference on Intelligence and Security Informatics, ISI 2018
CountryUnited States
CityMiami
Period11/9/1811/11/18

Fingerprint

hacker
intelligence
threat
Recurrent neural networks
Metadata
Authentication
Visualization
Incremental
Exploratory study
Threat
neural network
visualization

Keywords

  • CTI
  • Cyber threat intelligence
  • Hacker exploits
  • Hacker forum
  • Recurrent neural network
  • Web crawling

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Communication

Cite this

Williams, R., Samtani, S., Patton, M., & Chen, H. (2018). Incremental hacker forum exploit collection and classification for proactive cyber threat intelligence: An exploratory study. In D. Lee, G. Mezzour, P. Kumaraguru, & N. Saxena (Eds.), 2018 IEEE International Conference on Intelligence and Security Informatics, ISI 2018 (pp. 94-99). [8587336] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISI.2018.8587336

Incremental hacker forum exploit collection and classification for proactive cyber threat intelligence : An exploratory study. / Williams, Ryan; Samtani, Sagar; Patton, Mark; Chen, Hsinchun.

2018 IEEE International Conference on Intelligence and Security Informatics, ISI 2018. ed. / Dongwon Lee; Ghita Mezzour; Ponnurangam Kumaraguru; Nitesh Saxena. Institute of Electrical and Electronics Engineers Inc., 2018. p. 94-99 8587336.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Williams, R, Samtani, S, Patton, M & Chen, H 2018, Incremental hacker forum exploit collection and classification for proactive cyber threat intelligence: An exploratory study. in D Lee, G Mezzour, P Kumaraguru & N Saxena (eds), 2018 IEEE International Conference on Intelligence and Security Informatics, ISI 2018., 8587336, Institute of Electrical and Electronics Engineers Inc., pp. 94-99, 16th IEEE International Conference on Intelligence and Security Informatics, ISI 2018, Miami, United States, 11/9/18. https://doi.org/10.1109/ISI.2018.8587336
Williams R, Samtani S, Patton M, Chen H. Incremental hacker forum exploit collection and classification for proactive cyber threat intelligence: An exploratory study. In Lee D, Mezzour G, Kumaraguru P, Saxena N, editors, 2018 IEEE International Conference on Intelligence and Security Informatics, ISI 2018. Institute of Electrical and Electronics Engineers Inc. 2018. p. 94-99. 8587336 https://doi.org/10.1109/ISI.2018.8587336
Williams, Ryan ; Samtani, Sagar ; Patton, Mark ; Chen, Hsinchun. / Incremental hacker forum exploit collection and classification for proactive cyber threat intelligence : An exploratory study. 2018 IEEE International Conference on Intelligence and Security Informatics, ISI 2018. editor / Dongwon Lee ; Ghita Mezzour ; Ponnurangam Kumaraguru ; Nitesh Saxena. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 94-99
@inproceedings{5e848264378a4f5f93392e8e8b659b2b,
title = "Incremental hacker forum exploit collection and classification for proactive cyber threat intelligence: An exploratory study",
abstract = "Cyber threats have emerged as a key societal concern. To counter the growing threat of cyber-attacks, organizations, in recent years, have begun investing heavily in developing Cyber Threat Intelligence (CTI). Fundamentally a data driven process, many organizations have traditionally collected and analyzed data from internal log files, resulting in reactive CTI. The online hacker community can offer significant proactive CTI value by alerting organizations to threats they were not previously aware of. Amongst various platforms, forums provide the richest metadata, data permanence, and tens of thousands of freely available Tools, Techniques, and Procedures (TTP). However, forums often employ anti-crawling measures such as authentication, throttling, and obfuscation. Such limitations have restricted many researchers to batch collections. This exploratory study aims to (1) design a novel web crawler augmented with numerous anti-crawling countermeasures to collect hacker exploits on an ongoing basis, (2) employ a state-of-the-art deep learning approach, Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN), to automatically classify exploits into pre-defined categories on-the-fly, and (3) develop interactive visualizations enabling CTI practitioners and researchers to explore collected exploits for proactive, timely CTI. The results of this study indicate, among other findings, that system and network exploits are shared significantly more than other exploit types.",
keywords = "CTI, Cyber threat intelligence, Hacker exploits, Hacker forum, Recurrent neural network, Web crawling",
author = "Ryan Williams and Sagar Samtani and Mark Patton and Hsinchun Chen",
year = "2018",
month = "12",
day = "24",
doi = "10.1109/ISI.2018.8587336",
language = "English (US)",
pages = "94--99",
editor = "Dongwon Lee and Ghita Mezzour and Ponnurangam Kumaraguru and Nitesh Saxena",
booktitle = "2018 IEEE International Conference on Intelligence and Security Informatics, ISI 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Incremental hacker forum exploit collection and classification for proactive cyber threat intelligence

T2 - An exploratory study

AU - Williams, Ryan

AU - Samtani, Sagar

AU - Patton, Mark

AU - Chen, Hsinchun

PY - 2018/12/24

Y1 - 2018/12/24

N2 - Cyber threats have emerged as a key societal concern. To counter the growing threat of cyber-attacks, organizations, in recent years, have begun investing heavily in developing Cyber Threat Intelligence (CTI). Fundamentally a data driven process, many organizations have traditionally collected and analyzed data from internal log files, resulting in reactive CTI. The online hacker community can offer significant proactive CTI value by alerting organizations to threats they were not previously aware of. Amongst various platforms, forums provide the richest metadata, data permanence, and tens of thousands of freely available Tools, Techniques, and Procedures (TTP). However, forums often employ anti-crawling measures such as authentication, throttling, and obfuscation. Such limitations have restricted many researchers to batch collections. This exploratory study aims to (1) design a novel web crawler augmented with numerous anti-crawling countermeasures to collect hacker exploits on an ongoing basis, (2) employ a state-of-the-art deep learning approach, Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN), to automatically classify exploits into pre-defined categories on-the-fly, and (3) develop interactive visualizations enabling CTI practitioners and researchers to explore collected exploits for proactive, timely CTI. The results of this study indicate, among other findings, that system and network exploits are shared significantly more than other exploit types.

AB - Cyber threats have emerged as a key societal concern. To counter the growing threat of cyber-attacks, organizations, in recent years, have begun investing heavily in developing Cyber Threat Intelligence (CTI). Fundamentally a data driven process, many organizations have traditionally collected and analyzed data from internal log files, resulting in reactive CTI. The online hacker community can offer significant proactive CTI value by alerting organizations to threats they were not previously aware of. Amongst various platforms, forums provide the richest metadata, data permanence, and tens of thousands of freely available Tools, Techniques, and Procedures (TTP). However, forums often employ anti-crawling measures such as authentication, throttling, and obfuscation. Such limitations have restricted many researchers to batch collections. This exploratory study aims to (1) design a novel web crawler augmented with numerous anti-crawling countermeasures to collect hacker exploits on an ongoing basis, (2) employ a state-of-the-art deep learning approach, Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN), to automatically classify exploits into pre-defined categories on-the-fly, and (3) develop interactive visualizations enabling CTI practitioners and researchers to explore collected exploits for proactive, timely CTI. The results of this study indicate, among other findings, that system and network exploits are shared significantly more than other exploit types.

KW - CTI

KW - Cyber threat intelligence

KW - Hacker exploits

KW - Hacker forum

KW - Recurrent neural network

KW - Web crawling

UR - http://www.scopus.com/inward/record.url?scp=85061054594&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85061054594&partnerID=8YFLogxK

U2 - 10.1109/ISI.2018.8587336

DO - 10.1109/ISI.2018.8587336

M3 - Conference contribution

AN - SCOPUS:85061054594

SP - 94

EP - 99

BT - 2018 IEEE International Conference on Intelligence and Security Informatics, ISI 2018

A2 - Lee, Dongwon

A2 - Mezzour, Ghita

A2 - Kumaraguru, Ponnurangam

A2 - Saxena, Nitesh

PB - Institute of Electrical and Electronics Engineers Inc.

ER -