Unsupervised segmentation of categorical time series into episodes

Paul R Cohen, Brent Heeringa, Niall Adams

Research output: Chapter in Book/Report/Conference proceedingConference contribution

19 Citations (Scopus)

Abstract

This paper describes an unsupervised algorithm for segmenting categorical time series into episodes. The VOTING-EXPERTS algorithm first collects statistics about the frequency and boundary entropy of ngrams, then passes a window over the series and has two "expert methods " decide where in the window boundaries should be drawn. The algorithm successfully segments text into words in four languages. The algorithm also segments time series of robot sensor data into subsequences that represent episodes in the life of the robot. We claim that VOTING-EXPERTS finds meaningful episodes in categorical time series because it exploits two statistical characteristics of meaningful episodes.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE International Conference on Data Mining, ICDM
Pages99-106
Number of pages8
StatePublished - 2002
Externally publishedYes
Event2nd IEEE International Conference on Data Mining, ICDM '02 - Maebashi, Japan
Duration: Dec 9 2002Dec 12 2002

Other

Other2nd IEEE International Conference on Data Mining, ICDM '02
CountryJapan
CityMaebashi
Period12/9/0212/12/02

Fingerprint

Time series
Robots
Entropy
Statistics
Sensors

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Cohen, P. R., Heeringa, B., & Adams, N. (2002). Unsupervised segmentation of categorical time series into episodes. In Proceedings - IEEE International Conference on Data Mining, ICDM (pp. 99-106)

Unsupervised segmentation of categorical time series into episodes. / Cohen, Paul R; Heeringa, Brent; Adams, Niall.

Proceedings - IEEE International Conference on Data Mining, ICDM. 2002. p. 99-106.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cohen, PR, Heeringa, B & Adams, N 2002, Unsupervised segmentation of categorical time series into episodes. in Proceedings - IEEE International Conference on Data Mining, ICDM. pp. 99-106, 2nd IEEE International Conference on Data Mining, ICDM '02, Maebashi, Japan, 12/9/02.
Cohen PR, Heeringa B, Adams N. Unsupervised segmentation of categorical time series into episodes. In Proceedings - IEEE International Conference on Data Mining, ICDM. 2002. p. 99-106
Cohen, Paul R ; Heeringa, Brent ; Adams, Niall. / Unsupervised segmentation of categorical time series into episodes. Proceedings - IEEE International Conference on Data Mining, ICDM. 2002. pp. 99-106
@inproceedings{4b80579514d54c07b965f24140bc8487,
title = "Unsupervised segmentation of categorical time series into episodes",
abstract = "This paper describes an unsupervised algorithm for segmenting categorical time series into episodes. The VOTING-EXPERTS algorithm first collects statistics about the frequency and boundary entropy of ngrams, then passes a window over the series and has two {"}expert methods {"} decide where in the window boundaries should be drawn. The algorithm successfully segments text into words in four languages. The algorithm also segments time series of robot sensor data into subsequences that represent episodes in the life of the robot. We claim that VOTING-EXPERTS finds meaningful episodes in categorical time series because it exploits two statistical characteristics of meaningful episodes.",
author = "Cohen, {Paul R} and Brent Heeringa and Niall Adams",
year = "2002",
language = "English (US)",
isbn = "0769517544",
pages = "99--106",
booktitle = "Proceedings - IEEE International Conference on Data Mining, ICDM",

}

TY - GEN

T1 - Unsupervised segmentation of categorical time series into episodes

AU - Cohen, Paul R

AU - Heeringa, Brent

AU - Adams, Niall

PY - 2002

Y1 - 2002

N2 - This paper describes an unsupervised algorithm for segmenting categorical time series into episodes. The VOTING-EXPERTS algorithm first collects statistics about the frequency and boundary entropy of ngrams, then passes a window over the series and has two "expert methods " decide where in the window boundaries should be drawn. The algorithm successfully segments text into words in four languages. The algorithm also segments time series of robot sensor data into subsequences that represent episodes in the life of the robot. We claim that VOTING-EXPERTS finds meaningful episodes in categorical time series because it exploits two statistical characteristics of meaningful episodes.

AB - This paper describes an unsupervised algorithm for segmenting categorical time series into episodes. The VOTING-EXPERTS algorithm first collects statistics about the frequency and boundary entropy of ngrams, then passes a window over the series and has two "expert methods " decide where in the window boundaries should be drawn. The algorithm successfully segments text into words in four languages. The algorithm also segments time series of robot sensor data into subsequences that represent episodes in the life of the robot. We claim that VOTING-EXPERTS finds meaningful episodes in categorical time series because it exploits two statistical characteristics of meaningful episodes.

UR - http://www.scopus.com/inward/record.url?scp=36348933993&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=36348933993&partnerID=8YFLogxK

M3 - Conference contribution

SN - 0769517544

SN - 9780769517544

SP - 99

EP - 106

BT - Proceedings - IEEE International Conference on Data Mining, ICDM

ER -