An algorithm for segmenting categorical time series into meaningful episodes

Paul Cohen, Niall Adams

Research output: Chapter in Book/Report/Conference proceedingConference contribution

28 Scopus citations

Abstract

This paper describes an unsupervised algorithm for segmenting categorical time series. The algorithm first collects statistics about the frequency and boundary entropy of ngrams, then passes a window over the series and has two “expert methods” decide where in the window boundaries should be drawn. The algorithm segments text into words successfully in three languages. We claim that the algorithm finds meaningful episodes in categorical time series, because it exploits two statistical characteristics of meaningful episodes.

Original languageEnglish (US)
Title of host publicationAdvances in Intelligent Data Analysis - 4th International Conference, IDA 2001, Proceedings
EditorsFrank Hoffmann, Gabriela Guimaraes, David J. Hand, Niall Adams, Douglas Fisher
PublisherSpringer-Verlag
Pages198-207
Number of pages10
ISBN (Print)3540425810, 3540425810, 9783540425816, 9783540425816
DOIs
StatePublished - 2001
Externally publishedYes
Event4th International Conference on Intelligent Data Analysis, IDA 2001 - Cascais, Portugal
Duration: Sep 13 2001Sep 15 2001

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2189
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other4th International Conference on Intelligent Data Analysis, IDA 2001
Country/TerritoryPortugal
CityCascais
Period9/13/019/15/01

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'An algorithm for segmenting categorical time series into meaningful episodes'. Together they form a unique fingerprint.

Cite this