Semi-supervised learning in nonstationary environments

Gregory Ditzler, Robi Polikar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

28 Citations (Scopus)

Abstract

Learning in nonstationary environments, also called learning concept drift, has been receiving increasing attention due to increasingly large number of applications that generate data with drifting distributions. These applications are usually associated with streaming data, either online or in batches, and concept drift algorithms are trained to detect and track the drifting concepts. While concept drift itself is a significantly more complex problem than the traditional machine learning paradigm of data coming from a fixed distribution, the problem is further complicated when obtaining labeled data is expensive, and training must rely, in part, on unlabelled data. Independently from concept drift research, semi-supervised approaches have been developed for learning from (limited) labeled and (abundant) unlabeled data; however, such approaches have been largely absent in concept drift literature. In this contribution, we describe an ensemble of classifiers based approach that takes advantage of both labeled and unlabeled data in addressing concept drift: available labeled data are used to generate classifiers, whose voting weights are determined based on the distances between Gaussian mixture model components trained on both labeled and unlabeled data in a drifting environment.

Original languageEnglish (US)
Title of host publication2011 International Joint Conference on Neural Networks, IJCNN 2011 - Final Program
Pages2741-2748
Number of pages8
DOIs
StatePublished - Oct 24 2011
Externally publishedYes
Event2011 International Joint Conference on Neural Network, IJCNN 2011 - San Jose, CA, United States
Duration: Jul 31 2011Aug 5 2011

Publication series

NameProceedings of the International Joint Conference on Neural Networks

Conference

Conference2011 International Joint Conference on Neural Network, IJCNN 2011
CountryUnited States
CitySan Jose, CA
Period7/31/118/5/11

Fingerprint

Supervised learning
Classifiers
Learning systems

Keywords

  • concept drift
  • ensemble systems
  • incremental learning
  • non-stationary environments
  • unlabeled data

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Cite this

Ditzler, G., & Polikar, R. (2011). Semi-supervised learning in nonstationary environments. In 2011 International Joint Conference on Neural Networks, IJCNN 2011 - Final Program (pp. 2741-2748). [6033578] (Proceedings of the International Joint Conference on Neural Networks). https://doi.org/10.1109/IJCNN.2011.6033578

Semi-supervised learning in nonstationary environments. / Ditzler, Gregory; Polikar, Robi.

2011 International Joint Conference on Neural Networks, IJCNN 2011 - Final Program. 2011. p. 2741-2748 6033578 (Proceedings of the International Joint Conference on Neural Networks).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ditzler, G & Polikar, R 2011, Semi-supervised learning in nonstationary environments. in 2011 International Joint Conference on Neural Networks, IJCNN 2011 - Final Program., 6033578, Proceedings of the International Joint Conference on Neural Networks, pp. 2741-2748, 2011 International Joint Conference on Neural Network, IJCNN 2011, San Jose, CA, United States, 7/31/11. https://doi.org/10.1109/IJCNN.2011.6033578
Ditzler G, Polikar R. Semi-supervised learning in nonstationary environments. In 2011 International Joint Conference on Neural Networks, IJCNN 2011 - Final Program. 2011. p. 2741-2748. 6033578. (Proceedings of the International Joint Conference on Neural Networks). https://doi.org/10.1109/IJCNN.2011.6033578
Ditzler, Gregory ; Polikar, Robi. / Semi-supervised learning in nonstationary environments. 2011 International Joint Conference on Neural Networks, IJCNN 2011 - Final Program. 2011. pp. 2741-2748 (Proceedings of the International Joint Conference on Neural Networks).
@inproceedings{6805065514c4467dade117602c11fe78,
title = "Semi-supervised learning in nonstationary environments",
abstract = "Learning in nonstationary environments, also called learning concept drift, has been receiving increasing attention due to increasingly large number of applications that generate data with drifting distributions. These applications are usually associated with streaming data, either online or in batches, and concept drift algorithms are trained to detect and track the drifting concepts. While concept drift itself is a significantly more complex problem than the traditional machine learning paradigm of data coming from a fixed distribution, the problem is further complicated when obtaining labeled data is expensive, and training must rely, in part, on unlabelled data. Independently from concept drift research, semi-supervised approaches have been developed for learning from (limited) labeled and (abundant) unlabeled data; however, such approaches have been largely absent in concept drift literature. In this contribution, we describe an ensemble of classifiers based approach that takes advantage of both labeled and unlabeled data in addressing concept drift: available labeled data are used to generate classifiers, whose voting weights are determined based on the distances between Gaussian mixture model components trained on both labeled and unlabeled data in a drifting environment.",
keywords = "concept drift, ensemble systems, incremental learning, non-stationary environments, unlabeled data",
author = "Gregory Ditzler and Robi Polikar",
year = "2011",
month = "10",
day = "24",
doi = "10.1109/IJCNN.2011.6033578",
language = "English (US)",
isbn = "9781457710865",
series = "Proceedings of the International Joint Conference on Neural Networks",
pages = "2741--2748",
booktitle = "2011 International Joint Conference on Neural Networks, IJCNN 2011 - Final Program",

}

TY - GEN

T1 - Semi-supervised learning in nonstationary environments

AU - Ditzler, Gregory

AU - Polikar, Robi

PY - 2011/10/24

Y1 - 2011/10/24

N2 - Learning in nonstationary environments, also called learning concept drift, has been receiving increasing attention due to increasingly large number of applications that generate data with drifting distributions. These applications are usually associated with streaming data, either online or in batches, and concept drift algorithms are trained to detect and track the drifting concepts. While concept drift itself is a significantly more complex problem than the traditional machine learning paradigm of data coming from a fixed distribution, the problem is further complicated when obtaining labeled data is expensive, and training must rely, in part, on unlabelled data. Independently from concept drift research, semi-supervised approaches have been developed for learning from (limited) labeled and (abundant) unlabeled data; however, such approaches have been largely absent in concept drift literature. In this contribution, we describe an ensemble of classifiers based approach that takes advantage of both labeled and unlabeled data in addressing concept drift: available labeled data are used to generate classifiers, whose voting weights are determined based on the distances between Gaussian mixture model components trained on both labeled and unlabeled data in a drifting environment.

AB - Learning in nonstationary environments, also called learning concept drift, has been receiving increasing attention due to increasingly large number of applications that generate data with drifting distributions. These applications are usually associated with streaming data, either online or in batches, and concept drift algorithms are trained to detect and track the drifting concepts. While concept drift itself is a significantly more complex problem than the traditional machine learning paradigm of data coming from a fixed distribution, the problem is further complicated when obtaining labeled data is expensive, and training must rely, in part, on unlabelled data. Independently from concept drift research, semi-supervised approaches have been developed for learning from (limited) labeled and (abundant) unlabeled data; however, such approaches have been largely absent in concept drift literature. In this contribution, we describe an ensemble of classifiers based approach that takes advantage of both labeled and unlabeled data in addressing concept drift: available labeled data are used to generate classifiers, whose voting weights are determined based on the distances between Gaussian mixture model components trained on both labeled and unlabeled data in a drifting environment.

KW - concept drift

KW - ensemble systems

KW - incremental learning

KW - non-stationary environments

KW - unlabeled data

UR - http://www.scopus.com/inward/record.url?scp=80054770829&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80054770829&partnerID=8YFLogxK

U2 - 10.1109/IJCNN.2011.6033578

DO - 10.1109/IJCNN.2011.6033578

M3 - Conference contribution

AN - SCOPUS:80054770829

SN - 9781457710865

T3 - Proceedings of the International Joint Conference on Neural Networks

SP - 2741

EP - 2748

BT - 2011 International Joint Conference on Neural Networks, IJCNN 2011 - Final Program

ER -