Training deep nets with imbalanced and unlabeled data

Jeff Berry, Ian Fasel, Luciano Fadiga, Diana B Archangeli

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

Training deep belief networks (DBNs) is normally done with large data sets. Our goal is to predict traces of the surface of the tongue in ultrasound images of human speech. Hand-tracing is labor-intensive; the dataset is highly imbalanced since many images are extremely similar. We propose a bootstrapping method which handles this imbalance by iteratively selecting a small subset of images to be hand-traced (thereby reducing human labor time), then (re)training the DBN, making use of an entropy-based diversity measure for the initial selection, thereby achieving over a two-fold reduction in human time required for tracing with human-level accuracy.

Original languageEnglish (US)
Title of host publication13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Pages1754-1757
Number of pages4
Volume2
StatePublished - 2012
Event13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 - Portland, OR, United States
Duration: Sep 9 2012Sep 13 2012

Other

Other13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
CountryUnited States
CityPortland, OR
Period9/9/129/13/12

Fingerprint

Bayesian networks
Personnel
labor
entropy
Entropy
Ultrasonics
time

Keywords

  • Bootstrapping
  • Class imbalance problem
  • Deep belief networks
  • Speech processing
  • Tongue imaging
  • Ultrasound imaging

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Communication

Cite this

Berry, J., Fasel, I., Fadiga, L., & Archangeli, D. B. (2012). Training deep nets with imbalanced and unlabeled data. In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 (Vol. 2, pp. 1754-1757)

Training deep nets with imbalanced and unlabeled data. / Berry, Jeff; Fasel, Ian; Fadiga, Luciano; Archangeli, Diana B.

13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. Vol. 2 2012. p. 1754-1757.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Berry, J, Fasel, I, Fadiga, L & Archangeli, DB 2012, Training deep nets with imbalanced and unlabeled data. in 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. vol. 2, pp. 1754-1757, 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, Portland, OR, United States, 9/9/12.
Berry J, Fasel I, Fadiga L, Archangeli DB. Training deep nets with imbalanced and unlabeled data. In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. Vol. 2. 2012. p. 1754-1757
Berry, Jeff ; Fasel, Ian ; Fadiga, Luciano ; Archangeli, Diana B. / Training deep nets with imbalanced and unlabeled data. 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. Vol. 2 2012. pp. 1754-1757
@inproceedings{1e1d1d073d4843c886b2f5a2b2c185df,
title = "Training deep nets with imbalanced and unlabeled data",
abstract = "Training deep belief networks (DBNs) is normally done with large data sets. Our goal is to predict traces of the surface of the tongue in ultrasound images of human speech. Hand-tracing is labor-intensive; the dataset is highly imbalanced since many images are extremely similar. We propose a bootstrapping method which handles this imbalance by iteratively selecting a small subset of images to be hand-traced (thereby reducing human labor time), then (re)training the DBN, making use of an entropy-based diversity measure for the initial selection, thereby achieving over a two-fold reduction in human time required for tracing with human-level accuracy.",
keywords = "Bootstrapping, Class imbalance problem, Deep belief networks, Speech processing, Tongue imaging, Ultrasound imaging",
author = "Jeff Berry and Ian Fasel and Luciano Fadiga and Archangeli, {Diana B}",
year = "2012",
language = "English (US)",
isbn = "9781622767595",
volume = "2",
pages = "1754--1757",
booktitle = "13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012",

}

TY - GEN

T1 - Training deep nets with imbalanced and unlabeled data

AU - Berry, Jeff

AU - Fasel, Ian

AU - Fadiga, Luciano

AU - Archangeli, Diana B

PY - 2012

Y1 - 2012

N2 - Training deep belief networks (DBNs) is normally done with large data sets. Our goal is to predict traces of the surface of the tongue in ultrasound images of human speech. Hand-tracing is labor-intensive; the dataset is highly imbalanced since many images are extremely similar. We propose a bootstrapping method which handles this imbalance by iteratively selecting a small subset of images to be hand-traced (thereby reducing human labor time), then (re)training the DBN, making use of an entropy-based diversity measure for the initial selection, thereby achieving over a two-fold reduction in human time required for tracing with human-level accuracy.

AB - Training deep belief networks (DBNs) is normally done with large data sets. Our goal is to predict traces of the surface of the tongue in ultrasound images of human speech. Hand-tracing is labor-intensive; the dataset is highly imbalanced since many images are extremely similar. We propose a bootstrapping method which handles this imbalance by iteratively selecting a small subset of images to be hand-traced (thereby reducing human labor time), then (re)training the DBN, making use of an entropy-based diversity measure for the initial selection, thereby achieving over a two-fold reduction in human time required for tracing with human-level accuracy.

KW - Bootstrapping

KW - Class imbalance problem

KW - Deep belief networks

KW - Speech processing

KW - Tongue imaging

KW - Ultrasound imaging

UR - http://www.scopus.com/inward/record.url?scp=84878398857&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878398857&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781622767595

VL - 2

SP - 1754

EP - 1757

BT - 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012

ER -