Training deep nets with imbalanced and unlabeled data

Jeff Berry, Ian Fasel, Luciano Fadiga, Diana Archangeli

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations

Abstract

Training deep belief networks (DBNs) is normally done with large data sets. Our goal is to predict traces of the surface of the tongue in ultrasound images of human speech. Hand-tracing is labor-intensive; the dataset is highly imbalanced since many images are extremely similar. We propose a bootstrapping method which handles this imbalance by iteratively selecting a small subset of images to be hand-traced (thereby reducing human labor time), then (re)training the DBN, making use of an entropy-based diversity measure for the initial selection, thereby achieving over a two-fold reduction in human time required for tracing with human-level accuracy.

Original languageEnglish (US)
Title of host publication13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Pages1754-1757
Number of pages4
StatePublished - Dec 1 2012
Event13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 - Portland, OR, United States
Duration: Sep 9 2012Sep 13 2012

Publication series

Name13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Volume2

Other

Other13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
CountryUnited States
CityPortland, OR
Period9/9/129/13/12

Keywords

  • Bootstrapping
  • Class imbalance problem
  • Deep belief networks
  • Speech processing
  • Tongue imaging
  • Ultrasound imaging

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Communication

Fingerprint Dive into the research topics of 'Training deep nets with imbalanced and unlabeled data'. Together they form a unique fingerprint.

  • Cite this

    Berry, J., Fasel, I., Fadiga, L., & Archangeli, D. (2012). Training deep nets with imbalanced and unlabeled data. In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 (pp. 1754-1757). (13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012; Vol. 2).