Blending autonomous exploration and apprenticeship learning

Thomas J. Walsh, Daniel Hewlett, Clayton T Morrison

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

We present theoretical and empirical results for a framework that combines the benefits of apprenticeship and autonomous reinforcement learning. Our approach modifies an existing apprenticeship learning framework that relies on teacher demonstrations and does not necessarily explore the environment. The first change is replacing previously used Mistake Bound model learners with a recently proposed framework that melds the KWIK and Mistake Bound supervised learning protocols. The second change is introducing a communication of expected utility from the student to the teacher. The resulting system only uses teacher traces when the agent needs to learn concepts it cannot efficiently learn on its own.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011
StatePublished - 2011
Event25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 - Granada, Spain
Duration: Dec 12 2011Dec 14 2011

Other

Other25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011
CountrySpain
CityGranada
Period12/12/1112/14/11

Fingerprint

Supervised learning
Reinforcement learning
Demonstrations
Students
Communication

ASJC Scopus subject areas

  • Information Systems

Cite this

Walsh, T. J., Hewlett, D., & Morrison, C. T. (2011). Blending autonomous exploration and apprenticeship learning. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011

Blending autonomous exploration and apprenticeship learning. / Walsh, Thomas J.; Hewlett, Daniel; Morrison, Clayton T.

Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011. 2011.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Walsh, TJ, Hewlett, D & Morrison, CT 2011, Blending autonomous exploration and apprenticeship learning. in Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011. 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011, Granada, Spain, 12/12/11.
Walsh TJ, Hewlett D, Morrison CT. Blending autonomous exploration and apprenticeship learning. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011. 2011
Walsh, Thomas J. ; Hewlett, Daniel ; Morrison, Clayton T. / Blending autonomous exploration and apprenticeship learning. Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011. 2011.
@inproceedings{ac351e27f02648f4888fcccea33792c2,
title = "Blending autonomous exploration and apprenticeship learning",
abstract = "We present theoretical and empirical results for a framework that combines the benefits of apprenticeship and autonomous reinforcement learning. Our approach modifies an existing apprenticeship learning framework that relies on teacher demonstrations and does not necessarily explore the environment. The first change is replacing previously used Mistake Bound model learners with a recently proposed framework that melds the KWIK and Mistake Bound supervised learning protocols. The second change is introducing a communication of expected utility from the student to the teacher. The resulting system only uses teacher traces when the agent needs to learn concepts it cannot efficiently learn on its own.",
author = "Walsh, {Thomas J.} and Daniel Hewlett and Morrison, {Clayton T}",
year = "2011",
language = "English (US)",
isbn = "9781618395993",
booktitle = "Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011",

}

TY - GEN

T1 - Blending autonomous exploration and apprenticeship learning

AU - Walsh, Thomas J.

AU - Hewlett, Daniel

AU - Morrison, Clayton T

PY - 2011

Y1 - 2011

N2 - We present theoretical and empirical results for a framework that combines the benefits of apprenticeship and autonomous reinforcement learning. Our approach modifies an existing apprenticeship learning framework that relies on teacher demonstrations and does not necessarily explore the environment. The first change is replacing previously used Mistake Bound model learners with a recently proposed framework that melds the KWIK and Mistake Bound supervised learning protocols. The second change is introducing a communication of expected utility from the student to the teacher. The resulting system only uses teacher traces when the agent needs to learn concepts it cannot efficiently learn on its own.

AB - We present theoretical and empirical results for a framework that combines the benefits of apprenticeship and autonomous reinforcement learning. Our approach modifies an existing apprenticeship learning framework that relies on teacher demonstrations and does not necessarily explore the environment. The first change is replacing previously used Mistake Bound model learners with a recently proposed framework that melds the KWIK and Mistake Bound supervised learning protocols. The second change is introducing a communication of expected utility from the student to the teacher. The resulting system only uses teacher traces when the agent needs to learn concepts it cannot efficiently learn on its own.

UR - http://www.scopus.com/inward/record.url?scp=84860644652&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84860644652&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84860644652

SN - 9781618395993

BT - Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011

ER -