Computational models of reinforcement learning: The role of dopamine as a reward signal

R. D. Samson, M. J. Frank, Jean-Marc Fellous

Research output: Contribution to journalArticle

36 Citations (Scopus)

Abstract

Reinforcement learning is ubiquitous. Unlike other forms of learning, it involves the processing of fast yet content-poor feedback information to correct assumptions about the nature of a task or of a set of stimuli. This feedback information is often delivered as generic rewards or punishments, and has little to do with the stimulus features to be learned. How can such low-content feedback lead to such an efficient learning paradigm? Through a review of existing neuro-computational models of reinforcement learning, we suggest that the efficiency of this type of learning resides in the dynamic and synergistic cooperation of brain systems that use different levels of computations. The implementation of reward signals at the synaptic, cellular, network and system levels give the organism the necessary robustness, adaptability and processing speed required for evolutionary and behavioral success.

Original languageEnglish (US)
Pages (from-to)91-105
Number of pages15
JournalCognitive Neurodynamics
Volume4
Issue number2
DOIs
StatePublished - Jun 2010

Fingerprint

Reward
Dopamine
Learning
Punishment
Reinforcement (Psychology)
Efficiency
Brain

Keywords

  • Dopamine
  • Reinforcement learning
  • Reward
  • Temporal difference

ASJC Scopus subject areas

  • Cognitive Neuroscience

Cite this

Computational models of reinforcement learning : The role of dopamine as a reward signal. / Samson, R. D.; Frank, M. J.; Fellous, Jean-Marc.

In: Cognitive Neurodynamics, Vol. 4, No. 2, 06.2010, p. 91-105.

Research output: Contribution to journalArticle

@article{35b2f99ec4e645c1a7315594b5282b4c,
title = "Computational models of reinforcement learning: The role of dopamine as a reward signal",
abstract = "Reinforcement learning is ubiquitous. Unlike other forms of learning, it involves the processing of fast yet content-poor feedback information to correct assumptions about the nature of a task or of a set of stimuli. This feedback information is often delivered as generic rewards or punishments, and has little to do with the stimulus features to be learned. How can such low-content feedback lead to such an efficient learning paradigm? Through a review of existing neuro-computational models of reinforcement learning, we suggest that the efficiency of this type of learning resides in the dynamic and synergistic cooperation of brain systems that use different levels of computations. The implementation of reward signals at the synaptic, cellular, network and system levels give the organism the necessary robustness, adaptability and processing speed required for evolutionary and behavioral success.",
keywords = "Dopamine, Reinforcement learning, Reward, Temporal difference",
author = "Samson, {R. D.} and Frank, {M. J.} and Jean-Marc Fellous",
year = "2010",
month = "6",
doi = "10.1007/s11571-010-9109-x",
language = "English (US)",
volume = "4",
pages = "91--105",
journal = "Cognitive Neurodynamics",
issn = "1871-4080",
publisher = "Springer Netherlands",
number = "2",

}

TY - JOUR

T1 - Computational models of reinforcement learning

T2 - The role of dopamine as a reward signal

AU - Samson, R. D.

AU - Frank, M. J.

AU - Fellous, Jean-Marc

PY - 2010/6

Y1 - 2010/6

N2 - Reinforcement learning is ubiquitous. Unlike other forms of learning, it involves the processing of fast yet content-poor feedback information to correct assumptions about the nature of a task or of a set of stimuli. This feedback information is often delivered as generic rewards or punishments, and has little to do with the stimulus features to be learned. How can such low-content feedback lead to such an efficient learning paradigm? Through a review of existing neuro-computational models of reinforcement learning, we suggest that the efficiency of this type of learning resides in the dynamic and synergistic cooperation of brain systems that use different levels of computations. The implementation of reward signals at the synaptic, cellular, network and system levels give the organism the necessary robustness, adaptability and processing speed required for evolutionary and behavioral success.

AB - Reinforcement learning is ubiquitous. Unlike other forms of learning, it involves the processing of fast yet content-poor feedback information to correct assumptions about the nature of a task or of a set of stimuli. This feedback information is often delivered as generic rewards or punishments, and has little to do with the stimulus features to be learned. How can such low-content feedback lead to such an efficient learning paradigm? Through a review of existing neuro-computational models of reinforcement learning, we suggest that the efficiency of this type of learning resides in the dynamic and synergistic cooperation of brain systems that use different levels of computations. The implementation of reward signals at the synaptic, cellular, network and system levels give the organism the necessary robustness, adaptability and processing speed required for evolutionary and behavioral success.

KW - Dopamine

KW - Reinforcement learning

KW - Reward

KW - Temporal difference

UR - http://www.scopus.com/inward/record.url?scp=77954656517&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77954656517&partnerID=8YFLogxK

U2 - 10.1007/s11571-010-9109-x

DO - 10.1007/s11571-010-9109-x

M3 - Article

C2 - 21629583

AN - SCOPUS:77954656517

VL - 4

SP - 91

EP - 105

JO - Cognitive Neurodynamics

JF - Cognitive Neurodynamics

SN - 1871-4080

IS - 2

ER -