Natural language grammatical inference with recurrent neural networks

Steve Lawrence, C. Lee Giles, Sandiway Fong

Research output: Contribution to journalArticle

79 Citations (Scopus)

Abstract

This paper examines the inductive inference of a complex grammar with neural networks - specifically, the task considered is that of training a network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the Principles and Parameters linguistic framework, or Government-and-Binding theory. Neural networks are trained, without the division into learned vs. innate components assumed by Chomsky, in an attempt to produce the same judgments as native speakers on sharply grammatical/ungrammatical data. How a recurrent neural network could possess linguistic capability and the properties of various common recurrent neural network architectures are discussed. The problem exhibits training behavior which is often not present with smaller grammars and training was initially difficult. However, after implementing several techniques aimed at improving the convergence of the gradient descent backpropagation-through-time training algorithm, significant learning was possible. It was found that certain architectures are better able to learn an appropriate grammar. The operation of the networks and their training is analyzed. Finally, the extraction of rules in the form of deterministic finite state automata is investigated.

Original languageEnglish (US)
Pages (from-to)126-140
Number of pages15
JournalIEEE Transactions on Knowledge and Data Engineering
Volume12
Issue number1
DOIs
StatePublished - Jan 2000
Externally publishedYes

Fingerprint

Recurrent neural networks
Linguistics
Neural networks
Finite automata
Network architecture
Backpropagation
Learning algorithms

Keywords

  • Automata extraction
  • Government-and-binding theory
  • Gradient descent
  • Grammatical inference
  • Natural language processing
  • Principles-and-parameters framework
  • Recurrent neural networks
  • Simulated annealing

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Artificial Intelligence
  • Information Systems

Cite this

Natural language grammatical inference with recurrent neural networks. / Lawrence, Steve; Giles, C. Lee; Fong, Sandiway.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 12, No. 1, 01.2000, p. 126-140.

Research output: Contribution to journalArticle

@article{ef212942cc9c4cb0a2ea9eb52e96a0eb,
title = "Natural language grammatical inference with recurrent neural networks",
abstract = "This paper examines the inductive inference of a complex grammar with neural networks - specifically, the task considered is that of training a network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the Principles and Parameters linguistic framework, or Government-and-Binding theory. Neural networks are trained, without the division into learned vs. innate components assumed by Chomsky, in an attempt to produce the same judgments as native speakers on sharply grammatical/ungrammatical data. How a recurrent neural network could possess linguistic capability and the properties of various common recurrent neural network architectures are discussed. The problem exhibits training behavior which is often not present with smaller grammars and training was initially difficult. However, after implementing several techniques aimed at improving the convergence of the gradient descent backpropagation-through-time training algorithm, significant learning was possible. It was found that certain architectures are better able to learn an appropriate grammar. The operation of the networks and their training is analyzed. Finally, the extraction of rules in the form of deterministic finite state automata is investigated.",
keywords = "Automata extraction, Government-and-binding theory, Gradient descent, Grammatical inference, Natural language processing, Principles-and-parameters framework, Recurrent neural networks, Simulated annealing",
author = "Steve Lawrence and Giles, {C. Lee} and Sandiway Fong",
year = "2000",
month = "1",
doi = "10.1109/69.842255",
language = "English (US)",
volume = "12",
pages = "126--140",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",
number = "1",

}

TY - JOUR

T1 - Natural language grammatical inference with recurrent neural networks

AU - Lawrence, Steve

AU - Giles, C. Lee

AU - Fong, Sandiway

PY - 2000/1

Y1 - 2000/1

N2 - This paper examines the inductive inference of a complex grammar with neural networks - specifically, the task considered is that of training a network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the Principles and Parameters linguistic framework, or Government-and-Binding theory. Neural networks are trained, without the division into learned vs. innate components assumed by Chomsky, in an attempt to produce the same judgments as native speakers on sharply grammatical/ungrammatical data. How a recurrent neural network could possess linguistic capability and the properties of various common recurrent neural network architectures are discussed. The problem exhibits training behavior which is often not present with smaller grammars and training was initially difficult. However, after implementing several techniques aimed at improving the convergence of the gradient descent backpropagation-through-time training algorithm, significant learning was possible. It was found that certain architectures are better able to learn an appropriate grammar. The operation of the networks and their training is analyzed. Finally, the extraction of rules in the form of deterministic finite state automata is investigated.

AB - This paper examines the inductive inference of a complex grammar with neural networks - specifically, the task considered is that of training a network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the Principles and Parameters linguistic framework, or Government-and-Binding theory. Neural networks are trained, without the division into learned vs. innate components assumed by Chomsky, in an attempt to produce the same judgments as native speakers on sharply grammatical/ungrammatical data. How a recurrent neural network could possess linguistic capability and the properties of various common recurrent neural network architectures are discussed. The problem exhibits training behavior which is often not present with smaller grammars and training was initially difficult. However, after implementing several techniques aimed at improving the convergence of the gradient descent backpropagation-through-time training algorithm, significant learning was possible. It was found that certain architectures are better able to learn an appropriate grammar. The operation of the networks and their training is analyzed. Finally, the extraction of rules in the form of deterministic finite state automata is investigated.

KW - Automata extraction

KW - Government-and-binding theory

KW - Gradient descent

KW - Grammatical inference

KW - Natural language processing

KW - Principles-and-parameters framework

KW - Recurrent neural networks

KW - Simulated annealing

UR - http://www.scopus.com/inward/record.url?scp=33747598711&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33747598711&partnerID=8YFLogxK

U2 - 10.1109/69.842255

DO - 10.1109/69.842255

M3 - Article

AN - SCOPUS:33747598711

VL - 12

SP - 126

EP - 140

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 1

ER -