Creating causal embeddings for question answering with minimal supervision

Rebecca Sharp, Mihai Surdeanu, Peter Jansen, Peter Clark, Michael Hammond

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

A common model for question answering (QA) is that a good answer is one that is closely related to the question, where relatedness is often determined using general-purpose lexical models such as word embeddings. We argue that a better approach is to look for answers that are related to the question in a relevant way, according to the information need of the question, which may be determined through task-specific embeddings. With causality as a use case, we implement this insight in three steps. First, we generate causal embeddings cost-effectively by bootstrapping cause-effect pairs extracted from free text using a small set of seed patterns. Second, we train dedicated embeddings over this data, by using task-specific contexts, i.e., the context of a cause is its effect. Finally, we extend a state-of-the-art reranking approach for QA to incorporate these causal embeddings. We evaluate the causal embedding models both directly with a casual implication task, and indirectly, in a downstream causal QA task using data from Yahoo! Answers. We show that explicitly modeling causality improves performance in both tasks. In the QA task our best model achieves 37.3% P@1, significantly outperforming a strong baseline by 7.7% (relative).

Original languageEnglish (US)
Title of host publicationEMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages138-148
Number of pages11
ISBN (Electronic)9781945626258
StatePublished - Jan 1 2016
Event2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016 - Austin, United States
Duration: Nov 1 2016Nov 5 2016

Publication series

NameEMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings

Conference

Conference2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016
CountryUnited States
CityAustin
Period11/1/1611/5/16

Fingerprint

Seed
Costs

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Computational Theory and Mathematics

Cite this

Sharp, R., Surdeanu, M., Jansen, P., Clark, P., & Hammond, M. (2016). Creating causal embeddings for question answering with minimal supervision. In EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 138-148). (EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings). Association for Computational Linguistics (ACL).

Creating causal embeddings for question answering with minimal supervision. / Sharp, Rebecca; Surdeanu, Mihai; Jansen, Peter; Clark, Peter; Hammond, Michael.

EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings. Association for Computational Linguistics (ACL), 2016. p. 138-148 (EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sharp, R, Surdeanu, M, Jansen, P, Clark, P & Hammond, M 2016, Creating causal embeddings for question answering with minimal supervision. in EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings. EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings, Association for Computational Linguistics (ACL), pp. 138-148, 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, United States, 11/1/16.
Sharp R, Surdeanu M, Jansen P, Clark P, Hammond M. Creating causal embeddings for question answering with minimal supervision. In EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings. Association for Computational Linguistics (ACL). 2016. p. 138-148. (EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings).
Sharp, Rebecca ; Surdeanu, Mihai ; Jansen, Peter ; Clark, Peter ; Hammond, Michael. / Creating causal embeddings for question answering with minimal supervision. EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings. Association for Computational Linguistics (ACL), 2016. pp. 138-148 (EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings).
@inproceedings{ae5180b6ca0a4537b39ef50d53bc1f0e,
title = "Creating causal embeddings for question answering with minimal supervision",
abstract = "A common model for question answering (QA) is that a good answer is one that is closely related to the question, where relatedness is often determined using general-purpose lexical models such as word embeddings. We argue that a better approach is to look for answers that are related to the question in a relevant way, according to the information need of the question, which may be determined through task-specific embeddings. With causality as a use case, we implement this insight in three steps. First, we generate causal embeddings cost-effectively by bootstrapping cause-effect pairs extracted from free text using a small set of seed patterns. Second, we train dedicated embeddings over this data, by using task-specific contexts, i.e., the context of a cause is its effect. Finally, we extend a state-of-the-art reranking approach for QA to incorporate these causal embeddings. We evaluate the causal embedding models both directly with a casual implication task, and indirectly, in a downstream causal QA task using data from Yahoo! Answers. We show that explicitly modeling causality improves performance in both tasks. In the QA task our best model achieves 37.3{\%} P@1, significantly outperforming a strong baseline by 7.7{\%} (relative).",
author = "Rebecca Sharp and Mihai Surdeanu and Peter Jansen and Peter Clark and Michael Hammond",
year = "2016",
month = "1",
day = "1",
language = "English (US)",
series = "EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings",
publisher = "Association for Computational Linguistics (ACL)",
pages = "138--148",
booktitle = "EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings",

}

TY - GEN

T1 - Creating causal embeddings for question answering with minimal supervision

AU - Sharp, Rebecca

AU - Surdeanu, Mihai

AU - Jansen, Peter

AU - Clark, Peter

AU - Hammond, Michael

PY - 2016/1/1

Y1 - 2016/1/1

N2 - A common model for question answering (QA) is that a good answer is one that is closely related to the question, where relatedness is often determined using general-purpose lexical models such as word embeddings. We argue that a better approach is to look for answers that are related to the question in a relevant way, according to the information need of the question, which may be determined through task-specific embeddings. With causality as a use case, we implement this insight in three steps. First, we generate causal embeddings cost-effectively by bootstrapping cause-effect pairs extracted from free text using a small set of seed patterns. Second, we train dedicated embeddings over this data, by using task-specific contexts, i.e., the context of a cause is its effect. Finally, we extend a state-of-the-art reranking approach for QA to incorporate these causal embeddings. We evaluate the causal embedding models both directly with a casual implication task, and indirectly, in a downstream causal QA task using data from Yahoo! Answers. We show that explicitly modeling causality improves performance in both tasks. In the QA task our best model achieves 37.3% P@1, significantly outperforming a strong baseline by 7.7% (relative).

AB - A common model for question answering (QA) is that a good answer is one that is closely related to the question, where relatedness is often determined using general-purpose lexical models such as word embeddings. We argue that a better approach is to look for answers that are related to the question in a relevant way, according to the information need of the question, which may be determined through task-specific embeddings. With causality as a use case, we implement this insight in three steps. First, we generate causal embeddings cost-effectively by bootstrapping cause-effect pairs extracted from free text using a small set of seed patterns. Second, we train dedicated embeddings over this data, by using task-specific contexts, i.e., the context of a cause is its effect. Finally, we extend a state-of-the-art reranking approach for QA to incorporate these causal embeddings. We evaluate the causal embedding models both directly with a casual implication task, and indirectly, in a downstream causal QA task using data from Yahoo! Answers. We show that explicitly modeling causality improves performance in both tasks. In the QA task our best model achieves 37.3% P@1, significantly outperforming a strong baseline by 7.7% (relative).

UR - http://www.scopus.com/inward/record.url?scp=85072819330&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85072819330&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85072819330

T3 - EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings

SP - 138

EP - 148

BT - EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings

PB - Association for Computational Linguistics (ACL)

ER -