Sanity check

A strong alignment and information retrieval baseline for question answering

Vikas Yadav, Rebecca Sharp, Mihai Surdeanu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

While increasingly complex approaches to question answering (QA) have been proposed, the true gain of these systems, particularly with respect to their expensive training requirements, can be in- flated when they are not compared to adequate baselines. Here we propose an unsupervised, simple, and fast alignment and informa- tion retrieval baseline that incorporates two novel contributions: a one-to-many alignment between query and document terms and negative alignment as a proxy for discriminative information. Our approach not only outperforms all conventional baselines as well as many supervised recurrent neural networks, but also approaches the state of the art for supervised systems on three QA datasets. With only three hyperparameters, we achieve 47% P@1 on an 8th grade Science QA dataset, 32.9% P@1 on a Yahoo! answers QA dataset and 64% MAP on WikiQA.

Original languageEnglish (US)
Title of host publication41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018
PublisherAssociation for Computing Machinery, Inc
Pages1217-1220
Number of pages4
ISBN (Electronic)9781450356572
DOIs
StatePublished - Jun 27 2018
Event41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018 - Ann Arbor, United States
Duration: Jul 8 2018Jul 12 2018

Other

Other41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018
CountryUnited States
CityAnn Arbor
Period7/8/187/12/18

Fingerprint

Information retrieval
Recurrent neural networks

Keywords

  • Answer reranking
  • Information retrieval
  • Question answering
  • Unsupervised system

ASJC Scopus subject areas

  • Software
  • Computer Graphics and Computer-Aided Design
  • Information Systems

Cite this

Yadav, V., Sharp, R., & Surdeanu, M. (2018). Sanity check: A strong alignment and information retrieval baseline for question answering. In 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018 (pp. 1217-1220). Association for Computing Machinery, Inc. https://doi.org/10.1145/3209978.3210142

Sanity check : A strong alignment and information retrieval baseline for question answering. / Yadav, Vikas; Sharp, Rebecca; Surdeanu, Mihai.

41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018. Association for Computing Machinery, Inc, 2018. p. 1217-1220.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yadav, V, Sharp, R & Surdeanu, M 2018, Sanity check: A strong alignment and information retrieval baseline for question answering. in 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018. Association for Computing Machinery, Inc, pp. 1217-1220, 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018, Ann Arbor, United States, 7/8/18. https://doi.org/10.1145/3209978.3210142
Yadav V, Sharp R, Surdeanu M. Sanity check: A strong alignment and information retrieval baseline for question answering. In 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018. Association for Computing Machinery, Inc. 2018. p. 1217-1220 https://doi.org/10.1145/3209978.3210142
Yadav, Vikas ; Sharp, Rebecca ; Surdeanu, Mihai. / Sanity check : A strong alignment and information retrieval baseline for question answering. 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018. Association for Computing Machinery, Inc, 2018. pp. 1217-1220
@inproceedings{5030ae8e026646bab4e5cf70e0198ee1,
title = "Sanity check: A strong alignment and information retrieval baseline for question answering",
abstract = "While increasingly complex approaches to question answering (QA) have been proposed, the true gain of these systems, particularly with respect to their expensive training requirements, can be in- flated when they are not compared to adequate baselines. Here we propose an unsupervised, simple, and fast alignment and informa- tion retrieval baseline that incorporates two novel contributions: a one-to-many alignment between query and document terms and negative alignment as a proxy for discriminative information. Our approach not only outperforms all conventional baselines as well as many supervised recurrent neural networks, but also approaches the state of the art for supervised systems on three QA datasets. With only three hyperparameters, we achieve 47{\%} P@1 on an 8th grade Science QA dataset, 32.9{\%} P@1 on a Yahoo! answers QA dataset and 64{\%} MAP on WikiQA.",
keywords = "Answer reranking, Information retrieval, Question answering, Unsupervised system",
author = "Vikas Yadav and Rebecca Sharp and Mihai Surdeanu",
year = "2018",
month = "6",
day = "27",
doi = "10.1145/3209978.3210142",
language = "English (US)",
pages = "1217--1220",
booktitle = "41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - Sanity check

T2 - A strong alignment and information retrieval baseline for question answering

AU - Yadav, Vikas

AU - Sharp, Rebecca

AU - Surdeanu, Mihai

PY - 2018/6/27

Y1 - 2018/6/27

N2 - While increasingly complex approaches to question answering (QA) have been proposed, the true gain of these systems, particularly with respect to their expensive training requirements, can be in- flated when they are not compared to adequate baselines. Here we propose an unsupervised, simple, and fast alignment and informa- tion retrieval baseline that incorporates two novel contributions: a one-to-many alignment between query and document terms and negative alignment as a proxy for discriminative information. Our approach not only outperforms all conventional baselines as well as many supervised recurrent neural networks, but also approaches the state of the art for supervised systems on three QA datasets. With only three hyperparameters, we achieve 47% P@1 on an 8th grade Science QA dataset, 32.9% P@1 on a Yahoo! answers QA dataset and 64% MAP on WikiQA.

AB - While increasingly complex approaches to question answering (QA) have been proposed, the true gain of these systems, particularly with respect to their expensive training requirements, can be in- flated when they are not compared to adequate baselines. Here we propose an unsupervised, simple, and fast alignment and informa- tion retrieval baseline that incorporates two novel contributions: a one-to-many alignment between query and document terms and negative alignment as a proxy for discriminative information. Our approach not only outperforms all conventional baselines as well as many supervised recurrent neural networks, but also approaches the state of the art for supervised systems on three QA datasets. With only three hyperparameters, we achieve 47% P@1 on an 8th grade Science QA dataset, 32.9% P@1 on a Yahoo! answers QA dataset and 64% MAP on WikiQA.

KW - Answer reranking

KW - Information retrieval

KW - Question answering

KW - Unsupervised system

UR - http://www.scopus.com/inward/record.url?scp=85051472921&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85051472921&partnerID=8YFLogxK

U2 - 10.1145/3209978.3210142

DO - 10.1145/3209978.3210142

M3 - Conference contribution

SP - 1217

EP - 1220

BT - 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018

PB - Association for Computing Machinery, Inc

ER -