Learning to rank answers to non-factoid questions from web collections

Mihai Surdeanu, Massimiliano Ciaramita, Hugo Zaragoza

Research output: Contribution to journalArticle

93 Citations (Scopus)

Abstract

This work investigates the use of linguistically motivated features to improve search, in particular for ranking answers to non-factoid questions. We show that it is possible to exploit existing large collections of question-answer pairs (from online social Question Answering sites) to extract such features and train ranking models which combine them effectively.We investigate a wide range of feature types, some exploiting natural language processing such as coarse word sense disambiguation, named-entity identification, syntactic parsing, and semantic role labeling. Our experiments demonstrate that linguistic features, in combination, yield considerable improvements in accuracy. Depending on the system settings we measure relative improvements of 14% to 21% in Mean Reciprocal Rank and Precision@1, providing one of the most compelling evidence to date that complex linguistic features such as word senses and semantic roles can have a significant impact on large-scale information retrieval tasks.

Original languageEnglish (US)
Pages (from-to)351-383
Number of pages33
JournalComputational Linguistics
Volume37
Issue number2
DOIs
StatePublished - Jun 2011
Externally publishedYes

Fingerprint

Linguistics
ranking
Semantics
semantics
linguistics
Syntactics
Information retrieval
information retrieval
Labeling
learning
experiment
Processing
language
evidence
Experiments
Linguistic Features
Word Sense
World Wide Web
Ranking
Semantic Roles

ASJC Scopus subject areas

  • Computer Science Applications
  • Artificial Intelligence
  • Linguistics and Language
  • Language and Linguistics

Cite this

Learning to rank answers to non-factoid questions from web collections. / Surdeanu, Mihai; Ciaramita, Massimiliano; Zaragoza, Hugo.

In: Computational Linguistics, Vol. 37, No. 2, 06.2011, p. 351-383.

Research output: Contribution to journalArticle

Surdeanu, Mihai ; Ciaramita, Massimiliano ; Zaragoza, Hugo. / Learning to rank answers to non-factoid questions from web collections. In: Computational Linguistics. 2011 ; Vol. 37, No. 2. pp. 351-383.
@article{cdd978c8c19e475583cade810f5ec0a8,
title = "Learning to rank answers to non-factoid questions from web collections",
abstract = "This work investigates the use of linguistically motivated features to improve search, in particular for ranking answers to non-factoid questions. We show that it is possible to exploit existing large collections of question-answer pairs (from online social Question Answering sites) to extract such features and train ranking models which combine them effectively.We investigate a wide range of feature types, some exploiting natural language processing such as coarse word sense disambiguation, named-entity identification, syntactic parsing, and semantic role labeling. Our experiments demonstrate that linguistic features, in combination, yield considerable improvements in accuracy. Depending on the system settings we measure relative improvements of 14{\%} to 21{\%} in Mean Reciprocal Rank and Precision@1, providing one of the most compelling evidence to date that complex linguistic features such as word senses and semantic roles can have a significant impact on large-scale information retrieval tasks.",
author = "Mihai Surdeanu and Massimiliano Ciaramita and Hugo Zaragoza",
year = "2011",
month = "6",
doi = "10.1162/COLI_a_00051",
language = "English (US)",
volume = "37",
pages = "351--383",
journal = "Computational Linguistics",
issn = "0891-2017",
publisher = "MIT Press Journals",
number = "2",

}

TY - JOUR

T1 - Learning to rank answers to non-factoid questions from web collections

AU - Surdeanu, Mihai

AU - Ciaramita, Massimiliano

AU - Zaragoza, Hugo

PY - 2011/6

Y1 - 2011/6

N2 - This work investigates the use of linguistically motivated features to improve search, in particular for ranking answers to non-factoid questions. We show that it is possible to exploit existing large collections of question-answer pairs (from online social Question Answering sites) to extract such features and train ranking models which combine them effectively.We investigate a wide range of feature types, some exploiting natural language processing such as coarse word sense disambiguation, named-entity identification, syntactic parsing, and semantic role labeling. Our experiments demonstrate that linguistic features, in combination, yield considerable improvements in accuracy. Depending on the system settings we measure relative improvements of 14% to 21% in Mean Reciprocal Rank and Precision@1, providing one of the most compelling evidence to date that complex linguistic features such as word senses and semantic roles can have a significant impact on large-scale information retrieval tasks.

AB - This work investigates the use of linguistically motivated features to improve search, in particular for ranking answers to non-factoid questions. We show that it is possible to exploit existing large collections of question-answer pairs (from online social Question Answering sites) to extract such features and train ranking models which combine them effectively.We investigate a wide range of feature types, some exploiting natural language processing such as coarse word sense disambiguation, named-entity identification, syntactic parsing, and semantic role labeling. Our experiments demonstrate that linguistic features, in combination, yield considerable improvements in accuracy. Depending on the system settings we measure relative improvements of 14% to 21% in Mean Reciprocal Rank and Precision@1, providing one of the most compelling evidence to date that complex linguistic features such as word senses and semantic roles can have a significant impact on large-scale information retrieval tasks.

UR - http://www.scopus.com/inward/record.url?scp=79958717927&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79958717927&partnerID=8YFLogxK

U2 - 10.1162/COLI_a_00051

DO - 10.1162/COLI_a_00051

M3 - Article

AN - SCOPUS:79958717927

VL - 37

SP - 351

EP - 383

JO - Computational Linguistics

JF - Computational Linguistics

SN - 0891-2017

IS - 2

ER -