Multi-hop inference for sentence-level textgraphs: How challenging is meaningfully combining information for science question answering?

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Question Answering for complex questions is often modelled as a graph construction or traversal task, where a solver must build or traverse a graph of facts that answer and explain a given question. This "multi-hop" inference has been shown to be extremely challenging, with few models able to aggregate more than two facts before being overwhelmed by "semantic drift", or the tendency for long chains of facts to quickly drift off topic. This is a major barrier to current inference models, as even elementary science questions require an average of 4 to 6 facts to answer and explain. In this work we empirically characterize the difficulty of building or traversing a graph of sentences connected by lexical overlap, by evaluating chance sentence aggregation quality through 9,784 manually-annotated judgements across knowledge graphs built from three freetext corpora (including study guides and SimpleWikipedia). We demonstrate semantic drift tends to be high and aggregation quality low, at between 0.04% and 3%, and highlight scenarios that maximize the likelihood of meaningfully combining information.

Original languageEnglish (US)
Title of host publicationNAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies - Proceedings of the Student Research Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages12-17
Number of pages6
ISBN (Electronic)9781948087261
StatePublished - Jan 1 2018
Event2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018 - Student Research Workshop, SRW 2018 - New Orleans, United States
Duration: Jun 2 2018Jun 4 2018

Publication series

NameNAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Student Research Workshop

Conference

Conference2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018 - Student Research Workshop, SRW 2018
CountryUnited States
CityNew Orleans
Period6/2/186/4/18

    Fingerprint

ASJC Scopus subject areas

  • Linguistics and Language
  • Language and Linguistics
  • Computer Science Applications

Cite this

Jansen, P. A. (2018). Multi-hop inference for sentence-level textgraphs: How challenging is meaningfully combining information for science question answering? In NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Student Research Workshop (pp. 12-17). (NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Student Research Workshop). Association for Computational Linguistics (ACL).