A multi-pass sieve for coreference resolution

Karthik Raghunathan, Heeyoung Lee, Sudarshan Rangarajan, Nathanael Chambers, Mihai Surdeanu, Dan Jurafsky, Christopher Manning

Research output: Chapter in Book/Report/Conference proceedingConference contribution

202 Scopus citations

Abstract

Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier's entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sieve-based approaches could be applied to other NLP tasks.

Original languageEnglish (US)
Title of host publicationEMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
Pages492-501
Number of pages10
StatePublished - Dec 1 2010
Externally publishedYes
EventConference on Empirical Methods in Natural Language Processing, EMNLP 2010 - Cambridge, MA, United States
Duration: Oct 9 2010Oct 11 2010

Publication series

NameEMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference

Other

OtherConference on Empirical Methods in Natural Language Processing, EMNLP 2010
CountryUnited States
CityCambridge, MA
Period10/9/1010/11/10

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint Dive into the research topics of 'A multi-pass sieve for coreference resolution'. Together they form a unique fingerprint.

  • Cite this

    Raghunathan, K., Lee, H., Rangarajan, S., Chambers, N., Surdeanu, M., Jurafsky, D., & Manning, C. (2010). A multi-pass sieve for coreference resolution. In EMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 492-501). (EMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference).