A comparison of statistical and rule-induction learners for automatic tagging of time expressions in English

Jordi Poveda, Mihai Surdeanu, Jordi Turmo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Scopus citations

Abstract

Proper recognition and handling of temporal information contained in a text is key to understanding the flow of events depicted in the text and their accompanying circumstances. Consequently, time expression recognition and representation of the time information they convey in a suitable normalized form is an important task relevant to several problems in Natural Language Processing. In particular, such an analysis is largely significant for Information Extraction (IE), Question Answering (QA) and Automatic Summarization (AS). The most common approach to time expression recognition in the past has been the use of handmade extraction rules (grammars), which also served as the basis for normalization. Our aim is to explore the possibilities afforded by applying machine learning techniques to the recognition of time expressions. We focus on recognizing the appearances of time expressions in text (not normalization) and transform the problem into one of chunking, where the aim is to correctly assign Begin, Inside or Outside (BIO) tags to tokens. In this paper, we explain the knowledge representation used and compare the results obtained in our experiments with two different methods, one statistical (support vector machines) and one of rule induction (FOIL). Our empirical analysis shows that SVMs are superior.

Original languageEnglish (US)
Title of host publicationProceedings - 14th International Symposium on Temporal Representation and Reasoning, TIME 2007
Pages141-149
Number of pages9
DOIs
StatePublished - Dec 1 2007
Externally publishedYes
Event14th International Symposium on Temporal Representation and Reasoning, TIME 2007 - Alicante, Spain
Duration: Jun 28 2007Jun 30 2007

Publication series

NameProceedings of the International Workshop on Temporal Representation and Reasoning

Other

Other14th International Symposium on Temporal Representation and Reasoning, TIME 2007
CountrySpain
CityAlicante
Period6/28/076/30/07

ASJC Scopus subject areas

  • Mathematics(all)

Fingerprint Dive into the research topics of 'A comparison of statistical and rule-induction learners for automatic tagging of time expressions in English'. Together they form a unique fingerprint.

Cite this