ACTIV-ES: A comparable, cross-dialect corpus of 'everyday' Spanish from Argentina, Mexico, and Spain

Jerid Francom, Mans Hulden, Adam P Ussishkin, Julieta Fumagalli, Mikel Santesteban, Julio Serrano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Corpus resources for Spanish have proved invaluable for a number of applications in a wide variety of fields. However, a majority of resources are based on formal, written language and/or are not built to model language variation between varieties of the Spanish language, despite the fact that most language in 'everyday' use is informal/dialogue-based and shows rich regional variation. This paper outlines the development and evaluation of the ACTIV-ES corpus, a first-step to produce a comparable, cross-dialect corpus representative of the 'everyday' language of various regions of the Spanish-speaking world.

Original languageEnglish (US)
Title of host publicationProceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014
PublisherEuropean Language Resources Association (ELRA)
Pages1733-1737
Number of pages5
ISBN (Electronic)9782951740884
StatePublished - Jan 1 2014
Event9th International Conference on Language Resources and Evaluation, LREC 2014 - Reykjavik, Iceland
Duration: May 26 2014May 31 2014

Other

Other9th International Conference on Language Resources and Evaluation, LREC 2014
CountryIceland
CityReykjavik
Period5/26/145/31/14

Fingerprint

dialect
Argentina
Mexico
Spain
colloquial
Spanish language
written language
regional difference
language
resources
speaking
dialogue
evaluation
Language
Resources

Keywords

  • Corpora
  • Dialects
  • Spanish

ASJC Scopus subject areas

  • Linguistics and Language
  • Library and Information Sciences
  • Education
  • Language and Linguistics

Cite this

Francom, J., Hulden, M., Ussishkin, A. P., Fumagalli, J., Santesteban, M., & Serrano, J. (2014). ACTIV-ES: A comparable, cross-dialect corpus of 'everyday' Spanish from Argentina, Mexico, and Spain. In Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014 (pp. 1733-1737). European Language Resources Association (ELRA).

ACTIV-ES : A comparable, cross-dialect corpus of 'everyday' Spanish from Argentina, Mexico, and Spain. / Francom, Jerid; Hulden, Mans; Ussishkin, Adam P; Fumagalli, Julieta; Santesteban, Mikel; Serrano, Julio.

Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. European Language Resources Association (ELRA), 2014. p. 1733-1737.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Francom, J, Hulden, M, Ussishkin, AP, Fumagalli, J, Santesteban, M & Serrano, J 2014, ACTIV-ES: A comparable, cross-dialect corpus of 'everyday' Spanish from Argentina, Mexico, and Spain. in Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. European Language Resources Association (ELRA), pp. 1733-1737, 9th International Conference on Language Resources and Evaluation, LREC 2014, Reykjavik, Iceland, 5/26/14.
Francom J, Hulden M, Ussishkin AP, Fumagalli J, Santesteban M, Serrano J. ACTIV-ES: A comparable, cross-dialect corpus of 'everyday' Spanish from Argentina, Mexico, and Spain. In Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. European Language Resources Association (ELRA). 2014. p. 1733-1737
Francom, Jerid ; Hulden, Mans ; Ussishkin, Adam P ; Fumagalli, Julieta ; Santesteban, Mikel ; Serrano, Julio. / ACTIV-ES : A comparable, cross-dialect corpus of 'everyday' Spanish from Argentina, Mexico, and Spain. Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. European Language Resources Association (ELRA), 2014. pp. 1733-1737
@inproceedings{7f3f0003509a4d7885527c9c7380038b,
title = "ACTIV-ES: A comparable, cross-dialect corpus of 'everyday' Spanish from Argentina, Mexico, and Spain",
abstract = "Corpus resources for Spanish have proved invaluable for a number of applications in a wide variety of fields. However, a majority of resources are based on formal, written language and/or are not built to model language variation between varieties of the Spanish language, despite the fact that most language in 'everyday' use is informal/dialogue-based and shows rich regional variation. This paper outlines the development and evaluation of the ACTIV-ES corpus, a first-step to produce a comparable, cross-dialect corpus representative of the 'everyday' language of various regions of the Spanish-speaking world.",
keywords = "Corpora, Dialects, Spanish",
author = "Jerid Francom and Mans Hulden and Ussishkin, {Adam P} and Julieta Fumagalli and Mikel Santesteban and Julio Serrano",
year = "2014",
month = "1",
day = "1",
language = "English (US)",
pages = "1733--1737",
booktitle = "Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014",
publisher = "European Language Resources Association (ELRA)",

}

TY - GEN

T1 - ACTIV-ES

T2 - A comparable, cross-dialect corpus of 'everyday' Spanish from Argentina, Mexico, and Spain

AU - Francom, Jerid

AU - Hulden, Mans

AU - Ussishkin, Adam P

AU - Fumagalli, Julieta

AU - Santesteban, Mikel

AU - Serrano, Julio

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Corpus resources for Spanish have proved invaluable for a number of applications in a wide variety of fields. However, a majority of resources are based on formal, written language and/or are not built to model language variation between varieties of the Spanish language, despite the fact that most language in 'everyday' use is informal/dialogue-based and shows rich regional variation. This paper outlines the development and evaluation of the ACTIV-ES corpus, a first-step to produce a comparable, cross-dialect corpus representative of the 'everyday' language of various regions of the Spanish-speaking world.

AB - Corpus resources for Spanish have proved invaluable for a number of applications in a wide variety of fields. However, a majority of resources are based on formal, written language and/or are not built to model language variation between varieties of the Spanish language, despite the fact that most language in 'everyday' use is informal/dialogue-based and shows rich regional variation. This paper outlines the development and evaluation of the ACTIV-ES corpus, a first-step to produce a comparable, cross-dialect corpus representative of the 'everyday' language of various regions of the Spanish-speaking world.

KW - Corpora

KW - Dialects

KW - Spanish

UR - http://www.scopus.com/inward/record.url?scp=84981189835&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84981189835&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84981189835

SP - 1733

EP - 1737

BT - Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014

PB - European Language Resources Association (ELRA)

ER -