A study of abbreviations in the UMLS.

H. Liu, Yves A Lussier, C. Friedman

Research output: Contribution to journalArticle

34 Citations (Scopus)

Abstract

Abbreviations are widely used in medicine. The understanding of abbreviations is important for medical language processing and information retrieval systems. The Unified Medical Language System (UMLS) contains a large number of abbreviations. We hypothesized that extracting and studying the UMLS abbreviations can be helpful for understanding the characteristics of abbreviations in medicine. In this paper, we describe a method for extracting abbreviations from the UMLS. We evaluated the method and studied the ambiguous nature of the abbreviations. In addition, the coverage of the UMLS abbreviations in medical reports was studied. Using our method, we extracted 163,666 unique (abbreviation, full form) pairs from the UMLS with a precision of 97.5%, and a recall of 96%. The UMLS abbreviations were highly ambiguous: 33.1% of abbreviations with six characters or less had multiple meanings; the average number of different full forms for all abbreviations with six characters or less was 2.28. The coverage of the UMLS abbreviations in medical reports was over 66%.

Original languageEnglish (US)
Pages (from-to)393-397
Number of pages5
JournalProceedings / AMIA ... Annual Symposium. AMIA Symposium
StatePublished - 2001
Externally publishedYes

Fingerprint

Unified Medical Language System
Medicine
Information Systems
Language

Cite this

A study of abbreviations in the UMLS. / Liu, H.; Lussier, Yves A; Friedman, C.

In: Proceedings / AMIA ... Annual Symposium. AMIA Symposium, 2001, p. 393-397.

Research output: Contribution to journalArticle

@article{c5019366df2c4773821d529ad1543dac,
title = "A study of abbreviations in the UMLS.",
abstract = "Abbreviations are widely used in medicine. The understanding of abbreviations is important for medical language processing and information retrieval systems. The Unified Medical Language System (UMLS) contains a large number of abbreviations. We hypothesized that extracting and studying the UMLS abbreviations can be helpful for understanding the characteristics of abbreviations in medicine. In this paper, we describe a method for extracting abbreviations from the UMLS. We evaluated the method and studied the ambiguous nature of the abbreviations. In addition, the coverage of the UMLS abbreviations in medical reports was studied. Using our method, we extracted 163,666 unique (abbreviation, full form) pairs from the UMLS with a precision of 97.5{\%}, and a recall of 96{\%}. The UMLS abbreviations were highly ambiguous: 33.1{\%} of abbreviations with six characters or less had multiple meanings; the average number of different full forms for all abbreviations with six characters or less was 2.28. The coverage of the UMLS abbreviations in medical reports was over 66{\%}.",
author = "H. Liu and Lussier, {Yves A} and C. Friedman",
year = "2001",
language = "English (US)",
pages = "393--397",
journal = "Proceedings / AMIA . Annual Symposium. AMIA Symposium",
issn = "1531-605X",
publisher = "Hanley & Belfus",

}

TY - JOUR

T1 - A study of abbreviations in the UMLS.

AU - Liu, H.

AU - Lussier, Yves A

AU - Friedman, C.

PY - 2001

Y1 - 2001

N2 - Abbreviations are widely used in medicine. The understanding of abbreviations is important for medical language processing and information retrieval systems. The Unified Medical Language System (UMLS) contains a large number of abbreviations. We hypothesized that extracting and studying the UMLS abbreviations can be helpful for understanding the characteristics of abbreviations in medicine. In this paper, we describe a method for extracting abbreviations from the UMLS. We evaluated the method and studied the ambiguous nature of the abbreviations. In addition, the coverage of the UMLS abbreviations in medical reports was studied. Using our method, we extracted 163,666 unique (abbreviation, full form) pairs from the UMLS with a precision of 97.5%, and a recall of 96%. The UMLS abbreviations were highly ambiguous: 33.1% of abbreviations with six characters or less had multiple meanings; the average number of different full forms for all abbreviations with six characters or less was 2.28. The coverage of the UMLS abbreviations in medical reports was over 66%.

AB - Abbreviations are widely used in medicine. The understanding of abbreviations is important for medical language processing and information retrieval systems. The Unified Medical Language System (UMLS) contains a large number of abbreviations. We hypothesized that extracting and studying the UMLS abbreviations can be helpful for understanding the characteristics of abbreviations in medicine. In this paper, we describe a method for extracting abbreviations from the UMLS. We evaluated the method and studied the ambiguous nature of the abbreviations. In addition, the coverage of the UMLS abbreviations in medical reports was studied. Using our method, we extracted 163,666 unique (abbreviation, full form) pairs from the UMLS with a precision of 97.5%, and a recall of 96%. The UMLS abbreviations were highly ambiguous: 33.1% of abbreviations with six characters or less had multiple meanings; the average number of different full forms for all abbreviations with six characters or less was 2.28. The coverage of the UMLS abbreviations in medical reports was over 66%.

UR - http://www.scopus.com/inward/record.url?scp=0035748436&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035748436&partnerID=8YFLogxK

M3 - Article

SP - 393

EP - 397

JO - Proceedings / AMIA . Annual Symposium. AMIA Symposium

JF - Proceedings / AMIA . Annual Symposium. AMIA Symposium

SN - 1531-605X

ER -