Terminological mapping for high throughput comparative biology of phenotypes.

Research output: Chapter in Book/Report/Conference proceedingChapter

20 Scopus citations

Abstract

Comparative biological studies have led to remarkable biomedical discoveries. While genomic science and technologies are advancing rapidly, our ability to precisely specify a phenotype and compare it to related phenotypes of other organisms remains challenging. This study has examined the systematic use of terminology and knowledge based technologies to enable high-throughput comparative phenomics. More specifically, we measured the accuracy of a multi-strategy automated classification method to bridge the phenotype gap between a phenotypic terminology (MGD: Phenoslim) and a broad-coverage clinical terminology (SNOMED CT). Furthermore, we qualitatively evaluate the additional emerging properties of the combined terminological network for comparative biology and discovery science. According to the gold standard (n = 100), the accuracies (precision / recall) of the composite automated methods were 67% / 97% (mapping for identical concepts) and 85% / 98% (classification). Quantitatively, only 2% of the phenotypic concepts were missing from the clinical terminology, however, qualitatively the gap was larger: conceptual scope, granularity and subtle yet significant, homonymy problems were observed. These results suggest that, as observed in other domains, additional strategies are required for combining terminologies.

Original languageEnglish (US)
Title of host publicationPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
Pages202-213
Number of pages12
Publication statusPublished - 2004
Externally publishedYes

    Fingerprint

Cite this

Lussier, Y. A., & Li, J. (2004). Terminological mapping for high throughput comparative biology of phenotypes. In Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing (pp. 202-213)