Competency evaluation of plant character ontologies against domain literature

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

Specimen identification keys are still the most commonly created tools used by systematic biologists to access biodiversity information. Creating identification keys requires analyzing and synthesizing large amounts of information from specimens and their descriptions and is a very labor-intensive and time-consuming activity. Automating the generation of identification keys from text descriptions becomes a highly attractive text mining application in the biodiversity domain. Fine-grained semantic annotation of morphological descriptions of organisms is a necessary first step in generating keys from text. Machine-readable ontologies are needed in this process because most biological characters are only implied (i.e., not stated) in descriptions. The immediate question to ask is "How well do existing ontologies support semantic annotation and automated key generation? "With the intention to either select an existing ontology or develop a unified ontology based on existing ones, this paper evaluates the coverage, semantic consistency, and inter-ontology agreement of a biodiversity character ontology and three plant glossaries that may be turned into ontologies. The coverage and semantic consistency of the ontology/glossaries are checked against the authoritative domain literature, namely, Flora of North America and Flora of China. The evaluation results suggest that more work is needed to improve the coverage and interoperability of the ontology/glossaries. More concepts need to be added to the ontology/glossaries and careful work is needed to improve the semantic consistency. The method used in this paper to evaluate the ontology/glossaries can be used to propose new candidate concepts from the domain literature and suggest appropriate definitions.

Original languageEnglish (US)
Pages (from-to)1144-1165
Number of pages22
JournalJournal of the American Society for Information Science and Technology
Volume61
Issue number6
DOIs
StatePublished - Jun 2010

Fingerprint

ontology
Ontology
evaluation
Glossaries
Semantics
semantics
Biodiversity
biodiversity
coverage
literature
Competency
Evaluation
Interoperability
candidacy
Personnel
labor
China

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence
  • Information Systems
  • Human-Computer Interaction
  • Computer Networks and Communications

Cite this

@article{5394c4e297fd44bfa7445ea3c67bc83b,
title = "Competency evaluation of plant character ontologies against domain literature",
abstract = "Specimen identification keys are still the most commonly created tools used by systematic biologists to access biodiversity information. Creating identification keys requires analyzing and synthesizing large amounts of information from specimens and their descriptions and is a very labor-intensive and time-consuming activity. Automating the generation of identification keys from text descriptions becomes a highly attractive text mining application in the biodiversity domain. Fine-grained semantic annotation of morphological descriptions of organisms is a necessary first step in generating keys from text. Machine-readable ontologies are needed in this process because most biological characters are only implied (i.e., not stated) in descriptions. The immediate question to ask is {"}How well do existing ontologies support semantic annotation and automated key generation? {"}With the intention to either select an existing ontology or develop a unified ontology based on existing ones, this paper evaluates the coverage, semantic consistency, and inter-ontology agreement of a biodiversity character ontology and three plant glossaries that may be turned into ontologies. The coverage and semantic consistency of the ontology/glossaries are checked against the authoritative domain literature, namely, Flora of North America and Flora of China. The evaluation results suggest that more work is needed to improve the coverage and interoperability of the ontology/glossaries. More concepts need to be added to the ontology/glossaries and careful work is needed to improve the semantic consistency. The method used in this paper to evaluate the ontology/glossaries can be used to propose new candidate concepts from the domain literature and suggest appropriate definitions.",
author = "Hong Cui",
year = "2010",
month = "6",
doi = "10.1002/asi.21325",
language = "English (US)",
volume = "61",
pages = "1144--1165",
journal = "Journal of the Association for Information Science and Technology",
issn = "2330-1635",
publisher = "John Wiley and Sons Ltd",
number = "6",

}

TY - JOUR

T1 - Competency evaluation of plant character ontologies against domain literature

AU - Cui, Hong

PY - 2010/6

Y1 - 2010/6

N2 - Specimen identification keys are still the most commonly created tools used by systematic biologists to access biodiversity information. Creating identification keys requires analyzing and synthesizing large amounts of information from specimens and their descriptions and is a very labor-intensive and time-consuming activity. Automating the generation of identification keys from text descriptions becomes a highly attractive text mining application in the biodiversity domain. Fine-grained semantic annotation of morphological descriptions of organisms is a necessary first step in generating keys from text. Machine-readable ontologies are needed in this process because most biological characters are only implied (i.e., not stated) in descriptions. The immediate question to ask is "How well do existing ontologies support semantic annotation and automated key generation? "With the intention to either select an existing ontology or develop a unified ontology based on existing ones, this paper evaluates the coverage, semantic consistency, and inter-ontology agreement of a biodiversity character ontology and three plant glossaries that may be turned into ontologies. The coverage and semantic consistency of the ontology/glossaries are checked against the authoritative domain literature, namely, Flora of North America and Flora of China. The evaluation results suggest that more work is needed to improve the coverage and interoperability of the ontology/glossaries. More concepts need to be added to the ontology/glossaries and careful work is needed to improve the semantic consistency. The method used in this paper to evaluate the ontology/glossaries can be used to propose new candidate concepts from the domain literature and suggest appropriate definitions.

AB - Specimen identification keys are still the most commonly created tools used by systematic biologists to access biodiversity information. Creating identification keys requires analyzing and synthesizing large amounts of information from specimens and their descriptions and is a very labor-intensive and time-consuming activity. Automating the generation of identification keys from text descriptions becomes a highly attractive text mining application in the biodiversity domain. Fine-grained semantic annotation of morphological descriptions of organisms is a necessary first step in generating keys from text. Machine-readable ontologies are needed in this process because most biological characters are only implied (i.e., not stated) in descriptions. The immediate question to ask is "How well do existing ontologies support semantic annotation and automated key generation? "With the intention to either select an existing ontology or develop a unified ontology based on existing ones, this paper evaluates the coverage, semantic consistency, and inter-ontology agreement of a biodiversity character ontology and three plant glossaries that may be turned into ontologies. The coverage and semantic consistency of the ontology/glossaries are checked against the authoritative domain literature, namely, Flora of North America and Flora of China. The evaluation results suggest that more work is needed to improve the coverage and interoperability of the ontology/glossaries. More concepts need to be added to the ontology/glossaries and careful work is needed to improve the semantic consistency. The method used in this paper to evaluate the ontology/glossaries can be used to propose new candidate concepts from the domain literature and suggest appropriate definitions.

UR - http://www.scopus.com/inward/record.url?scp=77952997623&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77952997623&partnerID=8YFLogxK

U2 - 10.1002/asi.21325

DO - 10.1002/asi.21325

M3 - Article

AN - SCOPUS:77952997623

VL - 61

SP - 1144

EP - 1165

JO - Journal of the Association for Information Science and Technology

JF - Journal of the Association for Information Science and Technology

SN - 2330-1635

IS - 6

ER -