Building a large-scale testing dataset for conceptual semantic annotation of text

Xiao Wei, Daniel Dajun Zeng, Xiangfeng Luo, Wei Wu

Research output: Contribution to journalArticle

1 Scopus citations

Abstract

One major obstacle facing the research on semantic annotation is lack of large-scale testing datasets. In this paper, we develop a systematic approach to constructing such datasets. This approach is based on guided ontology auto-construction and annotation methods which use little priori domain knowledge and little user knowledge in documents. We demonstrate the efficacy of the proposed approach by developing a large-scale testing dataset using information available from MeSH and PubMed. The developed testing dataset consists of a large-scale ontology, a large-scale set of annotated documents, and the baselines to evaluate the target algorithm, which can be employed to evaluate both the ontology construction algorithms and semantic annotation algorithms.

Original languageEnglish (US)
Pages (from-to)63-72
Number of pages10
JournalInternational Journal of Computational Science and Engineering
Volume16
Issue number1
DOIs
StatePublished - 2018
Externally publishedYes

Keywords

  • MeSH
  • PubMed
  • evaluation baseline
  • evaluation parameters
  • guided annotation method
  • ontology auto-construction
  • ontology concept learning
  • priori knowledge
  • semantic annotation
  • testing dataset

ASJC Scopus subject areas

  • Computational Mathematics
  • Modeling and Simulation
  • Computational Theory and Mathematics
  • Hardware and Architecture
  • Software

Fingerprint Dive into the research topics of 'Building a large-scale testing dataset for conceptual semantic annotation of text'. Together they form a unique fingerprint.

  • Cite this