Deep learning based topic identification and categorization: Mining diabetes-related topics on Chinese health websites

Xinhuan Chen, Yong Zhang, Jennifer Xu, Chunxiao Xing, Hsinchun Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

As millions of people are diagnosed with diabetes every year, the demand for information about diabetes continues to increase. China is one of the countries with a large population of diabetes patients. Many Chinese health websites provide diabetes related news and articles. However, because most of the online articles are uncategorized or lack a clear topic and theme, users often cannot find their topics of interest effectively and efficiently. The problem of health topic identification and categorization on Chinese websites cannot be easily addressed by applying existing approaches and methods, which have been used for English documents, in a straightforward manner. To address this problem and meet users’ demand for diabetes related information needs, we propose a deep learning based framework to identify and categorize topics related to diabetes in online Chinese articles. Our experiments using datasets with over 19,000 online articles showed that the framework achieved a higher effectiveness and accuracy in categorizing diabetes related topics than most of the state-of-the-art benchmark approaches.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages481-500
Number of pages20
Volume9642
ISBN (Print)9783319320243
DOIs
StatePublished - 2016
Event21st International Conference on Database Systems for Advanced Applications, DASFAA 2016 - Dallas, United States
Duration: Apr 16 2016Apr 19 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9642
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other21st International Conference on Database Systems for Advanced Applications, DASFAA 2016
CountryUnited States
CityDallas
Period4/16/164/19/16

Keywords

  • Chinese
  • Deep learning
  • Healthcare
  • Text classification

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Fingerprint Dive into the research topics of 'Deep learning based topic identification and categorization: Mining diabetes-related topics on Chinese health websites'. Together they form a unique fingerprint.

  • Cite this

    Chen, X., Zhang, Y., Xu, J., Xing, C., & Chen, H. (2016). Deep learning based topic identification and categorization: Mining diabetes-related topics on Chinese health websites. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9642, pp. 481-500). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9642). Springer Verlag. https://doi.org/10.1007/978-3-319-32025-0_30