Segmentation of lecture videos based on text: A method combining multiple linguistic features

Ming Lin, Michael Chau, Jay F. Nunamaker, Hsinchun Chen

Research output: Contribution to journalConference article

17 Scopus citations

Abstract

In multimedia-based e-Learning systems, there are strong needs for segmenting lecture videos into topic units in order to organize the videos for browsing and to provide search capability. Automatic segmentation is highly desired because of the high cost of manual segmentation. While a lot of research has been conducted on topic segmentation of transcribed spoken text, most attempts rely on domain-specific cues and formal presentation format, and require extensive training; none of these features exist in lecture videos with unscripted and spontaneous speech. In addition, lecture videos usually have few scene changes, which implies that the visual information that most video segmentation methods rely on is not available. Furthermore, even when there are scene changes, they do not match with the topic transitions. In this paper, we make use of the transcribed speech text extracted from the audio track of video to segment lecture videos into topics. We review related research and propose a new segmentation approach. Our approach utilizes features such as noun phrases and combines multiple content-based and discourse-based features. Our preliminary results show that the noun phrases are salient features and the combination of multiple features is promising to improve segmentation accuracy.

Original languageEnglish (US)
Article numberCLATL03
Pages (from-to)23-32
Number of pages10
JournalProceedings of the Hawaii International Conference on System Sciences
Volume37
StatePublished - 2004
EventProceedings of the Hawaii International Conference on System Sciences - Big Island, HI., United States
Duration: Jan 5 2004Jan 8 2004

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint Dive into the research topics of 'Segmentation of lecture videos based on text: A method combining multiple linguistic features'. Together they form a unique fingerprint.

  • Cite this