A Shortcut-Stacked Document Encoder for Extractive Text Summarization

Peng Yan, Linjing Li, Daniel Zeng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

While doing summarization, human needs to understand the whole document, rather than separately understanding each sentence in the document. However, inter-sentence features within one document are not adequately modeled by previous neural network-based models that almost use only one layer recurrent neural network as document encoder. To learn high quality context-aware representation, we propose a shortcut-stacked document encoder for extractive summarization. We use multiple stacked bidirectional long short-term memory (LSTM) layers and add shortcut connections between LSTM layers to increase representation capacity. The shortcut-stacked document encoder is built on a temporal convolutional neural network-based sentence encoder to capture the hierarchical structure of the document. Then sentence representations encoded by document encoder are fed to a sentence selection classifier for summary extraction. Experiments on the well-known CNN/Daily Mail dataset show that the proposed model outperforms several recently proposed strong baselines, including both extractive and abstractive neural network-based models. Furthermore, the ablation analysis and position analysis also demonstrate the effectiveness of the proposed shortcut-stacked document encoder.

Original languageEnglish (US)
Title of host publication2019 International Joint Conference on Neural Networks, IJCNN 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728119854
DOIs
StatePublished - Jul 2019
Externally publishedYes
Event2019 International Joint Conference on Neural Networks, IJCNN 2019 - Budapest, Hungary
Duration: Jul 14 2019Jul 19 2019

Publication series

NameProceedings of the International Joint Conference on Neural Networks
Volume2019-July

Conference

Conference2019 International Joint Conference on Neural Networks, IJCNN 2019
CountryHungary
CityBudapest
Period7/14/197/19/19

Fingerprint

Neural networks
Recurrent neural networks
Ablation
Classifiers
Experiments
Long short-term memory

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Cite this

Yan, P., Li, L., & Zeng, D. (2019). A Shortcut-Stacked Document Encoder for Extractive Text Summarization. In 2019 International Joint Conference on Neural Networks, IJCNN 2019 [8852051] (Proceedings of the International Joint Conference on Neural Networks; Vol. 2019-July). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IJCNN.2019.8852051

A Shortcut-Stacked Document Encoder for Extractive Text Summarization. / Yan, Peng; Li, Linjing; Zeng, Daniel.

2019 International Joint Conference on Neural Networks, IJCNN 2019. Institute of Electrical and Electronics Engineers Inc., 2019. 8852051 (Proceedings of the International Joint Conference on Neural Networks; Vol. 2019-July).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yan, P, Li, L & Zeng, D 2019, A Shortcut-Stacked Document Encoder for Extractive Text Summarization. in 2019 International Joint Conference on Neural Networks, IJCNN 2019., 8852051, Proceedings of the International Joint Conference on Neural Networks, vol. 2019-July, Institute of Electrical and Electronics Engineers Inc., 2019 International Joint Conference on Neural Networks, IJCNN 2019, Budapest, Hungary, 7/14/19. https://doi.org/10.1109/IJCNN.2019.8852051
Yan P, Li L, Zeng D. A Shortcut-Stacked Document Encoder for Extractive Text Summarization. In 2019 International Joint Conference on Neural Networks, IJCNN 2019. Institute of Electrical and Electronics Engineers Inc. 2019. 8852051. (Proceedings of the International Joint Conference on Neural Networks). https://doi.org/10.1109/IJCNN.2019.8852051
Yan, Peng ; Li, Linjing ; Zeng, Daniel. / A Shortcut-Stacked Document Encoder for Extractive Text Summarization. 2019 International Joint Conference on Neural Networks, IJCNN 2019. Institute of Electrical and Electronics Engineers Inc., 2019. (Proceedings of the International Joint Conference on Neural Networks).
@inproceedings{5c14301a9ec24060aa945d55363e30b5,
title = "A Shortcut-Stacked Document Encoder for Extractive Text Summarization",
abstract = "While doing summarization, human needs to understand the whole document, rather than separately understanding each sentence in the document. However, inter-sentence features within one document are not adequately modeled by previous neural network-based models that almost use only one layer recurrent neural network as document encoder. To learn high quality context-aware representation, we propose a shortcut-stacked document encoder for extractive summarization. We use multiple stacked bidirectional long short-term memory (LSTM) layers and add shortcut connections between LSTM layers to increase representation capacity. The shortcut-stacked document encoder is built on a temporal convolutional neural network-based sentence encoder to capture the hierarchical structure of the document. Then sentence representations encoded by document encoder are fed to a sentence selection classifier for summary extraction. Experiments on the well-known CNN/Daily Mail dataset show that the proposed model outperforms several recently proposed strong baselines, including both extractive and abstractive neural network-based models. Furthermore, the ablation analysis and position analysis also demonstrate the effectiveness of the proposed shortcut-stacked document encoder.",
author = "Peng Yan and Linjing Li and Daniel Zeng",
year = "2019",
month = "7",
doi = "10.1109/IJCNN.2019.8852051",
language = "English (US)",
series = "Proceedings of the International Joint Conference on Neural Networks",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
booktitle = "2019 International Joint Conference on Neural Networks, IJCNN 2019",

}

TY - GEN

T1 - A Shortcut-Stacked Document Encoder for Extractive Text Summarization

AU - Yan, Peng

AU - Li, Linjing

AU - Zeng, Daniel

PY - 2019/7

Y1 - 2019/7

N2 - While doing summarization, human needs to understand the whole document, rather than separately understanding each sentence in the document. However, inter-sentence features within one document are not adequately modeled by previous neural network-based models that almost use only one layer recurrent neural network as document encoder. To learn high quality context-aware representation, we propose a shortcut-stacked document encoder for extractive summarization. We use multiple stacked bidirectional long short-term memory (LSTM) layers and add shortcut connections between LSTM layers to increase representation capacity. The shortcut-stacked document encoder is built on a temporal convolutional neural network-based sentence encoder to capture the hierarchical structure of the document. Then sentence representations encoded by document encoder are fed to a sentence selection classifier for summary extraction. Experiments on the well-known CNN/Daily Mail dataset show that the proposed model outperforms several recently proposed strong baselines, including both extractive and abstractive neural network-based models. Furthermore, the ablation analysis and position analysis also demonstrate the effectiveness of the proposed shortcut-stacked document encoder.

AB - While doing summarization, human needs to understand the whole document, rather than separately understanding each sentence in the document. However, inter-sentence features within one document are not adequately modeled by previous neural network-based models that almost use only one layer recurrent neural network as document encoder. To learn high quality context-aware representation, we propose a shortcut-stacked document encoder for extractive summarization. We use multiple stacked bidirectional long short-term memory (LSTM) layers and add shortcut connections between LSTM layers to increase representation capacity. The shortcut-stacked document encoder is built on a temporal convolutional neural network-based sentence encoder to capture the hierarchical structure of the document. Then sentence representations encoded by document encoder are fed to a sentence selection classifier for summary extraction. Experiments on the well-known CNN/Daily Mail dataset show that the proposed model outperforms several recently proposed strong baselines, including both extractive and abstractive neural network-based models. Furthermore, the ablation analysis and position analysis also demonstrate the effectiveness of the proposed shortcut-stacked document encoder.

UR - http://www.scopus.com/inward/record.url?scp=85073186819&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85073186819&partnerID=8YFLogxK

U2 - 10.1109/IJCNN.2019.8852051

DO - 10.1109/IJCNN.2019.8852051

M3 - Conference contribution

AN - SCOPUS:85073186819

T3 - Proceedings of the International Joint Conference on Neural Networks

BT - 2019 International Joint Conference on Neural Networks, IJCNN 2019

PB - Institute of Electrical and Electronics Engineers Inc.

ER -