MDA: Multimodal Data Augmentation Framework for Boosting Performance on Image-Text Sentiment/Emotion Classification Tasks

Nan Xu, Wenji Mao, Penghui Wei, Daniel Zeng

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Multimodal data analysis has drawn increasing attention with the explosive growth of multimedia data. Although traditional unimodal data analysis tasks have accumulated abundant labeled datasets, there are few labeled multimodal datasets due to the difficulty and complexity of multimodal data annotation, nor is it easy to directly transfer unimodal knowledge into multimodal data. Unfortunately, there are only a little related data augmentation work in multimodal domain, especially for image-text data. In this paper, to address the scarcity problem of labeled multimodal data, we propose a Multimodal Data Augmentation (MDA) framework for boosting performance on multimodal classification task. Our framework learns a cross-modality matching network to select image-text pairs from existing unimodal datasets as the multimodal synthetic dataset, and uses this dataset to enhance the performance of classifiers. We take the multimodal sentiment analysis and multimodal emotion analysis as the experimental tasks and the experimental results show the effectiveness of our framework for boosting the performance on multimodal classification task.

Original languageEnglish (US)
JournalIEEE Intelligent Systems
DOIs
StateAccepted/In press - 2020
Externally publishedYes

Keywords

  • Annotations
  • Automation
  • Boosting
  • cross-modality matching
  • Data analysis
  • Data augmentation
  • multimodal classification
  • Sentiment analysis
  • Social networking (online)
  • synthetic dataset
  • Task analysis

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'MDA: Multimodal Data Augmentation Framework for Boosting Performance on Image-Text Sentiment/Emotion Classification Tasks'. Together they form a unique fingerprint.

Cite this