CTFN: Hierarchical learning for multimodal sentiment analysis using coupled-translation fusion network

Jiajia Tang, Kang Li, Xuanyu Jin, Andrzej Cichocki, Qibin Zhao, Wanzeng Kong

Результат исследований: Глава в книге, отчете, сборнике статейМатериалы для конференциирецензирование

5 Цитирования (Scopus)

Аннотация

Multimodal sentiment analysis is the challenging research area that attends to the fusion of multiple heterogeneous modalities. The main challenge is the occurrence of some missing modalities during the multimodal fusion procedure. However, the existing techniques require all modalities as input, thus are sensitive to missing modalities at predicting time. In this work, the coupled-translation fusion network (CTFN) is firstly proposed to model bi-direction interplay via couple learning, ensuring the robustness in respect to missing modalities. Specifically, the cyclic consistency constraint is presented to improve the translation performance, allowing us directly to discard decoder and only embraces encoder of Transformer. This could contribute to a much lighter model. Due to the couple learning, CTFN is able to conduct bi-direction cross-modality intercorrelation parallelly. Based on CTFN, a hierarchical architecture is further established to exploit multiple bi-direction translations, leading to double multimodal fusing embeddings compared with traditional translation methods. Moreover, the convolution block is utilized to further highlight explicit interactions among those translations. For evaluation, CTFN was verified on two multimodal benchmarks with extensive ablation studies. The experiments demonstrate that the proposed framework achieves state-of-the-art or often competitive performance. Additionally, CTFN still maintains robustness when considering missing modality.

Язык оригиналаАнглийский
Название основной публикацииACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference
ИздательAssociation for Computational Linguistics (ACL)
Страницы5301-5311
Число страниц11
ISBN (электронное издание)9781954085527
СостояниеОпубликовано - 2021
СобытиеJoint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021 - Virtual, Online
Продолжительность: 1 авг. 20216 авг. 2021

Серия публикаций

НазваниеACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference

Конференция

КонференцияJoint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021
ГородVirtual, Online
Период1/08/216/08/21

Fingerprint

Подробные сведения о темах исследования «CTFN: Hierarchical learning for multimodal sentiment analysis using coupled-translation fusion network». Вместе они формируют уникальный семантический отпечаток (fingerprint).

Цитировать