sagar122 commited on
Commit
7e1d07b
·
1 Parent(s): 4494040
Files changed (1) hide show
  1. README.md +60 -11
README.md CHANGED
@@ -1,14 +1,63 @@
1
- ---
2
- datasets:
3
- - red_caps
 
 
4
 
5
- language:
6
- - en
 
7
 
8
- license: apache-2.0
9
- metrics:
10
- - accuracy
11
 
12
- tags:
13
- - flair
14
- thumbnail: https://winaero.com/blog/wp-content/uploads/2017/12/Google-Chrome-icon-big-2.png
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ ---
4
+ ## COGMEN; Official Pytorch Implementation
5
+ [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/cogmen-contextualized-gnn-based-multimodal/multimodal-emotion-recognition-on-iemocap)](https://paperswithcode.com/sota/multimodal-emotion-recognition-on-iemocap?p=cogmen-contextualized-gnn-based-multimodal)
6
 
7
+ **CO**ntextualized **G**NN based **M**ultimodal **E**motion recognitio**N**
8
+ ![Teaser image](logo.png)
9
+ **Picture:** *My sample picture for logo*
10
 
11
+ This repository contains the official Pytorch implementation of the following paper:
12
+ > **COGMEN: COntextualized GNN based Multimodal Emotion recognitioN**<br>
 
13
 
14
+ > **Paper:** https://arxiv.org/abs/2205.02455
15
+
16
+ > **Authors:** Abhinav Joshi, Ashwani Bhat, Ayush Jain, Atin Vikram Singh, Ashutosh Modi<br>
17
+ >
18
+ > **Abstract:** *Emotions are an inherent part of human interactions, and consequently, it is imperative to develop AI systems that understand and recognize human emotions. During a conversation involving various people, a person’s emotions are influenced by the other speaker’s utterances and their own emotional state over the utterances. In this paper, we propose COntextualized Graph Neural Network based Multimodal Emotion recognitioN (COGMEN) system that leverages local information (i.e., inter/intra dependency between speakers) and global information (context). The proposed model uses Graph Neural Network (GNN) based architecture to model the complex dependencies (local and global information) in a conversation. Our model gives state-of-theart (SOTA) results on IEMOCAP and MOSEI datasets, and detailed ablation experiments
19
+ show the importance of modeling information at both levels*
20
+
21
+ ## Requirements
22
+
23
+ - We use PyG (PyTorch Geometric) for the GNN component in our architecture. [RGCNConv](https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.nn.conv.RGCNConv) and [TransformerConv](https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.nn.conv.TransformerConv)
24
+
25
+ - We use [comet](https://comet.ml) for logging all our experiments and its Bayesian optimizer for hyperparameter tuning.
26
+
27
+ - For textual features we use [SBERT](https://www.sbert.net/).
28
+ ### Installations
29
+ - [Install PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html)
30
+
31
+ - [Install Comet.ml](https://www.comet.ml/docs/python-sdk/advanced/)
32
+ - [Install SBERT](https://www.sbert.net/)
33
+
34
+
35
+ ## Preparing datasets for training
36
+
37
+ python preprocess.py --dataset="iemocap_4"
38
+ ## Training networks
39
+
40
+ python train.py --dataset="iemocap_4" --modalities="atv" --from_begin --epochs=55
41
+ ## Run Evaluation [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1biIvonBdJWo2TiYyTiQkxZ_V88JEXa_d?usp=sharing)
42
+
43
+ python eval.py --dataset="iemocap_4" --modalities="atv"
44
+ Please cite the paper using following citation:
45
+
46
+ ## Citation
47
+ @inproceedings{joshi-etal-2022-cogmen,
48
+ title = "{COGMEN}: {CO}ntextualized {GNN} based Multimodal Emotion recognitio{N}",
49
+ author = "Joshi, Abhinav and
50
+ Bhat, Ashwani and
51
+ Jain, Ayush and
52
+ Singh, Atin and
53
+ Modi, Ashutosh",
54
+ booktitle = "Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
55
+ month = jul,
56
+ year = "2022",
57
+ address = "Seattle, United States",
58
+ publisher = "Association for Computational Linguistics",
59
+ url = "https://aclanthology.org/2022.naacl-main.306",
60
+ pages = "4148--4164",
61
+ abstract = "Emotions are an inherent part of human interactions, and consequently, it is imperative to develop AI systems that understand and recognize human emotions. During a conversation involving various people, a person{'}s emotions are influenced by the other speaker{'}s utterances and their own emotional state over the utterances. In this paper, we propose COntextualized Graph Neural Network based Multi- modal Emotion recognitioN (COGMEN) system that leverages local information (i.e., inter/intra dependency between speakers) and global information (context). The proposed model uses Graph Neural Network (GNN) based architecture to model the complex dependencies (local and global information) in a conversation. Our model gives state-of-the- art (SOTA) results on IEMOCAP and MOSEI datasets, and detailed ablation experiments show the importance of modeling information at both levels.",}
62
+ ## Acknowledgments
63
+ The structure of our code is inspired by [pytorch-DialogueGCN-mianzhang](https://github.com/mianzhang/dialogue_gcn).