Combining generative adversarial networks for emotion prediction in college students: An application to interactive musical interpretation

Xi  Song

doi:10.62617/mcb1625

Xi Song Department of Music, College of Arts, Xiamen University, Xiamen 361005, China

DOI: https://doi.org/10.62617/mcb1625

Keywords: generative adversarial networks; sentiment prediction; interactive music interpretation; discriminator; loss

Article ID: 1625

Abstract

Traditional emotion prediction methods rely heavily on large amounts of labeled data and often struggle to capture subtle variations and individual differences in emotional expression. The goal of this paper is to enhance Generative Adversarial Networks (GANs) to improve emotion prediction accuracy, thereby providing college students with a more intelligent and personalized learning experience in interactive music interpretation. Firstly, a spatial channel attention mechanism is incorporated into the generator of the D2M-GAN multimodal generative adversarial network to improve the model’s ability to focus on important information. Additionally, the traditional large kernel convolutional layer is replaced by a convolutional layer with multi-scale convolution, enhancing the model’s ability to assess the authenticity of the generated data. To further optimize the model, the generative network is both rewarded and penalized using music theory rules, and the convergence speed is accelerated by optimizing the loss function. This improves the intelligence and personalization of interactive music interpretation. In this study, the accuracy and generalization ability of the proposed Deep Two-Modal Generative Adversarial Network with Spatial Channel Attention model (D2M-GAN-SCA) are evaluated using cross-validation and comparative validation. The experimental results demonstrate that the generator structure with the spatial channel attention mechanism, combined with the discriminator optimization strategy involving multi-scale convolutional layers, significantly enhances the accuracy of sentiment prediction. An accuracy of 97.03% is achieved after 1400 training iterations. Furthermore, the model shows notable improvements in loss function stability, convergence speed, and the quality of generated music. These advancements provide robust support for sentiment prediction and real-time interactive music generation, facilitating a more engaging and personalized online learning experience for college students in music interpretation.

References

1. Pratama MP, Sampelolo R, Lura H. Revolutionizing education: harnessing the power of artificial intelligence for personalized learning. Klasikal: Journal of education, language teaching and science. 2023; 5(2): 350-357. doi: 10.52208/klasikal.v5i2.877

2. Kumar A, Sharma K, Sharma A. MEmoR: A Multimodal Emotion Recognition using affective biomarkers for smart prediction of emotional health for people analytics in smart industries. Image and Vision Computing. 2022; 123: 104483. doi: 10.1016/j.imavis.2022.104483

3. Wang X, Jiang H, Mu M, et al. A trackable multi-domain collaborative generative adversarial network for rotating machinery fault diagnosis. Mechanical Systems and Signal Processing. 2025; 224: 111950. doi: 10.1016/j.ymssp.2024.111950

4. Chen Y, Yang XH, Wei Z, et al. Generative Adversarial Networks in Medical Image augmentation: A review. Computers in Biology and Medicine. 2022; 144: 105382. doi: 10.1016/j.compbiomed.2022.105382

5. Zhang S. Interactive environment for music education: developing certain thinking skills in a mobile or static interactive environment. Interactive Learning Environments. 2022; 31(10): 6856-6868. doi: 10.1080/10494820.2022.2049826

6. Wang X, Ren Y, Luo Z, et al. Deep learning-based EEG emotion recognition: Current trends and future perspectives. Frontiers in Psychology. 2023; 14. doi: 10.3389/fpsyg.2023.1126994

7. Xu D, Tian Z, Lai R, et al. Deep learning based emotion analysis of microblog texts. Information Fusion. 2020; 64: 1-11. doi: 10.1016/j.inffus.2020.06.002

8. Li S, Shi W, Wang J, et al. A Deep Learning-Based Approach to Constructing a Domain Sentiment Lexicon: A Case Study in Financial Distress Prediction. Information Processing & Management. 2021; 58(5): 102673. doi: 10.1016/j.ipm.2021.102673

9. Yang H, Hu Y, He S, et al. Applying Conditional Generative Adversarial Networks for Imaging Diagnosis. Proceedings of the 2024 IEEE 6th International Conference on Power, Intelligent Computing and Systems (ICPICS); 2024. doi: 10.1109/icpics62053.2024.10796196

10. Yao J, Zhao Y, Bu Y, et al. Laplacian Pyramid Fusion Network with Hierarchical Guidance for Infrared and Visible Image Fusion. IEEE Transactions on Circuits and Systems for Video Technology. 2023; 33(9): 4630-4644. doi: 10.1109/tcsvt.2023.3245607

11. Wang X, Chen H, Tang S, et al. Disentangled Representation Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2024; 46(12): 9677-9696. doi: 10.1109/tpami.2024.3420937

12. Lee CK, Cheon YJ, Hwang WY. Least Squares Generative Adversarial Networks-Based Anomaly Detection. IEEE Access. 2022; 10: 26920-26930. doi: 10.1109/access.2022.3158343

13. Mi J, Ma C, Zheng L, et al. WGAN-CL: A Wasserstein GAN with confidence loss for small-sample augmentation[J]. Expert Systems with Applications, 2023, 233: 120943. doi: 10.1016/j.eswa.2023.120943

14. Gan C, Zheng J, Zhu Q, et al. A survey of dialogic emotion analysis: Developments, approaches and perspectives. Pattern Recognition. 2024; 156: 110794. doi: 10.1016/j.patcog.2024.110794

15. Huang K, Zhou Y, Yu X, et al. Innovative entrepreneurial market trend prediction model based on deep learning: Case study and performance evaluation. Science Progress. 2024; 107(3). doi: 10.1177/00368504241272722

16. Wang J, Chen Z. Factor-GAN: Enhancing stock price prediction and factor investment with Generative Adversarial Networks. PLOS ONE. 2024; 19(6): e0306094. doi: 10.1371/journal.pone.0306094

17. Celard P, Iglesias EL, Sorribes-Fdez JM, et al. A survey on deep learning applied to medical images: from simple artificial neural networks to generative models. Neural Computing and Applications. 2022; 35(3): 2291-2323. doi: 10.1007/s00521-022-07953-4

18. Taye MM. Theoretical Understanding of Convolutional Neural Network: Concepts, Architectures, Applications, Future Directions. Computation. 2023; 11(3): 52. doi: 10.3390/computation11030052

19. Wang T, Lin Y. CycleGAN with better cycles. arXiv; 2024.

20. Ardeliya VE, Taylor J, Wolfson J. Exploration of Artificial Intelligence in Creative Fields: Generative Art, Music, and Design. International Journal of Cyber and IT Service Management. 2024; 4(1): 40-46. doi: 10.34306/ijcitsm.v4i1.149

21. Mancusi M. Harmonizing deep learning: a journey through the innovations in signal processing, source separation and music generation. Catalogo dei prodotti della ricercar; 2024.

22. Shao H, Yao S, Sun D, et al. Controlvae: Controllable variational autoencoder. International conference on machine learning; 2020.

23. Han K, Xiao A, Wu E, et al. Transformer in transformer. Advances in neural information processing systems; 2021.

24. Shih YJ, Wu SL, Zalkow F, et al. Theme Transformer: Symbolic Music Generation With Theme-Conditioned Transformer. IEEE Transactions on Multimedia. 2023; 25: 3495-3508. doi: 10.1109/tmm.2022.3161851

25. Gu X, See KW, Liu Y, et al. A time-series Wasserstein GAN method for state-of-charge estimation of lithium-ion batteries. Journal of Power Sources. 2023; 581: 233472. doi: 10.1016/j.jpowsour.2023.233472

26. Song Q, Li G, Wu S, et al. Discriminator feature-based progressive GAN inversion. Knowledge-Based Systems. 2023; 261: 110186. doi: 10.1016/j.knosys.2022.110186

27. Zhu Y, Olszewski K, Wu Y, et al. Quantized gan for complex music generation from dance videos. European Conference on Computer Vision. Cham: Springer Cham: Springer Switzerland; 2022.

28. Lei D, Ran G, Zhang L, et al. A Spatiotemporal Fusion Method Based on Multiscale Feature Extraction and Spatial Channel Attention Mechanism. Remote Sensing. 2022; 14(3): 461. doi: 10.3390/rs14030461

29. Barron JT. A General and Adaptive Robust Loss Function. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2019. doi: 10.1109/cvpr.2019.00446

30. Margulis EH, Beatty AP. Musical Style, Psychoaesthetics, and Prospects for Entropy as an Analytic Tool. Computer Music Journal. 2008; 32(4): 64-78. doi: 10.1162/comj.2008.32.4.64

Combining generative adversarial networks for emotion prediction in college students: An application to interactive musical interpretation

Abstract

References

Further Information

Guidelines

Contact

WhatsApp: