Biomechanical and machine learning approaches to automating the identification of musical styles and emotions through human motion analysis
Abstract
This study explores the intricate relationship between biomechanical movements and musical expression, focusing on the identification of musical styles and emotions. Violin performance is characterized by complex interactions between physical actions—such as bowing techniques, finger placements, and posture—and the resulting acoustic output. Recent advances in motion capture technology and sound analysis have enabled a more objective examination of these processes. However, the current literature frequently addresses biomechanics and acoustic features in isolation, lacking an integrated understanding of how physical movements translate into specific musical expressions. Machine Learning (ML), particularly Long Short-Term Memory (LSTM) networks, provides a promising avenue for bridging this gap. LSTM models are adept at capturing temporal dependencies in sequential data, making them suitable for analyzing the dynamic nature of violin performance. In this work, they have proposed a comprehensive model that combines biomechanical analysis with Mel-spectrogram-based LSTM modeling to automate the identification of musical styles and emotions in violin performances. Using motion capture systems, Inertial Measurement Units (IMUs), and high-fidelity audio recordings, we collected synchronized biomechanical and acoustic data from violinists performing various musical excerpts. The LSTM model was trained on this dataset to learn the intricate connections between physical movements and the acoustic features of each performance. Key findings from the study demonstrate the effectiveness of this integrated approach. The LSTM model achieved a validation accuracy of 92.5% in classifying musical styles and emotions, with precision, recall, and F1-score reaching 94.3%, 92.6%, and 93.4%, respectively, by the 100th epoch. The analysis also revealed strong correlations between specific biomechanical parameters, such as shoulder joint angle and bowing velocity, and acoustic features, like sound intensity and vibrato amplitude.
References
1. Kucherenko S, Sediuk I. Aesthetic Experience and Its Expressions in Music Performance. International Review of the Aesthetics and Sociology of Music. 2020; 51(1): 19–28.
2. Bedoya D, Arias P, Rachman L, et al. Even violins can cry: specifically vocal emotional behaviours also drive the perception of emotions in non-vocal music. Philosophical Transactions of the Royal Society B: Biological Sciences. 2021; 376(1840). doi: 10.1098/rstb.2020.0396
3. Trollmo S. A method-based approach to historical violin playing: performance practice from a contemporary perspective [PhD thesis]. Boston University; 2020.
4. Kim N. Expressivity in the Melodic Line: Developing Musicality in Violin Students [PhD thesis]. The Florida State University; 2020.
5. Meissner H. Theoretical Framework for Facilitating Young Musicians’ Learning of Expressive Performance. Frontiers in Psychology. 2021; 11: 584171. doi: 10.3389/fpsyg.2020.584171
6. Lerch A, Arthur C, Pati A, Gururani S. An interdisciplinary review of music performance analysis. Transactions of the International Society for Music Information Retrieva. 2021; 3(1): 221–245. doi: 10.5334/tismir.53
7. Wolf E, Möller D, Ballenberger N, et al. Marker-Based Method for Analyzing the Three-Dimensional Upper Body Kinematics of Violinists: Reproducibility. Medical Problems of Performing Artists. 2022; 37(3): 176–191. doi: 10.21091/mppa.2022.3025
8. Turner C, Visentin P, Oye D, et al. Pursuing Artful Movement Science in Music Performance: Single Subject Motor Analysis with Two Elite Pianists. Perceptual and Motor Skills. 2021; 128(3): 1252–1274. doi: 10.1177/00315125211003493
9. Kyriakou T, de la Campa Crespo MÁ, Panayiotou A, et al. Virtual Instrument Performances (VIP): A Comprehensive Review. Computer Graphics Forum. 2024; 43(2). doi: 10.1111/cgf.15065
10. Erdem C, Lan Q, Jensenius AR. Exploring relationships between effort, motion, and sound in new musical instruments. Human Technology. 2020; 16(3): 310–347.
11. Freire S, Santos G, Armondes A, et al. Evaluation of Inertial Sensor Data by a Comparison with Optical Motion Capture Data of Guitar Strumming Gestures. Sensors. 2020; 20(19): 5722. doi: 10.3390/s20195722
12. Jensenius AR. Sound actions: Conceptualizing musical instruments. MIT Press; 2022.
13. Sharma, G. (2023). Audio Texture Analysis of Biomedical Audio Signals [PhD thesis]. Toronto Metropolitan University; 2023.
14. Gupta C, Li H, Goto M. (2022). Deep learning approaches in topics of singing information processing. IEEE/ACM Transactions on Audio, Speech, and Language Processing; 2022. pp. 2422–2451.
15. Alfaras M, Primett W, Umair M, et al. Biosensing and Actuation—Platforms Coupling Body Input-Output Modalities for Affective Technologies. Sensors. 2020; 20(21): 5968. doi: 10.3390/s20215968
16. Napier T, Ahn E, Allen-Ankins S, et al. Advancements in preprocessing, detection and classification techniques for ecoacoustic data: A comprehensive review for large-scale Passive Acoustic Monitoring. Expert Systems with Applications. 2024; 252: 124220. doi: 10.1016/j.eswa.2024.124220
17. Van Houdt G, Mosquera C, Nápoles G. A review on the long short-term memory model. Artificial Intelligence Review. 2020; 53(8): 5929–5955. doi: 10.1007/s10462-020-09838-1
18. Chikkamath S, Nirmala SR. Melody generation using LSTM and BI-LSTM Network. In: Proceedings of the 2021 International Conference on Computational Intelligence and Computing Applications (ICCICA); 2021. pp. 1–6.
19. Sarangi V. Biological and biomimetic machine learning for automatic classification of human gait [PhD thesis]. University of York; 2020.
20. Indumathi N et al., Impact of Fireworks Industry Safety Measures and Prevention Management System on Human Error Mitigation Using a Machine Learning Approach, Sensors, 2023, 23 (9), 4365; DOI:10.3390/s23094365.
21. Parkavi K et al., Effective Scheduling of Multi-Load Automated Guided Vehicle in Spinning Mill: A Case Study, IEEE Access, 2023, DOI:10.1109/ACCESS.2023.3236843.
22. Ran Q et al., English language teaching based on big data analytics in augmentative and alternative communication system, Springer-International Journal of Speech Technology, 2022, DOI:10.1007/s10772-022-09960-1.
23. Ngangbam PS et al., Investigation on characteristics of Monte Carlo model of single electron transistor using Orthodox Theory, Elsevier, Sustainable Energy Technologies and Assessments, Vol. 48, 2021, 101601, DOI:10.1016/j.seta.2021.101601.
24. Huidan Huang et al., Emotional intelligence for board capital on technological innovation performance of high-tech enterprises, Elsevier, Aggression and Violent Behavior, 2021, 101633, DOI:10.1016/j.avb.2021.101633.
25. Sudhakar S, et al., Cost-effective and efficient 3D human model creation and re-identification application for human digital twins, Multimedia Tools and Applications, 2021. DOI:10.1007/s11042-021-10842-y.
26. Prabhakaran N et al., Novel Collision Detection and Avoidance System for Mid-vehicle Using Offset-Based Curvilinear Motion. Wireless Personal Communication, 2021. DOI:10.1007/s11277-021-08333-2.
27. Balajee A et al., Modeling and multi-class classification of vibroarthographic signals via time domain curvilinear divergence random forest, J Ambient Intell Human Comput, 2021, DOI:10.1007/s12652-020-02869-0.
28. Omnia SN et al., An educational tool for enhanced mobile e-Learning for technical higher education using mobile devices for augmented reality, Microprocessors and Microsystems, 83, 2021, 104030, DOI:10.1016/j.micpro.2021.104030 .
29. Firas TA et al., Strategizing Low-Carbon Urban Planning through Environmental Impact Assessment by Artificial Intelligence-Driven Carbon Foot Print Forecasting, Journal of Machine and Computing, 4(4), 2024, doi: 10.53759/7669/jmc202404105.
30. Shaymaa HN, et al., Genetic Algorithms for Optimized Selection of Biodegradable Polymers in Sustainable Manufacturing Processes, Journal of Machine and Computing, 4(3), 563-574, https://doi.org/10.53759/7669/jmc202404054.
31. Hayder MAG et al., An open-source MP + CNN + BiLSTM model-based hybrid model for recognizing sign language on smartphones. Int J Syst Assur Eng Manag (2024). https://doi.org/10.1007/s13198-024-02376-x
32. Bhavana Raj K et al., Equipment Planning for an Automated Production Line Using a Cloud System, Innovations in Computer Science and Engineering. ICICSE 2022. Lecture Notes in Networks and Systems, 565, 707–717, Springer, Singapore. DOI:10.1007/978-981-19-7455-7_57.
Copyright (c) 2024 Yuan Ding
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright on all articles published in this journal is retained by the author(s), while the author(s) grant the publisher as the original publisher to publish the article.
Articles published in this journal are licensed under a Creative Commons Attribution 4.0 International, which means they can be shared, adapted and distributed provided that the original published version is cited.