Analysis and identification of nocturnal groaning syndrome based on multimodal data

  • Xiaohui Xu North University of China, Shanxi 030000, China
  • Min Yu Department of Orthodontics, Peking University School and Hospital of Stomatology, Beijing 100081, China
  • Qing Wang Pharmacovigilance Research Center for information technology and Data Science, Cross-strait Tsinghua Research Institute, Xiamen 361000, China
  • Xuemei Gao Department of Orthodontics, Peking University School and Hospital of Stomatology, Beijing 100081, China
  • Wenai Song North University of China, Shanxi 030000, China
  • Xu Gong Department of Orthodontics, Peking University School and Hospital of Stomatology, Beijing 100081, China
  • Yi Lei Pharmacovigilance Research Center for information technology and Data Science, Cross-strait Tsinghua Research Institute, Xiamen 361000, China
Keywords: multimodal fusion; feature fusion; night moaning; pattern recognition
Article ID: 717

Abstract

Nocturnal groaning syndrome is a common sleep disorder characterized by irregular groaning or vocalizations during nighttime sleep, representing a significant area of research in sleep disorders. Nocturnal groaning syndrome is a common sleep disorder characterized by irregular groaning or vocalizations during nighttime sleep, representing a significant area of research in sleep disorders. proposes a multimodal recognition approach based on speech, image, and text modalities. The study analyzes audio features using Mel Frequency Cepstral Coefficients (MFCC), which is the most common method for identifying nocturnal groaning syndrome. Coefficients (MFCC), extracts image features with pretrained MobileNetV2, and identifies key physiological signals from text using TF-IDF algorithm. Subsequently, Multimodal Compact Bilinear Pooling (MCB) is employed to fuse audio and image features, and a Text-Image CNN is used to combine image and text features. Support Vector Machine (SVM) is then used to classify the fused multimodal features, and decision-level fusion is performed using weighting criteria. Experimental results demonstrate an identification accuracy of 89.5% on the test set, significantly enhancing the auxiliary diagnostic effectiveness of nocturnurnal diagnosis. Experimental results demonstrate an identification accuracy of 89.5% on the test set, significantly enhancing the auxiliary diagnostic effectiveness of nocturnal groaning syndrome.

References

1. Oudiette, D, LeuSemenescu, et. al. Nocturnal groaning: an unusual sleep-related vocalization.[J]. Sleep Medicine, 2018(41):7-13.

2. Ferri, R, Manconi, et. al. Nocturnal groaning: sleep-related disorders, singing, and nocturnal vocalizations[J]. Handbook of Clinical Neurology, 2017(146):289-297.

3. Mallat, S. A Theory for Multiresolution Signal Decomposition: The Wavelet Representation. [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1989(11(7)):674-693.

4. Donoho, L D. Denoising by Soft-Thresholding[J]. IEEE Transactions on Information Theory, 1995(41(3)):613-627.

5. Zhang, X, & Desai, et. al. Adaptive Denoising Based on SURE Risk[J]. IEEE Signal Processing Letters, 2003(10(4)):113-116.

6. Gonzalez, C R, & Woods, et. al. Digital Image Processing[M]. 2nd ed. Prentice Hall, 2002.

7. Bovik, A. C. (Ed.). Handbook of Image and Video Processing[M]. Academic Press, 2005.

8. Sun Xiaohu, Li Hongjun. A review of speech emotion recognition[J]. Computer Engineering and Applications, 2020(56(11)):1-9.

9. HOU Wenqi, YANG Shihua, WU Zhifeng, et al. Search and optimization of MobileNetV3[J]. Computer Science and Exploration, 2019(13(4)):567-580.

10. Sharmandin, Hou WQ, Zhu M, et al. MobileNetV2: Inverse residuals and linear bottlenecks[C]: Proceedings of the Conference on Computer Vision and Pattern Recognition, 2018.

11. Ghosh, A, Bhattacharya, etal. AdGAP: Advanced Global Average Pooling[C]. //Proceedings of the 2018 International Conference on Machine Learning and Data Engineering (ICMLDE), 2018:16-21.

12. Shalton, Michael. Introduction to Modern Information Retrieval [M]. Beijing: Tsinghua University Press, 1986.

13. Fukui, A, Park, et al. Multimodal compact bilinear pooling for visual question answering and visual grounding[J]. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016:457-468.

14. Karpathy, A, & Fei-Fei, et al. Deep visual-semantic alignments for generating image descriptions[J]. EEE Transactions on Pattern Analysis and Machine Intelligence, 2015(39(4)):664-676.

15. LeCun, Y, Bottou, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998(86(11)):2278-2324.

16. Simonyan, K, & Zisserman, et al. Very deep convolutional networks for large-scale image recognition[C]. //Proceedings of the International Conference on Learning Representations, 2015.

17. Sainath, N T, Mohamed, etal. Deep Convolutional Neural Networks for LVCSR[J]. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2013:8614-8618.

18. Gao HB, Zhang N, Deng ZC, et al. Compact bilinear pooling[C]: Proceedings of the Conference on Computer Vision and Pattern Recognition, 2016.

19. XU Kaiyuan, BAI Zhilong, WU Donghui, et al. Visual attention mechanisms in image subtitle generation[C]: Proceedings of the International Conference on Machine Learning, 2015.

20. Cortes, C, & Vapnik, et al. Support-vector networks[J]. Machine Learning, 1995(20(3)):273-297.

21. Hinton, E G, Osindero, et al. A fast learning algorithm for deep belief nets. [J]. Neural Computation, 2006(18(7)):1527-1554.

22. Bishop, M C. Pattern Recognition and Machine Learning[M]. Springer, 2006.

23. Hochreiter, S, & Schmidhuber, et al. Long short-term memory[J]. Neural Computation, 1997(9(8)):1735-1780.

24. Vapnik, V. Statistical Learning Theory[M]. Wiley, 1998.

Published
2024-11-25
How to Cite
Xu, X., Yu, M., Wang, Q., Gao, X., Song, W., Gong, X., & Lei, Y. (2024). Analysis and identification of nocturnal groaning syndrome based on multimodal data. Molecular & Cellular Biomechanics, 21(3), 717. https://doi.org/10.62617/mcb717
Section
Article