Application of social media data mining in biomechanical and tactical analysis of tennis tournament players
Abstract
The rise of social media has provided a rich source of real-time data for analyzing player performance and tactics in professional sports, particularly tennis. This study harnesses social media data mining techniques to analyze tennis-related discussions on Twitter, focusing on identifying biomechanical patterns and tactical strategies during major tournaments. We propose a hybrid model combining Bidirectional Encoder Representations from Transformers (BERT) for generating contextual embeddings and Bidirectional Long Short-Term Memory (Bi-LSTM) for analyzing the sequential nature of tweets. The data collection spans tweets discussing key tournaments, including the Australian Open, French Open, Wimbledon, and US Open. It focuses on specific player movements such as footwork, speed, endurance, and tactical decisions like serve placement, net play, and shot selection. Our methodology includes preprocessing the data, tokenizing the text, and applying sentiment analysis to capture public perception of player performance. The model achieves an accuracy of 88.5% and an F1-score of 87.95%, outperforming comparative models such as BERT with CNN and GloVe with LSTM. The analysis highlights key player-specific tactics, including Rafael Nadal’s baseline dominance and Novak Djokovic’s defensive play, as well as tournament-specific strategies, such as serve-and-volley at Wimbledon and baseline control at the French Open. Furthermore, sentiment analysis reveals positive public perception toward player performance, with key emotions such as excitement and admiration frequently expressed during intense match moments. This study demonstrates the effectiveness of applying advanced NLP techniques to social media data for sports analytics. The insights generated can inform players, coaches, and analysts in enhancing performance strategies and understanding public reactions. Using social media data, our approach provides a scalable framework for analyzing tactical shifts and player performance in other sports contexts.
References
1. Muhsal; F.; Jaitner; D.; & John; J. (2023). # picturesofchange: Physical self-representations in social media as a sign of change in sports-and movement culture: An integrative review with educational implications. Current Issues in Sport Science (CISS); 8(3); 006-006.
2. Hachtmann; F. (2022). Emerging trends in computer-mediated communication and social media in sport: Theory and practice. The emerald handbook of computer-mediated communication and social media; 269-284.
3. Weimar; D.; Soebbing; B. P.; & Wicker; P. (2021). Dealing with statistical significance in big data: The social media value of game outcomes in professional football. Journal of Sport Management; 35(3); 266-277.
4. Pather; S. (2021). The impact of digital media platforms on sports reporting and audience engagement: A case study of Twitter in South Africa. University of Johannesburg (South Africa).
5. Crespo; M.; Martínez-Gallego; R.; & Filipcic; A. (2024). Determining the tactical and technical level of competitive tennis players using a competency model: a systematic review. Frontiers in Sports and Active Living; 6; 1406846.
6. Sampaio; T.; Oliveira; J. P.; Marinho; D. A.; Neiva; H. P.; & Morais; J. E. (2024). Applications of Machine Learning to Optimize Tennis Performance: A Systematic Review. Applied Sciences; 14(13); 5517.
7. Fett; J.; Oberschelp; N.; Vuong; J. L.; Wiewelhove; T.; & Ferrauti; A. (2021). Kinematic characteristics of the tennis serve from the ad and deuce court service positions in elite junior players. PLoS One; 16(7); e0252650.
8. Carboch; J.; Brenton; J.; Reischlova; E.; & Kocib; T. (2023). Anticipatory information sources of serve and returning of elite professional tennis players: A qualitative approach. International Journal of Sports Science & Coaching; 18(3); 761-771.
9. Olivetti; E. A.; Cole; J. M.; Kim; E.; Kononova; O.; Ceder; G.; Han; T. Y. J.; & Hiszpanski; A. M. (2020). Data-driven materials research enabled by natural language processing and information extraction. Applied Physics Reviews; 7(4).
10. Lin; J.; Nogueira; R.; & Yates; A. (2022). Pretrained transformers for text ranking: Bert and beyond. Springer Nature.
11. Koroteev; M. V. (2021). BERT: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943.
12. Sun; Y.; & Platoš; J. (2023). Attention-based Stacked Bidirectional Long Short-term Memory Model for Word Sense Disambiguation. ACM Transactions on Asian and Low-Resource Language Information Processing.
13. Zeberga; K.; Attique; M.; Shah; B.; Ali; F.; Jembre; Y. Z.; & Chung; T. S. (2022). [Retracted] A Novel Text Mining Approach for Mental
14. Kolman; N. (2023). Unravelling tennis performance: creating monitoring tools to measure and understand technical and tactical skills.
15. Srihi; S.; Jouira; G.; Ben Waer; F.; Rebai; H.; Majdoub; A.; & Sahli; S. (2022). Postural balance in young tennis players of varied competition levels. Perceptual and Motor Skills; 129(5); 1599-1613.
16. Li; X. (2022). Biomechanical analysis of different footwork foot movements in table tennis. Computational Intelligence and Neuroscience; 2022(1); 9684535.
17. Bergström; L. (2020). “Play ball!”: A Study of Speech Variations and Characteristics of UK Sports Commentary.
18. Norris; L. A.; Didymus; F. F.; & Kaiseler; M. (2020). Understanding social networks and social support resources with sports coaches. Psychology of Sport and Exercise; 48; 101665.
19. Martin; D.; O Donoghue; P. G.; Bradley; J.; & McGrath; D. (2021). Developing a framework for professional practice in applied performance analysis. International Journal of Performance Analysis in Sport; 21(6); 845-888.
20. Indumathi N et al., Impact of Fireworks Industry Safety Measures and Prevention Management System on Human Error Mitigation Using a Machine Learning Approach, Sensors, 2023, 23 (9), 4365; DOI:10.3390/s23094365.
21. Parkavi K et al., Effective Scheduling of Multi-Load Automated Guided Vehicle in Spinning Mill: A Case Study, IEEE Access, 2023, DOI:10.1109/ACCESS.2023.3236843.
22. Ran Q et al., English language teaching based on big data analytics in augmentative and alternative communication system, Springer-International Journal of Speech Technology, 2022, DOI:10.1007/s10772-022-09960-1.
23. Ngangbam PS et al., Investigation on characteristics of Monte Carlo model of single electron transistor using Orthodox Theory, Elsevier, Sustainable Energy Technologies and Assessments, Vol. 48, 2021, 101601, DOI:10.1016/j.seta.2021.101601.
24. Huidan Huang et al., Emotional intelligence for board capital on technological innovation performance of high-tech enterprises, Elsevier, Aggression and Violent Behavior, 2021, 101633, DOI:10.1016/j.avb.2021.101633.
25. Sudhakar S, et al., Cost-effective and efficient 3D human model creation and re-identification application for human digital twins, Multimedia Tools and Applications, 2021. DOI:10.1007/s11042-021-10842-y.
26. Prabhakaran N et al., Novel Collision Detection and Avoidance System for Mid-vehicle Using Offset-Based Curvilinear Motion. Wireless Personal Communication, 2021. DOI:10.1007/s11277-021-08333-2.
27. Balajee A et al., Modeling and multi-class classification of vibroarthographic signals via time domain curvilinear divergence random forest, J Ambient Intell Human Comput, 2021, DOI:10.1007/s12652-020-02869-0.
28. Omnia SN et al., An educational tool for enhanced mobile e-Learning for technical higher education using mobile devices for augmented reality, Microprocessors and Microsystems, 83, 2021, 104030, DOI:10.1016/j.micpro.2021.104030 .
29. Firas TA et al., Strategizing Low-Carbon Urban Planning through Environmental Impact Assessment by Artificial Intelligence-Driven Carbon Foot Print Forecasting, Journal of Machine and Computing, 4(4), 2024, doi: 10.53759/7669/jmc202404105.
30. Shaymaa HN, et al., Genetic Algorithms for Optimized Selection of Biodegradable Polymers in Sustainable Manufacturing Processes, Journal of Machine and Computing, 4(3), 563-574, https://doi.org/10.53759/7669/jmc202404054.
31. Hayder MAG et al., An open-source MP + CNN + BiLSTM model-based hybrid model for recognizing sign language on smartphones. Int J Syst Assur Eng Manag (2024). https://doi.org/10.1007/s13198-024-02376-x
32. Bhavana Raj K et al., Equipment Planning for an Automated Production Line Using a Cloud System, Innovations in Computer Science and Engineering. ICICSE 2022. Lecture Notes in Networks and Systems, 565, 707–717, Springer, Singapore. DOI:10.1007/978-981-19-7455-7_57.
Copyright (c) 2024 Hongmin Yu, Xiaokang Wei
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright on all articles published in this journal is retained by the author(s), while the author(s) grant the publisher as the original publisher to publish the article.
Articles published in this journal are licensed under a Creative Commons Attribution 4.0 International, which means they can be shared, adapted and distributed provided that the original published version is cited.