Analyzing arXiv Publication Trends of Generative AI After DeepSeek Release

Analyzing arXiv Publication Trends of Generative AI After DeepSeek Release
Author	Kai-Yuan Hsiao
Co-Author(s)	Wei Shan Chang; Mingchih Chen; Ben-Chang Shia
Abstract	This study investigates publication trends in generative artificial intelligence by analyzing preprints submitted to arXiv between November 2022 and March 2025, a period marked by the rapid development of large language models (LLMs) and the release of the DeepSeek model series. Using Latent Dirichlet Allocation (LDA) for topic modeling, we extracted key thematic patterns from over 60,000 articles classified under machine learning and AI. The results reveal three dominant research directions: foundational machine learning methods and training strategies, reasoning capabilities and task-oriented applications of LLMs, and the emergence of multimodal systems that integrate vision and language. The release of DeepSeek, which emphasizes computational efficiency and open accessibility, coincides with increased academic interest in scalable architectures and human-centric AI. These findings highlight the evolving priorities in generative AI research and suggest that future developments will continue to focus on efficiency, reasoning, and multimodal integration.
Keywords	Generative AI, large language model, Natural Language Processing, Text mining

		Article #: RQD2025-119

Proceedings of 30th ISSAT International Conference on Reliability & Quality in Design
August 6-8, 2025

	International Society of Science and Applied Technologies