Analyzing arXiv Publication Trends of Generative AI After DeepSeek Release  
Author Kai-Yuan Hsiao

 

Co-Author(s) Wei Shan Chang;  Mingchih Chen; Ben-Chang Shia

 

Abstract This study investigates publication trends in generative artificial intelligence by analyzing preprints submitted to arXiv between November 2022 and March 2025, a period marked by the rapid development of large language models (LLMs) and the release of the DeepSeek model series. Using Latent Dirichlet Allocation (LDA) for topic modeling, we extracted key thematic patterns from over 60,000 articles classified under machine learning and AI. The results reveal three dominant research directions: foundational machine learning methods and training strategies, reasoning capabilities and task-oriented applications of LLMs, and the emergence of multimodal systems that integrate vision and language. The release of DeepSeek, which emphasizes computational efficiency and open accessibility, coincides with increased academic interest in scalable architectures and human-centric AI. These findings highlight the evolving priorities in generative AI research and suggest that future developments will continue to focus on efficiency, reasoning, and multimodal integration.

 

Keywords Generative AI, large language model, Natural Language Processing, Text mining
   
    Article #:  RQD2025-119
 

Proceedings of 30th ISSAT International Conference on Reliability & Quality in Design
August 6-8, 2025