Experts discuss DeepSeek’s contributions to AI development

By LU HANG / 02-20-2025 / Chinese Social Sciences Today

DeepSeek epitomizes the transformation of China’s AI development model—from “exchanging market for technology” to an “innovation-driven” approach, and from achieving “single-point breakthroughs” to tackling challenges through “system engineering.” Photo: IC PHOTO


On Feb. 8, 2025, a special forum on the impact and implications of DeepSeek was held in Xi’an, northwest China’s Shaanxi Province. Scholars at the forum focused on DeepSeek’s technological breakthroughs, industry transformations, and implications for future development, while exploring how technological innovation and independent research and development (R&D) can enhance China’s competitiveness in AI.


Technological breakthroughs

DeepSeek is the product of systematic, collaborative innovation across algorithms, software, and hardware. Wu Fei, director of the Institute of Artificial Intelligence at Zhejiang University, believes that DeepSeek’s outstanding performance is the result of collective intelligence and efforts. While the DeepSeek model is based on Transformer architecture and does not represent a disruptive innovation in foundational theory, it marks an impressive leap forward in the long course of AI development and offers deep insights for the future of AI.


Wu emphasized that AI’s current achievements are the result of multiple “significant leaps” made over time. This is exemplified by the 2024 Nobel Prize in Physics, awarded to John Hopfield and Geoffrey Hinton for their work optimizing neural network model parameters from the perspectives of physical energy minimization and Boltzmann distribution, respectively, thus laying a solid historical foundation for the rise of deep learning.


DeepSeek’s “progressive breakthrough” path demonstrates that leapfrog development can be achieved through system engineering optimization, underscoring the critical role of system-level engineering innovation in scientific and technological advancement, Wu added.


A key reason for DeepSeek’s substantial improvement in cost-performance is its deep innovation in system software. According to Zhai Jidong, a professor of Computer Science and Technology at Tsinghua University, the collaborative innovation in algorithms, software, and hardware is the key to breaking the traditional paradigm of large language models relying on computing power. 


DeepSeek’s main innovation is in its algorithm, incorporating a new MoE (Mixture of Experts) architecture that combines shared experts with a large number of finely-tuned routing experts, Zhai said. By compressing general knowledge into shared experts, DeepSeek reduces parameter redundancy in routing experts and improves parameter efficiency. Without increasing the total number of parameters, the system divides the knowledge into more refined routing experts, enabling more accurate and targeted knowledge representation through flexible combinations of the routing experts.


Zhai also noted that the low training efficiency in traditional MoE models due to load imbalances has been effectively mitigated by load-balancing algorithms. At the system software level, DeepSeek incorporates extensive system engineering optimizations, which have significantly reduced the cost of model training.


“What DeepSeek has taught us is more about how to fully exploit the extreme performance of hardware through collaborative innovation of algorithms and software under the conditions of limited computing power,” Zhai said, underlining its great significance to the future development of AI and large models in China. 


With the advent of the deep learning era, the slow evolution of machine learning theory has increasingly lagged behind rapid technological progress, creating a growing gap between theory and technology. The performance of deep learning, driven by many “heuristic” techniques, remains difficult to explain within the existing theoretical framework, underscoring the urgency of addressing the bottleneck of reconstructing a modern theoretical system for machine learning.


After analyzing the characteristics and limitations of traditional statistical learning theory in machine learning, Meng Deyu, a professor from the School of Mathematics and Statistics at Xi’an Jiaotong University, provided a comprehensive introduction to three major challenges widely observed in modern machine learning technologies, particularly large models. These challenges are: the limitations of generalization scope in learning theory, as revealed by the phenomenon of “task generalization ability;” the deviation in generalization trends of learning theory, as exposed by the phenomenon of “intelligence emergence;” and the absence of generalization boundaries in learning theory, as indicated by the “robustness-precision paradox.”


Meng emphasized that with the rapid development of modern engineering technologies, particularly DeepSeek, machine learning capabilities are increasingly benefiting the public. It has become essential for the field of AI to elucidate the mathematical mechanisms behind these phenomena by reconstructing the theoretical foundation of machine learning, thereby transforming it into a discipline that is both theoretically sound and technically effective.


Enhancing competitiveness in AI

DeepSeek has undoubtedly made unique contributions to global AI development. Its breakthroughs in algorithm optimization and model generalization capabilities have not only expanded the boundaries of AI technology but also provided valuable references for technological innovation across various fields. As global competition in AI intensifies, China holds notable advantages in application scenarios, data scale, and policy support. However, the country still faces “stranglehold” challenges in core technologies such as fundamental algorithms, high-end chips, and open-source frameworks.


Enhancing the competitiveness of AI is a strategic game decisive to national prosperity. Li Fu, deputy dean of the School of Artificial Intelligence at Xidian University, stated that China is advancing toward self-support in key areas between 2025 and 2030 through a “four-in-one” independent R&D approach. This strategy focuses on breakthroughs in hardware, foundational progress in software, innovations in algorithms, and the expansion of applications, thereby laying the groundwork for a globally influential AI innovation system. 


Intelligent technologies play an irreplaceable role in enhancing performance and efficiency. DeepSeek’s success stems not only from its superior technology but also from its seamless integration with intelligent hardware, creating a powerful ecosystem synergy that accelerates the widespread application of AI across industries. The open-sourcing of AI models offers a glimpse of universal intelligence accessible to all. 


Guo Bin, a professor from the School of Computer Science at Northwestern Polytechnical University, believes that lowering the cost of large-model inference will produce significant positive impacts on industry and other fields. He predicted that DeepSeek will inject new vitality into China’s industrial development, especially in overseas markets, where it is expected to drive the global expansion of domestic hardware manufacturers and industries, create new opportunities for global industrial creations, and contribute to human progress and development.


DeepSeek’s rise to prominence was no accident; it epitomizes the transformation of China’s AI development model—from “exchanging market for technology” to an “innovation-driven” approach, and from achieving “single-point breakthroughs” to tackling challenges through “system engineering.” On this path, China has neither been completely independent nor passively followed others. Instead, while building “technological sovereignty” in key areas, it actively participates in the global innovation network through open-source collaboration, standard-setting, and ecosystem co-construction. Looking ahead, with the advancement of infrastructure projects such as the national supercomputing internet and the “East Data, West Computing” initiative, China is poised to achieve new breakthroughs in the three major AI domains: chips, frameworks, and algorithms. With “limited computing power and unlimited intelligence,” China’s AI sector is presenting its solutions for the intelligent era.


The forum was co-organized by the Computer Federation of Shaanxi Province and the China Computer Federation Chapters, among other organizations. 


Edited by CHEN MIRONG