Sci-Tech

How can Chinese intelligent computing achieve a "DeepSeek style breakthrough"

2025-03-31   

On crystal generative variable structure computing can open up a new path for China to break through the "process cocoon" of computing power chips, and create a system engineering level innovation route that achieves "first-class capabilities" with "three stream materials and two stream processes". Since the beginning of this year, large model enterprises in China, represented by DeepSeek, have compressed the training cost of billions of parameter models to 1/10 of similar models through algorithm optimization, targeted training, and open source ecosystem collaboration, using "scaled down" GPU chips. They have embarked on a new path from extensive computing power accumulation to endogenous efficiency improvement. While the world is amazed by China's asymmetric innovation miracle, we must also be aware that in terms of artificial intelligence (AI) technology and independent and sustainable industrial development, China has not yet shaken off its dependence on high-end and even "scaled down" physical devices such as smart computing chips. In the foreseeable future, the external environment may become more severe, and challenges such as the normalization of blockade and containment, and supply chain uncertainty are difficult to avoid. China urgently needs to achieve a "DeepSeek style breakthrough" in the field of intelligent computing power, decoupling the strong binding relationship between current hardware computing power improvement and process technology progress through over limit innovation. In other words, in order to gain the ability to compete with competitors in the field of AI, China not only needs to continue the revolution and break through the "computing cocoon" at the algorithm level, but also needs to achieve lane changing overtaking and break through the "process technology cocoon" through deep integration of algorithms and physics. The on crystal generative architecture, which mainly focuses on generative variable structure computing and Software defined System on Wafer (SDSoW), provides a new technological path for solving the problem of algorithm model and computing power carrier mismatch, and enhancing the sustainable development of software hardware collaborative computing power. Adapting the "shoes" of computing architecture to the "feet" of algorithms, advanced chip manufacturing processes can provide higher transistor density, enhance the computing power per unit area of chips, and provide stronger computing resource support for large model training and inference. However, based on the engineering design paradigm of reductionism, the physical computing power improvement obtained through arduous iterations of chip manufacturing processes is difficult to effectively utilize by software algorithms running on large-scale distributed physical systems. There is a structural mismatch between the peak computing power of chips and the systematic benefits of algorithms. In addition, distributed systems are constrained by the impossible triangle problem of "large-scale, low latency, and high bandwidth" in their technical system. Relying on simply stacking thousands, tens of thousands, or even hundreds of thousands of GPU cards is difficult to meet the non-linear computing power growth requirements of large-scale model training driven by the "scaling low" principle. In short, due to the von Neumann computing architecture of the storage computation separation system, there is a systematic mismatch between hardware system design (such as chip process, memory bandwidth, concurrent units, etc.) and the computational characteristics of the algorithm model (such as computational density, data flow patterns, accuracy requirements, etc.). Even if the chip manufacturing process advances and brings performance gains, it will be greatly discounted due to the system engineering paradigm of "step-by-step insertion loss". Breaking through the "process cocoon" and seeking solutions to problems in higher dimensions requires a transformation of traditional rigid computing architectures and technological physics implementation paradigms. For nearly 80 years, traditional computing architectures have been using the von Neumann computing architecture, which consists of several major components including arithmetic units, controllers, internal memory, and input/output devices. Whether it's complex AI algorithms or simple data processing tasks, they are all "hard packed" into rigid computing architectures, hoping to handle ever-changing applications "once and for all". It's like no matter how big your feet are, you have to wear size 37 shoes to walk. However, if the shoes don't fit well, you won't be able to walk quickly: wearing big shoes on small feet can cause tripping; Wearing small shoes with big feet can cause pain, and the outcome is often "cutting the foot to fit the shoe". To resolve the structural contradiction between rigid architecture computing power and diversified algorithms, it is necessary to rely on the physics dimension increasing solving rules and introduce new mechanisms of adaptive computing architecture. In 2009, inspired by the "natural camouflage master" mimetic octopus, Chinese scientists first proposed domain specific software and hardware collaborative variable structure computing - mimetic computing. Like the mimetic octopus, it can "adapt to changing circumstances" and hide in environments such as sandy seabeds or coral reefs. The mimetic calculation can make the "shoes" better fit the "feet" on which it walks. In 2018, computer architecture masters and Turing Award winners David Paterson and John Hennessy predicted that domain specific software hardware collaborative computing architectures based on software hardware collaborative computing languages would become one of the mainstream development directions for the golden development period of computer architecture in the next decade. Not long ago, Tesla Dojo Supercomputing announced its solution for paradigm shift in computing, proposing a hardware architecture that transforms with tasks like a Transformer, achieving a paradigm shift from "algorithm adapted hardware" to "algorithm defined hardware". The core of using second rate components to build a first-class system for generative variable structure computing lies in dynamically reconstructing the computing architecture according to algorithm requirements, requiring the hardware carrier of computing power to be able to achieve software driven physical structure changes, greatly improving the execution efficiency of specific computing structures for specific algorithms. SDSoW aims to promote the transition of computing architecture from a "rigid pipeline" to a "software moldable" one, and to break through the closed loop of generative variable structure computing from theory to technical physics implementation, making it possible to build first-class systems based on second rate devices or components. Specifically, SDSoW possesses five major abilities. One is the system's ability to break through. SDSoW breaks away from the "core device determinism" thinking and changes the step-by-step stacking and insertion loss engineering technology route of "chips, chips, modules, casings, racks, and systems". Through wafer level heterogeneous integration, it achieves functional deconstruction on chip recombination, achieves the goal of functional equivalence and system optimization, and transforms process process shortcomings into non primary contradictions through systems engineering methods. The second is the overall efficiency enhancement capability. By utilizing wafer level high-density interconnects, ultra short distances, and heterogeneous packaging, high bandwidth, low latency, and low power consumption system gains can be achieved. The bandwidth of SDSoW systems can be increased by an order of magnitude, latency can be reduced by an order of magnitude, power consumption can be reduced by an order of magnitude, and system efficiency can be improved by three orders of magnitude. The third is the ability to integrate general and specialized education. Based on wafer level system hardware programmable/redefined architecture, SDSoW functionality and performance can be configured in real-time through software or generated by AI large models. On the same physical carrier, it is possible to achieve "one platform, diversity" generative variable structure computing according to different application requirements or usage scenarios, which can meet the special computing power requirements of specialized scenarios and also take into account the relatively flexible general computing power requirements in the field. The fourth is open source collaboration capability. By establishing an open source community for SDSoW, releasing basic interconnect protocols, dynamic controllers, and generative variable structure computing toolchains, SDSoW can build an ecological environment that is "led by China and participated globally" to open up and break monopolies, forming a comparative advantage over the Chiplet route. The fifth is endogenous safety capability. SDSoW can address the new domain and new quality security challenges brought by an open ecosystem from the source, by introducing an endogenous security architecture to achieve openness and controllability. Even if the supply chain is insecure, it can still ensure that the system has "out of the box, default security" network resilience under open conditions. In short, on crystal generative variable structure computing can open up a new path for China to break through the "process technology cocoon" of computing power chips, and create a system engineering level innovation route that achieves "first-class capabilities" with "third rate materials and second rate processes". By vertically integrating applications and designs, algorithms and computing power, decoupling the strong dependence of China's computing infrastructure products on advanced chip manufacturing processes, and maximizing the comprehensive benefits of system architecture and process progress. Crystal generated variable structure computation can also provide a Chinese solution for global intelligent computing accessibility. Based on wafer level integration/packaging, generative variable structure computing has opened up new directions for breakthroughs in algorithm architecture and physical carrier revolution, as well as deep coupling between algorithm engineering implementation and computational paradigm innovation. At present, breakthroughs in root theory should be promoted as soon as possible, with a focus on tackling underlying theories such as crystal thermodynamics, heterogeneous integration theory, and mathematical description of reconfigurable architectures; Continuously strengthening the root technology breakthrough, breaking through key technologies such as wafer level bonding, 3D interconnection, wafer level operating systems, and generative structural computing languages/compilers, and achieving an independent ecosystem from architecture design, physical implementation to technical application; Multi dimensional promotion of root industry cultivation, driven by emerging market demands such as intelligent driving, embodied intelligence, industrial digital twins, and AI all-in-one machines, through the three in one model of "scene openness+system innovation+ecological aggregation", breaking through the "small courtyard high wall" and "containment blockade" with over limit innovation and unconventional measures, and forging a path of Chinese characteristic technology equality and industrial sustainable development. (New Society)

Edit:Ou Xiaoling Responsible editor:Shu Hua

Source:Science and Technology Daily

Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com

Recommended Reading Change it

Links