Building a strong foundation for artificial intelligence data
2025-04-25
Digital technologies such as 5G, artificial intelligence, and big models are rapidly developing, and digital industry business revenue increased by 8.2% year-on-year in the first two months of this year. ”Xie Shaofeng, Chief Engineer of the Ministry of Industry and Information Technology, introduced that China has formed a complete industrial system covering the basic layer, framework layer, model layer, and application layer. High quality industry datasets such as steel and coal have been built, and a number of competitive general and industry models have been cultivated. Domestic large models have topped the global mainstream open source community download volume list. The development of artificial intelligence cannot be separated from the development and utilization of data resources. High quality data serves as the foundation for artificial intelligence applications, providing strong support and assurance for both general and industry models. This year's Government Work Report proposes to accelerate the improvement of data infrastructure systems, deepen the development and utilization of data resources, and also emphasizes support for the widespread application of big models. As the main battlefield for the market-oriented value of data elements, enterprises have a strong demand for the development and utilization of data resources. According to the Organization for Economic Cooperation and Development, the average promotion rate of data flow on profit growth in various industries is around 10%, and can reach 32% in industries such as digital platforms and finance. We encourage enterprises to fully develop and utilize the data formed or legally obtained and held during their production and operation process, without violating laws and regulations, endangering national security and public interests, "said Chen Ronghui, Deputy Director of the National Data Administration. In recent years, China Telecom has achieved significant results in the application of enterprise digital transformation by strengthening the aggregation of enterprise data and external data, forming the advantage of massive multi-source data. In terms of user service, utilizing network resource data, network perception data, and historical complaint data to proactively repair customer perception; In terms of anti fraud identification, an anti fraud model is constructed based on data such as telephone calls, roaming behavior, and terminals to identify potential "fraudulent" numbers and handle them in a timely manner; In terms of livelihood services, based on the location data of operators and government public data, we have built emergency notification and livelihood care capabilities, and provided big data SMS reminder services to over 300 household enterprise users. The deep integration of data elements and artificial intelligence technology is a key driving force for the digital transformation of industries, and data is the core element for training and optimizing artificial intelligence models. ”Huang Zhiyong, Deputy General Manager of China Telecom Group Co., Ltd., introduced that based on a 500000 hour desensitized audio dataset, China Telecom has created the industry's first large-scale speech model that supports free mixing of 50 dialects. At the same time, for fields such as education and transportation, we will collaborate with users to build 99 industry datasets covering semantic, speech, image, video, and other types, and launch over 50 industry models. For example, the Star Government Hotline big model developed through knowledge base and work order data has been applied in 12345 citizen service hotlines in Shanghai and other places, with a one-time resolution rate increased by 30% and a dispatch accuracy rate increased by 15%. IFlytek, which has also been deeply involved in the field of government affairs for many years, has recently upgraded its Spark X1 model once again. Due to the integration of more complex types of data in various scenarios, the generalization of the model has been further improved, and its application scope has been expanded in key industries such as education, healthcare, and justice. Taking the judicial industry as an example, the big model summarizes case facts and judgment reasons based on materials, accurately grasps user instructions, analyzes dispute points in detail, quickly locates key information, and outputs accurate content, providing users with professional and reliable intelligent support. Behind the upgrade of iFlytek Spark X1 is a series of technological innovations and conceptual breakthroughs. Firstly, a large-scale multi-stage reinforcement learning method based on problem difficulty was proposed to improve model performance in complex reasoning, mathematics, code, language understanding, and other scenarios. Secondly, the mixed training method of quick and slow thinking under the unified model can give full play to the mutual promotion of data and support users to deploy and use more efficiently and conveniently. In addition, large-scale models are also accelerating their implementation in industries such as electronics, raw materials, and consumer goods, and are being applied in research and development design, pilot testing, production and manufacturing, and operational management. We found in the typical case collection and selection that a flat panel display enterprise in Beijing used a large model to generate a production schedule with one click, reducing the production line scheduling time by 75%. Xie Shaofeng said that in the next step, the Ministry of Industry and Information Technology will strengthen the research and development layout of general and industry large models, accelerate the construction of high-quality datasets in the industrial field, and consolidate the foundation. (New Society)
Edit:He Chuanning Responsible editor:Su Suiyue
Source:ECONOMIC DAILY
Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com