Eng

Explainer: Why is Chinese AI startup DeepSeek shaking up the tech world?

XINHUA
發布於 2025年01月29日11:33 • Jin Bowen,Xiong Maoling,Chen Binjie,Guo Shuang,Lian Yi,Fu Tian,Wu Xiaoling,Zeng Hui
A humanoid robot takes selfies with a visitor at the 7th World Voice Expo in Hefei, east China's Anhui Province, Oct. 24, 2024. (Xinhua/Fu Tian)

"To see the DeepSeek new model, it's super impressive in terms of both how they have really effectively done an open-source model that does this inference-time compute, and is super-compute efficient."

BEIJING, Jan. 29 (Xinhua) -- The artificial intelligence (AI) community is abuzz with excitement over DeepSeek-R1, a new open-source model developed by the Chinese startup DeepSeek.

廣告(請繼續閱讀本文)

Released on Jan. 20, it has quickly soared to the top of Apple's app store's free charts by Monday, surpassing OpenAI's ChatGPT.

According to DeepSeek, in tasks such as mathematics, coding and natural language reasoning, the performance of this model is comparable to the leading models from heavyweights like OpenAI but only at a fraction of the cash and computing power of its competitors.

Here's what DeepSeek has done and why it is sending shockwaves through the AI industry.

廣告(請繼續閱讀本文)

WHAT IS DEEPSEEK?

Officially known as DeepSeek Artificial Intelligence Fundamental Technology Research Co., Ltd., the firm was founded in July 2023. As an innovative technology startup, DeepSeek is dedicated to developing cutting-edge large language models (LLMs) and related technologies.

Since its first model "DeepSeek LLM" released in January last year, the company has undergone multiple rounds of iteration. In December, the startup launched its open-source LLM "V3", which overtook all of Meta's open-source LLMs and rivaled OpenAI's closed-source GPT4-o, according to U.S. media reports.

廣告(請繼續閱讀本文)

The just-released model R1 has achieved an important technological breakthrough -- using pure deep learning methods to allow AI to spontaneously emerge with reasoning capabilities.

Unlike traditional approaches like Chain-of-Thought (CoT) and Supervised Fine-Tuning (SFT), DeepSeek has distinguished itself in the AI industry by adopting Reinforcement Learning (RL) as a core training method.

While CoT and SFT rely on step-by-step reasoning and huge amounts of labeled data, respectively, RL enables models to learn through interaction and reward mechanisms, making it better suited for complex and dynamic tasks.

The adoption of RL has allowed the startup to enhance its models' reasoning, adaptability and efficiency, setting it apart as a frontrunner in the field.

When queried about the meaning of "DeepSeek," its latest R1 chatbot replied, "The name reflects the company's mission to deeply explore and advance the foundational technologies of AI, aiming to push the boundaries of AI innovation and application."

"BIGGER IS NO LONGER ALWAYS SMARTER"

According to its V3 model technical report, DeepSeek's manufacturing cost is approximately 5.57 million U.S. dollars, making it the least expensive among LLMs.

Renowned U.S. economist Jeffrey Sachs, a professor and director of the Center for Sustainable Development at Columbia University, told Xinhua that the breakthrough made by DeepSeek "shows the possibility of advanced AI at much lower costs than was widely believed in the United States until yesterday."

DeepSeek-V3 makes it "look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2,048 GPUs for 2 months, $6M)," posted Andrej Karpathy, a founding member of OpenAI, on X.

Compared to other well-known models, DeepSeek achieved an order-of-magnitude reduction.

The cost is "a stark contrast to the hundreds of millions, if not billions, that U.S. companies typically invest in similar technologies," said Marc Andreessen, a prominent tech investor, depicting DeepSeek's R1 as "one of the most amazing breakthroughs" he had ever seen.

The AI industry development has long relied on piling up computing power. The cost-efficient DeepSeek model may upend the AI landscape.

Praising the DeepSeek-V3 Technical Report as "very nice and detailed," Karpathy said that the report is worthy of reading through.

U.S. investment bank and financial service provider Morgan Stanley believed that DeepSeek demonstrates an alternative path to efficient model training than the current arm's race among hyperscalers by significantly increasing the data quality and improving the model architecture.

"Bigger is no longer always smarter," it said.

People visit the exhibition area of Chinese company Shokz during the Consumer Electronics Show (CES) 2025 in Las Vegas, the United States, Jan. 7, 2025. (Photo by Zeng Hui/Xinhua)

OPEN-SOURCE MODEL

"To see the DeepSeek new model, it's super impressive in terms of both how they have really effectively done an open-source model that does this inference-time compute, and is super-compute efficient," said Microsoft CEO Satya Nadella.

Open source allows researchers, developers, and users to access the model's underlying code and its "weights" -- the parameters that determine how the model processes information -- enabling them to use, modify or enhance the model to suit their needs.

DeepSeek has greatly benefited from open-source principles and, in turn, demonstrates a strong commitment to sharing knowledge and contributing to the collective advancement of technology.

Meta's chief AI scientist Yann LeCun said: "They came up with new ideas and built them on top of other people's work. Because their work is published and open source, everyone can profit from it."

"That is the power of open research and open source," LeCun added.

Echoing LeCun, Sachs, the U.S. economist, said, "DeepSeek's business and development model is open source, which is a compelling and successful model for science, technology and business."

While DeepSeek's U.S. counterpart, OpenAI, initially started as an open-source organization but later shifted to a closed-source model, DeepSeek has taken a different path.

Highlighting the importance of fostering collaboration and innovation through open-source principles, Liang Wenfeng, the founder of DeepSeek, said that building a robust technological ecosystem is the priority.

"We won't choose closed-source," Liang made the company's stance clear.■

更多 Eng 相關文章

China reinvents vocational education for tech-driven future
XINHUA
Death toll from Spanish fatal train crash rises to 45
XINHUA
China an essential actor in global sustainability transition, says WBCSD head
XINHUA
Study demonstrates effectiveness and economic value of Myriad™ in trauma and acute care surgery.
PR Newswire (美通社)
ViewSonic Unveils Industry's First Android 16-Powered Interactive Displays and Education Ecosystem Innovations at Bett 2026
PR Newswire (美通社)
Computing power woven into hair-thin fibers, paving way for smart clothes, brain implants
XINHUA
Tradeify Announces Partnership with UFC Legend, Israel Adesanya
PR Newswire (美通社)
Olympic champions Sui, Han lead pairs short program at Four Continents in Beijing
XINHUA
Bybit and Block Scholes Find Crypto Derivatives Mostly Unfazed by Greenland Tensions and JGB Yield Shock
PR Newswire (美通社)
Roundup: Concerns about U.S. armed seizure of Greenland ease, transatlantic rifts remain
XINHUA
China's grain output hits new high in 2025
XINHUA
GLOBALink | Archaeologists trace Chinese civilization in Lingjiatan relics site of E China's Anhui
XINHUA
Syria fighting threatens 20,000 children in IS-linked camps: aid group
XINHUA
Beko's Smart Living Index finds Economic Pressure Drives Surge in Sustainable Living
PR Newswire (美通社)
S. Korea's ex-PM Han sentenced to 23 years in prison for aiding insurrection
XINHUA
Urgent: European Parliament votes to refer EU-Mercosur trade deal to EU Court of Justice for review
XINHUA
FERRETTI INTERNATIONAL HOLDING S.P.A. EXPRESSES NO INTENTION TO SUPPORT THE PARTIAL PUBLIC TENDER OFFER LAUNCHED BY KKCG MARITIME
PR Newswire (美通社)
GLOBALink | Peek into prehistoric Chinese jade culture via Anhui's Lingjiatan relic site
XINHUA
Afghan women transform traditional crafts into global fashion
XINHUA
China's sacred revolutionary sites Zunyi, Yan'an to be connected by high-speed rail
XINHUA
UK PM vows not to yield over Greenland amid U.S. tariff threat
XINHUA
DFRobot Unveils Hands-On K12 AI & Robotics Learning Solutions at Bett 2026
PR Newswire (美通社)
Trump reiterates push to acquire Greenland, slams Europe, NATO at Davos
XINHUA
U.S. quits WHO, leaving $278 million unpaid bill
XINHUA
Denmark says NATO chief has no mandate to negotiate on Greenland
XINHUA
Hydrexia to Provide Hydrogen Application Solution to Toyota
PR Newswire (美通社)