Tag Archives: DarwinAI

DeepSeek, a Chinese rival to OpenAI and other US AI companies

There’s been quite the kerfuffle over DeepSeek during the last few days. This January 27, 2025 article by Alexandra Mae Jones for the Canadian Broadcasting Corporation (CBC) news only was my introduction to DeepSeek AI, Note: A link has been removed,

There’s a new player in AI on the world stage: DeepSeek, a Chinese startup that’s throwing tech valuations into chaos and challenging U.S. dominance in the field with an open-source model that they say they developed for a fraction of the cost of competitors.

DeepSeek’s free AI assistant — which by Monday [January 27, 20¸25] had overtaken rival ChatGPT to become the top-rated free application on Apple’s App Store in the United States — offers the prospect of a viable, cheaper AI alternative, raising questions on the heavy spending by U.S. companies such as Apple and Microsoft, amid a growing investor push for returns.

U.S. stocks dropped sharply on Monday [January 27, 2025], as the surging popularity of DeepSeek sparked a sell-off in U.S. chipmakers.

“[DeepSeek] performs as well as the leading models in Silicon Valley and in some cases, according to their claims, even better,” Sheldon Fernandez, co-founder of DarwinAI, told CBC News. “But they did it with a fractional amount of the resources is really what is turning heads in our industry.”

What is DeepSeek?

Little is known about the small Hangzhou startup behind DeepSeek, which was founded out of a hedge fund in 2023, but largely develops open-source AI models. 

Its researchers wrote in a paper last month that the DeepSeek-V3 model, launched on Jan. 10 [2025], cost less than $6 million US to develop and uses less data than competitors, running counter to the assumption that AI development will eat up increasing amounts of money and energy. 

Some analysts are skeptical about DeepSeek’s $6 million claim, pointing out that this figure only covers computing power. But Fernandez said that even if you triple DeepSeek’s cost estimates, it would still cost significantly less than its competitors. 

The open source release of DeepSeek-R1, which came out on Jan. 20 [2025] and uses DeepSeek-V3 as its base, also means that developers and researchers can look at its inner workings, run it on their own infrastructure and build on it, although its training data has not been made available. 

“Instead of paying Open $20 a month or $200 a month for the latest advanced versions of these models, [people] can really get these types of features for free. And so it really upends a lot of the business model that a lot of these companies were relying on to justify their very high valuations.”

A key difference between DeepSeek’s AI assistant, R1, and other chatbots like OpenAI’s ChatGPT is that DeepSeek lays out its reasoning when it answers prompts and questions, something developers are excited about. 

“The dealbreaker is the access to the raw thinking steps,” Elvis Saravia, an AI researcher and co-founder of the U.K.-based AI consulting firm DAIR.AI, wrote on X, adding that the response quality was “comparable” to OpenAI’s latest reasoning model, o1.

U.S. dominance in AI challenged

One of the reasons DeepSeek is making headlines is because its development occurred despite U.S. actions to keep Americans at the top of AI development. In 2022, the U.S. curbed exports of computer chips to China, hampering their advanced supercomputing development.

The latest AI models from DeepSeek are widely seen to be competitive with those of OpenAI and Meta, which rely on high-end computer chips and extensive computing power.

Christine Mui in a January 27, 2025 article for Politico notes the stock ‘crash’ taking place while focusing on the US policy implications, Note: Links set by Politico have been removed while I have added one link

A little-known Chinese artificial intelligence startup shook the tech world this weekend by releasing an OpenAI-like assistant, which shot to the No.1 ranking on Apple’s app store and caused American tech giants’ stocks to tumble.

From Washington’s perspective, the news raised an immediate policy alarm: It happened despite consistent, bipartisan efforts to stifle AI progress in China.

In tech terms, what freaked everyone out about DeepSeek’s R1 model is that it replicated — and in some cases, surpassed — the performance of OpenAI’s cutting-edge o1 product across a host of performance benchmarks, at a tiny fraction of the cost.

The business takeaway was straightforward. DeepSeek’s success shows that American companies might not need to spend nearly as much as expected to develop AI models. That both intrigues and worries investors and tech leaders.

The policy implications, though, are more complex. Washington’s rampant anxiety about beating China has led to policies that the industry has very mixed feelings about.

On one hand, most tech firms hate the export controls that stop them from selling as much to the world’s second-largest economy, and force them to develop new products if they want to do business with China. If DeepSeek shows those rules are pointless, many would be delighted to see them go away.

On the other hand, anti-China, protectionist sentiment has encouraged Washington to embrace a whole host of industry wishlist items, from a lighter-touch approach to AI rules to streamlined permitting for related construction projects. Does DeepSeek mean those, too, are failing? Or does it trigger a doubling-down?

DeepSeek’s success truly seems to challenge the belief that the future of American AI demands ever more chips and power. That complicates Trump’s interest in rapidly building out that kind of infrastructure in the U.S.

Why pour $500 billion into the Trump-endorsed “Stargate” mega project [announced by Trump on January 21, 2025] — and why would the market reward companies like Meta that spend $65 billion in just one year on AI — if DeepSeek claims it only took $5.6 million and second-tier Nvidia chips to train one of its latest models? (U.S. industry insiders dispute the startup’s figures and claim they don’t tell the full story, but even at 100 times that cost, it would be a bargain.)

Tech companies, of course, love the recent bloom of federal support, and it’s unlikely they’ll drop their push for more federal investment to match anytime soon. Marc Andreessen, a venture capitalist and Trump ally, argued today that DeepSeek should be seen as “AI’s Sputnik moment,” one that raises the stakes for the global competition.

That would strengthen the case that some American AI companies have been pressing for the new administration to invest government resources into AI infrastructure (OpenAI), tighten restrictions on China (Anthropic) and ease up on regulations to ensure their developers build “artificial general intelligence” before their geopolitical rivals.

The British Broadcasting Corporation’s (BBC) Peter Hoskins & Imran Rahman-Jones provided a European perspective and some additional information in their January 27, 2025 article for BBC news online, Note: Links have been removed,

US tech giant Nvidia lost over a sixth of its value after the surging popularity of a Chinese artificial intelligence (AI) app spooked investors in the US and Europe.

DeepSeek, a Chinese AI chatbot reportedly made at a fraction of the cost of its rivals, launched last week but has already become the most downloaded free app in the US.

AI chip giant Nvidia and other tech firms connected to AI, including Microsoft and Google, saw their values tumble on Monday [January 27, 2025] in the wake of DeepSeek’s sudden rise.

In a separate development, DeepSeek said on Monday [January 27, 2025] it will temporarily limit registrations because of “large-scale malicious attacks” on its software.

The DeepSeek chatbot was reportedly developed for a fraction of the cost of its rivals, raising questions about the future of America’s AI dominance and the scale of investments US firms are planning.

DeepSeek is powered by the open source DeepSeek-V3 model, which its researchers claim was trained for around $6m – significantly less than the billions spent by rivals.

But this claim has been disputed by others in AI.

The researchers say they use already existing technology, as well as open source code – software that can be used, modified or distributed by anybody free of charge.

DeepSeek’s emergence comes as the US is restricting the sale of the advanced chip technology that powers AI to China.

To continue their work without steady supplies of imported advanced chips, Chinese AI developers have shared their work with each other and experimented with new approaches to the technology.

This has resulted in AI models that require far less computing power than before.

It also means that they cost a lot less than previously thought possible, which has the potential to upend the industry.

After DeepSeek-R1 was launched earlier this month, the company boasted of “performance on par with” one of OpenAI’s latest models when used for tasks such as maths, coding and natural language reasoning.

In Europe, Dutch chip equipment maker ASML ended Monday’s trading with its share price down by more than 7% while shares in Siemens Energy, which makes hardware related to AI, had plunged by a fifth.

“This idea of a low-cost Chinese version hasn’t necessarily been forefront, so it’s taken the market a little bit by surprise,” said Fiona Cincotta, senior market analyst at City Index.

“So, if you suddenly get this low-cost AI model, then that’s going to raise concerns over the profits of rivals, particularly given the amount that they’ve already invested in more expensive AI infrastructure.”

Singapore-based technology equity adviser Vey-Sern Ling told the BBC it could “potentially derail the investment case for the entire AI supply chain”.

Who founded DeepSeek?

The company was founded in 2023 by Liang Wenfeng in Hangzhou, a city in southeastern China.

The 40-year-old, an information and electronic engineering graduate, also founded the hedge fund that backed DeepSeek.

He reportedly built up a store of Nvidia A100 chips, now banned from export to China.

Experts believe this collection – which some estimates put at 50,000 – led him to launch DeepSeek, by pairing these chips with cheaper, lower-end ones that are still available to import.

Mr Liang was recently seen at a meeting between industry experts and the Chinese premier Li Qiang.

In a July 2024 interview with The China Academy, Mr Liang said he was surprised by the reaction to the previous version of his AI model.

“We didn’t expect pricing to be such a sensitive issue,” he said.

“We were simply following our own pace, calculating costs, and setting prices accordingly.”

A January 28, 2025 article by Daria Solovieva for salon.com covers much the same territory as the others and includes a few detail about security issues,

The pace at which U.S. consumers have embraced DeepSeek is raising national security concerns similar to those surrounding TikTok, the social media platform that faces a ban unless it is sold to a non-Chinese company.

The U.S. Supreme Court this month upheld a federal law that requires TikTok’s sale. The Court sided with the U.S. government’s argument that the app can collect and track data on its 170 million American users. President Donald Trump has paused enforcement of the ban until April to try to negotiate a deal.

But “the threat posed by DeepSeek is more direct and acute than TikTok,” Luke de Pulford, co-founder and executive director of non-profit Inter-Parliamentary Alliance on China, told Salon.

DeepSeek is a fully Chinese company and is subject to Communist Party control, unlike TikTok which positions itself as independent from parent company ByteDance, he said. 

“DeepSeek logs your keystrokes, device data, location and so much other information and stores it all in China,” de Pulford said. “So you’ll never know if the Chinese state has been crunching your data to gain strategic advantage, and DeepSeek would be breaking the law if they told you.”  

I wonder if other AI companies in other countries also log keystrokes, etc. Is it theoretically possible that one of those governments or their government agencies could gain access to your data? It’s obvious in China but people in other countries may have the issues.

Censorship: DeepSeek and ChatGPT

Anis Heydari’s January 28, 2025 article for CBC news online reveals some surprising results from a head to head comparison between DeepSeek and ChatGPT,

The Chinese-made AI chatbot DeepSeek may not always answer some questions about topics that are often censored by Beijing, according to tests run by CBC News and The Associated Press, and is providing different information than its U.S.-owned competitor ChatGPT.

The new, free chatbot has sparked discussions about the competition between China and the U.S. in AI development, with many users flocking to test it. 

But experts warn users should be careful with what information they provide to such software products.

It is also “a little bit surprising,” according to one researcher, that topics which are often censored within China are seemingly also being restricted elsewhere.

“A lot of services will differentiate based on where the user is coming from when deciding to deploy censorship or not,” said Jeffrey Knockel, who researches software censorship and surveillance at the Citizen Lab at the University of Toronto’s Munk School of Global Affairs & Public Policy.

“With this one, it just seems to be censoring everyone.”

Both CBC News and The Associated Press posed questions to DeepSeek and OpenAI’s ChatGPT, with mixed and differing results.

For example, DeepSeek seemed to indicate an inability to answer fully when asked “What does Winnie the Pooh mean in China?” For many Chinese people, the Winnie the Pooh character is used as a playful taunt of President Xi Jinping, and social media searches about that character were previously, briefly banned in China. 

DeepSeek said the bear is a beloved cartoon character that is adored by countless children and families in China, symbolizing joy and friendship.

Then, abruptly, it added the Chinese government is “dedicated to providing a wholesome cyberspace for its citizens,” and that all online content is managed under Chinese laws and socialist core values, with the aim of protecting national security and social stability.

CBC News was unable to produce this response. DeepSeek instead said “some internet users have drawn comparisons between Winnie the Pooh and Chinese leaders, leading to increased scrutiny and restrictions on the character’s imagery in certain contexts,” when asked the same question on an iOS app on a CBC device in Canada.

Asked if Taiwan is a part of China — another touchy subject — it [DeepSeek] began by saying the island’s status is a “complex and sensitive issue in international relations,” adding that China claims Taiwan, but that the island itself operates as a “separate and self-governing entity” which many people consider to be a sovereign nation.

But as that answer was being typed out, for both CBC and the AP, it vanished and was replaced with: “Sorry, that’s beyond my current scope. Let’s talk about something else.”

… Brent Arnold, a data breach lawyer in Toronto, says there are concerns about DeepSeek, which explicitly says in its privacy policy that the information it collects is stored on servers in China.

That information can include the type of device used, user “keystroke patterns,” and even “activities on other websites and apps or in stores, including the products or services you purchased, online or in person” depending on whether advertising services have shared those with DeepSeek.

“The difference between this and another AI company having this is now, the Chinese government also has it,” said Arnold.

While much, if not all, of the data DeepSeek collects is the same as that of U.S.-based companies such as Meta or Google, Arnold points out that — for now — the U.S. has checks and balances if governments want to obtain that information.

“With respect to America, we assume the government operates in good faith if they’re investigating and asking for information, they’ve got a legitimate basis for doing so,” he said. 

Right now, Arnold says it’s not accurate to compare Chinese and U.S. authorities in terms of their ability to take personal information. But that could change.

“I would say it’s a false equivalency now. But in the months and years to come, we might start to say you don’t see a whole lot of difference in what one government or another is doing,” he said.

Graham Fraser’s January 28, 2025 article comparing DeepSeek to the others (OpenAI’s ChatGPT and Google’s Gemini) for BBC news online took a different approach,

Writing Assistance

When you ask ChatGPT what the most popular reasons to use ChatGPT are, it says that assisting people to write is one of them.

From gathering and summarising information in a helpful format to even writing blog posts on a topic, ChatGPT has become an AI companion for many across different workplaces.

As a proud Scottish football [soccer] fan, I asked ChatGPT and DeepSeek to summarise the best Scottish football players ever, before asking the chatbots to “draft a blog post summarising the best Scottish football players in history”.

DeepSeek responded in seconds, with a top ten list – Kenny Dalglish of Liverpool and Celtic was number one. It helpfully summarised which position the players played in, their clubs, and a brief list of their achievements.

DeepSeek also detailed two non-Scottish players – Rangers legend Brian Laudrup, who is Danish, and Celtic hero Henrik Larsson. For the latter, it added “although Swedish, Larsson is often included in discussions of Scottish football legends due to his impact at Celtic”.

For its subsequent blog post, it did go into detail of Laudrup’s nationality before giving a succinct account of the careers of the players.

ChatGPT’s answer to the same question contained many of the same names, with “King Kenny” once again at the top of the list.

Its detailed blog post briefly and accurately went into the careers of all the players.

It concluded: “While the game has changed over the decades, the impact of these Scottish greats remains timeless.” Indeed.

For this fun test, DeepSeek was certainly comparable to its best-known US competitor.

Coding

Brainstorming ideas

Learning and research

Steaming ahead

The tasks I set the chatbots were simple but they point to something much more significant – the winner of the so-called AI race is far from decided.

For all the vast resources US firms have poured into the tech, their Chinese rival has shown their achievements can be emulated.

Reception from the science community

Days before the news outlets discovered DeepSeek, the company published a paper about its Large Language Models (LLMs) and its new chatbot on arXiv. Here’s a little more information,

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

[over 100 authors are listed]

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoning capabilities. Through RL, DeepSeek-R1-Zero naturally emerges with numerous powerful and intriguing reasoning behaviors. However, it encounters challenges such as poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates multi-stage training and cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on reasoning tasks. To support the research community, we open-source DeepSeek-R1-Zero, DeepSeek-R1, and six dense models (1.5B, 7B, 8B, 14B, 32B, 70B) distilled from DeepSeek-R1 based on Qwen and Llama.

Cite as: arXiv:2501.12948 [cs.CL]
(or arXiv:2501.12948v1 [cs.CL] for this version)
https://doi.org/10.48550/arXiv.2501.12948

Submission history

From: Wenfeng Liang [view email]
[v1] Wed, 22 Jan 2025 15:19:35 UTC (928 KB)

You can also find a PDF version of the paper here or another online version here at Hugging Face.

As for the science community’s response, the title of Elizabeth Gibney’s January 23, 2025 article “China’s cheap, open AI model DeepSeek thrills scientists” for Nature says it all, Note: Links have been removed,

A Chinese-built large language model called DeepSeek-R1 is thrilling scientists as an affordable and open rival to ‘reasoning’ models such as OpenAI’s o1.

These models generate responses step-by-step, in a process analogous to human reasoning. This makes them more adept than earlier language models at solving scientific problems and could make them useful in research. Initial tests of R1, released on 20 January, show that its performance on certain tasks in chemistry, mathematics and coding is on par with that of o1 — which wowed researchers when it was released by OpenAI in September.

“This is wild and totally unexpected,” Elvis Saravia, an AI researcher and co-founder of the UK-based AI consulting firm DAIR.AI, wrote on X.

R1 stands out for another reason. DeepSeek, the start-up in Hangzhou that built the model, has released it as ‘open-weight’, meaning that researchers can study and build on the algorithm. Published under an MIT licence, the model can be freely reused but is not considered fully open source, because its training data has not been made available.

“The openness of DeepSeek is quite remarkable,” says Mario Krenn, leader of the Artificial Scientist Lab at the Max Planck Institute for the Science of Light in Erlangen, Germany. By comparison, o1 and other models built by OpenAI in San Francisco, California, including its latest effort o3 are “essentially black boxes”, he says.

DeepSeek hasn’t released the full cost of training R1, but it is charging people using its interface around one-thirtieth of what o1 costs to run. The firm has also created mini ‘distilled’ versions of R1 to allow researchers with limited computing power to play with the model. An “experiment that cost more than £300 with o1, cost less than $10 with R1,” says Krenn. “This is a dramatic difference which will certainly play a role its future adoption.”

The kerfuffle has died down for now.