Advertisment

Deep Lessons from DeepSeek

While the US blinked and the rest of the world woke up to a new force in GenAI, what lessons does this Black Swan event hold for countries like India? Here are five “deep” thoughts to consider.

author-image
DQINDIA Online
New Update
image
Listen to this article
0.75x 1x 1.5x
00:00 / 00:00

Here’s a short story with a twist: Mr. Liang Wenfeng, the 40-year-old founder of DeepSeek, is a math enthusiast and graduated with a degree in AI from Zhejiang University in Hangzhou, Zhejiang Province, China. In 2015, he co-founded High-Flyer, a quantitative hedge fund that leverages mathematical modeling, statistical analysis, and computer algorithms to integrate AI into trading strategies. Under his leadership, the company’s assets surged tenfold, from 1 billion yuan (US$138 million) in 2016 to 10 billion yuan by 2019. The twist? The company acquired more than 10,000 Nvidia GPUs just before US AI chip sanctions on China took effect on December 2, 2024.

Advertisment

Here’s another: In 2019, Gregory Zuckerman, a seasoned investigative reporter for The Wall Street Journal, published “The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution” which chronicled Jim Simons’ pioneering data-driven, algorithmic approach to outperforming the market. The twist? The Chinese translation of the book, released in 2021, had a preface by Mr Liang. In it, he detailed how Jim Simons significantly influenced his work and beliefs about utilizing mathematics to analyze financial data.

There is no doubt the DeepSeek tsunami has rattled the stock markets like never before. Shares of Nvidia, the poster child of the AI boom, sank 17% on January 27 and wiped US$593 billion off the chipmaker’s market value, marking it a record one-day loss of any US-listed company. Broadcom ended down 17.4% in the US, Google parent Alphabet was down 4.2%, and Microsoft fell 2.1%.

The DeepSeek tsunami has rattled the stock markets like never before. Shares of Nvidia, the poster child of the AI boom, sank 17% on Jan 27 & wiped $593 billion off the chipmaker’s market value, marking it a record one day loss of any US-listed company.

Advertisment

While the US blinked and the rest of the world woke up to a new force in GenAI, what lessons does this Black Swan event hold for countries like India? Here are five “deep” thoughts to consider, in alphabetical order:

Adapt Open-Source

Adapting open-source tech, as exemplified by DeepSeek, is crucial for fostering innovation and ensuring security in the rapidly evolving tech landscape. While open-source platforms offer unparalleled opportunities for collaboration and cost-efficiency, blindly accepting them without customization can expose organizations to vulnerabilities and inefficiencies.

Advertisment

We could draw lessons from DeepSeek’s bootstrapping methods: Emphasize modular design, leverage open-source resources, focus on incremental advancements (like the Japanese Kaizen methodology) instead of betting solely on big-ticket projects.

DeepSeek’s approach of tailoring open-source solutions to meet specific needs and security standards highlights the importance of strategic adaptation. This method not only enhances performance and reliability but also mitigates risks associated with generic implementations. By integrating and refining open-source technologies, companies can leverage the full potential of these tools without compromising security and operational integrity.

DeepSeek’s journey exemplifies the power of open-source adaptation. While the R1 model garnered significant attention, DeepSeek continued to innovate by launching Janus-Pro, a compact image-generation model designed to rival OpenAI’s DALL·E 3, highlighting the importance of not just adopting open-source, but refining and advancing it. On January 29, the first day of the Lunar New Year, Alibaba introduced the latest iteration of its Qwen 2.5 open-source model. These examples illustrate how open-source development, when thoughtfully adapted, can drive significant advancements, as well as maintain a competitive edge for companies in AI.

Advertisment

“There is also speculation that DeepSeek has trained its models by studying the results of American ones, a process known as distillation,” The Economist noted in its recent commentary.

The flip side: Can data shared on DeepSeek pose a security risk? The US, Taiwan, South Korea, Italy, France, and Belgium have raised security concerns. That is because DeepSeek states in its privacy terms that it collects and stores data in data centers in China and that any disputes that arise would be governed under Chinese law. The key question: Do open-source AI models pose a lower security risk since their code and algorithms are continuously monitored by thousands of impartial contributors, unlike proprietary models?

Boost Bootstrapping

Advertisment

DeepSeek has displayed remarkable ingenuity with its foundational V3 model, positioning it as a formidable competitor to OpenAI’s GPT-4. While OpenAI reportedly invested US$100 million and utilized 25,000 of Nvidia’s top-tier H100 chips for GPT-4, DeepSeek claims to have achieved comparable performance with a mere US$5.6 million and 2,000 H800 chips, specifically designed to comply with US export controls.

This impressive feat was because of DeepSeek’s innovative engineering approach. Instead of constructing a monolithic system, they segmented their AI into specialized “experts” tailored for distinct tasks. Their true breakthrough emerged with the development of the R1 reasoning model.

For its training, DeepSeek leveraged open-source foundational models such as Meta’s Llama and Alibaba’s Qwen. They then employed a vital function that OpenAI seems to have employed in its o1 model: enable AI to mimic human reasoning one step at a time, self-correcting as it progresses, and including the flexibility to “think harder” when demanded.

Advertisment

“There is also speculation that DeepSeek has trained its models by studying the results of American ones, a process known as distillation,” The Economist noted in its recent commentary.

Indian companies could draw lessons from DeepSeek’s bootstrapping methods: Emphasize modular design, leverage open-source resources, focus on incremental advancements (like the Japanese Kaizen methodology) instead of betting solely on big-ticket projects. This strategy can foster innovation and ensure sustainability and resilience in a rapidly evolving tech landscape.

Call to Collaborate

Advertisment

China’s early economic success was driven by export-led manufacturing and state-owned heavy industry. The cheese has since moved. In the late 1990s, state-owned enterprises accounted for more than 50% of China’s industrial output; today, they produce 30%. Private firms have become the primary drivers of job creation and efficiency gains, contributing over 50% of tax revenue and 60% of GDP.

New policies are helping the private sector by providing greater access to scientific infrastructure and financing channels. Technological self-reliance is a strategic priority for China. At a recent meeting with entrepreneurs, Chinese Premier Li Qiang called for a need to “concentrate efforts to break through key core technologies.” The goal is to maintain strategic state oversight while leveraging the dynamism of the private sector.

DeepSeek’s success exemplifies this vision. The company’s “local-first” approach, employing PhDs from Chinese universities, aligns with Beijing’s broader strategy to reduce reliance on foreign tech while fostering domestic innovation. By leveraging open-source models and innovative engineering, DeepSeek has shown how private firms can thrive under this new paradigm.

India is thinking along similar lines. On January 30, India’s IT Minister Mr Ashwini Vaishnaw announced that six Indian entities might release their foundational models this year. “The foundational models made in India will be able to compete with the best of the best in the world,” he said. Additionally, India has secured 18,693 GPUs to power AI R&D. The government may launch a portal to allow startups and researchers to access this computing power at prices lower than global benchmark rates.

Countries like Singapore focus on winning by collaborating. That means working closely with academia (for research, development, closed-loop trials), industry, including small and medium enterprises (for marketing, sales, business development), government bodies (for co-funding R&D, advocacy, opening international doors) and even multinational corporations (for embedding in larger systems, opening sales channel, co-branding). By adopting a similar collaborative approach, Indian firms can drive innovation and build a resilient, self-sustaining ecosystem capable of thriving in a dynamic global market.

Depend on Data

In GenAI, parameters are the internal variables of an AI model that it learns from the training data. These parameters define the behavior of the model and determine how it processes input data to produce output. More parameters generally mean the model can capture more complex patterns in the data.

However, despite the low parameter count, DeepSeek managed to achieve competitive performance through innovative engineering and efficient use of resources. DeepSeek’s R1 model activates only 37 billion parameters per forward pass, making it more resource-efficient compared to models with a much higher parameter count.

The flip side: Questions are being raised whether the firm underplayed the number of high-end chips it used to develop the model, even if others argue its claims are plausible. “There is also speculation that DeepSeek has trained its models by studying the results of American ones, a process known as distillation,” The Economist noted in its recent commentary. “OpenAI has said it has evidence that points to DeepSeek distilling its models, in violation of its terms of service.”

The key to algorithmic excellence is not more raw data, but more quality data. “Data is not just binary bits; it must be curated and vetted to build robust models like ChatGPT,” Dr PJ Narayanan, director of the International Institute of IT in Hyderabad, was quoted in The Straits Times, Singapore on February 1. He emphasized the need of training models on Indian datasets to ensure linguistic, cultural, and contextual relevance.

As the head of a 2023 government AI expert group, Dr Narayanan stressed the need for significant investments in hardware resources to boost AI expertise and cautioned against relying on frugal innovation. “Frugal innovation is good, but that cannot be the primary motive,” he said. “India needs significant investments in hardware resources to make rapid progress, while being creative about using resources effectively.”

Efficiency, not Expense

DeepSeek’s latest R1 model, released on January 20, was reportedly built with just US$6 million in raw computing power and inferior AI chips, a fraction of the money and resources spent by firms like OpenAI and Google. Silicon Valley veteran Marc Andreessen has called it “AI’s Sputnik moment.”

DeepSeek claims on its WeChat account that its R1 model is 20 to 50 times more affordable to use than OpenAI’s o1 model, depending on the task. Even more striking is DeepSeek’s competitive pricing, with the R1 model costing just a fraction of what US companies charge for their premium offerings. “Even if OpenAI is closed source, it cannot stop others from catching up,” Mr Liang was quoted by the Chinese portal 36Kr. “Open source is like a cultural practice, rather than a business practice.”

He also wants to keep prices low for users. “Our principle is neither to sell at a loss nor to seek excessive profits,” Mr Liang told CCTV News. “The current pricing allows for a modest profit margin above our costs. “Grabbing users wasn’t our primary goal. We reduced prices because we believe that both AI and API (application programming interface) services should be affordable and accessible to everyone.”

How are companies reacting to this? On January 28, Bloomberg News reported that Microsoft and OpenAI were investigating whether a group linked to DeepSeek had obtained data output from OpenAI’s technology without authorization. But that has not stopped Microsoft from offering DeepSeek’s model to customers. On January 29, Microsoft said it had added R1 to its Azure AI Foundry, a repository of more than 1,800 models that companies can use to design and manage AI programs.

The bottom line: The GenAI revolution has just begun. Much like the dawn of the Internet era, the possibilities are vast and endless. It took the World Wide Web 84 months to reach 100 million users. ChatGPT achieved this milestone in just two months. Companies and countries cannot afford to ignore the transformative potential of GenAI, as it promises to reshape industries, redefine work, and drive unprecedented innovation. Embracing this revolution is not an option but a necessity for staying competitive in the rapidly evolving digital landscape.

Raju-Chellam

By Raju Chellam

Raju Chellam is a former Editor of Dataquest and is currently based in Singapore, where he is the Editor-in-Chief of the AI Ethics & Governance Body of Knowledge, and Chair of Cloud & Data Standards.

maildqindia@cybermedia.co.in

Advertisment