DeepSeek Founder’s Exclusive Interviews – Origins

We are honored to present the English translations of two invaluable, exclusive interviews with DeepSeek CEO, Liang Wenfeng. These interviews are exceptionally valuable, representing the only in-depth reports to date that delve into DeepSeek’s early development and vision, vividly demonstrating the Power of Open Source.

This article is the first in a series of two interviews. Originally published in May 2023, it is titled, “The Wild Ride of High-Flyer Quant: The Path to Large Models for an Under-the-Radar AI Giant.” This interview takes you behind the scenes of DeepSeek’s parent company, the leading Chinese quantitative hedge fund, High-Flyer Quant (幻方). You’ll discover how High-Flyer Quant, in a disruptive move, transitioned from the financial sector to the realm of artificial intelligence and large language models.

The article provides a deep dive into High-Flyer Quant’s prescient investment in 10,000 GPUs as early as 2021, a move that predated many major technology companies. This not only demonstrates High-Flyer Quant’s substantial resources but also highlights its unwavering commitment to, and long-term investment in, Artificial General Intelligence (AGI). As they stated when announcing their large model initiative, echoing the words of filmmaker François Truffaut: “Be audaciously ambitious, and radically genuine.” Unlike those merely chasing the ChatGPT trend, High-Flyer Quant and DeepSeek have consistently striven to explore the uncharted territories of AGI, pursuing genuine breakthroughs in artificial intelligence.

Through this interview, you will appreciate the technological vision of DeepSeek founder, Liang Wenfeng, and High-Flyer Quant’s unique culture of innovation and talent development. These factors laid a solid foundation for DeepSeek’s remarkable rise in the field of large language models.

We highly recommend reading this interview to gain an in-depth understanding of DeepSeek’s origins, its extraordinary journey, and its relentless pursuit of technological innovation.

The Wild Ride of High-Flyer Quant: The Path to Large Models for an Under-the-Radar AI Giant

By: Yu Lili . Edited By: Liu Jing . May 25 2023 . Source Link

In the rush of large model competitions, High-Flyer Quant stands out as one of the most unconventional players.

This game is destined for a select few. Many start-ups have reconsidered their directions or contemplated exiting after major enterprises entered the arena, yet this quantitative fund has forged ahead independently.

In May, High-Flyer Quant established a new independent organization for large models, named “DeepSeek,” with a focus on developing truly human-level artificial intelligence. Their goal is not merely to replicate ChatGPT but to research and uncover more of the unknown mysteries of Artificial General Intelligence (AGI).

Moreover, in this field, highly dependent on scarce talent, High-Flyer Quant seeks to assemble a team of committed individuals, leveraging what they consider their greatest asset: collective curiosity.

In quantitative finance, High-Flyer Quant is a “top fund” with a scale exceeding 100 billion yuan, but its newfound attention in the AI wave is rather dramatic.

When the shortage of high-performance GPU chips among Chinese cloud providers emerged as the most immediate bottleneck hindering the emergence of generative AI in China, a report by Caijing Eleven revealed that fewer than five Chinese enterprises possess over 10,000 GPU units. Alongside several leading tech giants, this exclusive group includes a quantitative fund company named High-Flyer Quant. Typically, owning 10,000 NVIDIA A100 chips is considered the threshold for self-training large models.

Although rarely examined from an AI angle, this company is already a invisible AI titan: In 2019, High-Flyer Quant Quantitative founded an AI company, investing nearly 200 million yuan in a self-developed deep learning platform “Fire-Flyer 1,” equipped with 1,100 GPUs; two years later, they increased their investment to 1 billion yuan for “Fire-Flyer 2,” which is equipped with about 10,000 NVIDIA A100 graphics cards.

This means, purely in terms of computational power, High-Flyer Quant secured access to creating ChatGPT even earlier than many major firms.

Nevertheless, large models require substantial computational power, algorithms, and data, requiring an initial investment of 50 million US dollars and each training run costing tens of millions of US dollars. Companies without billions of US dollars in capital struggle to keep pace. Despite these challenges, High-Flyer Quant remains optimistic. Founder Liang Wenfeng shared, “The key is our desire and ability to do this, making us one of the most suitable candidates.”

This mysterious optimism mainly arises from High-Flyer Quant’s unique growth path.

Quantitative investment, imported from the U.S., has seen most of China’s top quantitative fund founders with backgrounds in American or European hedge funds. High-Flyer Quant is the exception, having been completely founded by a local team and growing through self-exploration.

In 2021, only six years after its founding, High-Flyer Quant achieved a scale of 100 billion yuan and was recognized as one of the “Big Four Quantitative Giants.”

Emerging from an outsider’s perspective, High-Flyer Quant consistently disrupts. Several industry insiders noted that High-Flyer Quant “always uses a novel approach in R&D, products, and sales to penetrate the industry.”

A founder of another top quantitative fund commented that through the years, High-Flyer Quant has “not followed conventional paths,” instead doing things “in their way,” even if a bit unconventional or controversial, “they dare speak openly and act on their ideas.”

Internally, High-Flyer Quant attributes its growth to “selecting a group of inexperienced yet potential individuals and fostering an organizational structure and corporate culture that encourages innovation.” They believe this could be the secret for large model start-ups competing against bigger companies.

Perhaps a crucial secret lies with High-Flyer Quant’s founder, Liang Wenfeng.

While studying AI at Zhejiang University, Liang firmly believed “artificial intelligence will change the world,” a conviction not widely held in 2008.

After graduating, instead of joining a major firm, he worked from a cheap rental in Chengdu, experiencing numerous failures in different scenarios before eventually breaking into the complex world of finance and founding High-Flyer Quant.

An interesting detail is that in the early years, an equally eccentric friend from Shenzhen tried to involve him in an “unreliable” aircraft project. That friend later created a 100 billion US dollars company named DJI.

Thus, beyond the discussion of funds, people, and computational power involved in developing large models, we discussed with Liang Wenfeng the organizational structure necessary for fostering innovation and how long such enthusiasm can last.

After more than a decade of entrepreneurship, this marked the first public interview with the rarely seen “techie” founder.

Coincidentally, on April 11th, when High-Flyer Quant announced its large model initiative, they quoted French New Wave director Truffaut’s advice to young directors: “Be audaciously ambitious, and radically genuine.”

Here is the dialogue:

1. Doing Research, Doing Exploration

Do the most important and most difficult things.

36Kr: Recently, High-Flyer Quant announced its decision to enter the field of large models. Why would a quantitative fund do such a thing?

Liang Wenfeng: Our work on large models actually has no direct relationship with quantitative finance or the finance industry. We’ve established a separate new company called DeepSeek to do this.

Many of the core members of High-Flyer Quant’s team are AI experts. At the time, we tried many scenarios and ultimately chose the complex field of finance. Artificial General Intelligence (AGI) is probably one of the next most difficult things, so for us, it’s a question of how to do it, not why to do it.

36Kr: Are you going to train your own large model, or a vertical-specific model—for example, one related to finance?

Liang Wenfeng: What we want to do is Artificial General Intelligence (AGI). Large language models are probably a necessary path to AGI and have initially shown characteristics of AGI, so we will start here, and later there will be vision and other aspects.

36Kr: Because of the entry of large companies, many startups have abandoned the broader direction of building general-purpose large models.

Liang Wenfeng: We won’t prematurely design applications based on the model; we will focus on the large model itself.

36Kr: Many people believe that it’s not a good time for startups to enter the field after the large companies have reached a consensus.

Liang Wenfeng: It seems that, at present, both large companies and startups will find it difficult to establish a crushing technological advantage over their competitors in a short period of time. Because OpenAI has shown the way, and both are based on public papers and code, by next year at the latest, both large companies and startups will have built their own large language models.

Both large companies and startups have their own opportunities. Existing vertical scenarios are not in the hands of startups; this stage is not very friendly to startups. But because these scenarios are, after all, scattered and fragmented small demands, they are more suitable for flexible startup organizations. In the long run, the entry barrier for large model applications will become lower and lower, and startups will have opportunities at any time in the next 20 years.

Our goal is also very clear: to not focus on vertical-specific applications or immediate applications, but on research and exploration.

36Kr: Why do you define it as “research and exploration”?

Liang Wenfeng: A kind of curiosity-driven motivation. From a long-term perspective, we want to verify some conjectures. For example, we believe that the essence of human intelligence may be language, and human thinking may be a linguistic process. What you think of as thinking might actually be you weaving language in your mind. This means that human-like artificial intelligence (AGI) may emerge from large language models.

From a short-term perspective, there are still many mysteries to be solved in GPT-4. While replicating it, we will also conduct research to uncover the secrets.

36Kr: But research means greater costs.

Liang Wenfeng: If we only do replication, we can build on public papers or open-source code, train only a few times, or even just fine-tune, and the cost is very low. But doing research requires various experiments and comparisons, requires more computing power, and has higher requirements for personnel, so the cost is higher.

36Kr: Then where does the research funding come from?

Liang Wenfeng: High-Flyer Quant, as one of our investors, has ample research and development budget. In addition, there is a donation budget of several hundred million yuan each year, which was previously given to public welfare organizations. If necessary, some adjustments can be made.

36Kr: But to build a foundational large model, you can’t even get to the table without two or three hundred million US dollars. How can we support its continuous investment?

Liang Wenfeng: We are also talking to different investors. From our contacts, we feel that many VCs have concerns about doing research. They have exit requirements and hope to quickly produce commercialized products. According to our priority of doing research, it is difficult to obtain financing from VCs. But we have computing power and an engineering team, which is equivalent to having half the chips.

36Kr: What deductions and assumptions have we made about the business model?

Liang Wenfeng: What we are thinking now is that we can share most of our training results publicly later, so that it can be combined with commercialization. We hope that more people, even a small app, can use large models at low cost, instead of the technology being controlled by a few people and companies, forming a monopoly.

36Kr: Some large companies will also provide some services in the later stages. What is your differentiated part?

Liang Wenfeng: The models of large companies may be bundled with their platforms or ecosystems, while ours is completely free.

36Kr: In any case, it’s a bit crazy for a commercial company to engage in unlimited investment in research and exploration.

Liang Wenfeng: If you must find a commercial reason, it may be impossible to find one, because it’s not cost-effective.

From a commercial point of view, basic research has a very low return on investment. When early investors in OpenAI invested money, they were certainly not thinking about how much return they would get, but really wanted to do this.

What we are relatively certain about now is that since we want to do this and have the ability, at this point in time, we are one of the most suitable candidates.

2. The 10,000-GPU Reserve and Its Cost

Something exciting may not be measured purely in monetary terms.

36Kr: GPUs are a scarce resource in this wave of ChatGPT-driven startups. You had the foresight to reserve 10,000 of them back in 2021. Why?

Liang Wenfeng: Actually, it was a gradual process, from the first GPU, to 100 GPUs in 2015, 1,000 GPUs in 2019, and then 10,000. Before we had a few hundred GPUs, we hosted them in IDCs. But when the scale grew larger, hosting couldn’t meet our needs, so we started building our own data centers. Many people might think there’s some hidden business logic behind this, but in reality, it was primarily driven by curiosity.

36Kr: What kind of curiosity?

Liang Wenfeng: Curiosity about the boundaries of AI capabilities. For many people outside the industry, the ChatGPT wave is a huge shock. But for those of us within the industry, the impact of AlexNet in 2012 already ushered in a new era. AlexNet’s error rate was far lower than other models at the time, reviving neural network research that had been dormant for decades. Although the specific technical directions have been changing, the combination of models, data, and computing power has remained constant. Especially after OpenAI released GPT-3 in 2020, the direction was very clear: a large amount of computing power was needed. But even in 2021, when we invested in building Fire-Flyer 2, most people still couldn’t understand it.

36Kr: So you started focusing on computing power reserves as early as 2012?

Liang Wenfeng: For researchers, the thirst for computing power is endless. After doing small-scale experiments, you always want to do larger-scale experiments. After that, we consciously deployed as much computing power as possible.

36Kr: Many people thought that building this computer cluster was for your quantitative hedge fund business to use machine learning for price prediction?

Liang Wenfeng: If it were purely for quantitative investment, a very small number of GPUs would suffice. We’ve done a lot of research beyond investment. We’re more interested in figuring out what kind of paradigm can completely describe the entire financial market, whether there’s a simpler way to express it, where the boundaries of different paradigms’ capabilities lie, whether these paradigms have broader applicability, and so on.

36Kr: But this process is also a money-burning endeavor.

Liang Wenfeng: Something exciting may not be measured purely in monetary terms. It’s like buying a piano for your home. Firstly, you can afford it, and secondly, it’s because there’s a group of people eager to play music on it.

36Kr: Graphics cards typically depreciate at a rate of 20%.

Liang Wenfeng: We haven’t calculated it precisely, but it shouldn’t be that much. NVIDIA’s graphics cards are hard currency; even older cards from many years ago are still being used by many people. When we previously retired our old cards, they were still quite valuable in the second-hand market, so we didn’t lose much.

36Kr: Building a computer cluster, maintenance costs, labor costs, and even electricity bills are all considerable expenses.

Liang Wenfeng: Electricity and maintenance costs are actually very low; these expenses only account for about 1% of the hardware cost per year. Labor costs are not low, but labor costs are also an investment in the future; they are the company’s greatest asset. We also choose people who are relatively down-to-earth, curious, and have the opportunity to do research here.

36Kr: In 2021, High-Flyer Quant was among the first companies in the Asia-Pacific region to receive A100 GPUs. Why were you earlier than some cloud vendors?

Liang Wenfeng: We conducted preliminary research, testing, and planning for new GPUs very early on. As for some cloud vendors, as far as I know, their previous demands were scattered. It wasn’t until 2022, with autonomous driving and the demand for renting machines for training, and the ability to pay, that some cloud vendors started to build their infrastructure. It’s difficult for large companies to purely do research and training; they are more likely to be driven by business needs.

36Kr: How do you view the competitive landscape of large models?

Liang Wenfeng: Large companies definitely have advantages, but if they can’t apply them quickly, they may not be able to persist, because they need to see results.

Some leading startups also have solid technology, but like the previous wave of AI startups, they all face the challenge of commercialization.

36Kr: Some people might think that a quantitative fund emphasizing its AI work is just blowing a bubble for other businesses.

Liang Wenfeng: But in fact, our quantitative fund has basically stopped raising funds from external sources.

36Kr: How do you distinguish between AI believers and speculators?

Liang Wenfeng: Believers were here before and will be here after. They are more likely to buy GPUs in bulk or sign long-term agreements with cloud vendors, rather than renting them short-term.

3. How to Make Innovation Truly Happen

Innovation often emerges organically; it cannot be forced or taught.

36Kr: How is the recruitment process going for the DeepSeek team?

Liang Wenfeng: The initial team has been assembled. Initially, due to a shortage of manpower, we temporarily seconded some people from High-Flyer Quant. When ChatGPT 3.5 became popular at the end of last year, we started the recruitment process, but we still need more people to join.

36Kr: Talent for large model startups is also scarce. Some investors say that many suitable candidates might only be found in the AI labs of giants like OpenAI and Facebook AI Research. Will you be recruiting such talent from overseas?

Liang Wenfeng: If the goal is short-term, finding experienced people is the right approach. But in the long run, experience is less important; fundamental abilities, creativity, and passion are more crucial. From this perspective, there are many suitable candidates in China.

36Kr: Why is experience less important?

Liang Wenfeng: It’s not necessarily the case that only those who have done something before can do it. High-Flyer Quant has a principle for hiring: we look at ability, not experience. Our core technical positions are primarily filled by recent graduates or those with one or two years of experience.

36Kr: Do you think experience is an obstacle in innovative businesses?

Liang Wenfeng: When doing something, experienced people will tell you without hesitation that it should be done this way. But people without experience will repeatedly explore, think very carefully about how it should be done, and then find a solution that fits the current situation.

36Kr: High-Flyer Quant entered the financial industry as a complete outsider with no financial background and became a leader within a few years. Is this hiring principle one of the secrets?

Liang Wenfeng: Our core team, including myself, initially had no quantitative experience, which is quite unusual. I can’t say it’s the secret to success, but it’s part of High-Flyer Quant’s culture. We don’t intentionally avoid experienced people, but we focus more on ability.

Take our sales positions, for example. Our two main salespeople were complete newcomers in this industry. One used to be in foreign trade, dealing with German machinery, and the other was writing code in the back office of a securities firm. When they entered this industry, they had no experience, no resources, and no established network.

And now, we are probably the only large-scale private equity firm that can primarily rely on direct sales. Direct sales mean we don’t have to share fees with intermediaries, resulting in higher profit margins for the same scale and performance. Many firms have tried to imitate us, but they haven’t succeeded.

36Kr: Why have many firms tried to imitate you but failed?

Liang Wenfeng: Because this alone is not enough to make innovation happen. It needs to be matched with the company’s culture and management.

In fact, they couldn’t achieve anything in the first year; they only started to see some results in the second year. But our evaluation criteria are different from those of typical companies. We don’t have KPIs, nor do we have so-called targets.

36Kr: So what are your evaluation criteria?

Liang Wenfeng: Unlike typical companies, we don’t focus on the volume of client orders. How much our salespeople sell isn’t directly tied to their commissions from the beginning. We encourage them to develop their own networks, meet more people, and build greater influence.

Because we believe that an honest salesperson who gains the trust of clients may not be able to get clients to place orders in the short term, but they can make you feel that they are a reliable person.

36Kr: After selecting the right people, how do you get them up to speed?

Liang Wenfeng: Give them important tasks and don’t interfere. Let him find his own solutions and exercise his initiative.

Actually, a company’s DNA is very difficult to imitate. For example, how do you judge the potential of someone with no experience, and how do you enable them to grow after they are hired? These things can’t be directly copied.

36Kr: What do you think are the necessary conditions for building an innovative organization?

Liang Wenfeng: Our conclusion is that innovation requires minimal intervention and management, allowing everyone to have the space for free expression and opportunities for trial and error. Innovation often emerges organically; it cannot be forced or taught.

36Kr: This is an unconventional management style. In this situation, how do you ensure that a person works efficiently and in the direction you want?

Liang Wenfeng: Ensure alignment of values during recruitment, and then use corporate culture to ensure alignment of actions. Of course, we don’t have a written corporate culture because any written document would hinder innovation. Most of the time, it’s about the managers leading by example. When encountering a situation, how you make decisions becomes a guideline.

36Kr: Do you think that in this wave of large model competition, the innovative organizational structure of startups will be a breakthrough point in competing with larger companies?

Liang Wenfeng: If you deduce the approach of startups according to textbook methodologies, in the current situation, what they are doing would not allow them to survive.

But the market is changing. The real determining force is often not some existing rules and conditions, but an ability to adapt and adjust to change.

The organizational structures of many large companies are no longer able to respond and act quickly, and they are easily constrained by their previous experience and inertia. In this new AI wave, a new batch of companies will definitely emerge.

4. True Madness

Innovation is expensive and inefficient, sometimes accompanied by waste.

36Kr: What excites you the most about doing this?

Liang Wenfeng: To find out if our hypotheses are true. If they are, it’s very exciting.

36Kr: In this round of hiring for the large model, what are the absolute must-have requirements?

Liang Wenfeng: Passion and solid foundational skills. Nothing else is as important.

36Kr: Is it easy to find people like that?

Liang Wenfeng: Their passion usually shows through because they really want to do this. So these people are often looking for you at the same time.

36Kr: Large models could be something that requires endless commitment. Are you concerned about the costs involved?

Liang Wenfeng: Innovation is expensive and inefficient, sometimes accompanied by waste. That’s why innovation can only emerge after the economy reaches a certain level of development. When things are very poor, or in industries not driven by innovation, cost and efficiency are crucial. Look at OpenAI; they burned through a lot of money to get where they are.

36Kr: Do you feel like you’re doing something crazy?

Liang Wenfeng: I don’t know if it’s crazy, but there are many things in this world that can’t be logically explained. It’s like many programmers who are also dedicated contributors to open-source communities. Even after a tiring day, they still go and contribute code.

36Kr: There’s a kind of spiritual reward in that.

Liang Wenfeng: It’s like when you hike 50 kilometers. Your body is completely exhausted, but you’re spiritually fulfilled.

36Kr: Do you think the madness driven by curiosity can last forever?

Liang Wenfeng: Not everyone can be crazy for their entire life, but most people, in their younger years, can be completely devoted to something without any utilitarian purpose.

36Kr is a prominent brand and a pioneering platform dedicated to serving New Economy participants in China. With the mission of empowering New Economy participants to achieve more, 36Kr continuously connects and serves the six communities, which involves Startups, TMT giants, Traditional enterprises, Institutional investors, Local governments, and Individuals. 36Kr accelerates the flow of the four major elements, which are information, talent, capital, and technology, to promote the development of a rapid, stable, and sustainable New Economy.36Kr offers comprehensive services that span the entire corporate lifecycle, from investment and public relations to funding and IPO guidance, as well as market value and brand management, customer engagement, consulting and solution provision, government collaboration, and international expansion for businesses.


Join us at AI Native Foundation Membership Dashboard for the latest insights on AI Native, or follow our linkedin account at AI Native Foundation and our twitter account at AINativeF.

Blank Form (#4)
[email protected]

About

Ecosystem

Copyright 2025 AI Native Foundation© . All rights reserved.​