A bunch of ML researchers who were initially hired to do quant work published th...

sailingparrot · 2025-01-29T23:58:11 1738195091

> A bunch of ML researchers who were initially hired to do quant work

Very interesting! I'm sure you have a source for this claim?

This myth of DS being a side project literally started from one tweet. DeepSeek the company is funded by a company whose main business is being a hedge fund, but DeepSeek itself from day 1 has been all about building LLM to reach AGI, completely independent.

This is like saying SpaceX is the side-project of a few caremaking bros, just because Elon funded and manages both. They are unrelated.

Again, you can easily google the name of the authors and look at their background, you will find people with PhD in LLM/multimodal models, internships at Microsoft Research etc. No trace of background on quant or time series prediction or any of that.

From the mouth of the CEO himself 2 years ago: "Our large-model project is unrelated to our quant and financial activities. We’ve established an independent company called DeepSeek, to focus on this." [0]

It's really interesting to see how after 10 years debating the mythical 10x engineer, we have now overnight created the mythical 100x Chinese quant bro researcher, that can do 50x better models than the best U.S. people, after 6pm while working on his side project.

[0]: https://www.chinatalk.media/p/deepseek-from-hedge-fund-to-fr...

maxglute · 2025-01-30T01:22:13 1738200133

See this earlier interview from 2020.

https://www.pekingnology.com/p/ceo-of-deepseeks-parent-high-...

TDLR Highflyer started very much as exclusive ML/AI focused quant investment firm, with a lot of compute for finance AI and mining. Then CCP cracked down on mining... then finance, so Liang probably decided to pivot to LLM/AGI, which likely started as side project, but probably not anymore now the DeepSeek has taken off and Liang just met with PRC premiere a few days ago. DeepSeek being independent company doesn't mean DeepSeek isn't Liang's side project using compute bought with hedge fund money that is primarily used for hedgefund work, cushioned/allowed to get by with low margins by hedgefund profits.

sailingparrot · 2025-01-30T01:30:28 1738200628

Yes, see my analogy with Elon.

The point is, the team actually doing the DeepSeek work are working on this as their exclusive project, have been hired exclusively for this etc.

They aren't doing this on the side of their main quant job, and destroying U.S. researchers just as a hobby as the myth would have us believe.

maxglute · 2025-01-30T01:43:48 1738201428

That's a fair distinction. IMO should still be categorized as side project in the sense that it's Liang's pet project, the same way Jeff Bezos spend $$$ on his forever clock with seperate org but ultimately with Amazon resources. DeepSeek / Liang fixating on AGI and not profit making or loss-making since hardware / capex deprecation is likely eaten by High Flyer / quant side. No reason to believe DeepSeek spent 100ms to build out another compute chain from High Flyer. Myth that seasoned finance quants using 20% time to crush US researchers is false, but reality/narrative that a bunch of fresh out of school GenZ kids from tier1 PRC universities destroying US researchers is kind of just as embarassing.

asdasdsddd · 2025-01-30T00:37:33 1738197453

Just to be pedantic, spaceX predates tesla

benatkin · 2025-01-30T01:09:11 1738199351

The carmaking bro predates SpaceX. He had a BMW in college and got a supercar in 1997. While he wasn’t a carmaker yet he got started with cars earlier.

islewis · 2025-01-30T00:11:05 1738195865

A valid response to my initial comment which was a bit tongue in cheek.

However, i'm not sure that them being LLM researchers compared to quant researchers changes the dynamic of their relaxed security posture.

sailingparrot · 2025-01-30T00:20:20 1738196420

> However, i'm not sure that them being LLM researchers compared to quant researchers changes the dynamic of their relaxed security posture.

It does not indeed, but that's not the part I was commenting on.

spoaceman7777 · 2025-01-29T23:21:56 1738192916

First ever? Their math, coding, and other models have been making a splash since 2023.

The mythologizing around deepseek is just absurd.

"Deepseek is the tale of one lowly hedgefund manager overcoming the wicked American AI devils". Every day I hear variations of this, and the vast majority of it is based entirely in "vibes" emanating from some unknown place.

sho_hn · 2025-01-30T00:18:03 1738196283

What I find amusing is that this closely mirrors the breakout moment OpenAI had with ChatGPT. They had been releasing models for quite some time before slapping the chatbot interface on it, and then it blew up within a few days.

It's fascinating that a couple of years and a few competitors in, the DeepSeek moment parallels it so closely.

quantified · 2025-01-29T23:29:42 1738193382

Models and security are very different uses of our synapses. Publishing any number of models is no proof of anything beyond models. Talented mathematicians and programmers though they may be.

tonyhart7 · 2025-01-30T00:12:48 1738195968

well security isn't their job to begin with