Hacker Newsnew | past | comments | ask | show | jobs | submit | smy20011's commentslogin

I think the good thing about it is that if you are given good specification, you are likely to get good result. Writing a C compiler is not something new, but it will be great for all the porting projects.


I miss entering flow state when coding. When vibe coding, you are in constant interruption and only think very shallow. I never see anyone enter flow state when vibe coding.


The two ways I get into flow state these days are in setting up agentic loops, so I can get out of the way by letting AI check the results for itself, and by doing more things. I've got ~4 Claude Code instances working on problems, per project, and I've got multiple projects I'm working on at the same time.


Same here, waiting for response destroys any focus I have had.


I think It's not because AI working on "misaligned" goals. The user never specify the goal clearly enough for AI system to work.

However, I think producing detailed enough specification requires same or even larger amount of work than writing code. We write rough specification and clarify these during the process of coding. I think there are minimal effort required to produce these specification, AI will not help you speed up these effort.


That makes me wonder about the "higher and higher-level language" escalator. When you're writing in assembly, is it more work to write the code than the spec? And the reverse is true if you can code up your system in Ruby? If so, does that imply anything about the "spec driven" workflow people are using with AIs? Are we right on the cusp where writing natural language specs and writing high level code are comparably productive?


Programming languages can be a thinking tool for a lot of tasks. Very much like a lot of notation, like music sheet and map drawing. A condensed and somewhat formal manner of describing ideas can increase communication speed. It may lack nuance, but in some case, nuance is harmful.

The nice thing about code compared to other notation is that it's useful on its. You describe an algorithm and the machine can then solve the problem ad infinitum. It's one step instead of the two step of writing a spec and having an LLM translate it, then having to verify the output and alter it.

Assembly and high level languages are equivalent in terms of semantics. The latter helps in managing complexity, by reducing harmful possibilities (managing memory, off-by-one errors) and presenting common patterns (iterators/collections, struct and other data structures, ....) so that categories of problems are easily solved. There's no higher level of computing model unlocked. Just faster level of productivity unlocked by following proven patterns.

Spec driven workflow is a mirage, because even the best specs will leave a lot of unspecified details. Which are crucial as most of programming is making the computer not do the various things it can do.


> most of programming is making the computer not do the various things it can do

This is a very stimulating way of putting it!


I believe that the issue right now is that we're using languages designed for human creation in an AI context. I think we probably want languages that are optimized for AI written but human read code, so the surface texture is a lot different.

My particular hypothesis on this is something that feels a little bit like python and ruby, but has an absolutely insane overkill type system to help guide the AI. I also threw in a little lispiness on my draft: https://github.com/jaggederest/locque/


I don't know, LLMs strive on human text, so I would wager that a language designed for humans would quite closely match an ideal one for LLMs. Probably the only difference is that LLMs are not "lazy", they better tolerate boilerplate, and lower complexity structures likely fit them better. (E.g. they can't really one-shot understand some imported custom operator that is not very common in its training data)

Also, they rely surprisingly closely on "good" code patterns, like comments and naming conventions.

So if anything, a managed language [1] with a decent type system and not a lot of features would be the best, especially if it has a lot of code in its training data. So I would rather vote on Java, or something close.

[1] reasoning about life times, even if aided by the compiler is a global property, and LLMs are not particularly good at that


But that is leas fundamental then you make it sound. LLMs work well with human language because that’s all they are trained on. So what else _could_ an ideal language possible look like?

On the other hand: the usefulness of LLMs will always be gated by their interface to the human world. So even if their internal communication might be superseded at some point. Their contact surface can only evolve if their partners/subjects/masters can interface


When I think of the effect of a single word on Agent behavior - I wonder if a 'compiler' for the human prompt isn't something that would benefit the engineer.

I've had comical instances where asking an agent to "perform the refactor within somespec.md" results in it ... refactoring the spec as opposed to performing a refactor of the code mentioned in the spec. If I say "Implement the refactor within somespec.md" it's never misunderstood.

With LLMs _so_ strongly aligned on language and having deep semantic links, a hypothetical prompt compiler could ensure that your intent converts into the strongest weighted individual words to ensure maximal direction following and outcome.

Intent classification (task frame) -> Reference Binding (inputs v targets) -> high-leverage word selection .... -> Constraints(?) = <optimal prompt>


If you are on the same wave length as someone you don't need to produce a full spec. You can trust that the other person has the same vision as you and will pick reasonable ways to implement things. This is one reason why personalized AI agents are important.


> I think producing detailed enough specification requires same or even larger amount of work than writing code

Our team has started dedicating much more time writing documentation for our SaaS app, no one seems to want to do it naturally, but there is very large potential for opening your system to machine automation. Not just for coding but customer facing tooling. I saw a preview of that possible future using NewRelic where they have an AI chat use their existing SQL-like query language to build tables and charts from natural language queries right in the web app. Theirs kinda sucks but there's so much potential there that it is very likely going to change how we build UIs and software interfaces.

Plus it also helps sales, support, and SEO having lots of documentation on how stuff works.


Detailed specification also helps root out conflicting design requirements and points at the desired behavior when bugs are actually found. It also helps when other stakeholders can read it and see misalignment with what their users/customers actually need.


As of today though, that doesn't work. Even straightforward tasks that are perfectly spec-ed can't be reliably done with agents, at least in my experience.

I recently used Claude for a refactor. I had an exact list of call sites, with positions etc. The model had to add .foo to a bunch of builders that were either at that position or slightly before (the code position was for .result() or whatever.) I gave it the file and the instruction, and it mostly did it, but it also took the opportunity to "fix" similar builders near those I specified.

That is after iterating a few times on the prompt (first time it didn't want to do it because it was too much work, second time it tried to do it via regex, etc.)


My thought too. To extend this coding agents will make code cheap, specifications cheaper, but may also invert the relative opportunity cost of not writing a good spec.


> The user never specify the goal clearly enough for AI system to work.

This is sort of a fundamental problem with all AI. If you tell a robot assistant to "make a cup of tea", how's it supposed to know that that implies "don't break the priceless vase in the kitchen" and "don't step on the cat's tail", et cetera. You're never going to align it well enough with "human values" to be safe. Even just defining in human-understandable terms what those values are is a deep existential question of philosophy, let alone specifying it for a machine that's capable of acting in the world independently.


According to https://cdn.realfood.gov/Daily%20Serving%20Sizes.pdf, their recommendations do not meet their calories goal. Eg, for 2000 calories, you can eat 4 egg, 3 cup of milk, 4 slice of bread, 2 apple and 3 tbsp of oil per day.

Total calories will be 1,608 kcal/day.

It's a very depressing diet menu.


This is exactly what I eat every day and I am phenomenally happy and successful.


Just don't play the game have winner and loser. Play the game that both side can win.


> Often, it will drop you precisely at that golden moment where shit almost works, and development means tweaking code and immediately seeing things work better. That dopamine hit is why I code.

Only if you are familiar with the project/code. If not, you were throw into a foreign codebase and have no idea how to tweak it.


And potentially make incredibly risky mistakes while the AI assures you it’s fine.


I use tailscale to build my personal podcast that include local weather and stocks I interested in. Running the whole pipeline on a steamdeck and use tailscale to securely delivery the generated podcast to my phone.


How are you going from weather and stock information to a podcast? Is there some sort of TTS step there?


Yeah, fully local LLM+TTS setup.

Use jupyter notebook to fetch the stock and weather info and feed that into a local LLM and convert that to speech using opensource TTS.

https://github.com/smy20011/MorningRadio


See notebookLM, as an example


What is the delivery mechanism?



Curious why large open source project (like linux) have no manager for all the political stuff but works well but every private company have a huge management chain for less complex stuff.


I'm pretty sure the majority of Linux contributions are made by corporations these days, which almost certainly have internal management structures.


Did they? Deepseek spent about 17 months achieving SOTA results with a significantly smaller budget. While xAI's model isn't a substantial leap beyond Deepseek R1, it utilizes 100 times more compute.

If you had $3 billion, xAI would choose to invest $2.5 billion in GPUs and $0.5 billion in talent. Deepseek, would invest $1 billion in GPUs and $2 billion in talent.

I would argue that the latter approach (Deepseek's) is more scalable. It's extremely difficult to increase compute by 100 times, but with sufficient investment in talent, achieving a 10x increase in compute is more feasible.


We don't actually know how much money DeepSeek spent or how much compute they used. The numbers being thrown around are suspect, the paper they published didn't reveal the costs of all models nor the R&D cost it took to develop them.

In any AI R&D operation the bulk of the compute goes on doing experiments, not on the final training run for whatever models they choose to make available.


One thing I (intuitively) don't doubt - that they spent less money for developing R1 than OpenAI spent on marketing, lobbying and management compensation.


What makes you say that? Do you think Chinese top tier talent is cheap?


I did not refer to the talent directly contributing to the technical progress.

P.S. - clarification: I mean not referring to talent at OpenAI. And yes I have very little doubt talent at DeepSeek is a lot cheaper than the things I listed above for OpenAI. I would be interested in a breakdown of the cost of OpenAI and seeing if even their technical talent costs more than the things I mentioned.


Do you think 1.5M a year compensation is cheap? That’s in the range of OpenAI offers.


What is cheap? But compared to the US, yes. Almost everywhere talent is 'cheap' compared to the US unless they move to the US.


How experienced are you with Chinese AI talent compensation?


I'm sure the salaries at Deepseek in China were lower than the salaries at OpenAI.


How are you sure about that?


A qualified guess. Do you have something that indicates dev salaries are lower in US vs China?


One example is that I've received offers to work in big tech in China at or exceeding my FAANG compensation here in the Bay Area. I have other reasons to believe as well but I can't talk about that in public.


Definitely cheaper than American top tier talent


How much cheaper? I’m curious because I’ve seen the offers that Chinese tech companies pay and it’s in the millions for the top talent.


> The numbers being thrown around are suspect, the paper they published didn't reveal the costs of all models nor the R&D cost it took to develop them.

did any lab release such figure? will be interesting to see.


>It's extremely difficult to increase compute by 100 times, but with sufficient investment in talent, achieving a 10x increase in compute is more feasible.

The article explains how in reality the opposite is true. Especially when you look at it long term. Compute power grows exponentially, humans do not.


If the bitter lesson were true we'd be getting sota results out of two layer neural networks using tanh as activation functions.

It's a lazy blog post that should be thrown out after a minute of thought by anyone in the field.


That's not how the economics work. There has been a lot of research that showed how deeper nets are more efficient. So if you spend a ton of compute money on a model, you'll want the best output - even though you could just as well build something shallow that may well be state of the art for its depth, but can't hold up with the competition on real tasks.


Which is my point.

You need a ton of specialized knowledge to use compute effectively.

If we had infinite memory and infinite compute we'd just throw every problem of length n to a tensor of size R^(n^n).

The issue is that we don't have enough memory in the world to store that tensor for something as trivial as mnist (and won't until the 2100s). And as you can imagine the exponentiated exponential grows a bit faster than the exponential so we never will.


Then how does this invalidate the bitter lesson? It's like you're saying if aerodynamics were true, we'd have planes flying like insects by now. But that's simply not how it works at large scales - in particular if you want to build something economical.


Because is the bitter lesson were true no one would be wasting their time with convolutions or attention blocks. You'd just replace them with the general tensor that allows every hyper relation possible between all points instead.


Humans don't grow exponentially indefinitely. But there's only something in the order of 100k AI researchers employed in the big labs right now. Meanwhile, there's around 20mn software engineers globally, and around 200k math graduates per year.

The number of humans who could feasibly work on this problem is pretty high, and the labs could grow an order of magnitude, and still only be tapping into the top 1-2% of engineers & mathematicians. They could grow two orders of magnitude before they've absorbed all of the above-average engineers & mathematicians in the world.


I'd actually say the market is stretched pretty thin by now. I've been an AI researcher for a decade and what passes as AI researcher or engineer these days is borderline worthless. You can get a lot of people who can use scripts and middleware like frontend lego sets to build things, but I'd say there are less than 1k people in the world right now who can actually meaningfully improve algorithmic design. There are a lot more people out there who do systems design and cloud ops, so only when you choose to go for scaling, you'll find a plentiful set of human brainpower.


Do you know what places people who are interested in research congregate at? Every forum, meet up or journal gets overwhelmed by bullshit with a year of being good.


Universities (at least certain ones) and startups (more in absolute terms than universities, but there's also a much bigger fraction of swindlers). Most blogs and forums are garbage. If you're not inside these ecosystems, try to find out who the smart/talented people are by reading influential papers. Then you can start following them on X, linkedin etc. and often you'll see what they're up to next. For example, there's a pretty clear research paper and hiring trail of certain people that eventually led to GPT-4, even though OpenAI never published anything on the architecture.


I am in correspondence with a number of worth while authors, it's just that there isn't any place where they congregate in the (semi) open and without the weirdos who do stuff with the models you're missing out on a lot.

My favorite example I can never share in polite company is that the (still sota) best image segmentation algorithm I ever saw was done by a guy labeling parts of the vagina for his stable diffusion fine tune pipeline. I used what he'd done as the basis for a (also sota 2 years later) document segmentation model.

Found him on a subreddit about stable diffusion that's now completely overrun by shitesters and he's been banned (of course).


It's pretty easy nowadays to come up with a narrow domain SOTA in image tasks. All you need to do is label some pictures and do a bit of hyperparameter search. This can literally be done by high schoolers on a laptop. And that's exactly what they do in those subreddits where everyone primarily cares about creating explicit content. The real frontier for algorithmic development is large domains (which need a lot more data by default as well). But there actually are some big-game explicit content platforms engaged in research in this area and they have shown somewhat interesting results.


Human do write code that scalable with compute.

The performance is always raw performance * software efficiency. You can use shitty software and waste all these FLOPs.


Algorithmic improvements in new fields are often bigger than hardware improvements.


Large amounts of teams are very hard to scale.

There is a reason why startups innovate and large companies follow.


Deepseek innovation is applicable to xAI setup - results are simply multiply of their compute scale.

Deepseek didn’t have option A or B available, they only had extreme optimisation option to work with.

It’s weird that people present those two approaches as mutually exclusive ones.


It's not an either/or. Your hiring of talent is only limited by your GPU spend if you can't hire because you ran out of money.

In reality pushing the frontier on datacenters will tend to attract the best talent, not turn them away.

And in talent, it is the quality rather than the quantity that counts.

A 10x breakthrough in algorithm will compound with a 10x scaleout in compute, not hinder it.

I am a big fan of Deepseek, Meta and other open model groups. I also admire what the Grok team is doing, especially their astounding execution velocity.

And it seems like Grok 2 is scheduled to be opened as promised.


Have fun hiring any talent after three years of advertising to students that all programming/data jobs are going to be obsolete.


Not that simple, It could cause resource curse [1] for developers. Why optimize algorithm when you have nearly infinity resources? For deepseek, their constrains is one of the reason they achieve breakthrough. One of their contribution, fp8 training, is to find a way to train models with GPUs that limit fp32 performance due to export control.

[1]: https://www.investopedia.com/terms/r/resource-curse.asp#:~:t...


R1 came out when Grok 3's training was still ongoing. They shared their techniques freely, so you would expect the next round of models to incorporate as many of those techniques as possible. The bump you would get from the extra compute occurs in the next cycle.

If Musk really can get 1 million GPUs and they incorporate some algorithmic improvements, it'll be exciting to see what comes out.


Deepseek didn’t seem to invest in talent as much as it did in smuggling restricted GPUs into China via 3rd countries.

Also not for nothing scaling compute x100 or even x1000 is much easier than scaling talent by x10 or even x2 since you don’t need workers you need discovery.


Talent is not something you can just freely pick up from your local Walmart.


Deepseek was a crypto mining operation before they pivoted to AI. They have an insane amount of GPUs laying around. So we have no idea how much compute they have compared to xAI.


Do you have any sources for that? When I searched "DeepSeek crypto mining" the first result was your comment, the other results were just about the wide tech market selloff after DeepSeek appeared (that also affected crypto). As far as I know, they had many GPUs because their parent company was using AI algorithms for trading for many years.

https://en.wikipedia.org/wiki/High-Flyer


You know crypto mining is illegal in China right? Of course they avoid mentioning it. Discussion boards in China had ex employees mention doing crypto mining but it’s all been wiped.


Crypto GPUs have nothing to do with AI GPUs.

Crypto mining is an embarassingly parallel problem, requiring little to no communication between GPUs. To a first approximation, in crypto, 10x-ing the amount of "cores" per GPU, 10x-ing the number of GPUs per rig and 10X-ing the number of rigs you own is basically equivalent. An infinite amount of extremely slow GPUs would do just as well as one infinitely fast GPU. This is why consumer GPUs are great for crypto.

AI is the opposite. In AI, you need extremely fast communication between GPUs. This means getting as much memory per GPU as possible (to make communication less necessary), and putting all the GPUs all in one datacenter.

Consumer GPUs, which were used for crypto, don't support the fast communication technologies needed for AI training, and they don't come in the 80gb memory versions that AI labs need. This is Nvidia's price differentiation strategy.


Any relevant crypto has been not mined on GPUs for a long time.

But a point was made to make it less parallel. For example, Ethereum uses DAG, making requirement to have 1 GB RAM, GPU was not enough.

https://ethereum.stackexchange.com/questions/1993/what-actua...

Also any GPUs are now several generations old, so their FLOPS/watt is likely irrelevant.


> While xAI's model isn't a substantial leap beyond Deepseek R1, it utilizes 100 times more compute.

I'm not sure if it's close to 100x more. xAI had 100K Nvidia H100s, while this is what SemiAnalysis writes about DeepSeek:

> We believe they have access to around 50,000 Hopper GPUs, which is not the same as 50,000 H100, as some have claimed. There are different variations of the H100 that Nvidia made in compliance to different regulations (H800, H20), with only the H20 being currently available to Chinese model providers today. Note that H800s have the same computational power as H100s, but lower network bandwidth.

> We believe DeepSeek has access to around 10,000 of these H800s and about 10,000 H100s. Furthermore they have orders for many more H20’s, with Nvidia having produced over 1 million of the China specific GPU in the last 9 months. These GPUs are shared between High-Flyer and DeepSeek and geographically distributed to an extent. They are used for trading, inference, training, and research. For more specific detailed analysis, please refer to our Accelerator Model.

> Our analysis shows that the total server CapEx for DeepSeek is ~$1.6B, with a considerable cost of $944M associated with operating such clusters. Similarly, all AI Labs and Hyperscalers have many more GPUs for various tasks including research and training then they they commit to an individual training run due to centralization of resources being a challenge. X.AI is unique as an AI lab with all their GPUs in 1 location.

https://semianalysis.com/2025/01/31/deepseek-debates/

I don't know how much slower are these GPUs that they have, but if they have 50K of them, that doesn't sound like 100x less compute to me. Also, a company that has N GPUs and trains AI on them for 2 months can achieve the same results as a company that has 2N GPUs and trains for 1 month. So DeepSeek could spend a longer time training to offset the fact that have less GPUs than competitors.


Having 50K of them isn't the same thing as 50K in one high bandwidth cluster, right? x.AI has all theirs so far in one connected cluster, and all of homogenous H100s right?


Deepseek spent at least 1.5 billion on hardware.


Looks like the beginning of the downfall of America.


We're a long way past that point.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: