Not to stand up for Claude Code in any way, I don’t like the company or use the product. This is just a related tangent-
one of my favorite software projects, Arcan, is built on the idea that there’s a lot of similarities between Game Engines, Desktop Environments, Web Browsers, and Multimedia Players. https://speakerdeck.com/letoram/arcan?slide=2
They have a really cool TUI setup that is kinda in a real sense made with a small game engine :)
Why do you think they’re jealous of future fraudsters?
But like genuinely, this sort of take confuses me so much. It’s like, if someone made fun of Putin and the consensus was that they’re just jealous they don’t have a country of their own to run.
Yeah I’m with you. Hate to assume such things but with how much AI spam is out there on programmer blogs these days I kinda just give up reading the blog post once something becomes confusing. Most of the time there’s not any insight to be learned by investigating deeper.
This one also has a lot of Its not X, its Y type phrasing
Maybe if most people would agree the corporation is big and bad and should have penalties, it’s more democratic to go with that decision that the decision nine unelected philosopher kings come up with.
Yeah that's odd. It seems like you'd want an n-1 dimensional grid on the surface of the unit sphere rather than an n dimensional grid within which the sphere resides.
Looking at the paper (https://arxiv.org/abs/2504.19874) they cite earlier work that does exactly that. They object that grid projection and binary search perform exceptionally poorly on the GPU.
I don't think they're using a regular grid as depicted on the linked page. Equation 4 from the paper is how they compute centroids for the MSE optimal quantizer.
Why specify MSE optimal you ask? Yeah so it turns out there's actually two quantization steps, a detail also omitted from the linked page. They apply QJL quantization to the residual of the grid quantized data.
My description is almost certainly missing key details; I'm not great at math and this is sufficiently dense to be a slog.
Yes. Great catch. I simplified the grid just for visualization purpose.
I've updated the visualization. The grid is actually not uniformly spaced. Each coordinate is quantized independently using optimal centroids for the known coordinate distribution. In 2D, unit-circle coordinates follow the arcsine distribution (concentrating near ±1), so the centroids cluster at the edges, not the center.
“ TurboQuant, QJL, and PolarQuant are more than just practical engineering solutions; they’re fundamental algorithmic contributions backed by strong theoretical proofs. These methods don't just work well in real-world applications; they are provably efficient and operate near theoretical lower bounds.”
I also instinctively reacted to that fragment, but at this point I think this is overreacting to a single expression. It's not just a normal thing to say in English, it's something people have been saying for a long time before LLMs existed.
> Redefining AI efficiency with extreme compression
"Redefine" is a favorite word of AI. Honestly no need to read further.
> the key-value cache, a high-speed "digital cheat sheet" that stores frequently used information under simple labels
No competent engineer would describe a cache as a "cheat sheet". Cheat sheets are static, but caches dynamically update during execution. Students don't rewrite their cheat sheets during the test, do they? LLMs love their inaccurate metaphors.
> QJL: The zero-overhead, 1-bit trick
> It reduces each resulting vector number to a single sign bit (+1 or -1). This algorithm essentially creates a high-speed shorthand that requires zero memory overhead.
Why does it keep emphasizing zero overhead? Why is storing a single bit a "trick?" Either there's currently an epidemic of algorithms that use more than one bit to store a bit, or the AI is shoving in extra plausible-sounding words to pad things out. You decide which is more likely.
It's 1:30am and I can't sleep, and I still regret wasting my time on this slop.
I say you're fixating on the wrong signal here. "Redefine" and "cheat sheet" are normal words people frequently use, and I see worse metaphors in human-written text routinely.
It's the structure and rhythm at the sentence and paragraph levels that's the current tell, as SOTA LLMs all seem to overuse clarification constructs like "it's not X, it's Y" and "it's X, an Y and a Z", and "it's X, it's essentially doing Y".
Thing is, I actually struggle to find what's so off-putting about these, given that they're usually used correctly. So far, the best hypothesis I have for what makes AI text stand out is that LLM output is too good. Most text written by real humans (including my own) is shit, with the best of us caring about communicating clearly, and most people not even that; nobody spends time refining the style and rhythm, unless they're writing a poem. You don't expect a blog post or a random Internet article (much less a HN comment) to be written in the same style as a NYT bestseller book for general audience - but LLMs do that naturally, they write text better at paragraph level than most people ever could, which stands out as jarring.
> Either there's currently an epidemic of algorithms that use more than one bit to store a bit, or the AI is shoving in extra plausible-sounding words to pad things out. You decide which is more likely.
Or, those things matter to authors and possibly the audience. Which is reasonable, because LLMs made the world suddenly hit hard against global capacity constraints in compute, memory, and power; between that and edge devices/local use, everyone who pays attention is interested in LLM efficiency.
Not if you view text as a medium for communication, i.e. as a way for a sender to serialize some idea they have in their mind and transfer it to the reader for deserialization.
The AI doesn't know what the sender meant. It can't add any clarity. It can only corrupt and distort whatever message the sender was trying to communicate.
Fixating on these tells is a way for the receiver of the message to detect that it has been corrupted and there is no point in trying to deserialize it. The harder you try to interpret an AI-generated message, the less sense it will make.
LLM prose is very bland and smooth, in the same way that bland white factory bread is bland and smooth. It also typically uses a lot of words to convey very simple ideas, simply because the data is typically based on a small prompt that it tries to decompress. LLMs are capable of very good data transformation and good writing, but not when they are asked to write an article based on a single sentence.
That's true. I.e. it's not that they're not capable of doing better, it's just whoever's prompting them is typically too lazy to add an extra sentence or three (or a link) to steer it to a different region of the latent space. There's easily a couple dozen dimensions almost always left at their default values; it doesn't take much to alter them and nudge the model to sample from a more interesting subspace style-wise.
(Still, it makes sense to do it as a post-processing style transfer space, as verbosity is a feature while the model is still processing the "main" request - each token produced is a unit of computation; the more terse the answer, the dumber it gets (these days it's somewhat mitigated by "thinking" and agentic loops)).
"The X Trick" or "The Y Dilemma" or similar snowclones in a header is also a big AI thing. Humans use this construction too, but LLMs love it out of all proportion. I call it The Ludlum Delusion (since that's how every Robert Ludlum book is titled).
There is also the possibility that the article when through the hands of the company's communication department which has writers that probably write at LLM level.
Only because people are lazy, and don't bother with a simple post-processing step: attach a bunch of documents or text snippets written by a human (whether yourself or, say, some respected but stylistically boring author), and ask the LLM to match style/tone.
They’re pointing out that if the jar was _filled_ with sand, then of course you can’t fit any rocks in because it’s full. It’s cute but misunderstands the original metaphor I think.
I feel like we give what’s some pretty impressive engineering short shrift because it’s just for entertainment
reply