More

radarsat1 · 2026-01-12T20:19:49 1768249189

Heh, at least this wouldn't spread emojis all over my readmes. Hm, come to think of it I wonder how much tokenization is affected.

Another thought, just occurred when thinking about readmes and coding LLMs: obviously this model wouldn't have any coding knowledge, but I wonder if it could be possible to combine this somehow with a modern LLM in such a way that it does have coding knowledge, but it renders out all the text in the style / knowledge level of the 1800's model.

Offhand I can't think of a non-fine-tuning trick that would achieve this. I'm thinking back to how the old style transfer models used to work, where they would swap layers between models to get different stylistic effects applied. I don't know if that's doable with an LLM.

fluoridation · 2026-01-12T22:36:50 1768257410

Just have the models converse with each other?

radarsat1 · 2026-01-09T13:59:16 1767967156

Something I wonder is, what happened to asm.js? It got killed by WASM. In a way this is good, WASM is a "better" solution, being a formal bytecode machine description, but on the other hand, asm.js would not have the same limitations e.g. with respect to DOM interaction, or debates on how to integrate garbage collection, since you stay squarely in the JS VM you get these things for free.

Basically in some ways it was a superior idea: benefit from the optimizations we are already doing for JS, but define a subset that is a good compilation target and for which we know the JS VM already performs pretty optimally. So apart from defining the subset there is no extra work to do. On the other hand I'm sure there are JS limitations that you inherit. And probably your "binaries" are a bit larger than WASM. (But, I would guess, highly compressible.)

I guess the good news is that you can still use this approach. Just that no one does, because WASM stole the thunder. Again, not sure if this is a good or bad thing, but interesting to think about... for instance, whether we could have gotten to the current state much faster by just fully adopting asm.js instead of diverting resources into a new runtime.

pjmlp · 2026-01-09T16:18:51 1767975531

Which only existed because Mozzilla was against adopting PNaCL.

radarsat1 · 2025-12-17T09:52:45 1765965165

I find it really interesting that it uses a Mamba hybrid with Transformers. Is it the only significant model right now using (at least partially) SSM layers? This must contribute to lower VRAM requirements right? Does it impact how KV caching works?

radarsat1 · 2025-12-06T19:21:12 1765048872

you're kind of describing the figure in table 1 (page 8) of the diffusion forcing paper

https://arxiv.org/abs/2407.01392

of course it doesn't redraw the image on every step, so not exactly what you're suggesting (interesting idea btw) but i think it's relevant.

radarsat1 · 2025-11-29T21:41:08 1764452468

Maybe what they should do in the future is just automatically provide AI reviews to all papers and state that the work of the reviewers is to correct any problems or fill details that were missed. That would encourage manual review of the AI's work and would also allow authors to predict what kind of feedback they'll get in a structured way. (eg say the standard prompt used was made public so authors could optimize their submission for the initial automatic review, forcing the human reviewer to fill in the gaps)

ok of course the human reviewers could still use AI here but then so could the authors, ad infinitum..

radarsat1 · 2025-11-22T19:10:20 1763838620

A lot of "generative" work is like this. While you can come up with benchmarks galore, at the end of the day how a model "feels" only seems to come out from actual usage. Just read /r/localllama for opinions on which models are "benchmaxed" as they put it. It seems to be common knowledge in the local LLM community that many models perform well on benchmarks but that doesn't always reflect how good they actually are.

In my case I was until recently working on TTS and this was a huge barrier for us. We used all the common signal quality and MOS-simulation models that judged so called "naturalness" and "expressiveness" etc. But we found that none of these really helped us much in deciding when one model was better than another, or when a model was "good enough" for release. Our internal evaluations correlated poorly with them, and we even disagreed quite a bit within the team on the quality of output. This made hyperparameter tuning as well as commercial planning extremely difficult and we suffered greatly for it. (Notice my use of past tense here..)

Having good metrics is just really key and I'm now at the point where I'd go as far as to say that if good metrics don't exist it's almost not even worth working on something. (Almost.)

radarsat1 · 2025-11-22T14:05:05 1763820305

> fuzzifying logical computation?

Isn't that basically what the sigmoid operator does? Or more in the direction of averaging many logical computations, we have random forests.

radarsat1 · 2025-11-11T03:46:02 1762832762

God of the gaps

radarsat1 · 2025-11-10T20:11:51 1762805511

Ah so nothing bad happening anymore due to people believing what they read on the internet, huh? Interesting take.

radarsat1 · 2025-11-10T13:16:07 1762780567

I would love to read more, but apart from not finding a lot of time lately, when I do read, it's fiction. Occasionally I have read a textbook on a topic I am really interested in, and I've read blogs and articles on various sciency themes, but when it comes to books, I have just never been very into reading non-fiction. I don't try often, but when I do, I get one or two chapters in and just .. fail to pick it up again.

I know that non-fiction would be "good for me." Particularly reading more in topics I'm less knowledgable about, like finance and business and politics. Personal growth. However, I do find that fiction helps expand my perspective and even, somehow, knowledge, but it's different from non-fiction, less direct. I don't read for that, explicitly, although I do like the effect. But I read because.. I guess, because it's nice for my brain to be somewhere else. I don't know. But non-fiction has never done it for me.. my mind just gets.. bored, I think, trying to absorb what someone else wants me to know. Even when I find the topic interesting.

I guess there are people who like non-fiction and people who like fiction and they often cross-over but I think most people lean one way or the other. I can see there being positives and negatives to either side. People who equally read both must be rare? Or maybe it's just my impression.

Peritract · 2025-11-10T13:20:01 1762780801

> I know that non-fiction would be "good for me."

I think this depends heavily on which non-fiction, particularly when contrasted with which fiction you're currently reading.

I don't think reading the same self-help books as a bunch of CEO's who see themselves as bold outsiders to the system will actually benefit you; it didn't make them self-aware.

Fiction contains information and ideas; it helps you expand your horizons, and that's generally a good thing. As long as you're not reading a very limited subset of fiction, it will be beneficial.

lenkite · 2025-11-10T14:01:47 1762783307

Reading science fiction has given me ideas that I would have never had before. I can comfortably say that it has expanded my narrow mind. Even pulp space-opera helped here!

Apart from that, taking the time to grok the architecture or top-rated issues of open source projects helps to make you a better developer - or at-least avoid obvious mistakes when coding some new feature of your own.