Thanks. The hardest part has been slogging through the segfaults and documenting all the unprincipled things I've had to add. Post-bootstrap, I have to undo it all because my IR is a semantically rich JSON format that is turing-incomplete by design. I'm building a substrate for rich applications over bounded computation, like eBPF but for applications and inference.
I don't buy this. I've long wondered if the larger models, while exhibiting more useful knowledge, are not more wasteful as we greedily explore the frontier of "bigger is getting us better results, make it bigger". Qwen3-Coder-Next seems to be a point for that thought: we need to spend some time exploring what smaller models are capable of.
Perhaps I'm grossly wrong -- I guess time will tell.
You are not wrong, small models can be trained for niche use cases and there are lots of people and companies doing that. The problem is that you need one of those for each use case whereas the bigger models can cover a bigger problem space.
There is also the counter-intuitive phenomenon where training a model on a wider variety of content than apparently necessary for the task makes it better somehow. For example, models trained only on English content exhibit measurably worse performance at writing sensible English than those trained on a handful of languages, even when controlling for the size of the training set. It doesn't make sense to me, but it probably does to credentialed AI researchers who know what's going on under the hood.
Not an AI researcher and I don't really know, but intuitively it makes a lot of sense to me.
To do well as an LLM you want to end up with the weights that gets furthest in the direction of "reasoning".
So assume that with just one language there's a possibility to get stuck in local optima of weights that do well on the English test set but which doesn't reason well.
If you then take the same model size but it has to manage to learn several languages, with the same number of weights, this would eliminate a lot of those local optima because if you don't manage to get the weights into a regime where real reasoning/deeper concepts is "understood" then it's not possible to do well with several languages with the same number of weights.
And if you speak several languages that would naturally bring in more abstraction, that the concept of "cat" is different from the word "cat" in a given language, and so on.
Is that counterintuitive? If I had a model trained on 10 different programming languages, including my target language, I would expect it to do better than a model trained only on my target language, simply because it has access to so much more code/algorithms/examples then my language alone.
i.e. there is a lot of commonality between programming languages just as there is between human languages, so training on one language would be beneficial to competency in other languages.
Cool, I didn't know about this phenomenon. Reading up a little it seems like training multilingual forces the model to optimize it's internal "conceptual layer" weights better instead of relying solely on English linguistics. Papers also mention issues arising from overdoing it, so my guess is even credentialed AI researchers are currently limited to empirical methods here.
Between GLM-4.7-Flash and this announcement, THIS is what I'm excited to see in this space: pushing the capabilities of _small_ models further and further. It really feels like we're breaking into a space where models that can run on hardware that I actually own is getting better and better, and that has me excited.
Just going to jump in here and say that there's another reason I might want Rust with a Garbage Collector: The language/type-system/LSP is really nice to work with. There have indeed been times that I really miss having enums + traits, but DON'T miss the borrow checker.
Maybe try a different ML-influenced language like OCaml or Scala. The main innovation of Rust is bringing a nice ML-style type system to a more low level language.
I wouldn't recommend OCaml unless you plan to never support Windows. It finally does support it in OCaml 5 but it's still based around cygwin which totally sucks balls.
Also the OCaml community is miniscule compared to Rust. And the syntax is pretty bonkers in places, whereas Rust is mostly sane.
Compile time is pretty great though. And the IDE support is also pretty good.
There are other nice things about Rust over OCaml that are mainly just due to its popularity. There are libraries for everything, the ecosystem is polished, you can find answers to any question easily, etc. I don't think the same can be said for OCaml, or at least not to the same extent. It's still a fairly niche language compared to Rust.
I remember about 5 years ago, StackOverflow for OCaml was a nightmare. It was a mishmash of Core (from Jane Street) Batteries, and raw OCaml. New developers were confronted with the prospect of opening multiple libraries with the same functionality. (not the correct way of solving any problem)
Jane Street apparently has a version of OCaml extended with affine types. I'd like to test that, because that would (almost) be the best of all worlds.
I think you're referring to OxCaml. I'd love to see this make a huge splash. Right now one of the biggest shortcomings of OCaml, is one is still stuck implementing so much stuff from scratch. Languages like Rust, Go and Java have HUGE ecosystems. OCaml is just as old (even older than Rust since OCaml inspired Rust and its original compiler was written in OCaml) as these languages. Since it's not been as popular, it's hard to find well-supported libraries.
I too wish that some OxCaml features bring new blood to OCaml. I've been using OCaml for a few years for personal projects and I find the language really simple and powerful at the same time, but I had to implement me some foundational libraries (e.g. proper JSON, parser combinators), and now I'm considering porting one of those projects to Rust just so I can have unboxed types and better Windows support.
> even older than Rust
That's an understatement, (O)Caml is between 17 and 25 years older than Rust 0.1 depending on which Caml implementation you start counting from.
I always wrote Lua off, scoffing at the 1-based indexing, until I was "forced" to learn it thanks to Neovim. What a delightful little language it is. I do wish I could do certain things less verbosely (lambdas would be nice) -- but then again, I defeat myself by suggesting it, because not having all the features makes Lua so approachable.
I used Lua professionally. I prefer the 1 indexing... it just feels more natural. For some reason the C apologists here will scream how 0 based is the only way to go. (which is not, it is just a historical artifact). Languages like ADA allowed you to use either 0 or 1, (or any arbitrary) starting index.
Same here, in fact something I wish the neovim team would do is create a book where popular plugin authors create tutorials that recreate basic functionality of their plugins.
Seems like a no brainer that would help bring in more revenue too, it'd also be an "evergreen" book as new others can contribute over time.
I can't be the only one that would immediately buy a copy. :D
I'm actually trying to work on a video-series to do just this. I've made my own rudimentary plugins reproducing several popular ones, and would like to walk through how I made: a) file-tree b) picker/fzf replacment c) hop/leap replacement d) surround plugin e) code-formatter f) hydra (sub-modes) g) many "UI" (interactive) buffers, etc.
None of these are published because the popular ones are better and provide more functionality, but I want to share what I believe is more valuable: what I learned while writing them.
(I personally don’t use patches like this because “Lua 5.1” is something pretty standardized with a bunch of different implementations; e.g. I wrote my Lua book with a C# developer who was using the moonsharp Lua implementation)
Mise is a hard sell for me when I can have pure Nix-shells. However, I can see this gaining wider adoption since it's learning curve is so much lower than Nix.
I've seen several Mac users have the same experience: going all-in on nix-darwin and then getting frustrated. But nix-darwin is one of the worst ways of getting into Nix, because its goal is to make your whole macOS system configurable with Nix, but macOS is a moving target and (unlike Linux) not built to be modular at all. I know people put a lot of hard work into nix-darwin, but it's simply not the main focus of Nix as a whole and sadly it might not ever become a seamless experience. (I'm not a mac user so not keeping up, but I do see colleagues trying it out from time to time.)
The solution here is: use Nix but don't use nix-darwin (at least not until you're generally comfortable with Nix for package management and dev shells). You do NOT have to use nix-darwin on Mac to reap 80% of the benefits of Nix (especially in a team setting).
After dropping nix-darwin, I think almost everyone will find that it's very easy to use Nix for sharing project setups with bespoke tooling. I just had a new team member onboard, knowing nothing about Nix, in a day or less, with several different languages and unusual tools.
> After dropping nix-darwin, I think almost everyone will find that it's very easy to use Nix for sharing project setups with bespoke tooling
Ahh, but I tried that too. I originally decided to play with nix-darwin because I was on a contract that used nix in their repos to ease onboarding of academic collaborators.
In practice, it was complicated enough that most of us ended up relying on the 2 nix experts to make any real changes, and when they left, the nix configs stagnated.
It might be the case that nix-darwin, and our particular python/ML repos, were "hard mode" for nix, but I truly think I gave it a fair shake.
If nix requires a lot of effort to do anything off the beaten path, it's just not the tool for me.
To be clear, I don't try to Nix-everything. I just use it to 1) install a bunch of CLI tools to my nix-env, and 2) dev-shells. That's pretty much it, though. Even that is a huge boon. Even so, I'm keeping an eye on mise, for sure.
If you think of it more of in the context of making it easy for people other than you and your bespoke machine to bootstrap a project, that's where it really shines. The toml config is very simple for people to understand.
I use it because I want people to be able to get projects up and running quickly without having to comb through an outdated README, trying to deal with all of the different ways people like to install and use non-compiled languages, etc. Managing anything Node/Ruby/Python is all annoying.
I have tried similar workflows (Neovim + Opencode/Codex CLI), and for me, the biggest downside compared to Cursor is the lack of a tab completion model as good as Cursor's. Supermaven is the best one I've found so far for Neovim, but it gives worse suggestions and can only suggest changes on the same line you are on.
Agree - leaps and bounds beyond anything I would have dreamed possible a few years ago...but... IDK, if I'm honest, the sound was way off too, not just the visuals. The music sounded detuned slightly, and the crowd noise was "crackly" etc. etc. It had a low-fidelity "quality" to it.
Personally, I feel mixed feelings. I'm impressed, but I'm not looking forward to the new "movies" that are going to litter YouTube et al generated from this.
reply