I do feel that GitHub's product development has been less exciting in recent years, but that's natural for any maturing platform. While I can't judge whether there are fewer talented people involved, I've noticed they haven't increased mistakes, and the platform continues to grow. It would be unfair to overlook the hard work that goes into maintaining GitHub and shipping new features (even if some of those features aren't to everyone's taste). I'm grateful for GitHub and hope it continues to thrive. Peace.
Seeing systems used in the most advanced areas of human civilization never fails to amaze me. They have been created half a century ago yet still functioning flawlessly in the autonomous, harsh environment of space. Meanwhile, I consider it a win if my Python API server survives a month without breaking. I'm always wondering, how did those engineers create something so robust, while I, despite standing on the shoulder of decades of software engineering progress, seem unable to avoid introducing bugs with every commit?
Management then cared that their one chance would work. Today management just wants it to mostly work.
Incentives and goals are very different between the two. We could very much build even more incredible things today; and would argue that we actually do. Just only in the places that seem to matter enough to do that type of special effort for.
I really wish there were a de facto state-of-the-art coding agent that is LLM-agnostic, so that LLM providers wouldn't bother reinventing their own wheels like Codex and Gemini-CLI. They should be pluggable providers, not independent programs. In this way, the CLI would focus on refining the agentic logic and would grow faster than ever before.
Currently Claude Code is the best, but I don't think Anthropic would pivot it into what I described. Maybe we still need to wait for the next groundbreaking open-source coding agent to come out.
There is Aider (aider.chat), and it has been there for couple years now. Great tool.
Alas, you don't install Claude Code or Gemini CLI for the actual CLI tool. You install it because the only way agentic coding makes sense is through subscription billing at the vendor - SOTA models burns through tokens too fast for pay-per-use API billing to make sense here; we're talking literally a day of basic use costing more than a monthly subscription to the Max plan at $200 or so.
Aider is in a sad state. The maintainer does not "maintain" for quite some time now (look at the open PRs and issues). It's not state of the art definitely but one of the first and best ones in the past. A fork was created, Aider CE, from some members of the Discord community https://github.com/dwash96/aider-ce The fork looks and works promising but there is (sadly) so much more development in the other AI CLI tools nowadays.
With increasingly aggressive usage limits (Claude weekly usage now), "agentic" style of token burning seems much less practical to me. Coming from Aider and trying tools like OpenCode, the "use models to discover the relevant files" etc pattern seems very token heavy and even wasteful - whereas with Aider you include relevant files up front and use your tokens for the real work.
Opencode (from SST; not the thing that got rebranded as Crush) seems to be just that. I've had a very good experience with it for the last couple of days; having previously used gemini-cli quite a bit. Opencode also has/hosts a couple of "free" models options right now, which are quite decent IMO.
There are many many similar alternatives, so here's a random sampling: Crush, Aider, Amp Code, Emacs with gptel/acp-shell, Editor Code Assistant (which aims for an editor-agnostic backend that plugs into different editors)
Finally... there is quite a lot of scope for co-designing the affordances / primitives supported by the coding agent and the LLM backing it (especially in LLM post-training). So factorizing these two into completely independent pieces currently seems unlikely to give the most powerful capabilities.
It’s not gonna happen any time soon. The model is fine-tuned on traces generated by the scaffolding (eg dependent on what tool calls are available), and the scaffolding is co-developed with the strengths/weaknesses of the specific model.
Cherry Studio is my daily go-to, I hope Onyx deskktop can be a great alternative for personal users who just want a dedicated app to access any LLMs with full power of MCP and various tools
Great article! I always feel that the choice of embedding model is quite important, but it's seldom mentioned. Most tutorials about RAG just tell you to use a common model like OpenAI's text embedding, making it seem as though it's okay to use anything else. But even though I'm somewhat aware of this, I lack the knowledge and methods to determine which model is best suited for my scenario. Can you give some suggestions on how to evaluate that? Besides, I'm wondering what you think about some open-source embedding models like embeddinggemma-300m or e5-large.
I'm not against AI-generated userscripts, but I wish to review and edit the scripts more easily. Currently, the extension hides the code too much and doesn't provide a decent editing experience.
By saying "under your control," I believe it means you are not locked in by cloud providers. Since it's easy to switch and back up between different blob storages, if one is down, your data remains accessible and manageable.
reply