More

samuelstros · 2026-03-22T18:47:43 1774205263

Improving on "git not handling non-text files" is a semantic understanding aka parse step in between the file write.

Take a docx, write the file, parse it into entities e.g. paragraph, table, etc. and track changes on those entities instead of the binary blob. You can apply the same logic to files used in game development.

The hard part is making this fast enough. But I am working on this with lix [0].

[0] https://github.com/opral/lix

jayd16 · 2026-03-23T15:32:35 1774279955

What's the plan for large files that can't be merged? Images, executable binaries, encrypted files, that sort of thing?

samuelstros · 2026-03-23T15:57:14 1774281434

Simple left or right merge. One overwrites the other one.

The appeal or structured file formats like .docx, .json, etc. Images are unstructured and simple "do you want to keep the left or right image" is good enough.

jayd16 · 2026-03-24T00:20:23 1774311623

That doesn't really address the game dev use case then. Artists and designers want to prevent conflicts, not just throw away half the work and redo it.

samuelstros · 2026-03-24T01:56:50 1774317410

track the source of the asset and it works. take ui design. dont track the svg. track the design file itself

jayd16 · 2026-03-24T15:34:42 1774366482

Ok well what if I draw the foreground and you add something to the background and now my changes visually block your changes? Even if the file is merged, our work is wasted and must be redone. P4 is often popular in industry because artists can lock files and inform others that work is being done in that area.

If you actually want to capture those customers it's a use case that needs to be addressed.

samuelstros · 2026-03-04T07:03:51 1772607831

How do you get blob file writes fast?

I built lix [0] which stores AST’s instead of blobs.

Direct AST writing works for apps that are “ast aware”. And I can confirm, it works great.

But, the all software just writes bytes atm.

The binary -> parse -> diff is too slow.

The parse and diff step need to get out of the hot path. That semi defeats the idea of a VCS that stores ASTs though.

[0] https://github.com/opral/lix

gritzko · 2026-03-04T07:42:16 1772610136

I only diff the changed files. Producing blob out of BASON AST is trivial (one scan). Things may get slow for larger files, e.g. tree-sitter C++ parser is 25MB C file, 750KLoC. Takes couple seconds to import it. But it never changes, so no biggie.

There is room for improvement, but that is not a show-stopper so far. I plan round-tripping Linux kernel with full history, must show all the bottlenecks.

P.S. I checked lix. It uses a SQL database. That solves some things, but also creates an impedance mismatch. Must be x10 slow down at least. I use key-value and a custom binary format, so it works nice. Can go one level deeper still, use a custom storage engine, it will be even faster. Git is all custom.

rs545837 · 2026-03-04T17:17:42 1772644662

Good framing. Source code is already a serialization of an AST, we just forgot that and started treating it as text. The practical problem is adoption: every tool in the ecosystem reads bytes.

rs545837 · 2026-03-04T07:11:53 1772608313

This is exactly a reason why weave stays on top of git instead of replacing storage. Parsing three file versions at merge time is fine (was about 5-67ms). Parsing on every read/write would be a different story. I know about Lix, but will check it out again.

samuelstros · 2026-02-14T00:54:06 1771030446

Yep “stripe couldnt build it because they rely on http ingestion”. Lol.

samuelstros · 2026-02-04T16:26:50 1770222410

Oh interesting. Checks sound similar to lix validation rules [1].

We were coming from a an application perspective where blocking the users intent is a no-go.

Do you have a link to a discussion where the JJ community is discussing checks?

[1] https://github.com/opral/lix/issues/239

samuelstros · 2026-02-02T18:25:58 1770056758

It's basically what Emdash (https://www.emdash.sh/), Conductor (https://www.conductor.build/) & CO have been building but as first class product from OpenAI.

Begs the question if Anthropic will follow up with a first-class Claude Code "multi agent" (git worktree) app themselves.

FanaHOVA · 2026-02-02T18:31:11 1770057071

https://code.claude.com/docs/en/desktop

samuelstros · 2026-02-02T18:32:47 1770057167

oh i didn't know that claude code has a desktop app already

esafak · 2026-02-02T19:42:03 1770061323

And it uses worktrees.

mcintyre1994 · 2026-02-02T18:52:15 1770058335

It isn’t its own app, but it’s built in to their desktop, mobile and web apps.

desireco42 · 2026-02-02T20:43:43 1770065023

I never heard of Emdash before and I am following on AI tools closely. It just shows you how much noise there is and how hard is to promote the apps. Emdash looks solid. I almost went to build something similar because I wasn't aware of it.

another_twist · 2026-02-02T21:02:52 1770066172

I am not sure if multi agent approach is what it is hyped up to be. As long we are working on parallel work streams with defined contracts (say an agreed upon API def that backend implements and frontend uses), I'd assume that running independent agent coding sessions is faster and in fact more desirable so that neither side bends the code to comply with under specified contracts.

sepositus · 2026-02-02T21:25:11 1770067511

Usually I find the hype is centered around creating software no one cares about. If you're creating a prototype for dozens of investors to demo - I seriously doubt you'd take the "mainstream" approach.

IMTDb · 2026-02-02T18:40:51 1770057651

Maybe a dumb question on my side; but if you are using a GUI like emdash with Claude Code, are you getting the full claude code harness under the hood or are you "just" leveraging the model ?

atestu · 2026-02-02T18:51:00 1770058260

I can answer for Conductor: you're getting the full Claude Code, it's just a GUI wrapper on top of CC. It makes it easy to create worktrees (1 click) and manage them.

heystefan · 2026-02-02T21:20:24 1770067224

I don't think this is true. Try running `/skills` or `/context` in both and you'll see.

jacksondc · 2026-02-03T06:18:52 1770099532

Hey, Conductor founder here. Conductor is built on Anthropic's Agents SDK, which exposes most (but not all) of Claude Code's features.

https://platform.claude.com/docs/en/agent-sdk/overview

heystefan · 2026-02-03T13:30:32 1770125432

Thanks for clarifying, just wanted to point out it's not 1 to 1 with CC. Happy user of Conductor here btw, great product!

mritchie712 · 2026-02-02T19:44:23 1770061463

yeah, I wanted a better terminal for operating many TUI agent's at once and none of these worked because they all want to own the agent.

I ended up building a terminal[0] with Tauri and xterm that works exactly how I want.

0 - screenshot: https://x.com/thisritchie/status/2016861571897606504?s=20

saadn92 · 2026-02-03T00:08:07 1770077287

looks like we both did haha: https://github.com/saadnvd1/aTerm

arnestrickmann · 2026-02-02T18:47:43 1770058063

Emdash is inducing CC, Codex, etc. natively. Therefore users are getting the raw version of each agent.

krzyzanowskim · 2026-02-04T00:51:49 1770166309

until then, the https://CommanderAI.app tries to fill the gap on Mac

asdev · 2026-02-02T18:45:29 1770057929

They have Claude Code web in research preview

dbbk · 2026-02-03T16:03:58 1770134638

It still doesn't support plan mode... I'm really confused why that's so hard to do

samuelstros · 2026-01-22T16:34:19 1769099659

It can but the plugins are not developed for production readiness yet. I should clarify that.

The way to write a plugin:

Take an off the shelf parser for pdf, docx, etc. and write a lix plugin. The moment a plugin parses a binary file into structured data, lix can handle the version control stuff.

samuelstros · 2026-01-22T16:06:42 1769098002

Merge algebra is similar to git with a three way merge. Given that lix tracks individual changes, the three way merge is more fine grained.

In case of a conflict, you can either decide to do last write wins or surface the conflict to the user e.g. "Do you want to keep version A or version B?"

The SQL engine is merge unrelated. Lix uses SQL as storage and query engine, but not for merges.

samuelstros · 2026-01-22T15:32:51 1769095971

Thanks for the feedback.

AI agents are the pull right now to why version control is needed outside of software engineering.

The mistake in the blog post is triggering comparisons to git, which leads to “why is this better/different than git?”.

If you have a custom binary file, you can write a plugin for it! :)

Lix runs on top of a SQL database because we initially built lix on top of git but needed:

- database semantics (transactions, acid, etc.)

- SQL to express history queries (diffing arbitrary file formats cant be solved with a simple diff() API)

samuelstros · 2026-01-22T15:27:15 1769095635

Indeed, if lix were to target code version controlling, incompatibility with git is a “dead on arrival” situation.

But, Lix use case is not version controlling code.

It’s embedding version control in applications. Hence, the reason why lix runs within SQL databases. Apps have databases. Lix runs of top of them.

The benefit for the developer is a version control system within their database, and exposing version control to users.

samuelstros · 2026-01-22T08:40:49 1769071249

> But then the first thing it talks about is diffing files. Which honestly shouldn’t even be a feature of VCS. That’s just a separate layer.

There is nuance between git line by line diffing and what lix does.

For text diffing it holds true that diffing is a separate layer. Text files are small in size which allows on the fly diffing (that's what git does) by comparing two docs.

On the fly diffing doesn't work for structured file formats like xlsx, fig, dwg etc. It's too expensive. Both in terms of materializing two files at specific commits, and then diffing these two files.

What lix does under the hood is tracking individual changes, _which allows rendering a diff without on the fly diffing_. So lix is kind of responsible for the diffs but only in the sense that it provides a SQL API to query changes between two states. How the diff is rendered is up to the application.

forrestthewoods · 2026-01-22T17:04:33 1769101473

> On the fly diffing doesn't work for structured file formats like xlsx, fig, dwg etc. It's too expensive. Both in terms of materializing two files at specific commits, and then diffing these two files.

I don’t think that’s actually true?

How often are binary files being diffed? How long does it take to materialize? How long to run a diff algorithm?

I’ve worked with some tools that can diff images. Works great. Not a problem in need of solving.

In any case I’ll give benefit of the doubt that this project solves some real problem in a useful way. I’m not sure what it is.

My goals in a VCS for binary files seem to be very very very different than yours.

samuelstros · 2026-01-22T23:26:16 1769124376

I think our goals indeed differ.

> How often are binary files being diffed? How long does it take to materialize? How long to run a diff algorithm?

If version control is embedded in an app, constantly.

Imagine a cell in a spreadsheet. An application wants to display a "blame" for a cell C43 i.e. how did the cell change over time?

The lix way is this SQL query

SELECT * from state_history WHERE file_id <the_spreadsheet> AND schema_key "excel_cell" AND entity_id C43;

Diffing on the fly is not possible. The information on what changed needs to be available without diffing. Otherwise, diffing an entire spreadsheet file for every commit on how cell C43 changed takes ages.