It is RAG for your codebase, and provides code completion. The gain is the local inference, and is actually useful with smaller models.
The plugin itself provides chat also, but my gut feeling is that ggerganov runs several models at the some time, given he uses a 192gb machine.
Have not tried this scenario yet, but looking at my API bill I’m probably going to try 100% local dev at some point. Besides vibe coding with existing tools seems to not work that good for enterprise size codebases.
> Thanks! I was trying to really push the idea of having it be almost a business book with code snippets. When I went through college, you could either study business or study software. But these days being good at both is kind of table stakes
1. I will buy a business book with code snippets
2. I am biased abt 1, because I also agree with the premise that being good at both (or at least passable) is table stakes these days
Is this the one? https://github.com/ggml-org/llama.vscode it sems to be built for code completion rather than outright agent mode