The integrations feel so rag-ish. It talks, tells you it’s going to use a tool, ...

pyryt · 2025-05-01T17:55:04 1746122104

I would love to do this on my codebase after every commit

notgiorgi · 2025-05-01T17:57:08 1746122228

why is finetuning talked about so much less than RAG? is it not viable at all?

omneity · 2025-05-01T20:14:05 1746130445

RAG is infinitely more accessible and cheaper than finetuning. But it is true that finetuning is getting severely overlooked in situations where it would outperform alternatives like RAG.

riku_iki · 2025-05-01T20:54:44 1746132884

> RAG is infinitely more accessible and cheaper than finetuning.

it depends on your data access pattern. If some text goes through LLM input many times, it is more efficient for LLM to be finetuned on it once.

omneity · 2025-05-01T21:04:42 1746133482

This assumes the team deploying the RAG-based solution has equal ability to either engineer a RAG-based system or to finetune an LLM. Those are different skillsets and even selecting which LLM should be finetuned is a complex question, let alone aligning it, deploying it, optimizing inference etc.

The budget question comes into play as well. Even if text is repetitively fed to the LLM, that might happen over a long enough time compared to finetuning which is a sort of capex that it is financially more accessible.

Now bear in mind, I'm a big proponent of finetuning where applicable and I try to raise awareness to the possibilities it opens. But one cannot deny RAG is a lot more accessible to teams which are likely developers / AI engineers compared to ML engineers/researchers.

riku_iki · 2025-05-01T21:16:56 1746134216

> But one cannot deny RAG is a lot more accessible to teams which are likely developers / AI engineers compared to ML engineers/researchers.

It looks like major vendors provide simple API for fine-tuning, so you don't need ML engineers/researchers: https://platform.openai.com/docs/guides/fine-tuning

Setting RAG infra is likely more complicated than that.

omneity · 2025-05-01T22:16:57 1746137817

You are certainly right, managed platforms make finetuning much easier. But managed/closed model finetuning is pretty limited and in fact should be named “distribution modeling” or something.

Results with this method are significantly more limited compared to all the power open-weight finetuning gives you (and the skillset needed in return).

And in either case don’t forget alignment and evals.

riku_iki · 2025-05-03T18:50:22 1746298222

> Results with this method are significantly more limited compared to all the power open-weight finetuning gives you (and the skillset needed in return).

I am not sure I understand why you are so certain that finetuned top market models, built by top researchers will be significantly worse than whatever open source model you pick.

never_inline · 2025-05-02T09:45:24 1746179124

Can find tuning produce results as grounded as RAG?

How many epochs do you run?

retinaros · 2025-05-01T22:18:26 1746137906

fine tuning can cost 80$ and a few hours. a good rag doesnt exist

computerex · 2025-05-01T19:10:05 1746126605

It’s significantly harder to get right, it’s a very big stepwise increase in technical complexity over in context learning/rag.

There are now some light versions of fine tuning that don’t update all the model weights but train a small adapter layer called Lora which is way more viable commercially atm in my opinion.

onel · 2025-05-02T06:58:12 1746169092

You usually fine tune when you want to add capabilities (an output style, json output, function calling, etc). You use RAG to add knowledge

mring33621 · 2025-05-01T19:03:02 1746126182

i'm not an expert in either, but RAG is like dropping some 'useful' info into the prompt context, while fine tuning is more like a performing mix of retraining, appending re-interpretive model layers and/or brain surgery.

I'll leave it to you to guess which one is harder to do.

ijk · 2025-05-01T19:10:29 1746126629

There were initial difficulties in finetuning that made it less appealing early on, and that's snowballed a bit into having more of a focus on RAG.

Some of the issues still exist, of course:

* Finetuning takes time and compute; for one-off queries using in-context learning is vastly more efficient (i.e., look it up with RAG).

* Early results with finetuning had trouble reliably memorizing information. We've got a much better idea of how to add information to a model now, though it takes more training data.

* Full finetuning is very VRAM intensive; optimizations like LoRA were initially good at transferring style and not content. Today, LoRA content training is viable but requires training code that supports it [1].

* If you need a very specific memorized result and it's costly to get it wrong, good RAG is pretty much always going to be more efficient, since it injects the exact text in context. (Bad RAG makes the problem worse, of course).

* Finetuning requires more technical knowledge: you've got to understand the hyperparameters, avoid underfitting and overfitting, evaluate the results, etc.

* Finetuning requires more data. RAG works with a handful datapoints; finetuning requires at least three orders of magnitude more data.

* Finetuning requires extra effort to avoid forgetting what the model already knows.

* RAG works pretty well when the task that you are trying to perform is well-represented in the training data.

* RAG works when you don't have direct control over the model (i.e., API use).

* You can't finetune most of the closed models.

* Big, general models have outperformed specialized models over the past couple of years; if it doesn't work now, just wait for OpenAI to make their next model better on your particular task.

On the other hand:

* Finetuning generalizes better.

* Finetuning has more influence on token distribution.

* Finetuning is better at learning new tasks that aren't as present in the pretraining data.

* Finetuning can change the style of output (e.g., instruction training).

* When finetuning pays off, it gives you a bigger moat (no one else has that particular model).

* You control which tasks you are optimizing for, without having to wait for other companies to maybe fix your problems for you.

* You can run a much smaller, faster specialized model because it's been optimized for your tasks.

* Finetuning + RAG outperforms just RAG. Not by a lot, admittedly, but there's some advantages.

Plus the RL Training for reasoning has been demonstrating unexpectedly effective improvements on relatively small amounts of data & compute.

So there's reasons to do both, but the larger investment that finetuning requires means that RAG has generally been more popular. In general, the past couple of years have been won by the bigger models scaling fast, but with finetuning difficulty dropping there is a bit more reason to do your own finetuning.

That said, for the moment the expertise + expense + time of finetuning makes it a tough business proposition if you don't have a very well-defined task to perform, a large dataset to leverage, or other way to get an advantage over the multi-billion dollar investment in the big models.

[1] https://unsloth.ai/blog/contpretraining

jimbokun · 2025-05-02T00:55:44 1746147344

So is a good summary:

1. If you have a large corpus of valuable data not available to the corporations, you can benefit from fine tuning using this data.

2. Otherwise just use RAG.

kordlessagain · 2025-05-03T15:34:53 1746286493

That summary's not wrong, it's just reductionist.

Fine-tuning makes sense when you need behavioral shifts (style, tone, bias) or are training on data unavailable at runtime.

RAG excels when you want factual augmentation without retraining the whole damn brain.

It's not either/or — it's about cost, latency, use case, and update cycles. But hey, binaries are easier to pitch on a slide.

msp26 · 2025-05-02T09:48:01 1746179281

Thanks for the detailed comment.

I had no idea that fine tuning for adding information is viable now. Last I checked (year+ back) it seemed to not work well.

disgruntledphd2 · 2025-05-01T19:04:38 1746126278

RAG is much cheaper to run.

kordlessagain · 2025-05-03T15:32:41 1746286361

Nuance is hard. Binary choices are fast, comforting, and require less thought. Certainty feels safer than ambiguity — especially in conflict, where complexity threatens identity. And in most arenas (tech, media, politics), decisive hot takes get applause. Fence-sitters get ignored.

I worked at a startup where the CEO swore up and down that real-time fine-tuning was the future — that models would continuously update with company data. It sounded cool until you remember: That’s not how LLMs work. It’s not efficient. It’s not flexible. And it’s not even necessary — we already have RAG.

Pipedreams make good pitch decks. But they break when you hit production.

It's a fucking pipedream, this. That's not how LLMs work, it's not efficient, it's not useful (we have RAG for reference augmentation), and it’s not even desirable unless you want your model overfitting on stale, internal narratives every night.