Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think any meaningful context engineering strategies will be trade secrets.


Maybe, but we'll be getting to a place where each LLM call gets cheaper, faster and has a larger context, it may not matter long term.


Context is often not the only issue. Really the issue is attention - context is a factor in how well the LLM handles attention to the broad scope of a task, but one can anecdotally easily observe the thing forget or go off the rails when only a fraction of the context window is being used. Oftentimes it’s effective to just say “don’t ever go above 20% of the max”


Some of that is, or at least was, down to the training: extending the context window but not training on sufficiently long data or using weak evaluation metrics caused issues. More recent models have been getting better, though long context performance is still not as good as short context performance, even if the definition of "short context" has been greatly extended.

RoPE is great and all, but doesn't magically give 100% performance over the lengthened context; that takes more work.


Idk if trade secrets really exist in a world where engineers at every level hop between the same x companies every other Monday.


Why do you think that?


Competitive edge. Some agents will be better than others, therefore worth paying for. So for example, if one writes an AI trading agent, there’s no reason to share it similar to how it is at the moment with regular trading algos.

I’m not saying it won’t eventually be known, but not in these initial stages.

The only thing separating Claude, Gemini and ChatGPT is their context and prompt engineering, assuming the frontier models belong to the same class of capability. You can absolutely release a competitor to these things that could perform better for certain things (or even all things, if you introduce brand new context engineering ideas), if you wanted to.


No, I mean why do you think that effective context engineering will remain a black art, instead of becoming something with standard practices that work well for most use cases?


I can’t say it will remain a black art because the tech itself creates new paradigms constantly. An LLM can be fine tuned with context engineering examples, similar to Chain Of Thought tuning, and that’s how we get a reasoning loop. With enough fine tuning, we could get a similar context loop, in which case those keeping things hidden will be washed away with new paradigms.

Even if someone fine tuned an LLM with this type of data, Deepseek has shown that they can just use a teacher-student strategy to steal from whatever model you trained (exfiltrate your value-add, which is how they stole from OpenAI). Stealing is already a thing in this space, so don’t be shocked if over time you see a lot more protectionism (protectionism is something we already see geopolitically on the hardware front).

I don’t know what’s going to happen, but I can confidently say that if humans are involved at this stage, there will absolutely be some level of information siloing, and stealing.

——

But to directly answer your question:

”… instead of becoming something with standard practices that work well for most use cases?”

In no uncertain terms, the answer is because of money.


Imagine where we would be if academia or open source had this train of tougth.

No algorithms, no Linux, no open protocols, maybe not even internet.


Sure, it’s a horrible attitude. With that said, there is a time and place for everything. At the very beginning of AI, which is where we are, it’s not necessarily evil to carve out your advantages and share later.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: