Do we know for a fact that the mechanisms are actually used that way inside the ...

Do we know for a fact that the mechanisms are actually used that way inside the model?

My understand was that they know how the model was designed to be able to work, but that there's been very little (no?) progress in the black box problem so we really don't know much at all about what actually happens internally.

Without better understanding of what actually happens when an LLM generates an answer I stick with the most basic answer that its simply predicting what a human would say. I could be wildly misinformed there, I don't work directly in the space and its been moving faster than I'm interested in keeping up with.