This is much more concise than my usual attempts to explain why LLMs don’t “know” things. I’ll be stealing it. Maybe with a different example corpus, lol.
I actually I fashioned this logic out of the philosophy question of why certain neural firings appear as sound in our brain while others appear as vision? What gives?
iirc there were some experiments where they rewired optic nerve and inner ear in mice to route (so to speak) to different areas of the brain (different cortical (i think?) destinations), and iirc the higher level biological structures of those areas were built up accordingly (regular visual cortex like neural structures for visual data, etc.) iirc was done on very young baby mice or somesuch (classic creepy stuff, do not remember which decade; Connectionism researchers).
does not answer the general good abstract question and "how semantics possible thru relative context / relations to other terms only?", but speaks to how different modalities of information (e.g. visual data vs. sound data) are likewise represented, modelled, processed, etc. using different neural structures which presumably encode different aspects of information (e.g. layman obvious guess - temporality / relation-across-time-axis much more important for sound data).
In case of a person, the external sensory data provides the grounding. Consider a prisoner who spent a long time in hole in the cell, he starts hallucinating due to no sensory information to ground his neuronal firings.
Philosophically speaking, sensory data is no more "external" or grounded than words are. You do not see - your eyes see. You do not hear - your ears hear. You cannot truly interact with the world except at a remove, through various organs. You never perceive the world "as it really is". Your brain attempts to build a consistent model that explains all your sensory input - including words. Where words and sensory input disagree, words can even win - I can tell you that you are in VR, or dreaming, or that immigrants are the cause of all your problems, and if I am careful and charismatic you may believe me...
tl;dr what you think of as "grounding" is just yet more relative context...