What if the question is "What are the main themes of this work?"
Or anything where the question answer isn't 'close' to the words used in the question?
How well does this work vs giving it the whole thing as a prompt?
I assume worse but I'm not sure how this approach compares to giving it the full thing in the prompt or splitting it into N sections and running on each and then summarizing.
Any material comparing the different embedding models? I'm working on information retrieval from government documents and without any ML experience it's daunting
You pretty much summed up the drawbacks of the embeddings approach. In my experience it's pretty hard to extract the relevant parts of text, especially when the text is uniform.
I don't think it's as much of a band-aid as it first appears since this roughly mimics how a human would do it.
The problem is that humans have continuous information retrieval and storage where the current crop of embedding systems are static and mostly one shot.
Humans have limited working memory, they quickly forget short term memory (unless it's super significant) and our long term memory fades selectively if not reactivated or significant (intense).
This weird leaky memory has advantages and disadvantages. Forgetting is useful, it removes garbage.
Machine models could vary the balance of temporal types, drop out
Etc. We may get some weird behavior.
I would guess we will see many innovations in how memory is stored in systems like these.
Or anything where the question answer isn't 'close' to the words used in the question?
How well does this work vs giving it the whole thing as a prompt?
I assume worse but I'm not sure how this approach compares to giving it the full thing in the prompt or splitting it into N sections and running on each and then summarizing.