Well that's not actually how it works - they are just getting a model (WikiSP & EntityLinker) to write a query that responds with the fact from Wikidata. Did you read the post or just the headline?
Besides, let's not forget that humans are also trained on language data, and although humans can also be wrong, if a human memorised all of Wikidata (by reading sentences/facts in 'training data') it would be pretty good in a pub-quiz.
Also, we obviously can't see anything inside how OpenAI train GPT, but I wouldn't be surprised if sources with a higher authority (e.g. wikidata) can be given a higher weight in the training data, and also if sources such as wikidata could be used with reinforcement learning to ensure that answers within the dataset are 'correctly' answered without hallucination.
Ah, I did misunderstand how it worked, thanks -- I was looking at the flow chart and just focusing on the part that said "From Wikidata, the filming location of 'A Bronx Tale' includes New Jersey and New York" that had an arrow feeding it into GTP-3...
I'm not really sure how useful something this simple is, then. If it's not actually improving the factual accuracy in the training of the model itself, it's really just a hack that makes the whole system even harder to reason about.
> For more complex and knowledge-intensive tasks, it's possible to build a language model-based system that accesses external knowledge sources to complete tasks. This enables more factual consistency, improves reliability of the generated responses, and helps to mitigate the problem of "hallucination".
> Meta AI researchers introduced a method called Retrieval Augmented Generation (RAG) to address such knowledge-intensive tasks. RAG combines an information retrieval component with a text generator model. RAG can be fine-tuned and its internal knowledge can be modified in an efficient manner and without needing retraining of the entire model.
> RAG takes an input and retrieves a set of relevant/supporting documents given a source (e.g., Wikipedia). The documents are concatenated as context with the original input prompt and fed to the text generator which produces the final output. This makes RAG adaptive for situations where facts could evolve over time. This is very useful as LLMs's parametric knowledge is static.
> RAG allows language models to bypass retraining, enabling access to the latest information for generating reliable outputs via retrieval-based generation.
> Lewis et al., (2021) proposed a general-purpose fine-tuning recipe for RAG. A pre-trained seq2seq model is used as the parametric memory and a dense vector index of Wikipedia is used as non-parametric memory (accessed using a neural pre-trained retriever). [...]
> RAG performs strong on several benchmarks such as Natural Questions, WebQuestions, and CuratedTrec. RAG generates responses that are more factual, specific, and diverse when tested on MS-MARCO and Jeopardy questions. RAG also improves results on FEVER fact verification.
> This shows the potential of RAG as a viable option for enhancing outputs of language models in knowledge-intensive tasks.
So, with various methods, I think having ground facts in the process somehow should improve accuracy.
In this context, these are more expert systems vs LLMs, and as you enumerate, they can be built well if built well. For example, Google surfaces search engine results directly. This is similar, but more powerful, because Wikimedia Foundation can actually improve results, gaps, overall performance while Google DGAF.
I would expect as the tide rises with regards to this tech, self hosting of training and providing services to prompts becomes easier. For Wikimedia, it'll just be another cluster and data pipeline system(s) at their datacenter.
Besides, let's not forget that humans are also trained on language data, and although humans can also be wrong, if a human memorised all of Wikidata (by reading sentences/facts in 'training data') it would be pretty good in a pub-quiz.
Also, we obviously can't see anything inside how OpenAI train GPT, but I wouldn't be surprised if sources with a higher authority (e.g. wikidata) can be given a higher weight in the training data, and also if sources such as wikidata could be used with reinforcement learning to ensure that answers within the dataset are 'correctly' answered without hallucination.