ChatGPT is not up to date unless you start using the plugins. This sort of indexing is based on vector databases and various intermediate prompting. If you want to get technical, the academic term is "Retrieval Augmented Generation".
Hallucination is unfortunately inevitable when it comes to any autoregressive model, even with RAG. You can minimize hallucination by prompting, but you'll still see some factually incorrect responses here and there (https://zilliz.com/blog/ChatGPT-VectorDB-Prompt-as-code).
I unfortunately don't think we'll be able to solve hallucination anytime soon. Maybe with the successor to the transformer architecture?
Hallucination is naturally a concern for anyone looking to depend upon LLM-generated answers.
We’ve been testing LLM responses with a CLI, we’re using it to generate accuracy statistics, which is especially useful when the use-case Q/A is limited.
If ‘confidence’ can be returned to the user, then at least they can have an indication if there is a higher quality-risk with a given response.
Here's one example question that ChatGPT utterly fails at, but that this answers fine: "What is Amazon CodeCatalyst?"
ChatGPT: "I'm sorry, but as of my knowledge cut-off in September 2021, there was no service, tool, or product known as Amazon CodeCatalyst offered by Amazon Web Services (AWS). [...]"
You can ask the same questions to ChatGPT and get the same or better answers.
I also know from personal experience with ChatGPT, that you can use it to:
- convert Python/boto3 to any language that has an AWS SDK
- convert CloudFormation to Terraform or the CDK
- write scripts that use the SDK
You will get the occasional hallucination.