Hallucination & Grounding of Generative AI
Generative AI has taken a boom in recent times. With the advent of Foundational models(which are trained on billions of data and parameters and learn to understand patterns) such as Large Language models or LLMs, generative AI has shown beyond creative results. It's amazing to see far AI and machine learning have gone.
Along with this comes the question of the reliability of these models. Does it always show us the correct information? Can I trust such AI systems with critical tasks? What if the AI system generated biased and harmful content?
Such behaviour of Gen AI models is known as Hallucination wherein the model displays irrelevant outputs which are not related to inputs. Research continues to restrict models from hallucinating. Gen AI models are capable of hallucinations with high confidence, if not monitored correctly. As we talk about AI’s behaviour to completely mimic the human brain, there are challenges involved and the need for building trustworthy systems.
Ways to Tackle Hallucinations & Ground Them
Grounding term was introduced first by Microsoft. Grounding means providing with most accurate and factual answers that are related to the inputs provided.
Fine-Tuning
Fine-tuning an AI model refers to using specific use case data over a pre-trained model. This helps the AI system to better come up with patterned and domain-constrained outputs. Fine-tuning can improve performance provided it is well-trained to learn patterns from the custom data. It is also known as a type of transfer learning as the weights of the pre-trained model get retrained over new data.
For example, a large language model such as GPT3 could be fine-tuned to predict sentiment analysis or an image model such as Vision Transformer could be fine-tuned to detect objects such as vehicles for autonomous driving.
There are some limitations to this process as it's highly time-consuming and expensive. Even after being trained, the model might not be able to perform well in preventing hallucinations.
Prompt Engineering
Prompt engineering as the name suggests is a process of providing instructions to the generative AI model on what task needs to be performed. It suggests what content is to be generated, and further how it can be accomplished. This piece of information could be highly helpful in guiding the AI system into what exactly to look for. Different types of prompt engineering techniques are being explored. It can be in the form of text, image, speech anything. As an evident example, we can take up the most talked about these days is how OpenAI ChatGPT works. The better the prompts the more accurate are the results.
Similarly for image generation models, like StabilityAI’s stable diffusion has the option for negative prompts which could help provide some constraints.
Nevertheless, this method has its limitations in restraining the model from hallucination. Prompt engineering can’t always ensure the system is completely informed about the model’s domain data.
Retrieval-Augmented-Generation
Retrieval-Augmented Generation as the name suggests consists of two neural networks retrieval-based and generative-based. The retrieval-based component uses a search algorithm to fetch relevant information from pre-existing data and knowledge based on the query or prompt provided. This information is then fed into the next layer i.e. the generative-based component, which could be RNNs, transformers or GPT models that generate content based on the retrieved information. The second layer is performing the summarization process from the inputs received from the first layer. This can be particularly useful in cases where the AI system has limited or incomplete information. RAG models had come up some years back and have been used in conversational bots, image captioning, language and other content generation.
It is hard to say if the AI system could be grounded and not hallucinate even with this process as it depends on pre-existing data that could have bias. Though one way round to overcome this could be human evaluation, about which we talk more in the next section.
Reinforcement Learning
We’re all aware of how reinforcement learning works with the hit-and-trial method. For every correct answer model is rewarded and for wrong answers it receives punishment. This way the model learns when and what actions to take. This is commonly how most games are designed.
In generative AI, reinforcement learning can be adequately useful for avoiding hallucinations. This can be done in two ways one is bot-based (adversarial networks) and another is human feedback or human-in-the-loop.
Adversarial networks are built of generator and discriminator models such as GANs(generative adversarial networks). The generative model is responsible for generating synthetic data until the discriminator model is unable to discriminate between the synthetic data from the real-world data. If properly trained, adversarial networks could be helpful to restrict models from hallucinations and ground the AI system from suggesting against the required outputs.
RLHF or reinforcement learning from human feedback is not new and has been used earlier. There are ways in which the system keeps generating multiple variations of the content and presenting them for human evaluation. Humans can continuously evaluate the variations based on factors such as coherence, relevance, and quality, and provide feedback back to the AI system. The AI system then uses this feedback to adjust its generation strategy and learns to generate new variations that are closer to the desired outcome. Though this method is expensive and time-consuming could prove to show better results. It could also have its challenges and how far the humans can outperform themselves is another story. For more information on this, check out Chip Huyen’s detailed blog on the same.
Summing Up
Above all these techniques can be clubbed and mixed-matched to get better results. Fine-tuning along with prompt engineering could prove to enhance results for certain use cases. Further usage of reinforcement learning with human feedback can be used to overcome certain edge cases.
It also depends on factors such as the dataset size, compute power(which usually is required a lot for such massive data-hungry models), the correlation between the input data and how the model can learn the pattern, etc. Training and building such systems can scale to unforeseen capabilities. This article is inspired and elaborated on my recent learnings and the below podcast I came across.