LLM Fine-Tuning
IMPROVE LLM PERFORMANCE USING
RETRIEVAL AUGMENTED GENERATION
Satyam Sangwan
July 10th, 2024
What is Retrieval Augmented Generation
An LLM’s knowledge is static
LLMs may have an insufficient “understanding” of niche and specialised
information that was not prominent in their training data.
One way we can dec. these limitations is to augment a model via a specialised
and mutable knowledge base
RAG does not fundamentally change how we use an LLM; it's still prompt-in and
response-out.
What is RAG?
RAG works by adding a step to this basic process of
Prompt and Response.
A retrieval step is performed where, based on the user’s
prompt, the relevant information is extracted from an
external knowledge base and injected into the prompt
before being passed to the LLM.
RAG a flexible and (relatively)
straightforward way to improve LLM-
based systems.
How it works
RAG system: a retriever and a knowledge base.
A retriever takes a user prompt and returns relevant items
from a knowledge base. This typically works using so-
called text embeddings and numerical text
representations in concept space.
Text embeddings can compute a similarity score between
the user’s query and each item in the knowledge base.
The result of this process is a ranking of each item’s
relevance to the input query.
What are Text Embeddings
What are Text Embeddings
How it works
Knowledge base
This houses all the information you want to make available to the LLM.
The process can be broken down into four key steps:
References
https://www.shawhintalebi.com/
Thank you