0% found this document useful (0 votes)
12 views9 pages

Gen AI

The document discusses Retrieval Augmented Generation (RAG) as a method to enhance the performance of large language models (LLMs) by integrating a dynamic knowledge base. RAG involves a retrieval step that extracts relevant information based on user prompts before passing it to the LLM, thereby addressing limitations in the model's static knowledge. The process utilizes text embeddings to rank the relevance of items in the knowledge base to improve the response quality of LLMs.

Uploaded by

Satyam Sangwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views9 pages

Gen AI

The document discusses Retrieval Augmented Generation (RAG) as a method to enhance the performance of large language models (LLMs) by integrating a dynamic knowledge base. RAG involves a retrieval step that extracts relevant information based on user prompts before passing it to the LLM, thereby addressing limitations in the model's static knowledge. The process utilizes text embeddings to rank the relevance of items in the knowledge base to improve the response quality of LLMs.

Uploaded by

Satyam Sangwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

LLM Fine-Tuning

IMPROVE LLM PERFORMANCE USING


RETRIEVAL AUGMENTED GENERATION

Satyam Sangwan

July 10th, 2024


What is Retrieval Augmented Generation
An LLM’s knowledge is static

LLMs may have an insufficient “understanding” of niche and specialised


information that was not prominent in their training data.

One way we can dec. these limitations is to augment a model via a specialised
and mutable knowledge base

RAG does not fundamentally change how we use an LLM; it's still prompt-in and
response-out.
What is RAG?
RAG works by adding a step to this basic process of
Prompt and Response.

A retrieval step is performed where, based on the user’s


prompt, the relevant information is extracted from an
external knowledge base and injected into the prompt
before being passed to the LLM.

RAG a flexible and (relatively)


straightforward way to improve LLM-
based systems.
How it works
RAG system: a retriever and a knowledge base.

A retriever takes a user prompt and returns relevant items


from a knowledge base. This typically works using so-
called text embeddings and numerical text
representations in concept space.

Text embeddings can compute a similarity score between


the user’s query and each item in the knowledge base.

The result of this process is a ranking of each item’s


relevance to the input query.
What are Text Embeddings
What are Text Embeddings
How it works
Knowledge base

This houses all the information you want to make available to the LLM.

The process can be broken down into four key steps:


References
https://www.shawhintalebi.com/
Thank you

You might also like