Advanced-
Q&A-and-
RAG-Series
• RAG | Q&A with various
databases
• SQLDB, VectorDB,
GraphDB
Objectives:
• RAG: Text Embedding + Vector Search
• Q&A: LLM Agent querying the database (SQL, Cypher,
etc.)
Projects
• Q&A-and-RAG-with-SQL-
and-TabularData
• KnowledgeGraph-Q&A-
and-RAG-with-
TabularData
• KnowledgeGraph-Q&A-
and-RAG-with-Text
•Q&A-and-RAG-with-SQL-
and-TabularData
RAG with
Vector DB:
Embeddi Vector
Question Retrieval LLM Answer
ng search
Q&A with SQL
DB:
Reference:
https://python.langchain.com/docs/use_cases/sql/
RAG on Tabular Data
Treat each row as one
chunk of the vectorDB
Str(
Survived: 0
Pclass: 3
Name: Mr. Owen
…
…
)
Perform vector search
on the embeddings of
one specific column
Ex: Column
name“description
”: values (str)
Q&A-and-RAG-with-SQL-and-
TabularData • Red: Data preparation pipeline
for SQL DB. (Q&A)
• Green: Data preparation
pipeline for VectorDB. (RAG)
• Yellow: Chat pipeline for
interacting with the SQL agent
and SQL database. (Q&A)
• Blue: Chat pipeline for
interacting with the Embedding
model, LLM, and VectorDB.
(RAG)
Projects
• Q&A-and-RAG-with-SQL-
and-TabularData
• KnowledgeGraph-Q&A-
and-RAG-with-
TabularData
• KnowledgeGraph-Q&A-
and-RAG-with-Text
Content
1. Why knowledge graph
2. Series schema
3. Knowledge graph fundamentals
4. How to construct knowledge graph
5. ChatBot schema
6. Knowledge graph agent
7. RAG with knowledge graph
8. Part 1: Constructing knowledge graph with movie dataset
a) Code walk through (Notebooks)
I. Preparing GraphDB
II. Data preparation
III. Populating the graph DB
IV. Q&A with Graph DB
V. RAG with Graph DB
VI. Chatbot design and test
9. Part 2: Microsoft project – medical chatbot with unstructured data
a) Code walk through
Why Knowledge
Graph
•Pros:
• Suitable for both structured and unstructured data
• Domain-Specific Applications: Very powerful for scenarios where
the desired knowledge graph is known
• Explainability and Traceability: Allows us to ask questions that can
be answered using multiple knowledge points within a single
database or even multiple databases which is not accessible with
normal RAG and direct Q&A.
•Use cases:
• Chatbot for unstructured medical reports (from multiple doctors
or hospitals)
• Chatbot for accessing complex relationships between databases
•Cons:
• Not as generic as the conventional RAG approaches
• Requires more technical knowledge
• Slower implementation
• Not as mature as RAG
•Deciding whether to use a KG or conventional RAG depends on your
data's characteristics, your application's specific needs, and
considerations such as scalability, flexibility, and explainability.
•You can also consider to use a combination of both if it suits your
use case.
LLM Model Matters!
Cypher Query
Context length
knowledge
Objectives:
• RAG: Text Embedding + Vector Search
• Q&A: LLM Agent querying the database (SQL, Cypher,
etc.)
Microsoft
project
Part 1: Knowledge Graph for Movie Dataset
Part 2: Medical Chatbot Using Unstructured Data
Knowledge Graph: Fundamentals
ge graph: A database that stores information in nodes and relationships
Node properties: key (e.g: name) + value (e.g: Farzad)
Nodes can have labels that groups them together.
elationships have direction, type, and properties
Name: Name:
Farzad Mike
Relationships Period:
contains Weekl
two nodes y
Year:
2024
Topic:
AI projects
(Person)-[Produces](YouTube (Person)-[Produces](YouTube
Video) Video)[Watches]-(Person)
How to construct the knowledge graph?
Hybrid: Use LLM’s
knowledge (e.g: GPT
4) to guide you in
constructing the
Pros: Graph Knowledge)
- Consistent graph structure.
Domain expert - Less expertise are needed in LLM (e.g: LLMGraphTransformer)
collaboration with LLM.
Pros: Consistent graph structure Pros: Easy to use
Cons: Still requires some basic knowledge Cons: Inconsistent graph structure (due to
Cons: Not easy to implement of coding and graph knowledge nondeterministic behavior of the LLMs)
implementation
KnowledgeGraph-Q&A-and-RAG-with-TabularData
• Green: Data preparation
pipeline for GraphDB. (Q&A
and RAG)
• Yellow: Chat pipeline for
interacting with the graph
agent and GraphDB. (Q&A)
• Blue: Chat pipeline for
interacting with the
Embedding model, LLM, and
GraphDB. (RAG)
Q&A on Graph DB using Graph Agent
RAG with Q&A with
Vector DB Embeddi Vector
SQL DB
Question Retrieval LLM Answer
ng search
Knowledge Graph
Agent: Q&A
https://python.langchain.com/docs/use_cases/
graph/quickstart/
RAG on Tabular Data
Treat each row as one
chunk of the vectorDB
Str(
Survived: 0
Pclass: 3
Name: Mr. Owen
…
…
)
Perform vector search
on the embeddings of
one specific column
Ex: Column
name“description
”: values (str)
Part 1: Knowledge Graph for Movie Dataset
Dataset
Knowledge graph Nodes and their properties:
- Movie {imdbRating: FLOAT, id: STRING, released: DATE, title: STRING},
- Person {name: STRING},
- Genre {name: STRING}
Proper
Proper
ty Person [: Genre ty - Location {name: STRING}
DI - SimilarMovie {name: STRING}
] RE name
E
NR
name [: CT
] ACT ED GE Relationship
ED IN_
_I [: properties:
N (:Movie)-[:IN_GENRE]-
]
>(:Genre)
Movie[ (:Person)-[:DIRECTED]-
:W
T AS >(:Movie)
AR_ _T (:Person)-[:ACTED_IN]-
IL AK
SIM Propertie EN
_I >(:Movie)
_ s
: IS N] (:Movie)-[:WAS_TAKEN_IN]-
[ id >(:Movie)
O] Proper
Proper Similar released Location (:Person)-[:IS_SIMILAR_TO]-
ty
ty >(:Movie)
Movie tagline
name
name
title
imdbRati
ng
Part 2: Medical Chatbot Using Unstructured Data
What you will learn:
- How to prompt an LLM to
construct the knowledge graph.
- How to create knowledge graph
from unstructured text.
Prompt
GPT 4 + Neo4j
(cypher)
Resources: LLMs for constructing Knowledge Graph
https://github.com/neo4j-partners/neo4j-generative- https://python.langchain.com/docs/use_cases/graph/
ai-azure constructing/