You are on page 1of 10

AI for custom

documents
Team Name: DocSpeak

Team Members: 1) Apurv Jadhav


2) Jayaram Iyer
3) Vedant Kesarkar
4) Soham Jadhav

Affiliated Institute: SIES GST


Problem
Statement

Some existing problems with


LLM:-

• Limit of text you can input


• Outdated knowledge for
• Comparison from multiple
sources
Motivation
• It becomes very difficult as a student going through the
prescribed textbooks by MU for the subjects I am studying.

• In such a case it would be easy if I could just provide a


document to an LLM and let it do the work for me.

• The project we created helps the user provide custom


knowledge to the LLM and communicate with it.

• It also enables them to provide multiple documents about the


same topic and make comparisons between them.
Methodology 1. Data Ingestion: Convert each PDF document into text format
since PDFs are binary. Split the text into manageable chunks.
2. Text Embedding: Use OpenAI (specifically GPT-4) to create
embeddings (number representations) of the text chunks. These
embeddings serve as numerical representations of the documents.
3. Vector Store: Store these embeddings in a vector store, essentially
a database, organized by namespaces corresponding to different
documents or years.
4. Query Processing:
• For single-document queries: Extract the relevant namespace
from the query, retrieve the embeddings from the corresponding
vector store, and generate responses using GPT-4.
• For multi-document queries: Extract years mentioned in the
query, map them to their corresponding namespaces, retrieve
embeddings from multiple namespaces, and combine them to
generate responses.
Innovation
• Multi-Document Querying: Users can query
across multiple PDFs simultaneously, enabling
trend analysis and insights extraction.
• Dynamic Namespace Extraction: Automatic
identification of document identifiers streamlines
querying across multiple documents.
• Intelligent Contextual Analysis: Leveraging
GPT-3.5, our system provides accurate responses
tailored to query and document context.
• Source Attribution: Enhances transparency by
displaying sources used for response generation,
empowering informed decision-making.
Flowchart
Implementation
• Data Preparation: Load PDF
documents, convert to text, split into
chunks, and assign namespaces (year +
document identifier).

• Vector Storage: Use a service like


PineCone to store embeddings organized
by namespaces.

• Query Handling: Extract years from


queries, map to namespaces, retrieve
relevant embeddings, and generate
responses using GPT-4.
Future Scope
1. Advanced Image Extraction:
Implement cutting-edge image processing algorithms to extract text and
relevant information from images, enabling users to query and interact
with both textual and visual content.

2. Visual Data Analytics:


Develop tools for visual data analytics, enabling users to gain insights and
derive meaning from visual content through interactive visualizations and
data exploration techniques.

3. Generative Image Query:


Integrate cutting-edge generative models like DALL-E to generate images
based on user queries, allowing for visually rich and diverse search
results.
Conclusion
Our project revolutionizes document exploration by seamlessly integrating generative AI
technologies. With a focus on user experience and customization, we offer intuitive features for
informed decision-making and meaningful outcomes through document exploration. This
transformative paradigm shift empowers users to maximize productivity, enhance understanding,
and take action with confidence.
References
• A Comprehensive Overview of Large Language Models--Humza Naveed1, Asad Ullah Khan1, Shi Qiu2,
Muhammad Saqib (2024)

• Vector Database: Storage and Retrieval Technique, Challenge--Yikun Han, Chunjiang Liu, and Pengfei
Wang (2024)

• Pinecone: A Vector Database for Analytics at Scale- Qian Yu, Justin Culbreth, Laurent Amar, Vivien
Nguyen, Jiaxuan Huang, Eugene Vinitsky, and Samay Dhawan.(2021)

• AI-Based Custom Test Case Generation for Software Testing-- Huiqun Yu, Wei Huang, Hongyu Zhang,
Qian Li, and Xiaolong Xie. (2018)

You might also like