You are on page 1of 5

RiskGuru Chatbot

Madhavi Aghera Mentor Name: Amaresh Chandra Singh


Enrollment ID: 202211049 Manager - AI/ML Team
MTech Major Project - II Report Company Name: Injala Pvt. Ltd.
Mode: Off Campus Company Address: 1401 to 1410, 14 th Floor, B Block,
DAIICT Westgate by True Value, S.G. Highway,
Gandhinagar, Gujarat, India Ahmedabad, Gujarat, India
202211049@daiict.ac.in Amaresh.Singh@injala.com

On Campus Supervisor: Prosenjit Kundu On Campus Co-Supervisor: Bakul Gohel


Assistant Professor Assistant Professor
DAIICT DAIICT
Gandhinagar, Gujarat, India Gandhinagar, Gujarat, India
prosenjit kundu@daiict.ac.in bakul gohel@daiict.ac.in

Abstract—This project is a segment of the RiskGuru product, alities ensure consistency, accuracy, thoroughness, and speed,
which comprises several AI-driven modules. Among these mod- effectively preventing costly errors and omissions in policy
ules is a Chatbot, designed to respond to user queries. Chatbots coverage. By providing optimal match coverages and uncover-
play a pivotal role in modern software solutions by providing in-
stant and personalized assistance to users. They serve as efficient ing hard terms through dissecting form wording and coverage
virtual assistants, capable of addressing inquiries, guiding users intent, RiskGuru delivers unparalleled value. Additionally, it
through processes, and even executing tasks autonomously. By extracts key policy details, ensuring quick results.
leveraging natural language processing and artificial intelligence, Our project is structured into three distinct stages: factual,
chatbots enhance user experience, streamline communication, explanatory, and generative. Leveraging the capabilities of
and improve efficiency within software ecosystems. In essence,
chatbots are indispensable components of software solutions, LLM models and efficient prompt engineering techniques, we
driving engagement, productivity, and innovation in various aim to facilitate the extraction of key policy details.
industries.
The focus of this project is on managing policy documents, II. O BJECTIVE
enabling the Chatbot to address queries across various sections RiskGuru Chatbot allows users to access necessary infor-
of policies. The project consists of various stages - factual,
explanatory & generative. To achieve this, we are exploring
mation from policies without navigating through lengthy and
different types of LLM models and their effectiveness with complex documents, enabling quicker decision-making and
insurance domain. We’re also generating and refining a variety facilitating the purchase of policies that align best with their
of Prompt & Responses to achieve this task. needs. Moreover, this approach aids users in comprehending
Engaging in this project provides me with an opportunity to the coverage, limits, and any other conditions associated with
delve into the domain of insurance and its associated documents.
the policy.
The domain’s complexity necessitates expert guidance for com-
prehension. Therefore, RiskGuru aims to assist users in reviewing III. L EARNING O UTCOMES
their documents and selecting the most suitable policies.
Index Terms—Annotations, Prompt Engineering, Large Lan- Despite already possessing knowledge in Python, Machine
guage Models - LLM, Prompt & Responses (P&R), Quantization Learning, Deep Learning, Natural Language Processing, and
other fundamental concepts, the scheduled sessions were ini-
tially planned to span a total of six months to cover all
I. I NTRODUCTION these topics comprehensively. However, upon our request, they
Injala is the leading disruptor in the insurance industry, accommodated our needs by agreeing to assess us through
offering a range of risk management software solutions, in- assignments and provided modules to work on. Due to their
cluding Wrapportal, Kinetic, Asuretify, Anzenn, Prequaligy, coordination and support, I’ve been able to learn the following
and RiskGuru. Collectively, these software solutions form a topics within a relatively short period:
complete insurance technology ecosystem known as the Risk
A. Domain Knowledge
Management Operating System (RMIS 360). RMIS 360 offers
automation, rapid verification, document review, comparison, 1) Types of Policies:
and other functionalities. • General Liability: Commercial general liability (CGL)
RiskGuru serves as an AI Advisor, offering intelligent anal- insurance is a policy type that offers coverage to busi-
ysis for diverse insurance documents. Its AI-driven function- nesses for injury and property damage resulting from
the business’s operations or accidents that occur on the 3) ACORD 25 | Certificate of Insurance (COI):
business premises. However, it does not cover intentional The Certificate of Insurance (COI), also known interchange-
damages or accidents involving automobiles, aircraft, or ably as an ACORD form, is a concise document that sum-
watercraft. marizes an insurance policy and confirms that a business
• Auto Liability: Auto liability insurance provides cov- holds insurance coverage. An ACORD Certificate of Insurance
erage for businesses that use vehicles as part of their includes various key components to communicate important
operations. It proves beneficial in the event of crises information. COI contains details of producer, insured, in-
related to automobiles. surer, certificate number, coverages, type of insurance, policy
• Workers Compensation: Workers’ compensation is a number, policy effective & expiration date, policy limits, de-
type of employer insurance coverage that provides ben- scription of operations/ vehicles/ locations, additional insured,
efits to workers who sustain injuries or become disabled signature and cancellation notice.
as a result of their job duties.
• Umbrella and Excess: An umbrella liability policy B. Concepts
supplements underlying liability insurance policies by
1) Prompt Engineering:
extending coverage beyond their limits. Once the un-
derlying coverage is exhausted due to claim payments, Hallucination in Large Language Models (LLMs) describes
the umbrella policy can step in to cover the remaining the occurrence when the model generates text that lacks factual
balance. basis or coherence with the provided context. Instead of
producing accurate responses, the model generates fictional,
misleading, or nonsensical content, presenting false knowledge
2) Policy Sections:
as if it were accurate. This behavior involves the model
The policy contains sections like Declaration, Rating, Cov- generating misleading or incorrect information.
erage, Form list, Endorsement & Notice. Prompt engineering aims to define the role and scenario
• Declaration: The declaration page provides compre- for the model to guide its generation process, preventing
hensive information such as carrier details, names and it from producing unbounded hallucinations. This approach
addresses of the insured, insurer, and producer. It also helps ensure that the model generates responses that are
encompasses essential policy details like the policy num- grounded in the provided context and aligned with the intended
ber, effective date, retroactive date, coverage list, limits, purpose, minimizing the risk of producing nonsensical or
premium breakdown, and any fees included in the policy. misleading content [1]. There are some well-known types of
• Rating: Rating refers to the process of determining the prompt engineering.
premium amount for an insurance policy based on the • Retrieval-Augmented Generation (RAG): RAG is a
insured’s risk profile. Insurers use various factors, such framework that combines retrieval-based and generation-
as the insured’s age, location, type of property, and claims based approaches to improve the quality and relevance of
history, to calculate the premium rate. The rating process the generated text based on domain knowledge. It incor-
helps insurers ensure that premiums accurately reflect the porates a retrieval mechanism to fetch relevant passages
level of risk associated with insuring the policyholder. or documents from a knowledge source (e.g., Wikipedia)
• Form list: A formlist is a document that outlines the based on the input query or prompt. It uses a language
various forms and endorsements attached to an insurance model (e.g., GPT) to generate responses conditioned on
policy. It serves as a checklist to ensure that all neces- both the input query and the retrieved passages, ensur-
sary forms and endorsements are included and properly ing that the generated text is contextually relevant and
executed. informative.
• Endorsement: An endorsement is a modification or RAG is useful for tasks requiring contextually rich re-
addition to an insurance policy that changes the terms sponses, such as question answering, summarization, and
or coverage provided by the policy. Endorsements can dialogue systems.
be used to add or remove coverage, change coverage • Chain of Thoughts (COT): It is a few-shot learning
limits, or modify policy conditions. They are typically technique. COAT is a prompting technique that involves
attached to the policy as separate documents and are generating a sequence of interconnected prompts to guide
signed by both the insurer and the insured to indicate the generation of coherent and contextually relevant text.
mutual agreement to the changes. Instead of providing a single prompt, which may limit
• Notice: A notice in insurance refers to formal com- the scope or coherence of the generated output, Chain
munication provided by the insurer to the insured or of Thoughts prompts the model with a series of related
other parties involved in the insurance policy. Notices questions or statements, each building upon the previous
may include important information about changes to the one. This approach encourages the model to maintain a
policy, updates on claims processing, renewal reminders, coherent narrative or logical progression throughout the
cancellation notices, or other relevant notifications. generated text, resulting in more structured and meaning-
ful outputs.
Chain of Thoughts is particularly useful in tasks requiring – LORA - Low rank adaption is a technique that uses
the generation of complex or multi-faceted responses, SVD (singular value decomposition) to decompose a
such as storytelling, essay writing, or problem-solving. high rank matrices in to a combination of low rank
By breaking down the prompt into a series of intercon- matrices. Instead of adjusting all the model’s param-
nected thoughts, this technique helps guide the model’s eters, LORA introduces small, specialized matrices
generation process and ensure consistency and coherence that capture the essence of the new task [4].
in the output. – QLORA is an extension of LORA, which introduces
• Reason + Act (ReACT): It is also a few-shot learning 4-bit NormalFloat (NF4), Double Quantization and
technique. ReACT is similar to the RAG framework Paged Optimizers [5].
specifically tailored for conversational applications, such
as chatbots and virtual assistants. It has much more C. Non-tech values:
reasoning capacity. It gains additional information to In addition to technical expertise, I’ve learned the impor-
arrive at a response by getting knowledge from both tance of certain non-tech skills crucial for my career growth.
Private Public - External knowledge bases. • Sharpening soft skills like communication and teamwork,
• Directional Stimulus Prompting (DSP): DSP aims to which are essential for success in any workplace.
influence the output of language models by providing • I’ve also realized the value of keeping code organized,
directional prompts that guide the generation process checking its coverage, and documenting it well, ensuring
towards specific topics, sentiments, or styles. Prompts are that projects stay manageable and understandable.
carefully crafted to evoke a particular response from the • Lastly, effective task management is key for meeting
model. They often include keywords, cues, or instructions deadlines and achieving goals efficiently.
designed to steer the model’s output in a desired direction.
DSP prompts may vary in complexity and specificity, IV. T OOLS
depending on the desired outcome. They can range from
A. LabelImg
simple directives to more nuanced instructions tailored to
the task. LabelImg is an user-friendly tool used for annotating im-
DSP is commonly used in tasks such as content genera- ages, providing a graphical interface for this task. Developed
tion, sentiment modification, or topic manipulation, where in Python, it utilizes the Qt framework for its visual interface.
guiding the model’s output towards a specific direction is Annotations done by LabelImg are saved as XML files in
crucial. PASCAL VOC format, the format used by ImageNet. It also
supports YOLO format [6].
2) Parameter Efficient Fine Tuning (PEFT):
Parameter Efficient Fine Tuning (PEFT) is a strategy utilized B. YEDDA
to enhance the performance of large language models (LLMs) YEDDA, formerly known as SUTDAnnotator, is a versatile
on particular tasks while mitigating the computational burden tool designed for annotating various forms of text, including
associated with their massive parameter count [2]. Despite multiple languages such as English and Chinese, as well as
LLMs’ effectiveness in natural language processing (NLP) symbols and emojis. It is compatible across all major operating
tasks, their extensive parameters make training and storage systems, including Windows, Linux, and MacOS [7].
computationally expensive. Fine-tuning is a common tech-
nique to adapt these models for specific tasks, requires training C. Postman
all these parameters, which can be very costly. PEFT addresses Postman is a powerful API testing and development tool
this issue by focusing on a subset of parameters for training that simplifies the process of working with APIs. It offers a
while freezing the reamining, thereby significantly reducing user-friendly interface for sending requests, testing responses,
computational resources while maintaining performance.There and debugging APIs. With features like automated testing and
are concepts which works along with PEFT. collaboration capabilities, Postman streamlines API develop-
• Quantization: The main concept is to change the preci- ment workflows and enhances productivity.
sion of the parameters in a pre-trained model so it can fit
into less powerful hardware [3]. It converts the parameters D. Amazon Textract
from high precision to low precision. Most parameters are Amazon Textract is a machine learning service provided by
converted to a simpler form to save memory, but a small Amazon Web Services (AWS) that enables users to extract
part that’s crucial for adjusting the model during training text and data from scanned documents, PDFs, and images.
stays in its original form because making them simpler Leveraging advanced machine learning algorithms, Textract
might affect how well the model learns. Post Training can accurately detect and extract printed text, handwriting,
Quantization (PTQ) and Quantization Aware Training tables, and forms from a wide range of document types.
(QAT) are two modes of training with quantization. There We’ve utilized this service with policy and accord documents
are various quantization techniques. Below are some to provide context in generating prompts and responses for
widely used peft techniques that uses quantization: training large language models (LLMs).
E. Azure DevOps C. Annotation and P&R Generation
Azure DevOps is a cloud platform from Microsoft that Data annotation and prompt & response generation are
streamlines the entire software development lifecycle. It in- critical aspects of training models. Data annotation involves
cludes a range of services including version control, build labeling datasets with relevant information, providing super-
automation, continuous integration and delivery (CI/CD), agile vision for machine learning algorithms. This annotated data
planning, and project management. With Azure DevOps, teams is essential for training accurate models. Prompt & response
can collaborate more efficiently, automate repetitive tasks, generation involves crafting input prompts and generating
track progress, and deliver high-quality software faster. corresponding responses, which is crucial for developing con-
versational AI systems like chatbots. These processes are
essential for creating high-quality datasets and ensuring that
V. C ONTRIBUTIONS models effectively understand and generate meaningful natural
language text.
A. Understanding Policy Document & Helping team restruc-
turing the policy pipeline Following the data segregation process, the next step in-
volves preparing prompts to feed into the model. This includes
As mentioned, policy documents are complex and often generating prompts based on the respective section folders and
reuires the guidance of domain experts for comprehension. aligning them with the chat template designed for a particular
Our deliberate efforts involved dissecting the primary sections model. The template includes questions and answers with their
of the policy to compile a list of classes crucial for training corresponding context, ensuring the model receives structured
the classifier module. In the RiskGuru pipeline provided in the input for effective training and comprehension.
figure below, policy section classification is a pivotal point that
determines from which section the model should look after to D. RAG Module
address specific queries. RAG (Retrieval-Augmented Generation) with LLM (Large
Language Models) holds significant importance in natural
language processing tasks, particularly in the domain of ques-
tion answering and information retrieval. RAG enhances the
capabilities of LLMs by integrating a retrieval component that
accesses external knowledge sources. This enables LLMs to
generate responses based not only on learned patterns from
training data but also on relevant information retrieved from
external sources. By leveraging this combination of generation
and retrieval, RAG with LLMs can produce more accurate, co-
herent, and contextually relevant responses, thereby improving
the overall performance and effectiveness of natural language
understanding and generation tasks.
After testing the RAG module on a pre-trained LLM, we
observed satisfactory results at some stage but observed that it
encounters difficulties in understanding the structure of policy
documents. Consequently, we made the strategic decision to
first train the LLM model specifically for policy data. By doing
so, we aim to enhance the model’s comprehension of policy
document structures and nuances. Subsequently, we plan to
integrate the RAG module with this tailored LLM model,
Fig. 1. RiskGuru Policy Pipeline leveraging its augmented capabilities to provide more accurate
and contextually relevant results. This approach is expected to
improve performance in generating responses to policy-related
queries.
B. Data Segregation
As mentioned earlier, while the classification module con- E. LLM module for Declaration section
sists of a list of classes, there is currently a lack of readily Assigned with the task of fine-tuning LLM models for the
available data. However, the company owns a vast array of declaration section of policies, I am tasked with selecting the
policies, from which we aim to extract our requirements. most suitable LLM model for this purpose and integrating
Our strategy involves converting all policies into images and it with the declaration section. Due to the unavailability
systematically organizing them into folders corresponding to of sufficient data, there is a need to generate prompts and
respective classes, including Declaration, Rating, Form List, responses for our designated section. Currently, this module is
and various Endorsement sections. in progress, and I have completed the training pipeline while
focusing on developing the inferencing pipeline and generating
more prompts to get better training results.
VI. C ONCLUSION
Along with the classifier module, fine-tuned models trained
on section-wise prompts, and the RAG module, the RiskGuru
chatbot will be able to deliver improved responses for policy
documents.
Currently, our focus is solely on handling factual data. In
the subsequent phase, we aim to delve into the explanatory
and generative aspects of the chatbot. Moving forward, the
project’s scope extends to comparing policy documents and
suggesting the most suitable policy to users based on the
coverages offered. This expanded functionality is intended to
enrich the user experience and offer invaluable assistance in
selecting the right insurance policies.
ACKNOWLEDGMENT
I want to thank my manager, colleagues, and mentors for
their invaluable support and guidance throughout my learning
journey. Working with such an exceptional team and under
the guidance of a supportive manager has been truly fortunate.
Their mentorship helped me grow professionally and fostered
a friendly and nurturing work environment. The expertise and
experience gained from collaborating with my colleagues will
undoubtedly hold great significance for me in the future.
R EFERENCES
[1] https://paperswithcode.com/task/prompt-engineering
[2] https://huggingface.co/blog/peft
[3] https://www.tensorops.ai/post/what-are-quantized-llms
[4] ”LoRA: Low-Rank Adaptation of Large Language Models”, by Edward
J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li,
Shean Wang, Lu Wang, Weizhu Chen, arXiv preprint arXiv:2106.09685.
2021 Jun 17. (https://arxiv.org/abs/2106.09685)
[5] ”QLoRA: Efficient Finetuning of Quantized LLMs”, by Tim Dettmers,
Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer, Advances in Neural
Information Processing Systems, 36. (https://arxiv.org/abs/2305.14314)
[6] https://github.com/HumanSignal/labelImg
[7] https://github.com/jiesutd/YEDDA

You might also like