Professional Documents
Culture Documents
AIML Internship Report
AIML Internship Report
Bachelor of Technology
(Electronics and Communication)
Bhavay Savaliya
Roll No. (U20EC096)
B. TECH. IV(EC), 8th Semester
: Guided By :
(Year: 2023-24)
CERTIFICATE
This is to certify that the Internship Training Report entitled “Innovations in Risk
Estimation Models, Automation Solutions, and Chatbot Development for Health-
care Applications” is presented & submitted by Bhavay Savaliya, bearing Roll No.
U20EC096, of B.Tech. IV, 8th Semester in the partial fulfillment of the requirement
for the award of B.Tech. Degree in Electronics & Communication Engineering for aca-
demic year 2023-24.
He has successfully and satisfactorily completed Internship Training Exam in all
respects. We certify that the work is comprehensive, complete, and fit for evaluation.
Examiners:
3. Chatbot Development for Help and Support: Undertook the task of creating a
chatbot to provide assistance and support by fine-tuning Large Language Models. He
demonstrated creativity and innovation in leveraging cutting-edge technology to
enhance user experience and support services.
We commend Bhavay Savaliya for his outstanding performance and wish him all the
best in his future endeavours.
v
vii
Acknowledgements
I would like to express my profound gratitude and deep regards to my guide Dr. Kishor
P. Upla for his guidance. I am heartily thankful for suggestion and the clarity of the
concepts of the topic that helped me a lot for this work. I would also like to thank Prof.
Dr. Jignesh N. Sarvaiya , Head of the Electronics Engineering Department, SVNIT
and all the faculties of DoECE for their co-operation and suggestions. I am very much
grateful to all my classmates for their support.
Bhavay Savaliya
Sardar Vallabhbhai National Institute of Technology
Surat
May 2024
ix
Abstract
In the realm of healthcare, efficient utilization of data holds immense potential for im-
proving patient outcomes and streamlining medical processes. This internship report
outlines the development and implementation of several innovative solutions leverag-
ing deep neural networks (DNNs) and advanced natural language processing (NLP)
techniques in the medical domain.
The primary focus of this internship was to harness the power of machine learning
to analyze and interpret medical data. Classification and regression models were con-
structed using deep neural networks, enabling the classification of medical conditions
and the prediction of patient possibility of chances for a disease. These models were
trained on comprehensive datasets, providing insights into patterns and trends crucial
for clinical decision-making.
Furthermore, a novel solution was devised to automate the extraction of pertinent in-
formation from patient reports. Integrating optical character recognition (OCR) via
Pytesseract and leveraging the Gemini API, this solution efficiently extracts essential
parameters from textual data, facilitating seamless integration into medical databases.
This streamlined approach enhances data accessibility and accuracy, empowering health-
care professionals with actionable insights derived from patient records.
Additionally, a conversational agent (chatbot) was developed utilizing the Gemini API,
offering an intuitive interface for users to interact with medical data effectively. Through
natural language understanding and generation capabilities, the chatbot provides per-
sonalized assistance, answers queries, and facilitates data retrieval, ultimately enhanc-
ing the user experience and optimizing workflow efficiency.
xi
Table of Contents
Page
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. xv
List of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Chapters
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Internship Insights: Unveiling My Professional Growth . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Industry Demand for Risk Estimation Models . . . . . . . . . . 2
1.2.2 Addressing Manual Processes with Automation Solutions . . . 3
1.2.3 Enhancing Patient Engagement and Support through Chatbot
Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.4 Convergence of Technology and Healthcare . . . . . . . . . . . 3
1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.1 Development of Accurate and Robust Risk Estimator Models . 3
1.3.2 Implementation of an Automated Solution for Medical Param-
eter Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.3 Development of a Chatbot for Patient Help and Support . . . . 4
1.3.4 Evaluation, Validation and Deployment of Developed Solutions 4
1.4 Organization of project Report . . . . . . . . . . . . . . . . . . . . . . 5
1.4.1 Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . 5
1.4.2 Chapter 2: Literature Survey . . . . . . . . . . . . . . . . . . . 5
1.4.3 Chapter 3: Proposed Work . . . . . . . . . . . . . . . . . . . . 5
1.4.4 Chapter 4: Experimental Work . . . . . . . . . . . . . . . . . . 5
1.4.5 Chapter 5: Conclusion and Future Scope . . . . . . . . . . . . 6
2 Literature Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Linear Regression [1] . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Logistic Regression [2] . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Regression Using Deep Neural Networks . . . . . . . . . . . . . . . . 10
2.4 Classification Using Deep Neural Networks . . . . . . . . . . . . . . . 11
2.5 Normalization (Why Normalize Data?) . . . . . . . . . . . . . . . . . . 12
2.6 Optical Character Recognition (OCR) . . . . . . . . . . . . . . . . . . 14
2.6.1 Basic: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6.2 Challenges: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.6.3 Applications: . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
xiii
Table of Contents
xiv
List of Figures
xv
List of Abbreviations
API Application Programming Interface
CNN Convolutional Neural Network
LLM Large Language Model
ML Machine Learning
NLP Natural Language Processing
OCR Optical Character Recognition
RNN Recurrent Neural Network
xvii
Chapter 1
Introduction
As I look back on my internship at DQCare, I feel really grateful and excited to share
what I’ve learned. DQCare is a company that uses advanced technology, like AI and
image processing, to make healthcare better.
During my time there, I got to be part of some really cool projects that are chang-
ing the way healthcare works. I learned a lot and grew a ton, both as a professional and
as a person.
1
Chapter 1. Introduction
that make a real difference for the users, every experience taught me something impor-
tant. Using Fig. 1.1 I will summarize my learning during internship:
1. First I had to complete a task in which I have to insert values from given csv file
to a specific website and save result back to csv file. As csv file contained 9000
entries. So, I learned selenium to automate this task.
2. After generating and filtering dataset, risk estimator model were created using
this data.
3. As it was a deep neural network model, if I used python then during deployment
size of docker-image would be greater than 500 MB as only TensorFlow library
size is around 450 MB. So, I learned candle framework and created model using
it. And the final docker-image was of 22 MB only because candle is a rust ML
framework.
4. Then I research OCR, Table Transformers, etc. that can help extract data from
pdf and images.
5. I thought using Gemini API and prompt engineering I can automate data extrac-
tion. So, I learned all about integrating with Gemini API.
6. My last task was to build an AI chatbot for Help and Support for the clients.
So, I learned finetuning open source Large Language Models using hugging-
face/transformers library. As the dataset provided by DQCare was not sufficient, I
provided data in context field of Gemini. Then I set the rules of chatbot in system
prompt and user queries were answered using user prompt.
7. Then I learned to containerize back-end using docker image and deployed the
solutions on Render.
1.2 Motivation
1.2.1 Industry Demand for Risk Estimation Models
The healthcare industry is increasingly recognizing the importance of predictive an-
alytics and risk estimation models in enhancing patient care and optimizing resource
allocation. With the growing volume of patient data being generated daily, there is a
pressing need for accurate risk assessment tools that can assist healthcare providers in
identifying high-risk patients and implementing proactive intervention strategies. The
creation of various risk estimator models during the internship period stemmed from
this industry demand to improve patient outcomes and operational efficiency within
healthcare organizations.
2
1.3. Objectives
1.3 Objectives
1.3.1 Development of Accurate and Robust Risk Estimator Models
The primary objective of the internship project was to develop a series of risk estimator
models capable of accurately predicting various healthcare outcomes, such as disease
3
Chapter 1. Introduction
progression, treatment response, and patient prognosis. These models aimed to leverage
machine learning algorithms and statistical techniques to analyze patient data, including
demographic information, medical history, and clinical parameters, and generate risk
scores or probabilities. The goal was to create models that not only demonstrate high
predictive performance but also exhibit robustness and generalizability across diverse
patient populations and healthcare settings.
4
1.4. Organization of project Report
application.
5
Chapter 1. Introduction
6
Chapter 2
Literature Survey
Deep learning, a subset of artificial intelligence (AI), has emerged as a transformative
technology in healthcare, revolutionizing various aspects of medical practice and re-
search. Deep learning has enormous potential to improve patient care, diagnosis, and
treatment because of its capacity to automatically discover complex patterns and rep-
resentations from enormous volumes of data. An overview of the uses, difficulties,
and potential directions of deep learning in healthcare is given by this review of the
literature.
1. Basic Idea: The goal of linear regression is to determine which straight line fits a
given set of data points the best. The relationship between the independent vari-
ables (commonly indicated as X) and the dependent variable (typically indicated
as y) is represented by this line.
2. Equation: The equation for a linear regression model with independent variables
is:
y = β0 + βX + ϵ (2.1)
3. Evaluation: Metrics like Root Mean Squared Error (RMSE) and R-squared
(the percentage of the dependent variable’s variance that can be predicted from
the independent variable or variables) are utilised in evaluating linear regression
models.
7
Chapter 2. Literature Survey
1. Basic Idea: Using a logistic (or sigmoid) function to represent the probability of a
binary result is the fundamental notion underpinning logistic regression. Logistic
regression is specifically developed for binary classification problems, where the
result variable can take only two possible values (usually coded as 0 and 1). This
is in contrast to linear regression, which is used to predict continuous outcomes.
2. Equation: The equation for a linear regression model with independent variables
is:
1
p(y = 1|x) = −(β
(2.2)
1 + e 0 +βX)
• p(y = 1|x) is the probability of the outcome being 1 (belonging to the
positive class) given the predictor array X.
• X is the array of independent variables.
• β0 is the constant coefficient (parameter) of the model.
• β is an array of coefficients (parameters) of the model.
(c) Precision: Out of all the positive predictions the model makes, precision
indicates the percentage of exact positive predictions. It is calculated as:
TP
P recision = (2.4)
TP + FP
8
2.2. Logistic Regression [2]
(d) Recall: The ratio of the number of true positive predictions among all actual
positive data cases is known as recall. It is calculated as:
TP
Recall = (2.5)
TP + FN
(e) F1 Score: The harmonic mean of recall and precision is the F1 score. It is
calculated as follows and offers a combination of recall and precision:
2 ∗ P recision ∗ Recall
F 1Score = (2.6)
P recision + Recall
(f) Receiver Operating Characteristic (ROC) Curve: Plotting the true pos-
itive rate (TPR) or recall versus the false positive rate (FPR) at different
threshold values is known as the ROC curve. It allows the representation of
the sensitivity vs. specificity trade-off.
(g) Area Under the ROC Curve (AUC-ROC): The model’s overall perfor-
mance is summarised by AUC-ROC for all threshold values. It shows the
likelihood that a randomly selected positive instance would be ranked higher
by the model than a randomly selected negative case.
(h) Precision-Recall Curve:The precision-recall curve plots precision against
recall at different threshold levels, much like the ROC curve. It comes in
very handy when working with unbalanced datasets.
9
Chapter 2. Literature Survey
1. Architecture:
(a) Input, hidden, and output layers are among the several layers of intercon-
nected neurons that make up DNNs.
(b) In regression tasks, the output layer typically consists of a single neuron,
and the network predicts continuous values.
2. Loss Function:
(a) For regression assignments, the mean squared error, or MSE, is frequently
utilised as the loss function.
(b) Reducing the discrepancy between the target values and the projected values
is the aim.
3. Activation Functions:
4. Optimization:
(a) During training, network parameters are often updated using Gradient Descent-
based optimisation algorithms like Adam or RMSprop.
(b) Mini-batch gradient descent is often employed to train DNNs efficiently on
large datasets.
5. Regularization:
10
2.4. Classification Using Deep Neural Networks
1. Architecture:
(a) Similar to regression, DNNs for classification tasks consist of input, hidden,
and output layers.
(b) The output layer in classification tasks usually comprises of many neurons,
each of which represents a class label. Softmax activation is used to produce
class probabilities.
2. Loss Function:
(a) The loss function for classification tasks is often Cross-Entropy Loss, also
known as Log Loss.
(b) Reducing the discrepancy between the true class labels and the predicted
class probabilities is the aim.
3. Activation Functions:
(a) The output layer of binary classification tasks frequently uses the sigmoid
activation function to generate probabilities between 0 and 1.
(b) The softmax activation function is utilised to generate probabilities over sev-
eral classes in multi-class classification problems.
4. Optimization:
(a) Gradient Descent-based optimization algorithms are also used for training
classification DNNs.
(b) To further modify the learning rate during training, strategies such as learn-
ing rate scheduling might be used.
5. Evaluation Metrics:
(a) For classification tasks, accuracy, precision, recall, F1 score, and area under
the ROC curve (AUC-ROC) are common evaluation metrics.
(b) These metrics provide insights into the model’s performance in correctly
predicting class labels.
11
Chapter 2. Literature Survey
2. Facilitates Interpretability:
1. Min-Max Scaling:
x − min(x)
xscaled =
max(x) − min(x)
where x is the original feature value, min(x) is the minimum value of the
feature, and max(x) is the maximum value of the feature.
where x is the original feature value, mean(x) is the mean of the feature,
and std(x) is the standard deviation of the feature.
12
2.5. Normalization (Why Normalize Data?)
• Min-Max Scaling:
• Standardization:
Considerations:
2. Impact on Interpretability:
• Normalization changes the scale of the features but does not affect their
distribution or relationship with the target variable.
• Interpretability of the model remains intact after normalization, as the rela-
tive importance of features is preserved.
13
Chapter 2. Literature Survey
2.6.1 Basic:
1. Image Acquisition: OCR begins with the acquisition of an image containing text.
This image can be obtained from various sources, such as scanned documents,
photographs, or screenshots.
3. Text Localization: Text localization techniques identify the regions in the image
that contain text. This step is crucial for isolating the text from the background
and other non-text elements.
4. Text Segmentation: Text segmentation divides the localized text regions into
individual characters or words. This step is necessary for recognizing and inter-
preting each character or word separately.
7. Postprocessing: These techniques are applied to improve the accuracy of the rec-
ognized text. This may include spell checking, language modeling, and context-
based correction.
14
2.6. Optical Character Recognition (OCR)
2.6.2 Challenges:
1. Variability in Fonts and Styles: OCR systems must handle a wide range of
fonts, styles, sizes, and orientations of text present in images.
2. Noise and Distortions: Images may contain noise, distortions, blurring, or degra-
dation due to scanning artifacts, poor lighting conditions, or low-resolution cam-
eras.
4. Multi-lingual and Multi-script Text: OCR systems must support multiple lan-
guages and scripts, each with its own unique character sets and writing conven-
tions.
2.6.3 Applications:
1. Document Digitization: OCR is widely used for converting printed documents,
books, and manuscripts into editable and searchable digital formats.
2. Data Entry and Form Processing: OCR automates data entry tasks by extract-
ing information from documents, invoices, forms, and receipts.
4. Mobile Scanning Apps: OCR is integrated into mobile scanning apps for cap-
turing text from images of business cards, whiteboards, and documents.
2. End-to-End OCR Systems: End-to-end OCR systems that directly map input
images to output text without intermediate processing steps are being developed,
offering potential improvements in efficiency and accuracy.
15
Chapter 2. Literature Survey
3. Multimodal OCR: Integrating OCR with other modalities such as natural lan-
guage processing (NLP) and computer vision enables more sophisticated docu-
ment understanding and semantic analysis.
7. Text Generation: Text generation involves producing natural language text au-
tomatically. This can include tasks like language modeling, text summarization,
and dialogue generation.
16
2.7. Natural Language Processing [3]
These are just some of the key theories and techniques in NLP, and the field is
constantly evolving with new advances in deep learning, reinforcement learning, and
other areas of AI.
17
Chapter 3
Proposed Work
3.1 Overview
The initial phase of the internship involved studying medical parameters and develop-
ing predictive models for cardiovascular risk estimation, Retinopathy estimation, and
diabetes prediction using deep learning techniques with TensorFlow. Subsequently, a
solution was created to streamline the extraction of essential patient details from reports
in image or PDF format, followed by the development of a chatbot using data provided
by DQcare.
3.2 Methodology
3.2.1 Creation of various estimation and classification models
19
Chapter 3. Proposed Work
20
3.2. Methodology
21
Chapter 3. Proposed Work
• Filtering Dataset: Null values and other invalid values containing entries
are removed from dataset.
• Normalization: Dataset was normalised using sklearn StandardScaler.
• Training of model: First logistic regression model was created using sklearn
library. Then a deep neural network model was created using TensorFlow
with Adam as optimizer.
• Evaluation of model: Logistic regression model achieved an accuracy of
0.724. Deep neural network model achieved an accuracy of 0.732.
22
3.2. Methodology
23
Chapter 3. Proposed Work
4. Check if any data extracted: The final json response will be checked
whether it is empty or not. If it is empty then it means no data is extracted
from report, then we notify user about that. Otherwise, if any data is there
in json response then the data is saved to database. Also, report is saved to
cloud storage and report’s original file name and generated sha256 key is
saved to database.
24
3.2. Methodology
Advantages:
(a) Usually a better choice when implementing using low dataset compared
to finetuning other Large Language Models.
(b) Google Gemini model is trained on a vast amount of dataset. This
means if user asks question apart from given data and if question is
valid then model generates response using data from internet.
(c) Maintenance is easier as no manual model deployment is needed.
Disadvantages:
(a) There may be security issues as context data is exposed to google Gem-
ini.
25
Chapter 4
Experimental Work
In this chapter, we will see a demo of the solution build by my team. First, solution
is that we created a simple frontend using React that have a file uploading option, a
graph for displaying cardiovascular risk estimation and some tables denoting tables of
the database. Second one is the AI chatbot integrated with the streamlit frontend.
27
Chapter 4. Experimental Work
Hematology Report
1. Example 1:
• Question: Hello
• Answer: Hello! I’m your virtual health assistant from DQ Care, here to
help you navigate the world of digital healthcare and answer any questions
you have. What can I do for you today?
2. Example 2:
28
4.2. Demo for AI Chatbot
Do you have any other questions I can help you with today?
3. Example 3:
• Question: Are there any age restrictions for using a digital healthcare plat-
form?
• Answer: Hello! I’m your virtual health assistant from DQ Care, here to
help you navigate the world of digital healthcare.
Is there anything else you’d like to know about DQ Care or any other health
related concerns you have?
29
Chapter 4. Experimental Work
30
4.2. Demo for AI Chatbot
Example 1
Example 2
Example 3
31
Chapter 5
Conclusion & Future Scope
5.1 Conclusion
The integration of risk prediction models within the application serves as a proactive
measure to empower users in managing their health effectively. By providing users
with insight into their percentage risk for diabetes complications, these models encour-
age adherence to dietary plans and regular medical check-ups, thereby mitigating po-
tential risks and promoting overall well-being. Furthermore, the implementation of data
extraction techniques from medical reports offers users the convenience of maintaining
organized medical records within the application. This functionality enables users to ac-
cess comprehensive summaries and insights directly from the app, facilitating informed
decision-making and enhancing collaboration between users and healthcare profession-
als. Consequently, both users and healthcare providers benefit from streamlined ac-
cess to essential information, thereby simplifying processes and improving efficiency
in healthcare management. Moreover, the incorporation of an AI chatbot within the
application augments user experience by providing real-time assistance and guidance.
The chatbot’s ability to address queries related to app functionalities and medical ter-
minology not only enhances user engagement but also fosters a greater understanding
of health-related concepts. By offering personalized support and educational resources,
the chatbot contributes to an enriched user experience, ultimately promoting health lit-
eracy and empowerment. In essence, the combination of risk prediction models, data
extraction capabilities, and AI chatbot functionalities within the application underscores
a commitment to leveraging technology for the betterment of healthcare outcomes.
33
Chapter 5. Conclusion & Future Scope
34
References
[1] P. Schober and T. R. Vetter, “Linear regression in medical research,” Anesthesia &
Analgesia, vol. 132, no. 1, pp. 108–109, Jan 2021.
[4] Białek, “CVD Risk Calculator – Cardiovascular Disease 10-year Risk,” https://
www.omnicalculator.com/health/cvd-risk.
35