Professional Documents
Culture Documents
Using ML
Submitted
By
Allangi Suresh
(21173-CM-004)
Submitted to
BONAFIDE CERTIFICATE
This to Certify that this project work entitled “Liver Cancer Prediction
Using Machine Learning” is the bonafied record work done by Mr./Ms.
________________________________________ bearing the Pin No.
_______________ of the final year, along with batch mates submitted
on the partial fulfilment of the requirement for the award of Diploma in
Computer Engineering to the State Board of Technical Education and
Training. The results embodied in this project report have not been
submitted to any other board/ University or Institute for the award of
diploma.
EXTERNAL EXAMINER
Table Of Content
1. Abstract…………………………..……………………………………………1
2. Acknowledgement………..………………………………………………2
3. Introduction………..…………………………………………………….3-7
3.1. Background and Motivation
3.2. Objectives
3.3. Overview of the Project
4. Literature Review……..……………………………………………....8-9
4.1. Introduction
4.2. Early Approaches
4.3. Machine Learning Techniques
4.4. Feature Selection and Importance
4.5. Data Imbalance and Bias
4.6. Performance Evaluation
4.7. Challenges and Future Directions
4.8. Conclusion
5. Data Description……………………………………………………10-11
5.1. Overview of the Project
5.2. Laboratory Test Result
5.3. Histopathological Data
5.4. Risk Factors
6. Methodology………….………………………………………………12-13
6.1. Data Collection and Acquisition
6.2. Data Preprocessing
6.3. Model Development
6.4. Model Evaluation
6.5. Model Validation
6.6. Clinical Translation
6.7. Continous Improvement and Research
7. Languages, Technologies, and Machine Learning Tools
Used…14-15
7.1. Python
7.2. Flask
7.3. Scikit-learn
7.4. NumPy and Pandas
7.5. Matplotlib and Seaborn
7.6. Pickle
7.7. Machine Learning Algorithms
8. Requirement Analysis………………………….………..………16-18
8.1. Objective
8.2. Functional Requirements
8.3. Non-Functional Requirements
8.4. Modules Used
8.5. Installation Instructions
9. Experimental Setup………………………….……………..…….19-20
9.1. Data Splitting
9.2. Model Training
9.3. Hyperparameter Tuning
10. Implementation..…………………………………………...21-30
10.1. Data Visulization
10.2. Data Pre-Processing
10.3. Algorithm
10.4. Model Traning
10.5. Model Selection
10.6. Model Testing
10.7. Model Evaluting
11. Project Code……………………….…………………….……31-47
11.1. Backend of the Project
11.2. Frontend of the Project
11.3. Connection to the frontend and backend
12. Refrences……………………………………………………………48
1. ABSTRACT
Page | 1
2. ACKNOWLEDGMENTS
We would like to express our sincere gratitude to Mr. Girish Reddy, our
mentor and guide, whose invaluable support and expertise have been
instrumental in the successful completion of this project on liver cancer
prediction. Mr. Girish Reddy's guidance, encouragement, and insightful
feedback have inspired us throughout the project journey, helping us
navigate challenges and achieve our goals effectively.
We would also like to extend our appreciation to our colleagues and peers
for their collaboration, encouragement, and constructive discussions, which
have enriched our understanding and contributed to the project's progress.
Last but not least, we would like to thank our families and friends for their
unwavering support, patience, and encouragement throughout this
endeavour. Their understanding and encouragement have been invaluable
in sustaining our motivation and drive to pursue excellence.
Thank you to everyone who has played a part in this project. Your support
and collaboration have been indispensable, and we are deeply grateful for
the opportunity to work together towards a common goal.
Page | 2
3. INTRODUCTION
Page | 3
exacerbate the impact of liver disease on vulnerable populations,
underscoring the need for equitable and effective diagnostic strategies.
Against this backdrop, the motivation for the liver disease prediction
project is clear:
3.2. Objectives:
The objectives of the liver disease prediction project are multifaceted
and aim to address key challenges in the diagnosis and management of
liver disease. Here are the primary objectives
Page | 4
2. Incorporate Diverse Data Sources: The project aims to incorporate
diverse data sources, including patient medical history, laboratory test
results, imaging studies, lifestyle factors, and genetic information, to
enhance the predictive capabilities of the models. By leveraging multiple
data modalities, the models can capture the complex interplay of factors
contributing to liver disease risk.
Page | 5
8. Contribute to Medical Research and Knowledge: The project aims to
contribute to the advancement of medical research and knowledge in the
field of liver disease diagnosis and management. This involves
disseminating research findings through publications, presentations, and
open-access repositories, as well as fostering collaboration with other
research groups and institutions.
Page | 6
evaluated to identify the most effective approaches for predicting liver
disease risk.
5. Model Evaluation: The performance of the developed models is
evaluated using appropriate metrics such as accuracy, sensitivity,
specificity, precision, recall, and area under the receiver operating
characteristic curve (AUC-ROC). Model evaluation involves cross-
validation, external validation on independent datasets, and comparison
against baseline models to assess predictive performance and
generalizability.
6. Validation and Clinical Translation: Validated models are further
assessed in real-world clinical settings to evaluate their effectiveness in
identifying individuals at risk of liver disease. This involves collaboration
with healthcare providers and stakeholders to integrate the predictive
models into clinical practice, develop decision support tools, and evaluate
their impact on patient outcomes and healthcare delivery.
7. Continuous Improvement and Research: The liver disease prediction
project is an iterative process, with ongoing efforts to refine and improve
the predictive models based on feedback from clinical implementation and
new research findings. Continuous collaboration with medical
professionals, researchers, and data scientists ensures the project's
relevance, accuracy, and effectiveness in addressing the evolving challenges
of liver disease diagnosis and management.
Page | 7
4. LITERATURE REVIEW
4.1. Introduction:
Liver disease is a global health burden affecting millions of individuals
worldwide. Early detection and accurate prediction of liver disease play a
crucial role in effective treatment and management. In recent years, machine
learning techniques have gained attention for their potential in predicting liver
disease based on clinical and demographic data. This literature survey aims to
review the state-of-the-art research in liver disease prediction using machine
learning methods.
Page | 8
4.4. Feature Selection and Importance:
Feature selection techniques have been employed to identify the most
relevant predictors of liver disease. Methods such as recursive feature
elimination and feature importance analysis help prioritize informative features
and improve model interpretability.
4.5. Data Imbalance and Bias:
Addressing data imbalance and bias is crucial in liver disease prediction,
as datasets often exhibit skewed class distributions and demographic disparities.
Techniques such as oversampling, undersampling, and bias correction
algorithms help mitigate these challenges and improve model generalization.
4.6. Performance Evaluation:
Performance evaluation metrics such as accuracy, sensitivity, specificity,
precision, recall, and area under the ROC curve (AUC-ROC) are commonly
used to assess the predictive performance of liver disease models. Cross-
validation and external validation on independent datasets are essential for
validating model robustness and generalization.
4.7. Challenges and Future Directions:
Despite significant advancements, several challenges remain in liver
disease prediction, including data heterogeneity, model interpretability, and
clinical applicability. Future research directions may involve integrating
multimodal data sources, enhancing model explainability, and conducting real-
world validation studies to translate research findings into clinical practice.
4.8. Conclusion:
In conclusion, machine learning techniques offer promising avenues for
liver disease prediction, providing valuable insights for early diagnosis and
personalized healthcare interventions. Continued research efforts in this field
are essential to develop reliable and interpretable models for improving patient
outcomes and reducing the global burden of liver disease.
Page | 9
5. Dataset Description
Page | 10
5.4. Risk Factors:
- Alcohol Consumption: Self-reported alcohol consumption habits,
including frequency and quantity of alcohol intake.
- Smoking Status: Self-reported smoking status (current smoker, former
smoker, non-smoker).
Page | 11
6. Methodology
Page | 12
- Conduct cross-validation to estimate the generalization performance of the
models and identify potential sources of overfitting or underfitting.
- Compare the performance of different models and select the best-performing
model(s) for further evaluation.
6.5. Model Validation:
- Validate the selected model(s) on independent datasets or through external
validation studies to assess their robustness and generalizability.
- Collaborate with healthcare providers and stakeholders to evaluate the
clinical utility and real-world effectiveness of the predictive models in
identifying individuals at risk of liver disease.
6.6. Clinical Translation:
- Integrate the validated predictive models into clinical practice by developing
decision support tools, electronic health record (EHR) systems, or mobile
applications.
- Provide training and education to healthcare professionals on the use of the
predictive models for early detection and risk assessment of liver disease.
- Monitor the impact of the predictive models on patient outcomes, healthcare
delivery, and resource utilization, and iterate on the models based on feedback
and performance metrics.
6.7. Continuous Improvement and Research:
- Continuously monitor and update the predictive models based on new data,
research findings, and emerging technologies.
- Foster collaboration with medical professionals, researchers, and data
scientists to advance scientific understanding of liver disease pathology, risk
factors, and predictive modeling techniques.
- Disseminate research findings through publications, presentations, and
knowledge sharing platforms to contribute to the broader scientific community
and promote further research in the field.
Page | 13
7. Languages, Technologies, and Machine Learning
Tools Used
7.1. Python:
Python served as the foundational programming language for the
project, offering versatility, extensive libraries, and ease of use for various
tasks including data preprocessing, model training, and web development.
7.2. Flask:
Flask, a lightweight web framework for Python, was instrumental in
constructing the project's website. Flask facilitated URL routing, HTTP
request handling, and template rendering, enabling seamless integration of
machine learning functionalities into the web application.
7.3. Scikit-learn:
Scikit-learn, a leading machine learning library for Python, played a
pivotal role in training the predictive model for Liver cancer classification.
It provided a wide array of algorithms, including Support Vector Machines
(SVM), Logistic Regression (LR), Decision Trees (DT), and Random Forests
(RF), allowing for comprehensive exploration and selection of the most
suitable model for the task at hand.
Page | 14
7.6. Pickle:
Pickle, a Python module for object serialization, was utilized to save
the trained machine learning model to a file. This facilitated persistent
storage of the model, allowing for efficient loading within the web
application. Pickle ensured seamless integration of the trained model with
the Flask framework, enabling real-time predictions on user input.
Page | 15
8. Requirement Analysis
8.1. Objective:`
4. Model Persistence: The system should allow for the trained machine
learning model to be persisted and loaded efficiently for real-time
predictions.
Page | 17
properly:
```
python --version
pip --version
```
6. If Python and pip are installed correctly, you can use pip to install
additional packages as needed for your project. For example:
```
pip install flask scikit-learn NumPy pandas matplotlib seaborn
```
This command installs the required modules for the Liver cancer
prediction project, including Flask for web development and Scikit-learn,
NumPy, Pandas, Matplotlib, and Seaborn for machine learning and data
analysis functionalities.
Page | 18
9. Experimental Setup
1. Training Set: The majority of the dataset (e.g., 70% or 80%) will be
allocated to the training set. This portion of the data will be used to train
the machine learning models.
2. Testing Set: The remaining portion of the dataset (e.g., 30% or 20%)
will be reserved for the testing set. This independent subset of the data will
be used to evaluate the trained models' performance and assess their
generalization ability on unseen data.
2. Training: Fit the initialized models to the training data, allowing them to
learn patterns and relationships between input features and target labels
(benign or malignant).
Page | 19
best results. In this project, hyperparameter tuning will be performed using
techniques such as grid search or random search to explore the
hyperparameter space and identify the optimal combination of
hyperparameters for each model.
Page | 20
10. IMPLEMENTATION
3. Box Plots: Box plots can be used to visualize the distribution of a feature
across different classes (benign vs. malignant). Box plots provide
information about the median, quartiles, and outliers, allowing for
comparisons between classes.
Page | 21
methods (e.g., decision tree feature importance).
1. Data Cleaning:
- Handling Missing Values: Identify and handle missing values in the
dataset. Options include imputation (e.g., replacing missing values with the
mean, median, or mode), deletion of rows or columns with missing values,
or using algorithms that can handle missing values directly.
- Handling Outliers: Detect and address outliers in the dataset.
Outliers can skew statistical analyses and model predictions. Techniques
such as Z-score normalization or winsorization can be used to handle
outliers.
2. Data Transformation:
- Feature Scaling: Scale the features to a similar range to ensure that
no single feature dominates the others. Common techniques include Min-
Max scaling and Standardization (Z-score normalization).
- Encoding Categorical Variables: Convert categorical variables into
numerical representations that can be used by machine learning
algorithms. This can be done using techniques such as one-hot encoding or
Page | 22
label encoding.
- Feature Engineering: Create new features or transform existing
features to capture additional information that may improve model
performance. This could involve deriving new features from existing ones
or applying mathematical transformations.
3. Feature Selection:
- Select the most relevant features that are informative for predicting
Liver cancer outcomes. Feature selection techniques such as univariate
feature selection, recursive feature elimination, or feature importance from
tree-based models can be used to identify the most important features.
- Dimensionality Reduction: Reduce the dimensionality of the dataset
by removing irrelevant or redundant features. Techniques such as Principal
Component Analysis (PCA) or Singular Value Decomposition (SVD) can be
used for dimensionality reduction.
4. Data Splitting:
- Split the preprocessed dataset into training and testing sets. The training
set is used to train the machine learning model, while the testing set is used
to evaluate its performance. Typically, a random split (e.g., 80% training,
20% testing) is used, ensuring that the distribution of classes is preserved
in both sets.
5. Normalization:
- Normalize the data to ensure that all features have a similar scale. This is
particularly important for distance-based algorithms like K-Nearest
Neighbors (KNN) or Support Vector Machines (SVM).
Page | 23
10.3. Algorithm:
Page | 24
which can be advantageous for certain applications.
Page | 25
10.5. Model Selection:
1. Evaluation Metrics:
- Before selecting a model, it's essential to define evaluation metrics
that reflect the desired performance criteria. Common metrics for binary
classification tasks like Liver cancer prediction include accuracy, precision,
recall, F1-score, and area under the ROC curve (AUC-ROC). These metrics
provide insights into different aspects of the model's performance, such as
its ability to correctly classify benign and malignant instances, minimize
false positives, and balance precision and recall.
2. Cross-Validation (Optional):
- Cross-validation can be employed to assess the generalization
performance of different models and mitigate overfitting. In k-fold cross-
validation, the training dataset is divided into k subsets (folds), and the
model is trained and evaluated k times, each time using a different fold as
the validation set and the remaining folds as the training set. The average
performance across folds provides a more robust estimate of the model's
performance and helps identify models that generalize well to unseen data.
Page | 26
the other hand, may generalize better but may not capture complex
patterns in the data as effectively. Strike a balance between model
complexity and performance based on the specific requirements of the
Liver cancer prediction task.
5. Interpretability:
- Consider the interpretability of the models, especially in healthcare
settings where interpretability and transparency are crucial for gaining
trust from healthcare professionals and patients. Simple models like
logistic regression or decision trees are often more interpretable than
complex models like neural networks or ensemble methods. Choose a
model that provides a good balance between performance and
interpretability, depending on the stakeholders' needs.
Page | 27
generalize well to new instances and provide accurate classifications of
Liver masses as benign or malignant. Here's a detailed overview of the
model testing process:
• Prediction:
• Performance Evaluation:
1. Data Preprocessing:
Page | 28
- Feature Selection: Identify and select relevant features that contribute most to
the prediction task.
2. Model Selection:
3. Evaluation Metrics:
4. Cross-Validation:
5. Model Evaluation:
- Train-Test Split: Divide the dataset into training and testing sets to evaluate
the model's generalization ability.
6. Interpretation:
7. Model Deployment:
- Save the trained model to a file (e.g., using pickle or joblib) for future use.
Page | 29
- Integrate the model into a production environment for real-world predictions.
- Handle any concept drift or changes in data distribution that may affect model
performance.
By following these steps, you can effectively evaluate your liver disease
prediction model and ensure its accuracy and reliability for real-world
applications. If you need assistance with any specific aspect of model evaluation
or have further questions, feel free to ask!
Page | 30
11. Project Code
-Platform: Jupyter
df.shape
(583, 11)
df.columns
Index(['Age', 'Gender', 'Total_Bilirubin', 'Direct_Bilirubin',
'Alkaline_Phosphotase', 'Alamine_Aminotransferase',
'Aspartate_Aminotransferase', 'Total_Protiens', 'Albumin',
'Albumin_and_Globulin_Ratio', 'Dataset'],
dtype='object')
df.describe()
Page | 31
df.duplicated().sum()
df.drop_duplicates(inplace=True)
df.duplicated().sum()
df.isna().sum()
Age 0
Gender 0
Total_Bilirubin 0
Direct_Bilirubin 0
Alkaline_Phosphotase 0
Alamine_Aminotransferase 0
Aspartate_Aminotransferase 0
Total_Protiens 0
Albumin 0
Albumin_and_Globulin_Ratio 4
Dataset 0
dtype: int64
df[df['Albumin_and_Globulin_Ratio'].isna()]
df['Albumin_and_Globulin_Ratio'].fillna(df['Albumin_and_Gl
obulin_Ratio'].median(),inplace=True)
df.isna().sum()
Age 0
Gender 0
Total_Bilirubin 0
Direct_Bilirubin 0
Alkaline_Phosphotase 0
Alamine_Aminotransferase 0
Aspartate_Aminotransferase 0
Total_Protiens 0
Albumin 0
Page | 32
Albumin_and_Globulin_Ratio 0
Dataset 0
dtype: int64
df.head()
df['Gender'].value_counts()
Gender
Male 430
Female 140
Name: count, dtype: int64
df['Gender']=df['Gender'].map({'Female':0,'Male':1})
df.head()
Page | 33
sns.countplot(x='Dataset', data=df)
plt.savefig("pie.png")
plt.figure(figsize = (20,20))
sns.heatmap(df.corr(), annot = True)
plt.savefig("pie.png")
Page | 34
df['Dataset'].value_counts()
Dataset
1 406
2 164
Name: count, dtype: int64
X=df.drop(['Dataset'],axis=1)
X.head()
y=df['Dataset']
y.sample(5)
14 1
347 1
468 1
295 1
243 1
Name: Dataset, dtype: int64
Data Preprocessing
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.25, random_state=42)
## standardize the dataset
from sklearn.preprocessing import StandardScaler
scaler=StandardScaler()
X_train=scaler.fit_transform(X_train)
X_test=scaler.fit_transform(X_test)
X_train.shape
(427, 10)
Model Creation
Linear Regression
accuracy_lst=list()
from sklearn.linear_model import LinearRegression
Page | 35
model_ln =LinearRegression()
model_ln.fit(X_train,y_train)
pred_ln=model_ln.predict(X_test)
accuracy=model_ln.score(X_test,y_test)
accuracy
0.07033305757604935
Logistic Regression
from sklearn.linear_model import LogisticRegression
model_lr = LogisticRegression(random_state = 51, C=1,
penalty='l1', solver='liblinear')
model_lr.fit(X_train,y_train)
pred_lr=model_lr.predict(X_test)
model_lr.score(X_test,y_test)
from sklearn.metrics import accuracy_score
accuracy=accuracy_score(y_test,pred_lr)
accuracy
0.6153846153846154
accuracy_lst.append(accuracy*100)
Decision Tree
from sklearn.tree import DecisionTreeClassifier
model_dtr = DecisionTreeClassifier()
model_dtr.fit(X_train,y_train)
pred_dtr=model_dtr.predict(X_test)
model_dtr.score(X_test,y_test)
0.6153846153846154
0.6153846153846154
accuracy_lst.append(accuracy*100)
SVM
Page | 36
from sklearn.svm import SVC
model_svc=SVC(kernel='rbf',random_state=0)
model_svc.fit(X_train,y_train)
pred_svc=model_svc.predict(X_test)
model_svc.score(X_test,y_test)
0.7272727272727273
0.7272727272727273
accuracy_lst.append(accuracy*100)
0.7412587412587412
0.7412587412587412
accuracy_lst.append(accuracy*100)
KNeighbours Classifier
Page | 37
0.6713286713286714
accuracy_lst.append(accuracy*100)
Naive Bayes
0.5454545454545454
0.5454545454545454
accuracy_lst.append(accuracy*100)
Confusion Matrix
[[92 11]
[26 14]]
Accuracy Graph
accuracy_lst
[74.82517482517483,
Page | 38
61.53846153846154,
72.72727272727273,
74.12587412587412,
67.13286713286713,
54.54545454545454]
import pickle
#pickle.dump(model_rfc, open('model.pkl','wb'))
with open('model.pickle','wb') as f:
pickle.dump(model_rfc,f)
pred=model.predict(X_test)
from sklearn.metrics import accuracy_score
accuracy=accuracy_score(y_test,pred_rfc)
accuracy
0.7412587412587412
Page | 39
11.2. Frontend of the Project:
-Languages: HTML, CSS
-Platform: VS Code
Liver.html
<!DOCTYPE html>
<html>
<head>
<title>liver cancer</title>
<style>
@import
url('https://fonts.googleapis.com/css2?family=Poppins:wght
@200&family=Ubuntu:wght@300&display=swap');
*{
padding: 0;
margin: 0;
box-sizing: border-box;
font-family: 'Poppins',sans-serif;
outline: none;
user-select: none;
}
body{
padding: 0 50px;
background-color: rgb(0, 255, 170);
}
.header{
display: flex;
justify-content: center;
align-items: center;
margin: 0 auto;
padding: 40px 0;
}
.header h1{
font-family: 'Ubuntu', sans-serif;
letter-spacing: 4px;
font-size: 50px;
font-weight: 700;
Page | 40
}
.row{
display: flex;
align-items: center;
justify-content: space-between;
width: 100%;
padding: 10px 0;
margin-bottom: 20px;
}
input{
border: none;
background-color: white;
border: none;
color: #000;
width: 100%;
margin: 0 10px;
padding: 10px 10px;
font-size: 15px;
font-weight: 700;
box-shadow: -8px -8px 15px rgba(57, 56,
56, 0.236),5px 5px 15px rgba(17, 17, 17, 0.489);
border-radius: 6px;
outline: none;
}
input::placeholder{
color: #000;
}
.footer{
display: flex;
align-items: center;
justify-content: center;
margin: 0 auto;
}
.button{
font-size: 22px;
color: #fff;
background: #000;
width: 250px;
height: 60px;
cursor: pointer;
border-radius: 6px;
Page | 41
box-shadow: -8px -8px 15px rgba(57, 56,
56, 0.236),5px 5px 15px rgba(17, 17, 17, 0.489);
outline: none;
border: none;
display: grid;
place-content: center;
}
.loader{
pointer-events: none;
width: 30px;
height: 30px;
border-radius: 50%;
border: 3px solid transparent;
border-top-color: #fff;
animation: an1 1s ease infinite;
}
</style>
</head>
<body>
<div class="header">
<h1>Liver Cancer</h1>
</div>
<div class="body">
<form action="{{ url_for('predict') }}"
method="post">
<div class="row">
<input type="text" name="Age"
placeholder="Age" required="required">
<input type="text" name="Gender"
placeholder="Gender" required="required">
</div>
<div class="row">
<input type="text"
name="Total_Bilirubin" placeholder="Total Bilirubin"
required="required">
<input type="text"
name="Direct_Bilirubin" placeholder="Direct Bilirubin"
required="required">
<input type="text"
name="Alkaline_Phosphotase" placeholder="Alkaline
Phosphotase" required="required">
</div>
Page | 42
<div class="row">
<input type="text"
name="Alamine_Aminotransferase" placeholder="Alamine
Aminotransferase" required="required">
<input type="text"
name="Aspartate_Aminotransferase" placeholder="Aspartate
Aminotransferase" required="required">
<input type="text"
name="Total_Protiens" placeholder="Total Protiens"
required="required">
</div>
<div class="row">
<input type="text" name="Albumin"
placeholder="Albumin" required="required">
<input type="text"
name="Albumin_and_Globulin_Ratio" placeholder="Albumin and
Globulin Ratio" required="required">
</div>
<div class="footer">
<button class="button">Submit</button>
</div>
</form>
</div>
</body>
</html>
Page | 43
Predict.html
<!DOCTYPE html>
<html>
<head>
<title>results</title>
<style>
*{
padding: 0;
margin: 0;
box-sizing: border-box;
font-family: 'Poppins',sans-serif;
outline: none;
user-select: none;
}
body{
background-color: rgb(0, 255, 170);
height: 95vh;
display: grid;
place-content: center;
}
.predict{
display: flex;
align-items: center;
justify-content: center;
margin: 0 auto;
font-size: 30px;
}
</style>
</head>
<body>
<div class="predict">
{% if output == 1 %}
<p>Positive liver cancer</p>
{% else %}
<p>Negative liver cancer</p>
{% endif %}
</div>
</body>
</html>
Page | 44
11.3. Connection to the frontend and backend:
app = Flask(__name__)
model = pickle.load(open('liver.pkl','rb'))
@app.route("/")
def test():
return render_template("liver.html")
@app.route("/predict", methods=['POST','GET'])
def predict():
input_features = [float(x) for x in
request.form.values()]
features_values = [nu.array(input_features)]
print(input_features)
features_names =
['Age','Gender','Total_Bilirubin','Direct_Bilirubin','Alka
line_Phosphotase','Alamine_Aminotransferase',
'Aspartate_Aminotransferase','Total_
Protiens','Albumin','Albumin_and_Globulin_Ratio']
df = pa.DataFrame(features_values)
output = None
output = model.predict(df)
print(output)
return render_template("predict.html", output =
output)
if __name__=='__main__':
app.run(debug=True)
Page | 45
Output-1:
Page | 46
Output-2:
Page | 47
12. References
Certainly! Here are some references and resources that can help you with
your liver disease prediction project:
1. Datasets:
- The UCI Machine Learning Repository hosts several datasets related to liver
disease prediction, such as the "Liver Disorders Dataset" and the "Indian Liver
Patient Dataset".
- Kaggle is another platform where you can find datasets related to liver
disease prediction.
2. Research Papers:
- "Prediction of liver disease using machine learning algorithms" by Abdar,
M. et al. (2019).
- "Prediction of liver disease using ensemble classification" by Shenoy, P. et
al. (2018).
- "Predictive modeling for diagnosis of liver disorder using machine learning
techniques" by Srivastava, S. et al. (2019).
3. Books:
- "Introduction to Machine Learning with Python" by Andreas C. Müller and
Sarah Guido.
- "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow"
by Aurélien Géron.
4. Tutorials and Courses:
- Coursera and Udemy offer various courses on machine learning and data
science, which often include projects and case studies related to medical
prediction tasks.
- YouTube channels like sentdex and Data School provide tutorials on
implementing machine learning algorithms in Python for medical prediction
tasks.
5. GitHub Repositories:
- Search GitHub for repositories related to liver disease prediction or
healthcare analytics. You may find code implementations, datasets, and project
ideas.
6. Online Communities:
- Join communities such as Stack Overflow, Reddit (e.g., r/MachineLearning),
and LinkedIn groups related to data science and machine learning. You can ask
questions, share insights, and learn from others' e
Page | 48
Page | 1