Professional Documents
Culture Documents
OBJECTIVE
1. Design complete solution to demonstrate end-to-end pipeline development and
manage an AIML project
2. Develop understanding of all stages of a machine learning project lifecycle
3. Demonstrate understanding of challenges encountered during the project
development and provide ways to tackle them
4. Showcase understanding of software engineering best practices while developing the
project
REFERENCES
1. https://arxiv.org/abs/1901.11210
2. https://ieeexplore.ieee.org/abstract/document/9342702
3. https://arxiv.org/abs/1711.05225
4. https://www.codekarle.com/
5. https://medium.com/@kethan.pothula/amazon-system-design-and-architecture-d78
7a6572f35
6. https://medium.com/double-pointer/system-design-interview-amazon-flipkart-ebay-
or-similar-e-commerce-applications-35a0bc764421
PROBLEM STATEMENT
Develop an application which takes X-ray as input either from a website and predicts lung
pathologies. Develop a data pipeline which includes the all the following stages of Machine
Learning Project Life Cycle –
Modules Details
Different technical modules are elaborated as follows –
1. Project Requirement Documentation
2. Data
a. Data Ingestion
b. Data Preparation
c. Data sources:
https://www.kaggle.com/ashishpatel26/chexnet-radiologist-level-pneumonia-
detection
https://github.com/ieee8023/covid-chestxray-dataset
d. Database: https://www.orthanc-server.com/
3. AIML Module
a. Data pre-processing
b. Models
i. VGG16 -> COVID Vs. NonCOVID classification
ii. VGG16 -> Healthy patient Vs. UnHealthy patient
iii. ChexNet model
iv. Technologies: Python, Tensorflow, Keras
c. Model Training
d. Model Deployment
e. Model Prediction
4. Front-End & Website
a. Technologies: Java script, html, CSS
b. Sample website: https://mlmed.org/tools/xray/
c. https://kinsta.com/blog/shopify-alternatives/
5. Back-end
a. Technologies: Python, MySQL, OpenPACS(Orthanc)
i. https://www.orthanc-server.com/
6. Integration and Development
a. Technologies: Docker, Nginx, AWS EC2 small instance
7. Security
8. Testing
a. Write test cases
9. Documentation
a. Project documentation - detailed
b. Concise technical documentation (Sample: https://arxiv.org/abs/1901.11210
)
MILESTONES
There would be 4 milestones –
1) Week 1 –
a) Participants are expected to complete documentation and architectural design by
creating the UML diagrams.
b) Prepare the local environment with the right tools and installations
c) Ingest data source and prepare data-ingestion-service which populates “raw_data”
2) Week 2 –
a) Setup the model-training-service project
b) Setup appropriate connectors to load data from MySQL and Orthanc
c) Understand cleaning + preprocessing steps necessary to transform the raw data and
complete data preparation step.
3) Week 3 –
a) Segregate the preprocessed data into training and test
b) complete model training and deployment
c) Serialize the re-trained model.
d) Push this model to model-registry (MLFlow) along with hyperparameters
4) Week 4 –
a) Dockerize all the projects by adding appropriate Dockerfile
b) Test the built application by loading different chest x-ray and validating the output
c) Test quantitatively using accuracy measures
Team:
Project manager: Varun Seshadrinathan