You are on page 1of 7

Predictive

Analytics

Case Study by:

Datascience | Training | Visualization


Klaymatrix

A company “ XYZ ” produces an industrial chemicals and supplies it to clients across the globe. XYZ wants to
analyse data related to one of the industrial chemical “TURBO-CHEM” they produce. Every year tonnes of
TURBO-CHEM gets wasted as it is not able to meet the stringent quality requirements. This has a very high
stakes involved in terms of raw material and production cost.

A proposed solution is to predict beforehand if a lot would be rejected based on composition of key
components. If they are able to do it before quality checks, they can adjust the composition and hence save a
lot on cost. To carry out this exercise a data model needs to be build that can accurately predict whether
sample would be accepted based on some key parameters.

PROBLEM STATEMENT

2
Klaymatrix

XYZ produces an industrial chemical “TURBO-CHEM” at its 4 plant locations; R1, R2, R3, R4. There are 5 key
components that are produced during the production cycle namely: z897, z678, z143, z435, z987 (in addition
there are other components also-mostly organic compounds which do not have any bearing on the final
product).

Further, there are different “types” of “TURBO-CHEM” produced: Type zz48,Type zz49,Type zz50,Type zz51,Type
zz52,Type zz53,Type zz54,Type zz55,Type zz56,Type zz57,Type zz58,Type zz59. These types depend on
composition of 5 key components and the end use of product.

Every day a sample from a lot of 1000 litre is taken and based on composition of 5 key components, the lot is
rated as “OK” or “NOT OK”.

If the sample is NOT OK, entire lot is rejected.


DESCRIPTION

3
Klaymatrix

○ The case study data would is in two parts:

○ Train_Dataset.csv

○ Test_Dataset.csv

○ Each data set looks as shown in the snapshot below, each row represents a lot

DATA

4
Klaymatrix

With the given parameters predict as accurately as possible whether a the “Status” would be “OK” or “Not OK”

OBJECTIVE

5
Klaymatrix

Evaluation would be done based on following:

○ accuracy and logic

○ performance using the evaluation data,

○ performance on confusion matrix and ROC curve.

○ EDA

○ data visualization

○ Overall presentation

EVALUATION

6
Thank
You
https://www.linkedin.com/company/klaymatrix
support@klaymatrix.com

http://www.klaymatrix.com/

You might also like