You are on page 1of 6

Reduce maintenance

cost through predictive


techniques
Data Basics
 125k records, 12 fields (9 attributes, date, device ID, failure tag)
 No Missing Values
 Event rate of 8 basis points

Some assumptions
 date is the date of recording of the data
Some observations
 Data is unique by device ID+date (1 exception)
 Once a device fails, it doesn’t come back into operation (5 exceptions)
 The devices have a daily entry (in 99.82% of the cases)
 Close to 1200 devices, 9.1% of the devices failed in operation
More observations
 Using 5 of these attributes ( ID 2,3,4,7 and 8), we can segment 83% of the
data which has the value of all of these attributes = 0.
 This segment has event rate of 2 basis points whereas the remainder data has
event rate of 40 basis points
 The attribute values (Except for attribute 1) are almost monotonically
increasing ( all other attributes have sparse cases of a decrease in the value for a device over time)
Feature Creation
 Created features which look at the difference between the current attribute
and the attribute on the previous day (absolute and %diff)
 Created features which look at the difference between the current attribute
and the attribute when device came into operation
 Created feature which tells the number of days a device has been live
Modeling Methods and Results
 Class Balancing techniques can be tried. The simplest one - Random
oversampling has been used
 Logistic Regression modeling technique was used for classification
 AUROC = 0.820
 AUPR = 0.082

You might also like