0 views

Uploaded by frawat

1

1

© All Rights Reserved

- SPSS Data Analysis Examples_ Multinomial Logistic Regression
- Butler 2018
- Lesson Plan Data Sc & Big Data
- MGSC 372
- msdn.july2017
- neuromation-whitepaper.pdf
- Data Science Acadgild
- Machine Learning
- Artificial Intelligence and Machine Learning
- Isap Diana
- Book_Machine Learning - The Complete Guide
- (Applied environmental statistics) Douglas E. Splitstone, Michael E. Ginevan - Statistical Tools for Environmental Quality Measurement-Chapman & Hall_CRC (2004).pdf
- AI Brochure
- Cloudwick Sees Demand for Machine Learning Engineers Grow in Q1
- From Data to Action
- IENG314(3)
- qualitative reseach,complete.docx
- Forecasting_ANN_vs_regression.pdf
- d1.docx
- Introductory Data Science and ML

You are on page 1of 19

function for Six Sigma

Meryem Uluskan

To cite this article: Meryem Uluskan (2018): Artificial Neural Networks as a quality

loss function for Six Sigma, Total Quality Management & Business Excellence, DOI:

10.1080/14783363.2018.1520597

Article views: 12

http://www.tandfonline.com/action/journalInformation?journalCode=ctqm20

Total Quality Management, 2018

https://doi.org/10.1080/14783363.2018.1520597

Meryem Uluskana,b*

a

Department of Industrial Engineering, Faculty of Engineering, Eskisehir Osmangazi University,

Eskisehir, Turkey; bESOGU, Muhendislik Mimarlik Fakultesi, Endustri Muhendisligi Bolumu,

Meselik Kampusu, Eskisehir, Turkey

In this study, Artiﬁcial Neural Networks (ANNs) are proposed to be used as a quality

loss function for Six Sigma projects. An industrial data set consisting of power

consumption rates of refrigerators and thermal camera readings around their

compressors is analysed. For industrial data, relationships between inputs and outputs

can be nonlinear and complex. Therefore, traditional statistical models may result in

poor inferences. At this point, ANNs emerge as effective tools because of their ability

to learn nonlinear and complex relationships. While Six Sigma remains as the major

quality initiative, its popularity started to decline among industrial practitioners. Since

the methodology was established, Six Sigma toolbox was not radically improved.

Therefore, to enhance Six Sigma toolbox, an ANN-based structure is proposed to

detect the refrigerators not complying with power speciﬁcations through thermal

camera readings. Four quality loss function models are compared: one-dimensional

parabolic Taguchi loss functions, multivariate Maximum Likelihood cost, logistic

regression and ﬁnally ANNs. The analyses are conducted by Monte Carlo cross-

validation to obtain precision-recall curves for these methods. The ANN-based cost

function is shown to outperform other three methods. Finally, ANNs are found to be

an effective tool which may bring new dimensions to Six Sigma concept.

Keywords: Six Sigma; Artiﬁcial Neural Networks; quality loss function; Maximum

Likelihood; Taguchi loss function; logistic regression; quality control; pattern

recognition

1. Introduction

Six Sigma is a disciplined, project-oriented, statistically based, highly quantitative approach

to improving product and process quality and is a major force in business improvement

(Hahn, Doganaksoy, & Hoerl, 2000; Montgomery & Woodall, 2008). While Six Sigma

maintains its status as the major methodological quality initiative, the popularity of this

method has already started to decline among industrial practitioners. It can be argued stat-

istically that the enthusiasm for Six Sigma and expectations for it have started to decrease

(Uluskan & Erginel, 2017). There are many reasons for this situation including issues

arising during implementation, inefﬁcient training programmes, and ineffective or improper

use of tools. While these issues constitute a major part of the decline of enthusiasm in Six

Sigma, an important fact which must not be ignored is that the Six Sigma methodology or

Six Sigma toolbox have not been radically improved since the methodology was

established.

The enhancement of Six Sigma, which includes topics such as new Six Sigma models

and new tools in Six Sigma, has long been studied in quality engineering literature (Non-

thaleerak & Hendry, 2006). These studies proposed new tools for Six Sigma projects.

*Email: meryemulus@yahoo.com

2 M. Uluskan

However, further research is needed on the effectiveness of those new tools (Nonthaleerak

& Hendry, 2006). Moreover, prior literature found that Artiﬁcial Neural Networks (ANNs),

an advanced machine-learning tool, is one of the least frequently used tools in Six Sigma

projects (Uluskan, 2017). Therefore in this study, it is proposed that ANNs can be used

effectively as a quality loss function for Six Sigma projects. The effectiveness of ANNs

is tested in this study using data collected in an industrial setting.

In practical situations with real industrial data, generally, the relationships between pre-

dictors (inputs) and outcomes (outputs) are nonlinear as well as complex. Therefore, the use

of traditional statistical models may result in poor inferences regarding the real world data.

At this point, ANNs emerge as an effective tool for real industrial situations as they have the

ability to learn and model nonlinear and complex relationships. Moreover, after ANNs learn

from inputs and their corresponding outputs, they can infer hidden relationships between

these and can predict relationships more thoroughly for future data sets. This characteristic

of ANNs make them powerful tools while dealing with real data as opposed to traditional

techniques. Finally, unlike many other prediction techniques, ANNs do not impose any

restrictions on the input variables, such as assumptions on how the data should be distrib-

uted. Therefore, ANNs provide great ﬂexibility in real industrial practices.

As machine learning and data mining technology developed, several disciplines took

advantage of these developments. Many disciplines which had been previously accustomed

to statistical models now adopted very quickly the contemporary ANNs. While many

different disciplines take advantage of machine learning, quality engineering does not effec-

tively utilise these structures.

The use of statistical tools and the correct interpretation of the statistical tests require at

least a moderate level of understanding about statistics. So, a signiﬁcant level of training is

needed. However, industrial processes can be so complicated that basic statistical tools

cannot model the complexity effectively. On the other hand, the use of machine-learning

tools such as ANNs can lead to new understanding in quality engineering. Once a prac-

titioner becomes an expert in using machine-learning tools, he or she can more accurately

assess and predict industrial processes.

Therefore in this study, ANNs are proposed as an effective tool to model quality loss

function for Six Sigma projects. This study argues that to increase its effectiveness, Six

Sigma must be replenished by new powerful tools. To support this idea, an industrial

data set, which includes power consumption rates of refrigerators as well as thermal

camera readings around the compressors and the refrigerators themselves, is analysed.

An ANN-based model is proposed to detect refrigerators that do not comply with power

speciﬁcations by means of thermal camera readings. It is hypothesised that ANN-based

models outperform other statistical models. Consequently, ANN, an advanced machine-

learning tool, is proposed to help provide new insights in quality engineering and increase

the effectiveness of Six Sigma projects.

Up to now, the motivation of this study has been provided. The Background section will

ﬁrst provide a detailed explanation on ANNs and how they are superior compared to tra-

ditional statistical methods. Then, a literature review about the use of ANNs in quality initiat-

ives is provided. Next, three competing methods whose performances will be compared to

that of ANNs are introduced: parabolic Taguchi loss function, Maximum Likelihood Cost

(multivariate cost function) and logistic regression. The Background section also provides

brief information about Retrieval Systems, Precision, and Recall which are all subjects of

pattern recognition. The Methods section introduces how different quality loss functions

are used to create retrieval systems for the refrigerator data. This section is enriched by

many visual displays, i.e. by ﬁgures and by videos to make these concepts more tangible

Total Quality Management 3

to the readers. Maximum Likelihood and logistic regression-based retrieval and then, the pro-

posed solution (i.e. ANNs-based retrieval) are thoroughly explained in this section. In the

Experiments and Results section, a description of how the experiments were conducted is

provided together with the precision-recall curves for all the four methods. A discussion of

the results is also presented. Finally, the Conclusion section discusses the success of ANN-

based-systems as strong candidates to be included in the toolbox of Six Sigma projects.

2. Background

2.1. ANNs: a brief introduction

ANNs are information processing models inspired by the way biological nervous systems

process information (Basu, Bhattacharyya, & Kim, 2010) and thus, are computational

models of the brain. They are increasingly being used to model complex, nonlinear data

(Yang, 2009). Inspired by the nervous systems, neural networks are comprised of a large

number of highly interconnected processing elements, i.e. neurons, working in harmony

to solve particular problems (Basu et al., 2010). Accordingly, they have the ability to

learn quantitative relationships between input variables and the corresponding output vari-

ables (Tu, 1996). Therefore, neural networks are conﬁgured for a speciﬁc application

through a learning process (Basu et al., 2010).

Learning is achieved by training the network with a training data set consisting of input

variables and the known or associated outcomes (Tu, 1996). Once a network has been

trained, it can be used for tasks in a separate test data set (Tu, 1996). In general, a neural

network can be divided into three layers: (a) input layer, (b) hidden layers and (c) output

layer. Input layer is responsible for receiving information (data) from the external environ-

ment, whereas, hidden layers are composed of neurons which are responsible for internal

processing of the data. Ultimately, output layer which is also composed of neurons is

responsible for producing and presenting the ﬁnal network outputs (Da Silva, Spatti, Flau-

zino, Liboni, & Dos Reis Alves, 2017).

The main characteristics of neural networks are that they have the ability to learn

complex nonlinear input-output relationships, use sequential training procedures, and

adapt themselves to the data (Basu et al., 2010). Recent advances in neural network-

based models result in outstanding accuracies in a variety of ﬁelds (Cerquitelli, Quercia,

& Pasquale, 2017). Mathematically, a neuron k can be described by the following equations

(Tosun, Aydin, & Bilgili, 2016):

m

uk = wkj xj ,

j=1

yk = w (uk + bk ),

where xj ’s are the inputs and wkj ’s are the weights of the neuron k, and uk is the linear com-

biner output due to input signals. w( · ) is the activation function and bk is the bias of the

activation function and ﬁnally yk is the output signal of the neuron.

There are two main types of neural networks: deep neural network and shallow

network. Deep neural network has two or more hidden layers as opposed to shallow

neural networks that usually have only one hidden layer. If there are more than 10

layers, then it is called very deep learning. Additional layers make it possible to extract

data features from the lower layers, i.e. to extract features from features, which creates

4 M. Uluskan

the potential to model complex data with fewer neurons than in a shallow network (Schmid-

huber, 2015).

With the growing complexity of the industrial systems, traditional statistical tools and

methods cannot satisfy all the demands of current complex industrial environment, so

data-driven modelling which is based on computational intelligence and machine-learn-

ing methods such as ANNs are brought into practice. Traditional parametric modelling

uses data to search for an optimal value of a parameter that varies within a space of

speciﬁed dimension. Complex data, such as those encountered in contemporary data

analysis can seldom be fully studied by this traditional approach (Ciampi & Lecheval-

lier, 2007).

Traditional statistical modelling is formalisation of relationships between variables in

the form of mathematical equations. Statistical modelling works on a number of assump-

tions. As an example, linear regression assumes that there exists a linear relation

between independent and dependent variables, observations should be independent of

each other and error should be normally distributed. In statistical modelling, even a non-

linear model has to comply to a continuous separation boundary.

On the other hand, machine learning, upon which data-driven modelling is based, is an

algorithm that can learn from data without relying on rules-based programming. Machine-

learning algorithms, in general, are spared from most of these assumptions. The biggest

advantage of using a machine-learning algorithm is that there might not be any continuity

of the boundary. In data-driven modelling, which can be successfully used to model

complex industrial data, the underlying relationship among measured data is calculated

by the model itself (Stundner & Al-Thuwaini, 2001).

From a practical or industrial point of view, the use of ANNs can appear to be some-

what difﬁcult. For industrial practitioners, attaining the necessary experience to efﬁciently

use ANNs can be quite costly in terms of time and ﬁnancial resources. In general, these

methods are often considered as a black box, and most of the rules for building a neural

network model are empirical rather than theoretical (Kislov & Gravirov, 2018). Compu-

tational burden required for model development and its tendency to over-ﬁt are sometimes

regarded as disadvantages of ANNs (Tu, 1996). On the other hand, usage of neural net-

works has a number of advantages as it has ability to completely detect complex nonlinear

relationships between dependent and independent variables, has ability to detect all poss-

ible interactions between predictor variables, and has the availability of multiple training

simultaneously (Tu, 1996). Therefore, once a practitioner becomes proﬁcient in ANNs, it

is quite sure that he or she will obtain superior results in their industrial processes and

projects.

Neural networks represent a novel approach that can provide solutions to problems for which

traditional mathematics is not able to ﬁnd a reasonable solution. These problems are gener-

ally complex in nature and some of the mechanisms involved in the problem have not been

fully understood by the researchers studying them (Lolas & Olatunbosun, 2008).

Neural networks model the neural connections in the human brain and imitate the

human ability to learn from experience. Their ability to capture knowledge is among the

main advantages that make these expert systems attractive for a large variety of applications

Total Quality Management 5

(Lolas & Olatunbosun, 2008). Therefore, complex problems in quality and Six Sigma pro-

jects can be better addressed by ANNs. Previous literature proposed the potential use of

ANNs in Six Sigma projects and provided examples of their usage (Mahanti & Antony,

2005; Brady & Allen, 2006).

Pyzdek’s (2009) Six Sigma paradox states that Six Sigma focuses on the reduction of

process variation in order to meet or exceed customer requirements, but signiﬁcant quality

levels or improvements can be achieved only by changing the way of thinking within the

organisation. In other words, the Six Sigma paradox addresses the necessity of changing the

mindset of the organisation in order to eliminate the variation within the processes. This

paradox emphasises the signiﬁcance of the creativity within organisations (Pyzdek,

2003). Keeping this paradox in mind, Pyzdek utilised ANNs as a tool for Six Sigma projects

(Pyzdek, 1999). He used ANNs to predict the level of defects based on the parameters of the

processes. He compared the surface created by ANNs with the corresponding response

surface model from the classical design of experiments (DOE). He found that compared

to the classical DOE method, an ANN-based surface can be well complicated to produce

better predictions about defect rates. He emphasised that his study simply points out the

potential applications of ANNs for quality and performance improvement.

Prior to Pyzdek, Su and Hsieh (1998) similarly examined the potential use of ANNs in

creating more efﬁcient response surface models. They argued in their paper that prac-

titioners with limited statistical training, especially engineers, can more easily use ANNs

in quality control. In Mahanti and Antony (2005), ANNs were mentioned as one of the

computational intelligence techniques which help software quality assessment for Six

Sigma projects in the software development. ANNs is mentioned in additional studies as

a potential tool for Six Sigma projects (Patterson, Bonissone, & Pavese, 2005; Brady &

Allen, 2006).

In a study by Johnston, Maguire, and McGinnity (2009), the authors utilised ANNs as a

key component in their Six Sigma project. In their case study, after observing the nonlinear-

ity between read/write capability of hard disc drives (HDD) and the associated predictor

parameters, the authors trained ANNs to better predict the performance of HDDs. In

research conducted by El-Midany, El-Baz, and Abd-Elwahed (2012), an ANN-based

approach was used for performance prediction within a manufacturing environment. The

study evaluated the manufacturing system by integrating ANNs with other Six Sigma

tools. Fahey and Carroll (2016) proposed a neural network approach in biopharmaceutical

manufacturing and compared ANNs to multiple linear regression. In the context of Six

Sigma, their research showed how ANNs can be used to interpret the data gathered

during the manufacturing process.

In a study by Wu, Wang, Zhang, and Huang (2011) the authors investigated a new

Design for Six Sigma (DFSS) approach by employing ANNs to optimise burnishing for-

mation process quality and yield. Their results indicated that the DFSS-Neural networks

method is an effective tool to improve the yield in machining. Kuthe and Tharakan

(2009) applied ANNs in a Six Sigma DMADV project in the steel industry and compared

the ANNs with regression analysis. Sen (2015) presented a case study of an iron manufac-

turing plant where the variation of CO from the blast furnace was the problem. The

researcher identiﬁed the signiﬁcant process parameters responsible for CO emission and

then used ANNs as a Six Sigma tool to model the process.

In examining the previous literature the potential for ANNs as a Six Sigma tool is clear,

so they can be well utilised as an advanced tool in Six Sigma projects. Further studies that

integrate ANNs into Six Sigma projects are needed to persuade quality engineers to use

ANNs.

6 M. Uluskan

The quality loss function is deﬁned as a quantitative way of evaluating quality. Taguchi,

Chowdhury, and Wu (2005) stated that quality loss can be expressed as a deviation from

the target value. The most basic version of the quality loss function is the parabolic loss

function (Taguchi et al., 2005):

L = k (y − m)2 (1)

where y is the value of the quality characteristic, m is the target value of y, k is a constant and

L is the quality loss. This parabolic deﬁnition of the loss function is similar to the Gaussian

squared error concept. When there exist multiple quality characteristics to be evaluated, the

multivariate version of the quality loss function is deﬁned as the following (Suhr & Batson,

2001):

n

n

L= kij (yi − mi ) (yj − mj ) (2)

i=1 j=1

Keeping in mind the analogy of these loss functions and the Gaussian squared error, a more

organised loss function can be written as the Maximum Likelihood cost:

L = x C −1 xT (3)

where x is the vector of the deviations from the target values for multiple quality character-

istics:

and C −1 is the inverse of the covariance matrix of the quality characteristics. When

Maximum likelihood cost is expressed as Equation (2), the following relation must be

established:

kij = C −1 (i , j) (5)

When there exist only two quality characteristics, the maximum likelihood cost function

can be expressed by means of a surface as shown in Figure 1. In Figure 1, the maximum

likelihood surface is plotted based on the mean vector and the covariance matrix of

Thermal Cameras 1 and 3 of the refrigerators. This surface touches the x–y plane at the

mean point of Thermal Cameras 1 and 3. As the quality characteristics move away from

the mean point, the cost starts to increase in all directions. However, the cost surface is ellip-

tically aligned in accordance with the multivariate distribution of the data. In the next sec-

tions, a threshold value will be applied to this cost function to detect the items which do not

comply with the power speciﬁcations.

Binary logistic regression is a model-building technique developed for the case where the

outcome variable is binary or dichotomous (Hosmer & Lemeshow, 2000). Unlike linear

regression, the dependent variable has a ceiling and a ﬂoor. The function describing the

Total Quality Management 7

transition from the ﬂoor to the ceiling is generally desired to be an S-shaped curve just like

the cumulative distribution of a normal random variable (Hosmer & Lemeshow, 2000).

Linear regression faces signiﬁcant problems in dealing with binary dependent variables.

Logistic regression applies a transform to allow the dependent variable to approach a

ceiling and a ﬂoor while the independent variables can range from −1 to +1. Although

many nonlinear functions can represent the S-shaped curve, the logit transformation has

become popular because it is a ﬂexible and easily used function (Pampel, 2000). Conse-

quently, when there are N independent variables (i.e. x1 , x2 , … , xN ), the logistic regression

estimates a multiple linear regression model as the following (Hosmer & Lemeshow, 2000),

p(x)

ln = b0 + b1 x1 + b2 x2 + · · · + bN xN (6)

1 − p(x)

where p(x) represents the conditional mean of the output (i.e. Y) given x, namely E(Y|x).

The left hand-side of the above equation is the logit transformation of p(x) which has

just mentioned above. After the logistic model is ﬁtted and the estimated coefﬁcients

(i.e. b̂0 , b̂1 , … , b̂N ) are obtained, the estimated logistic probability p̂(x) can be calculated

by means of the logistic function as the following,

p(x) = (7)

1 + eb̂0 + b̂1 x1 + ··· + b̂N xN

While the training data used in model ﬁtting includes a binary output variable, the estimated

logistic probability p̂(x) is a continuous function of x. So, this property allows the logistic

regression to be used as quality loss function as will be described in Section 3.2.

In pattern recognition, precision is deﬁned as the ratio of the number of relevant items

retrieved to the number of all retrieved items. Recall is deﬁned as the ratio of the

8 M. Uluskan

number of relevant items retrieved to the number of all relevant items. Therefore, usually as

the retrieval system applies a more strict rule to retrieve items, then recall (i.e. the ability to

retrieve more relevant items) decreases, while precision (i.e. the ability to retrieve a higher

rate of relevant items within all retrieved items) increases. There is a tradeoff between recall

and precision. Consequently, to compare the performance of retrieval systems, precision-

recall curves are established. When a precision-recall curve that encloses all the other

curves is obtained, then it is said that a better retrieval system is created.

In this study, an ANN-based retrieval system will be created to detect the refrigerators

that do not comply with the power speciﬁcations through thermal camera readings. The

ANN retrieval system will be compared to the parabolic Taguchi loss, maximum likelihood

cost and logistic regression models.

3. Methods

The industrial data set used in this study includes thermal camera readings as well as the

power consumption rates for the refrigerators that were tested in an isolated test room

after production. In the test room, several different thermal cameras read the temperature

levels in degrees Celsius at different regions of the refrigerator and its compressor. To sim-

plify the overall process and to be able to visualise the models that are created, only two of

these thermal cameras were included in the present study. The industrial data set includes

thermal camera readings of 478 refrigerators. The power speciﬁcations determined by the

company is between 150 and 180 Watts. The aim of the research is to detect the refriger-

ators not complying with power speciﬁcations through thermal camera readings, assuming

that the incompatible items differ from other items in terms of thermal camera readings.

Four different types of retrieval systems are created to detect refrigerators that do not

comply with power speciﬁcations by means of thermal camera readings. These are the para-

bolic Taguchi loss, Maximum Likelihood cost, logistic regression and ﬁnally ANNs. The

major aim of the study is to show that ANNs outperform all the other competing

methods. In this section, how retrieval systems are created by Maximum Likelihood

cost, logistic regression and ﬁnally by ANNs will be explained in detail.

The aim of this section is to describe the maximum likelihood cost to detect items which do

not comply with power speciﬁcations by means of thermal camera readings. The power

consumption levels of refrigerators are partially correlated with the thermal camera read-

ings, therefore thermal camera readings should provide some cues regarding whether a

refrigerator complies with the speciﬁcations or not. From this point of view, the refriger-

ators whose power consumptions do not comply with the speciﬁcations should yield

some outlying thermal camera readings in a multivariate sense. Therefore, when a threshold

for the maximum likelihood cost of items is applied, the items with incompatible power

consumptions can be retrieved.

Figure 2 is the two dimensional contour plot version of Figure 1 when a threshold, i.e. a

cutting value, is applied to the upper part of this 3D cost surface. Accordingly, a threshold in

maximum likelihood cost implies an elliptical boundary in the two dimensional space as

shown in Figure 2. Therefore, the items which lay outside this ellipse are considered as

the outlying items, so they are retrieved as the incompatible power consumption items.

As shown in Figure 2, empty circles represent compatible items which are within power

speciﬁcations, whereas, ﬁlled circles represent incompatible items which are out of

Total Quality Management 9

Figure 2. The retrieval system and elliptical threshold based on Maximum Likelihood cost.

power speciﬁcations. True positives are the detected incompatible items through threshold

applied in maximum likelihood cost function. It can be seen that there exist only a few com-

patible items, i.e. false alarms, which are located outside the threshold ellipse, yet the

majority of the outlying items are incompatible items. The existence of compatible items

outside of the threshold ellipse decreases the precision of the retrieval system.

Moreover, there exist some incompatible items remaining inside of the threshold

ellipse. These items cannot be retrieved easily without retrieving a lot of irrelevant items.

The existence of these incompatible items inside of the ellipse decreases the recall of the

retrieval system.

The retrieval process by means of Maximum Likelihood cost is also depicted in the fol-

lowing video:

https://youtu.be/ujohDoS8tGI

In Section 2.5, the mathematical background of logistic regression has been described. This

section will explain how logistic regression is utilised as a quality loss function. It has been

mentioned that in logistic regression, while the training data used in model ﬁtting includes a

binary output variable, the estimated logistic probability p̂(x) is a continuous function of x.

Therefore, the logistic regression model can be viewed as a 3D surface to which thresholds

are applied for the retrieval systems.

As a three dimensional surface plot, Figure 3(a) shows the logistic regression model

trained on the refrigerator data. As can be seen, the region where the incompatible items

are located yields 1, while the region where the compatible items are dominantly located

yields 0. As mentioned in Section 2.5, the transition from 1 to 0 is a smooth S-shaped tran-

sition along the surface. Therefore, this logistic probability can be used as a cost surface to

create a successful retrieval system for incompatible items. As the threshold is decreasing

10 M. Uluskan

Figure 3. The logistic regression model for the refrigerator data: (a) 3D surface of logistic model, (b)

the retrieval system by means of logistic regression.

from 1 to 0, more incompatible items can be retrieved while the precision is reduced as

shown in Figure 3(b). Video 2 visually demonstrates the use of logistic regression for

the refrigerator data in a more tangible manner.

https://youtu.be/uXHfy3L9xnw

A more advanced retrieval scheme can be modelled with ANNs. Before applying ANNs, the

data are converted to polar coordinates around the mean of the data to make the data more

interpretable when using ANNs. The polar coordinates consist of radial and angular coordi-

nates (i.e. r and u) as shown in Figure 4. The radial coordinate or radius r is the distance from

the reference point (i.e. the mean of the data in our case), and the angular coordinate or the

polar angle u is the counterclockwise angle from the corresponding horizontal-axis.

Total Quality Management 11

The compatible and incompatible items are now determined by the value of the radial

distance to the origin, and the angle with respect to the x-axis. In Figure 5(b), the new align-

ment of the data in accordance with the polar coordinates is shown. The original alignment

shown is in Figure 5(a). The radial coordinate uses a log scale to be able to analyse the data

in a more compact form.

By means of polar coordinates, the incompatible items are gathered more closely to one

another, while compatible items scatter into a larger area. By this way, the data are scattered

into the plane in a more homogenous way. This helps the ANN to better ‘learn’ the struc-

ture. The conversion of the data to polar coordinates is also depicted in the following video:

https://youtu.be/VNZWm7RrNr0

The next step is to train an ANN which returns higher values for the items which do

not comply with the power speciﬁcations. During the training phase, based on power con-

sumption rates, the training data are labelled as incompatible and compatible. The input

consists of two dimensions of radial (in log scale) and polar coordinates. The output is

a cost value of 1 for incompatible and 0 for compatible items. To prevent the ANN

from overﬁtting the data, twenty percent of the data are reserved for a validation

dataset. When no further gain is obtained with validation data, training iterations stop.

Moreover, the number of neurons in the hidden layer is set to 10, which is a relatively

small number, in order to prevent overﬁtting. In other words, the ANN will not learn

unnecessary details within the training data.

Finally, by preventing the ANN from overﬁtting the data, a smooth ANN-based cost

surface is created, as shown in the 3D plot in Figure 6(a). The output of the training data

is binary (i.e. either 1 or 0). However, for the output of the trained ANN, the transition

from compatible to incompatible states (i.e. transition from low to high values) is a

smooth ramp instead of a sharp increase. This is a natural consequence of preventing

the ANN from overﬁtting the data. To demonstrate the opposite case, that is the

12 M. Uluskan

Figure 5. The original vs. the new alignment of the data obtained by means of polar coordinates.

over-ﬁtting case, Figure 6(b) shows an ANN surface obtained by 100 neurons in the

hidden layer. As can be seen, while Figure 6(a) is reasonably smooth to be used as a

quality loss function, Figure 6(b) includes too many unnecessary ﬂuctuations, indicating

overﬁtting.

The match of the ANN cost surface and the data is displayed in Figure 7 where the

contour plot of the cost surface and the scatter of the data are superimposed on each

other.

Finally, based on this ANN-based cost surface, the items that do not comply with the

power speciﬁcations can be retrieved by applying a proper threshold level. In Figure 8,

this retrieval process is depicted by means of the boundary which is implied by a certain

Total Quality Management 13

Figure 6. The 3D plot of the Artiﬁcial Neural Network-based multivariate cost surfaces: Number of

neurons in hidden layer is (a) 10 and (b) 100.

threshold level. Again, true positives in Figure 8 are the incompatible items which are accu-

rately detected as incompatible ones through ANN-based cost function and false alarms are

compatible items which fall beyond this threshold and therefore they are incorrectly deter-

mined as incompatible items by the ANN-based cost function.

Figure 7. The superposition of contour of ANN cost surface and the data.

14 M. Uluskan

Figure 8. The retrieval system and the threshold which is based on ANN Cost.

Video 4. The retrieval system and the threshold which is based on ANN Cost

https://youtu.be/2hSL0I9YtKs

4.1. Monte Carlo cross-validation

In this section, precision-recall curves of the single dimension parabolic Taguchi loss func-

tion, Multivariate Maximum Likelihood cost, logistic regression and ﬁnally ANN-based-

cost are compared to each other. To provide valid results, a repeated training-test split

(i.e. Monte Carlo cross-validation) technique is applied. Each time the data is randomly

divided into exactly two parts as the training and the test data. The ANNs, logistic

Total Quality Management 15

regression models, mean vectors and the covariance matrices are all trained or obtained

based only on the training data. Then, the test data are used with the trained models to

obtain a precision-recall curve. These experiments are repeated 100 times to obtain an

average smooth precision-recall curve. Figure 9 shows the ﬂowchart of a single iteration

of this process for ANN-based retrieval systems.

Consequently, to compare the performance of retrieval systems, precision-recall curves are

established. The precision-recall curves of different methods are plotted in Figure 10. When a

precision-recall curve that encloses all the other curves is obtained, then it is said that a better

retrieval system is created. In other words, the higher the precision and recall values simul-

taneously, the more accurate the retrieval process of the method. As can be seen, single-

dimensional Taguchi cost functions are the least effective solutions for these data. The multi-

variate maximum likelihood cost appears to be superior to the single-dimensional Taguchi

cost functions. Logistic regression exceeds the performance of the Maximum Likelihood

cost. Finally, the ANN-based cost function outperforms all the other methods by producing

a precision-recall curve which encloses all the others. This indicates that there is a high poten-

tial for ANNs to provide accurate interpretations of industrial data in Six Sigma projects.

4.3. Discussion

The Maximum Likelihood cost model performs better than the single-dimensional Taguchi

loss functions, because the maximum likelihood cost handles the retrieval issue in the multi-

variate sense. An additional dimension enriches the total available information and also the

joint relationship between these two dimensions produces a more sophisticated retrieval

system. Logistic regression demonstrates superior performance compared to Maximum like-

lihood cost. This is mainly because the incompatible items do not cover the compatible items

in all directions in Figure 2. In other words, the incompatible items are mostly located under

the compatible items as seen in Figure 2. If this were not the case (i.e. the incompatible items

covered the compatible majority from all directions), then logistic regression which only

creates a linear boundary to separate these items would fail. In this scenario, Maximum like-

lihood cost which creates an elliptical boundary to separate these items would perform better

than logistic regression. Nevertheless, ANNs would be still the best method because it can be

easily adjusted to any type of scenario. As a result, this study demonstrates that ANNs can be

16 M. Uluskan

effectively used in industrial data as long as the main philosophy of this machine-learning

technique is correctly understood by the Six Sigma practitioners.

5. Conclusion

ANNs can be used in a wide range of applications to meet a lot of different needs. As

machine learning and data mining technology developed, several disciplines took advan-

tage of these developments. While many different disciplines take advantage of advanced

machine-learning tools, quality engineering does not effectively utilise these structures.

While Six Sigma maintains its status as the major methodological quality initiative, the

popularity of this method has already started to decline among industrial practitioners

because of improper and inefﬁcient use of Six Sigma tools. The Six Sigma methodology

or Six Sigma toolbox have not been radically improved since the methodology was estab-

lished. Therefore, this study argues that in order to re-increase the enthusiasm against Six

Sigma, Six Sigma toolbox must be replenished by new powerful machine-learning tools.

In real practices, generally, the relationships between inputs and outcomes are nonlinear

as well as complex. Therefore, the use of traditional statistical models may result in poor

inferences regarding the real industrial data. Statistical modelling works on a number of

assumptions. As an example, linear regression assumes that there exists a linear relation

between independent and dependent variables, observations should be independent of

each other and error should be normally distributed. However, ANNs can be quite ﬂexible

to ﬁt any distribution occurred in industrial data without any prior assumption. This charac-

teristic of ANN-based learning makes it attractive for industrial applications.

ANNs can be considered as an effective tool as they have the ability to learn and

model nonlinear and complex relationships. After ANNs learn from inputs and their cor-

responding outputs, they can infer hidden relationships between these and can predict

relationships more thoroughly for future data sets. This characteristic of ANNs makes

them attractive compared to traditional techniques while dealing with real data. Unlike

many other prediction techniques, ANNs do not impose any restrictions on the input vari-

ables, such as assumptions on how the data should be distributed. Therefore, ANNs

provide great ﬂexibility in real practices, such as it requires less formal statistical training

for practitioners.

Considering these advantages of ANNs, in this study ANNs are proposed to be an

effective tool for quality management implementations and to be used for industrial

data sets. In order to be able to show how ANNs provide superior results as opposed

to traditional techniques, four different quality loss function models i.e. one-dimensional

parabolic Taguchi loss function, multivariate Maximum Likelihood cost function, logistic

regression and ANNs are compared in terms of their abilities in detecting the defective

items within a dataset. It is shown that ANNs outperform all the other methods

because they can easily adjust themselves to any type of scenario. Therefore, ANN is

offered as an effective tool for quality management implementations especially for Six

Sigma projects.

Six Sigma training includes many statistical tools and tests which may require prac-

titioners to deeply understand and employ many statistical rules. Each statistical test has

its own steps which must be carefully fulﬁlled to achieve a valid test. Moreover, industrial

datasets can be so complicated that the basic statistical tools may not uncover the true

relationships within the data. On the other hand, the use of machine-learning tools such

as ANNs can lead to new understanding in quality engineering. Once practitioners learn

how to use ANNs, they can use this tool to solve different kinds of problems. By applying

Total Quality Management 17

ANNs to many different problems, engineers will start to become proﬁcient in using ANNs.

As they get better at using ANNs, new insights in quality engineering will be obtained. As

new advanced tools are proved to be successful in Six Sigma projects in the future, many

different techniques will be also utilised by practitioners, such as fuzzy logic systems as

well as Artiﬁcial Network-Based Fuzzy Inference Systems (ANFIS) and so on. Finally,

this will result in new dimensions and understanding in the Six Sigma concept.

A video depicting the steps in this study can be found at the following link:

Function for Six Sigma

https://youtu.be/y6bSp3RnULQ

Disclosure statement

No potential conﬂict of interest was reported by the author.

Funding

This research is funded by TUBITAK (Scientiﬁc and Technological Research Council of Turkey)

with the project number 115C079.

References

Basu, J. K., Bhattacharyya, D., & Kim, T. H. (2010). Use of artiﬁcial neural network in pattern rec-

ognition. International Journal of Software Engineering and Its Applications, 4, 23–34.

Brady, J. E., & Allen, T. T. (2006). Six sigma literature: A review and agenda for future research.

Quality and Reliability Engineering International, 22(3), 335–367.

Cerquitelli, T., Quercia, D., & Pasquale, F., Eds. (2017). Transparent data mining for Big and small

data. Cham: Springer International.

Ciampi, A., & Lechevallier, Y. (2007). Statistical models and artiﬁcial neural networks: Supervised

classiﬁcation and prediction via soft trees. In Advances in statistical methods for the health

sciences (pp. 239–261). Boston: Birkhäuser.

Da Silva, I. N., Spatti, D. H., Flauzino, R. A., Liboni, L. H. B., & Dos Reis Alves, S. F. (2017).

Artiﬁcial neural network architectures and training processes. In Artiﬁcial neural networks

(pp. 21–28). Cham: Springer.

El-Midany, T. T., El-Baz, M. A., & Abd-Elwahed, M. S. (2012, January). A proposed prediction

approach for manufacturing performance processes using ANNs. Proceedings of the 2012

international conference on industrial engineering and operations management. (pp. 192–

200). Istanbul: IEEE.

Fahey, W., & Carroll, P. (2016). Improving biopharmaceutical manufacturing yield using neural

network classiﬁcation. BioProcessing Journal, 14(4), 39–50.

Hahn, G. J., Doganaksoy, N., & Hoerl, R. (2000). The evolution of six sigma. Quality Engineering, 12

(3), 317–326.

Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic regression (2nd ed.). New York, NY: John

Wiley & Sons.

Johnston, A. B., Maguire, L. P., & McGinnity, T. M. (2009). Downstream performance prediction for

a manufacturing system using neural networks and Six-sigma improvement techniques.

Robotics and Computer-Integrated Manufacturing, 25(3), 513–521.

Kislov, K. V., & Gravirov, V. V. (2018). Deep artiﬁcial neural networks as a tool for the analysis of

seismic data. Seismic Instruments, 54, 8–16.

Kuthe, A. M., & Tharakan, B. D. (2009). Application of ANN in Six Sigma DMADV and its com-

parison with regression analysis in view of a case study in a leading steel industry.

International Journal of Six Sigma and Competitive Advantage, 5(1), 59–74.

Lolas, S., & Olatunbosun, O. A. (2008). Prediction of vehicle reliability performance using artiﬁcial

neural networks. Expert Systems with Applications, 34(4), 2360–2369.

18 M. Uluskan

Mahanti, R., & Antony, J. (2005). Conﬂuence of Six Sigma, simulation and software development.

Managerial Auditing Journal, 20(7), 739–762.

Montgomery, D. C., & Woodall, W. H. (2008). An overview of Six Sigma. International Statistical

Review, 76(3), 329–346.

Nonthaleerak, P., & Hendry, L. C. (2006). Six Sigma: Literature review and key future research areas.

International Journal of Six Sigma and Competitive Advantage, 2(2), 105–161.

Pampel, F. C. (2000). Logistic regression: A primer. London: Sage.

Patterson, A., Bonissone, P., & Pavese, M. (2005). Six Sigma applied throughout the lifecycle of an

automated decision system. Quality and Reliability Engineering International, 21(3), 275–292.

Pyzdek, T. (1999). Virtual-DOE, data mining and artiﬁcial neural networks. Retrieved from http://

qualityamerica.com/LSS-Knowledge-Center/designedexperiments/virtual_doe_data_mining_

and_artiﬁcial_neural_networks.php (Accessed February, 2017)

Pyzdek, T. (2003). The Six Sigma handbook. New York, NY: McGraw-Hill.

Pyzdek, T. (2009). The Six Sigma management paradox. Retrieved from http://sixsigmatraining.com/

leading-six-sigma/the-six-sigma-management-paradox.html

Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85–117.

Sen, P. (2015). Application of ANN in Six Sigma for CO modelling and energy efﬁciency of blast

furnace: A case study of an Indian pig iron manufacturing organisation. International

Journal of Six Sigma and Competitive Advantage, 9(2–4), 109–125.

Stundner, M., & Al-Thuwaini, J. S. (2001). How data-driven modeling methods like neural networks

can help to integrate different types of data into reservoir management. In SPE Middle East oil

show. Manama, Bahrain: Society of Petroleum Engineers.

Su, C. T., & Hsieh, K. L. (1998). Applying neural network approach to achieve robust design for

dynamic quality characteristics. International Journal of Quality and Reliability

Management, 15(5), 509–519.

Suhr, R., & Batson, R. G. (2001). Constrained multivariate loss function minimization. Quality

Engineering, 13(3), 475–483.

Taguchi, G., Chowdhury, S., & Wu, Y. (2005). Taguchi’s quality engineering handbook. Hoboken,

NJ: John Wiley and Sons.

Tosun, E., Aydin, K., & Bilgili, M. (2016). Comparison of linear regression and artiﬁcial neural

network model of a diesel engine fueled with biodiesel-alcohol mixtures. Alexandria

Engineering Journal, 55, 3081–3089.

Tu, J. V. (1996). Advantages and disadvantages of using artiﬁcial neural networks versus logistic

regression for predicting medical outcomes. Journal of Clinical Epidemiology, 49, 1225–1231.

Uluskan, M. (2017). Analysis of lean Six Sigma tools from a multidimensional perspective. Total

Quality Management & Business Excellence, 1–22.

Uluskan, M., & Erginel, N. (2017). Six Sigma experience as a stochastic process. Quality

Engineering, 29(2), 291–310.

Wu, J., Wang, Y., Zhang, Q., & Huang, P. (2011). Improve burnishing formation yield applying

design for Six Sigma. In Industrial engineering and engineering management (IEEM), 2011

IEEE international conference (Vol. December, pp. 804–808). Singapore: IEEE.

Yang, X. (2009). Artiﬁcial neural networks. In Handbook of research on geoinformatics (pp. 122–

128). Hershey, PA: IGI Global.

- SPSS Data Analysis Examples_ Multinomial Logistic RegressionUploaded bybananatehpisang
- Butler 2018Uploaded byy
- Lesson Plan Data Sc & Big DataUploaded byDrAmit Kumar Chandanan
- MGSC 372Uploaded byNaes Nunssevatar
- msdn.july2017Uploaded byGeorgeBungarzescu
- neuromation-whitepaper.pdfUploaded byAaron Heaps
- Data Science AcadgildUploaded bypraneet singh
- Machine LearningUploaded byajmal
- Artificial Intelligence and Machine LearningUploaded byusama
- Isap DianaUploaded bydyjimenez
- Book_Machine Learning - The Complete GuideUploaded byVu Sang
- (Applied environmental statistics) Douglas E. Splitstone, Michael E. Ginevan - Statistical Tools for Environmental Quality Measurement-Chapman & Hall_CRC (2004).pdfUploaded bywillscar
- AI BrochureUploaded byjbsimha3629
- Cloudwick Sees Demand for Machine Learning Engineers Grow in Q1Uploaded byPR.com
- From Data to ActionUploaded byHenry Oleta
- IENG314(3)Uploaded byAbdu Abdoulaye
- qualitative reseach,complete.docxUploaded byAlleah Rppm
- Forecasting_ANN_vs_regression.pdfUploaded byAnang Prasetyo
- d1.docxUploaded bySsk
- Introductory Data Science and MLUploaded byParee Katti
- cr_modelsUploaded byscorpiomanoj
- Cv ChandanUploaded bySRKaVesitian
- Demographic Nature of the Consumers in Brand Selection and Consumers Protection under Globalized Retail Marketing: A Case Study in KolkataUploaded byIJSRP ORG
- A Classification Model for Prediction of Certification MotivationsUploaded byDian Abiyoga
- Linear ModelsUploaded byjbsimha3629
- geo2017-0595Uploaded byAnonymous AbD0dpxF
- Neural Network Models for Forecast- A Review - Marquez 1992Uploaded byJosé Maria Jr Menezes
- van doorenUploaded byMartha Patricia Velasco Romero
- AIG Analytics University Recruitment - Data Science AnalystUploaded byMelissa May
- slides01shkeneUploaded byharsh.nsit2007752

- MI Blueberry GAPs ManualUploaded byfrawat
- M-retr- Et Al-2017-Comprehensive Reviews in Food Science and Food SafetyUploaded byfrawat
- qualityUploaded byfrawat
- FSMA-IA Final Rule Fact SheetUploaded byfrawat
- The+future+of+quality+presentationUploaded byfrawat
- The Fourth Industrial RevolutionUploaded byfrawat
- Food Fraud Position Paper (1)Uploaded byDevidas R Anantwar
- CDC Advisory_ Do Not Consume Any Kellogg’s Honey Smacks Cereal - Food Safety MagazineUploaded byfrawat
- Leadership Prescription the Conference BoardUploaded byfrawat
- The Peruvian Mango JetroUploaded byfrawat
- 08 Split PlotsUploaded byfrawat
- AgricolaeUploaded byfrawat
- SchoolGardenWorkshop_KatherineSimonUploaded byfrawat
- Color, Flavor Etc for ProcessingUploaded byMarilou Gadgode Humigop
- fpd.2008.0232Uploaded byfrawat
- FoodSafetyHandbook_CDSAUploaded byfrawat
- The Pros and Cons of Buying vs Building a Quality Management SystemUploaded byfrawat
- Quality Digest MagazineUploaded byfrawat
- Foodborne Disease Outbreaks Annual Report 2013 508cUploaded byfrawat
- 6_sikoraUploaded byfrawat
- 17-22Uploaded byfrawat
- 41292Uploaded byfrawat
- John Dalberg-Acton, 1st Baron ActonUploaded byfrawat
- William Murray, 1st Earl of MansfieldUploaded byfrawat
- Models in ScienceUploaded byhijazensis
- 407.pdfUploaded byfrawat
- Food quality managementUploaded byfrawat
- Fauna Diversity in Tropical RainforestUploaded byfrawat

- Nerzic 2007 Wind:Waves:CurrentUploaded bybrian_dutra
- six steps in regression analysis by hasan nagra econometrics sir atif notesUploaded byMUHAMMAD HASAN NAGRA
- Statistics, the magicUploaded byArijit Shaw
- Training Topics - MS ExcelUploaded byjindalyash1234
- SasUploaded byEveryday Levine
- The Policy of AssimilationUploaded byjasotsa67
- Engineering Mathematics 4 Jan 2014Uploaded byPrasad C M
- Statspack Tuning Otn NewUploaded byapi-3744496
- political risk institution foreign investmen.pdfUploaded byRizkiRonaldo
- Spatio Temporal Outlier DetectionUploaded bySanthosh MsMurthy
- Samlpe Questions Quants FRM IUploaded byrasnim
- AP Stats Chapter 10 NotesUploaded bycamillesyp
- environmental awarenessUploaded bylourene
- Us Environmental Protection Agency-Acute Toxicity Lc50Uploaded byApoteker Dina Yuspita Sari
- 05 - Particle Shape Effects in Flotation. Part 1 Microscale Experimental ObservationsUploaded byVictor
- 05-Chapitre-05Uploaded byAbigor45
- UT Dallas Syllabus for hcs6313.501.07s taught by Herve Abdi (herve)Uploaded byUT Dallas Provost's Technology Group
- Moore 4e CH15Uploaded byOutofbox
- Theil-Sen no RUploaded byFilipe Duarte
- Good Documentation PracticeUploaded byVladimir Sira
- arUploaded byBarun Kumar
- Yu Dissertation (Yu)Uploaded byTava Eurdanceza
- 1038Uploaded byManoj Varrier
- Add Math-HealthUploaded byPrime Sinista
- Stress and Coping among the under Graduate Nursing Students A Cross Sectional StudyUploaded byEditor IJTSRD
- Precautionary Principle PapersUploaded byrozo_aster
- Bus 5011 – Marketing ResearchUploaded bynyangara
- Ec 201614 Likelihood 2 Percent Inflation PDFUploaded byTBP_Think_Tank
- Forecasting Exchange Rates_good case study for IFMUploaded byapi-19974928
- Why Spreadsheets Are Inadequate for Uncertainty AnalysisUploaded byjuncar25