You are on page 1of 20

The current issue and full text archive of this journal is available on Emerald Insight at:

www.emeraldinsight.com/1355-2511.htm

Corrosion loop
Corrosion loop development of oil development
and gas piping system based on
machine learning and group
technology method
Andika Rachman and R.M. Chandima Ratnayake Received 2 July 2018
Revised 16 January 2019
Department of Mechanical and Structural Engineering and Materials Science, Accepted 12 August 2019
University of Stavanger – UiS, Stavanger, Norway

Abstract
Purpose – Corrosion loop development is an integral part of the risk-based inspection (RBI) methodology.
The corrosion loop approach allows a group of piping to be analyzed simultaneously, thus reducing non-value
adding activities by eliminating repetitive degradation mechanism assessment for piping with similar
operational and design characteristics. However, the development of the corrosion loop requires rigorous
process that involves a considerable amount of engineering man-hours. Moreover, corrosion loop
development process is a type of knowledge-intensive work that involves engineering judgement and
intuition, causing the output to have high variability. The purpose of this paper is to reduce the amount of
time and output variability of corrosion loop development process by utilizing machine learning and group
technology method.
Design/methodology/approach – To achieve the research objectives, k-means clustering and
non-hierarchical classification model are utilized to construct an algorithm that allows automation and a
more effective and efficient corrosion loop development process. A case study is provided to demonstrate the
functionality and performance of the corrosion loop development algorithm on an actual piping data set.
Findings – The results show that corrosion loops generated by the algorithm have lower variability and
higher coherence than corrosion loops produced by manual work. Additionally, the utilization of the
algorithm simplifies the corrosion loop development workflow, which potentially reduces the amount of time
required to complete the development. The application of corrosion loop development algorithm is expected to
generate a “leaner” overall RBI assessment process.
Research limitations/implications – Although the algorithm allows a part of corrosion loop development
workflow to be automated, it is still deemed as necessary to allow the incorporation of the engineer’s
expertise, experience and intuition into the algorithm outputs in order to capture tacit knowledge and refine
insights generated by the algorithm intelligence.
Practical implications – This study shows that the advancement of Big Data analytics and artificial
intelligence can promote the substitution of machines for human labors to conduct highly complex tasks
requiring high qualifications and cognitive skills, including inspection and maintenance management area.
Originality/value – This paper discusses the novel way of developing a corrosion loop. The development of
corrosion loop is an integral part of the RBI methodology, but it has less attention among scholars in
inspection and maintenance-related subjects.
Keywords Inspection, k-means, Machine learning, Risk-based inspection, Risk assessment,
Asset integrity management, Lean maintenance, Predictive maintenance, Piping system, Corrosion loop
Paper type Research paper

1. Introduction
Equipment used in oil and gas production and processing system is exposed to various
degradation mechanisms (e.g. corrosion, cracking, fatigue, erosion, etc.). Consequently, their
technical integrity degrades during their operational lifetime. The inability to control
degradation mechanisms may cause failure (e.g. leakages, ruptures, bursts, etc.) that yields
significant impact on the personnel, environment and financial of the organization (Singh
and Pokhrel, 2018). Regular inspection is normally performed to understand the current Journal of Quality in Maintenance
Engineering
operational conditions of the equipment. By understanding the equipment current © Emerald Publishing Limited
1355-2511
conditions, any circumstances that pose significant hazards to the technical integrity of the DOI 10.1108/JQME-07-2018-0058
JQME assets can be mitigated by carrying out the necessary maintenance, modification or
replacement (Ratnayake, 2015; Vinod et al., 2014). However, comprehensive inspection for all
equipment in the system is not financially feasible due to the high cost of performing
inspection. Prescriptive/time-based inspection used to be the industry standard for
developing the inspection plan for individual equipment (Schröder and Kauer, 2004).
In recent years, prescriptive inspection planning has been replaced by risk-based inspection
(RBI) planning, which emphasizes on inspecting equipment that possesses the most risk to
the system (Shishesaz et al., 2013).
One of integral parts of RBI assessment is corrosion loop development (Chang et al.,
2005). The corrosion loop approach enables a systematic simplification of degradation
mechanism analysis in the RBI assessment, by grouping a set of piping and equipment with
similar operational and design characteristics. This approach allows a group of piping and
equipment to be analyzed simultaneously, thus reducing non-value adding activities by
eliminating repetitive degradation mechanism assessment for piping and equipment with
similar operational and design characteristics. However, the process of developing corrosion
loop itself is laborious and has repetitive nature. Furthermore, corrosion loop development
process is a knowledge-intensive work that has high output variability because it implicates
engineers’ judgement and intuition. This variation can cause underinspection of higher-risk
equipment and overinspection of lower-risk equipment (Geary, 2002), and threaten the
overall integrity of the system.
The advancement of Big Data analytics and artificial intelligence facilitates the
substitution of machines for human labors to conduct highly complex tasks requiring
high qualifications and cognitive skills (Brynjolfsson and McAfee, 2011, 2014). The
potential productivity enhancement and the reduction of knowledge workers costs
offered by autonomous information processing tasks and can produce stimulus
that accelerate “machine-for-human” replacement (Loebbecke and Picot, 2015).
Productivity improvement is achieved, in the sense that machine intelligence evidently
has considerably more processing and computational capabilities and scalability than
human labors, thus accelerating the completion of tasks and reducing lead time (Frey and
Osborne, 2017; Ekbia et al., 2015). Furthermore, the computer algorithm is considered to be
less susceptible to human biases (Frey and Osborne, 2017) and less error-prone than
humans (Acemoglu and Autor, 2011), which can be the determinants for improving the
quality of knowledge-intensive works.
Based on the above argument, this paper aims to develop a computer algorithm, based
on k-means clustering and non-hierarchical classification method, to reduce the lead time
and the output variability inherent in the corrosion loop development process. k-means
clustering is a type of unsupervised machine learning algorithm to divide n data
points into k clusters so that each data point belongs to the cluster that has the closest
distance with the data point. Meanwhile, non-hierarchical classification model comes from
a manufacturing concept called group technology, which is intended to analyze and to
arrange parts spectrum and the relevant manufacturing process according to the
design and machining similarity so that a basis of groups and families can be established
for rationalizing the production process (Shunk, 1985). k-means clustering and
non-hierarchical classification model are selected because they solve problems that
resemble corrosion loop development process, i.e. to cluster a set of piping with similar
degradation mechanisms into the same group. The integration of k-means clustering and
non-hierarchical classification model is expected to allow a more effective and efficient
corrosion loop development process and generate a “leaner” overall RBI assessment
process. To demonstrate the functionality and performance of the algorithm, a case study
of corrosion loop development for the piping system of an offshore petroleum production
and processing platform is provided.
The remainder of this paper is structured as follow. Section 2 provides an overview of Corrosion loop
RBI and corrosion loop concept. Section 3 elaborates the elements that construct the development
corrosion loop development algorithm. k-means clustering and non-hierarchical
classification model are briefly reviewed in this section. In Section 4, a case study is
provided to demonstrate the functionality and performance of the corrosion loop
development algorithm on an actual piping data set. Comparison between the output
(i.e. corrosion loop) generated by the corrosion development algorithm and human
intelligence (i.e. manual work) is performed to examine the impact of applying the algorithm
on the output variability. Additionally, process modeling is performed to examine the
impact of algorithm integration in the typical corrosion loop development process, by
comparing the process before and after the algorithm integration. Section 5 gives the results
and discussion corresponding to the case study. Section 6 concludes the paper.

2. RBI assessment and corrosion loop development


RBI assessment is a methodology to achieve cost-effective inspection plan and to ensure
compliance of regulatory and corporate requirements. RBI prioritizes equipment inspection
based on the equipment risk level, which allows organization to focus inspection effort on
highrisk equipment and prevents overinspection of low-risk items (Chang et al., 2005). Risk
in RBI is defined as the product of probability of failure (PoF) and consequence of failure
(CoF). PoF is evaluated based on the variables that influence failure rate of the equipment,
such as type of degradation mechanism, degradation rate, operational conditions,
equipment design, previous inspection effectiveness and results, and equipment age. CoF is
assessed based the factors that impact the magnitude of hazards in the event of
hydrocarbon release, such as the type of substance contained in the equipment and process
conditions. RBI considers four categories of consequence effect: personnel safety and health
impact, environmental impact, production losses and facility repair costs (API, 2016).
A typical RBI methodology is shown in Figure 1.
One of the integral parts of RBI assessment is the development of corrosion loops, which
is typically an element of degradation mechanism assessment and PoF assessment (Chang
et al., 2005). Degradation mechanism assessment involves identifying relevant degradation
mechanisms and evaluating their rate in particular piping/equipment (API, 2011). Corrosion
loop approach is normally applied on piping system as piping arguably has higher
complexity than other types of equipment (e.g. pressure vessels, tanks, heat exchangers,
etc.) due to their arrangements and quantity. This complexity creates immense challenges in

Risk assessment process

Consequence of
failure assessment
Data and
Inspection plan and
information Risk ranking
mitigation (if any)
collection
Probability of
failure assessment

Reassessment
Figure 1.
RBI methodology
Source: API (2016)
JQME performing RBI assessment because assessing degradation mechanism for every individual
piping in the entire production system entails a significant number of man-hours and costs
to complete. The corrosion loop approach enables a systematic simplification of degradation
mechanism assessment by grouping a set of piping with similar operational and design
characteristics (Mohammed et al., 2017). This approach allows a group of piping to be
analyzed simultaneously, thus reducing non-value adding activities by eliminating
repetitive degradation mechanism assessment for piping and equipment with similar
operational and design characteristics. It reduces the magnitude of RBI assessment by
simplifying the identification of credible damage mechanisms and ensuring the applied
inspection techniques are suitable with the corresponding damage mechanisms and piping
metallurgy (Matthews et al., 2014).
In practice, various terms are used for this approach, e.g. corrosion loop, corrosion circuit,
piping group, piping circuit, etc. No consensus regarding the definition and distinction of
each term has been made and they are often used interchangeably. In this paper, the term
corrosion loop is used and defined as a group of piping that has similar operational and
design characteristics, and thus undergo similar degradation mechanisms and rates.
The essential key in implementing corrosion loop is defining the boundary features
(i.e. the features that are used to separate piping into different corrosion loops), which are
referred to distinguish one corrosion loop with the others. A detailed explanation about
boundary features is given in Section 3.2. In practice, each organization may have a different
set of boundary features for developing corrosion loops. Boundary features normally
include the operational and design characteristics of piping that influence the occurrence
and rate of particular degradation mechanisms, such as:
• process conditions, e.g. temperature and/or pressure changes, fluid type and phase
transition, removal/addition of fluid constituents, etc.;
• piping materials of construction;
• piping external condition, e.g. the presence of piping insulation and coating; and
• process units’ battery limits.
In practice, engineers develop corrosion loops manually by referring to particular guidelines
and documentations that exhibit information regarding piping design and operational
characteristics. Additional premises based on engineering judgement and experience are
normally added during the corrosion loop development process. For instance, it is common for
RBI/corrosion engineers to assume that non-hydrocarbon containing units (e.g. firewater
system, air compressor system, corrosion inhibitor injection system, etc.) are less susceptible
to corrosion, i.e. significant difference in operating pressure and temperature of piping in these
units would not affect the type of damage mechanisms and degradation rate of the piping.
This causes the engineers to loosen the boundary requirements, such as grouping a set of
piping into the same corrosion loop even though their operating pressure and temperature are
considerably different. This type of assumption reflects engineers’ unique interpretation
regarding the degradation mechanisms and rates similarity of certain piping, which is the
basic premise to develop a corrosion loop. As each person has different knowledge and
experience, and a unique mental model to approach the essence of a corrosion loop, the
outputs of the corrosion loops development process are uncertain and highly varied. Even the
same engineer can make different assumption for different piping and corrosion loops.
Consequently, it is challenging to determine whether particular outputs are satisfactory or not,
which makes the establishment of benchmarks for evaluating corrosion loop development
process outcomes to be challenging (Alvesson, 1993). Furthermore, due to the involvement of
engineering knowledge, judgement and experience, corrosion loop development can be
considered as a knowledge-intensive work (Drucker, 1999).
Corrosion loop is normally visualized by marking a process flow diagram (PFD) or a Corrosion loop
process and instrumentation diagram with different colors, with each color representing a development
corrosion loop. An example of PFD marking to visualize corrosion loops in a gas lift
compressor system in an offshore production system is shown in Figure 2.
A typical corrosion loop development workflow is shown in Figure 3. It can be inferred
that corrosion loop development process is inherently a clustering problem with repetitive
nature because the workflow has to be performed repeatedly for all the piping in the RBI

CL-5
Op. Cond.: 45 barg/50°C
Fluid: Gas Lift
Material: SS316L
Ext. Cond.: Uninsulated

To 2nd Stage Gas Lift


Compressor Suction
CL-1
Op. Cond.: 22 barg/40°C
Scrubber Scrubber
Fluid: Gas Lift
Material: Carbon Steel
Ext. Cond.: Uninsulated CL-2
T CL-4 1st Stage MP
Op. Cond.: 20 barg/39°C Op. Cond.: 46 barg/111°C Compressor After
Fluid: Gas Lift Fluid: Gas Lift Cooler
Material: SS316L Material: Carbon Steel
From HP Ext. Cond.: Uninsulated Gas Lift Compressor Ext. Cond.: Insulated
Separator

Figure 2.
1st Stage Gas Lift
Compressor Suction
An example of PFD
Scrubber marking to visualize
CL-3
Op. Cond.: 20 barg/39°C
corrosion loops in a
Fluid: Condensate
Material: SS316L
gas lift compressor
Ext. Cond.: Uninsulated To HP
Flare Header
system

Start
End the process
all of the piping in
the RBI assessment
scope have been
analyzed Review and refer to Review and refer to
End line list PIDs/PFDs

Analye individual
piping tag

Is the process unit Is there a change in Is there a


No No No
the same with the the materials of change in fluid
adjacent piping/ construction? type or phase?
equipment?

Yes Yes Yes

Start analyzing
Is there a change in Yes new piping tag
Start new corrosion Mark the piping tag
the external
loop on PIDs/PFDs
condition?

No
Maintain corrosion
loop of the
Is there a change in Yes adjacent
operating pressure/ piping/equipment
temperature?

No
Figure 3.
Note: The shaded part of the workflow is replaced and performed by the algorithm based on A typical corrosion
loop development
k-means clustering and non-hierarchical classification model workflow
Source: Adapted from Mohammed et al. (2017)
JQME assessment scope (see the shaded area of Figure 3). Clustering can be defined as the task of
classifying a set of objects by comparing the similarity between objects based on multiple
variables. In the case of corrosion loop development, the objects are the piping and the
variables are the boundary features. The aim of this paper is to create an algorithm based on
k-means clustering and non-hierarchical classification model to automate the corrosion
loop development process. The algorithm will replace the shaded part of the workflow
shown in Figure 3.

3. Algorithm development
This section discusses the elements that construct the corrosion loop development
algorithm. This section comprises of three main parts. The first part (Section 3.1) provides
an overview of key methods that construct the corrosion loop development algorithm
(i.e. non-hierarchical classification model and k-means clustering). The second part
(Section 3.2) discusses the concept of boundary feature and requirement, which are the
foundational features of the algorithm to cluster a set of individual piping into corrosion
loops. The third part (Section 3.3) discusses the corrosion loop development algorithm and
its corresponding pseudocode.

3.1 Overview of methods


3.1.1 Non-hierarchical classification model. Group technology is a manufacturing concept to
analyze and to arrange parts spectrum and the relevant manufacturing process according to
the design and machining similarity so that a basis of groups and families can be
established for rationalizing the production process in the area of small and medium batches
size. The basic concept is relatively simple: identify and bring together items that are related
by similar attributes, and then take advantage of similarities to develop simplified and
rationalized procedures in all stages of design and manufacture (Shunk, 1985). The term
similar attributes may mean similar design features, similar production requirements,
similar inspection requirements, etc. Since group technology involves identification of items
with similar attributes, the topic has always been closely associated with classification
processes, as a structured way to bring similar items together, usually by the design
features of shape, size, material and function, or, alternatively by similar production
requirements (Knight, 1998).
Billo and Bidanda (1995) categorize group technology application into three modes:
hierarchical, non-hierarchical and hybrid. A hierarchical classification model generates
classification from general to specific with respect to each feature. Non-hierarchical
classification model attempts to classify parts based on several related features that are not
necessarily hierarchical. The intention is to include diverse and independent features, e.g.
material type, geometry, processing type, etc. Meanwhile, the hybrid model combines
hierarchical and non-hierarchical concept (Billo and Bidanda, 1995). Non-hierarchical
classification model is used as corrosion loops are dependent on diverse and independent
features that are not related hierarchically, such as material type, insulation type, fluid
containment type, etc.
An illustration of piping classification based on the non-hierarchical classification model
is shown in Figure 4. X represents a single value of a particular feature for classification. For
instance, X2,1 is the first value of feature 2. The group name represents the grouping that is
produced after classifying a set of piping through n features. In this study, non-hierarchical
classification model is used mainly piping classification based on categorical features.
3.1.2 k-means clustering. Clustering is the task of partitioning data points into disjointed
clusters such that the data points grouped in the same cluster are similar to each
other according to a number of pre-defined criteria, yet different from the data points in
Classification Xn, 1 (X1, 1, X2, 1, ..., Xn–1, 1, Xn,1) Corrosion loop
Xn–1, 1 development
Xn, l (X1, 1, X2, 1, ..., Xn–1, 1, Xn, l)
X2, 1

X1, 1

X2, j
Xn, 1 (X1, i–m, X2, j–m, ..., Xn–1, k–m, Xn, l)

All piping Xn–1, k–m

Xn, l (X1, i–m, X2, j–m, ..., Xn–1, k–m, Xn, l)


X2, 1

X1, i

X2, j
Xn, 1 (X1, i , X2, j , ..., Xn–1, k, Xn, 1)
Figure 4.
An illustration of
Xn–1, k piping classification
based on
Xn, l (X1, i , X2, j , ..., Xn–1, k, Xn, 1)
non-hierarchical
classification model
Feature 1 Feature 2 Feature n–1 Feature n Group name

other clusters (Žalik, 2008). The term “similar”, in a clustering problem, means proximate by
a defined similarity measure (Khan and Ahmad, 2004). Clustering analysis is commonly
used in areas such as pattern recognition, data mining, information retrieval and knowledge
discovery (Kanungo et al., 2002).
Clustering task has an exploratory nature that reveals structure in data ( Jain, 2010).
Moreover, clustering is inherently a subjective matter that requires interpretation using
particular domain knowledge. Among machine learning clustering techniques, k-means is the
most widely used for clustering purpose due to its simplicity and efficiency (Huang et al., 2005;
Žalik, 2008). The objective of k-means clustering is to partition a set of n data points in
m-dimensional space into k distinct clusters. In k-means clustering, each cluster is represented
by a centroid or cluster center, and the distance between the centroids and the data points are
computed. Then, the data points are assigned to the cluster that has the closest cluster center.
Let X ¼ {xi|i ¼ 1, 2, …, n} be a set of n data points and B ¼ {bj|j ¼ 1, 2, …, k} be a set of k
cluster centers with each bj containing nj data points, 0onj on, the k-means algorithm aims
to minimize the following function (Žalik, 2008; Khan and Ahmad, 2004):
k X
X 
Cost ¼ distðxt ; bj ; (1)
j¼1 xt A Bj

where xt represents the member of cluster Bj with bj as the cluster center. The cluster centers are
computed and learned by performing the following steps (Žalik, 2008; Khan and Ahmad, 2004):
(1) Initialize k cluster centers b1, b2, …, bk using random sampling.
(2) Determine the membership of each data point xi in one of the clusters by finding the
nearest cluster center.
(3) Compute the new cluster centers bj as:
P
xt A Bj xt
bj ¼   ; (2)
Bj 

where |Bj| is the number of data points that are the members of the jth cluster.
JQME (4) Iterate step 2 and 3 until all centers converge, i.e. no change in all cluster centers values.
In general, k-means clustering entails three user-specified parameters ( Jain, 2010):
(1) Number of clusters: k-means clustering requires the number of clusters k to be
pre-determined. This situation creates a major challenge as the initial estimation of k
requires a priori knowledge regarding the data. Performing a number of clustering
attempts with varied value of k is a way to find the most appropriate value of k for a
particular data set (Wagstaff et al., 2001);
(2) Cluster initialization: k-means clustering starts with locating cluster centers arbitrarily.
Consequently, different cluster initialization may lead to different clustering because
k-means only converges to local minima ( Jain, 2010). The downside of this method is
the sensitivity of the algorithm outputs to the initial positions of the cluster centers.
To overcome this, numerous runs of k-means clustering are conducted to achieve the
optimal clusters, i.e. clusters that achieve the lowest cost function value (Likas et al.,
2003; Jain, 2010);
(3) Distance metric: As referred in the cost function (Equation 1), k-means clustering
requires the distance metric dist(xt, bj) to be specified. The sum of squared of
Euclidean distance between each cluster center bj and data point xt is commonly
used as the distance metric (Likas et al., 2003):

 2
distðxt ; bj ¼ :xt bj : : (3)

This metric is also commonly known as inertia, a measure of how internally coherent the
clusters are (Pedregosa et al., 2011). Consequently, the clusters typically have spherical
shape, which makes k-means clustering poor in handling clusters with irregular shapes
( Jain, 2010).

3.2 Corrosion loop boundary features and requirements


As mentioned in Section 2, a corrosion loop is as a set of piping that has similar operational
and design characteristics, and thus undergo similar degradation mechanisms and rates.
According to this definition, the algorithm main task is to cluster a set of piping based on
particular features and requirements. Thus, the algorithm needs a set of rules as an input
to segment a set of piping into corrosion loops. In this paper, the features that are used to
separate piping into different corrosion loops are referred as boundary features and the
corresponding requirement of each boundary feature is called boundary requirement.
For instance, the guideline to develop corrosion loops states that a corrosion loop should
consist of a set of piping that have similar operating temperature, where the operating
temperature difference between each piping cannot be more than 10 °C. In this example, the
boundary feature is “operating temperature” and the boundary requirement is that “the
difference of operating temperature between each piping cannot exceed 10 °C.” Boundary
features normally include the operational and design characteristics of piping that influence
the occurrence and rate of particular degradation mechanisms.
It should be noted that each organization may have a different set of boundary features
and requirements for developing corrosion loop. Defining a boundary requirement for
each boundary feature is essential as it provides detail and precise information about the
exact limitation of a corrosion loop and prevents any ambiguity in developing corrosion
loops. Providing just qualitative description (e.g. “similar,” “comparable,” “analogous,”
etc.) as the boundary requirement may create vagueness that can be interpreted
differently by different persons. The consequence is inconsistent outputs is detrimental to Corrosion loop
the quality of the inspection plan (e.g. overinspection of low-risk piping and/or development
underinspection of high-risk piping).
There are two common types of boundary features: categorical and numerical. Categorical
features can only have one fixed value from an array of possible values. For example, a piping
can contain either heavy hydrocarbon, light hydrocarbon or non-hydrocarbon. Categorical
features can be nominal (i.e. no intrinsic ordering to the categories) and ordinal (i.e. possible
values are ordered), but they will be treated the same in the algorithm. The typical categorical
boundary features for corrosion loop development are piping coating and insulation type, fluid
type and phase, and piping material of construction. Meanwhile, numerical features are variables
that present measurable quantity. Some examples of numerical boundary features for corrosion
loop development are pressure, temperature and flow rate. The distinction between these features
is important as they undergo different process in the algorithm. Each categorical/numerical
feature has different boundary requirement, which is discussed in Sections 3.2.1 and 3.2.2.
3.2.1 Boundary requirements for categorical boundary features. In this paper, it is defined
that all piping in the same corrosion loop shall have identical categorical features value.
Let K ¼ {ki|i ¼ 1, 2, …, n} be the total set of n piping tags in the facility, C ¼ {Cj|Cj  K, j ¼ 1,
2, …, m} be a set of m corrosion loops, Q ¼ {q(h)|h ¼ 1, 2, …, u} be a set of u categorical
boundary features, qðkht Þ be the value of the hth categorical boundary feature for a single piping
tag kt∈Cj, and qðkhz Þ be the value of the hth categorical boundary feature for a single piping tag
kz∈Cj that is not kt, thus the dissimilarity measure between piping within the jth corrosion
loop (Cj) based on categorical boundary features is defined as follows:
X X X u  
dj ¼ d qðkht Þ ; qðkhz Þ ; (4)
kt A C j k A C h¼1
z j
kz akt

where:
8
  < 0 if qðkhÞ ¼ qðkhÞ
d qðkht Þ ; qðkhz Þ ¼
t z
; (5)
: 1 if qðkhÞ aqðkhÞ
t z

To ensure all elements of a corrosion loop to have identical categorical features values, dj
shall be equal to 0.
3.2.2 Boundary requirements for numerical boundary features. Let K ¼ {ki|i ¼ 1, 2, …, n}
be the total set of n piping tag in the facility, P ¼ {p(g)|g ¼ 1, 2, …, v} be a set of v numerical
boundary features, pðkgt Þ be the value of gth numerical boundary feature for a single piping
tag kt ∈ Cj, thus:
n o n o
DP ðCgjÞ ¼ max pðkgt Þ 9kt A C j  min pðkgt Þ 9kt A C j ; (6)

where DP ðCgjÞ is the difference between maximum and minimum value of the gth numerical
boundary feature of a set of piping, which are members of the jth corrosion loop (Cj). The
boundary requirement for numerical boundary features in Cj is defined, such that:
       
C j ¼ kt 9kt A C j 3 DP ðC1jÞ oað1Þ 4 DP ðC2jÞ oað2Þ 4. . .4 DP ðCvjÞ o aðvÞ ; (7)
JQME where a(1), a(2) , …, a(v) is the boundary requirement values for the corresponding numerical
boundary features p(1), p(2) , …, p(v). Let A ¼ {a(g)|g ¼ 1, 2, …, v} be a set of boundary
requirement values corresponding to P, A is defined such that it is consistent for all
corrosion loops.

3.3 Corrosion loop development algorithm


Pseudocode of corrosion loop development is as follows:
Function kmeans (n_clusters):
1. Set n_clusters cluster centers c1, c2, …, ck using random sampling
2. For i ¼ 1 to (n_clusters + 1):
3. Determine the membership of each data point in one of the clusters by finding the
nearest cluster center
4. Compute the new cluster centers
5. If all cluster centers have converged:
6. Return cluster centers and membership of each data point
7. Else:
8. Back to the top of the loop
Function CorrosionLoopDevelopment (Dataset, CatBoundFeatures, NumBoundFeatures,
NumBoundReqs):
1. Group Dataset based on CatBoundFeatures, such that each group has the same value for
all CatBoundFeatures and store every generated groups into variable GroupedDataSet
2. Set CorrosionLoopDataset as an empty array to store the results
3. For each Group in GroupedDataSet:
4. For k ¼ 1 to (length of Group +1):
5. Call kmeans function with n_clusters ¼ k
6. Fit Group into kmeans model using NumBoundFeatures
7. Predict the closest cluster each data point in Group belongs to
8. If NumBoundReqs is fulfilled:
9. Append all clusters generated from Group into CorrosionLoopDataset
10. Else:
11. Back to the top of the loop and try calling kmeans function with another
k value
12. Return CorrosionLoopDataset
Remark: n_clusters: number of clusters; CatBoundFeatures: a list of categorical boundary
features; NumBoundFeatures: a list of numerical boundary features; NumBoundReqs: a list
of numerical boundary requirements; Dataset: a data set containing information about a set
of individual piping to be grouped; GroupedDataSet: a data set containing information about
a set of individual piping that have been grouped based on CatBoundFeatures; and
CorrosionLoopDataset: a data set containing information about a set of piping that have
been grouped based on CatBoundFeatures and NumBoundFeatures.
To implement the corrosion loop development algorithm, the above-mentioned
pseudocode is used. The kmeans function provides an action to perform k-means
clustering, as described in Section 3.1.2. The CorrosionLoopDevelopment function
represents the corrosion loop development algorithm, which carries out the complete
development of corrosion loops. The CorrosionLoopDevelopment function has two
main steps: grouping piping based on categorical boundary features by using
non-hierarchical classification model, and performing k-means clustering for each
preceding piping group, based on numerical boundary features by calling the kmeans
function. The combination of non-hierarchical classification model and k-means clustering is
used because of the inherent weakness associated with each method. The non-hierarchical
classification model is used mainly for classifying items based on categorical features. Corrosion loop
Meanwhile, k-means clustering is not used for categorical features as it can only work on development
numerical data. k-means clustering involves the computation of the Euclidean distance,
which requires the data to be numerical (Huang, 1998). While it is possible to convert
categorical data into numerical, this approach does not necessarily produce meaningful
results if the categorical data have no ordinality (Huang, 1998). One approach to handle
categorical values is by converting multiple categorical features into binary numbers, but
this approach causes an expansion of the number of features and lead to a significant
increase in the computational cost of the algorithm (Ralambondrainy, 1995; Huang, 1998).
It is mentioned in Section 3.2 that each organization may have a different set of boundary
features and boundary requirements for developing corrosion loop. In order to ensure the
applicability and the generalizability of the developed corrosion loop development
algorithm in various sets of boundary features and boundary requirements, the algorithm
has boundary features and boundary requirements as its input parameters. Hence, the
CorrosionLoopDevelopment function has CatBoundFeatures (i.e. categorical boundary
features), NumBoundFeatures (i.e. numerical boundary features) and NumBoundReqs (i.e.
numerical boundary requirements) that serve as the function’s input parameters. The
categorical boundary requirement is not set as one of the input parameters because it is put
by default that all piping in the same corrosion loop shall have identical categorical features
value (see Section 3.2.1).
A set of piping in a corrosion loop should be connected to each other. A limitation of this
corrosion loop development algorithm is the inability to process spatial information of the
piping. Consequently, there is a possibility that a set of piping in a corrosion loop are not
connected to each other. To minimize this, a boundary feature related to where the piping is
located (e.g. the name of the system or processing unit) can be used to ensure that a set of
piping in a corrosion loop are located nearby to each other. However, there is still a
likelihood that a set of piping in a corrosion loop are not connected to each other. To handle
this, manual modification of the corrosion loop development algorithm output can be
performed to ensure connection between a set of piping in a corrosion loop (see Section 5.3).

4. Case study
A case study is provided to demonstrate the functionality and performance of the corrosion
loop development algorithm on an actual piping data set. In this case study, a MacBook Pro
(16GB 2133 MHz LPDDR3 memory, 2.8 GHz Intel Core i7 processor, 256 GB of flash storage,
and macOS 10.14 as the operating system) with Python 3.7 (with the necessary packages such
as NumPy, Pandas and Scikit-learn (Pedregosa et al., 2011)) and Jupyter Notebook are used to
implement the corrosion loop development algorithm and input the data set into the algorithm.
The construction of the algorithm is described in Section 3.3. A piping data set of an offshore
petroleum production and processing platform is used in this case study. The detailed
description of the data set is given in Section 4.1. The selection of boundary features and
requirements for the algorithm is given in Section 4.2. In order to evaluate the ability of the
corrosion loop development algorithm to perform its intended function, a set of indices are
constructed as described in Section 4.3.

4.1 Data set description


The piping data set (i.e. line list) used in this case study was acquired from an actual RBI
assessment project conducted for an offshore petroleum production and processing
platform. The piping system in this platform connects various equipment (e.g. pressure
vessels, tanks, pumps, compressors, etc.), which process produced gas and condensate
through oil-water separation unit, gas treatment unit, gas dehydration unit, etc. The main
process of the offshore petroleum production and processing platform is shown in Figure 5.
JQME A total of 5699 piping tags are contained in the data set. Each piping tag is described by its
operating characteristics (e.g. fluid containment, operating temperature, operating pressure,
etc.) and design characteristics (e.g. material of construction, coating type, insulation type,
process unit, etc.).
The scope of the RBI assessment project includes corrosion loop development for the
piping system. Besides the piping data set, a document containing corrosion loop development
guideline for the aforementioned offshore petroleum production and processing platform was
given. This documentation is used for determining the pertinent boundary features and
requirements in this particular case. Section 4.2 provides a detailed explanation about the
selection of boundary features and requirements for this case study.

4.2 Boundary features and requirements selection


As mentioned in Section 4.1, a document containing the guideline for developing corrosion
loop for the aforementioned offshore petroleum production and processing platform was
given to determine the boundary features and requirements used this in this case study.
Based on the given corrosion loop development guideline, a corrosion loop is defined
as follows:
• a corrosion loop should comprise of piping that have the identical fluid type and
phase, materials of construction, insulation and coating type;
• piping within a corrosion loop should be belong to the same process unit; and
• piping within a corrosion loop should have similar operating pressure and temperature.
Based on these definitions, eight boundary features, including their boundary requirement
and value type, are identified, as shown in Table I. The boundary requirements for
numerical boundary features need more elaboration as the guideline only states “similar” for

Gas treatment Gas compression Gas dehydration Export gas


unit system unit to pipeline

Figure 5. Legend
The main process of Oil
the offshore petroleum Gas
production and From oil/gas Oil-gas To oil
processing platform storage Oil-gas mixture
wells separation unit

Boundary
feature Type Boundary requirement

Fluid type Categorical Piping within a corrosion loop shall have identical fluid type
Fluid phase Categorical Piping within a corrosion loop shall have identical fluid phase
Process unit Categorical Piping within a corrosion loop shall have identical process unit
Material of Categorical Piping within a corrosion loop shall have identical material of construction
construction
Table I. Insulation type Categorical Piping within a corrosion loop shall have identical insulation type
Boundary features Coating type Categorical Piping within a corrosion loop shall have identical coating type
and their boundary Operating Numerical The difference of operating pressure between piping within the same
requirement and value pressure corrosion loop shall not exceed 1barg
type for corrosion loop Operating Numerical The difference of operating temperature between piping within the same
development temperature corrosion loop shall not exceed 10 °C
both operating pressure and temperature. If operating pressure is described as p(1) and Corrosion loop
operating temperature is described as p(2), the boundary requirement for numerical development
boundary feature is defined, such that:
     
C j ¼ kt 9kt A C j 3 DP ðC1jÞ o 1 barg 4 DP ðC2jÞ o101C : (8)

In other words, the difference of operating pressure and temperature between piping within
the same corrosion loop shall not exceed 1 barg and 10 °C, respectively. The limit values of
1barg and 10 °C are set based on expert knowledge. Thus, it is assumed that if the difference
of operating pressure and temperature between a set of piping is 1 barg and 10 °C,
respectively, the damage mechanisms and rate of these piping is approximately similar,
given the other boundary features are the same for this set of piping.
It should be noted that the aforementioned boundary features and requirements are
specific for this case study as they are derived from a project documentation of a certain
facility. In reality, they may differ between organizations as there is still no consensus
regarding on how to develop corrosion loops and what boundary features and requirements
to be included.
The data set is considered as raw data as it still contains incomplete, inconsistent and
contains many errors. Therefore, data preprocessing is required to transform the data set
into readable and understandable format by the algorithm and to ensure that the data
are prepared for further processing (Han et al., 2011). The following data preprocessing
is performed:
• remove all features that do not contribute to corrosion loop development process;
• remove all piping tags that have incomplete values in the features included in
corrosion loop development process; and
• format the data into the appropriate data type to resolve inconsistencies.
After data preprocessing, the line list document contains 4,510 piping tags with eight
features, which correspond to the boundary features listed in Table I.

4.3 Algorithm verification index and process modeling


This section discusses the formulation of indices to evaluate the ability of the corrosion loop
development algorithm to perform its intended function based on the given data set. Three
functionality and performance aspects of the algorithm are evaluated:
(1) the ability of the algorithm to fulfill the boundary requirement for each boundary
feature;
(2) the ability of the algorithm to create a highly coherence and low variability corrosion
loops; and
(3) the impact of the algorithm implementation to the overall corrosion loop development
workflow.
In order to verify the first aspect, two indices are introduced and defined in Section 4.3.1.
The second aspect is verified by establishing an indicator that can measure the coherency
and variability of a corrosion loop (see Section 4.3.2). Additionally, a comparison is made
between corrosion loops generated by the algorithm and human intelligence (i.e. manual
work) to determine whether the algorithm can generate corrosion loops with higher
coherency and lower variability than the manual work. To confirm the third aspect, process
modeling is performed by comparing the corrosion loop development workflow before and
after the algorithm implementation (see Section 4.3.3).
JQME 4.3.1 Boundary requirement verification index. Boundary requirement verification index
is defined to confirm whether the algorithm can satisfy the boundary requirement for each
boundary feature. A different index is used for categorical and numerical features:
(1) Categorical boundary requirement verification index: as defined in Section 3.2.1, the
dissimilarity measure between piping within the jth corrosion loop based on
categorical boundary features is dj and to ensure all elements of a corrosion
loop have identical categorical features values, dj shall be equal to 0. If C ¼ {Cj|j ¼
1, 2, …, m} be a set of m corrosion loops generated by the algorithms; thus, the
verification index for categorical boundary requirement is defined as follows:
X
m
V I Cat ¼ dj : (9)
j¼1

If VICat ¼ 0, then all corrosion loops fulfill the categorical boundary requirement.
(2) Numerical boundary requirement verification index: if operating pressure
is described as p(1) and operating temperature is described as p(2), and DP ðC1jÞ and
DP ðC2jÞ are the difference between maximum and minimum value of operating
pressure and temperature of a set of piping members of jth corrosion loop,
respectively, as defined in Section 3.2.2; thus, the verification index for operating
pressure and temperature is described as follows:
n o
V I ðN1Þum ¼ max DP ðC1jÞ 9j ¼ 1; 2; . . .; m ; (10)

n o
V I ðNum

¼ max DP ðC2jÞ 9j ¼ 1; 2; . . .; m : (11)

If V I ðN1Þum o1barg and V I ðN2Þum o10 °C, then the model satisfies all numerical boundary
requirements stated in the previous section.
4.3.2 Coherency and variability index. Besides verifying the ability of the algorithm to
meet the boundary requirements of each boundary feature, it is important to understand the
capability of the algorithm to create highly coherence and low variability corrosion loops.
To achieve that, coherency and variability index is defined. Before defining the index, the
concept of inertia is introduced. Let R ¼ {r(s)|s ¼ 1, 2, …, w} be a set of all corrosion loop
boundary features (i.e. R ¼ Q∪P), the inertia of corrosion loop Cj is described as follows:
w 
X X 2
Ij ¼ r ðkst Þ mrCðjsÞ ; (12)
kt A C j s¼1

where r ðsÞ
kt is the value of the sth boundary feature of piping kt, where kt is a member of
ðsÞ
corrosion loop Cj, and mrC j is the average value of the sth boundary feature of all piping
members of Cj. In other words, the inertia of a corrosion loop Ij is the sum of squared of the
Euclidean distance between each boundary feature value and the corresponding average
boundary feature value in a particular corrosion loop. Thus, the average inertia (μI) for a set
of corrosion loops C ¼ {Cj|j ¼ 1, 2, …, m} is described as follows:
Pm
j¼1 I j
mI ¼ ; (13)
m
where Ij is the inertia of the jth corrosion loop and m is the total number of corrosion loops.
μI is set as the index to determine the coherency and variability in the corrosion loops. Corrosion loop
Lower μI value indicates higher coherency and lower variability between piping within a development
corrosion loop, i.e. the members of the corrosion loop have higher similarity among each
other. As the computation μI involves the calculation and Euclidean distance and Euclidean
distance requires all features to be numerical, thus all categorical values are converted into
binary numbers. For instance, if the coating type feature has two values, which are coated
and uncoated, the feature will be divided into two features: coating type coated and coating
type uncoated with 1 and 0 being the possible values for each feature. If a piping is coated,
then it has 1 as the value for feature coating type coated and 0 for feature coating type
uncoated. Additionally, as the range of values between boundary features varies, it is
necessary to standardize boundary features values to create equal comparison and
contribution on the μI value. Standardized feature is rescaled by removing its mean and
scaling to its variance, thus it will have the properties of a standard normal distribution with
μ ¼ 0 and σ ¼ 1.
There is a need to compare μI to the corrosion loops generated by the algorithm and μI to
the corrosion loops generated by the manual work in order to understand whether the
algorithm can produce corrosion loops with higher coherency and lower coherency than
corrosion loops generated by the manual work. The data set used in this case study contains
corrosion loop information for each piping, which was generated by manual work. These
data are used for comparative analysis. To establish an equal comparison, the evaluation of
μI value is made when the algorithm and the manual work achieve the same total number of
corrosion loops.
4.3.3 Process modeling. Process modeling is performed to examine the impact of
algorithm integration to the corrosion loop development process by comparing process
before and after the algorithm implementation. The flow chart technique is used to represent
the sequence of work in the corrosion loop development process. A flow chart is selected
because it is flexible, easy to communicate and can provide a high level of detail of the
process being examined (Aguilar-Saven, 2004).

5. Results and discussion


5.1 Boundary requirement verification index evaluation
From Table II, it can be seen that the algorithm is able to conform with boundary
requirements of each boundary feature. VICat of the algorithm is equal to 0, which satisfies
the boundary requirement for categorical boundary features. For numerical features, the
algorithm produces 0.99 and 9.0 for V I ðNum

and V I ðN2Þum , respectively, which also fulfill the
boundary requirements for numerical boundary features.

5.2 Coherency and variability index evaluation


As stated in Section 4.3.2, a comparison between the value of average inertia (μI) of corrosion
loops generated by the algorithm and the manual work is performed to understand whether
the algorithm is able to develop results with higher coherency and lower variability than
manual work. We only have a sample of manual work, which produces 256 corrosion loops
in the given set of piping. To enact impartial comparison, the algorithm is set up to generate

Boundary feature Verification indices Expected output Algorithm output

Categorical features VICat 0 0 Table II.


Algorithm verification
Operating pressure V I ðNum

o1 0.99 index evaluation
Operating temperature V I ðNum

o 10 9.0 results
JQME an equal number of corrosion loops. Table III shows the summarized performance of the
two methods.
It can be seen that the algorithm produces lower μI than the manual work with the same
number of generate corrosion loops, i.e. the algorithm enables the development of corrosion
loops with higher coherency and lower variability. To determine whether the difference of
μI value between the k-means clustering and the manual work is statistically significant or
not, a two-sample t-test is performed. The two-sample t-test is normally applied when the
members of the population are not the identical (Hartshorn, 2015). Let μI,A and μI,B are the
average inertia produced by the algorithm and manual work, respectively, and the null
hypothesis is μI,B−μI,A ¼ 0, while the alternative hypothesis is μI,B−μI,A≠0. Table IV shows
the results of the two-sample t-test.
It can be inferred that the average inertia level generated by the algorithm is lower than
the manual work at 1% statistical significance level. This implies that the algorithm is able
to produce the same number of corrosion loops with higher coherency and lower variability
than the manual work. Discrepancy and inconsistency in corrosion loops generated by
manual work is not unusual due to the complex configuration of offshore piping systems
(Mohammed et al., 2017). Jain (2010) argues that humans are naturally accomplished in
identifying clusters in two- or three-dimensional variables. Therefore, a computer algorithm
is required to reveal clusters in high-dimensional data and to reduce the subjectivity and
inherent variability embedded in the clustering outputs ( Jain, 2010). Additionally, algorithm
intelligence is considered to be less susceptible to human biases (Frey and Osborne, 2017)
and less error-prone than human intelligence (Acemoglu and Autor, 2011), which can be the
determinants for improving the quality of cognitively-demanding works. Staats et al. (2011)
mention that all work shall be highly specified as to content, sequence, timing and outcome
in order to capture an essential element of lean system. The utilization of the algorithm
inherently compels the standardization of the inputs, process, and boundary features and
requirements. The algorithm ensures that every piping undergo identical process. The
inputs and parameters of the algorithm are also highly specified, which ensure the
uniformity of the outcomes.

5.3 Changes in the workflow after the implementation of the algorithm


The integration of the algorithm into the corrosion loop development process can be
considered as an effort to automate some parts of the process, making it less vulnerable to
inherent problems of manual work. Bortolotti and Romano (2012) assert that “the automation
is like a magnifying glass that reveals, accelerates and exalts the improvements.” While the
automation of malfunctioned process is harmful, the automation of a streamlined process is

Table III.
Coherency and
variability index
comparisons between Number of corrosion
corrosion loops Method loops μI
generated by the
algorithm and Algorithm 256 0.39
manual work Manual work 256 26.6

SE of mean 95% confidence interval of the difference Degree of Sig. level


Table IV. Mean difference difference Lower Upper freedom t-value (two-tails)
The results of
two-sample t-test 26.21 10.47 5.69 46.74 255.11 3.73 0.0002
likely to expedite the achievement of the process objectives and strengthen the competitive Corrosion loop
advantages (Bortolotti and Romano, 2012). Furthermore, Spear and Bowen (1999) argue that development
in order to be more cost effective and efficient and lean, the pathway for every product and
service must be simple and direct, i.e. the process architecture should be simple.
The workflow after implementing the algorithm is shown by Figure 6. It can be inferred
that the algorithm is able to simplify the initial corrosion loop development workflow
(see Figure 3) by giving away the repetitive part to be performed by the algorithm. The
workflow simplification is expected to be a factor that drives the reduction of required
man-hours to complete a corrosion loop development process.
Additionally, the algorithm enables more convenient modifications if there are any
changes regarding the corrosion loop boundary requirements. This is essentially important
considering the nature of corrosion loop development process that is constantly changing,
i.e. interactions and activities are added, eliminated, and/or iterated if the state of the work
requires (Browning et al., 2006). The personnel just have to change the requirements of the
algorithm and the corrosion loops will be regenerated automatically. A more laborious
process is required if manual work is used for revising and modifying the corrosion loops.
“Preprocess piping list data” step, which does not exist in the workflow before the algorithm
integration, is added because the input data need to be refined to make it readable and
understandable by the algorithm. Data preprocessing is always required to prepare the data
input before conducting the algorithm as errors and incomplete and inconsistent data may
exist in the data input and can undermine the algorithm results.
The “review” and “modify” corrosion loops steps at the end of the workflow is added to
allow incorporation of tacit knowledge (e.g. personal belief, values, expertise and
individual experience) into the algorithm outputs. Due to its knowledge intensiveness,
corrosion loop development work may still dependent on tacit knowledge of the engineers.
While the ability of machine learning algorithm to process explicit knowledge has already
been proven, its ability to handle judgements and intuition based on tacit knowledge is
still questionable (Tsoukas and Vladimirou, 2001). Hendriks and Vriens (1999) argue that
tacit knowledge is imperative of making knowledge productive although its transmission
and codification remain as problems. Therefore, it is deemed necessary to supplement
algorithm outputs with human intelligence to capture tacit knowledge (Günther et al.,
2017; Shollo and Galliers, 2016) and to refine insights generated by the algorithm
intelligence (Sharma et al., 2014). In the case study, the results generated from the
algorithm can be considered as a foundation to facilitate the integration of the general
rules of corrosion loop development with additional information and knowledge from the
engineers. Review and modification of the corrosion loops generated by the algorithm are
inserted as the last steps of the suggested workflow to allow adjustment of the algorithm
outputs based on expert knowledge.

Input data into the


Preprocess k-means clustering
Review line list
line list data algorithm

Figure 6.
Corrosion loop
Mark PIDs/PFDs development workflow
Modify corrosion Review marked based algorithm after the utilization of
loop (if necessary) PIDs/PFDs outputs the algorithm
JQME 6. Conclusion and practical implications
This paper aims to reduce the amount of time and output variability of corrosion loop
development process by utilizing k-means clustering and non-hierarchical classification
method. An algorithm is constructed based on these two techniques. Python and Jupyter
Notebook are used as the tools for constructing the algorithm. A case study from the
corrosion loop development of an offshore petroleum production and processing platform is
provided to demonstrate the approach. Verification is performed to ensure that the
algorithm can produce outputs that fulfill the pre-defined conditions stated in the task
guideline. The outputs generated by the algorithm and manual work (i.e. work performed by
the engineer) are compared. It is evident that the algorithm is able to produce outputs with
lower variability than the manual work. A workflow analysis is conducted to examine the
changes in the workflow of a corrosion loops development after the implementation of
the algorithm. The inclusion of the algorithm into the workflow enables the simplification of
the overall corrosion loop development process by giving away the repetitive part to be
performed by the algorithm. The algorithm also allows more convenient modifications and
revisions if there are any changes regarding the corrosion loop boundary requirements.
Overall, the algorithm based on k-means clustering and non-hierarchical classification
model has the potential to generate a “leaner” corrosion loop development process by
reducing the required engineering man-hours and outputs variability. Although the
algorithm allows a part of corrosion loop development workflow to be automated, it is still
deemed as necessary to allow the incorporation of engineer’s expertise, experience and
intuition into the algorithm outputs in order to capture tacit knowledge and to refine
insights generated by the algorithm intelligence.

References
Acemoglu, D. and Autor, D. (2011), “Skills, tasks and technologies: implications for employment and
earnings”, Handbook of Labor Economics, Vol. 4, pp. 1043-1171.
Aguilar-Saven, R.S. (2004), “Business process modelling: review and framework”, International Journal
of Production Economics, Vol. 90 No. 2, pp. 129-149.
Alvesson, M. (1993), “Organizations as rhetoric: knowledge-intensive firms and the struggle with
ambiguity”, Journal of Management Studies, Vol. 30, pp. 997-1015.
API (2011), API Recommended Practice 571: Damage Mechanisms Affecting Fixed Equipment in the
Refining Industry, 2nd ed., API.
API (2016), Risk-Based Inspection: API Recommended Practice 580, 3rd ed., API, Washington, DC.
Billo, R.E. and Bidanda, B. (1995), “Representing group technology classification and coding techniques
with object oriented modeling principles”, IIE Transactions, Vol. 27 No. 4, pp. 542-554.
Bortolotti, T. and Romano, P. (2012), “ ‘Lean first, then automate’: a framework for process
improvement in pure service companies: a case study”, Production Planning & Control, Vol. 23
No. 7, pp. 513-522.
Browning, T.R., Fricke, E. and Negele, H. (2006), “Key concepts in modeling product development
processes”, Systems Engineering, Vol. 9 No. 2, pp. 104-128.
Brynjolfsson, E. and McAfee, A. (2011), Race Against the Machine: How the Digital Revolution is
Accelerating Innovation, Driving Productivity, and Irreversibly Transforming Employment and
the Economy, Digital Frontier Press, Lexington.
Brynjolfsson, E. and McAfee, A. (2014), The Second Machine Age: Work, Progress, and Prosperity in a
Time of Brilliant Technologies, W. W. Norton & Company, New York, NY.
Chang, M.-K., Chang, R.-R., Shu, C.-M. and Lin, K.-N. (2005), “Application of risk based inspection
in refinery and processing piping”, Journal of Loss Prevention in the Process Industries, Vol. 18
Nos 4-6, pp. 397-402.
Drucker, P. (1999), “Knowledge-worker productivity: the biggest challenge”, California Management Corrosion loop
Review, Vol. 41 No. 2, pp. 79-94. development
Ekbia, H., Mattioli, M., Kouper, I., Arave, G., Ghazinejad, A., Bowman, T., Suri, V.R., Tsou, A.,
Weingart, S. and Sugimoto, C.R. (2015), “Big Data, bigger dilemmas: a critical review”, Journal of
the Association for Information Science and Technology, Vol. 66 No. 8, pp. 1523-1545.
Frey, C.B. and Osborne, M.A. (2017), “The future of employment: how susceptible are jobs to
computerisation?”, Technological Forecasting and Social Change, Vol. 114, pp. 254-280.
Geary, W. (2002), Risk Based Inspection: A Case Study Evaluation of Offshore Process Plant, Health and
Safety Laboratory, Sheffield.
Günther, W.A., Rezazade Mehrizi, M.H., Huysman, M. and Feldberg, F. (2017), “Debating Big Data: a
literature review on realizing value from Big Data”, The Journal of Strategic Information
Systems, Vol. 26 No. 3, pp. 191-209.
Han, J., Pei, J. and Kamber, M. (2011), Data Mining: Concepts and Techniques, Elsevier,
San Francisco, CA.
Hartshorn, S. (2015), “Hypothesis testing: a visual introduction to statistical significance”.
Hendriks, P.H. and Vriens, D.J. (1999), “Knowledge-based systems and knowledge management: friends
or foes?”, Information & Management, Vol. 35 No. 2, pp. 113-125.
Huang, J.Z., Ng, M.K., Rong, H. and Li, Z. (2005), “Automated variable weighting in k-means
type clustering”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27 No. 5,
pp. 657-668.
Huang, Z. (1998), “Extensions to the k-means algorithm for clustering large data sets with categorical
values”, Data Mining and Knowledge Discovery, Vol. 2 No. 3, pp. 283-304.
Jain, A.K. (2010), “Data clustering: 50 years beyond k-means”, Pattern Recognition Letters, Vol. 31 No. 8,
pp. 651-666.
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R. and Wu, A.Y. (2002), “An
efficient k-means clustering algorithm: analysis and implementation”, IEEE Transactions on
Pattern Analysis and Machine Intelligence, Vol. 24 No. 7, pp. 881-892.
Khan, S.S. and Ahmad, A. (2004), “Cluster center initialization algorithm for k-means clustering”,
Pattern Recognition Letters, Vol. 25 No. 11, pp. 1293-1302.
Knight, W. (1998), Group Technology, Concurrent Engineering and Design for Manufacture and
Assembly, Group Technology and Cellular Manufacturing, Springer, pp. 15-36.
Likas, A., Vlassis, N. and Verbeek, J.J. (2003), “The global k-means clustering algorithm”, Pattern
Recognition, Vol. 36 No. 2, pp. 451-461.
Loebbecke, C. and Picot, A. (2015), “Reflections on societal and business model transformation arising
from digitization and Big Data analytics: a research agenda”, The Journal of Strategic
Information Systems, Vol. 24 No. 3, pp. 149-157.
Matthews, S., Al Jaberi, M.S. and Mundyath, D. (2014), “Risk based inspection implementation for
upstream offshore operator: case study”, Abu Dhabi International Petroleum Exhibition and
Conference, Society of Petroleum Engineers, Abu Dhabi.
Mohammed, M., Farah, M.M. and Abdi Adus, W. (2017), Corrosion Looping for Down Stream Petroleum
Plants: An Enigma for RBI Engineers, A Perspective from the Review of Mechanical Integrity
Systems, CORROSION 2017, NACE International, New Orleans, LA.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M.,
Prettenhofer, P., Weiss, R. and Dubourg, V. (2011), “Scikit-learn: machine learning in python”,
Journal of Machine Learning Research, Vol. 12, pp. 2825-2830.
Ralambondrainy, H. (1995), “A conceptual version of the k-means algorithm”, Pattern Recognition
Letters, Vol. 16 No. 11, pp. 1147-1157.
Ratnayake, R.M.C. (2015), “Mechanization of static mechanical systems inspection planning process”,
Journal of Quality in Maintenance Engineering, Vol. 21 No. 2, pp. 227-248.
JQME Schröder, H.-C. and Kauer, R. (2004), “Regulatory requirements related to risk-based inspection and
maintenance”, International Journal of Pressure Vessels and Piping, Vol. 81, pp. 847-854.
Sharma, R., Mithas, S. and Kankanhalli, A. (2014), “Transforming decision-making processes: a
research agenda for understanding the impact of business analytics on organisations”,
European Journal of Information Systems, Vol. 23 No. 4, pp. 433-441.
Shishesaz, M.R., Nazarnezhad Bajestani, M., Hashemi, S.J. and Shekari, E. (2013), “Comparison of
API 510 pressure vessels inspection planning with API 581 risk-based inspection planning
approaches”, International Journal of Pressure Vessels and Piping, Vol. 111-112,
November–December, pp. 202-208.
Shollo, A. and Galliers, R.D. (2016), “Towards an understanding of the role of business intelligence
systems in organisational knowing”, Information Systems Journal, Vol. 26 No. 4, pp. 339-367.
Shunk, D.L. (1985), “Group technology provides organized approach to realizing benefits of CIMS”,
Industrial Engineering, 4th ed., Vol. 17, pp. 74-76, 78.
Singh, M. and Pokhrel, M. (2018), “A fuzzy logic-possibilistic methodology for risk-based inspection
(RBI) planning of oil and gas piping subjected to microbiologically influenced corrosion (MIC)”,
International Journal of Pressure Vessels and Piping, Vol. 159, January, pp. 45-54.
Spear, S. and Bowen, H.K. (1999), “Decoding the DNA of the Toyota production system”, Harvard
Business Review, Vol. 77, pp. 96-106.
Staats, B.R., Brunner, D.J. and Upton, D.M. (2011), “Lean principles, learning, and knowledge work:
evidence from a software services provider”, Journal of Operations Management, Vol. 29 No. 5,
pp. 376-390.
Tsoukas, H. and Vladimirou, E. (2001), “What is organizational knowledge?”, Journal of Management
Studies, Vol. 38 No. 7, pp. 973-993.
Vinod, G., Sharma, P.K., Santosh, T.V., Hari Prasad, M. and Vaze, K.K. (2014), “New approach for risk
based inspection of H2S based process plants”, Annals of Nuclear Energy, Vol. 66, pp. 13-19.
Wagstaff, K., Cardie, C., Rogers, S. and Schrödl, S. (2001), “Constrained k-means clustering with
background knowledge”, ICML, pp. 577-584.
Žalik, K.R. (2008), “An efficient k′-means clustering algorithm”, Pattern Recognition Letters, Vol. 29,
pp. 1385-1391.

Corresponding author
Andika Rachman can be contacted at: rachman.andika@rocketmail.com

For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: permissions@emeraldinsight.com

View publication stats

You might also like