Xiangrong 2020

Structure and Infrastructure Engineering
Maintenance, Management, Life-Cycle Design and Performance
ISSN: 1573-2479 (Print) 1744-8980 (Online) Journal homepage: https://www.tandfonline.com/loi/nsie20
Statistical analysis of spatial distribution of

external corrosion defects in buried pipelines
using a multivariate Poisson-lognormal model
Xiangrong Wang, Hui Wang, Fujian Tang, Homero Castaneda & Robert Liang
To cite this article: Xiangrong Wang, Hui Wang, Fujian Tang, Homero Castaneda & Robert Liang
(2020): Statistical analysis of spatial distribution of external corrosion defects in buried pipelines
using a multivariate Poisson-lognormal model, Structure and Infrastructure Engineering, DOI:
10.1080/15732479.2020.1766516
To link to this article: https://doi.org/10.1080/15732479.2020.1766516
Published online: 20 May 2020.
Submit your article to this journal
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at

https://www.tandfonline.com/action/journalInformation?journalCode=nsie20
STRUCTURE AND INFRASTRUCTURE ENGINEERING
https://doi.org/10.1080/15732479.2020.1766516
Statistical analysis of spatial distribution of external corrosion defects in buried

pipelines using a multivariate Poisson-lognormal model
Xiangrong Wanga , Hui Wanga , Fujian Tangb , Homero Castanedac and Robert Lianga
a
Department of Civil and Environmental Engineering and Engineering Mechanics, The University of Dayton, Dayton, OH, USA; bState Key
Laboratory of Coastal and Offshore Engineering, School of Civil Engineering, Dalian University of Technology, Dalian, Liaoning, China;
c
Department of Material Science and Engineering, National Corrosion and Materials Reliability Center, Texas A&M University, College
Station, TX, USA
ABSTRACT ARTICLE HISTORY

It is well recognised that the severity of pipeline external corrosion is highly related to the corrosivity Received 20 November 2019
of the surrounding soil environment. However, in practice, the explicit effects of the soil corrosivity Revised 29 January 2020
variables on the spatial distribution of external corrosion defects are largely unknown. This paper Accepted 25 February 2020
presents a novel modelling and predicting approach for pipeline external corrosion defect count data
KEYWORDS
in terms of spatial patterns using on a multivariate Poisson-lognormal (MVPLN) model. The MVPLN External corrosion;
model can account for the over-dispersion and unobserved heterogeneity of the defect count data, as maintenance & inspection;
well as consider the stochastic correlation between corrosion defects with different spatial patterns. multivariate Poisson-
The developed model is applied to a pipeline inspection dataset consisting in-line inspection (ILI) data lognormal model; pipeline;
and corresponding soil corrosivity measurements. Its performance is validated by using cross-valid- probability density
ation. A comparison study shows that the MVPLN model provides superior modelling results for the functions; soil corrosivity;
spatial distribution of external corrosion defects over the commonly used univariate count data mod- spatial patterns;
els. In addition, the obtained model coefficients of the soil corrosivity measurements are discussed, stochastic models
and their estimated impacts on the spatial patterns of corrosion defects are verified qualitatively. The
potential application to assess the corrosion severity of non-piggable pipeline segments is further
demonstrated.
Abbreviations: CI: confidence interval; Cl : chloride ion; CO2
3 : carbonate ion; DIC: deviance informa-
tion criteria; exGPD: exponentiated generalised Pareto distribution; FeCO3 : iron carbonate; HCO
3 : bicar-
bonate ion; ILI: in-line inspection; K–S test: Kolmogorov–Smirnov test; M-H algorithm:
Metropolis–Hastings algorithm; MCMC: Markov chain Monte Carlo; MIC: microbiologically influenced
corrosion; MLE: maximum-likelihood estimate; MVNB model: multivariate negative binomial model;
MVP model: multivariate Poisson model; MVPLN model: multivariate Poisson-lognormal model; NB dis-
tribution/model: negative binomial distribution/model; PLN model: Poisson-lognormal model; PMF:
probability mass function; RMSE: root-mean-square error; SO2 4 : sulphate ion; SRB: sulphate-reducing
bacteria; UVNB model: univariate negative binomial model; UVPLN model: univariate Poisson-lognor-
mal model
1. Introduction and propagation of external corrosion defects typically

involve multiple factors, complex processes and various
Pipeline system is a critical part of the infrastructure sys-
uncertainties, which make it extremely difficult to explain
tems that ensure the safety and functionality of our society
and predict such phenomena using explicit electrochemical
as it provides one of the most reliable ways for transporting principles alone. As a result, statistical approaches are widely
large quantities of liquid and gas products over long distan- employed for modelling and evaluating the severity of exter-
ces. However, the aging and degrading of pipeline coatings nal corrosion spots on underground pipelines, as well as
and cathodic protection systems weaken the protection of quantifying the related uncertainties (El-Abbasy, Senouci,
pipeline metals from the aggressive and corrosive surround- Zayed, Mirahadi, & Parvizsedghy, 2014; Katano, Miyata,
ing soil environment and ultimately result in metal-loss Shimizu, & Isogai, 2003; Melchers, 2005; Shibata, 1996;
corrosion, which severely threatens the integrity and oper- Sinha & Pandey, 2002; Velazquez, Caleyo, Valor, & Hallen,
ational safety of underground pipelines. External localised 2009; Wang, Yajima, Liang, & Castaneda, 2015a, 2015b).
corrosion (a.k.a. pitting corrosion) is known as one of the Analysis of external corrosion defects and their impacts
most common corrosion defects for pipelines under normal on the integrity of a pipeline can be conducted from either
operation conditions (Caleyo, Alfonso, Alcantara, & Hallen, or both of the spatial and temporal dimensions, which could
2008; Kishawy & Gabbar, 2010). In practice, the nucleation be correlated with each other: 1) temporal analysis focusing
CONTACT Hui Wang hwang12@udayton.edu Department of Civil and Environmental Engineering and Engineering Mechanics, The University of Dayton,
Dayton, OH, 45469 USA
ß 2020 Informa UK Limited, trading as Taylor & Francis Group
2 X. WANG ET AL.
on the depth or the growth rate of the corrosion defects; 2) distribution of the corrosion defects, but also the inter-
spatial analysis investigating the spatial distribution and dependency between the soil corrosivity measures and the
density of the corrosion defects along the pipeline right-of- spatial density of the corrosion defects. Understanding of
way. The former has attracted most concern; substantial such interdependency can reveal complex natures of the
corrosion growth models have been developed in existing corrosion mechanism, thereby guiding the development of
studies (Bazan & Beck, 2013; Caleyo, Velazquez, Valor, & countermeasures for corrosion-lead pipeline failures.
Hallen, 2009a, 2009b; Valor, Caleyo, Hallen, & Velazquez, Moreover, it will be demonstrated that this investigation
2013; Wang, Yajima, Liang, & Castaneda, 2016). Although may provide a feasible way to estimate the corrosion sever-
the latter has gained relative less attention, investigation on ity of the non-piggable pipeline segments.
the spatial distribution of the corrosion defects is crucial for This study presents a statistical analysis of the spatial distri-
conducting pipeline safety assessment and optimising pipe- bution of external corrosion defect count data on a buried
line inspection and maintenance strategies. pipeline based on the multivariate Poisson-lognormal
It has been demonstrated in both experimental and theoret- (MVPLN) model. Under the framework of the MVPLN
ical studies that interaction among nearby corrosion defects model, counts of both random corrosion defects and clustered
may result in complicated and threatening failure modes of defects (Shibata, 2000) of a pipeline segment are considered as
the pipeline wall (Benjamin, Freire, Vieira, & Cunha, 2016; correlated dependent variables and modelled jointly. The phys-
Benjamin, Freire, Vieira, & de Andrade, 2008; Chen et al., ical and chemical characteristics of the surrounding soil envir-
2015; De Andrade et al., 2008; Li, Bai, Su, & Li, 2016). While onment, which have been proven highly related to the density
from a broader view of the system integrity, it has been and distribution of external corrosion defects (Alamilla,
reported that the spatial distribution of corrosion defects can Espinosa-Medina, et al., 2009; Ferreira, Ponciano, Vaitsman, &
exert critical impacts on the system failure probability (Hong, Perez, 2007; Frankel, 1998; Punckt et al., 2004), are considered
1999; Kishawy & Gabbar, 2010; Zhou, Hong, & Zhang, 2012). as the explanatory variables. Other unobservable factors are
Therefore, this study is aiming at developing a novel approach considered as latent effects in the model.
for modelling the spatial distribution of the corrosion defects, A data set consisting in-line inspection (ILI) data and
the application of which, combined with existing temporal soil corrosivity measurements along a 110-km-long buried
analysis approaches, is expected to deliver a more accurate oil transportation pipeline (Wang et al., 2015b) is used to
assessment of the pipeline system safety and integrity. demonstrate the performance of the presented statistical
In the literature, corrosion defect count data of a pipeline model. The paper starts with the statistical characteristics of
section are commonly modelled using homogenous Poisson external corrosion defect count data, followed by the
processes (Alamilla, Oliveros, & Garcıa-Vargas, 2009; MVPLN model and parameter estimation using Markov
Alamilla & Sosa, 2008). This implies that a corrosion defect chain Monte Carlo (MCMC) technique. Then, the proposed
can randomly and independently occur at any location approach is illustrated and validated using a real-life pipeline
along the pipeline with a constant probability (Alamilla, external corrosion data-set. Finally, the validity, effectiveness,
Espinosa-Medina, & Sosa, 2009; Caleyo, Gonzalez, & Hallen, additional insights and potential application in modelling spa-
2002). Such a strong assumption has two major contradic- tial frequency of external corrosion defects for non-piggable
tions with observations in practice. Firstly, it has been pipeline segments are further discussed.
widely recognised that the nucleation and propagation of
adjacent corrosion defects can be a highly interactive pro- 2. Model framework and estimation
cess rather than completely independent random events
(Benjamin, Freire, & Vieira, 2007; Cosham, Hopkins, & 2.1. Statistical characteristics of external corrosion
Macdonald, 2007; Silva, Guerreiro, & Loula, 2007). Secondly, defect count data
the soil corrosivity and other corrosion-related factors may
In the case that an entire pipeline is divided into a finite
vary significantly along the alignment of a pipeline, thereby
number of segments with equal length, the number of exter-
exerting various impacts on the corrosion severity. Therefore,
nal corrosion defects in each segment at a given time is
it seems to be oversimplified to neglect the spatial variability count data (non-negative integers). This section provides
of these factors and to postulate the distribution of corrosion some discussions regarding the statistical characteristics of
defects as a stationary random field. the defect count data, which will guide the selection of a
Substantial efforts have been made to address the draw- suitable statistical model.
back of the homogenous Poisson assumption (De La Cruz,
Kuniewski, Van Noortwijk, & Gutierrez, 2008; Valor et al.,
2015). Taking into account of corrosion mechanism, Valor 2.1.1. Over-dispersion
et al. (2015) developed a spatial distribution model of exter- Assume that the number of corrosion defects y on a pipe-
nal corrosion defects using the negative binomial (NB) dis- line segment follows a homogenous Poisson process:
tribution. It was claimed that the NB distribution could
ek ky
account for corrosion defects with different spatial patterns P ð yÞ ¼ (1)
y!
and provide reasonable fitting results for real-life defect
count data. Inspired by this and relevant findings, this study in which k is the rate of the process. Poisson process
investigates not only the statistical model of the spatial imposes a strong implicit assumption that the mean and
STRUCTURE AND INFRASTRUCTURE ENGINEERING 3
variance of the defect variable y are equal along the pipeline, as a single metal loss event) is equal to the greatest depth of
which conflicts with the observation of over-dispersed (i.e. the individual defects included in the cluster, and the length
the observed variance of the defect count is higher than the of the metal loss event is equal to the total combined length
mean) corrosion count data in practice (Valor et al., 2015). of the cluster.
Mixed Poisson models are commonly employed to relax the The processed results are then reported in a spread sheet.
‘equal variance’ assumption and to address the over-disper- Therefore, as the data source is the processed spread sheet,
sion difficulties (Grandell, 1997; Karlis & Xekalaki, 2007). it is possible that some of the external corrosion defects are
The probability mass function (PMF) of a Poisson mixture actually local corrosion clusters and are reported as individ-
can be expressed in the form: ual events. The spatial interaction between different local
ð 1 k y clusters is still not clear. A simplified assumption is used
e k
P ðy Þ ¼ g ðkÞdk (2) that all reported defects are individual corrosion defects.
0 y!
Though this assumption is somewhat strong, the large-scale
in which gðÞ is referred to as mixing probability density spatial patterns of external corrosion events, which is the
function, which determines the density of rate k in the mix- focus of the current research, still can be well modelled
ture model. Let gðÞ be a gamma distribution or a lognormal using the presented approach.
distribution, the corresponding mixed Poisson models turn
to the NB model and Poisson-lognormal (PLN) model
respectively. 2.1.3. Unobserved heterogeneity
In statistical analysis, the deficiencies and the unknown var-
iations of the explanatory variables across the sample popu-
2.1.2. Correlated spatial patterns of corrosion defects and lation are referred to as unobserved heterogeneity. For the
multivariate models corrosion defect count data, the soil corrosivity characteris-
The spatial distribution of corrosion defects follows different tics can vary considerably along a pipeline segment, due to
spatial patterns (Shibata, 2000). Some of the corrosion the inherent heterogeneity of the surrounding soil. Such het-
defects randomly spread along the pipeline with a low local erogeneity may not be fully reflected by the soil corrosivity
density (i.e. referred to as random defects), while other measurements since multiple measurements along the align-
defects can distribute in an intensively clustered form (i.e. ment of a given pipeline segment are averaged. In addition,
referred to as clustered defects) (Cosham et al., 2007). The some other corrosion-related variables, such as sulphide
random defects and clustered defects may cause damage to inclusions (Stewart & Williams, 1992; Wranglen, 1974) in
pipeline structures at different severity levels but are highly steel, are nearly impossible to measure in a macroscale ana-
correlated since they may depend on the same set of
lysis, thereby cannot be considered in a statistical corrosion
observed and unobserved explanatory variables.
model. It has been noted that neglecting unobserved het-
To be specific, clustered defects that are very close to
erogeneity may lead to biased and inefficient estimations
each other or even overlapped may cause more a severe
of the model coefficients, which could further result in
decrease in the strength of pipeline wall at local points than
erroneous inferences and predictions (Mannering,
those random defects (Benjamin et al., 2007; Silva et al.,
Shankar, & Bhat, 2016).
2007), thereby leading to a higher possibility of pipeline fail-
Based on the discussion presented in this section, the
ure. Meanwhile, it has been reported that the presence of
statistical model employed for analysing corrosion defects
the clustered defects is highly correlated with the propaga-
tion of random defects – the former may be transformed should be able to tackle the over-dispersion for multivariate
from the latter under certain favourable environment defect count data with consideration of unobserved hetero-
(Burstein, Pistorius, & Mattin, 1993; Williams, Stewart, & geneity. Table 1 summarises the capabilities of the com-
Balkwill, 1994). Therefore, it seems unreasonable to neglect monly used count data models, including Poisson model,
such a strong correlation and to employ separate univariate the NB model and the PLN model and their corresponding
models for each type of defect data. Thus, multivariate multivariate versions – multivariate Poisson (MVP) model
count data models are preferred since they can model (Kocherlakota & Kocherlakota, 2004), the multivariate nega-
defects of different spatial patterns as distinct dependent tive binomial (MVNB) model (Winkelmann, 2000) and the
variables with consideration of their correlations. MVPLN model (Chib & Winkelmann, 2001). It can be
From a practical point of view, adjacent originally noticed that, as a multivariate mixed Poisson process,
detected metal loss defects from ILIs can be usually com- MVPLN model can address over-dispersion for multivariate
bined into small local clusters based on interaction rules in count data. Moreover, compared with the MVNB model
customary industry practices. There are several different sets that allows only non-negative correlations (Winkelmann,
of rules for interacting defects. For all of them, if metal loss 2008) between different spatial patterns, the MVPLN model
flaws are closer than pre-defined criteria, they are combined allows a more general correlation structure (both positive
into one large cluster. One of the rules is that ‘metal loss and negative). In addition, MVPLN model provides a simple
defects are considered to interact if they are within one inch and flexible way to address the unobserved heterogeneity for
(25.4 mm) in axial dimension or six times the nominal wall defect count data by introducing the latent variables. Due to
thickness in the circumferential dimension’. The maximum the attractive features of the MVPLN model, it is adopted in
corrosion depth of the combined defects (usually considered this study for modelling the corrosion defect count data.
4 X. WANG ET AL.
Table 1. Comparison of the capabilities of the commonly used count data models.
Features
Models Over-dispersion Multivariate data General correlation Unobserved heterogeneity
Univariate Poisson model No No No No
Univariate negative binominal model Yes No No No
Univariate Poisson-lognormal model Yes No No Yes
Multivariate Poisson model No Yes No No
Multivariate negative binominal model Yes Yes No No
Multivariate Poisson-lognormal model Yes Yes Yes Yes
ðY
2.2. The MVPLN model s

Pðyi xi , b, RÞ ¼ fP yij jeij , xi , bj us ðei 0, RÞdei (7)
In this section, a brief introduction to the MVPLN model is j¼1
provided. More detailed information on this model can be
found in Chib and Winkelmann (2001). Let n correspond to where fP ðÞ denotes the Poisson PMF, and us ðÞ denotes the
the number of pipeline segments and s correspond to the s-variate normal density function. As this multiple integral
total number of different spatial patterns (i.e. s ¼ 2 when cannot be solved in a closed form, the MCMC simulation
random defect and clustered defects are considered in this approach is employed to estimate the unknown parameter
study) of the corrosion defects. Let yi ¼ ðyi1 , :::, yis Þ denote under the Bayesian inferential framework.
to the defect counts of s spatial patterns in pipeline segment Due to the lack of sufficient prior knowledge of the dis-
i ði ¼ 1, 2, :::, nÞ: Let ei ¼ ðei1 , :::, eis Þ denote the s spatial pat- tributions for bj and R, their prior distributions can be
tern-specific latent effects for pipeline segment i: Let xi ¼ specified as frequently employed vague (less-informative)
ðxi1 , :::, xik Þ denote the k-dimensional explanatory variables priors. Specially, it is assumed that bj and R1 independ-

for pipeline segment i: Let bj ¼ b1j , :::, bkj T denote the ently follow a multivariate normal distribution and a
model coefficients of the k-dimensional explanatory varia- Wishart distribution (Chib & Winkelmann, 2001), respect-
bles for spatial pattern j ðj ¼ 1, 2, :::, sÞ: Assume that, condi- ively:
tional on the latent effects eij , the explanatory variables xi
and the corresponding coefficients bj , the defect counts yij bj uk b0 , B1
0 , R1 fW ðr0 , R0 Þ (8)
follows a Poisson distribution:
where uk is the k-variate normal density and fW is the
yij jeij , xi , bj Poisson kij (3)
Wishart density, and b0 , B1 0 , r0 , R0 are the associated
where hyperparameters: b0 is the mean vector of the k-variate nor-
kij ¼ expðxi bj þ eij Þ (4) mal density, B0 is the covariance matrix of the k-variate
normal density, r0 is the degree of freedom of the Wishart
for i ¼ 1, 2, :::, n and j ¼ 1, 2, :::, s: In MVPLN model, the density and R0 is the scale matrix of the Wishart density.
latent effects ei are assumed to be uncorrelated with the Then, according to the Bayes’ theorem, the joint posterior
explanatory variables xi and follow an s-dimensional multi- density is proportional to the product of prior probability
variate normal distribution with a mean vector 0 and an and likelihood probability:
unrestricted variance–covariance matrix, expressed as
ei jR Ns ð0, RÞ (5) Posterior / prior likelihood
n ðY
Y
where
s

¼ uk b0 , B1 f ð R , r Þ fP yij jeij , xi , bj us ðei 0, RÞdei
2 3 0 W 0 0
r11 r12 ::: r1s i¼1 j¼1
6 r21 r22 ::: r2s 7 (9)
6 7
R¼6 .. .. .. .. 7 (6)
4 . . . . 5 The above joint posterior can then be simulated by itera-
rs1 rs2 rss tively sampling from three conditional posterior distribu-
and tions:
rij ¼ rji ; fori, j 2 ð1, 2, :::, sÞ:
pðR1 eÞ; pðeY, X, b, RÞ; pðbY, X, e, RÞ, (10)
3 2 2 3 2 3
2.3. Parameter estimation via MCMC simulation e1 x1 y
6 .. 7 6 .. 7 6 .1 7
where e ¼ 4 . 5, b ¼ ½b1 :::bs , X ¼ 4 . 5, Y ¼ 4 .. 7
6
5:
The likelihood function of the MVPLN model is the product
en xn yn
of the marginal distributions Pðyi xi , b, RÞ with regard to ei ,
which can be expressed as an s-variate integration of the The sampling process consists of three steps, which are sam-
Poisson distribution with respect to the distribution of ei : pling R1 , e and b, accordingly.
2.3.1. Sampling R1 Although iterative MCMC simulation is required for esti-
The full conditional posterior distribution of R1 can be mating the MVPLN model, it is worth to mention that the
expressed as model framework enables sampling ei and bj in parallel
Y
n schemes, thereby significantly reducing the computation
pðR1 eÞ / fW R1 R0 , r0 us ðei 0, RÞ (11) time (Zhan, Aziz, & Ukkusuri, 2015). During the MCMC
i¼1 simulation, a certain number of burn-in iterations are used
By combining terms, the above posterior density can be to monitor the convergence. Trace plots of the generated
rewritten as a new Wishart distribution: sample are inspected to ensure that the sample values pro-
0 " #1 1 duce no significant periodicities or tendencies exist by
X n the end of the burn-in period. The generated samples in the
pðR1 eÞ fW @n þ r0 , R1
0 þ ei eTi A (12) burn-in period are discarded, and the inferences on the
i¼1
model parameters can be directly made using the remaining
Hence, this known parametric distribution can be samples. The corresponding mean and standard deviations
sampled using a Gibbs sampler. of each posterior distribution are calculated; the 2.5th and
97.5th percentiles are used to construct the 95% cred-
ible interval.
2.3.2. Sampling e
As the closed-form density of the conditional posterior
density of e, i.e. pðeY, X, b, RÞ, is not available, the values 3. Real-world pipeline corrosion data description
of e are sampled using the Metropolis–Hastings (M-H) algo- and pre-processing
rithm (Chib & Greenberg, 1995). According to Chib and
3.1. General data information
Winkelmann (2001), the multivariate t distribution
with degrees of freedom ve (i.e. ve is employed as a tuning To demonstrate the use of the MVPLN model for analysing
parameter in the M-H algorithm, which makes the accept- corrosion defect data, the proposed model is applied in a
ance
rate lie between 20 and 45 per cent), given as real-life data-set collected from a transporting pipeline sys-
fT ei jê i , Vê i , ve , is employed as the proposal density distri- tem (Wang et al., 2015b; Yajima, Rivera, et al., 2015;
bution. Here, eî is obtained by maximising the posterior Yajima, Wang, Liang, & Castaneda, 2015). The pipeline was
probability for ei based on Newton-Raphson algorithm: originally installed in 1969, using steel pipes with a diameter
of 457.2 mm (18 inch) and a wall thickness of 6.41 mm
ê i ¼ arg max lnpðei yi , xi , b, RÞ (13)
(0.252 inch). ILI was conducted for this pipeline in 2010
and Vê i ¼ ðHê i Þ1 is the inverse of minus the Hessian using ultrasonic smart pigging system, which has a measure-
matrix of lnpðei yi , xi , b, RÞ given ê i : A proposal value ei ment threshold of 5% pipe wall thickness. The total length

drawn from fT ei jê i , Vê i , v is accepted with probability: of the inspected pipeline interval was about 110 km, along
( ) which a total of 1892 external defects were detected and
pðei yi , xi , b, RÞfT ei jê i , Vê i , ve reported by processing the raw ILI signals. The inspected
min ,1 (14)
pðei yi , xi , b, RÞfT ei jê i , V ê i , ve pipeline interval was analysed by dividing it into segments
with an equal length of 200 m. Soil corrosivity surveys were
carried out for each of these segments to measure the corro-
2.3.3. Sampling b sivity factors of the surrounding soil of each segment. After
Similar to the sampling of e, the H-M algorithm is performing correlation analysis, parameters that were
employed to sample b: Again, the multivariate t distribution strongly correlated with some other parameters were
excluded. A total number of seven soil corrosivity variables,
^
with degrees of freedom vb , given as fT bj jb j , V b^ j , vb , is i.e. half-cell potential, resistivity, pH, concentrations of
employed as the proposal density distribution of bj , with chloride ion (Cl ), carbonate ion (CO2 3 ), bicarbonate ion
^ j obtained by maximising the posterior probability for
the b (HCO 3 ) and sulphate ion (SO 2
4 ), are considered as
bj based on Newton-Raphson algorithm: explanatory variables in this study.
^ j ¼ arg maxln pbj Y j , X j , ej , bj , R
b (15)
3.2. Identifying the corrosion defect patterns
where Y j , X j and ej denote the jth column of matrix Y, X
and e, respectively; bj denotes the components of b other To build a multivariate model accounting for corrosion
defects with different spatial patterns in a probabilistic man-
than bj : The sampling of b is then conducted by sampling
ner, it is required to classify the defects base on their associ-
its components bj one at a time, with the acceptance prob- ated spatial patterns (i.e. random or clustered) and to
ability of a proposal bj as
8 9 acquire the classified defect counts for each pipeline seg-
>
<p bj Y j , X j , ej , bj , R fT bj jb^ j , V ^ , vb >
= ment. This can be implemented by analysing the distribu-
bj
min , 1 (16) tion of interval distance between neighbouring defects (i.e.
> ^ j , V ^ , vb
: p bj Y j , X j , ej , bj , R fT bj jb >
;
j b also referred to as inter-arrival time in case Poisson proc-
esses are used for modelling time sequential events). The
6 X. WANG ET AL.
Figure 1. Histogram of the interval distance between neighbouring defect; the empirical CDF of the observed distribution and the CDF of the estimated
exGPD mixture.
interval distance distribution is shown in Figure 1. Given otherwise, it is considered as a defect with a random pat-
that the counts of both random defects and clustered defects tern. In this way, the total number of 1892 external corro-
in a pipeline segment follow respective mixed Poisson distri- sion defects can be identified as 1180 random defects and
butions, the interval distance between neighbouring random 712 clustered defects. Figure 2 shows the numbers of both
defects and clustered defects can be considered as exponen- random and clustered defects in each 200-m segment along
tial mixtures, which can be modelled using exponentiated the investigated 110-km pipeline. Table 3 contains summary
generalised Pareto distributions (exGPDs) (Ali, 2017; statistics of the defect counts and the measured soil corro-
Nadarajah, 2005): sivity explanatory variables.
11 It should be mentioned that the segments do not have to
1 nx ð n Þ be in equal length. The reason for the current setting is that
fexGPD ðxjr, nÞ ¼ 1þ (17)
r r the severity of corrosion herein is defined as the number of
in which ðr, nÞ 2 H are distribution parameters. In this way, defects per segment. If segments with various lengths are
the distribution of the interval distance d between neighbor- used, the associated defect numbers are not comparable, yet
ing defects can be considered as a mixture of two exGPDs the severity could be evaluated using the effective area (i.e.
with the expression: the ratio between the corroded area and the area of
the segment).
PðdÞ ¼ afexGPD
r
ðHr Þ þ ð1 aÞfexGPD
c
ð Hc Þ (18)
r
in which fexGPD ðÞ and fexGPD
c
ðÞ represent the distribution 4. Modelling validation and discussions
of interval distance for random defects and clustered
defects, respectively; and a donates the weight of the mix- In this section, the performance of the proposed MVPLN
ture distribution. Given the formation shown in Equation model is assessed for predicting the real-world pipeline
(18), the maximum-likelihood estimates (MLEs) of a, Hr defect count data using cross-validation. A comparison
and Hc can be inferred utilising an iterative maximisation- study is then conducted to compare the modelling results
expectation algorithm. obtained using the developed MVPLN model with other
As shown in Figure 1, the obtained estimation results existing defect count models. Furthermore, a thorough dis-
(listed in Table 2) lead to a Kolmogorov–Smirnov (K–S) cussion of the estimated correlations is provided between
test p value of 0.642 with the non-hypothesis (i.e. the inter- different spatial patterns of corrosion defects and the
val distances follow an exGPD distribution), which indicates obtained model coefficients. Finally, an illustrative case is
that the exGPD mixture model is satisfactory for modelling presented to demonstrate the use of the proposed MVPLN
the distribution of interval distances. A critical distance d model for predicting corrosion severity for the non-piggable
of 0.364 m can be obtained by solving the following equa- pipeline segments.
tion:
afexGPD
r
ðd , Hr Þ ¼ ð1 aÞfexGPD
c
ð d , Hc Þ (19) 4.1. Model validation
If the spatial distance between a given defect and its The entire defect count data set is split using ILI into five
nearest neighbouring defect is smaller than d , the given equal-sized groups (non-piggable segments are excluded), as
defect is more likely to be a defect with a clustered pattern; shown in Figure 2, to conduct a fivefold cross-validation of
Table 2. Estimation results of the exGPD mixture model.

Estimated distribution parameters
Interval distance distributions Shape parameter Scale parameter Weight factor
r
Random defects fexGPD 0.821 14.984 0.794
c
Clustered defects fexGPD 0.106 0.285 0.206
Critical distance d ¼ 0.364 K–S test p value ¼ 0.642
Figure 2. Distributions of defects with random and clustered patterns along the investigated pipeline.
Table 3. Descriptive statistics of defect counts and soil corrosivity measurements.

Variable description Mean Std. dev. Minimum Maximum
Dependent variables
Number of random defects per segment 2.12 3.26 0 18
Number of clustered defects per segment 1.28 4.01 0 53
Soil corrosivity characteristics

Redox potential (Eh) –249.98 60.94 –395 50
Resistivity (@1 m) (ohm.cm) 1.55 104 3.27 104 452.39 3.08 104
pH value 5.47 0.93 2 7.6
Cl ion concentration (mmol/L) 3.18 1.21 0.75 10.25
CO2
3 ion concentration (mmol/L) 0.92 0.80 0 5.33
HCO 3 ion concentration (mmol/L) 2.87 1.94 0 15.91
SO2
4 ion concentration (mmol/L) 0.10 0.20 0.04 2.81
the proposed MVPLN model. In each validation step, one test set, the probability distributions of the defect counts
data group is used as the test set, and the remaining four with both random and clustered patterns can be calculated
groups are used as the training set. Note that the count data based on the trained MVPLN model. As mentioned above,
samples are grouped according to their spatial sequence the corrosion process involves various uncertainties, and the
along the alignment of the pipeline, instead of being reor- estimation of defect counts may be impacted by the unob-
dered and sampled randomly. Such a partition strategy is served factors. As a result, the obtained probability distribu-
adopted because the generated test set has a similar spatial tions of defect counts may possess considerable variations.
distribution to the non-piggable pipeline segments, which Therefore, to better describe the obtained modelling results,
are typically contiguous blocks, rather than sparse points, two types of estimates of defect counts are provided for val-
along the pipeline alignment. In each validation step, an idation. The first type of estimates is the median values of
MVPLN model is trained using the training set based on the obtained probability distributions of defect counts. The
the MCMC simulation process presented in Section 3.2. To second type of estimates is the 90% quantile values of the
be more specific, a total number of 50,000 MCMC samples obtained probability distributions, which can be considered
were drawn, and the latter 30,000 samples were used to con- as an upper bound of the predicted defect counts with a
struct the probability density functions of the sampled 90% confidence level.
model coefficients. The results of the conducted fivefold cross-validation are
Multiple MCMC chains with different initial values were shown in Figure 3, in which for each validation case, the
simulated to ensure that the estimated model coefficients estimated defect counts with random and clustered patterns
converged to the same set of posterior means. Then for the are plotted versus their true values in the respective
8 X. WANG ET AL.
scatterplots. The root-mean-square errors (RMSEs) of the approach as described in the previous section. The statistics
estimated counts with respect to the measured count data of the estimated coefficients are presented in Table 4 for
are calculated for both the training set and the test set. It defects with both random and clustered patterns. The
can be noted from Figure 3 that in all the validation cases, obtained probability density functions of the sampled coeffi-
both the training set and the test set (i.e. estimated median cient values are further plotted in Figure 4, which provides
values) are scattered around the ideal 1:1 reference line (i.e. a more intuitive representation of the differences between
the 45-degree line), which indicates the defect count data the model coefficients associated with the two spatial pat-
are generally well predicted by the constructed MVPLN terns. The estimation results of the latent components in the
models. Most of the estimated upper bound values are MVPLN model are presented in Table 5. The obtained esti-
located above the reference line. Hence, it seems that the mation results are listed in Table 6, together with the model
use of the 90% percentile values of the estimated defect coefficients estimated using other candidate models.
count distribution is generally conservative. It can be noticed from Table 6 that for defects with ran-
For the random defect count, the RMSEs of the training dom pattern all the four models give generally similar
sets vary from 0.7549 to 0.9477 in the five validation cases, results in terms of the mean value of the estimated model
which are close to RMSEs of the test sets that vary from coefficients and their significance, except for the variable
0.9131 to 1.2714. For the clustered defect count, higher ‘Cl ion concentration’, which is identified as insignificant
RMSEs are observed for both training sets (vary from in the univariate Poisson model. For the defects with a clus-
1.2783 to 1.8143) and testing sets (vary from 1.3520 to tered pattern, however, the univariate Poisson model pro-
2.4475). This is expected since count data of clustered vides significantly different (in terms of both coefficient
defects possess much higher variance. Moreover, by compar- estimates and significance) results from those of the other
ing the five validation cases, it can be noted that the model three models. It appears that under the univariate Poisson
performance for modelling test set #5 is slightly inferior to model, the variables ‘Resistivity (@1 m)’, ‘Cl ion concentra-
the model performance for the other test sets, indicated by tion’ and ‘CO2 3 ion concentration’ are incorrectly declared
larger differences in RMSEs between the training sets and to be insignificant, leading to seriously overestimated stand-
test sets for both random defects and clustered defects. As ard variation of the other coefficients. Such biased estimated
marked in Figure 3e, for certain extreme events in test set results can be attributed to the strong over-dispersed count
#5, the true defect counts are not covered by the estimated data; it is known that the more over-dispersion there is, the
90% confidence intervals. more seriously biased standard errors are estimated under
The modelling of test set #5 may be affected by several the univariate Poisson model. In this case, the true uncer-
reasons. First, it can be noted from Figure 2 that the defect tainties cannot be well represented, and the corresponding
count data in group #5 have both much larger values and interval estimates may not able to capture the true model
variations than the first four groups. In this way, estimating coefficients.
the defect count data of group #5 using the first four groups MVPLN, UVPLN and UVNB models can account for
as a training set is essentially an extrapolation problem that over-dispersion, and thereby providing model coefficients of
is typically more challenging. Another hypothesis is that the generally consistent significance and standard errors that
extreme events that occurred in group #5 are caused by cer- reflect the true model uncertainties. It can be noticed that
tain local factors (e.g. material defect, coating damage dur- the quantified uncertainty of the model coefficients from the
ing construction) that are not considered in the constructed MVPLN model is noticeably smaller than those from the
MVPLN model. Therefore, for practical problems, it is sug- UVPLN model and UVNB model. It indicates that the epi-
gested to include more data samples to avoid the extrapola- stemic uncertainty estimated by the MVPLN model is lower
tion analysis; otherwise, extra attention should be paid for than that estimated from the UVPLN model and UVNB
modelling the extreme events. Yet for most of the data points model. The superiority of the MVPLN model can be further
in the test set #5, the constructed MVPLN model still pro- demonstrated by comparing the log-likelihood evaluated at
vides reasonable estimates with decent accuracy. In general, the posterior mean of model coefficients. The MVPLN
the feasibility and accuracy of the developed MVPLN model model provides a total log-likelihood value of –1728.27,
for modelling real-world pipeline corrosion count data are which is 281.45, 301.01 and 975.84 greater than the sum of
well demonstrated by the presented cross-validation. the two log-likelihood values from the univariate UVPLN
model, the UVNB model and the univariate Poisson model,
respectively. In addition, since the estimation of both the
4.2. Model comparison
MVPLN model and the UVPLN model involves MCMC
In this section, the performance of the proposed MVPLN simulation, the deviance information criteria (DIC), which
model further compared with the existing commonly used is a commonly used measure of the goodness of fit in
statistical models for count data, including the univariate Bayesian statistics with consideration of the model complex-
Poisson-lognormal (UVPLN) model, the univariate Negative ity, is also computed for these two models.
Binomial (UVNB) model and the univariate Poisson model. As can be found in Table 6, a significant drop of 386.06
For comparison purposes, the entire pipeline defect count in DIC value is observed for the MVPLN model (4,847.26)
data set is used as the training set for all the candidate mod- compared with the UVPLN model (5,233.32), which indi-
els. The MVPLN model is constructed using the same cates the MVPLN model provides a better fit compared
Figure 3. Fivefold cross-validation results (a)–(e): validation cases #1–#5.

10 X. WANG ET AL.
Table 4. Estimates of the model coefficients for the developed MVPLN model.
Random defects Clustered defects
Variable description Mean Std. dev. 95% CI Mean Std. dev. 95% CI
Constant 0.553 0.0465 0.459 0.648 –0.145 0.0597 –0.260 –0.0262
Soil corrosivity characteristics
Redox potential –0.451 0.0459 –0.542 –0.360 –0.513 0.0581 –0.628 –0.397
Resistivity @1m –0.114 0.0357 –0.184 –0.0413 –0.108 0.0489 –0.208 –0.00861
pH value –0.306 0.0519 –0.409 –0.204 –0.284 0.0684 –0.419 –0.147
Cl ion concentration 0.111 0.0496 0.015 0.208 0.518 0.0558 0.408 0.629
CO2
3 ion concentration –0.0641 0.0418 –0.148 0.0162 –0.308 0.0554 –0.421 –0.199
HCO 3 ion concentration –0.198 0.0476 –0.294 –0.102 –0.403 0.0765 –0.583 –0.277
2
SO4 ion concentration 0.0754 0.0414 –7.36e3 0.158 0.0573 0.0613 –0.0731 0.169
Summary statistics
Log-likelihood ¼ –1728.27 DIC ¼ 4847.26
1. Significant (at a ¼ 0.05) variables are shown in bold.
2. 95% credible interval is used to inspect whether the coefficients differ from zero in a statistically significant way, which is computed as the region that con-
tains 95% (2.5%–97.5%) sampled coefficient values.
Figure 4. Probability density distributions of the sampled model coefficients for the developed MVPLN model.
Table 5. Estimates of the covariance matrix and the correlation of the to train a corrosion degradation model via a Mixed-Levy
latent effects. approach (Amaya-G omez, Riascos-Ochoa, et al., 2019)
Variables Mean Std. dev. 95% CI under a pressure-stress failure criterion, which is, then, used
r11 (random pattern) 0.883 0.114 0.684 1.135 in turn, to develop a dynamic segmentation strategy. This
r12 , r21 0.943 0.136 0.707 1.229
r22 (clustered pattern) 1.334 0.188 1.006 1.740 strategy aims at identifying optimal intervention times and
pffiffiffiffiffiffiffiffiffiffiffiffiffi
correlation qor12 = r11 r22 0.869 8.478 103 0.850 0.883 locations by taking failure mechanism, corrosion temporal
evolution and corrosion spatial distribution (i.e. isolated and
with the UVPLN model. The large drop in DIC value is grouped) into consideration. Compared with this dynamic
likely related to the high level of correlation between count segmentation strategy, the presented approach in this work
data of the two different defect patterns. The absolute differ- only focuses on the spatial distribution modelling of corro-
ence between the estimated defect counts of each pipeline sion defects based on a static segmentation of 200 m. The
segment obtained using the developed MVPLN model and presented MVPLN model cannot provide any information
the corresponding true value is plotted in Figure 5. The sta- regarding failure probability or critical segment, while which
tistics presented above demonstrate that the MVPLN model can be quantitatively assessed using the dynamic segmenta-
outperforms the tested univariate models in modelling cor- tion approach. In addition, the proposed MVPLN model is
rosion defect count data with multiple correlated patterns, developed in the context of pipeline external corrosion and
in which cases using univariate models may lead to less hence, the effects generated by the corrosive soil environ-
accurate estimates of model coefficients and even seriously ment have been extensively investigated. In contrast, the
biased estimations of the corrosion severity. dynamic segmentation approach purely depends on the ILI
Most recently, the so-called dynamic segmentation was data and hence is suitable for both internal and external
proposed by Amaya-G omez, Sanchez-Silva, and Mu~ noz pipeline corrosion assessments without information on the
(2019). This approach uses information obtained from ILI corrosive environment.
Table 6. Model coefficients estimation results for the MVPLN model and the other commonly used univariate models.
Multivariate Poisson- Univariate Poisson- Univariate negative
Pattern Variable lognormal model lognormal model binomial model Univariate Poisson model
Random Constant 0.553(0.0465) 0.545(0.0569) 0.541(0.0504) 0.531(0.0776)
Redox potential –0.451(0.0459) –0.472(0.0507) –0.388(0.0452) –0.444(0.0830)
Resistivity @1m –0.114(0.0357) –0.109(0.0368) –0.101(0.0377) –0.116(0.0575)
pH value –0.306(0.0519) –0.356(0.0493) –0.410(0.0591) –0.264(0.0694)
Cl ion concentration 0.111(0.0496) 0.103(0.0462) 0.113(0.0485) 0.0119(0.0895)
CO2
3 ion concentration –0.0641(0.0418) –0.0511(0.0411) –0.0523(0.0511) –0.00877(0.0751)
HCO3 ion concentration –0.198(0.0476) –0.209(0.0535) –0.175(0.0435) –0.178(0.0684)
2
SO4 ion concentration 0.0754(0.0414) 0.0882(0.0499) 0.0719(0.0422) 0.0604(0.0677)
Log-likelihood – –1051.92 –1094.67 –1351.89
Clustered Constant –0.145(0.0597) –0.140(0.0600) –0.139(0.0674) –0.0270(0.0941)
Redox potential –0.513(0.0581) –0.491(0.0626) –0.429(0.0602) –0.531(0.1045)
Resistivity @1m –0.108(0.0489) –0.112(0.0599) –0.0880(0.0815) –0.179(0.1051)
pH value –0.284(0.0684) –0.280(0.0665) –0.274(0.0631) –0.439(0.1110)
Cl ion concentration 0.518(0.0558) 0.494(0.0581) 0.424(0.0583) 0.195(0.1133)
2
CO3 ion concentration –0.308(0.0554) –0.245(0.0602) –0.279(0.0588) –0.135(0.1429)
HCO3 ion concentration –0.430(0.0765) –0.522(0.0885) –0.415(0.0695) –0.405(0.1055)
SO2
4 ion concentration 0.0573(0.0613) 0.0755(0.0622) 0.163(0.0677) 0.0695(0.1547)
Log-likelihood – –957.8 –934.61 –1352.22
Summary statistics
Total Log-likelihood –1728.27 –2009.72 –2029.28 –2704.11
DIC 4847.26 5233.32 – –
1. Significant (at a ¼ 0.05) variables are shown in bold.
2. Numbers in parentheses represent posterior standard deviations of the estimates of model coefficients.
3. The univariate Poisson-lognormal model was implemented using a full Bayesian approach (Miaou, Song, & Mallick, 2003).
4. The univariate Poisson model and univariate negative binomial model were implemented in SAS.
4.3. Latent effects and their correlation across posterior distributions. From a Bayesian perspective, this is
defect patterns mainly because that the number of events identified as clus-
tered defects is fewer (i.e. 1180 random defects and 712
Estimates of the covariance matrix and the correlation of
clustered defects); thereby, the sampled coefficients have
the latent effects of the MVPLN model are presented in
larger variations. Table 4 provides the model parameter esti-
Table 5. The posterior mean of the correlation between
mates, their corresponding standard deviation and 95%
defect counts with different spatial patterns is found to be
credible intervals for the random corrosion defect and clus-
0.869, with a 95% credible interval of (0.850, 0.883). This
tered corrosion defect count model under the
estimated correlation supports the intuition that there exists
a highly significant correlation between defect counts of two MVPLN framework.
spatial patterns. Such a strong correlation might due to the In the developed MVPLN model, redox potential is iden-
underlying corrosion mechanisms. To be more specific, it is tified as the most significant factor that contributes nega-
well acknowledged that localised pitting corrosion, regard- tively to the defect counts compared with the other six
less of the spatial pattern, is preceded by the appearance of factors related to soil properties. Redox potential is an indi-
metastable pitting, which may transit to stable pit through a cator of electrochemical corrosion reactivity. To some
stochastic procedure (Burstein et al., 1993). In this way, it is extent, the lower the redox potential, the higher the prob-
likely that the nucleation of defects, regardless of their spa- ability of active corrosion (Jones, 1996). This is consistent
tial patterns, may depend on certain common corrosiv- with the negative coefficient: 0.451 for random defects and
ity variables. –0.513 for clustered defects.
In addition, it is reported that metastable pits can influ- Resistivity@1m, pH value, concentrations of chloride

ence each other in space through the release of aggressive ion, CO23 ion, HCO3 ion and sulphate ion are the meas-
ions that weaken the protective oxide layers, and such inter- ured parameters related to soil properties. Amongst them,
action may further enhance the nucleation and stabilisation pH value has the most negative coefficient –0.306 for ran-
of metastable pits. Therefore, to some extent, the randomly dom defects, which agrees with the fact that more acid soils
located defects can be considered as an intermediate state of have a higher risk of corrosion in buried metallic structures.
the formation of the clustered defects under certain favour- The external corrosion rate of pipeline decreases with an
able environment. Thus, the strong correlation between the increase in pH value of soil (Abbas, Norman, & Charles,
defect counts of the two spatial patterns is reasonable. 2018). A similar value of –0.284 is also observed for clus-
Therefore, as with any statistical models, such a strong cor- tered defect counts.
relation of defect count between different spatial patterns The concentration of chloride ion is identified as the
needs to be incorporated into the model framework. most significant factor for clustered defect counts with a
mean value of 0.518. The mean value of the model coeffi-
cient of chloride concentration for random defects is 0.111,
4.4. Estimates of model coefficients
which is also significant. Different from the negative coeffi-
Generally, as presented in Figure 4, the model coefficients cients from redox potential and pH value, the model coeffi-
corresponding to the clustered corrosion pattern have flatter cient for chloride concentration is positive to the count of
12 X. WANG ET AL.
Carbon dioxide is one of the main chemical species that

cause internal corrosion of transmission pipeline, and exten-
sive research work has been conducted to study its mechan-
ism (Barker, Burkle, Charpentier, Thompson, & Neville,
2018; Javidi & Bekhrad, 2018). While for external corrosion

of pipeline, both CO2 3 ion and HCO3 ion are from the dis-
solved carbon dioxide in the soil pore solution, these two ions
react with the pipeline steel to form a layer of iron carbonate
(FeCO3 ) that adheres to the steel surface. This iron carbonate
layer gradually slows down the carbon dioxide and corrosion
rate, and consequently causes an increase in the concentrations

of CO23 and HCO3 ion in the soil. Therefore, negative model
coefficients are observed for both random and clustered defect
counts. These results are supported by existing experimental

findings that increases in CO23 and HCO3 ion concentrations
may result in an increased stability of the FeCO3 passive film
and a sequentially increased pitting potential (Lu, Huang,
Huang, & Yang, 2006; Mao, Liu, & Revie, 1994).
Moreover, compared with coefficients from random

defects (–0.0641 for CO2 3 ion and –0.198 for HCO3 ion),

the estimated coefficients of both HCO3 ion and CO2 3 ion
for clustered defects are more significant (–0.308 for CO2 3
ion and –0.403 for HCO 3 ion). This is probably because the

concentration of both CO2 3 ion and HCO3 ion is the aver-
age of many measured sites within one segment, the con-
centration of these two ions may be more sensitive to a
relatively large corrosion area, which could be the result of
clustered corrosion events.
Resistivity @1 m is identified as a less significant negative
factor with mean values of –0.114 and –0.108 for random
and clustered defect counts, respectively. The negative value
Figure 5. Absolute residuals of the MVPLN model (a) defects with random spa- of these two coefficients indicates that an increase in the
tial pattern; (b) defects with clustered spatial pattern. soil resistivity will reduce the possibility of corrosion defects,
which is in agreement with experimental observation
both the random and clustered defects. These are expected (Bradford, 2000). However, it is important to note that soil
results since chloride plays a key role in the nucleation and resistivity alone may not be enough to evaluate the chance
propagation of pitting corrosion of carbon steels (Cheng, of pitting nucleation; for example, bacteria, dissimilar metals
Wilmott, & Luo, 1999; Tang et al., 2013). As the amount of or oxygen concentration cells may create severe corrosion in
chloride accumulates to a threshold at the steel surface, the high resistivity soil (Fitzgerald, 1989). This may explain the
passive film breaks down and stable pitting corrosion forms. relatively weak correlation between soil resistivity and defect
Moreover, clustered defects generally have a higher counts, compared with other explanatory variables.
chance of being attacked by chloride ion and hence are Sulphate-reducing bacteria (SRB) are widely found in
more sensitive to the chloride ion concentration, which is soils and are important stains that cause microbiologically
consistent with the fact that a high coefficient 0.518 is influenced corrosion (MIC) (Li et al., 2018; Liu & Cheng,
observed for clustered defects than 0.111 from random 2017). An increase in the concentration of sulphate ion will
defects. In addition, it has been reported that metastable pit- increase corrosion activity, which is consistent with the
ting events can exert significant impacts on the nucleation positive estimated coefficient: 0.0754 for random defect
of future events in terms of cooperative stochastic behaviour counts and 0.0573 for clustered defect counts. However,
(Bertocci & Ye, 1984; Leckie & Uhlig, 1966; Wu, Scully, compared with other factors related to soil corrosivity, the
Hudson, & Mikhailov, 1997). To be more specific, the concentration of sulphate ion is the least significant. On one
cooperative interaction of metastable pitting events may hand, it is probably because the soil along the pipeline inves-
occur at certain critical potential, which can lead to an auto- tigated in this study is not a favourable home for the growth
catalytic explosion in the metastable pitting density and con- of SRB. On the other hand, in addition to the sulphate ion,
sequent sudden rise in growth of pit density in a local MIC also depends on other parameters such as pH, anaerobic
region (Cheng & Luo, 2000; Mikhailov, Jain, Organ, & condition (low redox potential) and the existence of micro-
Hudson, 2006; Scully, Budiansky, Tiwary, Mikhailov, & biological activity (Arriba-Rodriguez, Villanueva-Balsera,
Hudson, 2008). Ortega-Fernandez, & Rodriguez-Perez, 2018).
Figure 6. (a) Estimated counts of defect with random pattern for the non-piggable segments; (b) estimated counts of defect with clustered pattern for the non-
piggable segments.
4.5. Estimating corrosion severity of non-piggable results demonstrate that the developed model can provide a
pipeline segments flexible alternative way to evaluate the external corrosion sever-
ity of non-piggable pipelines other than ditch excavation.
In this section, it will be demonstrated how the presented
approach can add value to the current pipeline maintenance
practices of the non-piggable pipeline systems. This model 5. Conclusions
is applied for modelling the first non-piggable pipeline
interval shown in Figure 2. This non-piggable pipeline inter- This study investigates the possibility of modelling pipeline
val ranges from 11,000 to 23,000 m, which contains a total external corrosion defect count data with two correlated
number of 60 segments with an equal length of 200 m. corrosion patterns (i.e. random pattern and clustered pat-
Figure 6 shows the estimated median value and upper tern) using a MVPLN model with the Markov chain Monto
bound with a 95% confidence level of the estimated defect Carlo (MCMC) algorithm as a computational engine for
counts for each segment. It can be noted, for the defects Bayesian parameter inference. The developed MVPLM
with a random pattern, the estimated median counts and model is applied to a data set consisting of external corro-
the upper bound counts range from 0.447 to 3.881 and sion defect counts and corrosivity measurements of sur-
from 1.964 to 16.985, respectively. rounding soils along a 110-km-long buried oil
For the defects with a clustered pattern, the estimated transportation. The performance of the developed model for
median counts and the upper bound counts range from predicting real-world corrosion defect count is assessed
0.106 to 15.453 and from 1.002 to 74.332, respectively. using cross-validation.
Slightly downward trends of the estimated defect counts can For comparison purposes, the same data-set was also
be found along the alignment of the non-piggable pipeline analysed using the UVPLN model, the UVNB model and
interval, indicating that the first half of the interval may the univariate Poisson model. Analysis results demonstrated
encounter a severer corrosion condition. Remarkable that the MVPLN model generally provided a superior fit
median and upper bound of defect counts are obtained for over the univariate models, indicated by noticeably larger
200-m segments starting at 11.8 and 14.8 km, suggesting log-likelihood value and smaller DIC value. Meanwhile, the
that these segments may require more attention during the modelling results showed that the analysed defect count
inspection and maintenance operations. These estimation data are highly over-dispersed, and there exists a strong
14 X. WANG ET AL.
correlation between the two different spatial patterns (i.e. Alamilla, J. L., & Sosa, E. (2008). Stochastic modelling of corrosion
random and clustered defects). Such statistical features of damage propagation in active sites from field inspection data.
Corrosion Science, 50(7), 1811–1819. doi:10.1016/j.corsci.2008.03.005
the real-life pipeline defect count data can be well modelled Ali, S. (2017). Time-between-events control charts for an exponentiated
and accommodated by the MVPLN model, resulting in a class of distributions of the renewal process. Quality and Reliability
significant improvement in the goodness of fit. Engineering International, 33(8), 2625–2651. doi:10.1002/qre.2223
The rigorous modelling of the spatial distribution with Amaya-G omez, R., Riascos-Ochoa, J., Munoz, F., Bastidas-Arteaga, E.,
the MVPLN model can provide engineers with valuable Schoefs, F., & Sanchez-Silva, M. (2019). Modeling of pipeline corro-
sion degradation mechanism with a Levy Process based on ILI (In-
information regarding the mechanism of pitting corrosion. Line) inspections. International Journal of Pressure Vessels and
The estimated model coefficients from the data-driven Piping, 172, 261–271. doi:10.1016/j.ijpvp.2019.03.001
approach were thoroughly discussed; the modelling results Amaya-G omez, R., Sanchez-Silva, M., & Mu~ noz, F. (2019). Integrity
were found in general consistent with corrosion mechanisms assessment of corroded pipelines using dynamic segmentation and
reported in the literature. In addition, the constructed model clustering. Process Safety and Environmental Protection, 128,
284–294. doi:10.1016/j.psep.2019.05.049
was further utilised to predict the corrosion defect counts Arriba-Rodriguez, L.-d., Villanueva-Balsera, J., Ortega-Fernandez, F., &
on the non-piggable pipeline segments. The presented Rodriguez-Perez, F. (2018). Methods to evaluate corrosion in buried
results demonstrated that the developed model provides a steel structures: A review. Metals, 8(5), 334. doi:10.3390/met8050334
feasible way of estimating the corrosion severity for the Barker, R., Burkle, D., Charpentier, T., Thompson, H., & Neville, A.
non-piggable pipeline. (2018). A review of iron carbonate (FeCO3) formation in the oil
and gas industry. Corrosion Science, 142, 312–341. doi:10.1016/j.cor-
The results presented in this paper are based on a single sci.2018.07.021
data set. Further research with different data sets is planned Bazan, F. A. V., & Beck, A. T. (2013). Stochastic process corrosion
to confirm the findings in this study. In addition, consider- growth models for pipeline reliability. Corrosion Science, 74, 50–58.
able latent effects have been observed in the developed doi:10.1016/j.corsci.2013.04.011
model, which lead to predictions of corrosion severity with Benjamin, A. C., Freire, J. L. F., & Vieira, R. D. (2007). Analysis of
pipeline containing interacting corrosion defects. Experimental
large uncertainty. In further studies, such uncertainty may Techniques, 31(3), 74–82. doi:10.1111/j.1747-1567.2007.00190.x
be mitigated by employing soil corrosivity measurements Benjamin, A. C., Freire, J. L. F., Vieira, R. D., & Cunha, D. J. (2016).
with better precision (higher measure point density), as well Interaction of corrosion defects in pipelines–Part 2: MTI JIP data-
as by introducing additional quantifiable corrosion-related base of corroded pipe tests. International Journal of Pressure Vessels
and Piping, 145, 41–59. doi:10.1016/j.ijpvp.2016.06.006
variables into the model.
Benjamin, A. C., Freire, J. L. F., Vieira, R. D., & de Andrade, E. Q.
(2008). Burst tests on pipeline containing closely spaced corrosion
defects. 25th International Conference on Offshore Mechanics and
Disclosure statement Arctic Engineering, 4, 103–116. doi:10.1115/OMAE2006-92131
No potential conflict of interest was reported by the authors. Bertocci, U., & Ye, Y. X. (1984). An examination of current fluctua-
tions during pit initiation in Fe-Cr alloys. Journal of the
Electrochemical Society, 131(5), 1011–1017. doi:10.1149/1.2115742
Funding Bradford, S. A. (2000). The practical handbook of corrosion control in
soils. Edmonton: CASTI Publishing.
The corresponding author would like to acknowledge the School of Burstein, G., Pistorius, P., & Mattin, S. (1993). The nucleation and
Engineering, The University of Dayton for the financial support to per- growth of corrosion pits on stainless steel. Corrosion Science,
form part of the work in this project. This work was also supported by 35(1–4), 57–62. doi:10.1016/0010-938X(93)90133-2
the DOT PHMSA Office of Pipeline Safety under Grant Caleyo, F., Alfonso, L., Alcantara, J., & Hallen, J. M. (2008). On the
693JK31910018POTA. estimation of failure rates of multiple pipeline systems. Journal of
Pressure Vessel Technology, 130(2), 021704–021704. doi:10.1115/1.
2894292
Caleyo, F., Gonzalez, J. L., & Hallen, J. M. (2002). A study on the reli-
ORCID ability assessment methodology for pipelines with active corrosion
defects. International Journal of Pressure Vessels and Piping, 79(1),
Xiangrong Wang http://orcid.org/0000-0001-5618-8776 77–86. doi:10.1016/S0308-0161(01)00124-7
Hui Wang http://orcid.org/0000-0002-7970-6772 Caleyo, F., Velazquez, J. C., Valor, A., & Hallen, J. M. (2009a). Markov
Fujian Tang http://orcid.org/0000-0002-3066-5041 chain modelling of pitting corrosion in underground pipelines.
Homero Castaneda http://orcid.org/0000-0002-9252-7744 Corrosion Science, 51(9), 2197–2207. doi:10.1016/j.corsci.2009.06.014
Caleyo, F., Velazquez, J. C., Valor, A., & Hallen, J. M. (2009b).
Probability distribution of pitting corrosion depth and rate in
References underground pipelines: A Monte Carlo study. Corrosion Science,
51(9), 1925–1934. doi:10.1016/j.corsci.2009.05.019
Abbas, M. H., Norman, R., & Charles, A. (2018). Neural network mod- Chen, Y., Zhang, H., Zhang, J., Liu, X., Li, X., & Zhou, J. (2015).
elling of high pressure CO2 corrosion in pipeline steels. Process Failure assessment of X80 pipeline with interacting corrosion
Safety and Environmental Protection, 119, 36–45. doi:10.1016/j.psep. defects. Engineering Failure Analysis, 47, 67–76. doi:10.1016/j.engfai-
2018.07.006 lanal.2014.09.013
Alamilla, J. L., Espinosa-Medina, M. A., & Sosa, E. (2009). Modelling Cheng, Y., & Luo, J. (2000). Statistical analysis of metastable pitting
steel corrosion damage in soil environment. Corrosion Science, events on carbon steel. British Corrosion Journal, 35(2), 125–130.
51(11), 2628–2638. doi:10.1016/j.corsci.2009.06.052 doi:10.1179/000705900101501146
Alamilla, J. L., Oliveros, J., & Garcıa-Vargas, J. (2009). Probabilistic Cheng, Y., Wilmott, M., & Luo, J. (1999). The role of chloride ions in
modelling of a corroded pressurized pipeline at inspection time. pitting of carbon steel studied by the statistical analysis of electro-
Structure and Infrastructure Engineering, 5(2), 91–104. doi:10.1080/ chemical noise. Applied Surface Science, 152(3–4), 161–168. doi:10.
15732470600924680 1016/S0169-4332(99)00328-1
Chib, S., & Greenberg, E. (1995). Understanding the Metropolis- behaviour of iron in weakly alkaline solutions with or without halides.
Hastings algorithm. The American Statistician, 49(4), 327–335. doi: Corrosion Science, 48(10), 3049–3077. doi:10.1016/j.corsci.2005.11.014
10.1080/00031305.1995.10476177 Mannering, F. L., Shankar, V., & Bhat, C. R. (2016). Unobserved het-
Chib, S., & Winkelmann, R. (2001). Markov Chain Monte Carlo ana- erogeneity and the statistical analysis of highway accident data.
lysis of correlated count data. Journal of Business & Economic Analytic Methods in Accident Research, 11, 1–16. doi:10.1016/j.amar.
Statistics, 19(4), 428–435. doi:10.1198/07350010152596673 2016.04.001
Cosham, A., Hopkins, P., & Macdonald, K. A. (2007). Best practice for the Mao, X., Liu, X., & Revie, R. (1994). Pitting corrosion of pipeline steel
assessment of defects in pipelines – Corrosion. Engineering Failure in dilute bicarbonate solution with chloride ions. CORROSION,
Analysis, 14(7), 1245–1265. doi:10.1016/j.engfailanal.2006.11.035 50(9), 651–657. doi:10.5006/1.3293540
De Andrade, E. Q., Benjamin, A. C., Machado, P. R., Pereira, L. C., Jacob, Melchers, R. E. (2005). Representation of uncertainty in maximum
B. P., Carneiro, E. G., & Noronha, D. B. (2008). Finite element model- depth of marine corrosion pits. Structural Safety, 27(4), 322–334.
ing of the failure behavior of pipelines containing interacting corrosion doi:10.1016/j.strusafe.2005.02.002
defects. 25th International Conference on Offshore Mechanics and Arctic Miaou, S.-P., Song, J. J., & Mallick, B. K. (2003). Roadway traffic crash
Engineering, 4, 315–325. doi:10.1115/OMAE2006-92600 mapping: A space-time modeling approach. Journal of
De La Cruz, J. L., Kuniewski, S. P., Van Noortwijk, J. M., & Gutierrez, Transportation and Statistics, 6, 33–58. Retrieved from https://
M. A. (2008). Spatial nonhomogeneous Poisson process in corrosion rosap.ntl.bts.gov/view/dot/34849/dot_34849_DS10.pdf#page=39
management. Journal of the Electrochemical Society, 155(8), Mikhailov, A. S., Jain, S., Organ, L., & Hudson, J. L. (2006).
C396–406. doi:10.1149/1.2926543 Cooperative stochastic behavior in the onset of localized corrosion.
El-Abbasy, M. S., Senouci, A., Zayed, T., Mirahadi, F., & Parvizsedghy, Chaos (Woodbury, N.Y.), 16(3), 037104. doi:10.1063/1.2214155
L. (2014). Artificial neural network models for predicting condition Nadarajah, S. (2005). Exponentiated Pareto distributions. Statistics,
of offshore oil and gas pipelines. Automation in Construction, 45, 39(3), 255–260. doi:10.1080/02331880500065488
50–65. doi:10.1016/j.autcon.2014.05.003 Punckt, C., B€ olscher, M., Rotermund, H. H., Mikhailov, A. S., Organ,
Ferreira, C. A. M., Ponciano, J. A. C., Vaitsman, D. S., & Perez, D. V. L., Budiansky, N., … Hudson, J. L. (2004). Sudden onset of pitting
(2007). Evaluation of the corrosivity of the soil through its chemical corrosion on stainless steel as a critical phenomenon. Science (New
composition. The Science of the Total Environment, 388(1–3), York, N.Y.), 305(5687), 1133–1136. doi:10.1126/science.1101358
250–255. doi:10.1016/j.scitotenv.2007.07.062 Scully, J. R., Budiansky, N. D., Tiwary, Y., Mikhailov, A. S., & Hudson,
Fitzgerald, J. (1989). The future as a reflection of the past. In V. J. L. (2008). An alternate explanation for the abrupt current increase
Chaker & J. Palmer (Eds.), Effects of soil characteristics on corrosion at the pitting potential. Corrosion Science, 50(2), 316–324. doi:10.
(pp. 1–4). West Conshohocken, PA: ASTM. doi:10.1520/STP19705S 1016/j.corsci.2007.08.002
Frankel, G. (1998). Pitting corrosion of metals a review of the critical Shibata, T. (1996). W.R. Whitney Award Lecture: Statistical and sto-
factors. Journal of the Electrochemical Society, 145(6), 2186–2198. chastic approaches to localized corrosion. Corrosion, 52(11),
doi:10.1149/1.1838615 813–830. doi:10.5006/1.3292074
Grandell, J. (1997). Mixed Poisson processes. London: Chapman and Shibata, T. (2000). Corrosion probability and statistical evaluation of
Hall. corrosion data. New York, NY: John Wiley & Sons.
Hong, H. P. (1999). Inspection and maintenance planning of pipeline Silva, R. C. C., Guerreiro, J. N. C., & Loula, A. F. D. (2007). A study
under external corrosion considering generation of new defects. of pipe interacting corrosion defects using the FEM and neural net-
Structural Safety, 21(3), 203–222. doi:10.1016/S0167-4730(99)00016-8 works. Advances in Engineering Software, 38(11–12), 868–875. doi:
Javidi, M., & Bekhrad, S. (2018). Failure analysis of a wet gas pipeline 10.1016/j.advengsoft.2006.08.047
due to localised CO2 corrosion. Engineering Failure Analysis, 89, Sinha, S. K., & Pandey, M. D. (2002). Probabilistic neural network for reli-
46–56. doi:10.1016/j.engfailanal.2018.03.006 ability assessment of oil and gas pipelines. Computer-Aided Civil and
Jones, D. A. (1996). Principles and prevention of corrosion (2nd ed.). Infrastructure Engineering, 17(5), 320–329. doi:10.1111/1467-8667.00279
Upper Saddle River, NJ: Prentice Hall. Stewart, J., & Williams, D. E. (1992). The initiation of pitting corrosion
Karlis, D., & Xekalaki, E. (2007). Mixed Poisson distributions. on austenitic stainless steel: On the role and importance of sulphide
International Statistical Review, 73(1), 35–58. doi:10.1111/j.1751- inclusions. Corrosion Science, 33(3), 457–474. doi:10.1016/0010-
5823.2005.tb00250.x 938X(92)90074-D
Katano, Y., Miyata, K., Shimizu, H., & Isogai, T. (2003). Predictive Tang, F., Cheng, X., Chen, G., Brow, R. K., Volz, J. S., & Koenigstein,
model for pit growth on underground pipes. CORROSION, 59(2), M. L. (2013). Electrochemical behavior of enamel-coated carbon
155–161. doi:10.5006/1.3277545 steel in simulated concrete pore water solution with various chloride
Kishawy, H. A., & Gabbar, H. A. (2010). Review of pipeline integrity concentrations. Electrochimica Acta, 92, 36–46. doi:10.1016/j.elec-
management practices. International Journal of Pressure Vessels and tacta.2012.12.125
Piping, 87(7), 373–380. doi:10.1016/j.ijpvp.2010.04.003 Valor, A., Alfonso, L., Caleyo, F., Vidal, J., Perez-Baruch, E., & Hallen,
Kocherlakota, S., & Kocherlakota, K. (2004). Bivariate discrete distribu- J. M. (2015). The negative binomial distribution as a model for
tions. New York, NY: Marcel Dekker. external corrosion defect counts in buried pipelines. Corrosion
Leckie, H., & Uhlig, H. (1966). Environmental factors affecting the crit- Science, 101, 114–131. doi:10.1016/j.corsci.2015.09.009
ical potential for pitting in 18–8 stainless steel. Journal of the Valor, A., Caleyo, F., Hallen, J. M., & Velazquez, J. C. (2013).
Electrochemical Society, 113(12), 1262–1267. doi:10.1149/1.2423801 Reliability assessment of buried pipelines based on different corro-
Li, X., Bai, Y., Su, C., & Li, M. (2016). Effect of interaction between sion rate models. Corrosion Science, 66, 78–87. doi:10.1016/j.corsci.
corrosion defects on failure pressure of thin wall steel pipeline. 2012.09.005
International Journal of Pressure Vessels and Piping, 138, 8–18. doi: Velazquez, J. C., Caleyo, F., Valor, A., & Hallen, J. M. (2009).
10.1016/j.ijpvp.2016.01.002 Predictive model for pitting corrosion in buried oil and gas pipe-
Li, X., Xie, F., Wang, D., Xu, C., Wu, M., Sun, D., & Qi, J. (2018). Effect lines. Corrosion, 65(5), 332–342. doi:10.5006/1.3319138
of residual and external stress on corrosion behaviour of X80 pipeline Wang, H., Yajima, A., Liang, R. Y., & Castaneda, H. (2015a). Bayesian
steel in sulphate-reducing bacteria environment. Engineering Failure modeling of external corrosion in underground pipelines based on
Analysis, 91, 275–290. doi:10.1016/j.engfailanal.2018.04.016 the integration of Markov Chain Monte Carlo techniques and clus-
Liu, H., & Cheng, Y. F. (2017). Mechanism of microbiologically influ- tered inspection data. Computer-Aided Civil and Infrastructure
enced corrosion of X52 pipeline steel in a wet soil containing sul- Engineering, 30(4), 300–316. doi:10.1111/mice.12096
fate-reduced bacteria. Electrochimica Acta, 253, 368–378. doi:10. Wang, H., Yajima, A., Liang, R. Y., & Castaneda, H. (2015b). A clus-
1016/j.electacta.2017.09.089 tering approach for assessing external corrosion in a buried pipeline
Lu, Z., Huang, C., Huang, D., & Yang, W. (2006). Effects of a magnetic based on hidden Markov random field model. Structural Safety, 56,
field on the anodic dissolution, passivation and transpassivation 18–29. doi:10.1016/j.strusafe.2015.05.002
16 X. WANG ET AL.
Wang, H., Yajima, A., Liang, R. Y., & Castaneda, H. (2016). Yajima, A., Rivera, H., Mora, R., Martinez, L., Vergara, M., &
Reliability-based temporal and spatial maintenance strategy for Castaneda, H. (2015). Proposed ECDA methodology modification by
integrity management of corroded underground pipelines. Structure including new probabilistic and statistical analysis based on a case of
and Infrastructure Engineering, 12(10), 1281–1294. doi:10.1080/ study for 110km pipeline. Corrosion 2015, Dallas, Texas.
15732479.2015.1113300 Yajima, A., Wang, H., Liang, R. Y., & Castaneda, H. (2015). A
Williams, D. E., Stewart, J., & Balkwill, P. H. (1994). The nucleation, clustering based method to evaluate soil corrosivity for pipeline
growth and stability of micropits in stainless steel. Corrosion external integrity management. International Journal of Pressure
Science, 36(7), 1213–1235. doi:10.1016/0010-938X(94)90145-7
Vessels and Piping, 126–127, 37–47. doi:10.1016/j.ijpvp.2014.12.
Winkelmann, R. (2000). Seemingly unrelated negative binomial regres-
004
sion. Oxford Bulletin of Economics and Statistics, 62(4), 553–560.
Zhan, X., Aziz, H. M. A., & Ukkusuri, S. V. (2015). An efficient paral-
doi:10.1111/1468-0084.00188
Winkelmann, R. (2008). Econometric analysis of count data (5th ed.). lel sampling technique for Multivariate Poisson-Lognormal model:
New York, NY: Springer. Analysis with two crash count datasets. Analytic Methods in
Wranglen, G. (1974). Pitting and sulphide inclusions in steel. Corrosion Accident Research, 8, 45–60. doi:10.1016/j.amar.2015.10.002
Science, 14(5), 331–349. doi:10.1016/S0010-938X(74)80047-8 Zhou, W., Hong, H. P., & Zhang, S. (2012). Impact of dependent sto-
Wu, B., Scully, J., Hudson, J., & Mikhailov, A. (1997). Cooperative stochastic defect growth on system reliability of corroding pipelines.
chastic behavior in localized corrosion I. Model. Journal of the International Journal of Pressure Vessels and Piping, 96–97, 68–77.
Electrochemical Society, 144(5), 1614–1620. doi:10.1149/1.1837650 doi:10.1016/j.ijpvp.2012.06.005

Xiangrong 2020

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Xiangrong 2020

Uploaded by

Copyright:

Available Formats

Structure and Infrastructure Engineering

Maintenance, Management, Life-Cycle Design and Performance

ISSN: 1573-2479 (Print) 1744-8980 (Online) Journal homepage: https://www.tandfonline.com/loi/nsie20

Statistical analysis of spatial distribution of

To link to this article: https://doi.org/10.1080/15732479.2020.1766516

Published online: 20 May 2020.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

Statistical analysis of spatial distribution of external corrosion defects in buried

ABSTRACT ARTICLE HISTORY

1. Introduction and propagation of external corrosion defects typically

Table 2. Estimation results of the exGPD mixture model.

Table 3. Descriptive statistics of defect counts and soil corrosivity measurements.

Soil corrosivity characteristics

Figure 3. Fivefold cross-validation results (a)–(e): validation cases #1–#5.

Carbon dioxide is one of the main chemical species that

You might also like