Professional Documents
Culture Documents
x
d
=0
(1)
Since the data points y
are perturbed, they in general do not lie exactly on the polynomial. Consequently, there is a
residual r
= y
= y
- o
x
d
=0
(2)
We can rewrite the equation (2) in matrix form yields,
4
SPE 163302
_r
r
n
_ = _
y
1
y
n
_ -_
1 x
1
x
1
2
x
1
d
1 x
n
x
n
2
x
n
d
_ _
o
0
o
d
_ (3)
In general, measurements data; the design matrix B and the coefficient vector z are defined as,
B = _
1 x
1
x
1
2
x
1
d
1 x
n
x
n
2
x
n
d
_ and z = _
o
0
o
d
_ (4)
Consequently,
= B z (5)
So the error can be rewritten as,
E = r
2 n
=1
= r
1
r (6)
Then, the coefficient vector is calculated using,
z = (B
1
B)
-1
B
1
y = B
+
y (7)
Where B
+
is the pseudo-inverse of B (OLeary and Matther Harker 2008). The column vectors forming the matrix B are
the basis functions. The matrix B is called Vandermonde matrix. The fact is that the polynomials or the basis functions of
Vandermonde matrix are not orthogonal. This causes complexity in computation and calculation of high degrees of
polynomials. Moreover, this means that the Vandermode basis functions are not suitable for solving large-scale problems. This
fact leads us to choose another set of basis functions that are orthogonal and it is possible to calculate higher degrees of such
polynomials. Here are examples on orthogonal polynomials from functional analysis domain: Legendre, Chebyshev, Gram,
Gegenbauer, etc.
In our study on drilling sensors data, we will use Gram polynomials to describe drilling time series of different drilling
operations. We choose Gram polynomials because not only they are orthogonal but also they have a uniform scaling and this is
important to fit some complex shapes in drilling time series with higher performance than what Vandermonde basis can do.
The equation of generating Gram polynomials given by (OLeary and Matthew Harker 2010):
g
n
(x) = 2 o
n-1
x g
n-1
(x) -
u
n-1
u
n-2
g
n-2
(x) (8)
Whereby,
o
n-1
=
m
n
[
n
2
-12
m
2
-n
2
12
(9)
And g
0
(x) = 1, g
-1
(x) = u onJ o
-1
= 1 (10)
Figure 2 shows us how Gram basis functions look like.
Figure 2: The first six degrees of Gram basis functions.
SPE 163302 5
Applying Basis Functions on Time Series of Drilling Operations
Figure 3 shows two different formation drilling operations (left and right). The sensors measurements of block position, RPM,
flowIn, and hole depth are shown. All the sensors measurements are normalized to keep the values of coefficients close to each
other and make it comparable. The red lines on upper four sub-plots in Figure 3 represent the fitted Gram polynomials of each
sensor data. While the lowest sub-plot in Figure 3 shows the values of coefficients vectors z
1
, z
2
, z
3
, and z
4
as map of colors.
We call the coefficient matrix at the lowest sub-plot as pattern descriptor of drilling operation. Gram basis functions of high
order are used in calculations i.e. each of sensors data in each formation drilling operation in Figure 3 is represented using
Gram polynomial of degree 20.
The lowest part of Figure 3 shows that the two formation drilling operations look similar and their patterns are close to
each other. This will be a key indicator in drilling operations recognition process. Furthermore, we notice that the block
position data represented as a line and this already shown in the coefficients matrix (lowest sub-plot in Figure 3, first row in
color map). Where from 20 degrees polynomial only the second component has distinct value (blue color). The second value
in Gram polynomial spectrum represents line component (see Figure 2, second degree). The RPM data in drilling operation
shows that this data is not exactly at one level, and this is opposite to what was believed from drilling domain where the driller
sets and fixes the RPM to a specific value while formation drilling operation. The raw data shows that the values of RPM and
also flowIn are fluctuated in small range. RPM fluctuated in range [139-142] rev/min, and flowIn fluctuated between 1983 and
1985 L/min at depth of 3070 m. this small fluctuations will be the key indicator to recognize the trends of RPM and flowIn
through formation drilling from other states where the pumps are off and the drillstring is not rotating.
Figure 3: Two formation drilling operations and their polynomial representations (Patterns).
Trend Analysis of Drilling Operations
In this paragraph, we study the trends of all available sensors measurements using Gram basis methods for a specific drilling
operation (Cleaning a hole Circulation). Moreover, we try to give answers to a number of questions such as: What is the best
polynomial degree that preserves required information about each sensors data? Do we need to use all data from different
sensors to describe each drilling operation? What are the important sensors data that we need to monitor and recognize
different drilling operations? To answer such questions we need to go in deep analysis and take a look at each sensor data
during different drilling operation and check the spectrum (Gram polynomial coefficients moments) and corresponding
proportion of total power.
We take Cleaning Hole (Circulation) operation as case study to answer the questions asked earlier. Figure 4 shows a
sensors data during cleaning hole operation. The figure is divided into three main sections: first section shows the raw data and
the polynomial approximated data over time. The second section in the figure shows the corresponding spectrum of Gram
polynomials that are used to approximate the sensors data. The third section in the figure shows the proportional of total power
for each polynomial spectrum. The importance of proportional of total power shows which polynomial degrees or coefficients
play the main role in representing the information inside each sensor data. Redline threshold of 95% is plotted on the
proportional of total power.
The polynomial spectrum shows us how the coefficients fluctuated where each change in values of coefficients gives us
information about the importance of the polynomial component in representing and reconstructing the corresponding sensors
6 SPE 163302
data. For example, the spectrum of hookload sensor data indicates that the first component in polynomial spectrum carries no
information because it has a value of zero, but the components from 2 to 7 are representing most of the information in
hookload sensors data during cleaning hole. If we look at proportional of total power of hookload sensor data, we find that the
first five components represent around 95% of the information in hookload sensor data and the first component has zero
information. Then we can say that we need to keep four components (2 to 5) to be able to represent and reconstruct the
hookload sensors data.
If we use same analogy on other sensors data in Figure 4, we can conclude the following facts about sensors data during
cleaning hole (Circulation) operation: 1) To get all information about Block position sensor data, we need to preserve just one
component from polynomial spectrum. 2) It is enough to keep the components from 2 to 8 to have 95% information from
torque sensor data. 4) We need just two components (2 and 3) to be able to reconstruct the flowIn sensor data. 5) One
component for each Hole Depth and Bit Depth is sufficient. 6) Three components (2-3-4) of polynomial spectrum are enough
to have 95% of information about pressure of mud pumps. 7) The second and third components of spectrum are adequate for
RPM representation.
Figure 4: Applying trend analysis using Gram polynomials on drilling sensors data during cleaning hole (Circulation) operation.
SPE 163302 7
Drilling Patterns Base
The test database, which we will work on, consists of four complete drilled offset wells. The drilling operations highlighted
manually by drilling experts on raw data. The following table contains information about the testing wells:
Well Name Resolution Depth
Sensors used in patterns
recognition
Learning Well 1 1 data point each second, 1Hz 4000 m
Hkld, posblock, mdbit,
mdhole, flowIn, RPM,
torque, prespump
Test Well 1
1 data point each 5 seconds,
0.2 Hz
2550 m
Hkld, posblock, mdbit,
mdhole, flowIn, RPM,
torque, prespump
Test Well 2
1 data point each 5 seconds,
0.2 Hz
4100 m
Hkld, posblock, mdbit,
mdhole, flowIn, RPM,
torque, prespump
Test Well 3
1 data point each 5 seconds,
0.2 Hz
1860 m
Hkld, posblock, mdbit,
mdhole, flowIn, RPM,
torque, prespump
Table 1: Learning and Test Wells Information
We use the first well as a learning well where we teach our patterns classifier to build its patterns base from this well. The
other wells are segmented using a window and sent to our classifier. Then the classifier checks each segment and tries to
measure the similarity of the segment with patterns existed in the patterns base. Then the classifier assigns a class of a pattern
that has highest similarity to the segment.
Figure 5 explains how this classifier does its work. Importance should be given to the phase of building patterns base.
Existence of bad patterns in the patterns base makes classifier take wrong decisions.
Figure 5: Patterns-based classifier
Patterns Similarity Measure
To measure similarity between a segment of raw data and patterns, we use the measure of cosine of theta angel between two
vectors (Negi and Bansal 2002; Yang and Shahabi 2004):
cos(0) =
<u,b>
ub
(11)
Where o is a vector that contains values of polynomial coefficients from segment of raw data, and b is a vector that contains
values of polynomial coefficients from existing pattern in patterns base.
The Results
Figure 6 demonstrates the results of applying patterns classifier on three offset wells as testing wells. The accuracy of
classification process gives a percentage around 90% and we consider that as a high classification rate. It is shown that the
confusion of this classifier happen between formation drilling operations and non-drilling operations. This is due to the
similarity of flowIn and RPM trends during formation drilling (making hole) and non-drilling situations where both situations
show that flowIn and RPM sensors data should be in straight trends. For this reason, the patterns-based classifier is confused
between both situations. As a solution for this problem, a threshold level can be defined to employ more information on the
trend of flowIn or RPM.
8 SPE 163302
Another reason for confusion in the classification results is using all sensors data available. As we discussed in previous
paragraph, some sensors data contain same information. For example hole depth, bit depth, and posblock during drilling
operation carry same information and they considered as redundant components and this may cause confusion if the quality of
those sensors data is bad.
In addition, some sensors data are more important than other sensors data during specific operations. For example during
making connection, each of flowIn, pumps pressure, hole depth, and bit depth sensors data are not important for recognizing
this operation. The two important sensors data here are Block Position and Hookload.
We believe that the accuracy of results will be more than what it is now, if the previous comments are reflected on the
current implementation of the patterns-base classifier. Furthermore, data with no outliers or missing values is expected to have
higher accuracy.
Figure 6: Confusion matrices of test wells with classification accuracy of each one.
Drilling RuninHole PulloutofHole Circulation MakeCon Accuracy(Operation)
Drilling 1819 0 0 1 0 99.95%
RuninHole 78 789 8 9 7 88.55%
PulloutofHole 81 0 492 1 7 84.68%
Circulation 39 16 12 143 21 61.90%
MakeCon 15 46 38 23 3111 96.23%
Drilling RuninHole PulloutofHole Circulation MakeCon Accuracy(Operation)
Drilling 726 5 0 1 0 99.18%
RuninHole 95 472 2 3 14 80.55%
PulloutofHole 39 0 372 3 14 86.92%
Circulation 78 36 28 209 13 57.42%
MakeCon 35 20 9 15 1412 94.70%
Drilling RuninHole PulloutofHole Circulation MakeCon Accuracy(Operation)
Drilling 1095 0 0 9 0 99.18%
RuninHole 105 219 0 1 0 67.38%
PulloutofHole 55 0 340 1 3 85.21%
Circulation 62 32 12 312 30 69.64%
MakeCon 27 5 12 40 1085 92.81%
TestWell1
TestWell2
TestWell3
R
e
a
l
O
p
e
r
a
t
i
o
n
s
ClassifiedOperations(TotalAccuracy:88.56%)
R
e
a
l
O
p
e
r
a
t
i
o
n
s
ClassifiedOperations(TotalAccuracy:88.61%)
ClassifiedOperations(TotalAccuracy:94.05%)
R
e
a
l
O
p
e
r
a
t
i
o
n
s
SPE 163302 9
Conclusion and Future Work
It is possible to extend the suggested method of drilling operations classification to recognize any other required operation
based on whatever drilling sensors data. The only condition to apply this method is that the trends of sensors data should be
obvious. In addition, the suggested method can be improved by using a rejection classifier to reject specific patterns do not
belong to particular operation.
In our work, we used automatic adding for all operations in the database and here the error of experts classification is
included the results. As improvement, a continuous filtering process should be run on the patterns base to remove bad and
wrong patterns that cause low classification rate.
Another important improvement is that the results of classification from test wells can be added to the patterns base, if and
only if they are reviewed, corrected, and accepted by experts. This supports the idea of extending current patterns base with
new knowledge from experts and it shows an example on how to extract the knowledge from experts and teach current running
systems to do better in future.
Acknowledgement
The authors would like to thank TDE Thonhauser Data Engineering GmbH for the permission to publish this paper as well as
for the provided sensors data.
References
Florence, Fred, National Oilwell Varco, Fionn Iversen, and Drilltronics Rig Systems. 2010. IADC / SPE 128958 Real-Time
Models for Drilling Process Automation: Equations and Applications.
Negi, Tripti, and Veena Bansal. 2002. Time Series: Similarity Search and its Applications.
OLeary, Paul, and Matthew Harker. 2008. An Algebraic Framework for Discrete Basis Functions in Computer Vision. 2008
Sixth Indian Conference on Computer Vision, Graphics & Image Processing (1): 150157.
http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4756064.
OLeary, Paul, and Matthew Harker. 2010. Surface modelling using Discrete Basis Functions for Real-Time Automatic
Inspection.
Rabia, H. 1985. Oilwell Drilling Engineering, Principles and Practice.
Thonhauser, G, W Mathis, T D E Thonhauser, and Data Engineering. 2006. SPE 103211 Automated Reporting Using Rig
Sensor Data Enables Superior Drilling Project Management.
Thonhauser, G, G Wallnoefer, W Mathis, T D E Thonhauser, and Data Engineering. 2006. SPE 99880 Use of Real-Time Rig
Sensor Data to Improve Daily Drilling Reporting , Benchmarking and Planning - A Case Study.
Yang, Kiyoung, and Cyrus Shahabi. 2004. A PCA-based similarity measure for multivariate time series. In Proceedings of
the 2nd ACM international workshop on Multimedia databases - MMDB 04, New York, New York, USA: ACM Press,
p. 65. http://dl.acm.org/citation.cfm?id=1032604.1032616 (July 25, 2012).