You are on page 1of 12

SPE-203378-MS

A Global Drilling KPIs Analysis System Based on Modern Data Science

Downloaded from http://onepetro.org/SPEADIP/proceedings-pdf/20ADIP/1-20ADIP/D011S007R002/2384557/spe-203378-ms.pdf/1 by Saudi Aramco, Baraket Mehri on 21 February 2022


Techniques

Hongbao Zhang, Baoping Lu, and Shunhui Yang, Sinopec Research Institute of Petroleum Engineering; Ke Ke,
Sinopec Tech Middle East LLC; Jian Song, Sinopec International Exploration and Production Corporation; Xutian
Hou, Sinopec Research Institute of Petroleum Engineering; Zhifa Wang, Sinopec Tech Middle East LLC; Xin Jin,
Sinopec Research Institute of Petroleum Engineering

Copyright 2020, Society of Petroleum Engineers

This paper was prepared for presentation at the Abu Dhabi International Petroleum Exhibition & Conference to be held in Abu Dhabi, UAE, 9 – 12 November 2020.
Due to COVID-19 the physical event was changed to a virtual event. The official proceedings were published online on 9 November 2020.

This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents
of the paper have not been reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect
any position of the Society of Petroleum Engineers, its officers, or members. Electronic reproduction, distribution, or storage of any part of this paper without the written
consent of the Society of Petroleum Engineers is prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may
not be copied. The abstract must contain conspicuous acknowledgment of SPE copyright.

Abstract
In drilling engineering, Key Performance Indexes (KPIs) are import aids for management optimization and
technical improvement, such as drilling time, monthly footage, rate of penetration and non-productive time,
etc.
Almost all oil companies have deployed database systems to manage drilling information, where maybe
millions of daily reports are incorporated. While in international oil companies, some shortcomings of
conventional data management system are gradually shown: reliability decreasing of data analysis reports
during transferring in different levels due to manual intervention, huge amount of unstructured data in losing
risk, low automation of KPIs analysis, low tolerance of data processing procedure to poor quality data,
various data standards of different fields, poor performance in KPIs visualization.
To solve the above problems, an integrated and intelligent drilling KPIs analysis system based on modern
data science techniques were developed and deployed in an international oil company.
The system architecture was designed in 4 sections: unstructured data processing unit, database and
management system, KPIs extraction engine and data visualization dashboard.
Unstructured data processing unit was designed to convert reports form heterogeneous data sources to the
main database, including Excel, pdf and isometric databases. A 5-layers data quality control procedure was
designed to ensure the reliability of data for KPIs analysis. Specified extraction algorithms was designed
considering several realities, such as reports or data missing, and data error. A distance-based model to
evaluate the similarity of two wells by well properties was developed, providing an intelligent way for
benchmark and knowledge sharing between wells. A drilling anomaly detection model was developed by
deep learning and natural language processing to solve the problem of ununified coding system in different
fields. A data visualization dashboard with high efficiency, reliability and flexibility was designed to provide
information for different users.
By deploying the system, tens of thousands of daily reports from heterogeneous data sources were
automatically incorporated to the main database, and millions of dollars of manually processing cost were
2 SPE-203378-MS

saved. The data analysis standard was unified by the system, making it possible for benchmarking and
knowledge sharing in different oilfields. Thousands of wells’ information was activated by the system, and
the outputs provide support for management optimization, technical improvement, drilling performance
benchmarking and feasibility evaluation of new investment. The time for users to analysis the KPIs of a
field was shorten form several weeks to almost zero and the data quality was validated and improved in
real time by the feedback of KPIs analysis.
A 7-layer convolutional neural network was trained based on 1700 wells’ data to detect anomaly in daily

Downloaded from http://onepetro.org/SPEADIP/proceedings-pdf/20ADIP/1-20ADIP/D011S007R002/2384557/spe-203378-ms.pdf/1 by Saudi Aramco, Baraket Mehri on 21 February 2022


reports with accuracy of 85%. A new distance-based model was developed to evaluate the similarity of
two wells.

Introduction
In drilling engineering, KPIs are import aids for management optimization and technical improvement, such
as drilling time, monthly footage, rate of penetration (ROP) and non-productive time (NPT), etc. Almost
all oil companies have deployed database systems to manage drilling information, where maybe millions
of daily reports are incorporated. While in international oil companies, some shortcomings of conventional
data management system are gradually shown.
1. Reliability of data analysis reports is decreased during transferring in different levels (such as
wellsite, project, subcompany and headquarter, etc.) in the management architecture due to manual
interventions, and the verified data processing principles lead to the difficulty in benchmarking.
2. For international oil companies, the data managing procedures and digitalization level may differs
between oilfields, amounts of data are in unstructured formats, such as .pdf, .xls and .doc, etc., which
isolate the information from integrated engineering database and leads to data assets loss risk.
3. Drilling process is full of uncertainty both in geology and engineering and the quality of drilling
reports is always imperfect because of the manual recording way, therefore, regular KPI analysis
procedure with rigid rules always fails to extract reasonable results because of the low tolerance to
poor data quality and uncertainties. Manual processing and reviewing are necessary to ensure analysis
quality, which is time consuming.
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems
to extract knowledge and insights from many structural and unstructured data (Dhar, V, 2013). Since 2006,
the successful application of machine learning, especially deep learning, in almost all industries has attract
great attention to data science techniques. The importance of big data has been accepted both in industry and
academia. How to use modern data science techniques to improve operation and management in petroleum
industry is becoming a hotspot in research.
As shown in Figure 1, for oil companies, data science techniques are expected to improve management
efficiency in aspects of unstructured data processing, operation analysis, report auto-generation and
knowledge sharing between oilfields.
SPE-203378-MS 3

Downloaded from http://onepetro.org/SPEADIP/proceedings-pdf/20ADIP/1-20ADIP/D011S007R002/2384557/spe-203378-ms.pdf/1 by Saudi Aramco, Baraket Mehri on 21 February 2022


Figure 1—The Way to Improve Management Efficiency by Data Science Techniques

Figure 2—Changing of Data Transfer Way

In conventional way of data transferring in drilling engineering, the data is collected or reported in
wellsite, and some KPIs are calculated in wellsite by rig engineers or supervisors, and the analysis reports
are submitted to project office and then to higher layers in the management architecture. In each layer,
the data is processed and summarized by specific engineers, which causes extra jobs and information loss
during process, and the different KPI analysis methods lead to difficulty in KPI benchmark between wells,
fields or projects. Modern data sciences, especially artificial intelligence, provides a new and automatic way
to process the data in different layers.
According to the requirements of different roles in oil companies, an integrated and intelligent drilling
KPIs analysis system based on modern data science techniques are developed and deployed in an
international oil company. The system is designed based on the daily drilling reports (DDR) system, fully
automated processing procedure is triggered since the DDR is inputted into the system, providing an efficient
way for management and operation improvement in international oil companies.
4 SPE-203378-MS

System Architecture
An integrated and intelligent drilling KPIs analysis system based on modern data science techniques were
developed and deployed in an international oil company. The system architecture was designed in 4 sections:
unstructured data processing unit, database and management system, KPIs extraction engine and data
visualization dashboard (Figure 3).

Downloaded from http://onepetro.org/SPEADIP/proceedings-pdf/20ADIP/1-20ADIP/D011S007R002/2384557/spe-203378-ms.pdf/1 by Saudi Aramco, Baraket Mehri on 21 February 2022


Figure 3—System Architecture

Unstructured data processing unit


As an international oil company, the drilling reports may be generated in different data sources and some of
them may not be included in a database. The unstructured data processing unit helps to convert data from
multi-sources to one integrated database, such as PDF, Excel and heterogeneous databases, ensuring the
data integrity in the company.

Database and management system


The integrated engineering database records all drilling related data (such as well name, well depth,
trajectory, wellbore structure, bottom hole assembly, drilling parameters, rig activities record and cost
details.) and provide information for different applications.

KPIs extraction engine


KPI analysis engine is the core of the system, which comprehensively process data in the integrated
engineering database, calculate KPIs by a bottom-up mechanism. The outputs of KPI analysis engine are
stored in KPI database, which ensures the isolation between KPI data and raw daily reports, and provides
information for front end dashboard.

Data visualization dashboard


The users in the international oil company are classified in different categories for better understanding of
requirements. Headquarter managers focus more on the overall economical KPIs, such as schedule, cost
per meter, non-productive rate, etc. Technicians in Headquarter pay more attentions on overall technical
problems in the company for long term optimization plan, such as type of non-productive time (NPT),
abnormal activities in drilling operations. And in the field, the performance of specific contractors or wells
are paid more attention. According to the requirement of users, the information for different roles is shown
in different views.
SPE-203378-MS 5

Unstructured Data Processing


The DDR structures from 4 oilfields in 4 countries are shown in Figure 4, which are in PDF or Excel format
as unstructured data. Even though the online operation reporting system are deployed in the fields, there
are still tens of thousands of reports are not incorporated to the integrated database in headquarter and in
risk of losing. Unstructured data processing unit is designed to process the data automatically instead of
time-consuming manual inputting.

Downloaded from http://onepetro.org/SPEADIP/proceedings-pdf/20ADIP/1-20ADIP/D011S007R002/2384557/spe-203378-ms.pdf/1 by Saudi Aramco, Baraket Mehri on 21 February 2022


Figure 4—Unstructured Daily Drilling Reports

Even the data structures between unstructured reports and the integrated database are different, the
recorded information is generally the same, including well/wellbore information, drilling activities, drill
string information, trajectory, drilling parameters, daily cost, etc. A data mapping dictionary is designed
to build the connection between same information in different data fields and the data is extracted for unit
converting. To ensure data quality during data processing, a validation procedure is designed by drilling
experts with effective intervals and invalid data will be reported in a log file. Finally, the extracted and
validated data is written into a xml file which is compatible with the integrated database management system
and uploaded into the system.

KPIs Analysis
KPI analysis provides a macro perspective for drilling engineers or managers to understand the drilling
performance in aspects of schedule, cost and safety. Because of the variety of big data, it's difficult to extract
KPIs from DDRs automatically because of multiple reasons, such as mis-typing, data loss and different
reporting customs between oilfields. Traditional rigid rule based KPI analysis method may leads to false
results because of the assumption of perfect data. Therefore, according to the real data condition in the
database, a 5-layers KPIs extraction method is specifically designed to calculate KPIs directly or indirectly.
As shown in Figure 6, the drilling KPIs family include ROP, NPT, cost per meter, daily footage, monthly
footage, etc.
6 SPE-203378-MS

Downloaded from http://onepetro.org/SPEADIP/proceedings-pdf/20ADIP/1-20ADIP/D011S007R002/2384557/spe-203378-ms.pdf/1 by Saudi Aramco, Baraket Mehri on 21 February 2022


Figure 5—Flowchart of Unstructured Data Processing

Figure 6—Family of Drilling KPIs

Figure 7 illustrates the flowchart for daily ROP calculation by information in DDRs as an example for
tolerance of imperfect data. For some reports, the daily average ROP is not filled by several reasons, and the
KPI could be calculated by daily footage and drilling time alternatively. In the same way, Cost per Meter,
Monthly Footage, etc. are calculated.
According to the data features of 80 international projects, the KPI extraction and quality control strategy
by layers is designed, the KPIs are calculated from bottom to up in a 5-layers structure (daily report, well,
project, subcompany, headquarter). In the bottom layer, the daily footage, daily cost and net drilling time,
etc. are calculated directly or indirectly by DDR information. In the upper layers, the KPIs are calculated
according to the requirements of users, such as average cost per meter, total NPT of a field, etc. The KPI
analysis results of each layer are as the input for the upper layer, when the data is transferred from bottom
to up, it should be validated by pre-set rules, invalid data could not be used for the analysis in upper layers.
SPE-203378-MS 7

Downloaded from http://onepetro.org/SPEADIP/proceedings-pdf/20ADIP/1-20ADIP/D011S007R002/2384557/spe-203378-ms.pdf/1 by Saudi Aramco, Baraket Mehri on 21 February 2022


Figure 7—Calculating flowchart of daily ROP by DDRs

Similarity of wells
In the integrated engineering database, millions of daily reports from tens of thousands of wells are recorded,
which contains huge commercial value. The recorded drilling activities, well costs, drilling strings, drilling
parameters could be reference for new wells or projects. A similar well searching method is developed
according to the features of daily reports, which is based on similarity measurement methods by Euclidean
distance and features weighting, the features includes well type, measured depth, vertical depth, maximum
deviation, maximum mud weight, drilling time, wellbore structure, etc.
In the database, features of all wells should be normalized in a uniform scale, and the Euclidean distance
of two wells was calculated by weighted sum method. The distance between two wells is lower, the similarity
between 2 wells is higher.
The information of 2 wells could be represented by 2 n-dimensional vectors as [x11, x12, …, x1n] and [x21,
x22, …, x2n], therefore, the Euclidean distance between 2 wells could be calculated by,

(1)

Anomaly Detection by Natural Language Processing and Deep Learning


Drilling activity record is one of the most import content in DDR, and is critical information for operation
monitoring, post job evaluation, etc. In the global engineering database, the activity coding system for time
analysis are almost different between wells, so it's difficult to analysis all drilling activities by one unified
method based on pre-set rules, and it's almost impossible to classify all activities manually because of the
huge data volume. To improve the efficiency, an automatic drilling activity classification method based on
natural language processing and deep learning technique is proposed. As shown in Table 1, the machine
learning model is designed to identify abnormal operation activities for reference of drilling engineers.

Table 1—a typical section of daily drilling activity records

Time Operation Activities Abnormal or not / Sample Class

0:00-2:00 Rig working on TDS to change out saver sub with the new one. (Repair rate) Yes/Positive

RIH 2 stand to 489m, M/U TDS, circulate to flush PDC bit and stabilizer W/650GPM,
2:00-4:00 No/Negative
1050psi, ream & back ream W/35RPM meantime.

Rig service TDS & change out wash pipe with new as per company's instruction before
4:00-8:00 No/Negative
drilling in pay zone.

Very tough slide drilling job due to one rig generator broken down and rig generator
8:00-10:00 system cannot provide enough power to increase pump strokes for adjusting PDM working Yes/Positive
parameters. (1 hour rig zero rate)
8 SPE-203378-MS

According to the features of drilling activity records (huge data volume, noises, imbalanced data, and
irregular data), based on on 460,000 operation records of 1,700 wells from 80 projects around the world,
Word2vec (Mikolov T, 2013) is used to represent the texts in numbers, over-sampling and deep learning cost
function optimization were used to overcome the data imbalance. A 7-layer convolutional neural network
(Krizhevsky A, 2012) was trained based on 1700 wells’ data to detect anomaly in daily reports with accuracy
of 85%.

Downloaded from http://onepetro.org/SPEADIP/proceedings-pdf/20ADIP/1-20ADIP/D011S007R002/2384557/spe-203378-ms.pdf/1 by Saudi Aramco, Baraket Mehri on 21 February 2022


The final model is deployed in data processing engine of the KPIs analysis system and provide support
for a drilling anomaly detection interface, saving lots of engineers’ time to read the daily reports.

Web based Visualization Platform


A global drilling KPIs analysis dashboard based on Browser/Server architecture is developed, according
to the requirements of different roles in the international oil company, different views of web pages are
provided. The general information of KPIs, such as active rig number, NPT rate, cost per meter, etc.
in headquarter layer, is illustrated by a big screen page. Headquarter view and project view are used to
illustrated all KPIs and analysis results directly, including KPI summary, KPI ranking, statistics, history
trends, and all history information could be analyzed dynamically by data query and time analysis. User
authority management by category ensures the safety of data. KPI analysis in multiple time scale (24
hours, 1 week, 1 month, 3 months, 1 year and 3 years, etc) is supported by the system. Clear, efficient and
stable visualization platform provides comprehensive data illustration and good interactive experience. The
function structure of the dashboard is shown in Figure 10.

Figure 8—Multi-levels KPI extraction

Figure 9—CNN structure for anomaly detection in drilling & completion reports
SPE-203378-MS 9

Downloaded from http://onepetro.org/SPEADIP/proceedings-pdf/20ADIP/1-20ADIP/D011S007R002/2384557/spe-203378-ms.pdf/1 by Saudi Aramco, Baraket Mehri on 21 February 2022


Figure 10—The functions of KPIs dashboard

Applications
The system has been deployed in an international oil company for 2 years, and tens of thousands of
wells's information from more than 20 countries around the world are managed by the system. Significant
improvement in data processing and operation management is achived during the application.

Unstructured data processing


More than 20 thousand of daily reports were converted to structured data and the loss of data assets are
avoided. Compared with manually processing, the data processing efficiency by the tools could be improved
by 50 times, the data integrity in database were increased by 3 times, the data processing efficiency was
greatly improved. Millions of dollors are saved because of the cut of manpower.

Drilling operations monitoring


The platform users could supervise operation conditions in all international projects, conduct KPI analysis
in different time and spatial dimensions and monitor the schedule, cost and time efficiency, etc. The
automatic generation and output of weekly reports, monthly reports and yearly reports greatly saved the
times for engineers. Engineers in headquarter could analysis KPIs in headquarter level and benchmark
between international projects, locate abnormal factors impacting headquarter level KPIs, and design remote
technical support strategy and scientific research orientation, etc. Engineers in the projects could understand
the overall status of target fields, analysis operation progress, perform benchmark between wells, make up
optimization plan in short and long term, and accelerate the learning curve building (Figure 12 to 15).
10 SPE-203378-MS

Downloaded from http://onepetro.org/SPEADIP/proceedings-pdf/20ADIP/1-20ADIP/D011S007R002/2384557/spe-203378-ms.pdf/1 by Saudi Aramco, Baraket Mehri on 21 February 2022


Figure 11—Time consuming by manually converting and software tools

Figure 12—Deployment of the system in a RTOC

Figure 13—Anomaly detection support by text classification model


SPE-203378-MS 11

Downloaded from http://onepetro.org/SPEADIP/proceedings-pdf/20ADIP/1-20ADIP/D011S007R002/2384557/spe-203378-ms.pdf/1 by Saudi Aramco, Baraket Mehri on 21 February 2022


Figure 14—KPIs changing trend analysis

Figure 15—KPIs benchmarking between oilfields

Utilization of big historical data


By similar well searching, data query, benchmark, historical trend analysis, users could utilize historical
data completely. The similar well search function allows users to find out the most similar well in several
thousands of wells for benchmarking and concept design reference. Users are allowed to conduct fast
data query by key words, the results could be used for feasibility evaluation of new techniques, operation
experience sharing, etc. By deploying the system, tens of thousands of wells’ historical information are
activated.

Data quality improvement


As shown in Figure 16, the common data flow procedure is "acquisition-management-utilization". The
data quality is critical for analysis results, and some of the information in DDRs are reported by manually
input, which means it's almost impossible to ensure perfect data quality by rule-based methods or reviewing
procedures. By the application of the system, some of data quality problems were detected by the data
processing procedure and the quality of the data could be improved by feedbacks. Therefore, utilization of
data is helpfule for data quality improvement.
12 SPE-203378-MS

Downloaded from http://onepetro.org/SPEADIP/proceedings-pdf/20ADIP/1-20ADIP/D011S007R002/2384557/spe-203378-ms.pdf/1 by Saudi Aramco, Baraket Mehri on 21 February 2022


Figure 16—Data flow and quality improvement

Conclusions
1. A global drilling KPIs analysis system was developed by advanced artificial intelligence, big
data processing, and visualization techeniques, and the system was successfully deployed in an
international oil company to manage tens of thoustands of wells from more than 20 countries around
the world.
2. Unstructured data processing technique incorporated multi-source unstructured reports into the the
intergrated database, which greatly improved data processing efficiency and avoid data assets loss.
3. A multi-layers KPIs extraction and quality control strategy was developed, 29 KPIs extraction
methods were designed according to the real data conditions of more than 80 international projects. A
7-layer convolutional neural network was trained based on 1700 wells’ data to detect anomaly in daily
reports with accuracy of 85%. A new distance-based model was developed to evaluate the similarity
of two wells.
4. By deploying the system, tens of thousands of daily reports from heterogeneous data sources were
automatically incorporated to the main database, and millions of dollars of manually processing
cost were saved. The data analysis standard was unified by the system, making it possible for
benchmarking and knowledge sharing in different oilfields. Thousands of wells’ information was
activated by the system, and the outputs provide support for management optimization, technical
improvement, drilling performance benchmarking and feasibility evaluation of new investment. The
time for users to analysis the KPIs of a field was shorten form several weeks to almost zero and the
data quality was validated and improved in real time by the feedback of KPIs analysis.

Acknowledgement
The authors would like to thank the managers and drilling engineers who have provide helpful suggestions
during the development of the work. The research is supported by National Science and Technology Major
Project of China (Grant No. 2016ZX05033-004) and National Natural Science Foundation of China (Grant
No. U19B6003-05-01).

References
Vasant Dhar. 2013. Data science and prediction. Commun. ACM 56, 12 (December 2013), 64–73. DOI:https://
doi.org/10.1145/2500499
Mikolov T, Sutskever I, Chen K, et al Distributed Representations of Words and Phrases and their Compositionality[J].
Advances in Neural Information Processing Systems, 2013, 26:3111–3119.
Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural net⁃works[C]. NIPS.
2012:1097–1105.

You might also like