Professional Documents
Culture Documents
Hongbao Zhang, Baoping Lu, and Shunhui Yang, Sinopec Research Institute of Petroleum Engineering; Ke Ke,
Sinopec Tech Middle East LLC; Jian Song, Sinopec International Exploration and Production Corporation; Xutian
Hou, Sinopec Research Institute of Petroleum Engineering; Zhifa Wang, Sinopec Tech Middle East LLC; Xin Jin,
Sinopec Research Institute of Petroleum Engineering
This paper was prepared for presentation at the Abu Dhabi International Petroleum Exhibition & Conference to be held in Abu Dhabi, UAE, 9 – 12 November 2020.
Due to COVID-19 the physical event was changed to a virtual event. The official proceedings were published online on 9 November 2020.
This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents
of the paper have not been reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect
any position of the Society of Petroleum Engineers, its officers, or members. Electronic reproduction, distribution, or storage of any part of this paper without the written
consent of the Society of Petroleum Engineers is prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may
not be copied. The abstract must contain conspicuous acknowledgment of SPE copyright.
Abstract
In drilling engineering, Key Performance Indexes (KPIs) are import aids for management optimization and
technical improvement, such as drilling time, monthly footage, rate of penetration and non-productive time,
etc.
Almost all oil companies have deployed database systems to manage drilling information, where maybe
millions of daily reports are incorporated. While in international oil companies, some shortcomings of
conventional data management system are gradually shown: reliability decreasing of data analysis reports
during transferring in different levels due to manual intervention, huge amount of unstructured data in losing
risk, low automation of KPIs analysis, low tolerance of data processing procedure to poor quality data,
various data standards of different fields, poor performance in KPIs visualization.
To solve the above problems, an integrated and intelligent drilling KPIs analysis system based on modern
data science techniques were developed and deployed in an international oil company.
The system architecture was designed in 4 sections: unstructured data processing unit, database and
management system, KPIs extraction engine and data visualization dashboard.
Unstructured data processing unit was designed to convert reports form heterogeneous data sources to the
main database, including Excel, pdf and isometric databases. A 5-layers data quality control procedure was
designed to ensure the reliability of data for KPIs analysis. Specified extraction algorithms was designed
considering several realities, such as reports or data missing, and data error. A distance-based model to
evaluate the similarity of two wells by well properties was developed, providing an intelligent way for
benchmark and knowledge sharing between wells. A drilling anomaly detection model was developed by
deep learning and natural language processing to solve the problem of ununified coding system in different
fields. A data visualization dashboard with high efficiency, reliability and flexibility was designed to provide
information for different users.
By deploying the system, tens of thousands of daily reports from heterogeneous data sources were
automatically incorporated to the main database, and millions of dollars of manually processing cost were
2 SPE-203378-MS
saved. The data analysis standard was unified by the system, making it possible for benchmarking and
knowledge sharing in different oilfields. Thousands of wells’ information was activated by the system, and
the outputs provide support for management optimization, technical improvement, drilling performance
benchmarking and feasibility evaluation of new investment. The time for users to analysis the KPIs of a
field was shorten form several weeks to almost zero and the data quality was validated and improved in
real time by the feedback of KPIs analysis.
A 7-layer convolutional neural network was trained based on 1700 wells’ data to detect anomaly in daily
Introduction
In drilling engineering, KPIs are import aids for management optimization and technical improvement, such
as drilling time, monthly footage, rate of penetration (ROP) and non-productive time (NPT), etc. Almost
all oil companies have deployed database systems to manage drilling information, where maybe millions
of daily reports are incorporated. While in international oil companies, some shortcomings of conventional
data management system are gradually shown.
1. Reliability of data analysis reports is decreased during transferring in different levels (such as
wellsite, project, subcompany and headquarter, etc.) in the management architecture due to manual
interventions, and the verified data processing principles lead to the difficulty in benchmarking.
2. For international oil companies, the data managing procedures and digitalization level may differs
between oilfields, amounts of data are in unstructured formats, such as .pdf, .xls and .doc, etc., which
isolate the information from integrated engineering database and leads to data assets loss risk.
3. Drilling process is full of uncertainty both in geology and engineering and the quality of drilling
reports is always imperfect because of the manual recording way, therefore, regular KPI analysis
procedure with rigid rules always fails to extract reasonable results because of the low tolerance to
poor data quality and uncertainties. Manual processing and reviewing are necessary to ensure analysis
quality, which is time consuming.
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems
to extract knowledge and insights from many structural and unstructured data (Dhar, V, 2013). Since 2006,
the successful application of machine learning, especially deep learning, in almost all industries has attract
great attention to data science techniques. The importance of big data has been accepted both in industry and
academia. How to use modern data science techniques to improve operation and management in petroleum
industry is becoming a hotspot in research.
As shown in Figure 1, for oil companies, data science techniques are expected to improve management
efficiency in aspects of unstructured data processing, operation analysis, report auto-generation and
knowledge sharing between oilfields.
SPE-203378-MS 3
In conventional way of data transferring in drilling engineering, the data is collected or reported in
wellsite, and some KPIs are calculated in wellsite by rig engineers or supervisors, and the analysis reports
are submitted to project office and then to higher layers in the management architecture. In each layer,
the data is processed and summarized by specific engineers, which causes extra jobs and information loss
during process, and the different KPI analysis methods lead to difficulty in KPI benchmark between wells,
fields or projects. Modern data sciences, especially artificial intelligence, provides a new and automatic way
to process the data in different layers.
According to the requirements of different roles in oil companies, an integrated and intelligent drilling
KPIs analysis system based on modern data science techniques are developed and deployed in an
international oil company. The system is designed based on the daily drilling reports (DDR) system, fully
automated processing procedure is triggered since the DDR is inputted into the system, providing an efficient
way for management and operation improvement in international oil companies.
4 SPE-203378-MS
System Architecture
An integrated and intelligent drilling KPIs analysis system based on modern data science techniques were
developed and deployed in an international oil company. The system architecture was designed in 4 sections:
unstructured data processing unit, database and management system, KPIs extraction engine and data
visualization dashboard (Figure 3).
Even the data structures between unstructured reports and the integrated database are different, the
recorded information is generally the same, including well/wellbore information, drilling activities, drill
string information, trajectory, drilling parameters, daily cost, etc. A data mapping dictionary is designed
to build the connection between same information in different data fields and the data is extracted for unit
converting. To ensure data quality during data processing, a validation procedure is designed by drilling
experts with effective intervals and invalid data will be reported in a log file. Finally, the extracted and
validated data is written into a xml file which is compatible with the integrated database management system
and uploaded into the system.
KPIs Analysis
KPI analysis provides a macro perspective for drilling engineers or managers to understand the drilling
performance in aspects of schedule, cost and safety. Because of the variety of big data, it's difficult to extract
KPIs from DDRs automatically because of multiple reasons, such as mis-typing, data loss and different
reporting customs between oilfields. Traditional rigid rule based KPI analysis method may leads to false
results because of the assumption of perfect data. Therefore, according to the real data condition in the
database, a 5-layers KPIs extraction method is specifically designed to calculate KPIs directly or indirectly.
As shown in Figure 6, the drilling KPIs family include ROP, NPT, cost per meter, daily footage, monthly
footage, etc.
6 SPE-203378-MS
Figure 7 illustrates the flowchart for daily ROP calculation by information in DDRs as an example for
tolerance of imperfect data. For some reports, the daily average ROP is not filled by several reasons, and the
KPI could be calculated by daily footage and drilling time alternatively. In the same way, Cost per Meter,
Monthly Footage, etc. are calculated.
According to the data features of 80 international projects, the KPI extraction and quality control strategy
by layers is designed, the KPIs are calculated from bottom to up in a 5-layers structure (daily report, well,
project, subcompany, headquarter). In the bottom layer, the daily footage, daily cost and net drilling time,
etc. are calculated directly or indirectly by DDR information. In the upper layers, the KPIs are calculated
according to the requirements of users, such as average cost per meter, total NPT of a field, etc. The KPI
analysis results of each layer are as the input for the upper layer, when the data is transferred from bottom
to up, it should be validated by pre-set rules, invalid data could not be used for the analysis in upper layers.
SPE-203378-MS 7
Similarity of wells
In the integrated engineering database, millions of daily reports from tens of thousands of wells are recorded,
which contains huge commercial value. The recorded drilling activities, well costs, drilling strings, drilling
parameters could be reference for new wells or projects. A similar well searching method is developed
according to the features of daily reports, which is based on similarity measurement methods by Euclidean
distance and features weighting, the features includes well type, measured depth, vertical depth, maximum
deviation, maximum mud weight, drilling time, wellbore structure, etc.
In the database, features of all wells should be normalized in a uniform scale, and the Euclidean distance
of two wells was calculated by weighted sum method. The distance between two wells is lower, the similarity
between 2 wells is higher.
The information of 2 wells could be represented by 2 n-dimensional vectors as [x11, x12, …, x1n] and [x21,
x22, …, x2n], therefore, the Euclidean distance between 2 wells could be calculated by,
(1)
0:00-2:00 Rig working on TDS to change out saver sub with the new one. (Repair rate) Yes/Positive
RIH 2 stand to 489m, M/U TDS, circulate to flush PDC bit and stabilizer W/650GPM,
2:00-4:00 No/Negative
1050psi, ream & back ream W/35RPM meantime.
Rig service TDS & change out wash pipe with new as per company's instruction before
4:00-8:00 No/Negative
drilling in pay zone.
Very tough slide drilling job due to one rig generator broken down and rig generator
8:00-10:00 system cannot provide enough power to increase pump strokes for adjusting PDM working Yes/Positive
parameters. (1 hour rig zero rate)
8 SPE-203378-MS
According to the features of drilling activity records (huge data volume, noises, imbalanced data, and
irregular data), based on on 460,000 operation records of 1,700 wells from 80 projects around the world,
Word2vec (Mikolov T, 2013) is used to represent the texts in numbers, over-sampling and deep learning cost
function optimization were used to overcome the data imbalance. A 7-layer convolutional neural network
(Krizhevsky A, 2012) was trained based on 1700 wells’ data to detect anomaly in daily reports with accuracy
of 85%.
Figure 9—CNN structure for anomaly detection in drilling & completion reports
SPE-203378-MS 9
Applications
The system has been deployed in an international oil company for 2 years, and tens of thousands of
wells's information from more than 20 countries around the world are managed by the system. Significant
improvement in data processing and operation management is achived during the application.
Conclusions
1. A global drilling KPIs analysis system was developed by advanced artificial intelligence, big
data processing, and visualization techeniques, and the system was successfully deployed in an
international oil company to manage tens of thoustands of wells from more than 20 countries around
the world.
2. Unstructured data processing technique incorporated multi-source unstructured reports into the the
intergrated database, which greatly improved data processing efficiency and avoid data assets loss.
3. A multi-layers KPIs extraction and quality control strategy was developed, 29 KPIs extraction
methods were designed according to the real data conditions of more than 80 international projects. A
7-layer convolutional neural network was trained based on 1700 wells’ data to detect anomaly in daily
reports with accuracy of 85%. A new distance-based model was developed to evaluate the similarity
of two wells.
4. By deploying the system, tens of thousands of daily reports from heterogeneous data sources were
automatically incorporated to the main database, and millions of dollars of manually processing
cost were saved. The data analysis standard was unified by the system, making it possible for
benchmarking and knowledge sharing in different oilfields. Thousands of wells’ information was
activated by the system, and the outputs provide support for management optimization, technical
improvement, drilling performance benchmarking and feasibility evaluation of new investment. The
time for users to analysis the KPIs of a field was shorten form several weeks to almost zero and the
data quality was validated and improved in real time by the feedback of KPIs analysis.
Acknowledgement
The authors would like to thank the managers and drilling engineers who have provide helpful suggestions
during the development of the work. The research is supported by National Science and Technology Major
Project of China (Grant No. 2016ZX05033-004) and National Natural Science Foundation of China (Grant
No. U19B6003-05-01).
References
Vasant Dhar. 2013. Data science and prediction. Commun. ACM 56, 12 (December 2013), 64–73. DOI:https://
doi.org/10.1145/2500499
Mikolov T, Sutskever I, Chen K, et al Distributed Representations of Words and Phrases and their Compositionality[J].
Advances in Neural Information Processing Systems, 2013, 26:3111–3119.
Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural net⁃works[C]. NIPS.
2012:1097–1105.