You are on page 1of 27

Engineering Procedure

SAEP-389 3 December 2014


Process Data Reliability Management
Document Responsibility: Process Optimization Solutions Standards Committee

Saudi Aramco DeskTop Standards


Table of Contents

1 Scope.............................................................. 2
2 Conflicts and Deviations................................. 2
3 References..................................................... 2
4 Definitions and Acronyms............................... 3
5 Instructions..................................................... 6
6 Data Quality.................................................... 6
7 Data Availability............................................ 11
8 Data Consistency.......................................... 17
9 Tag Naming Convention............................... 19
10 Data Implementation Methodology............... 20
11 Archive Performance Checks....................... 21
12 System Date and Time Synchronization....... 22
13 Special Cases............................................... 23
14 Data Security................................................ 25
15 System Performance.................................... 27

Previous Issue: 27 November 2012 Next Planned Update: 27 November 2015


Revised paragraphs are indicated in the right margin Page 1 of 27
Primary contact: Kokolu, Prabhakar Rao (kokolupr) on +966-13-8801589

Copyright©Saudi Aramco 2014. All rights reserved.


Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

1 Scope

1.1 This procedure defines the minimum mandatory instructions needed for
configuration and development of functional design, architecture and
functionalities necessary for process data reliability management.

1.2 This procedure defines methods to collect accurate data, preventing from
accuracy decay, access, transformation, and interpretation of data for users.

1.3 This procedure defines methods to health-check existing process data historians
and information to discover and resolve prevailing anomalies and abnormalities.

1.4 This procedure applies to all Saudi Aramco existing data historian systems.
It must be a part of every new project that creates, migrates, replicates, or
integrates data.

1.5 The objective of the Process Data Reliability Management (PDRM) is to


implement a standard data collection and governance techniques for all Saudi
Aramco facilities and reduce inefficiencies and increase the level of confidence
in process information. This will help in ensuring proper data capture, increase
awareness of data quality issues and facilitate data stewardship activities.

1.6 Additional requirements might be included in Company's FSD, in which case


both this document and the FSD requirements shall be met.

2 Conflicts and Deviations

2.1 Any conflicts between this procedure and other applicable Saudi Aramco
Engineering Procedures (SAEPs), Materials System Specifications (SAMSSs),
Engineering Standards (SAESs), Standard Drawings (SASDs), or industry
standards, codes, and forms shall be resolved in writing by the Company or
Buyer Representative through the Manager, Process & Control Systems
Department of Saudi Aramco, Dhahran.

2.2 Direct all requests to deviate from this procedure in writing to the Company or
Buyer Representative, who shall follow internal company procedure SAEP-302
and forward such requests to the Manager, Process & Control Systems
Department of Saudi Aramco, Dhahran.

3 References

Material or equipment supplied to this specification shall comply with the references
listed below, unless otherwise noted.

Page 2 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

 Saudi Aramco References

Saudi Aramco Engineering Procedures


SAEP-99 Process Automation Networks and Systems Security
SAEP-302 Instructions for Obtaining a Waiver of a Mandatory
Saudi Aramco Engineering Requirement

Saudi Aramco Engineering Standards


SAES-J-004 Instrument Symbols and Identification
SAES-Z-010 Process Automation Networks Connectivity

Saudi Aramco Best Practice


SABP-Z-001 Plant Information System Data Compression

4 Definitions and Acronyms

Archive: Archives are special format of database which was developed to store and
retrieve sets of time-sequenced data. The database is not a flat file or a relational
database. It's a repository for automatically collected data. This data, also called
temporal or time series data, consists of two components: a recorded value of a user
determined type, and a time stamp. The input/output (I/O) point identifies organized
data in data stream series. This format makes it possible to archive, retrieve, and
organize data with minimal demand on system resources.

Authentication: The authentication model provides single sign-on for historian users.
It determines who is the user, and how to confirm that the user is really who he says.
The authentication requires less maintenance for historian administrators.

Authorization: It specifies what is that user allowed to do? With this security model
each Server object can have read and/or write permissions defined for any number of PI
identities.

Buffering: Data sent from the interface to the historian is redirected to the buffering
process, which stores and forwards events to the home node. Buffered data is
maintained in First-In, First-Out (FIFO) order.

Compressing: Turns compression on or off.

CompDev: Specifies the compression deviation in the point's engineering units. As a


rule of thumb, set CompDev to the accuracy of the instrument. Set it a little “loose” to err
on the side of collecting, rather than losing data. After collecting data for a while, go
back and check the data for your most important tags and adjust CompDev if necessary.

Page 3 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

CompDevPercent: Specifies the compression deviation as a percent of the point's


Span attribute.

CompMin: Sets a minimum limit on the time between events in the Archive. Set the
CompMin attribute to zero for any point coming from an interface that does exception
reporting. You typically use CompMin to prevent an extremely noisy point from using
a large amount of archive space.

CompMax: Sets a maximum limit on the time between events in the Archive. If the
time since the last recorded event is greater than or equal to CompMax, then PI
automatically stores the next value in the Archive, regardless of the CompDev setting.

Compression Deviation: If the absolute difference between the current snapshot and
the last archive value is greater than CompDev then the snapshot is sent to the archive.

Dead band: How much a value may differ from the previous value before it is
considered to be a significant value? This is a dead band, which when exceeded, causes
an exception.

DRA: Data Reliability Application is an in-house developed application from P&CSD,


which detects bad data and bad tag configurations of any PI System and provides
recommendations to fix them.

Exception Deviation: The Exception Deviation specifies in engineering units how


much a value may differ from the previous value before it is considered to be a
significant value. This is a dead band, which when exceeded, causes an exception.

ExcDev: This attribute is used to specify how much a point value must change before
the Interface reports the new value to PI. Use ExcDev to specify the exception
deviation in the point's engineering units. As a general rule, the exception deviation
should be set smaller than the accuracy of the instrument system.

ExcDevPercent: ExcDevPercent can be used instead of ExcDev. ExcDevPercent sets


the exception deviation as a percentage of the Span attribute. If Span attribute is not set
correctly, however, the exception reporting will be wrong, too. A typical exception
deviation value is about 1% of Span.

ExcMin: Use ExcMin to limit how often (in seconds) the Interface reports a new event
to PI. For example, if you set ExcMin to five, then the Interface discards any values
collected within five seconds of the last reported value. ExcMin is typically set to zero.

ExcMax: Set ExcMax to the maximum length of time (in seconds) you want the
Interface to go without reporting a new event to PI. After this time, the Interface reports
the new event to PI without applying the exception deviation test.

Page 4 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

Functional Specification Document (FSD): provides the technical requirements for


the system.

HA: High Availability methods. In this method multiple numbers of historian servers
are installed into a collection and they collect data from interfaces. Each historian
server continues to archive and buffer data separately. When one of the servers is down
due to network disruptions or down due to maintenance, etc., the other server becomes
available for users. Historian replication enables alternate data sources by
synchronizing the configuration of multiple servers.

Interface Node: Interface Nodes run interfaces. Interfaces get the data from the data
sources and send it to the process historian servers. Each different data source needs an
interface that can interpret it.

Interfaces: Software modules for collecting data from data sources or sending data to
other systems. Typical data sources are Distributed Control Systems (DCSs),
Programmable Logic Controllers (PLCs), OPC Servers, lab systems, and process
models. However, the data source could be as simple as a text file.

Interface: Software that allows communication between historian and a data source.
Collects data from data source and sends it to historian (and vice-versa).

IsGood: The IsGood method would be used to evaluate a Value object to determine if
the data contained in the Value property represents valid data or some error state.
When IsGood returns FALSE, it is indicating that the Value property does not contain
valid data and considered bad data.

Interface Failover: Depending on the data source, an interface can automatically switch
between redundant copies of the interface run on separate interface computers.
This provides uninterrupted collection of process data even when one of the interfaces is
unable to collect data for any reason. When maintenance, hardware failure, or network
failure causes one interface to become unavailable, the redundant interface computer
automatically starts collecting, buffering, and sending data to the Historian Server.

iFields: Intelligent fields is another name for oil fields which are automated for
collecting down hole information automatically.

OPC methods of getting data: OPC Interface has three methods of getting data:
Advising, Polling, and Event reads (also known as triggered reads). For Advise tags
(referred to as ReadOnChange in the OPC Standard), the OPC Server sends data
whenever a new value is read into the server’s cache. For Polled points, the interface
sends an Asynchronous Refresh call (see Data Access Custom Interface Standard from
OPC Foundation for more details) for the Group. For Event reads, the PI Server
informs the interface when the trigger point has a new event (not necessarily a change
in value) and the interface sends an Asynchronous Read call for the event tags attached

Page 5 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

to that trigger. All three kinds of points are read asynchronously by the interface and
the same data routines process all updates.

Point: The point is the basic building block for controlling data flow to and from the
Data Historian Server. For a given timestamp, a point holds a single value.

Performance Points: Performance Points are points that monitor Windows


Performance counters through the PI Performance Monitor interface.

PctGood: It’s a Historian function and returns percentage of time period over a given
time period that the point’s archived values is good. This function would be provided
with a date range. If the function returns Null or an error they would be considered as
bad tags.

PDRM: Process Data Reliability Management acronym for this procedure

PI: Plant Information is the name of a Data Historian from vendor.

Span: The Span is the difference between the top of the range and the bottom of the
range. It is the range of instrument. It is required for all numeric data type points.

Shutdown: Shutdown events are typically written into points to indicate when the
Historian is taken off-line.

Scan Class: A code that interfaces use to schedule data collection. Scan class consists
of a scan period(s) which tells interface how often to collect the data and, when to start
collecting data.

5 Instructions

This document shall be used to define plans and implement effective data collection and
governed data historian systems. Saudi Aramco facilities will greatly reduce
inefficiencies and increase the level of confidence in business information. It must
address accuracy of data when initially collected, accuracy decay, accurate access,
transformation, and accurate interpretation of the data for users. Its mission is
threefold: improve, prevent and monitor. PDRM major task is to investigate current
data historian systems and information processes to find and fix existing problems.

6 Data Quality

PDRM solution shall help to detect data gaps, bad quality data and other data faults
leading to historian data degradation. Saudi Aramco facilities shall use DRA and work
with P&CSD to resolve the data quality issues. The run-through shall be carried out
every year in order to maintain good quality data. PDRM process is as follows:

Page 6 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

The following are recommended techniques to improve process data reliability:

6.1 Exception Reporting Data

Exception reporting shall be used to tune tags and maximize their efficiency of
data flowing from interface machine to historian server for that point. Exception
reporting takes place on the interface machine before the value is sent to the
historian. Exception reporting will improve process data reliability by reducing
the communication (I/O) burden between the historian and the interface node.
It will filter out “noise.” Exception reporting shall be controlled by setting the
following attributes:
a) Exception deviation shall be slightly smaller than the precision of the
instrument (dead band).
b) Maximum time span between exceptions shall be set to 180 seconds to
ensure sufficient events/day are collected. This shall be the limit on how
long the interface can go without reporting a value. If the maximum time
period elapsed without any new value received then the interface shall
send a value, regardless of whether the new value is different from the last

Page 7 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

reported value or not. In this case, at least 480 values per day shall be
collected.
c) Dead band shall be kept at 0.25% of span.

The following list shows the different OSISoft PI attributes that shall be set in
order to configure exception reporting:
d) Tag attribute ExcMin shall be set to zero (0)
e) Tag attribute ExcMax shall be set to 180 (180 second ~ 3 minutes).
f) Tag attribute ExcDev shall be set to ½ of the tag attribute CompDev
g) Tag attribute ExcDevPercent shall be set to 0.25% or recalculated from
tag attribute ExcDev. Minimum value from these two values shall be used.
h) For more details on these attributes, please refer to SABP-Z-001 page 15,
Data Fidelity.

6.2 Compression Reporting Data

Compression reporting shall be used to tune historian tags and maximize the
efficiency of data storage in the archive for that point (compression testing).
Although modern day historians are capable of storing enormous amounts of
data, it's important to store quality data to improve historian’s efficiency.
More efficient data storage allows for longer periods of on-line data on the same
disk space. PDRM mandates not to store any event that historian can essentially
recreate by extrapolating from surrounding events.

Storing excessive data in the historians affects performance. When clients make
calls to retrieve compressed data or execute summary calculations over large
periods of time, much of the archive data will likely be read from disk the first
time the call is made. This is an expensive operation (compared to reading from
memory). If only quality data is stored in archives, then greater time ranges of
data can be stored in memory (read cache) for quicker access. The procedure of
adjusting the compression parameters and produce efficient archive storage
without losing significant data is by setting the following attributes:
a) Compression shall not be disabled. If for some reason all incoming values
are not needed, still Compassion shall enable and Compression Deviation
shall be removed. This allows all data except successive identical values
to be archived. Successive identical values are not archived. This is much
more efficient.
b) Compression Deviation shall be set to the minimum change that is
measurable by the instrument. Compression Deviation shall be set for not

Page 8 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

losing data and resulting in sufficient events/day. The compression


deviation calculations are explained with following examples.
Temperature Range: 1 – 150°C
Compression Deviation = (1x100)/150 = 0.6%
Pressure Range: 2 – 2000 psig
Compression Deviation = (2x100)/2000 = 0.1%
c) Refer to the time between events in the Archive. A new event shall not
be recorded if the time since the last recorded event is less than the
compression minimum time for the point. Similarly, a new event shall be
recorded if the time since the last recorded event is more than the
compression maximum time for the point.
d) Compression shall be disabled for laboratory, manually entered and other
tags where each event is significant in itself and not merely representative
of an underlying flow.

The following list shows the different OSISoft PI attributes that shall be set in
order to configure compression:
e) Tag attribute CompMin shall be set to zero (0).
f) The Maximum Compression time shall be set to 600 sec, i.e., 8 hrs.
g) Tag attribute CompDev shall be set to double of ExcDev.
h) Tag attribute CompDevPercent shall be set to 0.5% or recalculated from
tag attribute CompDev. Minimum value from these two values shall be
used.
i) More details please refer to SABP-Z-001 Page 15, Data Fidelity.

6.3 Data Gap Analysis

Shutdown, Bad and other historian events shall be used to recognize the
discontinuity of the data and identify data gaps. Historian states shall be used to
represent error conditions. Historian states may be sent as values to tags of any
type. In order to improve process data reliability, data gap analysis shall be
performed for all tags with a quality of data less than 100 percent. This will
help determine if the data contained in the tags is good. Quality of data shall be
measured by a built in function pctgood. PDRM shall categorize and group tags
according to their quality to analyze data and report percentage of time data is
holding good values.
a) Data Gap Analysis shall identify all tags with data quality less than 100%

Page 9 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

b) Data Gap Analysis shall capture Tag values for a given period of time.
Tags digital states shall be validated to generated data gap report showing
the total time each tag failed and number of times the tag had bad value.
c) Data Gap Analysis shall identify the total time period a tag has been in
error conditions.
d) Data Gap Analysis shall count the number of times a tag has been in error
conditions.
e) Data Gap Analysis shall evaluate the data for error states like
“Shutdown”, “I/O Time Out”, “Bad Value”, “Out of Service”, “Ptcreated”
etc. by applying IsGood built-in function. If the IsGood function fails, it is
an indicative of invalid data and considered bad data.
f) Data Gap Analysis shall identify percentage of time period the point’s
archived values is good. The PctGood function shall be applied within a
date range. If the function returns Null or an error, the tags would be
considered as bad tags.
g) Data Gap Analysis shall identify all tags whose data is frozen or stuck
and not changing for the entire period of process. These tags shall be
marked for as bad tags.

6.4 Stale and Dead Tags

There are two basic indicators that shall be monitored to diagnose the condition
of the historian. Historian points which have stopped collecting data (stale
points) and points that have not received data for a long time (dead points).
a) The Stale state indicates that the point has not updated within a specified
time. By default, a tag is stale if the current value is over four hours in the
past.
b) The dead state indicates that the point has not updated for the last
12 months’ time.
c) Some possible scenarios for stale or dead tags are (i) No network
connection between the Historian and the interface, (ii) the interface
computer has shut down, or the interface computer has lost connection
with the device, (iii) someone has changed the point attributes.
d) When point values are stale or dead for no known reason, administrator shall
immediately determine the cause. When points are no longer useful, such as
points that represent data from obsolete equipment, decommission them.

Page 10 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

6.5 Data Range Violations Analysis

Data range violation analysis shall report off range data as suspicious data. This is
done by analyzing Tag values for range limits (upper and lower limits) for a given
period of time. If data is off range more than 20% of the time (total time is the
report from/to period selected for the processing DRA), then it shall be marked as
suspicious and reported for correction. Standard deviation function shall be used
to find out the quality of the data. This option should be run specifically on
Temperature and Pressure tags. These tags shall be identified by their engineering
units (engunits). The following details the Data range violation analysis:
a) Data range violation analysis shall determine if a value is within a
specified range (min and max). If data is off range more than 20% of the
time, then it shall be marked as suspicious and reported for correction.
b) Data range violation analysis shall use standard deviation (StDev) to
identify the quality of data. StDev returns the time-weighted standard
deviation of archive values for the point over a given time interval.
The larger the StDev, the more suspicious is the data (i.e., a large standard
deviation (40% - varies from process to process) indicates that the data
points are far from the mean and hence shall be considered as suspicious
data. A small standard deviation indicates that the data is clustered closely
around the mean and hence good quality.)

7 Data Availability

Multiple processes bring data into the historian from outside – either manually or
through various interfaces and data integration techniques. Data is exchanged between
the systems through real-time (or near real-time) interfaces. The data is propagated too
fast with little or no time to verify the data accuracy. PDRM shall focus on the validity
of individual interfaces attributes, identify data problems and react accordingly.
The following details important interfaces covered by PDRM and how they could be
used to improve process data reliability:

7.1 Interface Reporting

PDRM shall generate health check reports on current Process Data Interfaces
from plants. The health check reports shall identify tags, interface scan periods,
and total number of tags associated with each scan class of an interface.
These reports shall help in interface load balancing and result in more robust data
gathering interfaces. The following details the health checks report conditions:
a) Interface Reporting shall group all tags used by an Interface with a
unique identifier (point source).

Page 11 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

b) Interface Reporting shall identify the total number of the monitored


interfaces. The “Location1” tag attribute uniquely identifies an interface in
OSISoft PI.
c) Interface Reporting shall identify the scan period for each point. The scan
period determines the frequency at which input points are scanned for new
values.
d) Scan periods shall be between 5 sec to 30 min. Frequency lower than that
would result in too many data values which will consume large disk space
and misuse network bandwidth. Frequency higher than 30 min would
result in insufficient events/day.
e) Interface nodes shall be physically located between plant’s firewall and
corporate WAN called DMZ area. Also, allow for remote desktop
inspection from Saudi Aramco WAN through the use of Citrix Servers.
The remote access shall be only for monitoring and not for any
administration or any other purposes. Please also refer SAEP-99 paragraph
5.4.2.J for communication details. Most failures of data come from
interfaces failures or inadequate scan frequencies which need constant
inspections. This might require additional firewall ports to be opened.
f) Data sources (PLCs/DCS/SCADA/Lab, etc.) shall be configured by industry
standard OPC interface and avoid vendor specific interfaces as far as
possible. The Interface architecture shall be as specified in SAES-Z-010.
g) PDRM recommends that each scan class of an interface shall be
configured with not more than 800 points/scan class. This is required to
optimize load balancing.
h) PDRM recommends that each Saudi Aramco facility shall document the
complete architecture of their plants process data flow into a historian.
This shall help analyze the data quality implications, if any changes in
interface configurations, or data collection procedures occur and thus
eliminate unexpected data errors.
i) PDRM recommends that all Saudi Aramco facilities shall deploy Interface
Failover redundancy architecture as described in SAES-Z-010 while
configuring interfaces. This allows the data collection process to be
controlled at the lowest possible level, and ensures that data collection will
continue even if the connection to the data historian fails.
j) PDRM recommends that all Saudi Aramco facilities shall configure the
interfaces using Data Historian vendor provided Interface Configuration
Tools (PI-ICU) and manual configurations of interfaces shall be avoided.

Page 12 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

The manual method of configuring interface does not create performance


counters or watch dog tags for monitoring health of interfaces.
k) PDRM recommends that all Saudi Aramco facilities shall configure
Performance Counters during interface configurations by using Interface
Configuration Tools (PI-ICU). Performance Counters provide important
insights into a number of performance management problems including
(but not limited to) memory, disk, and process management.
These Performance Counters shall be selected, associated with tags, and
then configure those tags on a historian Server.
l) PDRM recommends that all Saudi Aramco facilities shall install and enable
the buffering capabilities on the interface machine while configuring the
interface with Interface Configuration Tools. This shall take care of
connection failures between Interface machine and historian server. In the
event of a connection failure between interface machine and historian server,
the interface machine shall buffer the data till the historian server is brought
back again, and data is restored/sent to the historian. The buffering
capabilities shall be limited to the interface machine hard disk capacity.
m) OPC interfaces method of gathering data shall be configured as
“Advising” (referred to as ReadOnChange in the OPC Standard), to allow
the OPC Server sends data whenever a new value is read into the server’s
cache.
n) PDRM strongly recommends that Advise tags and Polled tags not be
mixed in the same Group (i.e., scan class) while configuring interfaces.
The OPC Interface has three methods of getting data: Advising, Polling,
and Event reads (also known as triggered reads). For Advise tags, the
OPC Server sends data whenever a new value is read into the server’s
cache. For Polled points, the interface sends an Asynchronous Refresh call
for the Group. For Event reads, the PI Server informs the interface when
the trigger point has a new event and the interface sends an Asynchronous
Read call for the event tags attached to that trigger. All three kinds of
points are read asynchronously by the interface and the same data routines
process all updates. If advice tags and polled tags are in the same scan
class, it can cause odd problems and the performance of the interface under
those conditions is not guaranteed.

7.2 Lab Data Availability

Historian administrator shall use OPC interface to make Lab data available for
historization. If the Lab systems do not provide with an OPC Server, Historian
administrator shall use RDBMS interface.

Page 13 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

Historian administrator shall disable Compression and Exceptions for laboratory


tags where each event is significant in itself and not merely representative of an
underlying flow.

7.3 Historian High Availability

Data Historians Server High Availability (HA) approaches shall be deployed to


avoid single point of failures, and avoid data loss or data inaccessible.
a) High Availability (HA) shall enhance the reliability of the Historian by
providing alternate sources of the same time-series data for users.
b) HA shall keep collecting data and also make available process data to
users in the event of failures like routine maintenance, upgrades, planned
maintenance like Operating System Updates, Software Upgrades,
Hardware Upgrades, etc., or unplanned failures like Software Failure,
Hardware Failures, and Network Failure and avoid data loss or render data
inaccessible.
c) HA shall enhance the reliability of Historian by deploying a minimum two
servers for collecting time series data.
d) HA shall be used for load distribution to balance server traffic among a
group of servers by distributing user connections to the servers.
Preferably, a new connection would be directed at a server in the group
that had the lowest load.
e) HA shall be used to segregate Users by Class of Service such as process
operators, who have immediate needs for data and must have access to any
available server including some reserved for them exclusively. Some users
have moderate needs for data, and should have access to any server except
those reserved for process operators. Some users run intensive data
mining operations that consume a large amount of server resources but can
run slowly or be deferred. These users shall have access only to servers
that do not impact the needs of the process operators and moderate need
users. Segregation of users by class of service directs user connections to
an available server that meets their needs but does not impact higher class
services. When the server that they are connected to becomes unavailable,
their connection shall be re-established to another server if possible.
f) HA shall establish a large geographic separation between the redundant
servers. The geographic diversity can help in risk management and
disaster recovery planning.

Page 14 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

7.4 Backup of Historian

Process data historians shall be backed up regularly and shall be tested to restore
the original data in case of data loss or corruption. PDRM recommended
procedures for backing up the Historian Server are as follows:
a) Configure a Daily Backup Task. This daily task will back up the Server
to a single Server backup directory. Files are overwritten and accumulated
in this directory. The accumulated files in this directory correspond to a
full backup of the Server. It is important to note that this backup directory
corresponds only to the latest state of the Server.
b) All historian systems includes a script to configure a daily backup that
runs as a Windows task, hence forth referred as “scheduled backup task”.
The scheduled backup task performs an incremental, verified backup each
day. It places the backup files in the directory specified by the Windows
task, which shall be referred to as the “scheduled backup directory”.
The scheduled backup directory holds only the most recent verified
backup. Historian Administrator shall need to back up each day's verified
backup to a safe location. The historian system can be accessed as usual
while the scheduled backup task is running.
c) By default, the backup task uses Microsoft's Volume Shadow Copy
Services (VSS) to enable access to the historian systems during backups.

d) Back Up the Scheduled Backup Directory. Backing up the files in the


backup directory is a crucial step to safeguarding historian Server.
The backup directory contains only the most recent backup. As new
backup files are copied into the backup directory, the old backup files are
overwritten. Backups of the PI Server backup directory will provide the
backup history that allows us to restore the data.

Page 15 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

e) Historian administrator shall avoid manual backup and use Saudi


Aramco recommended third-party applications to automate the backup
process. Historian administrator shall choose a combination of full and
incremental backups.

f) Historian administrator shall store backed-up data in a disaster recovery


center that shall be a different location or area.
g) Historian administrator shall maintain a history of two weeks to a month
in the backup and keep it ready to be restored at any moment of time.
h) While historian systems are running, they cannot be backed up with
standard operating system commands such as copy (Windows) or CP
(UNIX) because historian opens its databases with exclusive read/write
access. This means that the copy commands will outright fail. Historian
prevents access by the operating system because a lot of the information
that is needed to backup the databases of historian is in memory and a
simple file copy would most likely lead to a corrupt backup.
i) Historian administrator shall not try to include the historian archives
folder in the daily system backup. The archives consist of a large number
of huge files that undergo frequent small changes.
j) Historian administrator shall use historian backup scripts that are
designed to back up the archive files efficiently.
k) Historian administrator shall make sure that enough space on the disk
where historian creates the backup files. Check the disk space regularly.
l) Historian administrator shall run a trial backup and restore to make sure
everything works correctly. Test the backups in this way periodically.
m) Historian administrator shall ensure to turn on interface buffering for the
interfaces wherever possible to avoid losing incoming data while the
backups are running.

Page 16 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

n) Historian administrator shall after a new historian server installations or


upgrades, shut down the Server and make a complete backup of all server
directories and archives.
o) Historian administrator shall immediately make a backup after a major
change to historian, such as a major edit of the points or user database,
rather than waiting for the automated backup.

In the case of OSISoft PI system,


p) By default, PI backup task uses Microsoft's Volume Shadow Copy
Services (VSS) to enable access to the PI Server during backups.
q) PI Server uses incremental backups. No need to specify a cutoff date or a
number of archives to be backed up. PI Backup Subsystem backs up all
archives that have been modified since the last backup. Typically, only
one or two archives need to be backed up, depending on whether an
archive shift occurred.
r) To establish a full backup of the PI Server change to the PI\adm directory
and type the following command:

piartool -backup backupdir - numarch num - arcdir -wait


s) Where backupdir is the full path to the backup directory, and num is the
number of archives.
t) The piartool -backup commands shall not be used to start a backup
directly. Instead, the PI Server backup scripts provided in
pisitebackup.bat are used, which in turn run the necessary piartool -
backup commands. To change which files are backed up, edit
pisitebackup.bat
u) Files to be backed up are: archives and annotation files, configuration file,
log files, batch files and other important files depending on the type of
historian system in use.

8 Data Consistency

8.1 Tag Attributes

PDRM shall generate a standard table with all possible instruments used in
Saudi Aramco and their respective ranges (input required by all Saudi Aramco
process engineers). Engineering Units and instruments low and high limits shall
be classified as range attributes.
a) PDRM shall report all non-standard engineering units.

Page 17 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

b) PDRM shall report tags which violate the low and high limits.
c) PDRM shall validate these failed tags against the SA Standard table and
alert the discrepancies.

In the case of OSISoft PI system,


d) Min Value of tag = Zero
e) Max Value of tag = Zero + Span
f) Span = Range (Maximum – Minimum)
g) Report number of tags which has OSISoft default settings for span and
zero: count all tags whose Zero=0, Span = 100 and Engunits <> ‘%’
(Stastical Report)
h) PDRM recommends not to use OSISoft default values but SA standard
settings

8.2 Data Transfer between Historians

PDRM shall identify the tags, their attributes and also the data differences
between local historians and central historians. This shall help in fixing the
differences and ensure that both historian servers are collecting and archiving
same values. This is required by iField/OSPAS as they collect data from
different field historians located in all Saudi Aramco oil fields. PDRM
recommends to generate three reports (i) Tag attributes matching report,
(ii) Snapshot values matching report and (iii) Archive values matching report.
a) Tag Attribute Matching: The Tags from Server A and Server B shall be
matched and any missing tags shall be reported. Also, the tag attributes
shall be matched, any mismatches would be reported. This option shall
generate a detailed report.
1) All tags shall be matched and missing tags needs to be identified and
logged.
2) For Process tags the most important 16 attributes like Desc, engunits,
archiving step, excdev, excdevpercent, excmax, excmin, compdev,
compdevpercent, compmax, compmin, compressing, span, zero and
shutdown shall be matched in both servers. Mismatches of attributes
shall be reported to be fixed.
b) Snapshot Values Matching: All tags from Server A and Server B shall be
processed to compare their current values and its time stamps. This is to
ensure all tags in both servers are collecting similar values at similar
timestamps. This option shall generate a detailed report.

Page 18 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

1) Snapshot values: All tags will have only one snapshot value at any
given time and it’s the most recent value collected.
2) Analog values would be matched till 2 decimal points only.
3) Digital tags would be exactly matched
4) Identify and fix tags whose values are not matching or missing tags
c) Archive Values Matching: All tags from Server A and Server B shall be
processed to compare the archive values and its time stamps. This shall
ensure all tags are collecting and archiving similar values with similar
timestamps. This option shall generate a detailed report.
1) Analog values would be matched till 2 decimal points only.
2) Digital tags would be exactly matched.
3) Matching tags would only be processed i.e. tags existing in both servers.
4) Get events from both servers and match. If both time stamp and
values are matching it would be counted as Matched event.
5) Calculate percentage of matched events by dividing the matched
events with Number of archived events from Server A or Server B
whichever is greater.
6) Identify and fix tags whose values are not matching or missing tags.

9 Tag Naming Convention

The scope of tags configuration and management is intended to cover the tags from the
field instruments, within plant SCADA, PI servers and up to and including PI System.
The proposed tagging scheme is based on adopting the Saudi Aramco Engineering
Standard SAES-J-004, Instrument Symbols and Identification. This standard is based
on ANSI/ISA-5.1 which is a well-established industry standard for tagging that is
followed by all Saudi Aramco facilities and projects. In addition, ANSI/ISA-5.1 is used
by all major oil and petrochemical companies worldwide. Accordingly, this proposed
tagging document is mandatory and shall be followed without any deviation to remedy
the confusion created by the current non-standardized tagging.
a) NAMING CONVENTION FOR PLANT LEVEL TAGS

For all plant level tags, we propose to prefix the plant number to the actual tag
defined according to the Saudi Aramco standards. So, final proposed tag naming
convention is given below:

<Plant Code> - <Tag name> - <Tag number>

Page 19 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

1) <Plant Code>: Plant code as defined by Saudi Aramco.


2) <Tag Name>: As defined according to Saudi Aramco Standard
SAES-J-004.
3) <Tag Number>: A unique number that succeeds the tag name and is
obtained from Unit responsible for management and tag issuance for all
Saudi Aramco company facilities.

b) NAMING CONVENTION FOR CALCULATION AND OTHER TYPES OF


TAGS

For all calculated and aggregated tags we propose to prefix according to following
conditions:

< Plant Code>-<Calculated/Other Tag name>


1) <Plant Code>: Plant code as defined according to Saudi Aramco.
2) <Calculated/Other Tag Name>: as defined according to Saudi Aramco
Standard SAES-J-004.

10 Data Implementation Methodology

Saudi Aramco has historians running with more than 10 years data. Initially, the tag
attributes were configured as recommended by vendor which when tested by P&CSD
turned out to be highly incompatible to Saudi Aramco needs.
a) Historian administrator shall apply Saudi Aramco compression best practices as
described in SABP-Z-001 on current and future process data historians.
b) PDRM recommendations shall be applied with cooperation from all sources,
i.e., Data Proponents/IT first line support/P&CD data historian group.
c) PI Integrators shall be provided with the Saudi Aramco best practices of tag
attribute settings (SABP-Z-001). The best practice was developed after extensive
studies carried out on various Saudi Aramco plant data including iFields.
d) PI Integrators shall follow the standard Saudi Aramco tag attribute setting while
defining new tags or interfacing tags from PLCs/DCS. This includes Plant
Personnel and IT PI Support group.

Page 20 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

11 Archive Performance Checks

11.1 Archive Events

PDRM shall evaluate archives for total number of events archived in a specific
period by each tag and group the tags according to the count of events it’s
archived. Historian systems built in functions (event count, etc.) shall be
utilized to generate the report. The report shall compare the results against an
ideal bench mark percentage; if they exceed the bench mark values then they
shall be subjected to corrections. The idea is to check the number of archives
vs. size of archives is good.
a) Benchmark: As per data archival systems recommendations an ideal
historian shall collect data in six loading groups. The standard for filling
up archives for high loads shall be 20%, low loads shall be 70% and others
are as shown in the table below. If actual % of tags of any of the groups
mentioned in below table exceeds the “ideal bench mark” (recommended)
then it shall be termed as in-correct - the reasons need to be investigated.
b) A total events less than 48 or total events greater than 21600 shall indicate
that either too less data (48) or too much data (21600) is being gathered for
those tags. The reasons shall be investigated and justify for their existence.
c) Analyze all tags of a historian to find out for each tag “actual no of
archived events” during a specified time period. Based on number of
events, group them as Low loads, Optimal loads, High loads, super high
loads etc. and calculate the percentage of tags in each group. Following is
the different load benchmarks and their error fixes.

Ideal
Group No# Events Archived Error Fix
Bench Mark
1 < 24 0% 0% is bad
2 >=24 to <= 48 5% Low loads
3 >48 to <= 480 70% Optimal loads
4 > 481 to <=2880 20% High loads
5 >2880 5% Super high loads
6 >21600 0% 0% is bad

11.2 Percentage Events

DRA shall identify the percentage of time over a given period of time that the
tags archived values were good. The idea is to evaluate archived values and
eliminate collection of bad data.

Page 21 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

a) DRA shall subject the selected tags through a data quality check (apply
pctgood function) and develop the percent Good and Bad quality data
report as shown below.
b) Based on quality check (pctgood output) a tag shall be categorized into any
of the three groups (i) 100% good, (ii) 95% to 100% good and (iii) below
95% good.
c) Bad Quality: All tags whose quality check was below 100% shall be
further checked for bad digital states (IsGood function) and report their
statics as shown in below report. Quality of tag would be assessed by
using PctGood function. Bad digital states are “Scanoff”, “shutdown” etc.
d) User shall analyze and fix the bad quality tags.

Percent Count of Bad Digital States


Point Good
Interface Other
Tag Name Source (100%, I/O Bad Scan Shut Out of
name PtCreated Bad
name 95-100%, TimeOut Value Timeout down Service
< 95%) states

Total

12 System Date and Time Synchronization

The Historian server shall use the Windows clock, including the time zone and Daylight
Savings Time (DST) settings to track time. If the system clock isn't right, the data isn't
right either. Historian might even lose data if the system clock is wrong.
a) Historian administrator shall check the system clock regularly, adjust the clock
toward the correct time, adjust the clock only in small increments (for example,
one second per minute) and keep a record of all adjustments made.
b) Historian administrator shall configure the clock on a historian server and
synchronize differences in the clocks of the historian, the data systems from which
the data is being collected, and the clocks of the users on the corporate LAN or
WAN.
c) Complications arise when data is collected from legacy systems with clocks that
have been configured inaccurately or allowed to drift. Historian administrator
shall set all clocks to the correct time. If this is not possible, the he shall configure
interface process to read the current values from the legacy system and send them
to the historian with the current historian server time as the timestamp.

Page 22 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

d) Time synchronization software, designed to keep computer clocks accurate


without error-prone human intervention, can also be implicated in moving system
clocks erroneously. As a result, the events are recorded in the future. Historian
administrator shall recover from such situation by: (i) Stopping the historian
system, (ii) Setting the correct system time and the time on all connected nodes,
(iii) Isolate the historian server from interface nodes. Historian administrator shall
disconnect the historian server from the network. Historian administrator shall
allow the data to buffer until the system is verified up and running normally.

13 Special Cases

13.1 Data Backfilling Methods

For data backfilling, User shall utilize P&CSD PDRM solution. The solution
developed and deployed by P&CSD for data collected using Permanent down
Hole Monitoring System (PDHMS) and Multi Phase Flow Meters (MPFM) to
data historian where it is readily accessible by engineers and experts.
a) STEP 1: Evaluate the data that is to be backfilled. Determine the number
and configuration of new tags, the time period covered by all tags, and the
approximate amount of data you need to import.
b) STEP 2: Create the tags to backfill. If the tags correspond to active
interfaces, make sure current data is not being sent to the tag from the
interface. One way to do this is create the tags with the “Scan” attribute
set to “0” (zero, which is off), or set the “Point Source” attribute for the
tags to “L” for Lab tag. You can change these later. (You can import data
into existing tags that already contain values, but you will not be able to
compress the data.)
c) STEP 3: Check existing archive files. Use tools such as the PI Archive
Manager plug-in in System Management Tools [SMT 3. Note the start
time, end time, and filename (including the path) of all archives within the
time range of the backfill data].
d) STEP 4: Make a backup of your PI Server including all archives you plan
to reprocess.
e) STEP 5: Reprocess old archives to create primary records for the new tags.
You need to reprocess any existing non-primary archives with dates within
the range of the backfill data. This creates primary records for the new
tags in those archives. In addition, you should reprocess them as dynamic
archives (using the “-d” switch) to allow the archives to accommodate new
data.

Page 23 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

f) STEP 6: Create additional archives, as needed. If the data to be backfilled


include values prior to the oldest archive, create a new dynamic archive
with a Start Time at or earlier than the oldest time stamp, and an End Time
equal to the start time of the current oldest archive.
g) STEP 7. Clear the snapshot value for the new tags by deleting the snapshot
value for each tag at the time the point was created.
h) STEP 8. Verify that the oldest value is now in the snapshot for new tags.
At this point your PI System is ready to accept the backfill data. Now you
need to prepare the data itself and the piconfig script for input.

Tips on Backfilling
Ensure the Tag attributes are set properly.
Always run a backfill test with a small amount of data first, and then do the rest
of the data. This way you can verify your piconfig script and make sure that the
data is importing properly.
Check the archive and snapshot statistics during the test to see how the
backfilling affects the PI Server performance.
We highly recommend, whenever possible, to do backfilling jobs on an off-line
PI Server to avoid excessive burden on your main production server. This also
offers an opportunity to verify the backfill is successful without posing risk to
your real data on the PI Server.

13.2 Configure DCS/PLC Alarms

A historian brings together information from several sources and can perform
calculations that are not easily done elsewhere. Some sites may have alarm
philosophies that enable them to take advantage of the historian to provide alerts
on these higher level functions. Historian administrator shall configure the
historian to provide the capability to have alarms for its points. The alarm
package shall include the following features:
a) Current value and archived alarm states;
b) Alarm groups to organize and manage alarms;
c) A simple alarm detection program for monitoring numeric, digital, and
string points;
d) Alarm client functionality to alert operators and other personnel to selected
alarms.

Page 24 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

e) The release of Alarms provides the basic server-side functions of an alarm


system. The alarm package shall have two categories:
1) The first part is the alarm point. Alarms are displayed and archived
as digital points. A monitoring program observes updates to
numeric, digital, and string points and then tests each for configured
alarm conditions.
2) The second part is the alarm group. A set of alarm points can be
organized into alarm groups. Statistics such as the number of alarm
points and the number of unacknowledged alarms can be obtained for
each alarm group. Groups can be members of other groups to form
alarm hierarchies.

14 Data Security

Historian administrator shall utilize windows integrated security to manage Historian


Server authentication through Windows and Microsoft Active Directory (AD).
This new security model improves historian server security, reduces management
workload, and provides users a single-sign on experience.

14.1 Historian Identities and Mapping Methods


a) Computer security has two parts: authentication (who is the user, and how
do we confirm that the user is really who he or she says?) and authorization
(once we know who the user is, what is that user allowed to do?).
b) The Windows integrated security model relies on Windows security for
authentication, but provides its own authorization to historian objects.
This is accomplished through two structures: identities for which we
define permissions and mappings which provide the mapping from
Windows users and groups to identities. Historian administrator shall be
using identities and mappings methods within historian environment.
These are the central components of historian security model.
They determine which Windows users are authenticated on the historian
and what access permissions they have there (for example, is the user
allowed to create a point? Run a backup?). Each identity represents a set
of access permissions on the Historian Server. Each historian mapping
points from a Windows user or group to a historian identity.

Page 25 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

c) An Identity represents a set of access permissions on the historian Server.


Each Mapping points from a Windows user or group to an identity.
d) Members of the Windows groups that are mapped to an identity are
automatically granted the access permissions for that identity.
For example, in the above illustration, the identity called Engineers has
read/write access to the data for the Test Tag point. Because the Active
Directory (AD) group Engineering Team is mapped to Engineers, all the
members in that AD group get read/write permission for the point data.

14.2 User Access Categories

PDRM recommends utilizing the new security features and defining a common
access policy across all Saudi Aramco historians. Historian administrator shall
create three Identities for historian access. Their descriptions are as follows:
a) General User Identity: this Identity/Group shall get read-only access to
all Historian data points.
b) P&CSD Engineers Identity: this category shall get read/write access to
the entire point database and the modular database, allowing them to create
and delete modules and points. However, the P&CSD Engineers group
category does not get permissions for administrative tasks, such as
managing identities, users, and groups.
c) Admin User Group: The administrator category gets read/write access to
all Historian Server resources. IT and local PI support gets these rights.

14.3 OPC Server Security

Historian administrator shall configure access to interfaces by defining trusts.


As the access shall be conducted without man interference hence a trust is

Page 26 of 27
Document Responsibility: Process Optimization Solutions Standards Committee SAEP-389
Issue Date: 3 December 2014
Next Planned Update: 27 November 2015 Process Data Reliability Management

required.
a) Each trust shall be defined against an identity that has the required access
permissions for that interface.
b) Historian administrator shall define interface identities separately which
shall be used while trusts are defined for each interface machines.

15 System Performance

15.1 Hardware, CPU and Memory Requirements


a) Historian administration shall use Saudi Aramco Standard Hardware and
Operating System.
b) Historian administration shall use advanced Reliability Monitor tool to
measures hardware problems, change and calculates a stability index that
indicates overall system stability over time.
c) Historian shall be Compatible with Virtual Machine.
d) Historian shall maintain high availability.
e) Shall not require dedicated hardware, or location.
f) The Infrastructure shall allow efficient utilization of CPU and memory.
g) Ensure up-to-date security patches are installed during the life time of the
system.
h) Use UPS (uninterruptible power supply).
i) Historian shall be installed on servers not workstations.
j) Set and maintain hourly backup and store the backup on different location.
k) Monitor the performance of the hardware.

15.2 Procurement Procedure


a) Proponent shall request Saudi Aramco IT to install company standard
hardware and OS.
b) Proponent shall request P&CSD for procurement and licensing of
Historian software’s.

Revision Summary
27 November 2012 New Saudi Aramco Engineering Procedure.
3 December 2014 Editorial revision to transfer responsibility from Process Control to Process Optimization
Solutions Standards Committee.

Page 27 of 27

You might also like