Professional Documents
Culture Documents
Authors:
Iaroslav Plutenko
Mykhailo Hodis
24 January 2019
Contents
Introduction 3
Linear Algebra in modern engineering 3
Singular Value Decomposition as one of the powerful methods of LA 4
Seismic Exploration and Math 6
Description of the industry and underlying principles 6
Basics of marine operations. 8
Conditions of the survey 10
Data processing 11
Combating Noise 12
Linear Algebra in play 13
SVD in noise attenuation 14
Conclusion 25
List of figures 26
References 27
Introduction
Linear Algebra in modern engineering
Organization and structuring of information helps us to make complex tasks easier in
different spheres of everyday life. Once such structures are defined problem could be handled
effortlessly. Linear algebra is generally the study about such structures [5] . Namely,
First occurrences of procedure for solving linear systems simultaneously were found in in
the ancient Chinese mathematical text The Nine Chapters on the Mathematical Art [4] . However
wider usage of linear systems takes beginning in 1637 after coordinates in geometry were
introduced by René Descartes.
Linear algebra plays a considerable role in almost every area of mathematics. Basic
objects, such as lines, planes and rotations are defined with linear algebra in modern presentation
of geometry. It is applied in functional analysis to spaces of functions. Because of the possibility
to model different natural phenomena and convenient computations with such models, the study
of linear algebra is also being used in most sciences and engineering spheres. Even for nonlinear
systems, that cannot be modeled with the linear algebra, linear algebra sometimes helps to make
first-order approximation.
Among a wide list of linear algebra real-life application possibilities, here are just few of
them:
● Load and displacements in structures.
● Compatibility in structures.
● Finite element analysis (mechanical, electrical, and thermodynamic applications).
● Stress and strain in more than 1-D.
● Mechanical vibrations.
● Current and voltage in LCR circuits.
● Small signals in nonlinear circuits = amplifiers.
● Flow in a network of pipes.
● Control theory (governs how state space systems evolve over time, discrete and
continuous).
● Control theory (optimal controller can be found using simple linear algebra).
● Control theory (Model Predictive control is heavily reliant on linear algebra).
● Computer vision (used to calibrate camera, stitch together stereo images).
● Machine learning (Support Vector Machine).
● Machine learning (Principal Component Analysis).
● Lots of optimization techniques rely on linear algebra as soon as the dimensionality
starts to increase.
● Fit an arbitrary polynomial to some data.
SVD proved to be a computationally viable tool for solving a wide variety of problems
arising in many practical applications. Among possible usages of SVD for other computations
are: pseudoinverses, solving homogeneous equations, finding nearest orthogonal matrix, total
least squares minimization, separable models decomposition. The gist of using SVD is that these
applications require knowing the rank of the matrix, approximations of a matrix using matrices
of lower rank, their orthogonal complements and orthogonal projections onto these subspaces.
Mostly such computations should be done with presence of impurities in the data – noise – and
SVD appears to be quite effective for these computations.
Seismic Exploration and Math
Description of the industry and underlying principles
One of the important scientific and engineering areas employing extensive processing of
input data is the seismic surveying.
In seismic surveying, sound waves are mechanically generated and sent into the earth.
Some of this energy is reflected back to recording sensors, measuring devices that record
accurately the strength of this energy and the time it has taken for this energy to travel through
the various layers in the earth's crust and back to the locations of the sensors. These recordings
are then taken and, using specialized seismic data processing, are transformed into visual images
of the subsurface of the earth in the seismic survey area. Just as doctors use x-rays and audio- or
sonograms to “see” into the human body indirectly, geoscientists use seismic surveying to obtain
a picture of the structure and nature of the rock layers indirectly.
Seismic surveys are conducted for a variety of reasons. They are used to check
foundations for roads, buildings and large structures, such as bridges. They can help detect
groundwater. They can be used to assess where coal and minerals are located. One of the most
common uses of seismic data is in connection with the exploration, development, and production
of oil and gas reserves to map potential and known hydrocarbon-bearing formations and the
geologic structures that surround them. Most commercial seismic surveying is conducted for this
purpose. Oil & gas exploration and production is conducted in many places on the earth's
surface, in both the onshore (land) and offshore (marine) domains. Although the principles are
identical, the operational details differ between the two domains. In this overview, only marine
operations will be addressed.
Marine seismic acquisition is the most common method for offshore exploration. In most
marine work, the sensor is a hydrophone that detects the pressure fluctuations in the water caused
by the reflected sound waves. The cable containing the hydrophones, called a streamer, is towed
or ‘streamed’ behind a moving vessel. These streamers are typically 3 to 8 kilometers long,
although they can be up to 12 kilometers long depending on the depth of the geophysical target
being investigated
Acquisition takes place with large seismic vessels towing one or multiple airgun arrays
behind the vessel. This equipment is capable of producing a seismic signal by firing highly
pressurized airshot into the seawater. Receiving devices are towed behind the ship as well as
airguns in one or several long streamers that are conventionally 6-9 kilometers long. Figure
below shows the schematics of a marine seismic survey with one seismic vessel, two airgun
arrays and multiple streamers towed behind the vessel. [6] [10]
Towed streamer operations represent the most significant commercial activity, followed
by ocean bottom seismic survey (including arrays placed on the seafloor and arrays buried a
meter or so below the seafloor).
Figure 3. Marine seismic data acquisition set-up towed by vessel.
When energy from a sound source is released in the marine environment, pressure waves
are created in the water column. The magnitude of the pressure is called amplitude, and the
excited waves are P-waves, or compressional waves. To a first approximation, water will only
propagate P-waves and the sensors that make accurate measurements of the amplitudes of P-
waves are hydrophones. The velocity of sound in seawater and density of seawater can vary as a
result of changes in salinity, temperature, and gas and sediment content, under certain
hydrographic conditions layers can be formed that can reflect P-waves. and that can also trap
certain frequencies of P-waves. In this latter case, the trapping layer will be called a waveguide.
Although the terms amplitude (pressure) and energy are often used interchangeably (as in “the
P-wave pressure, or the P-wave energy”), energy is proportional to the square of the amplitude.
Rocks underlying the seafloor have rigidity: water does not. When P-waves enter the
rock, they can be transmitted and reflected as in water, but they can also convert to S-wave or
shear-waves. It is impossible for P-waves to propagate in rocks without mode-converting
(converting from P-wave mode to S-wave mode) to S-waves, but most seismic surveying is
accomplished using pressure sensors in the water column, so no direct S-waves are recorded in
that situation. However. S-waves contain information of use to geoscientists that is not contained
in P-waves. so sometimes it is advantageous to record S-waves. This can be done by placing
sensors on the seafloor and catching the S – wave energy that has been created by the initial
production of P-waves from the marine source. [2]
Within a given exploration zone, the details of a specific survey operation can vary
enormously. There are, however, two principal categories of seismic surveying. These are two-
dimensional (2D) seismic surveys and three-dimensional (3D) seismic surveys. 2D can be
described as a fairly basic survey method, which, although somewhat simplistic in its underlying
assumptions, has been and still is used very effectively to find oil & gas.
A sub-category of 2D is the site survey where ultra-high-resolution data is acquired in the
immediate vicinity of an intended well to identify both seabed and shallow subsurface hazards.
Ultra-high resolution here means that the survey is intended to provide more detailed information
about the seafloor and the conditions of the rock down to a depth of a few hundred metres
beneath the seafloor. 3D surveying is a more complex method of seismic surveying than 2D and
involves greater investment and much more sophisticated equipment than 2D surveying. Until
the beginning of the 1980s, 2D work dominated in oil & gas exploration, but 3D became the
dominant survey technique in the late 80s with the introduction of improved streamer towing and
positioning technologies.
4D surveys (or time-lapse 3D) are simply 3D surveys which are repeated over the same
area, some period of time elapsing between the initial survey and the subsequent surveys. There
might be several repeated surveys, depending on the specific oil or gas field in question. The
purpose of this type of survey is to obtain images of how the hydrocarbon reservoir is changing
over time due to production in order to maximize hydrocarbon recovery from the field. 4D
surveys have become increasingly used since the mid-1990s, and now represent a significant
percentage of overall seismic activity.
More recently, increasingly sophisticated towed streamer acquisition schemes - multi-
azimuth, wide azimuth and rich azimuth - have been developed to provide improved subsurface
imaging in geologically and geophysical challenging environments.
In 1984 the first twin streamer operation was undertaken, which effectively doubled the
data acquisition efficiency of the vessel by generating two subsurface lines per vessel sail line.
By moving to twin source/twin streamer configurations in 1985, the output was increased to four
subsurface lines per vessel sail line or pass. The next logical step of towing three streamers and
two sources behind a single vessel, thus acquiring six lines per pass, was achieved in 1990. The
number of deployed streamers has consistently increased with as many as 16 streamers having
been towed.
Multi-streamer operations require a significant amount of in-sea equipment - the 16-
streamer operation referred to above entailed 72 kilometers of cable being towed behind the
vessel.
Consequently, the back deck of the vessel becomes very busy due to the activity involved
in handling equipment including streamers, sources and the related control devices.
Organizing and operating such a set-up in a safe and efficient manner requires a very
high level of knowledge and skill. [2]
In general, seismic surveys are planned to be acquired in calm weather, to minimize the
amount of extraneous noise recorded along with the primary signals. This noise increases with
increasing sea state and most companies specify how much measured noise is acceptable during
the acquisition of the data. If the prevailing conditions lead to this level being exceeded, the
acquisition is stopped. If conditions become excessive, then the streamers and source arrays may
have to be recovered. The vessel will “ride out the storm” on location or move to more sheltered
waters, whichever is the safer and better operational option; the vessel crew's safety being the
overriding concern.
Data processing
Initially, the processing team's schedule is linked with the acquisition phase of the
operation. At the beginning of the survey, the vessels record a few test lines' that either track
previous surveys or pass near wells offering check-shot data. These data are transported ashore
by helicopter or supply boat, conveyed to the processing team which uses them to review data
quality and select parameters for beginning the process.
After the survey is finished, perhaps eight weeks after it began, acquisition and
processing contractors continue working in parallel. While the processing team begins its
massive numerical manipulations, the acquisition contractor performs a series of no less
important navigational computations. These determine the exact position, for every shot in the
survey, of the survey vessels, airgun arrays and tailbuoys at the end of the streamers, and also the
precise shape of the streamers themselves.
The navigational results are then dispatched to the processing team which merges them
with the seismic data to perform stacking. This is a pivotal moment in the six-month saga.
Throughout its task, the processing team focuses on two main goals: enhancing signal at
the expense of noise, and shifting acoustic reflectors as seen on sections to nearer their true
position. Stacking is a key signal enhancement technique, and unlike some of the other noise
reduction algorithms that processors use, it is intimately tied to how data are acquired.
Stacking is the averaging of many seismic traces that reflect from common points in the
subsurface. Each trace is assumed to contain the same signal but different random noise, so
averaging them enhances the former while minimizing the latter.
Each trace in a set of traces having a common midpoint, called a CMP gather, is nec-
essarily recorded with a different airgun-hydrophone group spacing, or offset — this varies from
about 50 meters for the so-called near trace, to several kilometers for the farthest trace. As offset
increases, the two-way reflection time also increases because the sound has a longer path to
travel. This effect, called moveout, produces a downward curving hyperbola on the gather.
Moveout must be corrected for before the traces are stacked, otherwise the reflections will not
sum constructively. In the normal moveout (NMO) correction, every trace is converted to zero
offset giving all traces a consistent set of two-way times to the reflectors.
Stacking, with its accompanying velocity analysis and interpolation, is pivotal because it
cuts the amount of data manyfolds. It divides the processing into two parts—up to and including
stack, and after stack. Processing before stack takes at least 70 percent of the team’s time and
resources, because of the amount of data and analysis involved. Going back to correct mistakes
and data errors or changing processing parameters and then restacking is expensive. Post-stack
processing, while no less critical to the appearance of the final product, can be less intense
because the data set is reduced, and reprocessing using different processes and parameters costs
less. [2]
Combating Noise
Noise contamination is a common event in seismic reflection resulting in very striking
features in the seismograms, hindering the data processing and interpretation.
The attenuation of seismic noise is a challenging task. Various techniques are utilized on
each level of acquisition, starting from hardware, continuing with real time digital processing
and finishing with high-level comprehensive onshore post-processing.
Commonly frequency filters are employed, but they often do not show good results. The
characteristic of noise depends mainly on the type of data researchers are working. In land data,
the most common is the ground roll, that has low frequencies and high amplitudes whereas in
marine data (on a shallow water acquisition), head waves and harmonic modes are linear and
dispersive events that mask part of the interest reflections, influencing the delimitation of
lithological layers. In this study we used the method Singular Value Decomposition (SVD) in
order to mitigate some types of noise in seismic reflection and create alternatives to
interpretation based on different frequency content present in seismic section. This new approach
results in a technique that can identify and mitigate the unwanted event from the seismograms,
trying always to preserve or enhance the interest signal.
In general, noise is divided into the categories: coherent and random noise. Coherent
noise can be followed and predicted over a number of traces, while random noise is
unpredictable. In a general way, seismic data x can be decomposed into the sum of signal, s and
noise, n, as follows:
x=s+n
For our work we determined this criteria as per observer perception, which is less
scientific but is sufficient for demonstration purposes
For the current project the unprocessed noise contaminated sample of Coil Shooting
survey was obtained (courtesy of Nick Moldoveanu, Global Geophysical Advisor at
Schlumberger), file of SEG-Y format, 15 sec record length, 2ms sampling rate, 2560 traces per
shot record, 3 records. Visualization of these seismogram was shown earlier. Useful signal is a
thin tilted streak across the upper part of the image while the entire image is heavily
contaminated by ripple-like noise.
This file of 233 MB size contains 7680 traces, each trace has 7681 values - samples,
according to sampling rate 2 ms, total 15 sec of duration. This division makes corresponding
matrix almost square, however we will take smaller fragment to reduce computational time and
to demonstrate results better. As stated earlier there are three records in the file, 2560 traces each,
but taking even less traces will be enough for demo.
According to SEG Technical Standards Committee standards SEGY file header of 3200
bytes with text data, 400-byte binary file header, also it may contain optional extended headers
(not present in our file) followed by trace data. Each trace has 240-byte header, the length of
trace and type of data is specified in headers, most often it’s 4-byte values (equivalent to float
primitive type in Java).
Software for viewing SEGY files and simple manipulation on them are available for free
with GNU and other types of licenses. We used cross-platform SeismiGraphix standalone
application written in Java which allows read and view SEG-Y format, open and edit headers.
Author and developer - Abel Surace who started his career in Oil and Gas Exploration industry
then became Software Engineer developing and maintaining awesome applications for Seismic
and Well data management used by several Oil and Gas companies in Canada, USA and
worldwide.
To perform matrix operation on SEGY data first we need to retrieve the data from file
and put into the matrix. There are plenty of dedicated libraries for Python but for Java they are
scarce. In our example we start reading the file, place header(s) in the memory, then read trace
data (each trace header is also stored in memory) into array of float primitives (4 bytes long),
convert it into matrix structure.
The obtained matrix undergoes Singular Value Decomposition. All vector, matrix types
and operations are provided by org.apache.commons.math3 library. Unfortunately, SVD
operation in our set-up is resource consuming operation for large matrices - that was the main
reason we didn’t process the entire file reducing it to the fragment.
After decomposition we construct back low-rank matrices corresponding to each singular
value. First low-rank matrix based on the largest singular value is subtracted from initial matrix.
As was explained earlier swell noise is appropriate for largest singular values, so by subtraction
we literally remove it from the source data.
Resulting matrix is converted back into SEGY file. Each column obtains back trace
header stored in memory and entire matrix is flushed into file prepended by file headers. Such
file is read by SeismiGraphix to visualize results after each iteration.
On the second cycle low-rank matrix for second singular value is build and is subtracted
from matrix obtained after first cycle. Thus, we have matrix additionally cleared from swell
noise. This matrix is also flushed into the next file to visualize results.
It was determined empirically that best results are achieved circa 50 cycles. More
iterations lead to signal deterioration. According to the practical method after each cycle manual
operations should be performed like filtering and refining, but we skipped them using only low-
rank matrix calculations and subtractions. Our goal was to demonstrate viability of the
mathematical method for swell noise attenuation.
As summary the routine performs following steps:
1. Original SEGY file is opened for reading.
2. Textual file header and binary file header are read and stored in memory buffer.
3. Loop is started to read trace data. Trace headers are copied into byte buffer in array
fashion in order to be retrieved back later for writing. Trace data are copied into two-
dimensional array. 1000 traces are taken to constitute example for further work.
4. Upon completing the loop array is converted to matrix.
5. Matrix is decomposed (SVD)
6. New loop is started (number of iterations = 51). Each iteration produce low-rank matrix
starting from the largest singular values. This matrix is subtracted from the original
matrix first. Result are written into the file, using stored trace and file headers to comply
with file format. Each consecutive iteration produce matrix without first largest k singular
values.
package com.la.marine;
import org.apache.commons.math3.linear.Array2DRowRealMatrix;
import org.apache.commons.math3.linear.RealMatrix;
import org.apache.commons.math3.linear.RealVector;
import org.apache.commons.math3.linear.SingularValueDecomposition;
import java.io.*;
import java.nio.ByteBuffer;
dataInputStream.close();
System.out.println("Read completed");
RealVector uk;
RealVector vk;
RealMatrix ak;
RealMatrix akPrev = mx.copy();
DataOutputStream dataOutputStream;
dataOutputStream =
new DataOutputStream(
new FileOutputStream("from_matrix_back"+k+".segy"));
dataOutputStream.write(initHeaders.array());
for (int i=0; i<width; i++ ) {
dataOutputStream.write(traceHeaders[i].array());
for(int j=0; j<height; j++) {
dataOutputStream.writeFloat((float)akPrev.getEntry(j,i) );
}
}
dataOutputStream.close();
System.out.println("Processed #" + k);
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
However, the dead(black) traces are visually removed, on latest images this is useful
function.
Ноwever, one can notice that signal is starting to deteriorate - instead of sine pattern it’s
getting sharp edges. So, we stopped the loop at 51 iteration assuming it’s the best result we can
achieve with pure SVD.
Further enhancement of signal should be conducted by other methods, actually they are
supposed to be applied at all steps between iterations.
Conclusion
We applied Singular Value Decomposition to digital seismic data in order to remove or
attenuate strong coherent swell noise which happened to be represented by large singular values.
As per our experience we achieved anticipated results by producing visualized records
largely stripped of unwanted ripple and with distinct signal.
The drawback of this method is that it still requires human supervision to evaluate
preliminary and intermediate results and adjust them with other techniques. For large sets of data
containing millions of records it is hard toil that still wants automation. No doubt this task will be
achieved in nearest future with application of specific algorithms and neural network-based
solutions once they became faster and smarter. Likely the industry of seismic acquisition will be
evolving towards the increase of computation powers onboard, developing communication
network to exchange the data between vessel, shore, data storages and computation centers,
forming integrated system for industry like those being developed for space industry, military etc.
For short term SVD for noise attenuation will require highly customized software allowing
to store and manipulate intermediate results without acceptable timelines, this may require
dedicated hardware as well.
Actually, there is no need to subtract each low-rank matrix iteratively – we did it to
produce and visualize intermediate results after each iteration. Second reason - in our
circumstances it showed better results than other approach - perhaps due to better control over the
process, supervising middle steps and working with raw data.
When the number of discarded singular values is predicted the required matrix can be
obtained by re-multiplying A=U Σ V T without defined set of large singular values. Python
libraries can do it relatively fast as this is recognized language for researchers, scientist and
engineers.
List of figures
Figure 1. SVD components shapes. 4
Figure 2. SVD decomposition components influence on some unit disc. 5
Figure 3. Marine seismic data acquisition set-up towed by vessel. 7
Figure 4. Coil Shooting schematics. 10
Figure 5. Noise classification. 12
Figure 6. Visualisation of raw seismic data sample in SeismiGraphix. 14
Figure 7. Eigenimages with coherent pattern and random noise. 14
Figure 8. Workflow of noise attenuation by SVD. 15
Figure 9. Structure of SEGY file. 16
Figure 10. Result after 1st iteration. 19
Figure 11. Result after 2nd iteration. 19
Figure 12. Result after 5th iteration. 20
Figure 13. Result after 10th iteration. 20
Figure 14. Result after 23rd iteration. 21
Figure 15. Result after 40st iteration. 21
Figure 16. Normalization applied on the image from 1st iteration. 22
Figure 17. Results of 51st iteration (with normalization). 22
Figure 18. Trace view mode on the raw data image. 23
Figure 19. Trace view mode on the data of the last iteration. 24
References
[1] Bekara M. Local singular value decomposition for signal enhancement of seismic
data. Geophysics. 2007
[2] Borehav D., Kingston J., Shaw P., Zeelst J. [1991] 3D Marine Seismic Data
Processing.
[3] Dondurur D. Acquisition and Processing of Marine Seismic Data
[4] Hart, Roger. The Chinese Roots of Linear Algebra. JHU Press. 2007
[5] David Cherney, Tom Denton, Rohit Thomas and Andrew Waldron. Linear Algebra.
[6] Magnussen F. De-blending of marine seismic hydrophone and multicomponent
data. University of Oslo June, 2015
[7] Moldoveanu, N. [2011] Attenuation of high energy marine towed-streamer noise. In:
2011 SEG Annual Meeting. Society of Exploration Geophysicists.
[8] Oropeza V. The Singular Spectrum Analysis method and its application to seismic
data denoising and reconstruction. University of Alberta 2010
[9] Roodaki A., Bouquard G., Bouhdiche O. and others SVD-based Hydrophone Driven
Shear Noise Attenuation for Shallow Water OBS. In 78th EAGE Conference and
Exhibition 31 May 2016
[10] Saeed Baseem S. De-noising seismic data by Empirical Mode Decompositio