You are on page 1of 12

SPE-197382-MS

A Generic Method For Acoustic Processing Using Deep Learning

Saad Kisra, Schlumberger; Bassem Khadhraoui, TOTAL; Sebastien Sable, Schlumberger; Lam Lu Duc Duong,
EURECOM

Copyright 2019, Society of Petroleum Engineers

This paper was prepared for presentation at the Abu Dhabi International Petroleum Exhibition & Conference held in Abu Dhabi, UAE, 11-14 November 2019.

This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents
of the paper have not been reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect
any position of the Society of Petroleum Engineers, its officers, or members. Electronic reproduction, distribution, or storage of any part of this paper without the written
consent of the Society of Petroleum Engineers is prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may
not be copied. The abstract must contain conspicuous acknowledgment of SPE copyright.

Abstract
A new method based on deep learning enables the extraction of formation compressional and shear
slownesses from raw waveforms acquired by an acoustic tool regardless of its conveyance system or of its
hardware configuration (number of axial receivers, waveform sampling rate, or number of time samples).
The proposed approach is very fast, fully automated, and suitable for real-time processing workflows at
the wellsite.
Over the years, a large collection of acoustic waveforms has been recorded and processed by experts in
a variety of environments. In the proposed method, we apply a convolutional neural network (also known
as ConvNet or CNN) to learn from previously processed data to estimate acoustic slownesses from raw
waveforms. Because we use an algorithm that is originally designed for visual recognition, we transform
the raw waveforms into images with enhanced characteristics that are directly associated with the acoustic
slownesses that we are aiming to predict. For monopole waveforms, we were able to improve the prediction
results using a short-term average/long-term average (STA/LTA) technique that enhances the main arrivals.
We then train a CNN model with both the input images and the expected outputs (i.e., slowness values)
on a large variety of data covering the main rock environments of interest. The CNN-trained model is
subsequently used to estimate the slowness values from unprocessed waveforms never seen by the CNN
model.
To test our method on real data, we gathered a collection of acoustic waveforms recorded by several
acoustic tools in 20 wells, drilled in different fields across the world. The wells were drilled with different
bit sizes (varying from 6 to 17.5 in.), and the compressional slownesses were ranging from 50 to 165 μs/
ft and shear slowness ranging from 80 to 600 μs/ft. In total, we used 96,011 data points where each data
point consists of a pair of a waveform array and the associated slowness value calculated by an acoustics
expert. We then applied the CNN-trained model to a set of waveforms from validation wells, i.e. wells that
were not part of the training dataset.
In these wells, previously unseen by the ConvNet model, the average absolute error between the slowness
estimated using the CNN-trained model and the slowness calculated by an expert was less than 3 μs/ft, which
is comparable to the error associated with state-of-the-art processing techniques for slowness estimation.
We also discuss how our method can be extended to estimate shear slowness from dipole data using the same
2 SPE-197382-MS

ConvNet. The deep-learning-based technique for slowness estimation that we describe can run extremely
fast after the ConvNet training process is completed and provides good slowness results without prior
information about the tool configuration or the environment in which waveforms are recorded. Because
our technique is fully automated, it can also be used as an automatic quality control (QC) flag for wellsite
processing and real-time operations.

Introduction
The main objective of acoustic logging is to measure the speed of sound in the borehole formation rock and
help estimate formation properties that are key to the development and production of an oil field. Acoustic
logging is achieved through multiple conveyance systems (wireline, logging-while-drilling, etc.) and has a
wide range of applications ranging from reserve estimation to geophysical and geomechanical applications
for well drilling and completion. The process of collecting acoustic data at sonic frequency range using a
variety of transmitters is now well established. The most commonly used transmitter is the omnidirectional
monopole transmitter that can generate body waves such as compressional waves (or P-waves) and shear
waves (or S-waves) in addition to multiple borehole modes such as the pseudo-Rayleigh waves and Stoneley
waves. The monopole source is generally the preferred transmitter for exciting the P-wave and extracting
the compressional slowness of the rock. It is also one of the preferred transmitters for exciting the S-wave
and calculating the shear slowness in fast formations, that is, formations where the shear is faster than
the borehole fluid. On the other hand, in slow formations, that is, formations in which the shear slowness
is slower than the borehole fluid speed, the monopole transmitter is not suitable for extracting the shear
slowness, and the use of a dipole or a quadrupole transmitter is necessary.
The acquired acoustic waveforms are processed generally at the wellsite, either in the acquisition unit or
downhole, to produce the compressional and shear slownesses as well as other derivatives. There are many
methods for achieving this task that have been used by the industry. One of the oldest and most widely
used methods for extracting compressional and shear slownesses from waveforms generated by a monopole
data is the slowness time coherence (STC) technique. This method is based on the calculation of the scalar
semblance of large numbers of possible arrival times and slownesses (Kimball et al. 1984). Extracting the
shear slowness from the flexural mode generated by a dipole transmitter is slightly more sophisticated
but nevertheless is considered a standard procedure (Kimball 1998). One of the major drawbacks of these
methods is that they contain many parameters to be adjusted by expert users depending on the different
formation types traversed by the borehole. In practice, the users must often create multiple processing zones,
for which the parameters are manually set. This process is generally time consuming and user dependent
and can lead to mistakes that require reprocessing in subsequent phases. So, there is a strong need for an
automatic tool to provide quality control (QC) for the processing results.
Over the years, oil and gas companies have gathered an enormous collection of historical acoustic data
processed by experts to generate multiple outputs of interest, most notably the compressional and shear
slownesses that we are after. So, the principle of our proposed approach is to train deep learning models
using these historical data and to apply those models, in a fully automated manner, to the task of estimating
compressional and shear slownesses from unprocessed acoustic waveforms.

Deep Learning for Slowness Processing


Machine learning technology has been successfully adapted to an increasing number of industries ranging
from healthcare, automotive, and financial services to construction and manufacturing. One of the most
popular classes of machine learning is convolutional neural networks (ConvNets), which has been enjoying
great success in large-scale image recognition [Goodfellow 2016]. So, we considered the problem of
adapting some of these ConvNets to the automation of slowness estimation from acoustic tools. Goodfellow
et al. (2014) applied deep ConvNets (11 layers) to the task of street number recognition and showed that
SPE-197382-MS 3

the increased depth led to better performance. Simonyan and Zisserman's (2015) deep learning ConvNet,
called VGG (after the Visual Geometry Group from Oxford), won the top prizes in the ImageNet Challenge
2014 with a deep architecture (11 layers) combined with a surprisingly small 3×3 convolutional layer.
They also made the two best-performing ConvNet models publicly available, making it easier for other
researchers to integrate them into public libraries such as TensorFlow (Abadi et al. 2015) and provide faster
implementations leveraging the huge advances in GPU calculators (Simard et al. 2003).
We started by testing some of these ConvNets on variable density log (VDL) images and a 2D black
and white image, and although the preliminary results were far from accurate, they were promising. Rather
than focusing on the best architecture for the ConvNets model to estimate slowness from this image, we
focused our attention on the best preprocessing steps that can generate an image before running ConvNets
on it. The goal of applying preprocessing to both the training and the test dataset was to put each example
into a more canonical form to reduce the amount of variation that the model must account for. Reducing the
amount of variation in the data can reduce both the generalization error and the size of the model needed
to fit the training set. The main workflow for our preprocessing is shown in Fig. 1, and our deep learning
workflow is summarized in Fig. 2.

Figure 1—The preprocessing workflow combines filtering techniques with


harmonization steps that generate a fixed-size image for all acoustic tools.

Figure 2—The main components of our 11-layer deep learning network


consist of convolution layers, maxpool layers and fully connected layers.

Deep Learning Model Architecture


Defining the ConvNet model architecture consists of deciding the number of layers, the number of filters
in each layer, the size of the filters, and computational operators to use as activation functions, as well as
positioning the layers in the network effectively to solve the given task. Fig. 2 shows that we use an 11-
layer deep neural network inspired by the VGG-11 model suggested by Simonyan and Zisserman (2015).
We use the reactivation linear unit (ReLu) function in all hidden layers (Krizhevsky et al. 2012). Concerning
the learning mechanism that defines how the weights and biases in the model's filters are adjusted toward
the direction that optimizes the objective function, we use the stochastic gradient descent (SGD) with
momentum, which leads to faster convergence than the basic SGD algorithm (Rumelhart 1986).
We also use maxpooling, which is widely used in modern ConvNets. By taking the maximum of a
cluster of values in a particular window, maxpooling can improve the statistical efficiency of the network
and make it more robust for small translations in the inputs [Goodfellow 2016]. In the final steps of our
deep network, we use fully connected (FC) layers with connections to all activations in the previous layer.
FC-1024 represents a fully connected layer with 1024 weights and biases while FC-400 represents a fully
connected layer with 400 weights and biases. This ConvNet scales extremely well due to the use of very
small convolution layers (3x3 pixels).
4 SPE-197382-MS

Since the mathematical form of the desired output is a single value of slowness per depth, the objective
function can be simply the mean square error (MSE) of the difference between the predicted and the training
outputs. This can be reduced to the absolute error between the desired slowness calculated by an expert and
the one produced by the ConvNet model.

Preprocessing
When we consider the acoustic image generated by superposing individual wiggles as a 2D black and white
image (VDL), this image clearly contains multiple virtual lines connecting the peaks of amplitude across the
receiver array. These lines generally correspond to the main arrivals of interest (compressional wave, shear
wave, and Stoneley wave), and the slope approximates the slownesses of each arrival. This observation
intuitively implies that acoustic images contain features that can be used to estimate the slownesses of
interest. However, the challenge faced when processing raw waveforms in real acquisition setups is the high
complexity of the waveforms due to superposition of multiple formation and borehole arrivals. Undoubtedly,
this complexity requires a very deep and complex CNN model to achieve satisfactory results and a huge
amount of data and time to train the model. For these reasons, we looked at ways to reduce the intrinsic
complexity of the raw waveform by enhancing the features that we are trying to capture and remove the
features that are irrelevant to our objective.
Because all state-of-the-art processing techniques involve some level of filtering of the signal that is not
of interest, we started the preprocessing workflow by using a standard bandpass filter. So, for the sake of
simplicity, and without loss of generalization, we applied a wideband Butterworth filter of 5 to 20 kHz to
the raw monopole waveforms, and a filter of 1 to 4 kHz to dipole waveforms. Then, based on trial and
error of different transforms, we found that the combination of preprocessing steps described in Fig. 1 either
reduced the error of our ConvNet-trained model or reduced the run time for training the model or both.

STA/LTA Transform
We observed that the short-term average/long-term average (STA/LTA) transform based on the difference
in the amplitude of the waveforms in two adjacent windows was enhancing the results of interest
(Khadhraoui et al. 2017). STA/LTA transforms the array waveforms into an image of equal size and has
been successfully adapted to calculate high-resolution compressional and shear slowness from monopole
waveforms (Khadhraoui et al. 2018) and is calculated using the following equation:

(1)

where g represents the Hilbert envelope of the considered waveform, 0 < sw ≤ lw, and ε is a small constant
number.
Our tests have shown that a short time-window of 120 µs, and a long time-window of 500 µs were
giving the best results for the variety of tools tested. The introduction of the bandpass filter and STA/LTA
transformation to our workflow adds four parameters to our proposed approach: the bandpass filter limits
and the sizes of the short- and long-time windows. Fig. 3 shows that using these parameters, prominent
features such as P-wave arrivals are enhanced and easier to distinguish.
SPE-197382-MS 5

Figure 3—Example of STA/LTA Image (right panel) obtained from 13 raw waveforms (blue) on
which we superpose the STA/LTA curve for each of the waveforms (red). The resulting STA/
LTA image has a size of 13x512 pixels and shows clear enhancements of the P-wave and S-wave.

Padding and Generalization


This step allows the generation of an image of 16×512 pixels (where 16 is the number of receivers and 512
is the number of time samples) regardless of the acoustic tool. Both of these limits are long enough to cover
for all commercial acoustic tools in the market today. Also, we resampled all monopole waveforms to 10
µs, which is equivalent to a Nyquist frequency of 50 kHz, well above our range of interest. The padding
and generalization step is done differently when considering multishot processing for some acoustic tools
as explained in the next section.

Multishot Processing
Another way of creating images with the same size regardless of the acoustic tool architecture is the
multishot processing technique, which is commonly used by acoustic experts to extract a higher resolution
acoustic slowness. This technique is especially valuable in the case of high formation heterogeneity and thin
beds. In these environments, the formation slownesses within the receiver array are generally nonconstant.
Hsu and Chang (1987) introduced a methodology that circumvents this problem by using a shorter waveform
array aperture. This method has an additional advantage for deep learning because it expands the size of
the training dataset and removes the necessity of axial padding for the waveforms since the number of
receivers used can be chosen to be lower than the minimum number of receivers for all acoustic tools under
consideration.
In the example shown in Fig. 4, we consider an acoustic tool with 13 axial receivers, and a receiver
spacing of 0.5 ft, which is equal to the depth sampling rate. If we consider a multishot processing window
of 5 receivers, then the vertical resolution is 2 ft instead of 3.5 ft with the full array. During the ConvNet
training, we generate 9 images at the same depth corresponding to 9 different gather shots. Then we combine
the 9 predictions from these 9 images to generate a single slowness estimate at each depth.
6 SPE-197382-MS

Figure 4—Applying multishot technique to augment the size of the learning dataset, enhance the vertical resolution
of the slowness and generate an input image with a harmonized size regardless of the original number of receivers.
In the case of a 13-receiver tool, we generate 9 images for each depth, each of them have a size of 5x512 pixels.

Multishot processing is applied without loss of information when the effective depth sampling rate
matches or is a multiplier of the receiver spacing, and when the tool is recording at a constant depth
increment. There are tools that do not satisfy this requirement, either because of their depth sampling or
because the depth increment is nonconstant which can happen with stick-slip movements in the borehole).
When the conditions for multishot processing are not satisfied, we can use full array processing with padding
as described in the previous section.

Implementation
In our work, we used the TensorFlow library which has been optimized for performance. By using the GPU
implementation for training, we were able to improve the processing speed by an order of magnitude and
the learning for 12 wells took a couple of hours instead of few days when using CPU. On the other hand, we
needed our application to run on standard laptops, which do not necessarily have GPUs or large amount of
memory. A key strategy is to use model compression and replace the original model with a smaller model
that requires less memory and runtime to store and evaluate (Bucilua et al. 2006). This implementation
is critical for the successful deployment of our method at the wellsite or in real-time monitoring centers,
and allows us to run our model on a large amount of data in few seconds even with a standard laptop with
modest specifications.

Deep Learning Results on Real Data


The quality of a learned system primarily depends on the size and quality of the training set. To better
understand the behavior of our convolutional network to solve the problem under consideration, we tested
it on real data acquired in 12 wells that cover a wide range of formations where compressional slowness
ranges between 50 µs/ft and 165 µs/ft (Fig. 5), the shear slowness between 80 µs/ft and 600 µs/ft (Fig. 6),
and borehole diameter (bit size) between 6.75 and 17.5 in. For validation, we used eight wells that cover
a similarly wide range of environments. In total, this resulted in 105,316 data points for compressional
slowness from monopole data. Each data point is a pair of slowness values and the associated acoustic array
waveform that generated it.
SPE-197382-MS 7

Figure 5—Distribution of compressional slowness in training wells; each color corresponds to a different well.

Figure 6—Distribution of shear slowness in training wells; each color corresponds to a different well.

Because the expert results are considered "ground-truth" in the context of machine learning, some level
of QC and data cleaning was necessary in order to reduce the amount of invalid data in the training of the
CNN model. For instance, many acoustic tools generate repeated waveforms and slowness values at the
bottom (or top or both) of the well due to the position of the acoustic tool in the tool string. These values
create false redundancies and were filtered in as much as possible. Also, because we focus on open hole data
only in this exercise, we removed the top log intervals where casing arrival is detected (typically around 57
µs/ft) to avoid influencing the results in fast formations where the slowness can overlap with this value. As
a result, we ended up with 97,011 data points for compressional slowness. It's important to note that these
steps are very simple and can be run automatically without any human intervention.
A key consideration in building a generic model for acoustic slowness prediction is to ensure that it covers
the different formation types that are typically logged with sonic tools. Fig. 7 shows a standard "VpVs ratio
versus compressional slowness" crossplot with an overlay of the most commonly encountered formation
types that are classified from fast to very slow. This crossplot shows that our training wells are covering
these different formation types with a reasonable number of data samples. It also shows that we have some
scatter of the VpVs data, generally due to the mismatch in the vertical resolution between the monopole
compressional and the dipole shear (which has a lower frequency content).
8 SPE-197382-MS

Figure 7—VpVs versus compressional slowness crossplot of all 12 training wells shows that our training dataset
has a reasonable coverage of the main formation types, ranging from extremely fast formations (compressional
faster than 50 µs/ft) to very slow formations (compressional slower than 130 us/ft and VpVs higher than 3.2).

To train a reasonable model, we run few epochs of the machine learning until we see convergence. Each
epoch corresponds to running the CNN model on the whole set of training or validation datasets. Fig. 8
shows that the average training error (i.e., the difference between the predicted compressional slowness
value and the slowness calculated by a human expert) starts at 5.3 µs/ft and is reduced with each epoch
until it reaches 2 µs/ft within 10 epochs. The training error ultimately converges to 1.3 µs/ft after running
40 epochs on the 12 training wells.

Figure 8—The reduction in training error with increasing number of epochs.

Fig. 9 shows the variability of this error for each of the wells using boxplots centered around the median
error value for each well. After 40 epochs, the median training error ranges between 0.9 µs/ft for Training
Well-01 and 5 µs/ft for Training Well-10.
SPE-197382-MS 9

Figure 9—Well-by-well boxplots of the difference between the slowness


calculated by an expert and the one calculated by ConvNet on the training wells.

Next, we run our methodology on the 8 validation wells for few epochs. Fig. 10 shows the variability
of the generalization error on all data points with the number of epochs. This error decreases sharply with
the first few epochs before it stabilizes after 20 epochs just below 3 µs/ft. In other words, we were able
to estimate the compressional slowness from raw waveforms in previously unseen wells to within 3 µs/ft,
which is comparable to many state-of-the-art acoustic processing techniques. Fig. 11 shows the distribution
of this error in all 8 validation wells and it can be seen that the median error is around 0 µs/ft for most of
the wells, and that the first quartile is within 1 µs/ft for 6 of the wells.

Figure 10—The evolution of the generalization error (difference between predicted compressional slowness and computed
slowness in unseen wells) with increasing number of epochs. After sufficient time, the difference is only approximately 3 µs/ft.
10 SPE-197382-MS

Figure 11—Well-by-well boxplot of difference between the slowness


calculated by an expert and the one estimated using our method.

The error per well between the slowness log estimated by an expert and the estimated slowness using
our method is an average over depth and might be misleading in some cases where there are many outliers,
and where the error might be concentrated in specific intervals. So, we try to analyze depth by depth the
difference between the slowness calculated by an expert and the one estimated by our ConvNet. Fig. 12
shows a more detailed comparison from a good/typical well (Validation Well-01) while Fig. 12 shows the
detailed results for the well with the worst performance (i.e. Validation Well-04).

Figure 12—Example: the testing results in Validation Well-01. The first track is measured depth in feet and
the second track shows the Gamma Ray in GAPI. The third and fourth tracks show two orthogonal calipers
with shading between the caliper value and the bit size used to drill this section. The fifth track shows the
VDL of the fifth receiver of the monopole data and the sixth track shows the calculated slownesses using
STC processing. The last track shows comparison between the predicted compressional slowness (blue)
and the benchmark computed by an expert (black). The difference is shaded in orange and dark yellow.
SPE-197382-MS 11

As shown in the last track of both Fig. 12 and Fig. 13, the difference between the ground truth
compressional slowness and the one estimated using our ConvNet is generally minimal in the majority of the
points and these two curves overlay in most of the data points. The match between the two logs is sometimes
surprisingly good in zones where the CNN model is able to capture the significant change in the slowness.
Nevertheless, the largest errors are mostly in zones with large vertical heterogeneities (as demonstrated by
the high variability of the Gamma Ray curve in track 2). We should note that, in these intervals, the quality
of the calculated slowness can vary considerably based on the assumptions of the processing technique used
as discussed in the multishot section.

Figure 13—Example: the testing results in Validation Well-04. The first track is measured depth in feet and
the second track shows the Gamma Ray in GAPI. The third and fourth tracks show two orthogonal calipers
with shading between the caliper value and the bit size used to drill this section. The fifth track shows the
VDL of the fifth receiver of the monopole data and the sixth track shows the calculated slownesses using
STC processing. The last track shows comparison between the predicted compressional slowness (blue)
and the benchmark computed by an expert (black). The difference is shaded in orange and dark yellow.

Discussion
Believing that our method can be extended to further applications such as (1) the extraction of shear slowness
from monopole data in fast formations and (2) the extraction of shear slowness from dipole waveforms, we
followed a similar approach to train our VGG-11 ConvNet with a few tweaks to the preprocessing steps. For
dipole waveforms, we knew that due to the dispersive nature of the flexural wave that is dominant, STA/LTA
will not be appropriate, so we swapped it with another method based on phase-shift between consecutive
receivers. With these modifications to the preprocessing workflow, we were able to train our model on dipole
waveforms using the same training wells described above. Combining data from two orthogonal dipoles, we
were able to use more than 200,000 data points, which is double the size of the monopole dataset. Then, we
successfully ran these models on validation wells to produce shear slownesses. The difference between the
predicted shear slowness value and the value calculated by a human expert were similar (but slightly higher)
than the results for the monopole data shown in this paper. These results will be presented in a separate paper.
12 SPE-197382-MS

Conclusions
In this work, we developed a new method based on deep learning to estimate formation slownesses from
raw acoustic waveforms. We found that by combining few steps of preprocessing with the 11-layer deep
convolutional neural network, we are able to yield reasonably good results:

• We have successfully trained a generic convolutional network model for compressional and shear
slowness from real data. This method is generic in the sense that it can be used with all acoustic
tools regardless of their architecture.
• We described several preprocessing steps that have significantly increased the accuracy and speed
of the convolutional neural network. More specifically, we showed how multishot processing
technique can increase the redundancy of the information while harmonizing the outputs without
need for padding
• The average error for the compressional slowness estimation from monopole data was less than
3 µs/ft in validation wells, which is comparable with the results of the state-of-art processing
techniques by experts. The average error for the training wells used in our study is around 1.3 µs/ft.
• We have discussed how we can extend this methodology to the estimation of shear slowness from
dipole data taking into consideration its dispersive nature.
• Finally, we have proven that the prediction in previously unseen wells is completely automated,
and only takes few seconds, which makes it suitable for real-time applications. It also means that
our method can be used as an automatic QC tool for large-scale processing of historical data using
modest hardware specifications.

References
Abadi, M., Agarwal, A. et al. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems.
Bucilua, C., Caruana, R., Niculescu-Mizil, A. 2006. Model Compression, Proceedings of the 12th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining. https://doi.org/10.1145/1150402.1150464
Goodfellow, I., Bengio, Y., and Courville, A. 2016. Deep Learning. Cambridge: The MIT Press.
Goodfellow, I., Bulatov, Y. et al. 2014. Multi-digit Number Recognition from StreetView Imagery Using Deep
Convolutional Neural Networks. Proc. ICLR.
Hsu, K. and Chang, S. K. 1987. Multiple-Shot Processing of Array Sonic Waveforms. Geophysics 52–1390. https://
doi.org/10.1190/1.1442250.
Khadhraoui, B., Kisra, S., and Nguyen, H. M. T. 2018. A New Algorithm for High Depth Resolution Slowness Estimate
on Sonic-Array Waveforms. Presented at the SPWLA 59th Annual Logging Symposium, London, UK, 2–6 June.
SPWLA-2018-PPPP.
Khadhraoui, B., Nguyen, H. M. T., and Kisra, S. 2017. Arrival-Time Picking, and Slowness Estimate on Sonic Data.79th
EAGE Conference and Exhibition. https://doi.org/10.3997/2214-4609.201701093
Kimball, C. 1998, Shear Slowness Measurement by Dispersive Processing of The Borehole Flexural Mode. Geophysics
63: 337–334. https://doi.org/10.1190/1.1444333
Kimball, C. and Marzetta, T. 1984. Semblance Processing of Borehole Acoustic Array Data, Geophysics 49: 274–281.
https://doi.org/10.1190/1.1441659
Krizhevsky, A., Sutskever, I., and Hinton, G. E. 2012. ImageNet Classification with Deep Convolutional Neural Networks.
Advances in Neural Information Processing Systems 25: 1106–1114.
Martin A., Ashish A. B., Eugene B. C., 2015, TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed
Systems.
Rumelhart, D., Hinton, G., and Williams, R. 1986. Learning Representations by Back-Propagating Errors. Nature 323
(6088): 533–536 https://doi.org/10.1038/323533a0
Simard, P. Y., Steinkraus, D., and Platt, J. 2003. Best Practices for Convolutional Neural Networks Applied to Visual
Document Analysis. Proceedings of the Seventh International Conference on Document Analysis and Recognition.
https://doi.org/10.1109/icdar.2003.1227801
Simonyan, K. and Zisserman, A. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR
2015.

You might also like