You are on page 1of 6

Deep learning algorithms for gravitational waves

core-collapse supernova detection


M. López M. Drago
2021 International Conference on Content-Based Multimedia Indexing (CBMI) | 978-1-6654-4220-6/20/$31.00 ©2021 IEEE | DOI: 10.1109/CBMI50038.2021.9461885

Nikhef, Science Park 105, 1098 XG, Università di Roma La Sapienza, I-00185 Roma, Italy
Amsterdam, The Netherlands INFN, Sezione di Roma, I-00185 Roma, Italy
Institute for Gravitational and Subatomic Physics Gran Sasso Science Institute (GSSI), I-67100 L’Aquila, Italy
(GRASP), Utrecht University, Princetonplein 1, INFN, Laboratori Nazionali del Gran Sasso, I-67100 Assergi, Italy
3584 CC Utrecht, The Netherlands Email: marco.drago@gssi.it
Email: m.lopez@uu.nl

I. Di Palma and F. Ricci P. Cerdá-Durán


Università di Roma La Sapienza, I-00185 Roma, Italy Departamento de Astronomı́a y Astrofı́sica,
INFN, Sezione di Roma, I-00185 Roma, Italy Universitat de València, Dr. Moliner 50,
Email: Irene.DiPalma@roma1.infn.it 46100, Burjassot (Valencia), Spain
Email: fulvio.ricci@roma1.infn.it Email: pablo.cerda@uv.es

Abstract—The detection of gravitational waves from core- observation of these events would help us understanding the
collapse supernova (CCSN) explosions is a challenging task, yet mechanism producing these cataclysmic explosions.
to be achieved, in which it is key the connection between multiple At the end of their lives, massive stars (those born with
messengers, including neutrinos and electromagnetic signals. In
this work, we present a method for detecting these kind of masses between 8 and 100 solar masses) have accumulated,
signals based on machine learning techniques. We tested its product of thermonuclear fusion processes, about 1.4 solar
robustness by injecting signals in the real noise data taken by masses of elements of the iron family in a compact core.
the Advanced LIGO-Virgo network during the second observa- This iron core cannot supports its own weight and experiments
tion run, O2. We trained three newly developed convolutional a gravitational collapse. The compressed matter eventually
neural networks using time-frequency images corresponding to
injections of simulated phenomenological signals, which mimic exceeds nuclear matter density and atomic nuclei disintegrate
the waveforms obtained in 3D numerical simulations of CCSNe. into free nucleons (mostly neutrons) and neutrinos, forming a
With this algorithm we were able to identify signals from both proto-neutron star (PNS). At the same time, the rapid change
our phenomenological template bank and from actual numerical of the compressibility of matter (dictated by the equation
3D simulations of CCSNe. We computed the detection efficiency of state of nuclear matter) halts the collapse and produces
versus the source distance, obtaining that, for signal to noise ratio
higher than 15, the detection efficiency is 70 % at a false alarm a shock wave that, helped by the additional thermal energy
rate lower than 5%. We notice also that, in the case of O2 run, deposited by the outstreaming neutrinos, disrupts the entire
it would have been possible to detect signals emitted at 1 kpc of star producing a characteristic electromagnetic signal known as
distance, whilst lowering down the efficiency to 60%, the event supernova. This is the so called neutrino driven mechanism [4],
distance reaches values up to 14 kpc. [5], and is expected to be responsible for the majority (> 99%)
Index terms—Convolutional, gravitational waves, machine of all CCSN explosions. Mass motions in the newly formed
learning, supernovae. PNS are responsible for the emission of strong GW signals,
I. I NTRODUCTION which could be detectable at galactic distances A combined
multi-messenger detection of the GW signal together with the
Since 2015, with the first detection of the gravitational neutrino signal would be critical to confirm this theoretical
waves (GWs) emitted during the merging of two black holes model and would help us understanding the details of the
[1], the LIGO and Virgo GW observatories have been detecting processes taking place during the explosion. In this work,
signals from astrophysical sources [2], [3], opening a new we do not conduct a real multi-messenger strategy, but we
window of observation in astronomy. The sensitivity of these perform a triggered GW search supposing to know the timing
GW detectors is on the limit of the capabilities of current information of neutrino signal.
technology and the detection process requires extracting weak Although the phenomenon is among of the most energetic in
signals buried in non-stationary noise. Among the yet unob- the universe, the amplitude of the gravitational wave impinging
served targets of these observatories, core collapse supernova on a detector on the Earth is extremely faint. To increase
(CCSN) explosions are particularly challenging due to the the detection probability, we should increase the volume of
weakness of the signals and their inherent complexity. The the universe to be explored and this can be achieved both

Authorized licensed use limited to: JAnibal Arias. Downloaded on March 31,2022 at 17:38:04 UTC from IEEE Xplore. Restrictions apply.
by decreasing the detector noise and using better performing
statistical algorithms.
Due to the complexity and stochasticity of the waveform,
generating numerical simulations of this physical process is
challenging and computationally intensive. Therefore, it is
impossible to detect these sources with the current state-of-the-
art of template-matching techniques. Current efforts to search
for gravitational waves from CCSNe ([6], [7], [8], [9]) make
use of pipelines (cWB [10], oLib [11] and BayesWave [12])
for searches based on excess power to identify signals buried
in the detector’s noise without taking advantage of any specific
feature of CCSN waveform. Nonetheless, we can benefit from
the signal peculiarity.
Fig. 1. From the top; the spectrogram of LIGO Hanford, LIGO Livingston
In this line of thought, it was proposed in [13] the use and Virgo shown in red, green and blue, respectively. At the bottom: the RGB
of Machine Learning (ML) techniques, in particular convo- image obtained by stacking the previous three spectrograms. In this case, the
lutional neural networks (CNNs), focusing on the monotonic signal is present just in Hanford and Livingston so that the combined signal
at the bottom is in yellow.
raise of the GW signal in the time-frequency plane due to
the excitation of g-modes in the PNS, which is the dominant To build images for our neural network algorithm we
feature present in the GW spectrum. In this work, a set employ the Wavelet transform built in cWB algorithm
of phenomenological waveform designed to mimic numerical (https://gwburst.gitlab.io/ ). Their size is 256 × 64 pixels, cov-
simulations is developed and injected in Gaussian noise. Due ering the frequency band from 0 o 2048 Hz and a time range
to their inexpensive computation, the authors are able to of 2 s (typical signal duration is under 1 s). To enhance the
generate large phenomenological training and validation sets, classification task, we use coincidence of different detectors
and test their network with numerical simulations. A similar by generating RGB images. For this purpose, we use primary
approach has been followed recently by ([14], [15], [16]) and colours for the spectrograms of each detector: red (R) for
in general there has been an increasing interest in the GW LIGO-Hanford, green (G) for LIGO - Livingston and blue
community for the use of ML methods, (see [17] for a review). (B) for Virgo (see [13] for details). A random example of the
Furthermore, the work of [13] has been recently extended in input of the CNN is shown in Fig 1. The signal is dominated
[18] with several improvements and employing real noise. by the presence of g-modes whose frequency increases as the
The main goal of this research is to generate an inexpensive neutron star becomes more compact.. We use the next data
set of phenomenological waveforms that mimic the physical sets for our analysis of real noise in sections IV C:
process described above, and to build a ML algorithm for the 1) Training set: For each distance d ∈ {0.2, 0.4, 1, 3, 4} in
detection task, while improving the performance obtained by kpc, ≈ 70000 phenomenological waveforms with random
[13] in Gaussian noise, and extending it to real noise. In this sky localization were injected (see section II of [18] for
proceedings paper, which is part of a larger work published details). 75% of the set is used in the actual training while
in [18], we provide technical details about the first steps the remaining 25% is used for validation.
towards the development of the final algorithm. In particular, 2) Blind set: these set contains ≈ 26000 phenomenological
we describe and compare the three different CNN architectures waveforms from the same distribution as the training set,
that were tested. Moreover, with the final algorithm we show with a uniform distance distribution in range [0.2, 15]
the similarity between the phenomenological data, labeled as kpc. This set is used to quantify the detection efficiency
blind set, and the set from numerical simulations, labeled as and to test the network.
test set, in terms of its performance. 3) Test set: we perform injections using realistic CCSN
waveforms from 3D numerical simulations of non-
II. DATA
rotating progenitor stars representative of the neutrino
We employ the data in Gaussian noise from [13] to build driven mechanism. The selection includes publicly avail-
our CNNs (see results in section IV A and B). Nonetheless, able waveforms from the literature (details can be found
to extend this work we also generate a data set in real noise, in [18]). The injected waveforms are in practice com-
as described below (see results in section IV C). pletely uncorrelated to any information we have used
To assess the robustness of our method, we selected data to train the CNN network. We injected about 65000
from the second observing run (O2) of the Advanced GW waveforms uniformly in distance and sky directions, from
detectors, without relaying on any neutrino information. In 100 pc to 15 kpc.
particular, we chose a stretch of data taken during August
III. M ETHODOLOGY
2017, when Virgo joined the run [19]. The period includes
about 15 days of coincidence time among the three detectors. A. Training methodology
About 2 years of time-shifts data have been used to construct As in [13], we train the network using curriculum learning,
noise to train and test the neural network. where we start training with the easiest data sets, and then

Authorized licensed use limited to: JAnibal Arias. Downloaded on March 31,2022 at 17:38:04 UTC from IEEE Xplore. Restrictions apply.
Fig. 2. Reduced versions of Resnet and Inception v4. Several tests were
performed and the present architectures provided the best results.

gradually the task difficulty is increased. In our framework,


the difficulty of the data sets increases with decreasing signal-
to-noise ratio (SNR). The data sets are balanced, and we use
75% of the data for training and 25% for validation. In the
previous paper we measured the performance of the neural
Fig. 3. Mini Inception-Resnet v1 and its optimized version. Several tests were
network in terms of the efficiency ηCN N and the false alarm performed and the present architectures provided the best results.
rate F ARCN N , which are equivalent to the true positive rate
and the false discovery rate, respectively, according to the parallel by separate CNN layers with different kernels, while
standard confusion matrix. In this research we also measure the outputs of all the convolutions are stacked. In such a way,
the performance of our network with the area under the curve a sparse network is built without the necessity of choosing a
(AUC), which is created by plotting the true positive rate particular kernel size, but computational complexity increases
(TPR) against the false positive rate (FPR) and measuring the drastically. To prevent a high computational cost the authors
area under the curve. introduce dimensionality reduction, i.e. 1 × 1 convolutions that
reduce the depth of the output. The idea of decreasing the
B. Increasing complexity of CNNs for CCSN signals number of parameters without reducing the expressiveness of
the block is explored even further, and the authors introduce
In a CNN, the input is convolved with a filter, which the factorization of convolutions to increase the computational
varies according to the characteristics of the data since it efficiency (see [21] for details).
can be learned by the network. With these ideas in mind,
Another obstacle of deeper networks is the degradation
the previous work [13] provided a clear evidence that, under
problem, where, with increasing depth, accuracy gets saturated
relatively simplified conditions, deep CNN algorithms could
and then degrades rapidly. In [22] this problem is approached
be more efficient to extract GW signals from CCSNe than
by introducing a deep neural network, called Residual Network
the current methodology. Therefore, the aim of this work
or ResNet. Furthermore, in this work, different empirical
is to improve the neural network developed in [13], going
results show that the degradation problem is well addressed
deeper with convolutions to increase accuracy while keeping
since accuracy gains are obtained from increasing depth.
computational complexity at a reasonable cost.
The most straightforward way of improving the performance Due to the improvements in accuracy obtained with Incep-
of a deep neural network is by increasing their size. Nonethe- tion network and Resnet, in [23] it was explored the combi-
less, enlarging a network implies to train a larger amount of nation of these two brilliant architectures. As a result, they
parameters and over-complicating the model, which increases developed, among others, an architecture called Inception-
dramatically the computational cost. A fundamental way of Resnet v1, which achieved tremendous improvements of the
solving these issues would be to move from fully connected performance.
to sparsely connected architectures. Our problem is much simpler than the task performed in
With this aim, [20] proposes a sophisticated network topol- [23], since we only need to discriminate between two classes:
ogy construction, the so-called Inception network. The ar- templates that contain a GW CCSN signal (event class) and
chitecture is composed by blocks of convolutions, known as templates that do not contain a GW CCSN signal (noise class).
Inception modules. The input of each block is convolved in Therefore, we will build a CNN with the ideas mentioned

Authorized licensed use limited to: JAnibal Arias. Downloaded on March 31,2022 at 17:38:04 UTC from IEEE Xplore. Restrictions apply.
above, but with only a few layers. As a consequence, we have IV. R ESULTS
developed reduced (“mini”) version of Inception v4, Resnet A. Waveform injection in Gaussian noise: comparison of mini
and Inception-Resnet v1, using the original building blocks of architectures
those networks, but adapting them to our needs.
To train and validate the networks, we use the data set
described in [13], composed of waveforms ranging in the
C. Mini architectures interval SNR=[8, 40]. This choice allows us for a direct
comparison with the previous work and to make an informed
In this section we present the reduced architectures, namely decision on the choice of the architecture to optimize. In Fig.
Mini Resnet, Mini Inception v4 and Mini Inception-Resnet v1. 4 we plot efficiency ηCN N and the false alarm rate F ARCN N
Several variations of these architectures were explored and as function of the SNR for the three architectures presented in
in this work we present the best performing ones. For the III-C. Note that these networks have not been fully optimized
development of the networks, including the model definition, but are rather a proof of concept for the final design of the
the training and the validation phases, we have used the Keras full architecture.
frameworks [24], based on the TensorFlow backend [25]. We
employ Adamax optimizer, with learning rate 0.001 and we
train each network for 20 epochs [26]. The activation functions
of all the convolutional layers is relu activation function,
ReLU (x) = max (0, x). In this work we use binary cross-
entropy loss function and a sigmoid activation function for
the output.
Excessive Max pooling hinders the learning process, as it
might extract crucial information that the consequent layers
need. Therefore, we use a minimal amount of pooling layers,
but an optimized version would require further parameter
reduction. In Figs. 2 and 3 we provide a scheme of these
networks and in the following we present their main charac-
teristics: Fig. 4. ηCN N (solid lines) and F ARCN N (dashed lines) as functions of
SNR computed during the validation process for architectures presented in
• Mini Resnet: it has a single “skip connection” (repre- III-C
sented as an arch). This is due to the fact that when
increasing the number of layers and “skip connections”, As we can observe, Mini Inception v4 (pink) and Mini
the performance of the network decreased rapidly for Resnet (orange), have a similar performance in terms of effi-
short architectures (≤ 30 layers). This network has a total ciency ηCN N , but Mini Inception v4 has a lower F ARCN N .
of 381390 parameters and a single epoch takes 31 s. Due to its complexity and generalization ability, Mini
• Mini Inception v4: we implement the block of Inception Inception-Resnet v1 acquires the lowest F ARCN N for large
v4 A (see [23] for details). This network has a total of SNR, and it has the largest ηCN N , except for low SNR. Due
250251 parameters and a single epoch takes 26 s. to its better performance we focus only in the Mini Inception-
• Mini Inception-ResNet v1: we implement the block of Resnet v1 network,which is optimized in the next section and
Inception-ResNet A and C presented in [23]. The mod- used for the different test presented in the rest of this work.
ules Inception-ResNet-B and Reduction-B are the most B. Waveform injection in Gaussian noise: comparison with
expensive blocks, so they are discarded for this work (see previous results
[23] for details). This is the most complex network and
We wish to improve the performance of Mini Inception-
the most expensive to train, as it has a total of 522346
Resnet, with the same set as in the previous section, by
parameters and a single epoch takes 43 s. In an optimized
minimizing F ARCN N while maximizing ηCN N . Therefore,
version Reduction blocks should be present to enlighten
we add Recuction blocks and fine-tune the network, as we
the computations.
mentioned before. On top of that, we implement a weighted
Due to its high preliminary performance (see section IV-B) binary cross-entropy, where we assign a weight w to the noise
we shrink the amount of parameters of Mini Inception-ResNet class and a weight 1 to the event class. We vary this parameter
by interspersing Inception-Resnet modules with Reduction-A between w = [1.0, 3.5]. Moreover, the algorithm returns the
blocks and fine-tune it (see Fig. 3 left panel and [18] for details probability θ that a certain template belongs to the event
on the blocks). Because of its deepness, the resulting Mini class. We want this probability to be high without dramatically
Inception-Resnet architecture is much more flexible than the decreasing ηCN N . Therefore, we define the decision threshold
θ in range [50%, 85%] and we perform different experiments

one presented in the previous work [13], and it is ∼ 30 times
more complex, as the previous network has ≈ 6000 parameters to tune w and θ. We found that to minimize F ARCN N at a
and the optimized Mini Inception-Renset has ≈ 99000. relatively high ηCN N , a good penalization was w = 2.0. It is

Authorized licensed use limited to: JAnibal Arias. Downloaded on March 31,2022 at 17:38:04 UTC from IEEE Xplore. Restrictions apply.
important to note that w will penalize the learning, so if the
network is learning correctly the results would be enhanced,
but it will lead to poor results otherwise.
To have a clearer comparison between different penaliza-
tions w and the results from the previous paper [13], we plot
the validation results of Mini Inception Resnet for w = {1, 2}
in Fig. 5.

Fig. 6. ηCN N as a function of the distance computed during the testing


process for {w, N, θ } = {2, 30.000, 65%}.

we increase the distance, they seem to reach a lower limit at


ηCN N ≈ 60%. In Fig. 7 we also plotted ηCN N against SNR.

Fig. 5. ηCN N (solid lines) and F ARCN N (dashed lines) as functions of


SNR computed during the validation process of w = {1, 2}, with θ = 65%,


and [13], where θ = 50%.

Since we want to obtain a trade-off between ηCN N and



F ARCN N , we settle w = 2.0 and θ = 65%. The main
improvement of our network with respect to [13] is the
minimization of F ARCN N towards ∼ 0% for SNR in range
[15, 20], while maintaining the same ηCN N . We note also that
the poor performance at low SNR is due to the fact that this
architecture is susceptible to the strong presence of Gaussian
white noise, as it is pointed out in [27]. Fig. 7. ηCN N as a function of SNR computed during the testing process for
{w, N, θ } = {2, 30.000, 65%}.

C. Waveform injections in real detector noise: final results
In this section we present the results obtained when training For low SNR, the difference in efficiency ηCN N between
in the injected phenomenological signals in real noise in the two cases, blind set and test set, is around 10%, while
the interval SNR=[1, 232], known as training set (section for SNR > 15 we obtain similar efficiency. This final result
II). Afterwards, we predict the labels of the testing sets: assesses the robustness of this method to detect CCSN signals
blind set and the test set, (section II for details). The signals embedded in the real detector noise.
injected in the blind set correspond to waveforms generated
by the same procedure used to generate the training set, while V. C ONCLUSIONS
the injections in the test set correspond to realistic CCSN We developed new machine learning algorithms to further
waveforms. The expectation is that our network will obtain a improve the detectability of a GW signal from CCSN, fol-
similar performance for both sets, since the blind set mimics lowing the path traced in [13]. We propose three different
the test set. As in the previous section, we choose w = 2 and architectures: Mini ResNet, Mini Inceptio v4 and Mini In-

the decision threshold θ = 65%. ception ResNet v1. After the first preliminary results, we
When comparing the area under the curve (AUC) of both decide to fine-tune Mini Inception ResNet v1 due to its better
data sets, we note the high performance of the test set performance.
(AUC=0.79), close to the results obtained with the blind set The short size of our Mini Resnet network, which under-
(AUC=0.90). As expected, both measurements are close, due performs with respect to the other CNNs tested, is the result of
to the fact that the set of phenomenological waveforms mimics the problems that we had when trying to implement a medium-
the behaviour of the waveforms simulated with numerical sized Resnet. A possible explanation is that “skip connections”
relativity in the test set. improve the learning of very deep networks (> 100 layers), but
Fig. 6 also shows the resemblance between the blind set and not the ones of short networks as the degradation problem is
test set, representing ηCN N as a function of the distance to the not as acute, which should be explored in future works. More-
source. As we can see, at short distances there is a difference over, the first preliminary results showed that Mini Inception
in efficiency between blind set and test set of ≈ 10%, but when ResNet v1 overcame the performance of Mini Inception v4,

Authorized licensed use limited to: JAnibal Arias. Downloaded on March 31,2022 at 17:38:04 UTC from IEEE Xplore. Restrictions apply.
due to the combination of “skip connections” and Inception [2] B. P. A. et al., “GWTC-1: A Gravitational-Wave Transient Catalog of
modules, as mentioned in [23]. Compact Binary Mergers Observed by LIGO and Virgo during the First
and Second Observing Runs,” Phys. Rev. X, vol. 9, no. 3, p. 031040,
Regarding the applicability of our method for the GW de- 2019.

tection, we have considered a detection threshold, θ = 65%, [3] R. A. et al., “GWTC-2: Compact Binary Coalescences Observed by
that results in a FAR of about 5% at SNR∼ 15. These values LIGO and Virgo During the First Half of the Third Observing Run,”
Oct 2020.
could be appropriate for an observation with high confidence [4] H. Bethe, “Supernova mechanisms,” Rev. Mod. Phys., vol. 62, pp.
of an event in coincidence with a neutrino signal. If the 801–866, Oct 1990. [Online]. Available: https://link.aps.org/doi/10.
method were to be used in all-sky non-triggered searches, the 1103/RevModPhys.62.801
[5] H. Janka, “Neutrino emission from supernovae,” pp. 1575–1604, 2017.
range of values of FAR needed to make a detection with high [Online]. Available: https://doi.org/10.1007/978-3-319-21846-5 4
confidence could be achieved by using values of θ very close [6] B. A. et al., “First targeted search for gravitational-wave bursts from
to 100%. The efficiency of the algorithm in this regime is core-collapse supernovae in data of first-generation laser interferometer
detectors,” vol. 94, no. 10, p. 102001, Nov. 2016.
something that could be explored in future work. Moreover, [7] B. P. A. et al., “Optically targeted search for gravitational waves emitted
these results are very promising for future detections of GWs by core-collapse supernovae during the first and second observing runs
from CCSN, because the network allows us to observe more of advanced LIGO and advanced Virgo,” Phys. Rev. D, vol. 101, no. 8,
p. 084002, 2020.
than half of the events within 15 kpc. [8] B. A. et al., “All-sky search for short gravitational-wave bursts in the
The high efficiency ηCN N obtained with the test set, which first Advanced LIGO run,” Phys. Rev., vol. D95, no. 4, p. 042003, 2017.
uses signals of a completely different catalogue from the [9] ——, “All-sky search for short gravitational-wave bursts in the second
Advanced LIGO and Advanced Virgo run,” Phys. Rev., vol. D100, no. 2,
training set, is a good example of the generalization ability p. 024017, 2019.
of the CNN. At present the entire data processing is rather [10] S. K. et al., “Method for detection and reconstruction of gravitational
fast , as predicting 180 s of data from the test set takes wave transients with networks of advanced detectors,” Phys. Rev.
D, vol. 93, p. 042004, Feb 2016. [Online]. Available: https:
0.27s, while the current cWB pipeline needs around 2 − 5 //link.aps.org/doi/10.1103/PhysRevD.93.042004
min. We could easily increase the number of classes to be [11] R. L. et al., “Information-theoretic approach to the gravitational-wave
able to detect other GW sources with the same architecture. burst detection problem,” Phys. Rev. D, vol. 95, no. 10, p. 104046, 2017.
[12] N. Cornish and T. Littenberg, “BayesWave: Bayesian Inference for
In the future, the new algorithm presented here should be Gravitational Wave Bursts and Instrument Glitches,” Class. Quant.
compared under realistic conditions with the methods currently Grav., vol. 32, no. 13, p. 135012, 2015.
in use within the LIGO-Virgo collaboration to evaluate the [13] P. A. et al., “A new method to observe gravitational waves emitted by
core collapse supernovae,” Physical Review D, vol. 98, Dec 2018.
real advantages of the method. In particular, the high speed [14] M. L. C. et al., “Detection and classification of supernova
of CNNs is an advantage for the design of new low-latency gravitational wave signals: A deep learning approach,” Phys. Rev.
detection pipelines for CCSN. D, vol. 102, p. 043022, Aug 2020. [Online]. Available: https:
//link.aps.org/doi/10.1103/PhysRevD.102.043022
[15] M. C. et al., “Improving the background of gravitational-wave searches
ACKNOWLEDGMENT for core collapse supernovae: A machine learning approach,” Mach.
Learn. Sci. Tech., vol. 1, p. 015005, 2020.
This research has made use of data, software and/or web [16] A. I. et all., “Core-collapse supernova gravitational-wave search
tools obtained from the Gravitational Wave Open Science Cen- and deep learning classification,” Machine Learning: Science and
ter (//https://www.gw-openscience.org/ /), a service of LIGO Technology, vol. 1, no. 2, p. 025014, Jun 2020. [Online]. Available:
https://doi.org/10.1088/2632-2153/ab7d31
Laboratory, the LIGO Scientific Collaboration and the Virgo [17] E. C. et al., “Enhancing gravitational-wave science with machine
Collaboration. LIGO is funded by the U.S. National Science learning,” Machine Learning: Science and Technology, vol. 2, no. 1,
Foundation. Virgo is funded, through the European Gravita- p. 011002, Dec 2020. [Online]. Available: https://doi.org/10.1088/
2632-2153/abb93a
tional Observatory (EGO), by the French Centre National de [18] M. L. et al., “Deep learning for core-collapse supernova detection,”
Recherche Scientifique (CNRS), the Italian Istituto Nazionale Phys. Rev. D, vol. 103, p. 063011, Mar 2021. [Online]. Available:
della Fisica Nucleare (INFN) and the Dutch Nikhef, with https://link.aps.org/doi/10.1103/PhysRevD.103.063011
[19] R. A. et al., “Open data from the first and second observing runs of
contributions by institutions from Belgium, Germany, Greece, advanced ligo and advanced virgo,” SoftwareX, vol. 13, Dec 2019.
Hungary, Ireland, Japan, Monaco, Poland, Portugal, Spain. ML [20] C. S. et al., “Going deeper with convolutions,” The IEEE Conference on
is supported by the research programme of the Netherlands Computer Vision and Pattern Recognition (CVPR), pp. 1–9, Jun 2015.
[21] C. S. et al, “Rethinking the inception architecture for computer vision,”
Organisation for Scientific Research (NWO). PCD acknowl- Jun 2016.
edges the support from the grants PGC2018-095984-B-I00, [22] K. H. et al., “Deep residual learning for image recognition,” Jun 2016,
PROMETEU/2019/071 and the Ramon y Cajal funding (RYC- pp. 770–778.
[23] C. S. et al., “Inception-v4, inception-resnet and the impact of residual
2015-19074) supporting his research. In addition, IDP and FR connections on learning,” AAAI Conference on Artificial Intelligence,
acknowledge the support from the Amaldi Research Center Feb 2016.
funded by the MIUR program ”Dipartimento di Eccellenza” [24] F. C. et al. (2015) Keras. [Online]. Available: https://github.com/
fchollet/keras
(CUP:B81I18001170001) and the Sapienza School for Ad- [25] A. M. et al., “TensorFlow: Large-scale machine learning on
vanced Studies (SSAS) and the support of the Sapienza grant heterogeneous systems,” 2015, software available from tensorflow.org.
RM120172AEF49A82.. [Online]. Available: http://tensorflow.org/
[26] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
International Conference on Learning Representations, Dec 2014.
R EFERENCES [27] P. R. et al., “Effects of degradations on deep neural network architec-
[1] B. A. et al., “Observation of Gravitational Waves from a Binary Black tures,” Jul 2018.
Hole Merger,” Phys. Rev. D, vol. 116, no. 6, p. 061102, Feb. 2016.

Authorized licensed use limited to: JAnibal Arias. Downloaded on March 31,2022 at 17:38:04 UTC from IEEE Xplore. Restrictions apply.

You might also like