You are on page 1of 61

"machine

learning"
"Machine
Lerningg"

"neural
network"
Even bananas!
shape shifter!
Super Kamiokande (1997)
50,000 ton of ultra pure water,
"photographed" by 11,000 PMTs
smashing more than 10 million high energy protons
producing few trillion neutrinos every 1.3 seconds
NUE CREATES ELECTRON NUMU CREATES MUONS
Electron “showers/hits in cone shape” in detector Muon creates long “track/line” in detector
HOW TO IDENTIFY A NEUTRINO (IF IT
WERE A CAT)
Non ML: very hard and tedious task

CAT IS A COLLECTION OF SHAPES: 

BODY + 2 PAWS + 2 EARS + 2 EYES + NOSE + MOUTH +


....

We can write different algorithms to do pattern


recognitions to identify these shapes, and then connect
them to reconize a cat
HOW TO IDENTIFY A NEUTRINO (IF IT
WERE A CAT)
Non ML: very hard and tedious task

NEUTRINO IS A COLLECTION OF PHYSICS


OBJECTS: 

SUM OF HITS IN STRAIGHT LINE--> TRACK

CONNECT TRACKS TO ONE INTERACTION POINT

We can write different algorithms to do pattern


recognitions to identify these objects, and then
merge them together to reconize a neutrino
APPLY A NEURON FOR THE TOP-
SPLIT THE IMAGE INTO
LEFT QUADRANT TO CONVERT
FOUR EQUAL QUADRANTS
THE 128 * 128 * 3 FEATURES
INTO ONE SINGLE NUMBER.

Development

Workflow

(Machine

Learning)

image from:

https://towardsdatascience.com/intuitive-deep-learning-part-2-cnns-for-computer-vision-472bbb2c8060 https://towardsdatascience.com/intuitive-
deep-learning-part-2-cnns-for-computer-
vision-472bbb2c8060
APPLY THE EXACT SAME AFTER APPLYING THAT
NEURON FOR THE OTHER NEURON FOR ALL FOUR
QUADRANTS QUADRANTS, WE HAVE
FOUR DIFFERENT NUMBERS

Development

Workflow

(Machine

Learning)

image from:

https://towardsdatascience.com/intuitive-
deep-learning-part-2-cnns-for-computer-
vision-472bbb2c8060
TAKE THE MAXIMUM OF THE FOUR NUMBER TO
GET A SINGLE NUMBER

Development

Workflow

(Machine

Learning)

image from:

https://towardsdatascience.com/intuitive-
deep-learning-part-2-cnns-for-computer-
vision-472bbb2c8060
plastic with
scintillator
interspersed
between
passive
target
Each triangular hit = scintillator plane with optic fiber
"vertex": start of Detector "lights up" when charged particle passed
neutrino interaction
through it. Colors represent energy deposited

how neutrino 128 neutrino


triangular
event look like
planes
viewed from
the top of
detector
track = line
"x-view"

neutrino
EM-like particles,
many protons/pions
"shower"
rotated 60o from y rotated -60o  from y
In experimental
particle physics, 
we use simulation
(synthetic data)
where we know the
true information
(labeled data)
Useful for tuning
pattern recognition
input layer hidden layer
output layer

make DCNN (Deep


Classify
images for 3 Convolutional
location of
different Neural
interaction
view Network)

apply cat finding algorithm

divide detector into segments


check if there a vertex in each
segment
hybrid classification and
localization
Using Theano along with Lasagne
framework for network design
z-direction/neutrino
https://arxiv.org/abs/1808.08332
https://arxiv.org/abs/1808.08332
Journal of Instrumentation, Volume 13, Number 11, 2018
input layer 

Input is three 2-deep tensors containing deposited energy and

hit-time in x, u, v view
Supervised learning: use simulation where we know the "truth"
hidden layer 

Four iterations of convolution and max pooling layers.

https://arxiv.org/abs/1808.08332
Journal of Instrumentation, Volume 13, Number 11, 2018
hidden layer 

Fully connected layer with 196 semantic outputs for


each view
Concatenated and fed to another fully connected layer
with 128 outputs.
Dropout: drop connections between layers during
training and reduce over-fitting
Final fully connected layer with 11 (67) outputs for the
segment (plane) classifier.

Softmax layer:
True z-segment: labeled data/true
information of which segment the vertex
is generated
Reconstructed z-segment: a vector of
softmax probabilities of DCNN predicted
segment

example is for 11 segments, but we


developed 67 planes classification and
173 planes for physics results
Signal purity has been improved by the factor of 2-3 using ML technique compared to
track based approach

https://arxiv.org/abs/1808.08332
Journal of Instrumentation, Volume 13, Number 11, 2018
Train and prediction in different domains
Train with labeled data: in our case it is Simulation/synthetic
data (lets call it source domain)
Test with unlabeled data: in our case it is real data (lets call it
target domain)
cat not cat

The localization problem strongly tied to physics.


But... Physics model is not perfect!

If we knew the physics exactly, we wouldn't be doing the


experiments/

Domain discrepancy arises


Training images with cats indoor
might bias the prediction Find ways to  reduce any  biases  in  the algorithm  that  may  come 

from training  our  models  in  one  domain   and  applying  them  in 
another.

Use Domain Adversarial Neural Network


Minimize the loss of the label
classifier so that network can
Without DANN:

predicts the input level


one classifier
into 
the
network

Label predictor: 

output 
Minimize the loss of the label
classifier so that network can
With DANN:

predicts the input level


Two classifier
into 
the
network

Label predictor: 

output 

Domain classifier: 
works internally    
Minimize the loss of the label
classifier so that network can
With DANN:

predicts the input level


Two classifier
into 
the
network

Label predictor: 

output 

Domain classifier: 
works internally    

Maximize the loss of the domain classifier so that network


can not distinguish between source and target domain
Minimize the loss of the label
classifier so that network can
With DANN:

predicts the input level


Two classifier
into 
the
network

Label predictor: 

output 

Domain classifier: 
works internally    

Maximize the loss of the domain classifier so that network


can not distinguish between source and target domain

The network develops an insensitivity to features that are present in one domain but 
not the
other, and train only on features that are common to both domains
https://arxiv.org/abs/1808.08332
Journal of Instrumentation, Volume 13, Number 11, 2018
using Caffe for the deep learning network

Blue vs black: model trained in the same domain (FSI active,Blue curve) is better 
than a
model trained with an out-of domain physics model (FSI inactive, black curve)

DANN helps to recover the domain information


Vertex Finding 
classifying what appears in an image into
one
out of a set of predefined classes

what is the segment of the vertex ?

Image Semantic Segmentation


classifying each pixel in an image into
one out
of a set of predefined classes

what type of particle each particle is?


Image processing is done by LarCV package(https://github.com/DeepLearnPhysics/larcv2.git), developed by
Kazuhiro Terao et al.

Analysis framework developed for processing LArTPC image data. Supports C++ data structures, IO
interface, and data processing 
machinery. 
Directly manipulate the image data with or without OpenCV. 
Interface with open source deep learning softwares including Caffe by Berkeley Lab. and TensorFlow by
Google.
ROOT format- easy to handle (for particle physics) can do other things like crop images , resize images
etc

U-resnet(https://github.com/DeepLearnPhysics/u-resnet.git) is used to implement the semantic


segmentation algorithm 

Hybrid of the U-Net(arxiv: 1505.04597) and residual network (arxiv: 1512.03385, 


         
arxiv:1603.05027)

https://doi.org/10.48550/arXiv.2103.06992

JINST 16 P07060 2021


https://doi.org/10.48550/arXiv.2103.06992

JINST 16 P07060 2021

EM like: light blue


NON-EM like: yellow

https://doi.org/10.48550/arXiv.2103.06992

JINST 16 P07060 2021


https://doi.org/10.48550/arXiv.2103.06992

JINST 16 P07060 2021

Traditional Semantic
Reconstruction Segmentation
https://arxiv.org/abs/2008.01242

Semantic Segmentation (CNN) on MicroBooNE

GoogleNet Application on NEXT Graphical Neural Net on IceCUBE


Identifies 630% more signal events than a CNN/traditional
algorithm
Outperforms traditional reconstruction by

between 20% and 60%.

GoogLeNet & CNN Application  on NOvA


https://arxiv.org/abs/2008.01242

Small-scale GPU clusters often used for


training.
ML in particle physics experiments uses huge resources
Large-scale GPU required for
Particle physics experiments record billions of events a year. evaluation.
Neural nets perform >109 floating point operations. Currently no GPU computing clusters
similar to the CPU
OSG exists.
Widespread use of large-spread computing clusters such
as
the Open Science Grid to perform evaluation on CPUs.
doi.org:10.3389/fdata.2022.787421

Fast/efficient ML: dataset sizes and data rates are in physics


experiments
are uniquely massive compared to industry
and other scientific domains

https://a3d3.ai/
https://arxiv.org/abs/2008.01242

Quantitative results and careful statistical analyses. 


Consideration of systematic effects and bias is crucial to
particle physics.
As ML becomes more widespread in physics, there will be
increased efforts
in understand this.

Use of simulated datasets corresponding to real data.


Unlike most industry applications, particle physics trains on
simulations. 
These simulations can be tuned at will, making it possible to
study
networks behavior under controlled modifications.
Comparing data and simulations can improve studies in
domain transfer.

You might also like