Melnitchouk21 Hugs

HUGS Topical Seminar
June 16, 2021
PDF fitting:
methods and uncertainties
Wally Melnitchouk
Parton distributions in hadrons
Parton distribution functions (PDFs) are light-cone Hig
correlation functions
Z 1
q(x) = d⇠ e(τ ixP + ⇠
= 2) τ = 2W(⇠
h P | (⇠ ) +
, 0) (0) | P i
1 (a)
k+ 0 ⇠
light cone momentum fraction x = +
P single quark k
Wilson line (gauge invariance)
⇢ Z scattering
⇠
W(⇠ , 0) = exp ig d⌘ A+ (⌘ ) P
0
In A = 0
+ τ = 2
gauge, in fast-moving frame PDF has a probabilistic
interpretation as a particle density
Z 1
single quark
dx q(x) = h P | (0) +
(0) | P i ⇡ h P | †
(0) (0)scattering
|P i
1
number
number 0
⇡ z
operator
density
2
Inclusive high-energy particle production A B ! C X
A B
xa xb
a ˆ b
X C Collins, Soper, Sterman (1980s)
QCD factorization: separation of hard (perturbative, calculable)

from soft (nonperturbative, parametrized) physics
XZ
AB!CX (pA , pB ) = dxa dxb fa/A (xa , µ) fb/B (xb , µ)
a,b X (n)
n
⇥ s (µ) ⇥
ˆab!CX (xa pA , xb pB , Q/µ)
n
process-independent parton distribution functions fa/A

characterizing structure of bound state A
3
Most information on PDFs obtained from lepton-hadronHigher tw
deep-inelastic scattering (DIS)
e! ✓ ◆
d 2 4↵ 2
E cos2 ✓2
02
(a) 2 ✓ F1 F2 (b)
e
0
= 2 tan +
γ∗ d⌦dE Q4 2M ⌫
Q2 Q2 = ~q 2 ⌫ 2
X xB =
2M ⌫ ⌫ = E E0
N
H
structure function given as convolution of τ
hard
(τ = 2) = 2
Wilson coefficient with PDF ⇤
(a)
Z
X 1
dx ⇣ xB ⌘
, ↵s single
q(x, Q2quark
q
F2 (xB , Q2 ) = xB e2q Cq )
q xB x x scattering
X N
xB e2q q(xB , Q ) 2
q τ =2
⇣ xB ⌘
for leading order approximation Cq 1 single quark
x scattering
4
Precision PDFs needed to
(1) understand basic structure of QCD bound states
(2) compute backgrounds in searches for BSM physics
2
Q2 evolution feeds Q collider
low x, high Q2 (“LHC”)

from high x, low Q 2 (“JLab”)
fixed-target
x
Information on PDFs obtained from

(1) nonperturbative approaches (low-energy models, DSE, EFT)
(2) lattice QCD
(3) global QCD analysis
Global PDF analysis
Universality of PDFs allows data from many different
processes (DIS, SIDIS, weak boson/jet production in pp, Drell-Yan …)
to be analyzed simultaneously
distributions parametrized using a specific functional form,
with parameters fitted to data
xf (x, µ) = N x↵ (1 x) P (x)
p
with polynomial e.g. P (x) = 1 + x+⇥x
or Chebyshev, neural net, …
Extraction of PDFs is challenging because usually

there exist multiple solutions — “inverse problem”
PDFs are not directly measured, but inferred from
observables involving convolutions with other functions
6
Global PDF analysis
Modern global QCD analyses typically fit 1000s of data points
from high-energy scattering experiments
106
x = 0.005 (i = 20) SLAC F2p ⇥ 2i
x = 0.008
x = 0.013
HERMES
105
x = 0.018 BCDMS 100 µ = 70±
0.60 < x < 0.75
x = 0.026
NMC µ = 60±
F2p £ 2 i
0.52 < x < 0.74
x = 0.036
10 4
x = 0.05
CJ15 µ = 55±
0.60 < x < 0.73
µ = 45±
x = 0.07 0.55 < x < 0.70
x = 0.09 µ = 41±
10°1
103 x = 0.1
0.53 < x < 0.69
JLab
µ = 38± (i = 0)
x = 0.11 0.48 < x < 0.68 CJ15
x = 0.14 4 5 6 7
10 2
x = 0.18
Q2 (GeV2)
F2p £ 2 i
x = 0.23
Q2 = 1.7 GeV2 (i = 5)
0.44 > x > 0.24
x = 0.28 10 1
101 Q2 = 2.0 GeV2

0.49 > x > 0.28
x = 0.35
Q2 = 2.4 GeV2
F2n/F2d £ 2 i
0.53 > x > 0.34
x = 0.45
100 Q2 = 2.9 GeV2

0.58 > x > 0.40
x = 0.55 100
Q2 = 3.4 GeV2
0.62 > x > 0.46
10°1 x = 0.65 Q2 = 4.0 GeV2 (i = 0)

0.65 > x > 0.56
n/d JLab
F2 ⇥ 2i CJ15
x = 0.75 10°1
3 4 5 6 7 8
°2
10 W 2 (GeV2)
F2p ⇥ 2i x = 0.85 (i = 0)
Accardi et al. (“CJ15”)

10°3
2 5 10 20 50 100 200 500 PRD93, 114017 (2016)
Q2 (GeV2)
Global PDF analysis
SLAC (p)
SLAC (d)
BCDMS (p)
DIS
BCDMS (d)
NMC (p)
DY
NMC (d/p)
HERA NC (1) (e+ p) SIDIS
HERA NC (2) (e+ p)
HERA NC (3) (e+ p) SIA
HERA NC (4) (e+ p)
HERA NC (e° p)
HERA CC (e+ p)
HERA CC (e° p)
E866 (pp)
E866 (pd)
COMPASS (º + )
COMPASS (º ° )
COMPASS (K + )
COMPASS (K ° )
COMPASS (h+ )
COMPASS (h° )
ARGUS (º ± , Q = 10.0)
BABAR (º ± , Q = 10.5)
BELLE (º ± , Q = 10.5)
TASSO (º ± , Q = 12.0)
TASSO (º ± , Q = 14.0)
TASSO (º ± , Q = 22.0)
TASSO (º ± , Q = 34.0)
TASSO (º ± , Q = 44.0)
TPC (º ± , Q = 29.0)
TPC (º ± , Q = 29.0)
TPC(c) (º ± , Q = 29.0)
TPC(b) (º ± , Q = 29.0)
TOPAZ (º ± , Q = 58.0)
SLD (º ± , Q = 91.3)
SLD(c) (º ± , Q = 91.3)
SLD(b) (º ± , Q = 91.3)
ALEPH (º ± , Q = 91.2)
OPAL (º ± , Q = 91.2)
OPAL(c) (º ± , Q = 91.2)
OPAL(b) (º ± , Q = 91.2)
DELPHI (º ± , Q = 91.2)
DELPHI(b) (º ± , Q = 91.2)
ARGUS (K ± , Q = 10.0)
BABAR (K ± , Q = 10.5)
BELLE (K ± , Q = 10.5)
TASSO (K ± , Q = 14.0)
TASSO (K ± , Q = 22.0)
TASSO (K ± , Q = 34.0)
TPC (K ± , Q = 29.0)
TPC (K ± , Q = 29.0)
TOPAZ (K ± , Q = 58.0)
SLD (K ± , Q = 91.3)
SLD(c) (K ± , Q = 91.3)
SLD(b) (K ± , Q = 91.3)
ALEPH (K ± , Q = 91.2)
OPAL (K ± , Q = 91.2)
OPAL(c) (K ± , Q = 91.2)
OPAL(b) (K ± , Q = 91.2)
DELPHI (K ± , Q = 91.2)
DELPHI (K ± , Q = 91.2)
DELPHI(b) (K ± , Q = 91.2)
ALEPH (h± , Q = 91.2)
DELPHI (h± , Q = 91.2)
SLD (h± , Q = 91.3)
TPC (h± , Q = 29.0)
OPAL (h± , Q = 91.2)
OPAL(b) (h± , Q = 91.2)
OPAL(c) (h± , Q = 91.2)
Moffat, WM, Rogers, Sato
TASSO (h± , Q = 35.0)
TASSO (h± , Q = 44.0)
DELPHI(b) (h± , Q = 91.2)
arXiv:2101.04664 (“JAM20”)
SLD(c) (h± , Q = 91.3)
SLD(b) (h± , Q = 91.3)
0 1 2 3 4 °4 °3 °2 °1 0 1 2 3 4
¬2red E [residual]
Global PDF analysis
SLAC (p)
SLAC (d)
BCDMS (p)
BCDMS (d)
NMC (p)
DIS
DY
JAM20-SIDIS
NMC (d/p)
HERA NC (1) (e+ p) SIDIS
HERA NC (2) (e+ p)
HERA NC (3) (e+ p) SIA
HERA NC (4) (e+ p)
HERA NC (e° p)
HERA CC (e+ p)
HERA CC (e° p)
E866 (pp)
E866 (pd)
COMPASS (º + )
COMPASS (º ° )
COMPASS (K + )
COMPASS (K ° )
COMPASS (h+ )
COMPASS (h° )
ARGUS (º ± , Q = 10.0)
d/u
x(d¯ + ū)
BABAR (º ± , Q = 10.5)
0.8
0.6BELLE (º ± , Q = 10.5)
TASSO (º ± , Q = 12.0)
TASSO (º ± , Q = 14.0)
xuv 0.6
TASSO (º ± , Q = 22.0)
TASSO (º ± , Q = 34.0) 0.6
0.4TASSO (º ± , Q = 44.0)
TPC (º ± , Q = 29.0)
JAM 0.4
TPC (º ± , Q = 29.0)
0.4
TPC(c) (º ± , Q = 29.0)
TPC(b) (º ± , Q = 29.0) CJ15
0.2TOPAZ (º ± , Q = 58.0)
SLD (º ± , Q = 91.3) 0.2 0.2
SLD(c) (º ± , Q = 91.3)
SLD(b) (º ± , Q = 91.3) xdv NNPDF3.1
ALEPH (º ± , Q = 91.2)
OPAL (º ± , Q = 91.2)
OPAL(c) (º ± , Q = 91.2)
x(d¯ ° ū) Rs = (s + s̄)/(d¯ + ū)

OPAL(b) (º ± , Q = 91.2)
0.06
DELPHI (º ± , Q = 91.2)
µ2 = 10 GeV2
DELPHI(b) (º ± , Q = 91.2)
ARGUS (K ± , Q = 10.0) 1.5 4
BABAR (K ± , Q = 10.5)
0.04
BELLE (K ± , Q = 10.5)
TASSO (K ± , Q = 14.0) 3
TASSO (K ± , Q = 22.0)
TASSO (K ± , Q = 34.0)
TPC (K ± , Q = 29.0)
1.0
0.02 TPC (K ± , Q = 29.0)
TOPAZ (K ± , Q = 58.0)
2
SLD (K ± , Q = 91.3)
SLD(c) (K ± , Q = 91.3)
SLD(b) (K ± , Q = 91.3)
0.5 1 xg
0.00
ALEPH (K ± , Q = 91.2)
OPAL (K ± , Q = 91.2)
OPAL(c) (K ± , Q = 91.2)
OPAL(b) (K ± , Q = 91.2)
0.01
DELPHI (K ± , Q = 91.2)
DELPHI (K ± , Q = 91.2)
DELPHI(b) (K ± , Q = 91.2)
0.03 0.1 0.3 x 0.01 0.03 0.1 0.3 x 0.01 0.03 0.1 0.3 x
ALEPH (h± , Q = 91.2)
DELPHI (h± , Q = 91.2)
SLD (h± , Q = 91.3)
TPC (h± , Q = 29.0)
OPAL (h± , Q = 91.2)
OPAL(b) (h± , Q = 91.2)
OPAL(c) (h± , Q = 91.2)
Moffat, WM, Rogers, Sato
TASSO (h± , Q = 35.0)
TASSO (h± , Q = 44.0)
DELPHI(b) (h± , Q = 91.2)
arXiv:2101.04664 (“JAM20”)
SLD(c) (h± , Q = 91.3)
SLD(b) (h± , Q = 91.3)
0 1 2 3 4 °4 °3 °2 °1 0 1 2 3 4
¬2red E [residual]
Why are PDF uncertainties important?
In searches for new physics beyond the Standard Model
a major source of uncertainty on limits/discoveries is from
calculation of QCD backgrounds PDF errors!
“The PDF and αS uncertainties were calculated using the PDF4LHC prescription
[39] with the MSTW2008 68% CL NNLO [40, 41], CT10 NNLO [42, 43], and
NNPDF2.3 5f FFN [44] PDF sets, and added in quadrature to the scale uncertainty.”
Measurements of the charge asymmetry in top-quark pair production in the
dilepton final state at √s = 8 TeV with the ATLAS detector PRD 94, 032006 (2016)
drives a large part of the global PDF community (esp. LHC)
Limits understanding of nucleon structure

e.g. momentum and spin distributions of d quarks at large x
motivation for several JLab12 experiments
(MARATHON, BONuS, SoLID, …)
10
Traditionally extracted from neutron / proton
structure function ratio (where “neutron” ~ deuteron - proton),
but large nuclear uncertainties affect high-x region
cannot discriminate between predictions for d/u at x ~ 1
1
CJ12min
CJ12mid
0.8
CJ12max
2
= 100
0.6
d/u
SU(6)
0.4
DSE
0.2 helicity
0 scalar qq
0 0.2 0.4 0.6 0.8 1
x
Owens, Accardi, WM (2013)
11
Different groups use different definitions of PDF uncertainties
to take into account tensions between data sets
p
multiply uncertainties by “tolerance” factor T = 2
1
CJ15
0.8 MMHT14
CT14
JR14
CJ15: 2
= 2.7
0.6
MMHT: 2
⇡ 25 100
d/u
0.4
CT14: 2
⇡ 100
0.2 JR14: 2
=1
0
0 0.2 0.4 0.6 0.8
x
… is this a meaningful comparison?
12
Need for new technology
A major challenge has been to characterize PDF uncertainties
— in a statistically meaningful way — in the presence of
tensions among data sets
Previous attempts sought to address tensions in data sets

by introducing
“tolerance” factors (artificially inflating PDF errors)
“neural net” parametrization (instead of polynomial
parametrization), together with MC techniques
However, to address the problem in a more statistically

rigorous way, one requires going beyond the standard
2 minimization paradigm
utilize modern techniques based on Bayesian statistics!
13
Bayesian approach to fitting
Analysis of data requires estimating expectation values E
and variances V of “observables” O (= PDFs, FFs) which are
functions of parameters ~a
Z
E[O] = dn a P(~a|data) O(~a)
Z
n
⇥ ⇤2
V [O] = d a P(~a|data) O(~a) E[O]
“Bayesian master formulas"
Using Bayes’ theorem, probability distribution P given by

1
P(~a|data) = L(data|~a) ⇡(~a)
Z
in terms of the likelihood function L

14
Likelihood function
✓ ◆
1 2
L(data|~a) = exp (~a)
2
is a Gaussian form in the data, with 2
function
X ✓ data i theoryi (~a)
◆2
2
(~a) =
i
(data)
with priors ⇡(~a) and “evidence” Z

Z
Z= dn a L(data|~a) ⇡(~a)
Z tests if e.g. an n-parameter fit is statistically different

from (n+1)-parameter fit
15
Two methods generally used for computing Bayesian
master formulas:
Maximum Likelihood Monte Carlo

( 2
minimization)
16
master formulas:
Maximum Likelihood
( 2
minimization)
maximize probability distribution P by minimizing 2
for a set of best-fit parameters ~a0

E [ ~a ] = ~a0
if O is ⇡ linear in the parameters, and if probability is

symmetric in all parameters
E [ O(~a) ] ⇡ O(~a0 )
17
master formulas:
Maximum Likelihood
( 2
minimization)
variance computed by expanding O(~a) about ~a0

e.g. in 1 dimension have “master formula”
1h i2
V [O] ⇡ O(a + a) O(a a)
4
where
a2 = V [a]
18
master formulas:
Maximum Likelihood
( 2
minimization)
generalization to multiple dimensions via Hessian approach:

find set of (orthogonal) contours in parameter space around ~a0
such that L along each contour is parametrized by statistically
independent parameters — directions of contours given by
eigenvectors êk of Hessian matrix H, with elements
1 @ 2 2 (~a)
Hij =
2 @ai @aj ~a=~a0
êk
and contours parametrized as a (k)
=a (k)
a 0 = tk p ,
vk
with vk eigenvectors of H
19
master formulas:
Maximum Likelihood
( 2
minimization)
basic assumption: P factorizes along each eigendirection

Y
P( a) ⇡ Pk (tk )
k
where h ⇣
1 2 êk ⌘i
Pk (tk ) = Nk exp a 0 + tk p
2 vk
2
note: in quadratic approximation for , this becomes a
normal distribution
20
master formulas:
Maximum Likelihood
( 2
minimization)
uncertainties on O along each eigendirection

(assuming linear approximation)
 ⇣ ⌘ ⇣ êk ⌘
2
1 ê k
( Ok ) 2 ⇡ O a 0 + Tk p O a0 Tk p
4 vk vk
where Tk is finite step size in tk , with total variance

X 2
V [O] = ( Ok )
k
21
master formulas:
Monte Carlo
in practice, generally one has E[O(~a)] 6= O(E[~a])

so the maximal likelihood method will sometimes fail
Monte Carlo approach samples parameter space and
assigns weights wk to each set of parameters ak
expectation value and variance are then weighted averages

X X 2
E[O(~a)] = wk O(~ak ), V [O(~a)] = wk O(~ak ) E[O]
k k
22
master formulas:
Maximum Likelihood Monte Carlo

( 2
minimization)
fast slow
assumes Gaussianity does not rely on
Gaussian assumptions
no guarantee that global
minimum has been found includes all possible
solutions
errors only characterize
local geometry of accurate
2 function
23
Incompatible data sets
Incompatible data sets can arise because of errors
in determining central values, or underestimation
of systematic experimental uncertainties
requires some sort of modification to standard statistics
Modify the master formula by introducing a “tolerance”

factor T
V [O] ! T 2 V [O]
e.g. for one dimension
T2h i2
V [O] = O(a + a) O(a a)
4
effectively modifies the likelihood function
24
Simple example: consider observable m , and two measurements
(m1 , m1 ), (m2 , m2 )
compute exactly the 2

function
✓ ◆2 ✓ ◆2
2 m m1 m m2
= +
m1 m2
and, from Bayesian master formula, the mean value

m1 m22 + m2 m21
E[m] =
m21 + m22
and variance does not
1 m21 m22 depend on
V [m] = H = m1 m2 !
m21 + m22
25
Simple example: consider observable m , and two measurements
(m1 , m1 ), (m2 , m2 )
same m2
L(m|m1, m2)
1.5 1.5 different m2

1.0 1.0
0.5 0.5
0.0 0.0
°4 °2 0 2 4 °4 °2 0 2 4
m m
total uncertainty remains independent of degree of

(in)compatibility of data
Gaussian likelihood gives unrealistic representation
of true uncertainty
26
Realistic example: recent CJ (CTEQ-JLab) global PDF analysis
10000
0.40 1
1
0.35
0.30
8000
eigen-direction # 3
24 parameters,
Likelihood
0.25 6000
¢¬2
0.20
0.15
12
5 8
29
27
3
4000
33 data sets
0.10 17
18
2
2000
11
13 10 12
5
3428
0.05 30
8
29
27
7 31 19 3
17
18
3332 15 24
6 2
11
20
23
25
22 26 4 9 14 21
16 28
34
13
30
10
7
0.00 0 31
4
33
15
32
26
14
20
9
25
23
22
16
6
24
21
19
°100 °50 0 50 100 °100 °50 0 50 100
100
0.40 1
data sets
0.35 80
0.30
Likelihood
60
compatible
0.25
¢¬2
0.20
40
along this
12
0.15 5
29
8
27 3
0.10 17 18
2
20
e-direction
28
0.05 34
30
31
0.00 0
°10 °5 0 5 10 °10 °5 0 5 10
0.6
8 (0) a1uv (12) a2du (0) TOTAL (17) e866pp06xf
(1) a2uv (13) a4du (1) HerF2pCut (18) H2 CC em
0.4 (2) slac p (19) d0run2cone
(2) a4uv (14) a1g
(3) d0Lasy13 (20) d0 gamjet1
(3) a1dv (15) a2g
0.2 18 (4) e866pd06xf (21) CDFrun2jet
1
(4) a2dv (16) a3g
4 9 15 20 (5) BNS F2nd (22) d0 gamjet3
0.0 (5) a3dv (17) a4g
Projection
2 6 10 11 12 13 14 16 17 19 21 (6) NmcRatCor (23) d0 gamjet2

5 22
3 23 (6) a4dv (18) a6dv (7) slac d (24) d0 gamjet4
°0.2 (7) a0ud (19) off1 (8) D0 Z (25) jl00106F2d
0
(8) a1ud (20) off2 (9) H2 NC ep 3 (26) HerF2dCut
°0.4 (9) a2ud (21) ht1 (10) H2 NC ep 2 (27) BcdF2dCor
(10) a4ud (22) ht2 (11) H2 NC ep 1 (28) CDF Z
°0.6 (11) a1du (23) ht3 (12) H2 NC ep 4 (29) D0 Wasy

(13) CDF Wasy (30) H2 NC em
(14) H2 CC ep (31) jl00106F2p
°0.8 (15) cdfLasy05 (32) d0Lasy e15
7
(16) NmcF2pCor (33) BcdF2pCor
°1.0
27
9
0.6 10000
1 18
0.5
34 8
3
8000
eigen-direction #18 13
24 parameters,
0.4
Likelihood
6000
¢¬2
0.3 32
33 data sets
5
26
30
4000 10
15
33
0.2 4
1231
16
11
2 20
19
728
11
171427
29
20 18 2000
0.1 24
6 22 19
13
10
2116 9 23 25 15
22
21
25
24
23
0.0 0
°100 °50 0 50 100 °100 °50 0 50 100
0.6 100
1
0.5
data sets not

34 8
80
3
0.4
Likelihood
60
compatible
¢¬2
0.3 32
5
26 40
along this
30
33
0.2 4
12 31
2
7 28
11
17 29 14 27
20 20
0.1 18
e-direction
6 22
13 19
10
16 23
9
0.0 0
°10 °5 0 5 10 °10 °5 0 5 10
0.8 2
(0) a1uv (12) a2du (0) TOTAL (17) e866pp06xf
0.6 (2) slac p (19) d0run2cone
(2) a4uv (14) a1g
(3) a1dv (15) a2g
0.4 (4) e866pd06xf (21) CDFrun2jet
(4) a2dv (16) a3g
22 (5) BNS F2nd (22) d0 gamjet3
0.2 4 (5) a3dv (17) a4g
Projection
6 (6) NmcRatCor (23) d0 gamjet2

16
1 3 9 11 (6) a4dv (18) a6dv (7) slac d (24) d0 gamjet4
7 8 12 15 18 20 23
0.0 (7) a0ud (19) off1 (8) D0 Z (25) jl00106F2d

13 14 19
0
5
10 17
(10) a4ud (22) ht2 (11) H2 NC ep 1 (28) CDF Z

(14) H2 CC ep (31) jl00106F2p
°0.6 21 (15) cdfLasy05 (32) d0Lasy e15
°0.8
28
9
0.6 10000
1 18
0.5
34 8
3
8000
eigen-direction #18 13
24 parameters,
0.4
Likelihood
6000
¢¬2
0.3 32
33 data sets
5
26
30
4000 10
15
33
0.2 4
1231
16
11
2 20
19
728
11
171427
29
20 18 2000
0.1 24
6 22 19
13
10
2116 9 23 25 15
22
21
25
24
23
0.0 0
°100 °50 0 50 100 °100 °50 0 50 100
0.6 100
1
0.5
data sets not

34 8
80
3
0.4
Likelihood
60
compatible
¢¬2
0.3 32
5
26 40
along this
30
33
0.2 4
12 31
2
7 28
11
17 29 14 27
20 20
0.1 18
e-direction
6 22
13 19
10
16 23
9
0.0 0
°10 °5 0 5 10 °10 °5 0 5 10
0.8 2
(0) a1uv (12) a2du (0) TOTAL (17) e866pp06xf
0.6
standard Gaussian likelihood incapable of accounting for

(2) a4uv (14) a1g (2) slac p (19) d0run2cone
(3) a1dv (15) a2g
0.4 (4) e866pd06xf (21) CDFrun2jet
(4) a2dv (16) a3g
22 (5) BNS F2nd (22) d0 gamjet3
0.2 4 (5) a3dv (17) a4g
underestimated individual errors (leading to incompatible data sets)

Projection
6 (6) NmcRatCor (23) d0 gamjet2

16
1 3 9 11 (6) a4dv (18) a6dv (7) slac d (24) d0 gamjet4
7 8 12 15 18 20 23
0.0 (7) a0ud (19) off1 (8) D0 Z (25) jl00106F2d

13 14 19
0
5
10
— not designed for such scenarios!

17
(10) a4ud (22) ht2 (11) H2 NC ep 1 (28) CDF Z

(14) H2 CC ep (31) jl00106F2p
°0.6 21 (15) cdfLasy05 (32) d0Lasy e15
°0.8
29
CTEQ tolerance criteria
Eigenvector 4
40
30
BCDMSp
BCDMSd
CCFR2
CCFR3
CDFjet
NMCp
CDFw
NMCr
ZEUS
20
D0jet
E605
E866
H1b
H1a
2
10
T (⇡ 10)
distance
0
p
!10
!20
!30 eigen-direction #4 CTEQ6
for each experiment, find minimum 2

along given e-direction
from 2
distribution determine 90% CL for each experiment
along each side of e-direction, determine maximum range d±
k
allowed by the most constraining experiment
T computed by averaging over all d±
k (typically T ⇠ 5 10)
30
CTEQ tolerance criteria
Eigenvector 4
40
30
BCDMSp
BCDMSd
CCFR2
CCFR3
CDFjet
NMCp
CDFw
NMCr
ZEUS
20
D0jet
E605
E866
H1b
H1a
2
10
T (⇡ 10)
distance
0
p
!10
!20
!30 eigen-direction #4 CTEQ6
This approach is not consistent with Gaussian likelihood

no clear Bayesian interpretation of uncertainties
(ultimately, a prescription…)
31
To summarize standard maximum likelihood method…
Gradient search (in parameter space) depends how “good” the
starting point is
for ~30 parameters trying different starting points is
impractical, if do not have some information about shape
Common to free parameters initially, then freeze those
not sensitive to data ( 2 flat locally)
introduces bias, does not guarantee that flat 2
globally
Cannot guarantee solution is unique
Error propagation characterized by quadratic 2 near minimum

no guarantee this is quadratic globally (e.g. Student t-distribution?)
Introduction of tolerance modifies Gaussian statistics
32
Monte Carlo
Designed to faithfully compute Bayesian master formulas
Do not assume a single minimum, include all possible solutions

(with appropriate weightings)
Do not assume likelihood is Gaussian in parameters
Allows likelihood analysis to be extended to address tensions

among data sets via Bayesian inference
More computationally demanding compared with Hessian method
33
Monte Carlo
First group to use MC for global PDF analysis was NNPDF,
using neural network to parametrize P (x) in
Forte et al. (2002)
f (x) = N x↵ (1 x) P (x)
— ↵, are fitted “preprocessing coefficients”
Iterative Monte Carlo (IMC), developed by JAM Collaboration,

variant of NNPDF, tailored to non-neutral net parametrizations
Markov Chain MC (MCMC) / Hybid MC (HMC)

— recent “proof of principle” analysis, ideas from lattice QCD
Gbedo, Mangin-Brinet (2017)
Nested sampling (NS) — computes integrals in Bayesian master

formulas (for E, V, Z) explicitly Skilling (2004)
34
Iterative Monte Carlo (IMC)
Use traditional functional form for input distribution shape,
but sample significantly larger parameter space than possible
in single-fit analyses

fit
sampler priors fit posteriors
fit
no assumptions for exponents
original data
cross-validation to avoid
pseudo
overfitting
prior
data
training
data
validation
data iterate until convergence
as initial
guess
fit criteria satisfied
parameters from
minimization steps validation
posterior
35
e.g. of convergence (for fragmentation functions) in IMC
1.4 1.4 1.4 1.4 1.4 1.4
u+ d+ s+ c+ b+ g
1.0 1.0 1.0 1.0 1.0 1.0
initial
0.6 0.6 0.6 0.6 0.6 0.6
0.2 0.2 0.2 0.2 0.2 0.2

0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z
1.4 1.4 1.4 1.4 1.4 1.4

intermediate
1.0 º+ 1.0 1.0 1.0 1.0 1.0 º+

0.6 0.6 0.6 0.6 0.6 0.6
K+ K+
0.2 0.2 0.2 0.2 0.2 0.2
0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z
1.4 1.4 1.4 1.4 1.4 1.4

intermediate
1.0 1.0 1.0 1.0 1.0 1.0
0.6 0.6 0.6 0.6 0.6 0.6
0.2 0.2 0.2 0.2 0.2 0.2

0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z
1.4 1.4 1.4 1.4 1.4 1.4
1.0 1.0 1.0 1.0 1.0 1.0 ~ 20

final
0.6 0.6 0.6 0.6 0.6 0.6 iterations

0.2 0.2 0.2 0.2 0.2 0.2
0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z 0.2 0.4 0.6 0.8 z
Sato, Ethier, WM, Hirai, Kumano, Accardi (2016)

36
Rigorous (Bayesian) way to address incompatible data sets
is to use generalization of Gaussian likelihood
joint vs. disjoint distributions
empirical Bayes
hierarchical Bayes
others, used in different fields
37
Disjoint distributions
Instead of using total likelihood that is a product (“and”)
of individual likelihoods, e.g. for simple example of two
measurements
L(m1 m2 |m; m1 m2 ) = L(m1 |m; m1 ) ⇥ L(m2 |m; m2 )
use instead sum (“or”) of individual likelihoods

1h i
L(m1 m2 |m; m1 m2 ) = L(m1 |m; m1 ) + L(m2 |m; m2 )
2
gives rather different expectation value and variance

1
E[m] = (m1 + m2 )
2
✓ ◆2
1 m1 m2
V [m] = ( m21 + m22 ) +
2 2
depends on
separation!
38
Symmetric uncertainties m1 = m2
joint
m2 )
L(m1 m12,|m)
1.5 1.5
1.0 1.0
L(m|m
disjoint
0.5 0.5
0.0 0.0
°4 °2 0 2 4 °4 °2 0 2 4
m m N. Sato
✓ ◆2
1 m1 m2
disjoint: V [m] = ( m21 + m22 ) +
2 2
m21 m22
joint: V [m] =
m21 + m22
39
Asymmetric uncertainties m1 6= m2
3 3
L(m|m1, m2)
|m)
2 2
1 m 2
1 1
L(m
0 0
°4 °2 0 2 4 °4 °2 0 2 4
m m
N. Sato
disjoint likelihood gives broader overall uncertainty,

overlapping individual (discrepant) data
40
Empirical Bayes
Shortcoming of conventional Bayesian — still assume
prior distribution follows specific form (e.g. Gaussian)
Extend approach to more fully represent prior uncertainties,

with final uncertainties that do not depend on initial choices
In generalized approach, data uncertainties modified by

distortion parameters, whose probability distributions given
in terms of “hyperparameters” (or “nuisance parameters”)
Hyperparameters determined from data

give posteriors for both PDF and hyperparameters
41
Empirical Bayes
Simple example of EB for symmetric & asymmetric errors
10 25
8 20 experiment 1
1–æ
P(t|D)
6 15 experiment 2
Hessian
fit
4 10
Empirical
EB fit Bayes
2 5
0 0
°1.0 °0.5 0.0 0.5 1.0 °0.5 °0.4 °0.3 °0.2 °0.1 0.0 0.1
10 25
8 20
6–æ
P(t|D)
6 15
4 10
2 5
0 0
°1.0 °0.5 0.0 0.5 1.0 °0.5 °0.4 °0.3 °0.2 °0.1 0.0 0.1
10 25
8 20
12–æ
P(t|D)
6 15
4 10
2 5
0 0
°1.0 °0.5 0.0 0.5 1.0 °0.5 °0.4 °0.3 °0.2 °0.1 0.0 0.1
t t
N. Sato
42
PDFs in lattice QCD
Recent progress in extracting x dependence of PDFs in
lattice QCD from matrix element of nonlocal operator
Mq (z, Pz ) = hN (Pz )| q (0, z) z W(z, 0) q (0, 0)|N (Pz )i
Z 1
= dy eiyPz z qe(y, Pz )
1
quasi-PDF qe related to light-cone PDF via matching kernel e

C
Z 1
dy e ⇣ x ⌘
q(x, µ) = C , µ, Pz qe(y, Pz , µ)
1 |y| y
Conflicting results on sign of d¯ ū asymmetry

2.5
6
MSTW
2.0 CJ12
Lattice 4
1.5
u-d
1.0 2
0.5
0
0
d¯ > ū -0.5 d¯ < ū -2

-0.4 -0.2 0 0.2 0.4 0.6 0.8 1.0 -1 -0.5 0 0.5 1
x
Lin et al. (2015) Alexandrou et al. (2018)
43
PDFs in lattice QCD
Fit lattice observable directly within JAM framework
1 exp
0.4 exp+lat
Re Mu°d
x(u ° d)
0 lat (DFT)
exp
exp+lat 0.2
°1 lat Q2 = 4 GeV2
0.0
1
0.1
Im Mu°d
x(ū ° d)
¯
0
0.0
°1
°10 0 10 10°2 10°1 100
z/a x
Bringewatt, Sato, WM, Qiu, Steffens, Constantinou (2021)
reatively weak impact of present lattice data on

unpolarized PDF determination
44
PDFs in lattice QCD
difficult to fit current

lattice matrix element
data for unpolarized
u-d PDFs
45
PDFs in lattice QCD
1.5 0.75
x(¢u ° ¢d)
Re M¢u°¢d
1.0 0.50
0.5 0.25
Q2 = 4 GeV2
0.0 0.00
0.2
1.0 exp
x(¢ū ° ¢d)
Im M¢u°¢d
¯
exp+lat
0.0
0.0
exp
°0.2 exp+lat
-1.0 lat (DFT)
°10 0 10 10°2 10°1 100
z/a x
Bringewatt, Sato, WM, Qiu, Steffens, Constantinou (2021)
better agreement between lattice and experiment for

polarized PDFs (within larger uncertainties)
46
PDFs in lattice QCD
good fits possible to

both experimental
and lattice data for
polarized u-d PDFs
47
Outlook
New approaches being developed for global QCD analysis
— simultaneous determination of parton distributions
using Monte Carlo sampling of parameter space
— towards synthesis of lattice QCD data with global analysis
Treatment of discrepant data sets needs attention

— Bayesian perspective has clear merits
Near-term future: “universal” QCD analysis of all observables

sensitive to collinear (unpolarized & polarized) PDFs and FFs
Longer-term: apply MC technology to global QCD analysis

of transverse momentum dependent (TMD) PDFs and FFs
Outlook
Study of PDFs has brought together essential elements
of nuclear and high-energy physics
1
CJ15 u
0.8 d
d¯ + ū
0.6 d¯ ° ū
xf (x, Q2)
s
g/10
0.4
0.2
Q2 = 10 GeV2
0
10°4 10°3 10°2 0.1 0.3 0.5 0.7 0.9

x
BSM
nucleon physics
structure
neutrino astrophysics
oscillations
chiral
physics
quark-hadron
duality
49

Melnitchouk21 Hugs

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Melnitchouk21 Hugs

Uploaded by

Copyright:

Available Formats

HUGS Topical Seminar

June 16, 2021

QCD factorization: separation of hard (perturbative, calculable)

process-independent parton distribution functions fa/A

low x, high Q2 (“LHC”)

Information on PDFs obtained from

Extraction of PDFs is challenging because usually

101 Q2 = 2.0 GeV2

100 Q2 = 2.9 GeV2

10°1 x = 0.65 Q2 = 4.0 GeV2 (i = 0)

Accardi et al. (“CJ15”)

x(d¯ ° ū) Rs = (s + s̄)/(d¯ + ū)

drives a large part of the global PDF community (esp. LHC)

Limits understanding of nucleon structure

… is this a meaningful comparison?

Previous attempts sought to address tensions in data sets

However, to address the problem in a more statistically

utilize modern techniques based on Bayesian statistics!

“Bayesian master formulas"

Using Bayes’ theorem, probability distribution P given by

in terms of the likelihood function L

with priors ⇡(~a) and “evidence” Z

Z tests if e.g. an n-parameter fit is statistically different

Maximum Likelihood Monte Carlo

maximize probability distribution P by minimizing 2

for a set of best-fit parameters ~a0

if O is ⇡ linear in the parameters, and if probability is

variance computed by expanding O(~a) about ~a0

generalization to multiple dimensions via Hessian approach:

basic assumption: P factorizes along each eigendirection

uncertainties on O along each eigendirection

where Tk is finite step size in tk , with total variance

in practice, generally one has E[O(~a)] 6= O(E[~a])

expectation value and variance are then weighted averages

Maximum Likelihood Monte Carlo

Modify the master formula by introducing a “tolerance”

effectively modifies the likelihood function

compute exactly the 2

and, from Bayesian master formula, the mean value

1.5 1.5 different m2

total uncertainty remains independent of degree of

°100 °50 0 50 100 °100 °50 0 50 100

2 6 10 11 12 13 14 16 17 19 21 (6) NmcRatCor (23) d0 gamjet2

°0.6 (11) a1du (23) ht3 (12) H2 NC ep 4 (29) D0 Wasy

data sets not

6 (6) NmcRatCor (23) d0 gamjet2

0.0 (7) a0ud (19) off1 (8) D0 Z (25) jl00106F2d

°0.4 (11) a1du (23) ht3 (12) H2 NC ep 4 (29) D0 Wasy

data sets not

standard Gaussian likelihood incapable of accounting for

underestimated individual errors (leading to incompatible data sets)

6 (6) NmcRatCor (23) d0 gamjet2

0.0 (7) a0ud (19) off1 (8) D0 Z (25) jl00106F2d

— not designed for such scenarios!

°0.4 (11) a1du (23) ht3 (12) H2 NC ep 4 (29) D0 Wasy

!30 eigen-direction #4 CTEQ6

for each experiment, find minimum 2

!30 eigen-direction #4 CTEQ6

This approach is not consistent with Gaussian likelihood

Cannot guarantee solution is unique

Error propagation characterized by quadratic 2 near minimum

Introduction of tolerance modifies Gaussian statistics

Do not assume a single minimum, include all possible solutions