Lecture9 DistributionTesting

Lecture 9: Property Testing for
Distributions
1
Problem Statement
• Let 𝒫 be a family of distributions on
???
• Given an unknown distribution , distinguish whether:
• , or
• is -far from .
???
2
Access to the Distribution
• Oracle access: iid samples .
3
Distance Between Distributions
Statistical distance (total variation distance):
For :
4
Statistical Distance vs.
• A distribution is a vector,
• Distance between vectors:
• Claim:
5
Δ ( 𝑝,𝑞 ) =max|𝑝 ( 𝐴 ) − 𝑞 ( 𝐴 )| 1
≥ ⋅ ‖𝑝− 𝑞∥1= ∑ |𝑝 ( 𝑥 ) −𝑞 ( 𝑥 )|
𝐴⊆Ω 2 𝑥 ∈Ω
𝑞
𝑝
What event maximizes ?
0
Ω
6
Δ ( 𝑝,𝑞 ) =max|𝑝 ( 𝐴 ) − 𝑞 ( 𝐴 )| 1
≥ ⋅ ‖𝑝− 𝑞∥1= ∑ |𝑝 ( 𝑥 ) −𝑞 ( 𝑥 )|
Let .
1
𝑞
𝑝
0
Ω
7
Δ ( 𝑝,𝑞 ) =max|𝑝 ( 𝐴 ) − 𝑞 ( 𝐴 )| 1
≤ ⋅ ‖𝑝− 𝑞∥1= ∑ |𝑝 ( 𝑥 ) −𝑞 ( 𝑥 )|
Let .
1
Let be any event.
Then :
𝑞
𝑝
0
Ω
8
Why Test Properties of Distributions?
• Suppose 𝒜 is designed assuming
• Example:
• Randomness: uniformly random
• Noise: Gaussian
• When : guaranteed errs
• What if we’re not sure?
1. Test whether or is -far from
2. If tester says “ is -far from ”: abort
3. If tester says “”:
9
Testing Uniformity
• Question: is uniform on , or -far?
• Strategy:
1. Take samples
2. Count collisions: how many s.t. ?
3. Few collisions accept, lots of collisions reject
10
Basic Observation
• The uniform distribution minimizes the collision probability:
1 1
0 0
Uniform Far From Uniform
11
Collision Probability
• Lemma:
12
[ )]
2
∑(
Collision Probability Cauchy-Schwartz 1 1
≥ 𝑝 (𝑥 )−
𝑛 𝑥 ∈ [𝑛] 𝑛
• Lemma:
• Corollary:
13
Analyzing the Tester Threshold:
Accept iff
• Let indicate
• If :
• If :
uniform -far
14
Concentration Bound
Want:
• Let
• not independent!
• Claim: if for large enough, then
Chebyshev:
15
Bounding the Variance
• For a single indicator:
• Independence?
and are independent iff
• What’s left?
16
The Contribution of Triplets
• Triple collision:
17
Bounding the Variance
pairs triplets + quadruplets
≤𝜇 for triplets, 0 for quadruplets
18
How Small Is ?
Threshold:
Accept iff
uniform -far
19
How Small Is ?
• Prevent flip from “no” to “yes”:
• Prevent flip from “yes” to “no”:
• Setting we get both:
20
Overall Sample Complexity
• Can be improved to
21
Testing Identity to any Fixed
Distribution
22
Identity Testing
• Fix
• To test whether or :
1. “Discretize” the distributions into
2. Check whether by reduction to uniformity testing
23
Identity Testing for Discretized Distributions
:
If :
• Idea: “flatten” the distribution Map to where
1 1
4/5 4/5
3/5 3/5
2/5 2/5
1/5 1/5
0 0
𝑎𝑏 ( 𝑎 ,1 ) ( 𝑎 ,2 ) ( 𝑎 , 3 ) ( 𝑏 ,1 ) ( 𝑏 , 2 )
24
:
If :
• What is Map to where
𝑎𝑏 ( 𝑎 ,1 ) ( 𝑎 ,2 ) ( 𝑎 , 3 ) ( 𝑏 ,1 ) ( 𝑏 , 2 )
25
:
If :
• What is Map to where
𝑎𝑏 ( 𝑎 ,1 ) ( 𝑎 ,2 ) ( 𝑎 , 3 ) ( 𝑏 ,1 ) ( 𝑏 , 2 )
Note: doesn’t need

Sample complexity?
to be discretized!
26
Discretizing the Distribution
• Idea: round probabilities to multiples of
• How?
1 1
0 0
Ω Ω
27
Discretizing the Distribution Attempt #1: given ,
Let
W.p. : output
• Idea: round probabilities to multiples of Else: output
• How?
Pr [ 𝑖 ]=𝑞 ( 𝑖 ) ⋅
𝐹 (𝑞 )
𝑗 ( 𝛾 /𝑛 )
𝑞 ( 𝑖)
=𝑗 ( )
𝛾
𝑛
1 1
0 0
Ω Ω ⊥
28
• Problem: very small values 0
• Example:
{
𝐹 ( 𝑞1 ) ( 𝑖 ) = 𝐹 ( 𝑞 2) ( 𝑖 ) = 0 , 𝑖∈ [ 𝑛 ]
1 ,𝑖=⊥
29
• Solution:
• First “smooth” by mixing with the uniform distribution
• Ensures:
• No element has probability “too small”
• Statistical distance (roughly) preserved
• Then apply the rounding filter
30
Mixing With the Uniform Distribution
:
• Given :
• W.p. ½: output
• W.p. ½: output uniform element in
• for any
• What happens to ?
31
Applying the Rounding Filter
: set
• Given ,
• Let
• W.p. : output
• Else: output
32
Identity Testing
• Fix
• To test whether or :
1. “Discretize” the distributions into
2. Check whether by reduction to uniformity testing
33
Lower Bound on Uniformity
Testing
34
Testing Uniformity
• Question: is uniform on , or -far?
• Strategy:
1. Take samples where
2. Count collisions: how many s.t. ?
3. Few collisions accept, lots of collisions reject
35
Lower Bound for Uniformity
• Claim: samples required for testing uniformity with .
• Observation:
• Uniformity is preserved under name-changes
36
Label-Invariant Testers
• is label invariant if its decision is preserved under name-changes:
for any
• Any label-invariant property has a label-invariant tester:
37
Lower Bound on Uniformity
• Claim: samples required for testing uniformity with .
• Assume is label-invariant with
• sees a collision
• Let be uniform on
• sees a collision
38
End (Part II)
39

Lecture9 DistributionTesting

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture9 DistributionTesting

Uploaded by

Copyright:

Available Formats

Lecture 9: Property Testing for

Uniform Far From Uniform

≤𝜇 for triplets, 0 for quadruplets

• Prevent flip from “yes” to “no”:

• Setting we get both:

Note: doesn’t need

• Any label-invariant property has a label-invariant tester:

You might also like