You are on page 1of 43

What's All This About P NP?

Ken Clarkson Ron Fagin Ryan Williams


P vs. NP
A mathematical issue, not a legal one
P and NP:
Each is a set of computational problems
Each is described differently
Are they actually the same set?
A million dollar problem
A Clay Millennium Prize
Most everyone thinks P NP
the problem is to prove it
On August 6, Vinay Deolalikar proposed a proof
Taking this proposed proof seriously
People claim proofs all the time
Every couple months on ArXiV, P=NP
But:
D. is a Principal Research Scientist at HP.
Steve Cook:
This appears to be a relatively serious claim...
Dick Lipton:
...this is a serious effort...
Moshe Vardi:
This looks like a serious paper...
However:
It doesn't look like the proof goes through.
Finding flaws can take time
Four-color Theorem
Proven 1879
Bug found 1890
Proven 1976 (using a computer)
Hilbert's 21st problem
Solved 1908
Counterexample 1990
Hilbert's 16th problem, special case
Solved 1923
Gaps 1980
Solved 1991
Finding flaws in internet time
August 6: Manuscript is sent to 22 people,
including Ron Fagin, and put on webpage
7: Blog post [Greg Baker]
8: Slashdot, Liptons blog
9: Wikipedia article about D. (deleted later)
10: Wiki for technical discussion established
Based on comment thread on Liptons blog
About 340 edits since
Fields Medalists are involved
15: Commemorative blogpost:
The PNP Proof Is One Week Old
Updates in internet time
First draft, Aug 6
Overwritten several times
Second draft Aug 9 to Aug 10
Draft 2 + , Aug 9 to Aug 11
Third draft, Aug 11 to Aug 17
All drafts removed after Aug 17
D. says: the paper has been sent out for
refereeing
Three-page synopsis, Aug 13
Only current public version
Elements of the proposed proof
Finite Model Theory
Part of mathematical logic
Impact on database theory, combinatorics, and complexity
theory
Ron Fagin is the founder of FMT
Ron will introduce P vs. NP, and explain the role of FMT
Random k-SAT
Analogs in statistical physics
Ryan Williams was a key player in the on-line discussions
Post-doc in K53, IBM Raviv Fellow
Ryan gave a beautifully simple counter-argument to this part
Discovery vs. Verification
Two important tasks for a scientist are discovery
of solutions, and verification of other peoples
solutions.
It is easier to check that a solution, say to a
puzzle, is correct, rather than to find the
solution.
That is, verifying a solution is easier than
discovering it.
Example: Sudoku
Sudoku
Sudoku
The P vs. NP question asks whether
verification is easier than discovery
What is P?
Polynomial Time
The class of problems where the solution can
discovered quickly
In time polynomial in the size of the input
Example 1: Given a number, is it even?
Example 2: Given a graph, is it connected?
What is NP?
Nondeterministic Polynomial time
The class of problems where the solution can
verified quickly
In time polynomial in the size of the input
Example 1: Sudoku
A filled-in puzzle gives a quick verification.

Example 2: 3-colorability
3-colorability
3-colorability
Quick verification of 3-colorability
Quick verification of 3-colorability
Does P = NP?
For our examples (Sudoku and 3-coloring), it is
not known if they are in P.
P vs. NP
Problems in P: efficient discovery of a solution
Problems in NP: efficient verification of a solution

The problem of whether P = NP asks:


Assume it easy to verify a solution.
Is it easy to discover a solution?

Can always discover a solution by brute-force search


But there are an exponential number of solutions to check
Can we do better?

Consider the needle in a haystack metaphor.


NP-complete problems
NP-complete problems are the hardest
problems in NP
Examples: Sudoku and 3-colorability
If there is a fast (polynomial-time) algorithm for
one NP-complete problem, then there is a fast
algorithm for every problem in NP!
For example, a fast algorithm for Sudoku implies
P=NP.
Why is a proof that P NP important?
A number of important problems in industry (such as
flight scheduling, chip layout, and protein folding) are
NP-complete. A proof that P NP would tell us that we
cannot expect to get optimal answers in practice.
Cryptography is based on the assumption that P NP.
Proving that P NP is a stepping stone towards
provably secure cryptography.
A proof that P NP would give us deep insight into the
nature of computation, which would have many ripple
effects
For example, Wiles proof of Fermats Last Theorem led to other
fundamental advances in number theory.
Maybe P = NP?
Then the world is fundamentally different than is commonly
believed.
Bad news: P = NP would destroy the standard model
of complexity theory
Much previous research would become useless.
Good news: P = NP would probably imply that we can
solve problems efficiently that we cant now.
Bad news: P = NP would probably imply that current
cryptographic systems can be broken.
Radically new approaches to security would be needed.
The P vs. NP problem has been called
one of the deepest questions ever
asked by human beings.

The blog author who said this bet his


house against Deolalikars proof.
SAT
Given a logical formula that is an and of ors, is
there a solution (an assignment of 0s and 1s to
the variables that makes the formula true)?
Example:
(x1 OR NOT(x2) OR x3)
AND (x2 OR NOT(x3))
AND (NOT(x ) OR NOT(x4))
1
A solution: x1 = 1, x2 = 0, x3 = 0, x4 = 0.
The set of such solutions is called the solution space.

Cooks Theorem (1971): SAT is NP-complete.

k-SAT: each clause has exactly k members.


This problem is also NP-complete for k 3.
Strategy of Deolalikars Proof
If k-SAT were in P, then the solution spaces for
all k-SAT formulas would have a simple
structure.
For some k-SAT formulas, the solution spaces
for these formulas do not have a simple
structure.
Therefore, k-SAT is not in P, and so P NP.

The proof Deolalikar gives for the first bullet uses


finite model theory
Existential second-order logic
3-colorability can be expressed quite informally as:
a coloring (the coloring is a 3-coloring of the graph)
A little more formally as:
RGB (Every point is in exactly one of the sets R, G, or
B, and no two points that are connected by an edge are
both in R, or both in G, or both in B)
This formula can be expressed formally in existential
second-order logic (SO)
So 3-colorability can be expressed in SO.
Capturing NP with logic
Fagins Theorem (1974): NP = SO

Example: 3-colorability

Surprising, since characterizing a complexity class


in terms of logic, where there is no notion of
machine, computation, polynomial, or time.
How about P?
Fagins Theorem captures NP in terms of logic.
Can we also capture P in terms of logic?
Answer: Yes (sort of).
Capturing P with logic
There is a logic called least fixpoint logic (LFP).
It is richer than first-order logic (it involves
recursion).

Immerman-Vardi Theorem (1982): P = LFP (over


ordered structures)
Back to Deolalikars proof strategy
Recall that the first part of Deolalikars proof strategy says that if k-SAT
were in P, then the solution spaces for all k-SAT formulas would
have a simple structure.
Deolalikars proof of this first part proceeds as follows:

1. Assume that k-SAT is in P.


2. So k-SAT can be expressed in LFP, by the Immerman-Vardi
Theorem.
3. LFP implies a simple structure for solution spaces.
4. So solution spaces for k-SAT formulas have a simple structure.

Unfortunately, Deolalikars proof of step 3 works only for a fragment of


LFP (the monadic case).
This was pointed out by Immerman in Liptons blog.
So k-SAT is not necessarily covered in step 3.
Strategy of Deolalikars Proof
If k-SAT were in P, then the solution spaces for all k-
SAT formulas would have a simple structure.
For some k-SAT formulas, the solution spaces for these
formulas do not have a simple structure.
Therefore, k-SAT is not in P, and so P NP.

We just saw that there was an error in Deolalikars proof of


the first bullet.
But maybe the first bullet can be proven another way.
Ryan will now discuss the second bullet.
Strategy of Deolalikars Proof
If k-SAT were in P, then the solution spaces for
all k-SAT formulas would have a simple
structure.
For some k-SAT formulas, the solution spaces
for these formulas do not have a simple
structure.
Therefore, k-SAT is not in P, and so P NP.

Deolalikar proposes to choose certain random


k-SAT formulas, and use known properties of
their solution spaces
Random k-SAT
Recall k-SAT: Satisfiability of Boolean formulas as AND of ORs
n variables (0-1), m clauses, each clause has k literals

F = (x1 OR NOT(x2) OR x3)


AND (x2 OR NOT(x3) OR NOT(x4))
AND (NOT(x1) OR NOT(x2) OR NOT(x3))

Here we have n=4, m=3, k=3


Given a formula F, is F satisfiable?
Is there a setting of variables that makes F evaluate to 1?

Random k-SAT:
Fix n, m, k, and choose m clauses at random
Study the percentage of random formulas that are satisfiable
Random k-SAT
Random k-SAT:
Fix n, m, k, and choose clauses at random
(Monasson et al., Nature 1999) As the clause-to-variable
ratio increases, we see a phase transition in SAT:
random formulas switch from being almost all satisfiable to
almost all unsatisfiable

100
Percent Satisfiable
Relative Run Time

Percent Satisfiable
80 Relative Run Time

60
40
20
Looks like this
is where the 0
hard 0 5 10
formulas are!
Clause-to-variable ratio m
n
Random k-SAT
What do the formulas undergoing this transition
from almost all satisfiable to almost all
unsatisfiable look like, on average?
(Mezard et al. Science 2002) For random k-SAT, there
are actually three phases:
1. a replica-symmetric phase where the solutions are all in one
big cluster together, then
2. a replica-symmetry-breaking satisfiable (RSB) phase with
exponentially many clusters of solutions, each cluster being
far from all the others, and finally
3. a replica-symmetry-breaking unsatisfiable phase with no
solutions.
Here the distance measure is Hamming distance:
e.g. (1,1,1,1) and (0,0,0,0) have distance 4,
(1,0,0,0) and (0,0,0,0) have distance 1
The RSB Satisfiable Phase of k-SAT
Exponentially many clusters of solutions, each cluster
being far from all the others

Deolalikars proof focuses on analyzing formulas


arising from this RSB satisfiable phase.
Certainly some complex-looking structure here
Can this be the reason that k-SAT is hard?
The SAT0 Objection
SAT0: Formulas that are satisfied when you set
every variable to zero.
This problem is definitely in P. Very easy.
However, we can show that for every k-SAT
formula, there is a SAT0 formula with an
isomorphic solution space. All distances
between solutions are preserved.
So whatever complex structure you may have in
the solution space of a random k-SAT formula,
there are always SAT0 formulas with analogous
structure!
The SAT0 Objection

(0,0,,0)

Take any k-SAT formula F and one of its solutions


(A1,,An) where Ai {0,1} for all i
Create the formula F as follows:
for every Ai = 1,
change all xi in F to NOT(xi), and all NOT(xi) to xi
The SAT0 Objection

(0,0,,0)

What does this say?


The difficulty of k-SAT doesnt arise from distinguishing
satisfiable formulas with simple structure from those with
complex structure, but rather from distinguishing satisfiable
formulas from unsatisfiable formulas.
Still, this is just intuition...
The intuition is realized

Theorem (Proved by "vloodin" and Terence Tao)


Under the notion of simple" given in the paper,
k-SAT does have simple solution spaces!

Proof Idea: First show that all SAT0 formulas have


"simple" solution spaces, then use the SAT0 objection
to translate this space over for an arbitrary k-SAT
instance.

So unfortunately the proof breaks in its current form.


Can we salvage something from it?
Terence Tao's car analogy (paraphrased):
the paper is like a lengthy blueprint for a revolutionary new car, that
somehow combines a high-tech engine with advanced fuel injection to
get 200 miles to the gallon.
The LFP objections are like a discovery of serious wiring faults in the
engine but the inventor claims this can be fixed using a weak engine
The solution space objections are like a discovery that, according to
blueprints, the car would run just as well if gasoline was replaced with
ordinary tap water D.s response to this has been roughly That
objection is invalid everyone knows cars cant run on water.
The theorem (on the previous slide) is like a discovery that the fuel is in
fact being sent to a completely different component of the car than the
engine"
Can any parts of this car be salvaged?
Concluding remarks
Deolalikars proof seems to be not only wrong,
but unfixable.
Hardness and solution space complexity seem
to be orthogonal.
New research question: can random k-SAT be
used to prove complexity results?
There is a new world of community refereeing.
Good: every part of the proof had corresponding
experts
Bad: those experts spent a great deal of time
The community is still learning how to work
effectively in this new world.

You might also like