Last week, Vinay Deolalikar presented a claimed proof that P is not
equal to NP. I'd like to discuss it because (a) it's an important
problem, (b) the paper has some interesting ideas that are relevant
to the way we design and use ontologies and the languages for
expressing them, and (c) it suggests ways of organizing ontologies
and using languages to express them. (01)
Deolalikar's paper: (02)
http://www.hpl.hp.com/personal/Vinay_Deolalikar/Papers/pnp12pt.pdf (03)
Two blogs that discuss this paper and whether the proof is correct: (04)
http://rjlipton.wordpress.com/ (05)
http://scottaaronson.com/blog/?p=456 (06)
I share some of the concerns mentioned in these blogs. But whether
or not there are flaws in Deolalikar's paper, he makes points that
are valuable in themselves, independently of their use in the proof. (07)
In the discussion at the end of this note, I quote some excerpts
from the paper and add some comments about their implications. (08)
As that discussion shows, the complexity class for many kinds
of problems can be determined by checking the problem statement.
If a problem is in an easy class, solve it quickly. But if it's
in the NPhard class, warn the user and don't attempt to solve it,
unless the user insists. (09)
For the design of languages and systems, it shows that there
is a choice between different design options: (010)
1. Restrict a language to make it impossible to state problems
that are inefficient (NP hard or worse). (011)
2. Allow the people who state a problem to use any language
they find convenient. Before trying to solve a problem,
the system can check its complexity class and warn the
user(s) before trying to solve problems that might take
excessive amounts of resources. (012)
Option #1 can make it impossible to state solvable cases of
hard problems. Option #2 is more flexible, because it makes
the determination at run time. (013)
John Sowa
__________________________________________________________________ (014)
Following are quotations from Deolalikar's paper with my comments
about their implications. The page numbers are the ones in the paper
itself, which are 4 less than the numbers by the Adobe counter. (015)
p. iii (or 3 by the Adobe reader count) is the abstract: (016)
"Our work shows that every polynomial time algorithm must fail to
produce solutions to large enough problem instances of kSAT in the
d1RSB phase." (017)
kSAT is the set of problems in satisfying a collection of Boolean
assertions with k variables. Any NPhard problem (which takes an
exponential amount of time to solve) can be reduced to some kSAT
problem that also takes exponential time. (018)
The d1RSB phase is the "dynamical one step replica symmetry breaking"
phase. Outside this phase, the problems are easy (P time). But in
this phase, the problems are hard (exponential time). (019)
"This shows that polynomial time algorithms are not capable of
solving NPcomplete problems in their hard phases, and demonstrates
the separation of P from NP." (020)
The heart of the proof shows that some kSAT problems lie in the
d1RSB phase. But as a byproduct, it also shows that large numbers
of problems are easy (P time) because they are outside that phase. (021)
p. 51 (or 55 by the Adobe reader count) (022)
"While a given CSP [constraint satisfaction problem] ... might
be NPcomplete, many instances of the CSP might be quite easy to
solve, even using fairly simple algorithms. Furthermore, such
'easy' instances lay in certain well defined regimes of the CSP,
while 'harder' instances lay in clearly separated regimes." (023)
This is good news for programmers and knowledge engineers,
because the method of proof provides important criteria: (024)
1. For distinguishing easy problems from hard problems, and (025)
2. For stating problems in such a way that they are guaranteed
to stay outside the hard phase. (026)
p. 53: (027)
"physicists have conjectured that there is a second threshold that
divides the SAT phase into two  an 'easy' SAT phase, and a 'hard'
SAT phase. In both phases, there is a solution with high probability,
but while in the easy phase one giant connected cluster of solutions
contains almost all the solutions, in the hard phase this giant
cluster shatters into exponentially many communities that are far
apart from each other in terms of least Hamming distance between
solutions that lie in distinct communities." (028)
What Deolalikar did was to prove this conjecture. Even more
importantly, he showed how to recognize most of the easy cases.
There may be some easy cases near the boundary, but most of
them are clearly distinguishable from the hard cases. (029)
p. 68: (030)
"The central concern of this work has been to understand what are
the irreducible interactions between the variables in a system 
namely, those that cannot be expressed in terms of interactions
between smaller sets of variables with conditional independencies
between them." (031)
p. 81: (032)
"This explains why polynomial time algorithms fail when interactions
between variables are O(n) at a time, without the possibility of
factoring into smaller pieces through conditional independencies." (033)
In other words, hard problems have longdistance interactions among
most of the variables. The symbol O(n) means that most variables
interact with some significant percentage of the other n variables,
including many that are in remote regions of the problem statement. (034)
"This also puts on rigorous ground the empirical observation that
even NPcomplete problems are easy in large regimes, and become
hard only when the densities of constraints increase above a
certain threshold." (035)
By "empirical observation", he means what most programmers and
knowledge engineers know: large numbers of such problems are
easy to solve, and they only become hard when the density of
longdistance interconnections becomes high. (036)
"This threshold is precisely the value where O(n) interactions
first appear in almost all randomly constructed instances." (037)
That is the criterion for distinguishing easy problems from
hard problems. (038)
p. 82: (039)
"The framework we have constructed allows us to analyze the set
of polynomial time algorithms simultaneously, since they can all
be captured by some LFP [leastfixedpoint method], instead of
dealing with each individual algorithm separately. It makes
precise the notion that polynomial time algorithms can take
into account only interactions between variables that grow
as poly(log n), not as O(n)." (040)
In other words, the easy problems are ones for which the number
of interactions among variables grows very slowly (log n) as the
the number n of variables increases. (041)
For hard problems, each variable is connected to a significant
fraction O(n) of all other variables. Such problems are also
characterized by a high density of variables in each Boolean
assertion: i.e., each assertion contains O(n) variables. (042)
In terms of graph isomorphism, there is a clear criterion for
easy problems: Easy graphs are those for which an upper bound on
the number of links to any node is determined by some polynomial
in (log n). For typical conceptual graphs and RDF graphs on the
WWW, this kind of upper bound is almost always true. That means
graph matching can be done in polynomial time. (043)
p. 86: (044)
"polynomial time algorithms succeed by successively 'breaking up'
the problem into smaller subproblems that are joined to each
other through conditional independence." (045)
This is what every programmer and knowledge engineer knows:
modularize your programs or ontologies. (046)
"Polynomial time algorithms resolve the variables in CSPs in a
certain order, and with a certain structure. This structure is
important in their study." (047)
It also suggests algorithms for determining whether a problem
is easy or hard. In other words, you can perform a simple test
on the problem description before wasting time on NP hard cases. (048)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontologforum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontologforum/
Unsubscribe: mailto:ontologforumleave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgibin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontologforum@xxxxxxxxxxxxxxxx (049)
