« Core dump | Main | Free (as in beer) speech »

December 12, 2005

N can't be 1, so P must be 0.

American Scientist has an article about Sudoku that is both fascinating and infuriating. Fascinating because it describes the convoluted history of Sudoku, or as it is known in Japan, `Number Place'. (Yes, in English.) Infuriating because, despite claiming to be an article about computering science, the author gets the cornerstone result embarassingly wrong:

Computer science has an elaborate hierarchy for classifying problems according to difficulty, and the question of where Sudoku fits into this scheme has elicited some controversy and confusion. It is widely reported that Sudoku belongs in the class NP, a set of notoriously difficult problems; meanwhile, however, many computer programs effortlessly solve any order-3 Sudoku puzzle. There is actually no contradiction in these facts, but there is also not much help in dispelling the confusion.

Complexity classes such as NP do not measure the difficulty of any specific problem instance but rather describe the rate at which difficulty grows as a function of problem size. If we can solve an order-n Sudoku, how much harder will we have to work to solve a puzzle of order n + 1? For problems in NP, the effort needed grows exponentially.

Umm... not quite.

NP stands for `Nondeterministic Polynomial time'. A problem is in the class NP if an afirmative answer can be verified by an algorithm whose running time grows only polynomially as a function of the input size. Yes, Sudoku is in NP—to prove that the puzzle has a solution, just show me the solution—but so is the problem `Are these numbers in sorted order?' Membership in NP is a statement about how easy a problem is, not how hard.

Similarly, a problem is in the class P if it can be solved from scratch in polynomial time.

A (The? No, just a) central open question in theoretical computer science is whether P=NP—is it just as easy to solve problems from scratch as it is to verify their solutions? It is widely (but not quite universally) believed that the answer is no. Most computer scientists believe that some problems in NP require exponential time. Sudoku is one of those problems. But we have no proof (or disproof) that Sudoku requires exponential time, or even a good idea how to approach a proof (or disproof). In fact, we have a proof that we have no idea how to approach a proof (or disproof)!

Sudoku is an example of an NP-hard problem. Intuitively, a problem is NP-hard if a polynomial-time algorithm to solve that problem would imply that P=NP. More formally, problem X is NP-hard if one of two conditions holds:

Yato and Seta proved that order-n Sudoku is NP-hard [pdf] by reduction from Latin Square Completion (which was proved NP-hard by reduction from 1-in-4-SAT (which can be proved NP-hard by reduction from Circuit Satisfiability)). In fact, Yato and Seta prove that if we are given 100 solutions to an order-n Sudoku problem, it is NP-hard to determine whether there is a 101st solution, for arbitrary values of 100.

That is so cool. And the article screwed it up! Argh!

Sorry? What's that, Dave?

So what lesson do I draw for this for the theory of CS? Well they need a Richard Feynman and a Stephen Hawking! But seriously, they need to attempt to convey their results in a way which, while not toally faithful to the science, gives the reader a reason to believe that they understand what is going on. This, of course, is much hard to do as a computer scientist than as a theoretical physicist because in the former rigor is held in higher esteem than in physics where hand-wavy arguments hold a much greater standing. But certainly even theoretical physicists (except the best) have to distort their understandings to meet the general public. So my advice to the theoretical computer science community is to let the rigor go but convey the spirit in a way that convince the public they understand what is going on.

Hmmm. Maybe he's right. Maybe I should be satisfied that American Scientist is using the letters N and P at all. (Grumble.)

[via Marginal Revolution]

Comments

Well, I have to say that I more or less agree with Dave. The concepts of complexity theory are so far removed from the world of the average reader - even the average science-magazine reader - that we should be satisfied if the flavor of the result is conveyed. To such readers, does it make any difference if the word "NP" is used as a shorthand for "NP-complete"? And while it might be nice to bring in the P vs. NP problem, that's really a whole 'nother topic - the article was about Sudoku, not about the foundations of computer science. Apart from such considerations, the practical meaning of (generalized) Sudoku being NP-complete is that it's hard, and we don't know how to solve it in better than exponential time.

Give the author credit at least for explaining that hardness is not a property of problem instances, but of problem families. Given the presumed space limitations, I'm not sure one could do a whole lot better at communicating the basic meaning of the result. It's hard to write about complexity for a lay audience, even when, unlike most science writers, you're familiar with all the concepts - I've tried. (Except in my case it's generally PSPACE-completeness I'm talking about, which makes it that much harder. Um, so to speak.)

So, yeah - be satisfied that American Scientist is using the letters N and P at all. Any press is good press.

Bob


Yeah, but how much harder would it have been (or how much "less understandable by the general public") had Hayes written "...the effort needed is believed to grow exponentially"? (Yes, still not technically correct, but at least it gives readers a better idea...)

It's too bad: I really enjoy Hayes's columns.

he's really not that bad - see also
http://www.americanscientist.org/template/AssetDetail/assetid/14705/page/1
it's just a confusion between NP and NPC, but explaining the heart
of the matter - asymptotic analysis and exponential growth.

I also liked the fact that he admitted that he does not understand why the Yato and Seta
paper implies hardness for puzzles with unique solution (probably through Valiant-Vazirani, right?).

The conjecture that 3SAT (and so soduko) requires exponential time is very reasonable
given our knowledge.

If you try to be super accurate you'll end up confusing everybody.

Wait, why is any press good press? Does physics *really* benefit from the existence of pop-physics? Or is all that money just going to Random House?

Maybe physics doesn't benefit from a lot of pop physics, but computational complexity doesn't have quite the same public awareness. So in this case, any press raises that awareness.

That said, when articles have been written about my results I've taken great pains to make sure the facts are correct, if not necessarily complete.

Yeah, I am also reasonably impressed with the article. The main screweup was confusing NP and NPC. But I presume half of CS ungrads would make the same mistake. After all, what is the difference between Peoples Front of Judea, and Judean Peoples Front ? :)

In my first experience with pop press, the article came out and it contained a quote of "mine" about "flooding meaningful answers with a tsunami of solutions"... After that, I am fine with everything that is written, unless they write that I am a goat (one has to draw the line somewhere).

Actually this reminds me of how I distinguish between the physicists and the computer scientists who work in quantum computing: a physicist thinks that NP stands for "not polynomial" and a computer scientist thinks a Hamiltonian is some sort of sandwich.

Did you know that Feynman had to be convinced that the P versus NP problem was an open question!

Interestingly I think I'd have to say that this gaff is not the kind of thing I was thinking about when I was thinking of playing lose with the facts. It's the equivalent in physics of getting Newton's law wrong. There are those in physics who I have seen do the equivalent and I wish, at those moments, to have a big pie to shove in their face.

Just wanted to say that I had heard "P", "NP", "NP-hard" a bunch of times but never sat down and read what they actually meant. Now I know, thanks!

Post a comment

If you have a TypeKey or TypePad account, please Sign In