[This is an iDrone reprint, written by Matt Ball sometime in 2006]
Wanna win a million dollars? All you have to do is solve one of the seven
Millennium Problems. These are seven of the greatest currently unsolved mathematical problems. My favorite Millennium problem is the question of ‘
P vs. NP’. In short, the challenge is to prove that problems we think are “hard” (i.e. NP) are either truly hard or are “easy” (i.e. P). Sometimes you’ll see this problem expressed as ‘P = NP’ or ‘P != NP’. For a more detailed description of ‘P vs. NP’, see this
Wikipedia article.
You’ve probably heard of several NP problems. There’s the famous Traveling Salesman problem, in which a salesman attempts to find the shortest path that connects several cities. There’s also Sudoku, the new puzzle craze coming out of Japan. Even tetris and minesweeper are NP problems. All of these examples are from a class called ‘NP-complete’, which means that finding an “easy” solution to one of them within this class results in an easy solution to them all. Conversely, if it’s possible to prove that any of these is hard, you prove that they are all hard.
Probably the simplest NP-complete problem to describe is the ‘subset sum problem’. Here’s the problem: Given a set of integers (both positive and negative), find a subset whose sum is zero (not including the empty set). Here’s a simple example:
Find the set of integers from (-8, -5, 2, 3, 14) whose sum is zero.
Not to keep you in suspense, the answer is (-5, 2, 3). Pretty easy, right? Sure - if there are only 5 numbers in the set! But difficulties quickly arise when you start adding a lot of numbers to the set. If I had a set of 100 numbers, it would probably take something like a year to solve it on my computer. If my set had 1000 numbers in it, it would probably take more energy than is in the known universe to find all the solutions, using the best-known algorithm. This is because the best-known algorithm to solve the subset sum problem takes roughly O(N2^(N/2)) operations (using Big-O notation). Given a set of 1000 numbers, the solution requires roughly 10002^500 operations, which is a very, very large number. This is what I mean by hard.
The catch is that we don’t know for sure that the best algorithm for solving the subset sum problem is O(N*2^(N/2)). Nobody has proven either way. The first person who does will be a million dollars richer, so there’s definitely incentive to find a solution. There’s also a lot of other money to be made by quickly solving certain NP-complete problems, including circuit optimization. On the other hand, there is also incentive to prove that NP-complete problems are hard. Several cryptographic algorithms are based on NP-complete problems. If someone finds a way to efficiently solve them, all these algorithms become worthless for cryptography.
So now that we’ve got the preliminaries taken care of, I want to describe some of my thoughts on a solution to the subset sum problem.
First, let me explain a simple algorithm to solve the subset sum problem. If the integers in the problem set aren’t too large (less than 2^N, roughly), it’s possible to record all of the possible solutions in an array, then just look to see if zero is among the solutions. To create this array, you compute all possible sums that both include and exclude each integer. The trick is that the number of operations grows according to the span between the largest positive sum and the largest negative sum.
The process I just described can be accomplished using a discrete convolution. For each integer, create a graph that has a 1 at both zero (on the x-axis) and the integer’s value. All other values are zero (on the y-axis). This represents the two possible uses of the integer; either you include the integer and add its value, or you don’t include the integer and add zero. The result is the convolution of all the integer’s graphs. Each value on the resulting graph shows the number of solutions that sum to that given value.
For example, let’s take 3 integers: (2, 3, 5). As we convolve these values, we get the following solutions:
Each graph for this example is expressed as an array that starts at zero. For simplicity, I’ll only use positive integers. Start with the array [1], which means there is a solution at zero (the sum of the NULL set).
- Add 2: [1] * [1 0 1] = [1 0 1]
- Add 3: [1 0 1] * [1 0 0 1] = [1 0 1 1 0 1]
- Add 5: [1 0 1 1 0 1] * [1 0 0 0 0 1] = [1 0 1 1 0 2 0 1 1 0 1]
Here is the resulting set of possible sums: (0, 2, 3, 5, 7, 8, 10), with two ways to get 5. (Note: the above example is easy to compute using MATLAB’s convolution operator).
Now here’s a trick: it is possible to quickly compute convolutions using Fourier analysis. Stated differently, it is possible to solve the subset sum problem using waves. The convolution of two functions is the same as the product of their frequency representations.
Before I get too much further, I want to step back and talk about how to solve problems involving Fourier transforms.
It is possible to perform an optical Fourier transform by shining light through a small hole and measuring where the light hits a wall on the other side. The angle of diffraction depends on the wavelength of the light. The end result is a sinc function (sin(x)/x), which happens to be the Fourier transform of a square function, as represented by the slit.
By representing each integer as a beam of light with a frequency that corresponds to the integer’s value, it could be possible to modulate several beams of light together, then take the fourier transform (i.e. shine through a diffraction grating) of the combined beam to determine the possible solutions. This is an example of using physics to solve a mathematical problem.
Another interesting consideration is that quantum computers also perform a Fourier transform of sorts, so it could be possible to use a quantum computer to quickly solve a subset-sum problem. I don’t know a lot more about quantum computers, so I’ll stop here before I get into trouble. Based on this possibility of using physics to solve mathematical problems, I suspect that eventually we will have computing devices that are capable of quickly solving NP-complete problems. It’ll just be a matter of working out the details.