SICP 1.2.6 Solutions
November 16, 2012 12:01
This whole section is one of the book’s several in-depth examples. As such, it tends to use concepts and require testing techniques that might go a bit beyond what has been done in other sections. But it is good to delve into a practical application now and then. Don’t be worried if some of what’s done is new to you; there will be more time to explore the methods later.
As mentioned in the introduction to the exercises, while the files remain labeled as
.scm, the method of measuring processing time is the one in Racket (
time). If your interpreter doesn’t support it, you will need to modify the test procedures.
This is just an introductory exercise, designed to introduce how primality can be tested using various means. Even here we aren’t actually testing for primality yet, but the rudiments of a brute-force method are seen.
Hardly more complicated than the previous exercise, this one actually tests for primes. It’s a reliable method, since it tests all reasonable divisors, but as the rest of the exercise shows it’s likely to take a really long time for big numbers. My routine is simple; as long as we haven’t reached the specified endpoint and the number is odd, test and then increment the start point.
The first part of testing is making sure that the routine works as intended. Therefore we run the test on the numbers 1 to 50. I expect these first few prime numbers to be reasonably well-known that you can spot if the results are correct just by looking at the resultant list.
The next step is to find the answers to the question asked after writing the procedure — the three smallest primes larger than a given number. Using what we’ve written, one approach is to estimate a range that we’ll find the answer in, and feed that to the testing procedure. For starters, I tried another set of 50 numbers, this time in the range 1000-1050. Now, that works, but it also found quite a few more than 3. But if we try to cut it down, say to 25, then when we move up to 10000, it’s found to fail. Revising the range for each new section does work, but it wastes a bit of effort to find those ranges.
Therefore I wrote a procedure that finds the endpoint of the range.
There’s still a further optimization to be made here; in order to find the range’s endpoint, we run the prime test. Then to report the results, I’m running the timed test on that same range. So each value in the range is checked twice. It’s not hard to see how to improve that. However, the timed tests show us that’s hardly important, because they took so little time. Things have changed since SICP was first published. Our computer is too fast.
In order to get useful comparisons, then, the
timed-prime-test-with-cycles was introduced. With the new version to take into account the 1000-fold leap in processor speeds, we can get numbers to check the growth of our procedure in time.
What we find (or at least what I find) is that the results do indeed match up. From 1000 up to 1000000, the pattern of √n is followed. Keep in mind that now we are actually measuring real world performance instead of ‘number of steps taken’. However, with this procedure we can see that there is little variation in how they process different numbers. In all cases, the branches taken will lead to calculating values up to the √n, so all steps will grow in that way, and processors still execute these problems using the step-by-step approach. (I would not be surprised if in another variety of Scheme, or with even a new processor design that works around this automatically, you might not see the same results that I do.)
Now we begin to look at optimizing the first approach. To limit the number of values tested, an additional routine is used to skip testing of even divisors. While we might expect it to halve the time taken, in my testing I found it took about 2/3 the time instead of 1/2. As for why, my guess is the time taken by the
next routine itself. Now instead of just incrementing the value and continuing, a test for equality with 2 is taken on every number. One must always understand what operations are going to be ‘expensive’(in terms of time) and make speeding those up a priority. In this case, it’s worth it since we do see an improvement.
Moving on, we bring in a new method of testing. This time we don’t need to go through a long list of numbers, and the steps taken in testing grow at a much lower rate (log n instead of √n). We’ve also introduced an element of randomness.
In order to have it work with all the previously created procedures, I modified
prime? instead of just
timed-prime-test. It’s much easier than rewriting new versions, especially since we again need to run many test cycles to get meaningful results.
The time measurement reveals two interesting details. For one, there is the question raised in the exercise. Does it grow at an actual log n rate? I find that it does not, and I expect the recursive routine
expmod to be the culprit. This is a bit harder to figure out; since the probes are chosen randomly, they will require an unknown number of steps in
expmod each time the test is run. However, we do find that on average, the growth is not that much worse, and certainly better than √n.
The second detail is not discussed directly in the text but of some interest, at least given the numbers I observed. In all cases, the time used is worse than with the divisor test. This goes to show that optimizing based on orders of growth requires a practical consideration of the expected values. If you knew the range of numbers were to be always below 1000000, then you might well choose the first approach; for larger values the second one. Remember that discussions of orders of growth for algorithms tend to talk about asymptotic, and sometimes even only worst-case values, and practical results may trump that.
Again we’re looking at ways to optimize our new approach. In this problem and the next, the goal might be said to be simplicity and clarity in the code. This one, while it is correct, will unfortunately not show any improvement. The “extra work” as it is termed provides a considerable savings, because as guessed
expmod is somewhat expensive in time (not so much in space, but the optimization does provide an improvement). Her method actually runs so slow that only one test cycle is needed to show how slow it runs when the numbers grow larger.
This time the approach is just as bad. Possibly worse for clarity, since it adds a repeated bit of code instead of a simple call to do what we want. In this case, a call to
square beats a second call to
square is a much faster operation. Even without knowing about the growth of
expmod we know that it’s being replaced by two calls each time it’s called recursively. In other words, that’s going to be 2calls to
expmod for each previous call. With
expmod being O(log n), the result will be O(n).
Now if we knew that
square was some particularly slow operation, and
expmod was much faster, then this approach could be worth it.
square would have to be worse than the O(n) calls to
expmod to be worse, though.
Recall that I mentioned our first test as being reliable if potentially slow. It turns out the faster Fermat test isn’t quite so reliable. However, as the footnote points out, the numbers that fool it are so sparsely found when testing very large primes that it’s reasonably safe in most circumstances to ignore the false positives.
The test I used simply measures whether
(expmod a n n) is equal to the remainder (i.e. a modulo n) for all values of
a. There are ways to improve the speed of this, but the most obvious one uses a technique we won’t see until the next section. It’s good enough for reasonably small numbers.
In order to verify it, I looked up some of the known Carmichael numbers. Bear in mind that as described in the text, our test does not actually determine definitively if a number is a Carmichael number, only if it might be one and pass the Fermat test. A proper Carmichael number is not prime.
To add the Miller-Rabin test (which is not hard to implement), I inserted it into
fast-prime?. Since this test may involve extra computation, it’s arguable whether or not it should replace the existing method. It begins to get complicated, however, since it may actually end up saving time for some numbers. This section is already long enough that I’m keeping it as is but making a slight check on it.
To test, we need to make sure it works on the values that failed for the previous Fermat test. A final test with cycles is added to see how long it actually takes. A few runs of my own suggest it does not make a big difference to the test, and I find it does not for the numbers I checked. Since this is a probabilistic test, we can’t be sure of our result without a much more involved analysis. It’s also worth pointing out that though the text says the Miller-Rabin test ’can’t be fooled’, all we have really done (in this case) is chosen enough probes to make it even less likely for the test to fail. On the other hand, with a bounded input range, it’s possible to prove that it will work; see more on the subject here.