Quirksand

SICP

SICP 1.2.5 Solutions

October 19, 2012 12:10

Ex. 1.20

Working this problem shows how lengthy a simple algorithm can become when recursion is involved. Euclid’s Method is easy to apply in steps, but the processing involved in computing it shows that more steps are required than is often apparent. Even in my solution, I’ve skipped writing some things out for the sake of brevity. Because in normal order, calculations are not done until needed, there will be some very lengthy expressions that will have to wait (and even grow) before being calculated.

With just numbers, it’s easy to substitute into the expressions:

(gcd 206 40)

(if (= 40 0) 206 (gcd 40 (remainder 206 40)))

But the result of that is a gcd procedure with another procedure call as an argument. Only a few steps later and we have this mess :

(gcd (remainder 40 (remainder 206 40))
      (remainder (remainder 206 40) (remainder 40 (remainder 206 40)))
      )

(if (= (remainder (remainder 206 40) (remainder 40 (remainder 206 40))) 0) 
    (remainder 40 (remainder 206 40))
    (gcd (remainder (remainder 206 40) (remainder 40 (remainder 206 40))) 
         (remainder (remainder 40 (remainder 206 40)) 
                    (remainder (remainder 206 40) (remainder 40 (remainder 206 40)))
                    )
          )
    )

During this process, the only part that is required to be evaluated is for the test in if. The special rule indicated by the problem for if means we don’t evaluate any of the branches immediately. Each one therefore yields a longer and longer call to gcd. Inside those tests are a number of calls, which will be calls to argument b. So at each step, we’re going to be making calls to remainder, but on the next iteration, we’ll have to redo much of that work in a new test.

It’s worth it to work through all the steps to understand how normal order can potentially lead to a great deal of unnecessary work. In this case, the result requires 18 calls if we process it that way. What follows is a more detailed derivation of that number.

As long as our test keeps failing (remainder is not 0), the new b argument is the only thing adding to the total calls to remainder. When the test succeeds, then whatever a is will be evaluated. If, then, the method requires k steps, the total calls will be those required to evaluate the b arguments k times, plus the number of calls in the a argument after k steps.

Looking at the new call (the failing branch of the if statement), it’s easy to see that each step adds to the calls in arguments a and b 1 remainder for the new argument b. Argument a for the next round will be what argument b was previously. (There’s a clear connection to the Fibonacci sequence, comparable to the book’s discussion of Lamé’s Theorem.) Here is a table of just the number of calls at each step, where ‘Calls in -’ means the number of calls required to evaluate that argument :

Step            Calls in a   Calls in b    
1                   0           0  
2                   0           1  
3                   1           2  
4                   2           4  
5                   4           7   
6                   7           12   
7                   12          20   
8                   20          33   

In case you don’t recognize, it, the number of calls in a is simply the Fibonacci sequence less 1. And that’s the same sequence as b, offset by 1. Since we want the sum of the calls in b after k steps, we make use of the fact that the sum of Fibonacci numbers up to n is Fib(n + 2)-1. Our expression works out like this:

(Calls in a at k steps) + (Total of calls in b after k steps)

Fib(k)-1 + Fib(2)-1 + Fib(3)-1 + Fib(4)-1 … Fib(k + 1)-1

Fib(k)-1 + [Fib(1) – 1] + Fib(2)-1 + Fib(3)-1 + Fib(4)-1 … Fib(k + 1)-1Insert an expression equal to 0

Fib(k)-1 + Fib(k + 3)-1 – (k + 1)

Fib(k) + Fib(k + 3) – (k + 3)

To apply it to this case, we had 5 steps in our calculation. Fib(5) = 5, fib(8) = 21, and the answer is 5 + 21 – 8 = 18. It’s not hard to understand that as the number of calls required to determine the GCD increases, the total number of calls is going to increase dramatically (exponentially, really, as the Fibonacci sequence does).

In applicative order, the arguments can be evaluated as soon as they are part of the procedure call. This means that each remainder step is only needed to be performed once; each new call will be on actual resolved numbers. It’s not hard to see that since the recursive call only has one call to remainder, there will be just one new call per step of the algorithm. Actually, one less than that, since the very first step does not have any calls to remainder in the arguments.

As a practical demonstration, I’m taking advantage of the trace function and that the Scheme interpreter (mine at least, and probably yours as well) uses applicative order. The way trace works in Dr Racket is that you first require it in your file with (require trace). After that, any function you define in the file will be traced. Being traced means that any call from then on will be logged to the screen, with the name of the procedure and the arguments it was passed.

I made both the gcd and a traceable remainder function (by simply defining it to call remainder), to show what happens for this particular call. This is the way output is displayed in Dr Racket (normally, the full file path is shown; I’ve cut it out here):

(.#<syntax:/.../1-2-5_sol.scm:59:9 gcd> 206 40)
  (.#<syntax:/.../1-2-5_sol.scm:67:9 rem-traced> 206 40)
(.#<syntax:/.../1-2-5_sol.scm:59:9 gcd> 40 6)
  (.#<syntax:/.../1-2-5_sol.scm:67:9 rem-traced> 40 6)
(.#<syntax:/.../1-2-5_sol.scm:59:9 gcd> 6 4)
  (.#<syntax:/.../1-2-5_sol.scm:67:9 rem-traced> 6 4)
(.#<syntax:/.../1-2-5_sol.scm:59:9 gcd> 4 2)
  (.#<syntax:/.../1-2-5_sol.scm:67:9 rem-traced> 4 2)
(.#<syntax:/.../1-2-5_sol.scm:59:9 gcd> 2 0)

Also within the program, next to the # is a small disclosure triangle, which gives some more detail on the call. None of what it reveals is very useful to our purposes, though. Adding a trace can be a very useful debugging or optimization method. Its usage or output may vary, but I expect it is supported in most varieties of Scheme.

Solutions 1.2.5