SICP
SICP 3.5.1 Solutions
September 6, 2020 11:26
Ex. 3.50
To fill in the blanks, let’s take a look at the original (single-stream) implementation of stream-map
from the text.
The first thing it checks is if the stream is finished, by using stream-null?
. Here we have to make that check, but will only look at one of the streams, using (car argstreams)
. It’s considered an error in Scheme if the lists provided to map
are not the same length, so we can follow that example and surmise that it is not necessary for us to check the other streams given, and just use the first one as a measure. So the first blank is filled with the stream-null?
check.
The other branch of the if
in the original stream-map
builds the mapped stream (using cons-stream
) to return, by applying proc
to the car of the stream, and then recursively mapping the cdr of the stream. We are still producing a single output stream, so cons-stream
doesn’t have to change. The only major difference, then, is that we use apply
(since argstreams
is a list already) and map
to get a list of the elements to work on. The map will put stream-car
and stream-cdr
in essentially the same place they appear in the single-stream version. The end result is our generic stream-map
procedure.
Ex. 3.51
It is not hard to understand what the stream x
produces. It has the numbers 0 to 10 in sequence, but by using show
, every time a position in the stream is checked, it will also be output to the screen using display-line
. The tricky part is in determining when the actual display of the value occurs. For that we need to figure out when show
ends up being called.
The definition of x
uses stream-map
, which we just defined in the last exercise. When does stream-map
apply its procedure to the stream members? When the input stream is not null, the procedure constructs a new stream whose first element has proc
applied to it. However, cons-stream
is a special form, not a procedure. So the application of proc doesn’t necessarily happen right away. Internally, cons-stream
uses regular cons
to create a pair, consisting of its first argument and a promise of the second argument. When the interpreter gets to this call to cons
, the first argument passed to cons-stream
to be evaluated, and therefore proc
will be applied then. This also depends on delay
not evaluating its arguments
That means the definition of x
will cause the first execution of show
on the first element of the enumerated list being mapped. So the first statement defining the stream will cause 0
to be output on a line, before we do anything with the stream itself.
The next statement is (stream-ref x 5)
, to get the element at index 5 in the stream. Since the result of the stream is just the numbers in sequence, we know the final output will be 5. But of course, show
may get called and print out more numbers to the screen before that happens. Within stream-ref
, the moment when this happens will be when stream-cdr
is called to get the next element. That results in a force
of the promise. In the case of x
, the force triggers a recursive call to stream-map
with the remainder of the input stream (the enumerated interval). As just explained, it is in the cons-stream
step inside stream-map
when show
is actually evaluated.
This means that we do not expect to apply show
to the first term (0) when calling stream-ref
, but it will be applied to successive terms. How many elements will be output via display-line
? Since show
only gets applied when stream-cdr
is called and the stream-map
to the next element in the interval occurs, we need to figure out how many times that happens. The recursive calls to stream-ref
will count down from 5. When n
in stream-ref
gets down to 1, it will call stream-cdr
the last time, and the next time, when n
is 0, it calls stream-car
(which does not evaluate show
). There are a total of 5 calls that force the promise in the cdr, and 5 numbers will be displayed.
It may help to consider an internal view of the stream. After being defined, it is a pair consisting of the first element of the stream, and the remainder of the stream, which is something produced by delay
. I will use _promise_
here to avoid getting into the internals of delay
, but all that matters is the expression attached to it has presumably not been evaluated. Shown below is pseudo-code, showing the values for the elements instead of their variable name.
While this is accurate to what we execute, there’s less extraneous information (and no loss of functionality) if we model it using the single-stream form of stream-map
, giving us this pair as the internal representation:
We have a pair that contains in its cdr a promise to do a stream-map
on the remainder of the interval. When that evaluates, stream-cdr
is called (as an argument to stream-map
), and the promise contained within it gets forced, producing the value of 1 for the car of the new stream. This is just the regular ‘enumerate interval’ stream, however, so show
is not called. Within stream-map
, when it is constructing the stream as a return value, show
is called on the cdr
and the number is printed to screen. Each time, the call to stream-ref
returns a pair like this; when it is called with n
=1 (just before the terminal call), this is what s
looks like.
The next time stream-ref
is called, with n
=0, the value 5 will be processed by show
(and thus displayed) and is now the car of that pair. That result is returned from stream-ref
.
We’re just now getting to the interesting part of this exercise, which is what happens next, involving the action of delay
. When we call stream-ref
again on x
and this time with a value of 7, we might expect that it will also output the numbers 1 through 7 before returning the result of 7. However, what we see is Assuming delay
is memoized just two lines of output: 6
, then 7
, and the final result.
Once we have evaluated the cdr the first time, the stream resembles something like the structure shown below. Although technically the code calling the promise is the same, I’ve marked that it will return an already-computed result, since it will do that instead of evaluating the expression. This return value comes from the call to stream-map
that produced the value with 1 at the head. Note that in the case of the exercise, all the sucessive promises that are already evaluated would have this ‘return’ form as well. This shows what it looks like only after the first promise is forced:
What happens is that the memoization of delay
prevents values that have already been computed from being displayed again. The mapping that executes show
only happens the very first time that element of the stream is accessed. After that, delay
is storing the value returned by stream-map
, and no longer calls show
. Instead it just gives the result. With different definitions of delay
we could see different output, although that is more the focus of the next exercise.
As an additional note on this exercise, the footnote in the text mentions that this style of programming is “odious”, as it is difficult to analyze properly and leads to confusing results. The typical term used for mixing something like display and assignment is a “side effect”, and you may hear a warning against having such in your code. The basic idea is that you want predictable and expected behavior to occur when calling a procedure, and not to perform some additional action unrelated to how it was called, or to have it modify something else in ways unclear or unknown to the caller of that procedure. It also gets at the heart of what a particular data structure is. Is the value of a member of this stream x
simply “a number” or is it “a number and the display of that number”? If we want to try and define it as the latter, it is not only a bit perplexing to put into words but becomes impractical to implement. A better design in this case would be to consider the stream just as a stream of numbers, and then display it as a separate action if that’s what we want. We should not have the stream itself try to do it for us, even if that might seem more convenient at first.
Ex. 3.52
In this exercise, we’re now looking at setting external variables as a “side effect”. In this stream, the variable sum
is an external one, and as we’ll see, setting it in the course of evaluating the terms of the stream produces confusing results. I’m assuming for now that delay
is memoized, as the text does initially. We’ll step through a few lines at a time.
So far, we’re fine. sum
is still 0, since we haven’t called accum
with any values yet.
As explained in the previous exercise, the creation of a stream with stream-map
causes the first term to be evaluated. That means accum
is called with a value of 1
, and sum
is set to 1
.
Now we’re defining a variable using stream-filter
. Checking its definition, we can see that it either continues to call stream-cdr
(forcing the contents of the stream) or it will use cons-stream
to build its result. This filter is testing for even numbers. It is first called with 1 (the stream-car
of seq
) which leads to the false branch, and the stream-cdr
of seq
is forced. That causes accum
to be run with the next value from the interval (2), and the value computed for the stream seq
is 1+2=3, which again is not even, so stream-filter
continues. The next time the cdr is forced, accum
is called with the value of 3, and the result is 6, which is even, so we create a new stream using that result and the remainder. Note that stream-filter
does not trigger an evaluation of accum
when creating its new stream, since the first element of seq
was already evaluated and stored in the stream, and calling stream-car
does not force anything.
The stream z
is similar to y
, except we’re filtering based on divisibility by 5 instead of 2. We already know that stream-filter
triggers accum
when it forces the remainder of the stream. However, since we’re also memoizing the delayed terms, all the values of seq
that were previously computed do not cause any additional calls to accum
. The first three terms of seq
are checked for divisibility by 5, and fail (as they are 1,3,6). A previously-unforced promise needs to be evaluated, so accum
is called with the value 4. That results in sum
being set to 10, and since that is divisible by 5, we do not evaluate any more terms when constructing stream z
.
This time, we will find the term in y
at index 7. Since y
consists of even terms, that will be the 8th even number in the generated sequence. So far we have evaluated seq
four times, and the memoized values are 1,3,6,10. That’s two even numbers there already. Thanks to the memoization in delay
, each newly evaluated term from the original interval is only added to sum
once, so the created stream seq
will be the sequence where each term is the previous term plus the index of the sequence seq
is the Triangular Numbers. The next even terms that follow are 28, 36, 66, 78, 120, and finally 136, the eighth even member of the sequence. All elements past the 10 will trigger accum
, and at the end sum
will be 136.
Finally, we display the entirety of the stream z
. Since the original interval was only 20 terms, this stream will be the first twenty Triangular Numbers, filtered to those that are divisible by 5. That will be the sequence 10,15,45,55,105,120,190, and 210. Since 210 is also the twentieth term in the unfiltered sequence, that is what sum
will be when complete.
If more terms had been included that were not divisible by 5, the filter would have exhausted the sequence without displaying any more values for z
, and sum could have ended up higher. For example, if the enumerated sequence went to 22, sum
would be 253 at the end even though the displayed output would not be any different.
As suggested by the text, all of this depends on delay
being memoized. If that isn’t the case, then every new access to the cdr of seq
will cause sum
to change, and that will affect the values of seq
. In fact, since the evaluation occurs each time, the elements of seq
will be different every time the remainder is accessed! If, for instance, we repeated the (stream-ref y 7)
call, we would get a new result each time.
Just to compare, we can redefine delay
to not use memoization (see the streambasic file for an implementation) and run this set of steps again. Up through the point that y
is defined, there is no difference between the two. That’s because at that point the remainder of seq
has not been forced with any repeated values. After that, when z
is defined, the results start to diverge. Every time the sequence seq
is processed, it produces a sequence like the Triangular Numbers, but with the values (except for the very first) offset by whatever sum
currently happens to be. (The first element doesn’t change because the car
of a stream does not rely on delay
.) So, when z
is created, the unfiltered values of seq
will be 1,(6+2)=8,(8+3)=11,(11+4)=15. However, seq
will change every time, so examining z
again will yield different results, and those will also be different depending on what element of y
was most recently evaluated. In short, it becomes impossible to predict what any value of seq
will be without knowing the entire history of the calls to it, and that is precisely why we would not want side effects that modify a variable without it being explicit.
I slightly sidestepped the actual question in the text here. It’s not compeletely clear what they are intending to ask. As stated, the question is what would happen if delay
had been implemented with a lambda expression “without the optimization provided by memo-proc
”. However, if we did use a literal lambda expression, we no longer have a special form. In that implementation, there would technically be no difference to the output from the memoized version. That would result only because all the values in seq
would be evaluated at the moment of creation (stream-map
would recursively create a list instead of delaying the computation of its remainder), and thus each calculation would only occur once, as in the memoized version. That may be what’s intended here, but the phrasing seems to be asking what would happen if memoization were not occurring. That would give us the utterly confusing behavior described in the previous paragraph. It seems like an oversight that the text does not explain special forms in more detail, as it appears to be a crucial element of this exercise.