SICP
SICP 3.5.3 Solutions
December 27, 2020 10:13
Exercise 3.63
In the original version of sqrt-stream
, a single stream is defined. It then maps onto the exact same stream, which is named guesses
. In Louis’s version the stream is mapped to a call to the procedure sqrt-stream
which constructs a stream and returns it. That means that an entirely new stream is constructed with each ‘recursive’ call. One advantage that the original has is that the same stream is being re-used in the mapping, possibly representing a savings in space when computing new values. More importantly, each time a new element of the square root stream is generated, the original version will have already calculated and stored the previous guesses (assuming memoization is used). With Louis’s procedure, all elements of each new stream have to be computed every time. The number of steps to calculate the nth element is the sum of integers up to n; this is O(n2). The original version, meanwhile, is O(n) with respect to calculation, since each value is computed only once and then memoized for later access.
There is still some memoization that occurs in Louis’s version at the higher level. If the result of the sqrt-stream
call is itself defined as a stream, then future calls to already-computed values will be memoized and no additional calculations are required. Only if a call is made to a higher number of elements later in the stream will the extra computation result. There is no difference at this level between the two streams as both will benefit from memoization in this way. This is demonstrated in the testing by repeating the call on the stream; in both versions the second call will return without needing to call sqrt-improve
.
The exercise also asks if there would be a difference if we were not memoizing calls to delay (at least that’s how I’m interpreting this; see the last section for the ambiguity on special forms for delay
). I’ve already alluded to the idea that in the original, with guesses defined only once, there might be a space savings since it stays as one stream. Exactly whether this actually saves space relative to the repeated calls to sqrt-stream
in Louis’s version would depend on stream-map. As it happens, stream-map
creates new streams, and every new stream created is a computed initial value and a promise for the cdr, which means both versions are a single pair. However, Louis’s version will have one extra nested call, since it will have to take the stream-cdr of a new call to sqrt-stream
at each step. The overhead of making a function call might be significant, but probably not compared to the computation of the value itself, which will likely dominate the extra time that Louis’s version requires.
Below is a diagram showing what is produced for each version when the stream’s cdr is taken (as in calls to stream-ref
). I’ve simplified the expression a bit to just use ‘improve’ to label the internal lambda, and indexed the iterative guesses using the variable x
i for each successive improvement. Note that while Louis’s appears to take up more space, it does not retain any references to previous elements of the stream, meaning that the previously created streams like ss1
can be deleted by the interpreter. In the original version, those streams (labeled g1
, g2
, etc.) need to be preserved, as the later elements of the stream refer back to them.
Exercise 3.64
This procedure needs to take one value and then compare it to the next value in the stream. It thus needs to store one value from the stream and then get the next value of the stream to determine if the two elements are within the specified tolerance. The way I solved it is by using a procedure that takes a single value and a stream. It will compare that value to the car of the stream and return the stream’s car if they are in tolerance. If not, a recursive call is made using the car that was just read as the value and the cdr of the stream as the stream to compare it to.
Exercise 3.65
To express the formula as a stream we can follow the example of pi-summands
and just use the next index as the argument (so that we use all whole numbers). Another approach would be to map the integers
stream using division and flipping the sign when the integer is even. Either way, we use partial-sums
to create the successive approximations. The accelerated streams then apply the appropriate stream transformation to the original partial sum stream.
Examining the first few results, we find that in ten iterations the original partial sum stream is still almost 0.5 away from the correct answer. Using Euler acceleration converges close to 0.69 by the second term and to the third decimal place within a few more terms. Tableau acceleration takes an extra term to ‘get started’ but within ten terms is accurate to more than ten decimal places. On many systems this quickly turns into an issue of precision, as in fact the sequence cannot compute any more valid terms as the floating-point math can no longer work with differences that are so small. That is an issue with the original pi-summands
approach as well; creating a system that would operate to infinite precision is feasible but well beyond the scope of this section.
We can compare the number of terms required for a given tolerance to see just how quickly (or slowly) the three approaches arrive to the same precision. Getting to just three decimal places requires 1000 terms for the bare partial sums but only a handful for the accelerated sequences. We can compare the accelerated forms by looking at how quickly they converge to six decimal places and see that the Euler acceleration takes around 60 terms to just 5 for Tableau. The unaccelerated form would take prohibitively long to calculate with that precision.
Exercise 3.66
In spite of what the text implies about ‘qualitative’ statements, the thing I find easier to do is just examine the sequence of output and determine the mathematical pattern it represents. From there, I can work backwards to try and find what the underlying explanation for such a pattern might be. Here’s what we can observe from where certain pairs show up in the sequence:
(1 j) for j = 1 is at 0
for j > 1 will be at (2j – 3)
(2 j) for j = 2 is at 2
for j > 2 has a period of 4 starting at 4 (2 after (2 2))
(3 j) for j = 3 is at 6
for j > 3 has a period of 8 starting at 10 (4 after (3 3))
(4 j) for j = 4 is at 14
for j > 4 has a period of 16 starting at 22 (8 after (4 4))
(5 j) for j = 5 is at 30
for j > 5 has a period of 32 starting at 46 (16 after (5 5))
Finding where the pair (1 j) occurs is not too difficult to figure out intuitively. The first time pairs
is called it will start interleaving with the stream created from pairing a 1 with all elements of the second stream. That means after the first output of (1 1) followed by (1 2), every alternate member of the output will be one of these pairs. Figuring out where (1 100) is as simple as multiplying by 2 (and subtracting 3 for the initial offset).
To get a more general formula, we need to find a pattern for values of i other than 1. The first time a new value for i shows up it will be in the pair where i = j. The next occurrence with that value i comes at some later point, but from then on it will recur at a fixed interval that is not the same as it took between the first two. The reason for this difference is that the first time, the new pair is coming from the interleaving picking the ‘second’ stream (where both i and j are advancing), and later on the pairs come from the ‘first’ stream (which keeps i the same and advances only j).
Both the first appearance of the pair including i and the first appearance of the next pair with i (where j = i + 1) follow a pattern based on powers of 2, since as we observe it takes somewhere near twice as long for each successive value of i to be introduced. In fact, the pattern for when i = j first comes up is just 2i, offset by 2. For the next pair that has i, we have to wait for one cycle of the next lower power of 2 (i.e. 2i-1) to finish. Thereafter we wait longer — the full value of 2i, and that does not change.
The result is that the index of a given term (i j) in which j > i follows an arithmetic sequence. The period at which it repeats is (2i), and that is multiplied by an index (j – i). It will have an initial value of 2i – 1 – 2. The resulting formula is thus:
The index of (i j) for j = i will be at 2i – 2
for j > i will be at 2i*(j – i) + 2i-1 – 2
If we apply the formulas to the suggested values we see that they produce very large values for even moderately low numbers. For instance (99,100) will be at 299 + 298 – 2 , which is position 950737950171172051122527404030, far too long to wait for the stream to produce! It’s more feasible to use smaller values to verify that this formula gives the correct result, which is why the testing numbers of 9 and 10 are used.
Exercise 3.67
In the original pairs
procedure, the car of stream s
was paired with every element of the cdr of stream t
. Then this was interleaved with the result of pairing the cdrs of both streams. That ensures the car of t
is only paired once (with the car of s
) and then ignored. Using the stream of integers as the source for both, that avoids any pairs where the second element is smaller than the first. To produce every possible pair we just need to construct an additional stream that pairs the car of t
with all elements from the cdr of s
. We interleave this with the original streams to get the whole result. Then we feed it the stream of integers for both inputs.
Exercise 3.68
This exercise really gets at the heart of what often makes streams confusing to deal with. At first blush, it seems that this sort of operation ought to work with no issues. However, crucial to the processing of infinite streams is the ability to delay their computation until needed (also known as lazy evaluation). If this does not happen then it is all to easy for a stream processing procedure to get into what is essentially an infinite recursive loop.
What was missed here is that there is no delayed evaluation when constructing the stream in this version of pairs
. Instead, it makes a procedure call to interleave
. It may seem that this would not be an issue because internally interleave
will call cons-stream
thereby delaying the processing of the stream. However since interleave is a procedure, it must evaluate the arguments that are passed to it. What are the arguments passed to it? The first is a stream-map which will construct a stream and exit normally. The second, though, is a call to pairs
. That’s where the trouble happens. Every subsequent call to pairs
will again lead to an attempt to evaluate arguments to interleave
which leads to a call to pairs
, forming a recursive sequence that never ends, if either s
or t
is an infinite stream (and if they do terminate it will just be an error in stream-cdr
).
It is therefore necessary to ensure that recursive calls in stream-processing procedures aren’t allowing this to happen. The recursion on a stream should only come inside a special form that won’t evaluate it as an argument — for instance using cons-stream
, as in the original version of the pairs
procedure.
Exercise 3.69
The construction of the triples can done in a way that resembles the pairs
procedure. First, one triple is created from the cars of each stream. Next, we interleave the stream created by taking the car of the first, combined with the other two streams in a way that ensures the first element is smaller than the others. What we can combine with is simply pairs
on the second two streams since that gives us exactly what we want (the second element is never smaller than the first). Finally, we call triples
on the cdr of each stream, again using interleave
to combine them. Of course cons-stream
is used to build the whole thing, so we don’t end up with the infinite recursion witnessed in Exercise 3.68.
To generate a stream of Pythagorean triples we can construct a stream of all possible triples from the integers, and then use a filter to ensure they satisfy the formula.
The use of stream-filter
allows for a fairly clear expression of the requirements on the output. We create a procedure that checks if the square of the third value is equal to the sum of the squares of the first two (since the triples are ordered by size, the last value in the triple must be the ‘longest’ side of any potential triangle). Notably, the production of the actual Pythagorean triples can be very slow, as the triples
stream will have more and more invalid values the longer it goes on. While I did not calculate the actual index where Pythagorean triples are expected to occur, we can guess that similarly to the pairs
procedure, even a moderately low set of values might occur at a very high index in the stream.
Exercise 3.70
At first glance it seems as though we could take the original merge
procedure and instead of taking the leading elements and comparing them, just use the weights of the leading elements and compare them. However this probably produces unwanted behavior, because it seems as though we are now having merge
do both sorting and combining. The problem with this approach is that the combination of the weights is distinct from the indexes. In basic merge
, when the two elements are equal, only one of them should be added to the output stream. Elements with equal weight, however, might both belong in the output stream.
That actually simplifies the procedure, as we only need to consider which weight is lower when merging. Instead of a cond
we can use an if
statement. I also got rid of the let
to get the car of the streams since the weights are only compared once. (The car of one stream is needed again when constructing the output, so it may be argued that it is worth keeping them as let
-assigned variables, but more likely than not it makes little difference either way).
As for weighted-pairs
, it almost exactly resembles the initial sketch of the pairs
procedure in the text with the ‘combine-in-some-way’ procedure being merge-weighted
, and an extra argument (the weighting function) added to its parameter list.
The stream in part (a) is constructed using weighted-pairs
with the stream of integers as the arguments; the weighting function is just the sum of the two elements of the pair (note these are ‘pairs’ only using the terminology of the procedure; in actuality they are of list type and accessed as members of a list).
The second stream is a bit more complex to construct. We need to produce the stream of numbers that do not have 2, 3, or 5 as factors. This sounds like the inverse of the stream we produced in 3.56, but it’s not quite that — there we had integers that had only those values as factors (and also included the value 1 in the stream). The stream we want here can be made using a filter that checks if the number given is not divisible by any of the numbers (my own filter checks if it is divisible and inverts the truth-value, which also works).
Exercise 3.71
While the text is perhaps correct that this sequence can be called Ramanujan numbers, they are more typically known as the Hardy-Ramanujan numbers The OEIS entry or Taxicab Numbers (based on the story around them). I’ll refer to them as Hardy-Ramanujan here.
Constructing the stream of sorted cubes is a simple matter of calling weighted-pairs
using the integers and a weighting function that adds the cubes of the pair’s values (though I’ll point out again that the pair is, in Scheme terms, a list, so it must extract the values using car
and cadr
).
What we then need is a procedure that scans through a list and checks consecutive values. One method might be to write a stream-processing function that does all of this and is specific to this stream. However, since I know that Exercise 3.72 is coming, I made a more generic procedure that allows the use of stream functions we already have. This is a version of stream map that, instead of being given one element of the stream at a time for mapping, is given a whole stream (starting at that element). It’s less useful to do this for regular list mapping, but when we are working with infinite streams representing ordered sequences, it makes more sense. Here’s the implementation, which is modeled on stream-map
.
The only difference between this and the original stream-map
is that we are applying proc to the whole stream passed, not just the car
of it. We can’t use this version on its own to produce our final output, though, since the map gives a value for every element of the input, and we need to only include those that fit the requirement. So the output is produced in two steps:
First the stream-map-s
function is used to map each element of the stream into a list containing it and the element following it. I used the get-n-of-stream
function that I’d written for reading just part of a stream and had mostly used for testing until this point. Then that stream of cube pairs is filtered, accepting only those elements whose sum of cubes match. It is mapped again to convert the pairs (or one of them, which is all we need) into the sum of the cubes, so that the output is the actual Hardy-Ramanujan numbers. There’s perhaps slightly more calculation of the cubes than is really needed here, but this approach also allows us to relatively easily alter the output (perhaps if we wanted the original pairs instead of their cubes).
Exercise 3.72
Taking advantage of the work done in 3.71, this exercise becomes almost identical to it. I wrote it up as a single function that internally defines the filtering functions, just to show an alternate approach. The filter which does the main work is this:
As expected, the make-three-pairs
again uses get-n-of-stream
to produce a list of three successive stream elements. Then the filter just checks if all the squares are equal. Unlike in 3.71, no additional map is needed, since this time we actually want the list of pairs.
There is a bit of ambiguity in the exercise statement. It could mean the sequence of numbers that have three or more ways to write them, or only those that have exactly three ways. The sequence, assuming three or more, is also in OEIS The second is of course a subsequence of the first, but they diverge relatively quickly. If the first option is chosen, that leads to the question of how to display any multiple pairs. My own procedure will list multiple matches with an extra entry for each successive triple (e.g. when there are four values that match, the first three will be one element, and the last three will be the next element of the stream).
Exercise 3.73
While this may seem to nearly be only an electrical engineering exercise, the details of using this as a circuit simulation aren’t as important. It can be analyzed using only the block-diagram model, and treating each block as a function. You probably do need to be able to follow the math involved, though. To discover the function arguments, we look at the inputs to each block (the arrows leading into them), and of course the output of the block is the result of the function.
Once we break it down this way, we will create a series of functions whose results are arguments to another function. The scale block on the top takes the input i and scales it by R. So we use scale_stream
for that. On the bottom, the integral block takes two inputs, one of which is i scaled by 1/C, and the other is just v0, the initial capacitor voltage. The integral
procedure there also needs a timestep, which must be given as an argument to the RC
procedure. Finally, the top and bottom paths go to an addition block, which is just an add-stream
. Since the expected output is meant to be a procedure, we wrap it up in an internal definition and return that. It could also have been just a lambda returned, since we don’t really need the label, but I think it helps to name it so we can indicate what the result actually represents — in this case, the voltage ‘output’ of the blocks.
The tests showcase that the behavior of the simulation more or less resembles how this circuit would function. In general, if a constant current (represented by i) flows, there will be constant voltage across the resistor. If there’s no current, then there is no voltage across the resistor. The capacitor, on the other hand, stores charge when current flows into it (leading to a rising voltage as charge increases). If the capacitor has any stored charge, it does not go away if the current turns off; the voltage on the capacitor remains the same. I did not give precise comparison tests for the output, since the mathematical precision is likely to vary, but hopefully this exercise is not too difficult to debug, once you know how to set up the stream procedures.
Exercise 3.74
The key concept is that the expression in question is just a ‘time’-shifted version of the input stream. That way, when the sense data stream is processing a given index, the second stream at the same index will be one value ‘behind’ it, and this provides the ‘last value’ to the input. (In case it isn’t clear, the variable last-value
means the previous value, not a final value).
There is a question of what to do at the start, when there is no previous value to work with. Alyssa’s stream assumes it to be 0; modeling that behavior is one approach. My implementation copies the first element of the input stream, so that there is never a spurious sign change if the input starts negative.
This exercise could be made a touch simpler with an allowance for re-ordering the arguments, and the assumption of infinite streams as inputs. If the ‘current’ stream is allowed to be changed, then the ‘previous’ stream would be the input data itself, and the ‘current’ becomes merely the stream-cdr
of the input. In this case the first output is skipped (it is meaningless anyway), and the output stream starts right away with the input compared to the second value in the input. Care would be needed if the measurement of the time of the zero crossing is required to be exact, as it would be one time-step off from the input data.
Exercise 3.75
The mistake Louis made is that his procedure is using the output of the stream (last-value
) to do the averaging, instead of averaging two input values. This will cause the ‘smoothed’ output to have a lingering residue of previous changes that never really goes away but does get reduced over time. Sharp changes in the output will remain in the smoothed output and only slowly be averaged out on subsequent smoothings. To see that this is the case, consider the mathematical expression that results. Let each input value be labeled as xn, where n is the index of the input (and the initial ‘last value’ is the first input repeated).
It’s clear that this approach doesn’t just take two successive values and smooth them together; it mixes them in with every value previously in the stream (albeit with earlier values having successively reduced influence). In this circumstance, it will likely manage to remove noise, but since it also retains the ‘residue’ of all earlier values, it’s quite likely to be in error as to when a sign change actually occurs. A particularly large but brief change in the input would cause a considerable additional error for some time afterward, when it should instead be evened out quickly.
The simplest fix, then, is to get another actual input value from the stream, giving it as one more argument to the procedure. The last value (output) continues to be required in order for the zero-crossing check to work as it did previously. The smoothing (averaging) is taken by combining with the last input instead of last-value
, without changing anything else in the procedure.
As in the last exercise, if we allow for the assumption of infinite streams, we could also solve this by looking ahead in the stream. This would again yield an ‘offset’ in time steps, but would not require additional arguments to be added. I’m not sure if that might be considered changing the ‘structure’ of the program, however. I think it’s certainly questionable to say that adding arguments is not already changing the ‘structure’ of the program in the first place, as that sort of change is one of the most noticeable and disruptive possible.
Exercise 3.76
Unlike the previous two exercises, this function is taking a single stream as input, and produces its output directly from that. That gives greater flexibility in how to construct the output. Indeed, one solution is the same idea proposed in the previous exercises that we could probably take only the stream itself as input and look ahead. We can also manipulate the stream in other ways by using internally defined streams or functions.
A good solution here doesn’t need to be that different from the original way that the zero crossings were constructed, seeing as how those combined two elements to make a stream. Either the approach that Alyssa took (constructing a stream that feeds an extra argument to a recursive function) or something more like Eva’s approach (using stream-map
with a time-shifted version of the input) should work. I ended up not using stream-map
itself, but effectively re-implementing it, so that it could properly stop before the end, if the stream is finite. (It would be possible to rely on stream-map
’s own cutoff by putting the shifted, ‘shorter’ stream first, but it might be considered a questionable practice to rely on the internal details of stream-map
.)
Using the new smooth
to produce the zero-crossing stream can then be done using either Alyssa’s function or Eva’s approach using stream-map, with the input stream argument replaced by a smoothed stream. It’s again somewhat dependent on how we want to treat that initial value. It wasn’t really possible to construct a comprehensive test for this exercise in advance, since there isn’t a very clear indication of how initial values are to be handled. Either way, it should produce results similar to the earlier tests. Consider it a chance to add your own verification steps, based on your interpretation of how it ought to work.