SICP 2.1.4 Solutions
April 7, 2014 12:30
There’s nothing particularly difficult here. The interval is a pair, and getting the bounds is using either
cdr to get the proper bound. Technically the constructor isn’t properly specified, as we don’t know what
b is meant to signify. The intuitive solution is to have
a be the lower bound and
b be the upper bound, as it then closely matches how mathematical intervals are written. The tests simply ensure that the already-written procedures will function using the selectors we’ve provided, and implicitly use the constructor arguments in this manner.
One way of defining subtraction is in terms of negation and addition. In other words, a – b = a + (−b). If that’s our approach, then what is needed is a proper definition of what negation is. Imagine that negation is ‘flipping’ the interval about 0. What we’ll get is that the lower bound of the negated interval is the negative of the original’s upper bound (which for positive intervals is farthest from 0). The new upper bound is then the negative of the original lower bound. Negating our subtrahend reduces the problem to simple interval addition.
Does this actually make sense for what we can imagine subtraction to be as a combination of the intervals? If an interval means a value that is potentially anywhere in a certain range (as in physical resistor values), then when we subtract, we want an interval that covers the possibilities the original intervals can have. The lower bound should be the lowest value that can result from the two. The upper bound shouldn’t be higher than what the subtrahend can reduce the minuend by. Let’s look at this in a bit more detail.
If a is defined as [a0, a1], and b is defined as [b0, b1], then using the negation method,
a – b = a + (−b)
a – b = [a0, a1] + (−[b0, b1])
a – b = [a0, a1] + ([−b1, −b0])
a – b = [a0-b1, a1-b0]
b1 >= any value in b, and a0 <= any value in a; a0-b1 is the lowest value in the result.
As a1 >= any value in a, and b0 is <= any value in b, then when subtracting b from a, there won’t be any value that can be higher than a1 reduced by b0.
While this method is thus sufficient to fill our needs, there is one point at which this subtraction doesn’t correspond to subtraction of real numbers. Namely, what do we get for a – a? For a real number, we’d expect this to equal 0. But this won’t work for intervals, assuming we (sensibly) define 0 as the interval [0,0]. Subtracting a from a doesn’t give [0,0], however. It gives an interval centered on zero, with a range determined by the range of a.
It turns out we can’t actually create an interval that, added to a, will equal [0,0]. [-a0, -a1] simply doesn’t work. Since a1 >= a0, the lower bound of that interval is greater than the upper bound!. That’s one indication that not every operation we can do with real numbers will end up working just as smoothly with intervals, and we’ll see more on that later on of course.
As it turns out, the preceding definition of subtraction makes this proof a bit simpler to describe. All we need to do to include subtraction is observe that the width of a negated interval is the same as the original.
|width (a) =||a1 – a0|
|width (−a) =||(−a0) – (−a1)|
|width (−a) =||a1 – a0|
∴ width(−a) = width(a)
Now we have to show that the width of an added interval is the sum of the widths of the two values added.
a = [a0, a1]
b = [b0, b1]
a + b = [a0+b0, a1+b1]
|width (a + b) =||(a1 + b1) – (a0 + b0)|
|width (a + b) =||a1 – a0 + b1 – b0|
|width (a + b) =||a1 – a0||+||b1 – b0|
∴ width (a + b) = width(a) + width(b)
Multiplication and division do not adhere to this simple method of combining widths. It’s fairly easy to come up with counterexamples:
We’ve run into another issue with the number 0. We can’t allow 0 to be in any interval when we’re dividing, even if it’s not the center of the interval. How is the check made? By simply testing the bounds. If the upper bound is >= 0, and the lower bound is <= 0, then zero must be somewhere in that interval. We put this test before performing the math in
div-interval, and signal an error as required if the test fails.
Taking the hint from the problem statement, in that we only need to test the signs, we discover the nine cases to be taken from the set of all the combinations of the bounds having various signs. There are only nine cases rather than 16, since we know that the upper bounds can’t be lower than the lower bounds. We do have to make a decision of which sign 0 is grouped with; as long as we stay consistent about it, the choice doesn’t matter.
Previously we performed four calculations and picked the maximum (or minimum) for the result. The number of multiplications is reduced when we test for signs. To start with, we can discard results that fall on the wrong side of 0. Furthermore, we know if one interval’s bounds are both positive (or negative), that when multiplying by a number of a given sign, which of those will give the larger (or smaller) value. In the end the only case this can’t be resolved easily for is when the intervals both span 0, and in that case we need to fall back to four multiplications.
This approach implies that the calculations are an expensive operation while the tests are not. That means that we don’t have to be particularly worried about how many tests on the sign we’re performing. I implemented this in a somewhat convoluted way that finds the upper and lower bound separately. It would be just as simple to have nine sets of tests, that each yield a different call to
A simple way to construct an interval like this is to convert the percentage tolerance to a width, and use
make-center-width with that value. The
percent selector can likewise rely on
center to get the tolerance.
Something that needs to be considered here is how to deal with 0 as a center. There’s no way to calculate a proper percentage from 0, so in my functions this is considered an error.
If we consider the intervals as based on small percentage tolerances, then it’s sensible to describe the interval x as (x +/- a * x), where a is our small percentage. The product of the two intervals then looks like this:
(x +/- ax) * (y +/- by) = xy +/- axy +/- xby +/- axby
Factor out xy in the three right terms to get
xy +/- (a + b + ab) * xy
Since ab is the product of two small percentages, it’s a very small number. Ignoring it yields
xy +/- (a + b) * xy
The last expression is in a similar format to how we stated the interval x to begin with. Thus, to approximate the percentage tolerance of the product, we simply add the tolerances of the values being multiplied. The assumption that all values are positive lets us gloss over the complications of finding the actual bounds as in the cases described in Exercise 2.11.
This exercise leads into the next one, but following the suggestion with specific values lets us see how large the effect is. We are finding more cases where treating intervals like real numbers leads to not just indeterminate results, but some that are outright incorrect. As the book states, “this is a serious complaint” which will be answered in Exercise 2.16.
Before even considering Eva’s ideas, it can be shown that
par2 gives the correct answer in the physical case it’s attempting to model.
Consider a circuit with two 1 kilo-ohm, 5% resistors. The maximum current will flow when the two resistors have their minimum value. This is 950 ohms for each, which is equivalent to a single 475-ohm resistor. Going the other way, the minimum current occurs when both are at their maximum value, 1050 ohms. These two in parallel result in an equivalent resistance of 525 ohms. The correct answer for the range of equivalent resistance is [475, 525]. This is in fact what
par2 calculates, and therefore it is to be preferred.
Eva does turn out to be right about this formula, and in general, if an uncertain variable is not repeated, then the uncertainty in the result will be smaller as well. A more in-depth consideration of this topic is covered by sensitivity analysis, which seeks to determine how the output of a formula or system is affected by variations in the input.
The problem here is that the phrase ‘equivalent algebraic expressions’ is incorrect. These are not equivalent because interval arithmetic does not follow the rules of algebra, at least not as we apply it to real numbers. We have already seen that arithmetic inverses (a number such that a plus its inverse equals 0) cannot exist, and in working exercise 2.14 you may have found that multiplicative inverses (a number such that a multiplied by its inverse equals 1) also don’t exist for all values.
What we find most damaging in trying to rewrite our expressions is that the distributive property does not hold for every interval. In general, a * (b + c) is not equal to a * b + a * c. We observe the same effect Eva noticed: the duplication of a variable in a formula is going to produce a wider ‘error’ bound in the result. Producing different values from these ‘rearrangements’ is enough to throw off our algebraic manipulations.
There have been several approaches to improving interval arithmetic. A few attempts define arithmetic operations on intervals in a way that eliminates the ambiguity. Another method is to use interval constraints to define the interval so that it can be used algebraically (it is considered as a section of a function). Yet another approach is Affine arithmetic, which encompasses interval math in a more generalized system.