Quirksand

SICP

SICP 3.4.1

February 16, 2020 10:34

There is only one exercise in this first section on concurrency, and no coding is required to solve it. It’s a mental exercise, and the second part is particularly open-ended. This post will be the only one for this section. As there’s no real need to introduce the exercise, the remainder of this post will discuss my solution set and how it was created.

Depending on how you break down the steps, it’s possible to have a relatively small number of steps, and figuring out the possible result can be accomplished by running through every combination. Computing combinations and running the sequence of actions that results is actually a good programming task, and that is what I did for my solution. The file included below will generate a list of all ‘possible’ sequences. It does not actually use concurrency at the interpreter level, so it should be possible to run it with almost any flavor of Scheme.

In the first part of the exercise, there are not that many orderings that can occur. There are three distinct operations, and thus there are only six different ways in which they can be run. Not all of them will produce a unique value, either. In particular, it makes no difference if Peter follows right after Paul or if it’s the other way around, since neither of those actions depends on the current balance. We can list out the possible orders, see what the results of running them are, and come up with each final value by hand without too much difficulty. Four distinct final balances are possible, ranging from $35 to $50.

Part b is a slightly different story. Depending on how you consider it, there are dozens, if not hundreds of possible ways that the steps can be interleaved. So part of the exercise is just understanding exactly which steps will lead to a change in the final outcome when they become interleaved. When analyzing a concurrency problem, I think it’s important to consider two questions — what are the shared resources, and when and how are they being accessed? Obviously if there are no shared resources, then concurrency will not affect the outcomes at all (for instance, if every person had a private bank account). We’d also typically only be concerned if the resource is actually being modified during the concurrency period, since if the concurrent processes do share a resource but it cannot change when they are run together, they simply observe an unchanging value, and won’t produce different outcomes if run in another order. An example of this would be procedures that merely report the bank balance in different currencies, but perform no actual transactions.

With this in mind, we can break down the bank account problem. What are the shared resources? It’s really only the account balance, as it is the variable that all three processes access. It’s readily apparent that the balance is going to be modified during these processes, so all steps that access it must be tracked. In the first two cases (Peter and Paul), they first read the balance, compute a new value, and then set the balance to that value. The computation portion can be considered a private step. Thus each of them only interacts with the shared resource in two ways. Mary’s case, however, has the statement (- balance (/ balance 2)), which in fact involves two distinct accesses to the balance. Her process also goes on to set the balance, meaning there are three steps that can be interleaved.

That amounts to seven steps that can access the balance. If they could run in literally any order, we’d have 7!, or 5040 combinations of steps. However, since each successive step must be run after another, we don’t have to consider quite that many cases. Figuring out the possible sequences where the steps remain in order is a matter that can be solved using a computer program. We make a list for each sequence, and then create the set of all sequences that interleave the lists, preserving the order of the steps for any one list.

It’s not necessary to understand this part to discuss the solutions, but for those interested, here is the code that generates the interleavings:

(define (interleave-n nlists)
  (cond ((null? nlists) '())
        ((null? (car nlists)) (interleave-n (cdr nlists)))
        (else (flatmap interleave-with-first-fixed (shuffle-each-to-front nlists)) )
        )
  )
  
 (define (interleave-with-first-fixed ln)
   (if (< (length ln) 2)
       ln
       (let ((l1 (car ln))
                     )
                 (map (lambda (shorter-list) (cons (car l1) shorter-list))
                          (interleave-n (cons (cdr l1) (cdr ln)))
                          )
                 )
       )
   )

The procedure takes a list of lists, and will return all possible interleavings of the various lists. To interleave a list and preserve the order of steps in it, we select the first element from the first list, and then recursively find all interleavings that result from the remainder of that list interleaved with the other lists. If we then allow any of our n lists to be the ‘first’ list, we’ll cover all cases. To handle the selection of ‘first list’, there is a procedure shuffle-each-to-front that creates a new set of lists, formed by successively taking every element of a list and moving it to the front. (In this case each element is itself a list; those lists are not shuffled.) The final result makes use of flatmap to combine all of those lists, as they may be of varying length.

I found it slightly easier to refer to the interleaved steps by giving them more generic names — instead of ‘Peter’, ‘Paul’, and ‘Mary’, I used a, b, and c for the process, with numeric indexes to show the steps in order. In other words, the list (a1 b1 a2 b2 c1 c2 c3) represents the following interleaving: First Peter (a) and Paul (b) read the balance (a1 b1). Then Peter sets the balance (a2), Paul sets the balance (b2), and lastly Mary’s steps proceed in order (c1 c2 c3). There is an additional reason for using these more generic names, which I’ll explain below. The resultant display of the output is comparable to the interleavings listed in the next section of the text (3.4.2), which shows an interleaving of two three-step processes.

It turns out that we don’t have to just look at the list of steps and compute the output manually. Instead of merely using names for the interleaved lists, we can actually use procedures instead. We can create an external balance variable, and then turn each step into a procedure that operates on that variable. The interleaving produces a sequence of procedures, almost as if it were a (begin ... ) statement. For this exercise, I made each step a procedure with no arguments, and used global variables for all storage. When a1 reads the balance, it will compute the eventual new balance and save it in a variable a temporarily. A later process a2 can then use that to set the balance. In this way, we can define what Peter, Paul, or Mary does as a list of ‘step’ procedures that take no arguments. Then, we take one of our interleaved output lists, process its steps (using apply to execute them), and check balance once it’s done.

In the end, there are 210 possible orderings that can result from the interleaving of the three processes in this fashion. To be sure, some of these don’t have a meaningful effect (e.g. the partial sequence a1 b1 is no different from b1 a1, since neither modifies the balance). The exhaustive list also includes the orderings where no actual ‘interleaving’ occurs, leading to the results produced in part a. The whole set of outputs is reasonably small enough that we can then look over the results (if we really wanted we could sort the outputs and/or remove duplicates). We find the actual set of possible balances present a pretty wide array: the results can be as low as 25 or as high as 110! Most multiples of 5 in between those two are also potential results.

Here are some sample sequences, exemplifying ways that the interleaving affects the balance.

(a1 b1 b2 c1 c2 c3 a2) => 110

In this one, a1 grabs the original balance before everyone else, but then doesn’t get to set the balance until the others are completely finished. In effect, this erases the other processes entirely (although in the bank case, consider that both Paul and Mary were able to ‘withdraw’ money that wasn’t properly recorded). This sort of pattern is a fairly common one when interleaving occurs that alters an ‘expected’ result. Indeed the last operation to complete is the one that is most likely to determine whether a sequence has an unusual value.

(a1 c1 a2 c2 c3 b1 b2) => 25

What happened here is that while c1 took the original balance (to subtract from), c2 took the larger balance 110 that results after a is finished. That means it’s subtracting 55 (half of 110) from 100, leaving only 45 when it is done. Then process b (Paul) removes 20 more, leaving only 25. Since process c reads twice, it can potentially end up with a very different output depending on how other steps are interleaved.

(b1 c1 a1 b2 a2 c2 c3) => 45

This set represents a sort of ‘balanced’ interleaving. Each process’s first step is completed, then each process’s second step is completed, then any further steps are completed. It still produces nonsense, though, since the result of b2 is overwritten by a2, and process c is still mixing the original balance with half of a later computed balance.

There is one further wrinkle to these calculations, however. We split the statement (- balance (/ balance 2)) into two steps, both of which read the balance. Those two steps are part of an argument list to the subtraction procedure. But how do we know which of those actually happened first? The order of argument evaluation will determine this, and it is not necessarily left to right. Therefore we have to consider the case in which those arguments are evaluated in a different order.

It would be a pain to accomodate this in the interleave-n procedure, since it assumes the order of steps needs to be consistent. For this exercise, I found it easier to just run the procedure twice, and give a different meaning to the procedures c1 and c2 depending on the order of evaluation (this is why the generic names are a bit easier to deal with). In fact, we find that when the order of the argument evaluation is different, we can get a slightly different set of outcomes.

(a1 a2 b1 c1 b2 c2 c3) => 65

This result in particular, 65, is only possible with left-to-right argument evaluation. With c1 as the minuend, it can retrieve the highest amount possible (110, when a2 completes before anything else). Then c2 takes away half of a smaller balance set by b2. If c1 is instead taking half of the earlier balance, and c2 will subtract that from a later balance, you won’t get the same output. It can be seen in the demonstration file by switching the left-to-right flag that 65 won’t show up as a result when the arguments are evaluated from right to left.

Complete Solution 3.4.1.