Archives for drcabana.org

Cracker Barrel puzzle

A recent posting to the Trifunc group invited the members to solve the Cracker Barrel puzzle.

Brian Adkins, the author of the challenge, posted a solution in Haskell. Here is my Clojure solution.

The game is played on a triangular board with 15 holes, labeled 0 through 14. Initially, 14 of the holes contain pegs, one hole is empty. The goal is to eliminate pegs by jumping over them. For a full explanation of the game, see the links cited above. My numbering scheme for the game board is similar to the one shown here, except that I number from 0 to 14 rather than from 1 to 15.

I represent moves by triples I call neighborhoods. Each neighborhood (for example [0 2 5]) designates the coordinates of three contiguous holes. Corresponding to the triple of coordinates is a triple of booleans denoting the presence/absence of pegs in the holes. If the triple of booleans is either [true true false] or [false true true], then a capture is possible, and the triple of integers unambiguously determines a legal move.

Though the solve and solve-all functions are mutually recursive, I did not bother to use a trampoline. Blowing the stack is not a danger here. The stack depth is proportional to the length of the game, and the maximum length for a game is thirteen moves.

Lazy coins

Here’s a problem I ran across on the CompSci Stack Exchange. Suppose you have a biased coin that comes up heads with probability p. You don’t know what p is, but assume it is neither 0 nor 1. How can you use the coin to simulate a fair coin, i.e. one that produces heads with probability very close to 1/2?

The procedure is surprisingly simple. To produce one “fair” toss, you toss the biased coin n times, and count the number of heads. If the count is even, the result is heads, if odd, tails. The value of n needed will depend on the bias of the coin, but typically something like n=20 will suffice.

Lazy sequences provide a simple and elegant way to build probabilistic models of coin tossing.

As seen above, eyeballing a few tosses is not going to suffice to distinguish fair coins from unfair. We’ll need to collect some statistics.

Here’s how we use repeated tosses to simulate a fairer coin.

Notice the oscillation in the sequence of sample means. To understand where it’s coming from, imagine flipping a coin that always comes up heads, an evil coin. The parity of the number of heads observed is exactly correlated to the parity of the blocksize n in our unbias function. This causes our scheme to fail miserably for evil coins, as below.

With a highly biased (but not evil) coin we see much the same oscillation for small blocksize n. If we track the sample means over a contiguous range of blocksizes n, high sample means alternate with low sample means. Regardless of these oscillations, as n grows, the sample means does converge to 1/2. Can we prove it?

It is convenient to the coin as a random variable X which assumes values 1 and -1 with probabilities p and q=1-p respectively. Flipping the coin repeatedly results in a sequence of independent, identically distributed random variables X_i.

Consider the random variable Y_n defined as:
Y_n = \frac 12 + \frac 12 \left(\prod_{i=1}^n X_i\right)

Notice that Y_n takes on values 0 and 1, depending on the parity of the number of heads seen in n coin tosses, exactly as in unbias function.

The expected value of Y_n is:
E\left(Y_n\right) = E\left(\frac 12 + \frac 12 \left(\prod_{i=1}^n X_i\right)\right) = \frac 12 + \frac 12 \,E\left(\prod_{i=1}^n X_i\right)

Because the X_i. are statistically independent, we can exchange the order of product and expectation:
E\left(\prod_{i=1}^n X_i\right) = \prod_{i=1}^n E\left(X_i\right) = \prod_{i=1}^n (p-q) = (p-q)^n

Since |p-q| < 1, the term (p-q)^n converges to zero as n increases. In particular E(Y_n) converges to 1/2.

I'll close by pointing out a couple of potential pitfalls of using lazy sequences to model coin tossing. Both are easily avoided. Here's the first. A given coin is a fixed sequence. Each time I look at the first 10 elements they are the same. Make new coins as required.

Don't do this: The sequence fair is now a million elements long and won't be garbage collected as long as you hold a reference to its head. If you must name a coin, try to do it inside a let block so that it will be garbage collected when the let block falls out of scope.

A thread safe fsm?

Today I wanted to play around with building a small thread-safe finite state machine. I’m mostly interested in the thread safety angle, so the machine is kept extremely simple. Clojure’s protocol feature turned out to be just what the doctor ordered.

The fsm consists of three things: the current state, the input alphabet, and a transition function which maps the current state and an input symbol to the next state. The fsm has two ‘methods’: state, which returns the current state, and act which acts on the input symbol and current state to compute the next state. Here is how act and state are implemented.

Let’s make things more concrete by building a machine. Our input alphabet will contain only a single symbol, 1. The machine will count the parity mod 4 of the number of times it has seen the symbol 1. It ignores input other than the symbol 1. We’ll call the machine blue, in honor of the Sweetwater Blue Ale I had while building it.

Playing with it at the repl we see that at first glance everything appears to work.

But is it thread safe? Let’s beat on it with some threads.

Well, 100000 mod 4 is 0, so it got the right answer. And blue passed various similar tests, which I’ll spare you. But the thought occurs that maybe my test is not a very good test. So let’s test the test. Yes, I’m kind of goofing on you, but not entirely. The way to test the test is to build a flawed version of the act function, one with a deliberate concurrency bug, and see whether the test catches it.

Using the new, flawed definition, we build another machine and test again. Notice the final state is ::1, which is incorrect. The red machine started in state ::0, so after seeing the symbol 1 exactly 100000 times it should have been in state ::0. This suggests the beat-on-it-with-threads test has some merit, and that having passed it repeatedly gives the thread safety of the blue machine some plausibility.

Clojure Programming Studio: a review

On Sept 13-15 I had the pleasure of attending the Pragmatic Studio’s Clojure Programming course in Reston, VA. This is a brief review.

First, the external elements. The course was held at the Learning Tree Reston Education Center. The facility was spacious, modern, and well run. There was good food, plenty of drinks and snacks, easy internet access, etc. All of the myriad tasks necessary to organize and run the class were well and transparently executed. I experienced no issues whatsoever. Nicely done.

The instructors, Rich Hickey and Stuart Halloway, need no introduction to anyone seriously interested in Clojure. Rich is the author of the Clojure language, Stu is the author of Programming Clojure, and the CEO of Relevance. Both are gifted instructors and natural raconteurs. They knew their material, knew how to present it, and seemed to enjoy doing so. The class consisted of approximately thirty students, some of whom traveled from as far as Sweden to attend the course. The students struck me as genuinely interested in the material, judging from hallway conversations and longer discussions at the Monday night beer and pizza social.

The structure of the course alternated lectures with lab exercises. Rich and Stu alternated as lecturers, though both were present throughout. The one not lecturing typically ran the slide show and provided occasional color commentary. Both encouraged questions at all times. A sufficiently good question might elicit a so-called “rant”, a thing much to be valued. If you ever get the chance, try to get Rich going on object oriented programming, or perhaps TDD.

The course seemed designed for an experienced programmer interested in taking a fast but serious look at what Clojure has to offer. Because of the depth of the instructors’ knowledge of the material, I suspect even fairly advanced Clojure programmers would enjoy and benefit from the course.

Bottom line, was it worth the time and expense involved? I thought so, absolutely. I came away satisfied that my early impressions of Clojure and its future were correct. And I had a great time.

Standard disclosure: I am not financially affiliated with the Pragmatic Studio, or with Rich or Stu. I have received no compensation of any sort for these remarks, nor will I.

Nonstandard disclosure: The course fees and hotel accommodations were paid for by my lovely wife Sharon, who gave me this opportunity as a birthday present. During the days I was gone, she heroically dealt with her job, our two children, their school and dance schedules, and with our two dogs. I am grateful.

bind, unit, and all that

I’ve given in to the unholy urge to write a monad tutorial. Here are some remarks that might help you decide whether it’s worth your time to read on.

My intended readers are programmers interested in functional programming, not mathematicians. No category theory is required.

All the code will be in Clojure. The level of Clojure expertise required is intermediate. The reader should be comfortable defining functions, especially functions that return functions as values, and should understand the comp operator. That said, there is nothing in the code that exploits features unique to Clojure. Everything I’m going to show you could easily be done Scheme or (I suspect) Ruby or Python.

My goal will be to explain what a monad is. That might seem too obvious to mention, but bear with me. Suppose I were writing about regular expressions. I might come up with several very different tutorials. One might discuss the languages recognized by finite state automata. Another might concern the symbolic technology used to describe regexes. Another tutorial might show how to use regexes to solve some particular class of problems. Someone new to regular expressions and looking for the third tutorial might well find the first confusing.

This tutorial will explain what a monad is, with as little mathematical formalism as possible. It does not attempt to show how to use monads to solve a variety of practical problems, nor does it make any use of the clojure.contrib.monads library. I try to keep things as concrete as I can, but a certain level of abstraction is inherent in the topic. I have put much effort into explaining the meaning of the so-called monad laws as clearly and intuitively as the subject and my ability allow.

Here’s the plan. We’ll start by solving a simple but not entirely trivial programming problem, using pure functional programming. We will not use mutable state, nor macros. A fairly natural solution falls out; that solution contains the elements that make up a monad. We’ll discus what those elements are, what the monad laws are, and what those laws mean.

So much for the prelude; let’s get started.

I will take as given that the function is the fundamental unit of modularity in functional programming, and that functions interact via composition. Here is our starting point.

The following two expressions are equivalent; both are examples of the composition of functions. The first example is more idiomatic, but we will be using the comp operator and the latter form extensively.

Our goal will be to build an enhanced composition operator, call it comp*, that traces the flow of data through composed functions. The comp* operator should compute the value one would expect, and also present intermediate results as data flows into and out of the functions that make up the computation. Here’s an example of what I have in mind.

The first problem we have to tackle is that Clojure functions do not have access to their names (they may not even have names), and so cannot provide their names to the proposed comp* operator. One way to get around that is to create enhanced functions that do know their names.

The type signature of (verbose f) is not the same as that of f. We are going to need to keep track of type signatures, so let’s introduce some terminology. I’ll refer to three types of functions, imaginatively called type 0, type 1, or type 2, depending on type signature, as follows:

  • type 0 has signature Number->Number
  • type 1 has signature Number->[String, Number]
  • type 2 has signature [String, Number]->[String, Number]

By using verbose to create type 1 functions we have traded one problem for another. The type 0 functions sq and dbl could be composed with one another, but did not know their own names. The type 1 functions v-sq and v-dbl know their own names but they cannot be composed with one another. Their type signatures don’t allow it.

We seem to be at an impasse, since our goal is precisely to compose type 1 functions. Strictly speaking that is not possible, but let’s see what we can do. The way forward is to define a new function, which we will call bind.

Bind takes a type 1 function as input, and returns a type 2 function. Notice that unlike type 1 functions, type 2 functions can be composed.

The output traces the computation, but the calling syntax is ugly and puts too much burden on the user. We don’t want to have to call bind explicitly, nor to wrap the final argument by hand. And it would be nice to be able to compose an arbitrary number of functions. Not a problem.

That completes the first part of this tutorial. We solved the problem we set out to solve. Notice that we did so without using mutable state, without using macros, and without modifying the source code of the type 1 functions whose compositions we set out to trace. The entire solution was based on the creation of two functions, unit and bind, which we then used to implement some syntactic sugar, comp*.

In solving our toy problem, we built a monad. What is a monad? It is essentially a pair of functions, bind and unit, that satisfy three conditions known as the monad laws. The principal goal of this post is to present the monad laws in the simplest way possible. But we will start with somewhat complicated version of the monad laws. Why?

Clojure has an excellent monad library called clojure.contrib.monads. Its author, Konrad Hinsen, has written some fine monad tutorials which I hope this article will complement. I want this presentation to be useful to someone who might want to use Konrad’s work. That means I need to bridge the gap between my exposition and his. I will start by considering the monad laws as Konrad presents them in part 2 of his Monad Tutorial for Clojure Programmers.

Yow! That is quite something to get your head around. I promised a simple version the monad laws. Here’s my first cut, with a simpler version still to come.

Comparing Konrad’s presentaton and mine, it is evident that we are not using the same terminology. Allow me to translate. The function I call bind, Konrad calls m-bind. What I call unit, Konrad calls m-result. I use f and g, he uses function, function1, and function2. But those are just names, and not terribly important.

The important difference between Konrad’s version of the monad laws and mine stems from the fact that I define bind as a function of a single argument. My bind takes a function as an argument, and returns a function. Konrad’s version of bind takes two arguments, a function f and a value x, and it returns a value computed using f and x. The returned value is generally not a function. The monad laws describe the contracts that bind and unit must fulfill. Not surprisingly, the arity of bind affects the written formulation of the monad laws. By reducing the arity of bind, I get a simpler set of laws.

Why did I choose one way and Konrad the other? Because we had different purposes in mind. Konrad wanted to build a monad library, I wanted to present the monad laws in the simplest manner possible.

I chose to write bind in a point-free style precisely because that style allows a simpler formulation of the monad laws. Konrad chose his version of bind because it is better suited than mine to the business of building a monad library. In particular, his choice to reverse the usual order of arguments and go with (m-bind [x function] …) instead of (m-bind [function x]…) was brilliant. I hope to explore that idea in another posting.

I promised you a simple version of the monad laws. We’re not there yet. To get there, we are going to do two things. One is that we will abandon Clojure’s prefix notation in favor of infix notation. The second amounts to a change of variables. Notice the repetition of the pattern (comp (bind f) g) in my version of the monad laws. This is a sign of a missing abstraction. We’ll write (comp (bind f) g) as f * g, and write (comp f g) as f . g.

Then we can write the monad laws as follows. Here parentheses serve only to indicate operator precedence, and bind has precedence over composition.

Together, the first two monad laws require that unit be an identity element with respect to the * operator. Just as 0 is an identity element with respect to addition and for every x, x+0 = 0+x = x, so unit is an identity element with respect to *, and for every f, f*unit = unit*f = f. Identity elements are sometimes called unit elements, which is why I prefer the name unit over the more commonly used return, or Konrad’s m-result. I think return is an especially unfortunate choice of name, because the word return is already overloaded with other associations.

Take a hard look at that third monad law. Does it seem familiar? You have probably seen that pattern before. Compare the third monad law to the law of logarithms:

Both log and bind follow the general pattern foo(a ⊕ b) = foo(a) ⊗ foo(b). Functions that have this property are called homomorphisms. The logarithm function is typically the first and certainly the most important homomorphism that most of us encounter. But I promised no category theory, and the spirit of that promise demands that I not pursue abstract algebra either. Let’s take a different approach, and think about type signatures.

We invented bind in order to get around the fact that type 1 functions cannot be composed. Recall that if f and g are type 1, then (bind f) and (bind g) are of type 2, and type 2 functions can be composed with one another. Notice that (bind f) can be also be composed with g, resulting in a type 1 function. So there are two ways to use bind, f, and g to create a type 2 function. One is (comp (bind f) (bind g)). The other is (bind (comp (bind f) g)). The third monad law demands that these two ways of building a type 2 function result in exactly the same function.

Okay, we have defined bind and unit. We have contemplated the monad laws. Now it’s time to bite the bullet and show that our particular bind and unit in fact satisfy the monad laws. In each of the cases below, the functions f and g are assumed to be of type 1.

The first monad law states that (= (comp (bind f) unit) f).

Asserting the equality of the functions f and (comp (bind f) unit) is equivalent to asserting that for any and every type-appropriate argument n, (f n) is equal to ((comp (bind f) unit) n). Here’s the proof.

The second monad law asserts that (= (comp (bind unit) f) f). We need to show that for any f and any type appropriate n, (f n) is equal to ((comp (bind unit) f) n). Here we go.

Keep in mind that this sort of gruntwork is not part of the day to day use of monads. Only the monad designer needs to establish that the monad laws hold.

We have just one more law to go. I tried doing the next proof in the style of the first two proofs, but found the prefix notation tough sledding. I’m going to switch to infix notation. I love prefix for programming, but infix seems better suited to doing algebra.

Here are the notational shortcuts I’ll be using:

  • Bg denotes (bind g)
  • f x denotes that the function f is applied to x
  • f . g denotes (comp f g)
  • f . g x denotes ((comp f g) x); notice composition has precedence over application.
  • Functions are right associative: f g h x denotes f (g (h x)).
  • Notice that f g h x is the same as f.g.h x
  • x : y denotes the string concatenation (str x y)
  • F x denotes (first x)
  • S x denotes (second x)
  • commas are whitespace
  • [a, b] denotes a pair with elements a and b

Let’s describe the action of bind in the infix notation:
Instead of ((bind g) [t x]) we’ll write Bg [t x].

We’ll make heavy use of this identity, which is the definition of bind written in the new notation: Bg [t x] = [t : F.g x, S.g x ]
Make sure that makes sense to you, or just skip the next section entirely.

The third monad law states that (comp (bind f) (bind g)) = (bind (comp (bind f) g)). We’ll write that as Bf . Bg = B(Bf . g).

To show this equality holds, we’ll show that for arbitrary [t x],
Bf . Bg [t x] = (B(Bf .g)) [t x] ;; call this ML3

First consider the left hand side of ML3.

Now consider the right hand side of ML3:

Hold that thought, while we figure out what h x is.

Now we know what h x is, and can substitute for F.h x and S.h x in expression B. In particular,
expression C shows us what h x is, and we see that
F.h x = F.g x : F.f.S.g x
and
S.h x = S.f.S.g x.
Substituting these into expression B give us
[t : F.h x, S.h x] =
[t : F.g x : F.f.S.g x, S.f.S.g x] ;; Call this D

Notice that expressions A and D are identical, which establishes ML3. We’re done. We showed that bind and unit satisfy the monad laws.

A remark by Donald Knuth comes to mind: “Beware of bugs in the above code; I have only proved it correct, not tried it”. I am not as brave as Professor Knuth, so I wrote some code to test whether the monad laws actually hold. The idea is to randomly generate functions, perform the appropriate compositions, and then test whether the laws hold. It’s actually pretty easy to do; you can get the code here. This post has already run longer than I expected; I’ll say no more about testing.

So what is a monad? It’s a pair of functions, bind and unit, which satisfy the monad laws. Bind and unit do not have fixed definitions like sine and cosine. Instead, they vary from monad to monad. But in all cases they must satisfy the monad laws. There is a bit more to it, in that the monad laws presumably operate on some specific set of functions. Here I waved my hands and called those functions type 1 functions. A more formal treatment would specify those functions more precisely, and consider a monad to be a triple, say bind, unit, and a type constructor, or some such. I chose to omit such considerations in favor of simplicity.

But in the context of programming, what are monads for? It is unfortunate that the term design pattern has been turned into a synonym for meta-boilerplate. Because design pattern might have been a fair description for the methodological role of monads The choice to write or use a monad is the choice to organize one’s code in a coherent, principled, and flexible fashion. My one example is hardly enough to show that. But perhaps it may help a programmer new to monads to better understand what the fuss is about, and to get started reading other material.

Speaking of further reading, most treatments of programming with monads default to the use of Haskell. But the Clojure community is fortunate to have some very good monad tutorials. Check out Konrad Hinsen’s and Jim Duey’s articles.

Credit where credit is due. I stole the use of the name unit, and the idea for my tracing example, from You Could Have Invented Monads, by Dan Piponi. I think this is a fantastic article, but it does require some knowledge of Haskell.

I hope you found this informative; thank you for your attention.

The expressiveness of functional programming

I’d like to revisit the discrete convolution problem described in my previous post, to consider the expressiveness of functional programming. Here I am interested in the power of code to communicate intent; the bare metal speed of the code is secondary.

The various comments and proposed implementations in this Google Groups thread are what got me thinking about this. My thanks to all the contributors. I’ll begin by quoting some imperative code posted to the thread:

The code is brief, and the style is familiar to most programmers. But it is all how, and no what. I look at it, and I do not see what it means. This proposed solution to an inherently mathematical problem make no use of mathematical constructs other than addition and subtraction. I think this is a shame, since convolution has a mathematical structure. A convolution is a sequence of inner products:

My proposed solution is longer than the imperative solution, but not much longer. And I should not have had to write inner-product; the inner product is so fundamental a mathematical operation that it should be part of every language.

I believe conv describes convolution more clearly than does convolve, because it does so in terms of operations on mathematical structures rather than as the outcome of changes to the contents of computer memory.

It also seems to me that convolve overspecifies the solution. The introduction of loops and mutation imposes a sequential structure on the solution that is not present in the underlying mathematics. For some fascinating thoughts along those lines I highly recommend Guy Steele’s ICFP 2009 presentation. It is an eye-opener.

While I’m ranting, I don’t think I should have had to write the tails function either, at least in so naked a form. The notion of an iterated function deserves a place in a functional programming language. I think Clojure is missing something along the lines of nest-list (stolen from Mathematica’s NestList).

Postscript: a few days later, while browsing core.clj, I noticed that Clojure already has the function iteration feature I wanted, it is called iterate. The implementation is quite pretty, much nicer than what I came up with:

Discrete convolution of finite vectors

Isaac Hodes posted a request to both StackOverflow and the Clojure mailing list, asking for some help implementing a discrete convolution function in Clojure.

I am trying to create a lazy/functional/efficient/Clojuresque function to carry out convolution on two lists/vectors of (ideally BigDecimals but that may be inefficient) doubles.

Right now, it seems as though this is something Clojure cannot do.

Let’s cut right to the code. Here’s my implementation:

So, how does this implementation work? And just what does the convolve function compute? Let’s consider the latter question first.

Recall how you multiply polynomials. Suppose we want to multiply 1+x by 2+3x+5x^2. Using the standard rules of algebra, we find the answer by gathering like terms. The order-0 term is 1*2=2. The order-1 term is 1*3x + x*2=5x, the order-2 term is x*3x + 5x^2=8x^2, and the order-3 term is x*5x^2=5x^3. It is notationally convenient to drop the variables and represent the polynomials as vectors of coefficients in standard form. In the case above, we multiply [1 1] by [2 3 5] and get [2 5 8 5]. The convolve function performs exactly the computation required to multiply the polynomials.

So, how does it work? My hope is that some (crude) pictures will tell the story; you be the judge. Let’s consider a particular example, (convolve [1 2 3] [4 5 6 7 ])
whose value is (4 13 28 34 32 21). Here’s the picture:

3---2---1
--------4---5---6---7    ;; 1*4=4

----3---2---1
--------4---5---6---7    ;; 2*4+1*5=13

--------3---2---1
--------4---5---6---7    ;; 3*4+2*5+1*6=28

------------3---2---1    
--------4---5---6---7    ;; 3*5+2*6+1*7=34

----------------3---2---1
--------4---5---6---7    ;; 3*6+2*7=32

--------------------3---2---1
--------4---5---6---7    ;; 3*7=21

As requested, the code is lazy:

user> (class (convolve [1 2 3] [4 5 6 7 ]))
clojure.lang.LazySeq

But does it perform? Let’s take a look.

user> (def foo2 (into [] (range 100)))
#'user/foo2

user> (def foo3 (into [] (range 1000)))
#'user/foo3

user> (def foo4 (into [] (range 10000)))
#'user/foo4

user> (time (do (doall (convolve [1 2 3] foo2 )) nil))
"Elapsed time: 2.827 msecs"
nil

user> (time (do (doall (convolve [1 2 3] foo3 )) nil))
"Elapsed time: 76.275023 msecs"
nil

user> (time (do (doall (convolve [1 2 3] foo4 )) nil))
"Elapsed time: 4101.002754 msecs"

Ouch. The time appears to be quadratic in the length of the second argument. We can do better than that with just a minor tweak. The problem is the performance of the drop function. Let’s get rid of it. Here’s the new version of the code.

And here are the new performance numbers, which look much better. A million points in under three seconds, with no dropping into java-isms. Sweet.

user> (def foo5 (into [] (range 100000)))
#'user/foo5

user> (def foo6 (into [] (range 1000000)))
#'user/foo6

user> (time (do (doall (convolve [1 2 3] foo3 )) nil))
"Elapsed time: 6.909401 msecs"
nil

user> (time (do (doall (convolve [1 2 3] foo4 )) nil))
"Elapsed time: 62.894431 msecs"
nil

user> (time (do (doall (convolve [1 2 3] foo5 )) nil))
"Elapsed time: 255.064899 msecs"
nil

user> (time (do (doall (convolve [1 2 3] foo6) nil)))
"Elapsed time: 2784.131159 msecs"
nil

British Lottery Puzzle

Have I mentioned that I’m a sucker for probability puzzles? I found this one at ibbly.com:

Two teams from opposite ends of the country have been drawn to play each other in a competition. They need to determine which team plays at home and do so fairly at a distance. They decide to try to generate a 50:50 probability from the following week’s UK national lottery draw.

The lottery draw consists of drawing 7 balls from 49.

To make the situation clear, there are 49 balls, labeled 1-49. The balls are thoroughly mixed, and 7 are drawn at random.

The author of ibbly.com points out that a very simple (and incorrect) approach to generating a 50:50 distribution is use the parity of the first ball drawn. Let’s simulate the process in Clojure.

(defn balls [n]
  "Random shuffle of the integers 1 ... n, inclusive"
  (shuffle (range 1 (inc n))))  ;;shuffle introduced in version 1.2

(defn lottery [] (take 7 (balls 49)))

(defn trial []
  (first (lottery)))

(defn trials [n]
  (let [outcomes (take n (repeatedly trial))
        odds (count (filter odd? outcomes))]
    (float (/ odds n))))

The results of the simulation confirm the expected bias, given that 25 of the 49 balls are odd:

(trials 100000)
0.51156
user> (float (/ 25 49))
0.5102041

So, how do we use the lottery drawing to simulate a fair coin? The author of ibbly.com proposes a somewhat complicated method:

We know that there are C(49,6)=13,983,816 ways of drawing 6 balls from 49. This is an even number so there's definitely a way of describing a 50% probability, even if it involves writing down 6,991,908 combinations explicitly. But can we find a more succinct way?

It turns out that such an approach is possible. A bit of trial and error shows that there's exactly a 50% chance that the lowest of the six main balls is one of {2,3,4,5,6,15,16,17,21,36,39,41,43,44}.

I'm not even going to think about that. There is a much simpler way to simulate the coin toss. Here's the code, with an explanation following.

(defn trial-adjusted []
  (let [draw (lottery)
        fst (first draw)
        snd (second draw)]
    (if (= 1 fst) snd fst)))

We go back to the parity approach, but we fix the sample space. The original approach uses 25 odd balls and 24 even ones, introducing a bias in favor of odd. We simply throw out one of the odd balls by disregarding it. I chose "1", but it does not matter which of the odd balls is thrown out. Simulation confirms the bias is gone.

user> (binding [trial trial-adjusted] (trials 100000))
0.50057

The Matchbook Problem

I have been reading a lovely book by Paul Nahin, Digital Dice. The book is subtitled Computational Solutions to Practical Probability Problems. I don’t know how practical the problems are, but they are a lot of fun.

Probability theory is rife with seemingly simple problems which turn out to be less simple than first appears. When possible, I like to approach such problems from two sides: analytically and computationally. If I can come up with two solutions, I can test one against the other. If the solutions don’t agree, at least one of them is wrong. Nahin’s book encourages this approach, and provides some fine practice material. For example, Nahin offers this problem, taken from a 1970′s textbook:

A pipe smoker has two booklets of matches in his pocket, each containing 40 matches initially. Whenever a match is required, he picks one of the booklets at random, removing one match. Write a program using random numbers to simulate the situation 100 times and determine the average number of matches that can be removed until one booklet is completely empty.

This is easily simulated in terms of coin tosses. We repeatedly flip a fair two-sided coin; heads represents one book of matches, tails the other. We’ll tally the outcome of each coin toss, and the trial is over as soon as we see the 40th head or the 40th tail, whichever comes first. The outcome of the trial is the total number of coin tosses. What we want to know is the average number of coin tosses per trial.

We’ll do the simulation in Clojure. First we’ll need to model a coin toss:

(defn coin-toss []
  (if (zero? (rand-int 2))
    [1 0]
    [0 1]))

Here’s how we use the coin to simulate one trial.

(def *N* 40)

(defn trial
  ([] (trial 0 0))
  ([heads, tails]
     (if (or (= *N* tails) (= *N* heads))
       (+ heads tails)
       (let [[h t] (coin-toss)]
         (recur (+ heads h) (+ tails t))))))

We need some code to conduct repeated trials, gather the results, and compute the average.

(defn average [coll]
  (float (/ (reduce + coll) (count coll))))

(defn trials [n]
  (average (take n (repeatedly trial))))

With a simulation in hand, let's think about solving the problem mathematically. For fixed N (40, in our case), we will repeatedly flip a coin until we see the Nth head or the Nth tail. Clearly a trial will require at least N coin tosses. The most tosses required will be 2N - 1. That happens when we manage to toss N-1 heads and N-1 tails, for a total of 2N - 2 tosses. The next toss will be decisive, no matter what the outcome, ending the trial at 2N - 1 tosses.

So every trial ends in N + K tosses, where 0 <= K < N. Write P(N, K) for the probability that the trial ends after exactly N+K tosses. The expected value we are trying to compute is the sum of P(N,K) * (N+K), taken over 0 <= K < N. The task boils down to computing P(N,K).

How can we calculate P(N,K)? Suppose the trial ends with N heads and K tails. The final toss had to have been heads. Then we must have had K tails out of first N+K-1 tosses. How many ways can that happen? The answer is well known: (N+K-1)!/[( N-1)! K!]. What is the probability of of any one of those outcomes? Given a fair coin it is 1/2^(N+K).

Let's implement the calculation so we can compare it to the results of the earlier simulation.

(defn factorial [n]
  (reduce * (range 1 (inc n))))

(defn choose [n k]
  (/ (factorial n)
     (factorial (- n k))
     (factorial k)))

(defn prob [n k]
  (/ (choose (dec (+ n k)) k)
     (clojure.contrib.math/expt 2 (+ n k))))

(defn expected-outcome []
  (let [n *N*
        probs (for [k (range n)] (prob n k))
        tosses (range n (+ n n))]
    (float (reduce + (map * probs tosses)))))

Now that we have both the simulation code and the calculation code, the only thing needed is a bit of scaffolding to compare the two approaches.

(def *num-trials* 10000)

(defn comparison [n]
  (binding [*N* n]
       [*N* (avg-outcome *num-trials*) (expected-outcome)]))

The comparison function is parameterized so we can vary the problem a bit. First let's try the original N=40.

user> (comparison 40)
[40 72.7887 36.442886]

The simulation and the calculation differ by a factor of about two. Why? Look back to the line that reads "Suppose the trial ends with N heads and K tails". I left out half the cases, the ones that end with N tails and K heads. The thing to do is correct the prob function:

(defn prob [n k]
  (/ (choose (dec (+ n k)) k)
     (clojure.contrib.math/expt 2 (dec (+ n k)))))

After that correction the comparison function shows that the simulation and the calculation agree nicely across a range of values of N:

user> (map comparison (range 20 60 5 ))
([20 34.9941 34.985172] [25 44.3824 44.386242] [30 53.7716 53.84531] [35 63.3641 63.348217] [40 72.8602 72.88577] [45 82.5092 82.4516] [50 92.0544 92.04108] [55 101.6011 101.65071])

In my experience, when the simulation and the calculation disagree, the simulation is usually more trustworthy. Simulations tend to be straightforward, while analytic approaches can be tricky to get right.

Cartesian product

I was browsing the 6-24 Clojure-log archive and ran across an invitation from Craig Andera:

I’ve knocked out a quick definition of something that computes the Cartesian cross product of multiple sequences. But as always when I write Clojure, I have to assume someone else can do it in less code. Anyone care to take a look? http://gist.github.com/451734

So I took a look. As luck would have it, that morning I been thinking about the reduce function, and reduce was exactly the tool needed for Craig’s challenge. Here’s my version.

(defn cross-prod [& colls]
  (->> colls
       (reduce #(for [x %1 y %2] [x y]))
       (map flatten)))

Antoine de Saint-Exupery famously remarked that “perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away”. I think he would have appreciated Clojure.