Monthly Archives: March 2008

20 Days of Clojure: Day 11

Ok, so now we’ve covered some basic clojure as lisp with immutable datastructures and built-in support for streaming (lazy) interfaces. I was playing around with Erlang right before I found clojure because I was looking for a language with direct support for concurrency, the lispiness and the JVM hosting were the deciding factor for me.

I’m just getting started with clojure’s concurrency support, so you’ll have to bear with me as I might struggle a bit. Yesterday, I showed you a lazy prime number sieve built using the seq interface. Today, I’m going to try to parallelize it. Since I don’t really know what I’m doing, I’ll take you through my thinking.

There are three ways in clojure to deal with memory in threads: dynamic vars, software transactional memory, and agents. I get dynamic vars and agents because I use things like them all the time. Dynamic vars are basically thread-local storage and agents are a kind of ActiveObject [pdf]. Thread local storage isn’t going to work for me since I need to gather answers from my worker threads, not have each work independently (there may be dynamic vars in the solution, but it’s not the way you communicate between threads).

ActiveObject is a pattern that serializes access to an object so that one thing happens to it at a time. Individual calls are put in a queue which is serviced by a thread assigned to the object, and that thread is the only one to touch the object. In clojure, there’s actually a thread pool that services all agents. Now, that I think about it — maybe this will work. I kind of wanted to play with STM’s, but I think I understand this better and I feel I can still make it streamable.

Here’s my naive idea. I’m going to make an agent per prime, then ask them all if a number is divisible by them, then wait for them all and if it is, make a new agent and go on. That seems really dumb unless agents are super-cheap. Let’s see.

I’ll start with something simple:

(defn divisible? [x y] (= (rem x y) 0))

I’ll define my agents to be maps that look like this at the start {:prime 2} and then define this:

(defn check-divisible [agent x]
(merge agent {:last x :notdivisible (not (divisible? x (:prime agent)))}))

Here’s a quick Repl session so far (no real agents yet)

user=> (check-divisible {:prime 2} 3)
  {:last 3, :prime 2, :notdivisible true}
  user=> (check-divisible {:prime 2} 4)
  {:last 4, :prime 2, :notdivisible false}
  user=> (check-divisible {:prime 2} 5)
  {:last 5, :prime 2, :notdivisible true}

Ok, now I’ll start. Here’s something weird — I just did a CPU snapshot running the sequential prime sieve from yesterday and it looks like this:

I don’t know what to make of that — I know purely functional programs are easy to parallelize, but this sieve seems particularly serial in implementation. Anyway, I’ll press on:

Here’s my agent-sieve

(defn agent-sieve [a ints]
(let [p (next-prime a ints)]
(lazy-cons p (agent-sieve (cons {:prime p} a) (iterate inc (inc p))))))

I keep stuffing a new agent list into the rest of a lazy-cons — just need to write next-prime:

(defn next-prime [agents ints]
    (let [x (first ints)]
      (check-divisible-by-agents (take-nth 2 agents) x)
      (check-divisible-by-agents (take-nth 2 (drop 1 agents)) x)
      (apply await agents)
      (if (every? (fn [a] (or (not= (:last @a) x) (:notdivisible @a))) agents)
        x
        (next-prime agents (rest ints)))))

Ok, so I’m starting to see problems — for one, I really want to short-circuit if an agent is divisible, but I also want to split up the work. So I decide to make a check-divisible-by-agents that will chain along each agent and stop when it finds an agent that divides into the prime. I start two chains each with half of the agents and wait — this will mean that if one is done, the other won’t know and it will keep going (which is bad). I press on, here’s the chain

(defn check-divisible-by-agents [agents x]
    (if (not= (count agents) 0)
      (! (first agents)
        (fn [a y rest-agents]
          (let [newstate (check-divisible a y)]
            (if (:notdivisible newstate)
              (check-divisible-by-agents rest-agents y))
            newstate))
        x (rest agents))))

So, I run a function on the first agent in the list (with !) that sets the new state to whether it was divisible. If it’s not divisible then I recursively call the check on the rest of the agents.

I’m not getting primes yet — but it’s close. I don’t think await works like I think it does — even though I am chaining ! inside of !, I think that doesn’t count for await. Here’s is next-prime rewritten.

(defn next-prime [agents ints]
(let [x (first ints)]
(check-divisible-by-agents (take-nth 2 agents) x)
(check-divisible-by-agents (take-nth 2 (drop 1 agents)) x)
(if (every?
(fn [a]
(await a)
(or (not= (:last @a) x)
(:notdivisible @a)))
agents)
x
(next-prime agents (rest ints)))))

Ok, I get primes now

To tell the truth, I can’t believe I got this working at all, but running (time) on the serial and parallel versions give me this:

sequential: “Elapsed time: 9.517 msecs”
parallel: “Elapsed time: 251.025 msecs”

So, obviously, this is not a good way to do this. I don’t know yet if it’s because I’m using too many agents or the extra work I’m doing because I’m not short-circuiting. Anyway, that’s enough for now.

Update: After a night’s rest, I think I know what’s going on with the other thread in the sequential version. My guess is the GC. The only other alternative I can come up with is that under the hoods clojure parallelizes some things I’m using. I started a thread on parallelizing prime sieves on the clojure google group.

(clojure day (March 20) in Northampton, MA is coming soon)

20 Days of Clojure: Day 10

Woo hoo — half way through. Well, not yet, but at the end of this entry.

Yesterday, we saw sequences. Now, here’s a way to use them: the Prime Sieve of Eratosthenes. Here it is in Scheme from the SICP lectures:

(define (sieve s)
  (cons-stream (head s)
   (sieve (filter
    (lambda (x)
     (not
      (divisible? x (head s))))
    (tail s)))))

(define primes
(sieve (integers-from 2)))

And here is a complete clojure program (all undefined functions are defined as part of the seq interface)

(defn sieve [s]
  (lazy-cons (first s)
    (sieve (filter
     (fn [x]
      (not
       (= (rem x (first s)) 0)))
     (rest s)))))

(defn primes []
(sieve (iterate inc 2)))

(prn (take 10 (primes)))

Here’s how it works. The function sieve is a sequence and takes a sequence (in this case we pass in all numbers greater than or equal to two).

Working from the inside, this filter:

    (filter
     (fn [x]
      (not
       (= (rem x (first s)) 0)))
     (rest s))

Filters the rest of the passed in stream based on whether the items are divisible by the first, so only the ones not divisible are passed. So for (iterate inc 2) we get a stream of all numbers not divsible by two.

The first of that stream is 3, and it is passed into sieve, so the resulting stream is all those not divisible by two (already filtered) or three. Once the stream is returned (lazily), the heads are lazy-cons’d on (that means that you get a stream with a given first and the rest set to the given stream). The first head was 2, the second was 3, the next one is 5 (the first number not divisible by 2 and 3), and so on.

(clojure day (March 20) in Northampton, MA is coming soon)

Update: It may look like sieve is an infinitely recursive function, but it’s not. lazy-cons is a macro that doesn’t evaluate its argument until rest is called on its return value. I haven’t looked at the implementation, but I imagine its implemented similarly to how cons-stream in the SICP lecture is.

Update: This prime sieve has been corrected here.

20 Days of Clojure: Day 9

News from Rich on a optimization he found for vectors:

I re-read Okasaki’s paper yesterday and woke up today with an idea for speeding up the Clojure vector conj and pop. They are now 2-5x faster than they were. The trick is to keep the tail of the vector out of the tree until it is a full (32-element) node. Note that this greatly speeds up vector construction, which typically consists of repeatedly conj’ing onto [].

Today, I’m going to look at sequences. I think it’s helpful to go back to the SICP — this time, the lecture on stream machines:

This is just a small part of the whole SICP stream machine lecture (and Streams Part II).

Sequences in clojure are lazy like stream as described in the video, and if you watch the whole lecture, you will see a rich stream library built up that is present in clojure.

This means that many operations can be incredibly efficient, but seem like they are operating on gigantic data structures at once (since my day job is image processing, this is especially intriguing to me — maybe more on that later). To really see this, you need to play with infinite sequences. In clojure, it’s easy to create an infinite sequence with cycle — (cycle [1 2 3]) returns a sequence which continuously yields 1, 2, 3, 1, 2, 3, etc. If you use sequence operations on it, you can get a feel for how laziness works because it’s obvious that these functions could not possibly be consuming the whole sequence. Here’s a clojure Repl script.

user=> (def cyc123 (cycle [1 2 3]))
#<Var: user/cyc123>
user=> (every? (fn [x] (= 1 x)) cyc123)
false
user=> (take 5 cyc123)
(1 2 3 1 2)
user=> (take 5 (filter (fn [x] (= x 1)) cyc123))
(1 1 1 1 1)
user=> (take 5 (drop 1 cyc123))
(2 3 1 2 3)
user=> (take 5 (interleave cyc123 cyc123))
(1 1 2 2 3)
user=> (take 5 (take-nth 3 cyc123))
(1 1 1 1 1)
user=> ~~(take 10 (for [x cyc123] (< x 3) x))~~ ;old syntax
user=> (take 10 (for [x cyc123 :when (< x 3)] x))
(1 2 1 2 1 2 1 2 1 2)
user=> (mapcat (fn [x y] (list x y)) cyc123 ‘(10 11 12))
(1 10 2 11 3 12)

More on this tomorrow.

20 Days of Clojure: Day 8

Rich Hickey comments on yesterday’s post on the clojure google group:

An interesting point of comparison with purely-functional solutions such as that which Okasaki writes about is that the Clojure vector can be log₃₂N specifically because I can use mutation – when cloning, i.e. the tree is log₃₂N deep and a node can be copied in O(1) time due to arraycopy + indexed array set. And yes, the vectors are optimized for lookup – in practice, log₃₂N blows logN out of the water.

Exactly — PersistentVector’s internals are not purely functional. New objects (during construction only) make an array copy of Object arrays from existing vectors, then make changes to the copied array before returning. As Rich points out, this makes some operations O(1). And for comparison, a clojure vector of 32k elements is only 2 levels deep, but if implemented with a log₂N structure would be 15 deep (with average access in maybe 7 operations). Since access by index is a very common operation, you can see that this would make a big difference.

Today, I want to look at PersistentHashMap.java which is the implementation of clojure hash-maps. There are actually four implementations of the map interface (hash-map, sorted-map, struct-map, and array-map), but hash-map is what the reader will give you for when you build up a map with {}.

A comment on the top of the source file says that it’s a persistent version of Phil Bagwell’s Hash Array Mapped Trie. According to that paper, it looks like the bits of hashes of the key are used to traverse a trie. Let’s see how it’s done in clojure.

The first thing I like to look at is how lookup is done, that gives me a sense of the structure. So, to start with, here’s entryAt:

public IMapEntry entryAt(Object key){
return root.find(RT.hash(key), key);
}

Simple enough — it uses the root node to start the traversal. The root node is an INode, and clojure uses polymorphism to get different behavior for different nodes. There are five types of nodes, EmptyNode, LeafNode, HashCollisionNode, BitmapIndexedNode, and FullNode. An empty hash-map has its root set to an instance of EmptyNode, which is an implementation of the Null Object design pattern, and find() on it returns null (by the way, the entire structure is a Composite design pattern, and by using each node type to handle part of traversal, I guess he’s also using a Chain of Responsibility — I mention this because there are many OO design patterns used in the Java implementation of clojure — some by name, so if you are going to look at the code, you’ll want to be familiar with these patterns).

If I assoc() onto a hash-map, root.assoc() is called which is supposed to return a new root. EmptyNode.assoc() simply creates a LeafNode with the hash, key and value, so a map with one key has its root set to a LeafNode.

If I call find() on a LeafNode, that’s pretty simple too. If this is the node I am looking for, it returns this, otherwise null.

If I assoc onto a LeafNode, here is what can happen:

If the key and value match the LeafNode, then it is just returned — nothing is created since you haven’t changed anything.
If just the key is the same as this node, a new LeafNode is created with the same key and the new value and returned (remember, this is a persistent data-structure, so we do not change the LeafNode, we create new nodes to build up a new hash-map).
If the hash is the same, but the key is different, then a new HashCollisionNode is created that holds onto this LeafNode and a new one that has the same key and the new value. HashCollisionNodes simply hold onto an array of LeafNodes whose keys have the same hash.
If the new key doesn’t have the same hash (the common case), then a BitmapIndexedNode will be created. At first it contains just the old LeafNode in its array of nodes, and then assoc() is called on it with the new key and value.

Case #4 is where we start to build up the trie. The hash of the key is going to be used to traverse the trie later, and a BitmapIndexedNode knows if it contains the LeafNode you are looking for. A BitmapIndexedNode has a bitmap of 32 bits. The bit corresponding to the bottom five bits of the hash of the LeafNode is turned on to indicate that the LeafNode is to be found in this branch (it can have 32 branches). For example if the bottom five bits of the key hash are 01100, then the 12th bit of the bitmap is turned on, and the node is put at this branch.

If the bitmap bit for the new leaf node is off, then a new BitmapIndexedNode is created with a new bitmap with the new bit turned on, and the node is put in its array which will be one bigger than the one we started with. If we kept doing this and hit each bit, the 32nd assoc would result in a FullNode — which is a simple case of a BitmapIndexedNode, except it knows that if it had a bitmap, all bits would be set. Again, remember, we create a new node because we never change nodes in a persistent data structure.

If the new leaf node has a bottom five bits that match a node already in this BitmapIndexedNode (the corresponding bit is already turned on), then we assoc() this key/value onto the node with the same five bits.

I haven’t mentioned this yet, but assoc takes a shift argument — so far, that has always been 0 and that meant to use the bottom five bits (use a 0 shift) of the hash in any BitmapIndexedNodes. On this assoc call, we’re now going to pass in (shift+5) so that the next 5 bits will be used on any BitmapIndexedNodes created under here. Later when we traverse, we’ll use the bottom five bits on the first BitmapIndexedNode we see, then the next five bits on the next one and so on. When we’ve used up all 32 bits of the hash, we should be at our LeafNode or a HashCollisionNode (meaning all 32 bits matched).

So here’s find():

If the node is an EmptyNode, it returns null.
If the node is a LeafNode, it returns itself if the key matches, otherwise null.
If the node is a HashCollisionNode, it calls find() on each LeafNode it holds, if any return themselves, then that is returned, otherwise null.
If the node is a BitmapIndexedNode, it first checks to see if it has a node where the five bits being used matches and then calls find() on it, if it doesn’t, it returns null.
If the node is a FullNode, it acts like a BitmapIndexedNode with all bits of the bitmap set, so it just calls find() on the node corresponding the five bits in the hash at its level.

The data structure is persistent because it never changes constructed nodes. During an assoc, all non-traversed paths will be shared, and only the path that changed will be new.

I should also mention that FullNode is purely an optimization — a BitmapIndexedNode could be used in its place, but it saves some branch calls to have them sprinkled throughout the map. Also, using a bitmap in BitmapIndexedNode is also an optimization. An INode array of 32 elements could have always been constructed with null signifying that there was no nodes in that branch, but that would be very wasteful in space, and so by using the bitmap, BitmapIndexedNodes have an INode array of the exact length they need.

20 Days of Clojure: Day 7

Ok, I’m going to try to describe how vectors are implemented in clojure. The implementation is in:

src/jvm/clojure/lang/PersistentVector.java

if you want to see the code. I’m going to try to go through the important bits here.

The PersistentVector object has three members: two ints named cnt and shift, and an Object[] named root. A simple PersistentVector built from this line:

(def v1 [1 2 3])

would look like this (using curly braces for root because it is a Java array):

   cnt: 3
   shift: 0
   root: { 1, 2, 3 }

cnt is the count of elements and (count v1) simply returns it and is O(1) — I’ll explain shift later, and root is the object array. When shift is 0, this line:

(nth v1 1)

Just simply resolves to root[1], and returns 2, and in this simple case is also O(1). If I do this:

(def v2 (assoc v1 1 4))

which returns a new vector, but with the value at index 1 set to 4, you get another PersistentVector that looks like this:

   cnt: 3
   shift: 0
   root: { 1, 4, 3 }

The 1 and the 3 are shared between the v1 and v3 array. If I do this:

(conj v1 5)

I’ll get yet another PersistentVector that looks like this:

   cnt: 4
   shift: 0
   root: { 1, 2, 3, 5 }

with the 1, 2, and 3 shared with v1 (and the 1 and 3 shared with v2). This is all very simple until, you conj onto a vector of 32 elements. When the root array has 32 elements, then adding one more element (33) returns a new PeristentVector that looks like this (assume the starting array had 1, 2, 3, … 32)

   cnt: 33
   shift: 5
   root: { {1, 2, 3, …, 32 }, { 33 } }

Root is a Java Object[] with the 0th element set to the Object[] from the input vector to conj (not a clone, but the actual one), and the next element is an array of just the new value. If I conj onto that, I get a new Vector:

   cnt: 34
   shift: 5
   root: { {1, 2, 3, …, 32 }, { 33, 34 } }

Now, I can explain shift. When a method that takes an index is called, a bit shift is done on it to determine how many levels deep in the structure we need to go to get to the part of the datastructure that has the element at that index. For instance, here is the main Java for nth(i):

      Object[] arr = root;
      for(int level = shift; level > 0; level -= 5)
         arr = (Object[]) arr[(i >>> level) & 0x01f];
      return arr[i & 0x01f];

so, when i < 32, then i >>> level is 0, and arr will be root[0] (the array of 32 elements). Then we return arr[i & 0x01f] (which is i % 32), to get the ith element in that array.

When i == 32, then (i >>> level) is 1, arr is root[1], and then we return arr[i%32] which is the 0th element. Now, if I do an assoc to set the 0th element to 100, I get this PersistentVector:

   cnt: 34
   shift: 5
   root: { {100, 2, 3, 4, …, 32 }, { 33, 34 } }

assoc calls a recursive function (doAssoc). First it clones root so that the new root is an array of two elements, each an object array. Then it determines that index 0 is in the 0th branch and does an assoc on that, decrementing the shift by 5 and setting the index to (index % 32). This recursive call clones the array at root[0]. Since shift is now 0, it is at the base case of the recursion, and so it sets root[0][0] to 100. All of the numbers and the entire array at root[1] is shared with the starting vector. Here is the Java code for that (doAssoc is called with the arguments shift, root, the index, and the new value) — the return is the new root:

   private static Object[] doAssoc(int level, Object[] arr, int i, Object val){
      Object[] ret = arr.clone();
      if(level == 0)
         {
         ret[i & 0x01f] = val;
         }
      else
         {
         int subidx = (i >>> level) & 0x01f;
         ret[subidx] = doAssoc(level – 5, (Object[]) arr[subidx], i, val);         }
      return ret;
   }

By now, you might have realized that we are building up a tree. If I keep conjing on numbers in order starting at 1, eventually I will conj on 65. If so, I get this

   cnt: 65
   shift: 5
   root: { {1, 2, 3, 4, …, 32 }, { 33, 34, 35, … 64 }, { 65 } }

Graphically, it looks like this (for 1..32 elements)

+————-+
|{1,2,3,…32}|
+————-+

and then for up to 32², it looks like this:

             +——-+
             | {   } |
             +–+-+–+
                | |
       +——–+ +———-+
       |                     |
+——+——+     +——–+——-+
|{1,2,3,…32}|     |{33,34,35,…64}|
+————-+     +—————-+

And, we get another level at 32²+1 (1025) elements, and another at 32k+1. This explains how index access is O(log₃₂N). Another important point, is that not only is a PersistentVector immutable from an interface perspective, but it’s internally immutable, and each individual Object[] that makes up the tree is never changed in a constructed vector. If it needs to change in a new vector, it is cloned and the changed clone is used in the new PersistentVector — the benefit of this is that the vector never needs to lock. This may seem like a small point, but I believe that this is not a requirement of Persistence — but this extra work makes vector very suitable for concurrent applications.

This implementation is different from the one described in the Okasaki paper from yesterday. Rich appears to be trading away O(1) consing for better index performance, which was O(log₂N) in that paper.

(clojure day (March 20) in Northampton, MA is coming soon)

Update: Made a few corrections

20 Days of Clojure: Day 6

So far, I’ve been working on the “clojure is a dialect of Lisp” part of the description of clojure. One of the things Lisps allow is purely functional programming, but to enforce it you need immutable datatypes. One of the things that bothered me as a C programmer, is that I could never understand how it worked for anything more complicated than a string. Then I saw persistent data-structures. Their existence in clojure is what made me think that purely functional programming could work in real programs.

The clojure page on data structures, but more importantly check out clojure TV on collections.

If you’re interested in how these data structures are implemented, you want to read Okasaki. I would start here with his description of a persistent random access list, which I think is one of the more accessible.

Tomorrow, I’m going to try to expand on this by describing how vectors are implemented in clojure.

20 Days of Clojure: Day 5

This SICP lecture shows some of the power of Lisp by building a symbolic differentiator in a very small amount of code, and is also the first one to introduce the idea that code is data and data is code (watch from the beginning to 29:00 — at 13:00 he starts talking about using Lisp as the data format):

Here is the first symbolic differentiator he writes (but, in clojure)

(defn atom? [x]
(not (instance? clojure.lang.IPersistentCollection x)))

    (defn const? [expr val] (and (atom? expr) (not= expr val)))
    (defn same-var? [expr val] (and (atom? expr) (= expr val)))
    (defn arith? [expr sym] (and (not (atom? expr)) (= (first expr) sym)))
    (defn sum? [expr] (arith? expr ‘+))
    (defn prod? [expr] (arith? expr ‘*))

    (defn a1 [expr] (frest expr))
    (defn a2 [expr] (first (rrest expr)))
    (defn m1 [expr] (frest expr))
    (defn m2 [expr] (first (rrest expr)))

    (defn deriv [expr val] (
        cond
            (const? expr val) 0
            (same-var? expr val) 1
            (sum? expr)
                (list ‘+ (deriv (a1 expr) val) (deriv (a2 expr) val))
            (prod? expr)
                (list ‘+ (list ‘* (m1 expr) (deriv (m2 expr) val))
                    (list ‘* (m2 expr) (deriv (m1 expr) val))
                )
            (:else) nil
        )
    )

The things to note:

Quote syntax, which uses ‘
first and rest which are car and cdr. Clojure also supports combinations like frest (cadr) and rrest (cddr)
In clojure, extra parens are usually not necessary — look at how cond is implemented to use variable arguments rather than a single list argument (which would require extra parens)
My first usage of Java interop (using instance? to find out if an expression is atomic or a collection)

20 Days of Clojure: Day 4

Continuing from yesterday, we made our first function that takes a function and returns a function. Go to 43:00 in this video to see why (you can stop at 54:00):

In this example, we take the first-order functions idea further and define a derivative function (that takes and returns functions) and a newton function (which takes a function and uses fixed-point). Here it is in clojure (you can find fixed-point and other functions in yesterday’s post):

    (defn square [x] (* x x))

    (defn deriv [f]
        (let [dx 0.000001]
            (fn [x] (/ (- (f (+ x dx)) (f x)) dx))))

    (defn newton [f guess]
        (let [df (deriv f)]
            (fixed-point (fn [x] (- x (/ (f x) (df x)))) guess)))

    (defn sqrt [x] (newton (fn [y] (- x (square y))) 1))

(clojure day (March 20) in Northampton, MA is coming soon)

20 Days of Clojure: Day 3

Ok, so first we made a very hard-coded sqrt, then we improved it by adding the concept of fixed points, which made the algorithm easier to see in the code. Here’s how we can improve it even further. Go to 32:30 in this video (stop at the break at 43:00):

First we define an average-damp function. This function takes a function as its argument and returns another function which takes a number and returns the average of that number and result of calling the function argument on the number. It’s easier to say that in clojure:

(defn average-damp [f]
(fn [x] (average x (f x))))

f is a function, average-damp creates a new function that takes x (a number) and that new function averages x and f(x).

Now that I have that, I define sqrt like this (you can find average and fixed-point in yesterday’s post):

(defn sqrt [x] (fixed-point (average-damp (fn [y] (/ x y))) 1))

Tomorrow, we’ll use this to implement another way of finding sqrt.

20 Days of Clojure: Day 2

Yesterday, we built a simple sqrt using Heron of Alexandria’s algorithm. The problem with the code is that it had some generically useful ideas, but it was all hard-coded to only work with sqrt. Also, it was not clear what exactly the algorithm was.

In the SICP lecture they fix it by passing functions around since Lisp treats functions as first-class data-types. Here’s the fix explained (23:30 in the video, stop at 32:30):

See the video for the Scheme implementation. Here it is in clojure.

    (defn sum ([] 0)
        ([x] x)
        ([x & more] (+ x (apply sum more) )))

    (defn average
        ([] 0)
        ([x] x)
        ([x & more]
            (/ (+ x (apply sum more)) (+ 1 (count more)))))

    (defn abs [x] (if (< x 0) (- x) x))     (defn close-enough? [u v]
        (let [tolerance 0.00001]
            (< (abs (- u v)) tolerance)))     (defn fixed-point [f start]
        (loop [old start new (f start)]
            (if (close-enough? old new)
                new
                (recur new (f new)))))

(defn sqrt [x] (fixed-point (fn [y] (average (/ x y) y)) 1 ))

(prn (sqrt 9))

A fixed point of a function f is a value y such that f(y) = y. This improved code, which implements the same algorithm as before, uses a generic fixed-point function that takes a function and a guess and tries to find the fixed point by setting the next guess to f(guess). You can see it all in the video. The things in clojure you need to know:

fn [arg1 arg2] is lambda (λ), so where fixed-point takes a function as the first argument, we create one on the fly to be (average (/ x y) y) where x is the argument to sqrt (closed around our function) and y is the argument defined by fn.
Since clojure compiles to Java and uses Java calling conventions, it does not have automatic tail-recursion. Instead, you use the recur special operator which will create a constant space loop back to the last loop or function definition. Loop/recur is explained on the Functional Programming page of the clojure site (at the bottom).

But, even this can be improved, as we’ll see tomorrow.