Category Archives: Clojure

The Most Consequential Clojure I Ever Wrote

In my congratulations to Rich Hickey, I mentioned that clojure influenced my code, but that I never wrote any professionally. I did, however, apply to a job opening at FogCreek/Trello with clojure code. They gated the application process with a programming question.

I call this code the most consequential clojure code I ever wrote because it got me an interview at FogCreek, which ultimately ended up in me working at Trello for almost seven years, during which we were acquired by Atlassian.

Since the question has been publicly answered many times, and is no longer used, I’ll reproduce it here:

Find a string of characters that contains only letters from “acdegilmnoprstuw” such that the hash(the_string) is 910897038977002.

If hash is defined by the following pseudo-code:

hash (s) {
h = 7
letters = "acdegilmnoprstuw"
for(i = 0; i < s.length; i++) {
h = (h * 37 + letters.indexOf(s[i]))
return h

For example, if we were trying to find the string where hash(the_string) was 680131659347, the answer would be “leepadg”.

I chose to use clojure because I didn’t know how big the value of h was going to get and wanted access to clojure’s implementation of BigInteger, which is seamless. All integer math in clojure promotes up to whatever size integer it needs. Once I thought of using clojure, I realized that it would be seen as a fun choice by the developers reviewing the code since FogCreek had a reputation of hiring developers that were interested in weird programming languages (see wasabi).

If you don’t want any hints on how to solve this, stop here.

So, they are asking you to invert the hash() function. In general, hash functions should not be invertible, but this hash is made by using simple lossless integer arithmetic with * and +, which both have inverses (/ and -). Each step of the function is invertible, so the function itself is invertible.

I decided to start with just implementing their pseudocode in clojure.

(defn fchash [arg]
  (let [letters "acdegilmnoprstuw"]
      (loop [h 7 s arg]
          (if-let [c (first s)]
              (recur (+ (* h 37) (.indexOf letters  (int c))) (rest s))

And to make sure this was correct, I evaluated (fchash "leepadg") to make sure I got 680131659347.

There are a few tricks to inverting the code.

First, you need to figure out when the loop is done. You don’t know how long the string is supposed to be, so you need to detect somehow that you are done. For this, note that the hash is originally seeded with the number 7, so we expect that we’ll eventually get back to this number in the inverted version.

Second, you need to figure out what number was added after the hash was multiplied by 7. To get that, you can use mod 37 because the number must be less than the length of the letters string (which is 16) and 16 is less than 37, so using mod gets you the number that was added to the hash after it was multiplied by 37.

That should be enough to attempt an answer (in whatever language). If you want to see what I submitted, look here: Clojure inverse hash.

Congratulations Rich Hickey

Rich Hickey, creator of Clojure, has announced his retirement from commercial software development. It looks like he’ll be still active in clojure development, but as an independent developer.

I met Rich in the 90’s when I took his Advanced C++ continuing education course at NYU. I was running a C development team, and we were adopting C++, so a few of us took the class. The most memorable part was the last few sessions where he described a GUI object-oriented design built around a dynamic object system (ala Self or Javascript) using his functor library.

The next 15 years of my career were dominated by C++ where my code was heavily influenced by what I learned in this class.

In 2007, when I saw his presentation at the NYC Lisp group, I reached out to see if he wanted to present to the Western MA Developer Group. Since Clojure was still relatively new, he was willing to come to present to us.

We had about 30-40 people there. One of our members, Chas Emerick, hosted the event. He went on to be a prolific contributor to the clojure ecosystem and co-author of O’Reilly’s Clojure Programming book.

I helped promote the event by writing my 20 Days of Clojure series. For the time, that was a lot of clojure content.

He came in March 2008 and blew the doors off with an elegant, concurrency-safe ant simulation:

Here is my original write-up of the meeting.

I still keep in touch with many of the developers that were there that day and we still talk about it. I can see the influence in their work.

The most important clojure code I wrote was the code I used to apply to FogCreek/Trello. They gated their application with a programming question, and I answered it in clojure because it looked like I would need something like a BigInteger in my answer, and clojure makes that easy. I also knew that the FogCreek/Trello team liked functional programming.

We might not all have adopted clojure (we opted for F# at Atalasoft and one of our engineers went on to become a Microsoft F# MVP), but our career trajectories were changed by that day when Rich opened our minds to what was possible with modern functional programming.

Thank you Rich and congratulations.

20 Days of Clojure: Day 20

Last night was our Clojure event with Rich Hickey. BTW, if you were there, regular meeting of the Western Mass Developers Group are every other Thursday at Panera Bread in Hadley (7pm) — most of the time, we just sit around and talk — very informal — hope to see you there.

Back to the presentation — hopefully the slides/video/etc will be up somewhere soon — Rich focussed on the concurrency features of Clojure (Vars, Refs, and Agents). First he showed us the state of concurrency support in Java/C# today (locking) and identified the problem as having direct references to mutable objects. Clojure offers only immutable atoms and datastructures so that’s how he addresses one part of the problem. However, since the mutability is how the world works, Clojure models change as a change to what a reference points to. So, if you view the world through refs, it will appear to change. Then he introduced the controlled ways in which refs can change.

1. Vars are essentially thread local variables. They are a reference whose changes can only be seen in a single thread — each thread gets an independent and isolated view of the variable. Like thread-local variables — they are a very simple way to take code that looks like it’s manipulating global state and give each thread its own isolated copy.

2. Refs are mutable references to immutable objects. They can only be changed inside a transaction and all changes to all refs in a transaction appear to have happened at the time the transaction ends (to other threads). Inside the transaction, the changes are immediate. When two transactions are open and change the same ref, the second one to finish is automatically retried (so you should have no side-effects inside of transactions). All Refs read inside of a transaction will return the value that they had at the start of the transaction (so are consistent with each other).

3. Agents (which I have explained before) are for asynchronous changes to a reference. Each agent hold a value and you send actions to it to change the value. Only one action will happen to an agent at a time and you can read the current state of the agent any time. Rich spent a little time to caution us against any comparison to Erlang’s actors. They are a different concept with different tradeoffs. Agents have the advantage that they can be read from any thread conveniently (just dereference) whereas actors require sending a message to read (which is asynchronous). Clearly, Erlang’s benefit is location transparency (for distributed code) — which is what it was designed for. Rich hinted that Clojure might have an Actor concept, but it would not be unified with Agents.

What was new to me with agents is that there are two types of message sending mechanisms — send and send-off (which appear to be undocumented right now) — Rich used send-off which dedicates a thread to the agent (rather than run the action from a thread-pool). This is how you have to do it if agents block at all (which ants do because they have a small sleep).

Then, he showed us an ant simulation — I think he will make the code available. In short, there is a square world of cells, each one holds a Ref to what is in the cell (food, ant, pheromones, home base). Ants are represented by agents, which are sent a behave message. Behave picks a random action for the ant to do (move, turn, pick up food, drop food) and then mutates the world of cells in a transaction, then it sends itself another behave action. There is an agent responsible for painting the world, and another which evaporates pheromones in the background.

Anyway, the demo was impressive — since painting makes a copy of the world in a transaction, it has a completely consistent view of the world. Refs make sure that all changes to them are in transactions, so you have language support for where you need cooperation (contrasted to locking, which is not enforced). Agents also help you model the real world in a way that a coordinated single-looped simulation (calling behave on ants in a loop, then paint over and over again) could not.

And clojure’s agent mutating mechanism (sending a function to it), means that agents don’t have to have any knowledge of the messages that might be sent to it (again contrasted to Erlang Actors). Finally, various messages can take different times to complete and that would be represented in the simulation — some ants might complete several small actions in the time it took another to complete one (which would not be the case in a behave loop).

I’ll have more on this when the slides, code, etc are available.

20 Days of Clojure: Day 19

Ok, yesterday I decided to try to implement a defclass macro which would hobble multimethods and force you to have some kind of class inheritance for polymorphism. Today, I’ll try to implement that macro.

First, I made this macro, which I think will be useful

  (defmacro sym2key [s]
    `(keyword (name (quote ~s)))    

Works like this:

  user=> (sym2key rect)

(I wrote the above this morning, and after several hours, I didn’t get too far)

I had to cheat a lot, but I finally got something — here is my final OO minilanguage

  (defclass shape
    (defabstractmeth area)
  (defclass rect
    (defctor (fn [w h] (setprop w) (setprop h) ))
    (defmeth area (fn [] (* (:w this) (:h this))))
  (defclass circle
    (defctor (fn [r] (setprop r) ))
    (defmeth area (fn [] (* (:r this) (:r this) (. Math PI))))

I got that to be processed by these macros:

  (defmacro sym2key [s]
    `(keyword (name (quote ~s)))  

  (defmacro defclass [name & body]
    (cons ‘do
    (loop [classbody# body parts# nil]
      (if classbody#
        (recur (rest classbody#) (cons (concat (first classbody#) `(~name)) parts#))

  (defmacro defabstractmeth [name class] `(defmulti ~name :type) )

  (defmacro setprop [x] { (sym2key x) x })

  (defmacro defctor [fndef name]
    (let [  arglist# (second fndef)
        fnbody# (map second (nthrest fndef 2) )
        obj# (reduce merge
            {:type `(sym2key ~name)}
            (map assoc (repeat {})
              (map eval
                (map sym2key fnbody#)) fnbody#))
      `(defn ~name ~arglist# ~obj#)

  (defmacro defmeth [meth fndef name]
    (let [  arglist# (conj (second fndef) ‘this)
        fnbody# (first (nthrest fndef 2))
        namesym# (eval `(sym2key ~name))

      `(defmethod ~meth ~namesym# ~arglist# ~fnbody#)

This code shows it in action:

  (prn (rect 10 20))
  (prn (area (rect 10 20)))
  (prn (circle 10))
  (prn (area (circle 10)))


  {:type :rect, :h 20, :w 10}
  {:type :circle, :r 10}

I am way too tired to explain this — suffice to say, this is kind a crazy way to make something, but it sure beats not being able to make it. My ctor implementation is so crazy, that you should just ignore it — defclass and defmeth are worth looking at (ctor is a major hack — assumes only setprop calls and turns them into a map).

20 Days of Clojure: Day 18

I just got around to watching Clojure TV on sequences, which I guess I should have done earlier, but I guess I was too lazy — thanks, I’m here all week.

Ok — since clearly, day 20 is going to be about Clojure Day in Northampton, I only have two days to explore more of clojure. I haven’t even gotten to multimethods, which are very cool, but pretty simple to understand. I want to keep working with macros.

The hardest thing about thinking about macros is that not using a language that has one, I haven’t started to think in macros yet. Kind of reminds me of Clint Eastwood in Firefox who has to keep remembering to think in Russian in order to fly his stolen super-secret Soviet jet.

Ok, since I think in OO, let me try to do some OO in clojure via macros — maybe I get to use multimethods after all (let’s hobble them with conventional OO polymorphism)

  (defclass rect
    (defctor (fn [w h] (setprop w) (setprop h) ))
    (defmeth area (fn [] (* w h)))

  (defclass circle
    (defctor (fn [r] (setprop r)))
    (defmeth area (fn [] (* r r (. Math PI))))

I want that to turn into

  (defn rect [w h] {:type :rect :w w :h h})
  (defmulti area :type)
  (defmethod area :rect [r]
    (* (:w r) (:h r)))

  (defn circle [r] {:type :circle :r r})
  (defmulti area :type)
  (defmethod area :circle [c]
    (* (. Math PI) (* (:r c) (:r c))))

I’m not even sure this is possible — I just checked to see if it’s legal to defmulti the same name, and it’s not really because it kills the definition for rect. Luckily, I have a precedent to fall back on — in Java/C# style OO, area is only polymorphic if it came from a superclass. I can change the class structure to this:

  (defclass shape
    (defabstractmeth area)

  (defclass rect :extends shape
    (defctor (fn [w h] (setprop w) (setprop h) ))
    (defmeth area (fn [] (* w h)))

  (defclass circle :extends shape
    (defctor (fn [r] (setprop r)))
    (defmeth area (fn [] (* r r (. Math PI))))

This is so ugly, I don’t even know if I want to go through with this. I’ll try for tomorrow.

20 Days of Clojure: Day 17

(clojure day in Northampton, MA is Thursday, March 20th)

I know I’ll reveal myself as being tragically unhip for not using emacs or aquamacs or whatever (I’m a VI man after all), but I started this TextMate Bundle for clojure. I’ll grow it as time goes by and probably move it to its own page or some repository for bundles once I figure this bundling stuff out.

I do some rudimentary syntax coloring and automatically expand defn[Tab], def[Tab], let[Tab], and if[Tab].

If you use TextMate and have some ideas for things to add, please send me a note.

20 Days of Clojure: Day 16

Ok, time to look at macros. Being new to lisps, I’m especially new to macros. I have been meaning to read On Lisp for a few months, so like the SICP, I’ll show you stuff from the macros chapter and then try to write them in clojure.

In, On Lisp (Chapter 7), Paul Graham writes:

The expression generated by the macro will be evaluated in place of the original macro call. A macro call is a list whose first element is the name of the macro. What happens when we type the macro call into the toplevel? Lisp notices that [it] is the name of a macro, and

  1. builds the expression specified by the definition, then
  2. evaluates that expression in place of the original macro call

Macros simply allow you to write code that gets executed at compile time to generate code, which is then run. There really isn’t anything quite like them in popular non-lisp languages, though there are a couple of .NET languages with language altering macros. They are much more powerful than the preprocessor macros in C, and C++ template meta-programming can sometimes get you the effect, but templates were not designed for this, so it is too cumbersome most of the time.

We have been using macros in clojure throughout this series. In fact, since Lisps evaluate all of their arguments before calling a function, whenever you see code where all of the arguments are not evaluated, you are either looking at something built-in to the language (e.g. if) or a macro (e.g. cond). You can look at boot.clj to see the definitions of them. For example, here’s lazy-cons:

  (defmacro lazy-cons [x & body]
    (list ‘fnseq x (list* `fn [] body)))

Remember, we used lazy-cons to make lazy sequences. lazy-cons acts like cons, except it does not evaluate the rest-expr until rest is called on the resulting list. We can use macroexpand-1 (which returns the unevaluated macro expansion of a macro call) to see how this works:

  user=> (defn rfn [] (prn “Called”) ‘(3))
  #<Var: user/rfn>
  user=> (rfn)
  user=> (def lazylist (lazy-cons 1 (rfn)))
  #<Var: user/lazylist>
  user=> (first lazylist)
  user=> (rest lazylist)
  user=> (rest lazylist)
  user=> (macroexpand-1 ‘(lazy-cons 1 (rfn)))
  (fnseq 1 (clojure/fn [] (rfn)))
  user=> (first (eval (macroexpand-1 ‘(lazy-cons 1 (rfn)))))
  user=> (rest (eval (macroexpand-1 ‘(lazy-cons 1 (rfn)))))

So, a lazy-cons is just a short hand to using a fnseq and wrapping the rest in a function that will be called whenever rest is called on the fnseq. That’s a typical use of a macro — codifying common uses to make your programs shorter. And, lazy-cons cannot be implemented as a function, because if it were, its arguments would need to be evaluated, which defeats the purpose.

Ok, let’s write some macros (Documentation on the clojure macro page and the clojure reader page). On Lisp, section 7.3 (Defining Simple Macros) presents this one for while.

  (defmacro while (test &body body)
    `(do () ((not ,test)) ,@body))

Here it is in clojure:

  (defmacro while [test & body]
    `(loop [] (if ~test (do ~@body (recur)))))

In clojure, you don’t have many mutable objects — here’s how you use while with a thread-local var binding:

  (def x 10)
  (prn (macroexpand ‘(while (> x 0) (prn x) (set! x (dec x)))))
  (binding [x 10] (while (> x 0) (prn x) (set! x (dec x))))

Here’s the output:

  (loop [] (if (> x 0) (do (prn x) (set! x (dec x)) (recur))))

20 Days of Clojure: Day 15

Christophe Grand made two observations about my parallel prime code, which improve it quite a bit.

First, since primes-in-list is lazy, you have to (touch) it to get it to be completely created in the agent thread. That leaves you with this:

  (defn par-primes-in-list [potential-primes known-primes]
    (let [ x 10 x-times-2 (* x 2)]
      (! w1 (fn[w pp kp] (
touch (primes-in-list pp kp))) (take x potential-primes) known-primes)
      (! w2 (fn[w pp kp] (
touch (primes-in-list pp kp))) (take x (drop x potential-primes)) known-primes)
      (await w1 w2)

      (let [new-primes (concat @w1 @w2) ]
        (if (= (count new-primes) 0)
          (recur (drop x-times-2 potential-primes) known-primes)
          (let [real-primes
            (if (> (first new-primes) x-times-2) new-primes (primes-in-list new-primes []))]
            (lazy-cat real-primes (par-primes-in-list
              (drop x-times-2 potential-primes)
              (into known-primes real-primes))))))))

But, the bigger difference is picking the number of potential-primes to evaluate intelligently. I had experimented with trying to do this dynamically, but without the fix above, anything you do will probably result in primes being found lazily in the main thread. Here’s Christophe’s code:

    (defn par-primes-in-list [potential-primes known-primes]
        (let [p (first potential-primes) x (quot p 2)]
            (! w1 (fn[w pp kp] (touch (lazy-primes-in-list pp kp)))
                (take x potential-primes) known-primes)
            (! w2 (fn[w pp kp] (touch (lazy-primes-in-list pp kp)))
                (take (- p x) (drop x potential-primes)) known-primes)
            (await w1 w2)

            (let [new-primes (concat @w1 @w2) ]
                (lazy-cat new-primes
                         (drop p potential-primes)
                         (into known-primes new-primes))))))

Christophe’s observation:

I use the property that no number in the interval [n .. 2*n-1] is divisible by n (except n itself) so it’s safe to process the whole interval with the current set of known primes. Hence this interval can be divided into m intervals (where m is the number of cores) for parallel processing.

The one issue with Christophe’s code it part lazy and part eager, so it’s applicability depends on whether you are trying to get to a specific number the fastest or just generate primes as fast as possible. Since the number of primes to try grows as it searches, it lazily generates bigger and bigger lists of primes. I tried to find a good value to ask for where it would perform well, and chose 20390 which is big enough to show a difference and doesn’t make Christophe’s code do too much extra. Here are the results:

Christophe’s: “Elapsed time: 87266.119 msecs”
Mine (with a touch added): “Elapsed time: 108841.354 msecs”
Sequential: “Elapsed time: 151075.292 msecs”

20 Days of Clojure: Day 14

Ok, I’m finally getting somewhere. To start with, Rich pointed out that I could use recur to make my last prime finder stack safe and faster. Here it is:

  (defn primes-in-list [potential-primes known-primes]
    (let [p (first potential-primes)]
      (if (nil? p) ‘()
        (if (is-prime? p known-primes)
            (list p)
            (primes-in-list (rest potential-primes) (conj known-primes p)))
          (let [rest-primes (rest potential-primes)]
            (recur rest-primes known-primes))))))

Here’s how I parallelized it:

  (def w1 (agent 0))
  (def w2 (agent 0))

  (defn par-primes-in-list [potential-primes known-primes]
    (let [ x 10 x-times-2 (* x 2)]
      (! w1 (fn[w pp kp] (primes-in-list pp kp)) (take x potential-primes) known-primes)
      (! w2 (fn[w pp kp] (primes-in-list pp kp)) (take x (drop x potential-primes)) known-primes)
      (await w1 w2)

      (let [new-primes (concat @w1 @w2) ]
        (if (= (count new-primes) 0)
          (recur (drop x-times-2 potential-primes) known-primes)
          (let [real-primes
            (if (> (first new-primes) x-times-2) new-primes (primes-in-list new-primes []))]
            (lazy-cat real-primes (par-primes-in-list
              (drop x-times-2 potential-primes)
              (into known-primes real-primes))))))))

Basically, I send lists to the agents, wait for them, and then combine their answers. The original runs at about 100% of CPU, calculates 8000 primes in 21.6 seconds and this runs at about 125%, and calculates the primes in 17.1 seconds, or in about 80% of the time.