Software Development

Strictly Professional Podcast

I took part in a Western MA Developers podcast last week. It was a lot of fun and definitely captures the spirit of how our conversations usually go. We cover a broad range of topics including the bug tracker market, refunds in software, whether to be Rich or King in a venture, and a philosophical discussion of Rich Hickey's Time is the new Memory concept.

Ben Fry Speaking in Northampton

Ben Fry, the creator of Processing and the author of Visualizing Data is coming to speak in Northampton. The event is sponsored by Atalasoft and Snowtide.

When: May 5th @ 6:30
Where: Snowtide Offices, 243 King St (Potpourri Mall, just across from Stop and Shop) Northampton, MA

I'll post more details soon.

Unit testing on the iPhone

Thanks to Google, there's a unit-testing framework for the iPhone. There's not much more to say about it -- the instructions are crystal clear and it worked exactly as described. It's compatible with OCUnit (the Objective-C unit test framework in XCode), so once you set it up, you can just create test cases the way you would for any ObjC project.

One quirk -- it instructs you to add a build step that runs the unit-tests during build time and shows the failures as compiler errors that you can then use XCode to track down. That's nice, but I have found that you don't really have enough of an environment to successfully run every kind of test -- they run fine if you run them in the simulator. The main problem I have is with setting up my database in my Documents folder -- I get errors at build-time that work just fine at run-time.

Get iPhone programming tips in your inbox with my Beginner iPhone Programming Tips newsletter.

Ant Colony in F#

My colleague, Rick, is playing around with F#. He made this ant colony in F# that's based on Rich's clojure one. If you know clojure, and you want to see what F# is all about, you might want to take a look.

20 Days of Clojure: Day 20

Last night was our Clojure event with Rich Hickey. BTW, if you were there, regular meeting of the Western Mass Developers Group are every other Thursday at Panera Bread in Hadley (7pm) --- most of the time, we just sit around and talk -- very informal -- hope to see you there.

Back to the presentation -- hopefully the slides/video/etc will be up somewhere soon -- Rich focussed on the concurrency features of Clojure (Vars, Refs, and Agents). First he showed us the state of concurrency support in Java/C# today (locking) and identified the problem as having direct references to mutable objects. Clojure offers only immutable atoms and datastructures so that's how he addresses one part of the problem. However, since the mutability is how the world works, Clojure models change as a change to what a reference points to. So, if you view the world through refs, it will appear to change. Then he introduced the controlled ways in which refs can change.

1. Vars are essentially thread local variables. They are a reference whose changes can only be seen in a single thread -- each thread gets an independent and isolated view of the variable. Like thread-local variables -- they are a very simple way to take code that looks like it's manipulating global state and give each thread its own isolated copy.

2. Refs are mutable references to immutable objects. They can only be changed inside a transaction and all changes to all refs in a transaction appear to have happened at the time the transaction ends (to other threads). Inside the transaction, the changes are immediate. When two transactions are open and change the same ref, the second one to finish is automatically retried (so you should have no side-effects inside of transactions). All Refs read inside of a transaction will return the value that they had at the start of the transaction (so are consistent with each other).

3. Agents (which I have explained before) are for asynchronous changes to a reference. Each agent hold a value and you send actions to it to change the value. Only one action will happen to an agent at a time and you can read the current state of the agent any time. Rich spent a little time to caution us against any comparison to Erlang's actors. They are a different concept with different tradeoffs. Agents have the advantage that they can be read from any thread conveniently (just dereference) whereas actors require sending a message to read (which is asynchronous). Clearly, Erlang's benefit is location transparency (for distributed code) -- which is what it was designed for. Rich hinted that Clojure might have an Actor concept, but it would not be unified with Agents.

What was new to me with agents is that there are two types of message sending mechanisms -- send and send-off (which appear to be undocumented right now) -- Rich used send-off which dedicates a thread to the agent (rather than run the action from a thread-pool). This is how you have to do it if agents block at all (which ants do because they have a small sleep).

Then, he showed us an ant simulation -- I think he will make the code available. In short, there is a square world of cells, each one holds a Ref to what is in the cell (food, ant, pheromones, home base). Ants are represented by agents, which are sent a behave message. Behave picks a random action for the ant to do (move, turn, pick up food, drop food) and then mutates the world of cells in a transaction, then it sends itself another behave action. There is an agent responsible for painting the world, and another which evaporates pheromones in the background.

Anyway, the demo was impressive -- since painting makes a copy of the world in a transaction, it has a completely consistent view of the world. Refs make sure that all changes to them are in transactions, so you have language support for where you need cooperation (contrasted to locking, which is not enforced). Agents also help you model the real world in a way that a coordinated single-looped simulation (calling behave on ants in a loop, then paint over and over again) could not.

And clojure's agent mutating mechanism (sending a function to it), means that agents don't have to have any knowledge of the messages that might be sent to it (again contrasted to Erlang Actors). Finally, various messages can take different times to complete and that would be represented in the simulation --- some ants might complete several small actions in the time it took another to complete one (which would not be the case in a behave loop).

I'll have more on this when the slides, code, etc are available.

20 Days of Clojure: Day 19

Ok, yesterday I decided to try to implement a defclass macro which would hobble multimethods and force you to have some kind of class inheritance for polymorphism. Today, I'll try to implement that macro.

First, I made this macro, which I think will be useful

  (defmacro sym2key [s]
    `(keyword (name (quote ~s)))    
  )


Works like this:

  user=> (sym2key rect)
  :rect


(I wrote the above this morning, and after several hours, I didn't get too far)

I had to cheat a lot, but I finally got something -- here is my final OO minilanguage

  (defclass shape
    (defabstractmeth area)
  )
  (defclass rect
    (defctor (fn [w h] (setprop w) (setprop h) ))
    (defmeth area (fn [] (* (:w this) (:h this))))
  )
  (defclass circle
    (defctor (fn [r] (setprop r) ))
    (defmeth area (fn [] (* (:r this) (:r this) (. Math PI))))
  )


I got that to be processed by these macros:

  (defmacro sym2key [s]
    `(keyword (name (quote ~s)))  
  )

  (defmacro defclass [name & body]
    (cons 'do
    (loop [classbody# body parts# nil]
      (if classbody#
        (recur (rest classbody#) (cons (concat (first classbody#) `(~name)) parts#))
        parts#)))
      
  )

  (defmacro defabstractmeth [name class] `(defmulti ~name :type) )

  (defmacro setprop [x] { (sym2key x) x })

  (defmacro defctor [fndef name]
    (let [  arglist# (second fndef)
        fnbody# (map second (nthrest fndef 2) )
        obj# (reduce merge
            {:type `(sym2key ~name)}
            (map assoc (repeat {})
              (map eval
                (map sym2key fnbody#)) fnbody#))
      ]
      
      `(defn ~name ~arglist# ~obj#)
    )
  )

  (defmacro defmeth [meth fndef name]
    (let [  arglist# (conj (second fndef) 'this)
        fnbody# (first (nthrest fndef 2))
        namesym# (eval `(sym2key ~name))

      ]
      
      `(defmethod ~meth ~namesym# ~arglist# ~fnbody#)
    )
  )

This code shows it in action:

  (prn (rect 10 20))
  (prn (area (rect 10 20)))
  (prn (circle 10))
  (prn (area (circle 10)))

Outputs:

  {:type :rect, :h 20, :w 10}
  200
  {:type :circle, :r 10}
  314.1592653589793

I am way too tired to explain this -- suffice to say, this is kind a crazy way to make something, but it sure beats not being able to make it. My ctor implementation is so crazy, that you should just ignore it -- defclass and defmeth are worth looking at (ctor is a major hack -- assumes only setprop calls and turns them into a map).

20 Days of Clojure: Day 18

I just got around to watching Clojure TV on sequences, which I guess I should have done earlier, but I guess I was too lazy --- thanks, I'm here all week.

Ok -- since clearly, day 20 is going to be about Clojure Day in Northampton, I only have two days to explore more of clojure. I haven't even gotten to multimethods, which are very cool, but pretty simple to understand. I want to keep working with macros.

The hardest thing about thinking about macros is that not using a language that has one, I haven't started to think in macros yet. Kind of reminds me of Clint Eastwood in Firefox who has to keep remembering to think in Russian in order to fly his stolen super-secret Soviet jet.

Ok, since I think in OO, let me try to do some OO in clojure via macros -- maybe I get to use multimethods after all (let's hobble them with conventional OO polymorphism)

  (defclass rect
    (defctor (fn [w h] (setprop w) (setprop h) ))
    (defmeth area (fn [] (* w h)))
  )

  (defclass circle
    (defctor (fn [r] (setprop r)))
    (defmeth area (fn [] (* r r (. Math PI))))
  )

I want that to turn into

  (defn rect [w h] {:type :rect :w w :h h})
  (defmulti area :type)
  (defmethod area :rect [r]
    (* (:w r) (:h r)))

  (defn circle [r] {:type :circle :r r})
  (defmulti area :type)
  (defmethod area :circle [c]
    (* (. Math PI) (* (:r c) (:r c))))


I'm not even sure this is possible -- I just checked to see if it's legal to defmulti the same name, and it's not really because it kills the definition for rect. Luckily, I have a precedent to fall back on -- in Java/C# style OO, area is only polymorphic if it came from a superclass. I can change the class structure to this:

  (defclass shape
    (defabstractmeth area)
  )

  (defclass rect :extends shape
    (defctor (fn [w h] (setprop w) (setprop h) ))
    (defmeth area (fn [] (* w h)))
  )

  (defclass circle :extends shape
    (defctor (fn [r] (setprop r)))
    (defmeth area (fn [] (* r r (. Math PI))))
  )

This is so ugly, I don't even know if I want to go through with this. I'll try for tomorrow.

20 Days of Clojure: Day 17

(clojure day in Northampton, MA is Thursday, March 20th)

viman
I know I'll reveal myself as being tragically unhip for not using emacs or aquamacs or whatever (I'm a VI man after all), but I started this TextMate Bundle for clojure. I'll grow it as time goes by and probably move it to its own page or some repository for bundles once I figure this bundling stuff out.

I do some rudimentary syntax coloring and automatically expand defn[Tab], def[Tab], let[Tab], and if[Tab].

If you use TextMate and have some ideas for things to add, please send me a note.

20 Days of Clojure: Day 16

Ok, time to look at macros. Being new to lisps, I'm especially new to macros. I have been meaning to read On Lisp for a few months, so like the SICP, I'll show you stuff from the macros chapter and then try to write them in clojure.

In, On Lisp (Chapter 7), Paul Graham writes:

The expression generated by the macro will be evaluated in place of the original macro call. A macro call is a list whose first element is the name of the macro. What happens when we type the macro call into the toplevel? Lisp notices that [it] is the name of a macro, and

  1. builds the expression specified by the definition, then
  2. evaluates that expression in place of the original macro call
Macros simply allow you to write code that gets executed at compile time to generate code, which is then run. There really isn't anything quite like them in popular non-lisp languages, though there are a couple of .NET languages with language altering macros. They are much more powerful than the preprocessor macros in C, and C++ template meta-programming can sometimes get you the effect, but templates were not designed for this, so it is too cumbersome most of the time.

We have been using macros in clojure throughout this series. In fact, since Lisps evaluate all of their arguments before calling a function, whenever you see code where all of the arguments are not evaluated, you are either looking at something built-in to the language (e.g. if) or a macro (e.g. cond). You can look at boot.clj to see the definitions of them. For example, here's lazy-cons:

  (defmacro lazy-cons [x & body]
    (list 'fnseq x (list* `fn [] body)))


Remember, we used lazy-cons to make lazy sequences. lazy-cons acts like cons, except it does not evaluate the rest-expr until rest is called on the resulting list. We can use macroexpand-1 (which returns the unevaluated macro expansion of a macro call) to see how this works:

  user=> (defn rfn [] (prn "Called") '(3))
  #<Var: user/rfn>
  user=> (rfn)
  "Called"
  (3)
  user=> (def lazylist (lazy-cons 1 (rfn)))
  #<Var: user/lazylist>
  user=> (first lazylist)
  1
  user=> (rest lazylist)
  "Called"
  (3)
  user=> (rest lazylist)
  (3)
  user=> (macroexpand-1 '(lazy-cons 1 (rfn)))
  (fnseq 1 (clojure/fn [] (rfn)))
  user=> (first (eval (macroexpand-1 '(lazy-cons 1 (rfn)))))
  1
  user=> (rest (eval (macroexpand-1 '(lazy-cons 1 (rfn)))))
  "Called"
  (3)

So, a lazy-cons is just a short hand to using a fnseq and wrapping the rest in a function that will be called whenever rest is called on the fnseq. That's a typical use of a macro -- codifying common uses to make your programs shorter. And, lazy-cons cannot be implemented as a function, because if it were, its arguments would need to be evaluated, which defeats the purpose.

Ok, let's write some macros (Documentation on the clojure macro page and the clojure reader page). On Lisp, section 7.3 (Defining Simple Macros) presents this one for while.

  (defmacro while (test &body body)
    `(do () ((not ,test)) ,@body))

Here it is in clojure:

  (defmacro while [test & body]
    `(loop [] (if ~test (do ~@body (recur)))))

In clojure, you don't have many mutable objects -- here's how you use while with a thread-local var binding:

  (def x 10)
  (prn (macroexpand '(while (> x 0) (prn x) (set! x (dec x)))))
  (binding [x 10] (while (> x 0) (prn x) (set! x (dec x))))

Here's the output:

  (loop [] (if (> x 0) (do (prn x) (set! x (dec x)) (recur))))
  10
  9
  8
  7
  6
  5
  4
  3
  2
  1

20 Days of Clojure: Day 15

Christophe Grand made two observations about my parallel prime code, which improve it quite a bit.

First, since primes-in-list is lazy, you have to (touch) it to get it to be completely created in the agent thread. That leaves you with this:

  (defn par-primes-in-list [potential-primes known-primes]
    (let [ x 10 x-times-2 (* x 2)]
      (! w1 (fn[w pp kp] (
touch (primes-in-list pp kp))) (take x potential-primes) known-primes)
      (! w2 (fn[w pp kp] (
touch (primes-in-list pp kp))) (take x (drop x potential-primes)) known-primes)
      (await w1 w2)

      (let [new-primes (concat @w1 @w2) ]
        (if (= (count new-primes) 0)
          (recur (drop x-times-2 potential-primes) known-primes)
          (let [real-primes
            (if (> (first new-primes) x-times-2) new-primes (primes-in-list new-primes []))]
            (lazy-cat real-primes (par-primes-in-list
              (drop x-times-2 potential-primes)
              (into known-primes real-primes))))))))

But, the bigger difference is picking the number of potential-primes to evaluate intelligently. I had experimented with trying to do this dynamically, but without the fix above, anything you do will probably result in primes being found lazily in the main thread. Here's Christophe's code:

    (defn par-primes-in-list [potential-primes known-primes]
        (let [p (first potential-primes) x (quot p 2)]
            (! w1 (fn[w pp kp] (touch (lazy-primes-in-list pp kp)))
                (take x potential-primes) known-primes)
            (! w2 (fn[w pp kp] (touch (lazy-primes-in-list pp kp)))
                (take (- p x) (drop x potential-primes)) known-primes)
            (await w1 w2)

            (let [new-primes (concat @w1 @w2) ]
                (lazy-cat new-primes
                    (par-primes-in-list
                         (drop p potential-primes)
                         (into known-primes new-primes))))))

Christophe's observation:

I use the property that no number in the interval [n .. 2*n-1] is divisible by n (except n itself) so it's safe to process the whole interval with the current set of known primes. Hence this interval can be divided into m intervals (where m is the number of cores) for parallel processing.

The one issue with Christophe's code it part lazy and part eager, so it's applicability depends on whether you are trying to get to a specific number the fastest or just generate primes as fast as possible. Since the number of primes to try grows as it searches, it lazily generates bigger and bigger lists of primes. I tried to find a good value to ask for where it would perform well, and chose 20390 which is big enough to show a difference and doesn't make Christophe's code do too much extra. Here are the results:

Christophe's: "Elapsed time: 87266.119 msecs"
Mine (with a touch added): "Elapsed time: 108841.354 msecs"
Sequential: "Elapsed time: 151075.292 msecs"

20 Days of Clojure: Day 14

Ok, I'm finally getting somewhere. To start with, Rich pointed out that I could use recur to make my last prime finder stack safe and faster. Here it is:

  (defn primes-in-list [potential-primes known-primes]
    (let [p (first potential-primes)]
      (if (nil? p) '()
        (if (is-prime? p known-primes)
          (lazy-cat
            (list p)
            (primes-in-list (rest potential-primes) (conj known-primes p)))
          (let [rest-primes (rest potential-primes)]
            (recur rest-primes known-primes))))))

Here's how I parallelized it:

  (def w1 (agent 0))
  (def w2 (agent 0))

  (defn par-primes-in-list [potential-primes known-primes]
    (let [ x 10 x-times-2 (* x 2)]
      (! w1 (fn[w pp kp] (primes-in-list pp kp)) (take x potential-primes) known-primes)
      (! w2 (fn[w pp kp] (primes-in-list pp kp)) (take x (drop x potential-primes)) known-primes)
      (await w1 w2)

      (let [new-primes (concat @w1 @w2) ]
        (if (= (count new-primes) 0)
          (recur (drop x-times-2 potential-primes) known-primes)
          (let [real-primes
            (if (> (first new-primes) x-times-2) new-primes (primes-in-list new-primes []))]
            (lazy-cat real-primes (par-primes-in-list
              (drop x-times-2 potential-primes)
              (into known-primes real-primes))))))))

Basically, I send lists to the agents, wait for them, and then combine their answers. The original runs at about 100% of CPU, calculates 8000 primes in 21.6 seconds and this runs at about 125%, and calculates the primes in 17.1 seconds, or in about 80% of the time.

20 Days of Clojure: Day 13

My brain still hurts from thinking about parallelizing prime number generation. I think the key is still to find primes two at a time in parallel, meaning that it's only useful if you want a list, which is necessary to find the nth prime anyway.

So, I am starting with a rewrite of prime number generation -- this one is simpler for me to understand and for some reason runs three times faster.

  (defn divisible? [x y]
      (= (rem x y) 0)
  )

  (defn is-prime? [p known-primes]
      (every? (fn [kp] (not (divisible? p kp))) known-primes)
  )

  (defn primes-in-list [potential-primes known-primes]
      (let [p (first potential-primes)]
          (if (is-prime? p known-primes)
              (lazy-cons
                  p
                  (primes-in-list (rest potential-primes) (conj known-primes p)))
              (primes-in-list (rest potential-primes) known-primes)
          )
      )
  )

  (def primes (primes-in-list (iterate inc 2) []))
  (time (reduce + (take 1000 primes)))
  (time (reduce + (take 1000 primes)))

It's also lazy. Not sure why it's so much faster. Now that I can provide a list of potentials, I'll be able to run two lists in parallel, and then I have to reduce them in the main thread. When I get the lists back, I need to make sure that they aren't divisible by each other (since they won't be in each others list of known primes).

20 Days of Clojure: Day 12

So, Rich thinks I'm on a fool's errand, but I'm going to keep going with this.

I'm sorry to say I can't think of a less promising candidate for parallelization :( The 3 things you would want in something to parallelize are independent data/work, moderate-to-course-grained work units and/or complex coordination logic that would be simplified with threads. (Note - wanting it to run faster is not enough :) Here you have none of those - the data/work is interdependent (more later), the work is very fine grained (seq walk and a remainder test), and the logic was pretty elegant.

So, in order to have something to parallelize, I need to get one of those things. Since I am trying to make a list of primes, I'll generate them two at a time. Each check is independent. There's probably better ways to do it, but to keep my life simple, I'll just generate pairs of primes lazily, but in parallel (each member of the pair in a separate thread).

Hmmm. I'm too tired to figure this out right now. So, for some content today, I can explain a mistake that Rich pointed out in my original prime sieve.

I defined primes as a function that returns a lazy seq, but a key benefit of lazy seqs is that they cache values -- so by creating a new lazy seq for each call to primes, I lose caching. The right way is to define primes as a seq this way:

  (def primes (sieve (iterate inc 2)))

Then, when you do this:

  (time (reduce + (take 1000 primes)))
  (time (reduce + (take 1000 primes)))

You get this:

  "Elapsed time: 1080.253 msecs"
  "Elapsed time: 5.36 msecs"

20 Days of Clojure: Day 11

Ok, so now we've covered some basic clojure as lisp with immutable datastructures and built-in support for streaming (lazy) interfaces. I was playing around with Erlang right before I found clojure because I was looking for a language with direct support for concurrency, the lispiness and the JVM hosting were the deciding factor for me.

I'm just getting started with clojure's concurrency support, so you'll have to bear with me as I might struggle a bit. Yesterday, I showed you a lazy prime number sieve built using the seq interface. Today, I'm going to try to parallelize it. Since I don't really know what I'm doing, I'll take you through my thinking.

There are three ways in clojure to deal with memory in threads: dynamic vars, software transactional memory, and agents. I get dynamic vars and agents because I use things like them all the time. Dynamic vars are basically thread-local storage and agents are a kind of ActiveObject [pdf]. Thread local storage isn't going to work for me since I need to gather answers from my worker threads, not have each work independently (there may be dynamic vars in the solution, but it's not the way you communicate between threads).

ActiveObject is a pattern that serializes access to an object so that one thing happens to it at a time. Individual calls are put in a queue which is serviced by a thread assigned to the object, and that thread is the only one to touch the object. In clojure, there's actually a thread pool that services all agents. Now, that I think about it -- maybe this will work. I kind of wanted to play with STM's, but I think I understand this better and I feel I can still make it streamable.

Here's my naive idea. I'm going to make an agent per prime, then ask them all if a number is divisible by them, then wait for them all and if it is, make a new agent and go on. That seems really dumb unless agents are super-cheap. Let's see.

I'll start with something simple:

 (defn divisible? [x y] (= (rem x y) 0))

I'll define my agents to be maps that look like this at the start {:prime 2} and then define this:

  (defn check-divisible [agent x]
    (merge agent {:last x :notdivisible (not (divisible? x (:prime agent)))}))

Here's a quick Repl session so far (no real agents yet)

  user=> (check-divisible {:prime 2} 3)
  {:last 3, :prime 2, :notdivisible true}
  user=> (check-divisible {:prime 2} 4)
  {:last 4, :prime 2, :notdivisible false}
  user=> (check-divisible {:prime 2} 5)
  {:last 5, :prime 2, :notdivisible true}

Ok, now I'll start. Here's something weird -- I just did a CPU snapshot running the sequential prime sieve from yesterday and it looks like this:

cpu

I don't know what to make of that -- I know purely functional programs are easy to parallelize, but this sieve seems particularly serial in implementation. Anyway, I'll press on:

Here's my agent-sieve

  (defn agent-sieve [a ints]
    (let [p (next-prime a ints)]
      (lazy-cons p (agent-sieve (cons {:prime p} a) (iterate inc (inc p))))))

I keep stuffing a new agent list into the rest of a lazy-cons -- just need to write next-prime:

  (defn next-prime [agents ints]
    (let [x (first ints)]
      (check-divisible-by-agents (take-nth 2 agents) x)
      (check-divisible-by-agents (take-nth 2 (drop 1 agents)) x)
      (apply await agents)
      (if (every? (fn [a] (or (not= (:last @a) x) (:notdivisible @a))) agents)
        x
        (next-prime agents (rest ints)))))

Ok, so I'm starting to see problems -- for one, I really want to short-circuit if an agent is divisible, but I also want to split up the work. So I decide to make a check-divisible-by-agents that will chain along each agent and stop when it finds an agent that divides into the prime. I start two chains each with half of the agents and wait -- this will mean that if one is done, the other won't know and it will keep going (which is bad). I press on, here's the chain

  (defn check-divisible-by-agents [agents x]
    (if (not= (count agents) 0)
      (! (first agents)
        (fn [a y rest-agents]
          (let [newstate (check-divisible a y)]
            (if (:notdivisible newstate)
              (check-divisible-by-agents rest-agents y))
            newstate))
        x (rest agents))))

So, I run a function on the first agent in the list (with !) that sets the new state to whether it was divisible. If it's not divisible then I recursively call the check on the rest of the agents.

I'm not getting primes yet -- but it's close. I don't think await works like I think it does -- even though I am chaining ! inside of !, I think that doesn't count for await. Here's is next-prime rewritten.

  (defn next-prime [agents ints]
    (let [x (first ints)]
      (check-divisible-by-agents (take-nth 2 agents) x)
      (check-divisible-by-agents (take-nth 2 (drop 1 agents)) x)
      (if (every?
          (fn [a]
            (await a)
            (or (not= (:last @a) x)
              (:notdivisible @a)))
        agents)
          x
          (next-prime agents (rest ints)))))


Ok, I get primes now

To tell the truth, I can't believe I got this working at all, but running (time) on the serial and parallel versions give me this:

sequential: "Elapsed time: 9.517 msecs"
parallel: "Elapsed time: 251.025 msecs"

So, obviously, this is not a good way to do this. I don't know yet if it's because I'm using too many agents or the extra work I'm doing because I'm not short-circuiting. Anyway, that's enough for now.

Update: After a night's rest, I think I know what's going on with the other thread in the sequential version. My guess is the GC. The only other alternative I can come up with is that under the hoods clojure parallelizes some things I'm using. I started a thread on parallelizing prime sieves on the clojure google group.

(clojure day (March 20) in Northampton, MA is coming soon)

20 Days of Clojure: Day 10

Woo hoo -- half way through. Well, not yet, but at the end of this entry.

Yesterday, we saw sequences. Now, here's a way to use them: the Prime Sieve of Eratosthenes. Here it is in Scheme from the SICP lectures:

 (define (sieve s)
  (cons-stream (head s)
   (sieve (filter
    (lambda (x)
     (not
      (divisible? x (head s))))
    (tail s)))))

 (define primes
  (sieve (integers-from 2)))

And here is a complete clojure program (all undefined functions are defined as part of the seq interface)

 (defn sieve [s]
  (lazy-cons (first s)
    (sieve (filter
     (fn [x]
      (not
       (= (rem x (first s)) 0)))
     (rest s)))))

 (defn primes []
  (sieve (iterate inc 2)))

 (prn (take 10 (primes)))

Here's how it works. The function sieve is a sequence and takes a sequence (in this case we pass in all numbers greater than or equal to two).

Working from the inside, this filter:

    (filter
     (fn [x]
      (not
       (= (rem x (first s)) 0)))
     (rest s))

Filters the rest of the passed in stream based on whether the items are divisible by the first, so only the ones not divisible are passed. So for (iterate inc 2) we get a stream of all numbers not divsible by two.

The first of that stream is 3, and it is passed into sieve, so the resulting stream is all those not divisible by two (already filtered) or three. Once the stream is returned (lazily), the heads are lazy-cons'd on (that means that you get a stream with a given first and the rest set to the given stream). The first head was 2, the second was 3, the next one is 5 (the first number not divisible by 2 and 3), and so on.

(clojure day (March 20) in Northampton, MA is coming soon)

Update: It may look like sieve is an infinitely recursive function, but it's not. lazy-cons is a macro that doesn't evaluate its argument until rest is called on its return value. I haven't looked at the implementation, but I imagine its implemented similarly to how cons-stream in the SICP lecture is.

Update: This prime sieve has been corrected here.

20 Days of Clojure: Day 9

News from Rich on a optimization he found for vectors:

I re-read Okasaki's paper yesterday and woke up today with an idea for speeding up the Clojure vector conj and pop. They are now 2-5x fasterthan they were. The trick is to keep the tail of the vector out of the tree until it is a full (32-element) node. Note that this greatlyspeeds up vector construction, which typically consists of repeatedly conj'ing onto [].

Today, I'm going to look at sequences. I think it's helpful to go back to the SICP -- this time, the lecture on stream machines:



This is just a small part of the whole
SICP stream machine lecture (and Streams Part II).

Sequences in clojure are lazy like stream as described in the video, and if you watch the whole lecture, you will see a rich stream library built up that is present in clojure.

This means that many operations can be incredibly efficient, but seem like they are operating on gigantic data structures at once (since my day job is image processing, this is especially intriguing to me -- maybe more on that later). To really see this, you need to play with infinite sequences. In clojure, it's easy to create an infinite sequence with cycle -- (cycle [1 2 3]) returns a sequence which continuously yields 1, 2, 3, 1, 2, 3, etc. If you use sequence operations on it, you can get a feel for how laziness works because it's obvious that these functions could not possibly be consuming the whole sequence. Here's a clojure Repl script.

user=> (def cyc123 (cycle [1 2 3]))
#<Var: user/cyc123>
user=> (every? (fn [x] (= 1 x)) cyc123)
false
user=> (take 5 cyc123)
(1 2 3 1 2)
user=> (take 5 (filter (fn [x] (= x 1)) cyc123))
(1 1 1 1 1)
user=> (take 5 (drop 1 cyc123))
(2 3 1 2 3)
user=> (take 5 (interleave cyc123 cyc123))
(1 1 2 2 3)
user=> (take 5 (take-nth 3 cyc123))
(1 1 1 1 1)
user=> (take 10 (for [x cyc123] (< x 3) x))
(1 2 1 2 1 2 1 2 1 2)
user=> (mapcat (fn [x y] (list x y)) cyc123 '(10 11 12))
(1 10 2 11 3 12)

More on this tomorrow.

20 Days of Clojure: Day 8

Rich Hickey comments on yesterday's post on the clojure google group:

An interesting point of comparison with purely-functional solutions such as that which Okasaki writes about is that the Clojure vector can be log32N specifically because I can use mutation - when cloning, i.e. the tree is log32N deep and a node can be copied in O(1) time due to arraycopy + indexed array set. And yes, the vectors are optimized for lookup - in practice, log32N blows logN out of the water.

Exactly -- PersistentVector's internals are not purely functional. New objects (during construction only) make an array copy of Object arrays from existing vectors, then make changes to the copied array before returning. As Rich points out, this makes some operations O(1). And for comparison, a clojure vector of 32k elements is only 2 levels deep, but if implemented with a log2N structure would be 15 deep (with average access in maybe 7 operations). Since access by index is a very common operation, you can see that this would make a big difference.

Today, I want to look at PersistentHashMap.java which is the implementation of clojure hash-maps. There are actually four implementations of the map interface (hash-map, sorted-map, struct-map, and array-map), but hash-map is what the reader will give you for when you build up a map with {}.

A comment on the top of the source file says that it's a persistent version of
Phil Bagwell's Hash Array Mapped Trie. According to that paper, it looks like the bits of hashes of the key are used to traverse a trie. Let's see how it's done in clojure.

The first thing I like to look at is how lookup is done, that gives me a sense of the structure. So, to start with, here's entryAt:

 public IMapEntry entryAt(Object key){
  return root.find(RT.hash(key), key);
 }

Simple enough -- it uses the root node to start the traversal. The root node is an INode, and clojure uses polymorphism to get different behavior for different nodes. There are five types of nodes,
EmptyNode, LeafNode, HashCollisionNode, BitmapIndexedNode, and FullNode. An empty hash-map has its root set to an instance of EmptyNode, which is an implementation of the Null Object design pattern, and find() on it returns null (by the way, the entire structure is a Composite design pattern, and by using each node type to handle part of traversal, I guess he's also using a Chain of Responsibility -- I mention this because there are many OO design patterns used in the Java implementation of clojure -- some by name, so if you are going to look at the code, you'll want to be familiar with these patterns).

If I assoc() onto a hash-map, root.assoc() is called which is supposed to return a new root. EmptyNode.assoc() simply creates a
LeafNode with the hash, key and value, so a map with one key has its root set to a LeafNode.

If I call find() on a
LeafNode, that's pretty simple too. If this is the node I am looking for, it returns this, otherwise null.

If I assoc onto a
LeafNode, here is what can happen:
  1. If the key and value match the LeafNode, then it is just returned -- nothing is created since you haven't changed anything.
  2. If just the key is the same as this node, a new LeafNode is created with the same key and the new value and returned (remember, this is a persistent data-structure, so we do not change the LeafNode, we create new nodes to build up a new hash-map).
  3. If the hash is the same, but the key is different, then a new HashCollisionNode is created that holds onto this LeafNode and a new one that has the same key and the new value. HashCollisionNodes simply hold onto an array of LeafNodes whose keys have the same hash.
  4. If the new key doesn't have the same hash (the common case), then a BitmapIndexedNode will be created. At first it contains just the old LeafNode in its array of nodes, and then assoc() is called on it with the new key and value.
Case #4 is where we start to build up the trie. The hash of the key is going to be used to traverse the trie later, and a BitmapIndexedNode knows if it contains the LeafNode you are looking for. A BitmapIndexedNode has a bitmap of 32 bits. The bit corresponding to the bottom five bits of the hash of the LeafNode is turned on to indicate that the LeafNode is to be found in this branch (it can have 32 branches). For example if the bottom five bits of the key hash are 01100, then the 12th bit of the bitmap is turned on, and the node is put at this branch.

If the bitmap bit for the new leaf node is off, then a new
BitmapIndexedNode is created with a new bitmap with the new bit turned on, and the node is put in its array which will be one bigger than the one we started with. If we kept doing this and hit each bit, the 32nd assoc would result in a FullNode -- which is a simple case of a BitmapIndexedNode, except it knows that if it had a bitmap, all bits would be set. Again, remember, we create a new node because we never change nodes in a persistent data structure.

If the new leaf node has a bottom five bits that match a node already in this
BitmapIndexedNode (the corresponding bit is already turned on), then we assoc() this key/value onto the node with the same five bits.

I haven't mentioned this yet, but assoc takes a shift argument -- so far, that has always been 0 and that meant to use the bottom five bits (use a 0 shift) of the hash in any
BitmapIndexedNodes. On this assoc call, we're now going to pass in (shift+5) so that the next 5 bits will be used on any BitmapIndexedNodes created under here. Later when we traverse, we'll use the bottom five bits on the first BitmapIndexedNode we see, then the next five bits on the next one and so on. When we've used up all 32 bits of the hash, we should be at our LeafNode or a HashCollisionNode (meaning all 32 bits matched).

So here's find():
  1. If the node is an EmptyNode, it returns null.
  2. If the node is a LeafNode, it returns itself if the key matches, otherwise null.
  3. If the node is a HashCollisionNode, it calls find() on each LeafNode it holds, if any return themselves, then that is returned, otherwise null.
  4. If the node is a BitmapIndexedNode, it first checks to see if it has a node where the five bits being used matches and then calls find() on it, if it doesn't, it returns null.
  5. If the node is a FullNode, it acts like a BitmapIndexedNode with all bits of the bitmap set, so it just calls find() on the node corresponding the five bits in the hash at its level.

The data structure is persistent because it never changes constructed nodes. During an assoc, all non-traversed paths will be shared, and only the path that changed will be new.

I should also mention that
FullNode is purely an optimization -- a BitmapIndexedNode could be used in its place, but it saves some branch calls to have them sprinkled throughout the map. Also, using a bitmap in BitmapIndexedNode is also an optimization. An INode array of 32 elements could have always been constructed with null signifying that there was no nodes in that branch, but that would be very wasteful in space, and so by using the bitmap, BitmapIndexedNodes have an INode array of the exact length they need.

20 Days of Clojure: Day 7

Ok, I'm going to try to describe how vectors are implemented in clojure. The implementation is in:

   src/jvm/clojure/lang/PersistentVector.java

if you want to see the code. I'm going to try to go through the important bits here.

The PersistentVector object has three members: two ints named cnt and shift, and an Object[] named root. A simple PersistentVector built from this line:

   (def v1 [1 2 3])

would look like this (using curly braces for root because it is a Java array):

   cnt: 3
   shift: 0
   root: { 1, 2, 3 }

cnt is the count of elements and (count v1) simply returns it and is O(1) -- I'll explain shift later, and root is the object array. When shift is 0, this line:

   (nth v1 1)

Just simply resolves to root[1], and returns 2, and in this simple case is also O(1). If I do this:

   (def v2 (assoc v1 1 4))

which returns a new vector, but with the value at index 1 set to 4, you get another PersistentVector that looks like this:

   cnt: 3
   shift: 0
   root: { 1, 4, 3 }

The 1 and the 3 are shared between the v1 and v3 array. If I do this:

   (conj v1 5)

I'll get yet another PersistentVector that looks like this:

   cnt: 4
   shift: 0
   root: { 1, 2, 3, 5 }

with the 1, 2, and 3 shared with v1 (and the 1 and 3 shared with v2). This is all very simple until, you conj onto a vector of 32 elements. When the root array has 32 elements, then adding one more element (33) returns a new PeristentVector that looks like this (assume the starting array had 1, 2, 3, ... 32)

   cnt: 33
   shift: 5
   root: { {1, 2, 3, ..., 32 }, { 33 } }

Root is a Java Object[] with the 0th element set to the Object[] from the input vector to conj (not a clone, but the actual one), and the next element is an array of just the new value. If I conj onto that, I get a new Vector:

   cnt: 34
   shift: 5
   root: { {1, 2, 3, ..., 32 }, { 33, 34 } }

Now, I can explain shift. When a method that takes an index is called, a bit shift is done on it to determine how many levels deep in the structure we need to go to get to the part of the datastructure that has the element at that index. For instance, here is the main Java for nth(i):

      Object[] arr = root;
      for(int level = shift; level > 0; level -= 5)
         arr = (Object[]) arr[(i >>> level) & 0x01f];
      return arr[i & 0x01f];

so, when i < 32, then i >>> level is 0, and arr will be root[0] (the array of 32 elements). Then we return arr[i & 0x01f] (which is i % 32), to get the ith element in that array.

When i == 32, then (i >>> level) is 1, arr is root[1], and then we return arr[i%32] which is the 0th element. Now, if I do an assoc to set the 0th element to 100, I get this PersistentVector:

   cnt: 34
   shift: 5
   root: { {100, 2, 3, 4, ..., 32 }, { 33, 34 } }

assoc calls a recursive function (doAssoc). First it clones root so that the new root is an array of two elements, each an object array. Then it determines that index 0 is in the 0th branch and does an assoc on that, decrementing the shift by 5 and setting the index to (index % 32). This recursive call clones the array at root[0]. Since shift is now 0, it is at the base case of the recursion, and so it sets root[0][0] to 100. All of the numbers and the entire array at root[1] is shared with the starting vector. Here is the Java code for that (doAssoc is called with the arguments shift, root, the index, and the new value) -- the return is the new root:

   private static Object[] doAssoc(int level, Object[] arr, int i, Object val){
      Object[] ret = arr.clone();
      if(level == 0)
         {
         ret[i & 0x01f] = val;
         }
      else
         {
         int subidx = (i >>> level) & 0x01f;
         ret[subidx] = doAssoc(level - 5, (Object[]) arr[subidx], i, val);         }
      return ret;
   }

By now, you might have realized that we are building up a tree. If I keep conjing on numbers in order starting at 1, eventually I will conj on 65. If so, I get this

   cnt: 65
   shift: 5
   root: { {1, 2, 3, 4, ..., 32 }, { 33, 34, 35, ... 64 }, { 65 } }

Graphically, it looks like this (for 1..32 elements)

+-------------+
|{1,2,3,...32}|
+-------------+


and then for up to 322, it looks like this:

             +-------+
             | {   } |
             +--+-+--+
                | |
       +--------+ +----------+
       |                     |
+------+------+     +--------+-------+
|{1,2,3,...32}|     |{33,34,35,...64}|
+-------------+     +----------------+


And, we get another level at 322+1 (1025) elements, and another at 32k+1. This explains how index access is O(log32N). Another important point, is that not only is a PersistentVector immutable from an interface perspective, but it's internally immutable, and each individual Object[] that makes up the tree is never changed in a constructed vector. If it needs to change in a new vector, it is cloned and the changed clone is used in the new PersistentVector -- the benefit of this is that the vector never needs to lock. This may seem like a small point, but I believe that this is not a requirement of Persistence -- but this extra work makes vector very suitable for concurrent applications.

This implementation is different from the one described in the Okasaki paper from yesterday. Rich appears to be trading away O(1) consing for better index performance, which was O(log2N) in that paper.

(clojure day (March 20) in Northampton, MA is coming soon)

Update: Made a few corrections

20 Days of Clojure: Day 6

So far, I've been working on the "clojure is a dialect of Lisp" part of the description of clojure. One of the things Lisps allow is purely functional programming, but to enforce it you need immutable datatypes. One of the things that bothered me as a C programmer, is that I could never understand how it worked for anything more complicated than a string. Then I saw persistent data-structures. Their existence in clojure is what made me think that purely functional programming could work in real programs.

The clojure page on data structures, but more importantly check out clojure TV on collections.

If you're interested in how these data structures are implemented, you want to read Okasaki. I would start here with his description of a persistent random access list, which I think is one of the more accessible.

Tomorrow, I'm going to try to expand on this by describing how vectors are implemented in clojure.

20 Days of Clojure: Day 5

This SICP lecture shows some of the power of Lisp by building a symbolic differentiator in a very small amount of code, and is also the first one to introduce the idea that code is data and data is code (watch from the beginning to 29:00 -- at 13:00 he starts talking about using Lisp as the data format):



Here is the first symbolic differentiator he writes (but, in clojure)

    (defn atom? [x]
        (not (instance? clojure.lang.IPersistentCollection x)))

    (defn const? [expr val] (and (atom? expr) (not= expr val)))
    (defn same-var? [expr val] (and (atom? expr) (= expr val)))
    (defn arith? [expr sym] (and (not (atom? expr)) (= (first expr) sym)))
    (defn sum? [expr] (arith? expr '+))
    (defn prod? [expr] (arith? expr '*))

    (defn a1 [expr] (frest expr))
    (defn a2 [expr] (first (rrest expr)))
    (defn m1 [expr] (frest expr))
    (defn m2 [expr] (first (rrest expr)))

    (defn deriv [expr val] (
        cond
            (const? expr val) 0
            (same-var? expr val) 1
            (sum? expr)
                (list '+ (deriv (a1 expr) val) (deriv (a2 expr) val))
            (prod? expr)
                (list '+ (list '* (m1 expr) (deriv (m2 expr) val))
                    (list '* (m2 expr) (deriv (m1 expr) val))
                )
            (:else) nil
        )
    )

The things to note:
  1. Quote syntax, which uses '
  2. first and rest which are car and cdr. Clojure also supports combinations like frest (cadr) and rrest (cddr)
  3. In clojure, extra parens are usually not necessary -- look at how cond is implemented to use variable arguments rather than a single list argument (which would require extra parens)
  4. My first usage of Java interop (using instance? to find out if an expression is atomic or a collection)

20 Days of Clojure: Day 4

Continuing from yesterday, we made our first function that takes a function and returns a function. Go to 43:00 in this video to see why (you can stop at 54:00):



In this example, we take the first-order functions idea further and define a derivative function (that takes and returns functions) and a newton function (which takes a function and uses fixed-point). Here it is in clojure (you can find fixed-point and other functions in yesterday's post):

    (defn square [x] (* x x))

    (defn deriv [f]
        (let [dx 0.000001]
            (fn [x] (/ (- (f (+ x dx)) (f x)) dx))))

    (defn newton [f guess]
        (let [df (deriv f)]
            (fixed-point (fn [x] (- x (/ (f x) (df x)))) guess)))

    (defn sqrt [x] (newton (fn [y] (- x (square y))) 1))

(clojure day (March 20) in Northampton, MA is coming soon)

20 Days of Clojure: Day 3

Ok, so first we made a very hard-coded sqrt, then we improved it by adding the concept of fixed points, which made the algorithm easier to see in the code. Here's how we can improve it even further. Go to 32:30 in this video (stop at the break at 43:00):



First we define an average-damp function. This function takes a function as its argument and returns another function which takes a number and returns the average of that number and result of calling the function argument on the number. It's easier to say that in clojure:

    (defn average-damp [f]
        (fn [x] (average x (f x))))

f is a function, average-damp creates a new function that takes x (a number) and that new function averages x and f(x).

Now that I have that, I define sqrt like this (you can find average and fixed-point in yesterday's post):

    (defn sqrt [x] (fixed-point (average-damp (fn [y] (/ x y))) 1))

Tomorrow, we'll use this to implement another way of finding sqrt.

20 Days of Clojure: Day 2

Yesterday, we built a simple sqrt using Heron of Alexandria's algorithm. The problem with the code is that it had some generically useful ideas, but it was all hard-coded to only work with sqrt. Also, it was not clear what exactly the algorithm was.

In the SICP lecture they fix it by passing functions around since Lisp treats functions as first-class data-types. Here's the fix explained (23:30 in the video, stop at 32:30):



See the video for the Scheme implementation. Here it is in clojure.

    (defn sum ([] 0)
        ([x] x)
        ([x & more] (+ x (apply sum more) )))

    (defn average
        ([] 0)
        ([x] x)
        ([x & more]
            (/ (+ x (apply sum more)) (+ 1 (count more)))))

    (defn abs [x] (if (< x 0) (- x) x))

    (defn close-enough? [u v]
        (let [tolerance 0.00001]
            (< (abs (- u v)) tolerance)))

    (defn fixed-point [f start]
        (loop [old start new (f start)]
            (if (close-enough? old new)
                new
                (recur new (f new)))))

    (defn sqrt [x] (fixed-point (fn [y] (average (/ x y) y)) 1 ))

    (prn (sqrt 9))

A fixed point of a function f is a value y such that f(y) = y. This improved code, which implements the same algorithm as before, uses a generic fixed-point function that takes a function and a guess and tries to find the fixed point by setting the next guess to f(guess). You can see it all in the video. The things in clojure you need to know:

  1. fn [arg1 arg2] is lambda (λ), so where fixed-point takes a function as the first argument, we create one on the fly to be (average (/ x y) y) where x is the argument to sqrt (closed around our function) and y is the argument defined by fn.
  2. Since clojure compiles to Java and uses Java calling conventions, it does not have automatic tail-recursion. Instead, you use the recur special operator which will create a constant space loop back to the last loop or function definition. Loop/recur is explained on the Functional Programming page of the clojure site (at the bottom).
But, even this can be improved, as we'll see tomorrow.

20 Days of Clojure: Day 1

Ok, this might be hard to keep up, but nothing ventured ...

Rich Hickey is going to be coming to Northampton on March 20th to talk about clojure. In preparation for that, I am going to blog every day in March about clojure -- then, hopefully we can post the results of his talk.

If you don't have clojure yet -- go get it. Start here to learn basic clojure syntax (there is also a download link). Clojure is trivial to get running and you definitely want to play with it a little to understand the rest of this. The quick description of clojure is that it's a dialect of Lisp that runs on the JVM. It has support for immutable and persistent data-structures and a concurrent programming model.

Today I'm going to start with some of the work I did to learn clojure through translating SICP lectures. In the first lecture, the first interesting program starts at 56:20 in this video:



Starting there presumes you know some basic Lisp syntax. This is the program he eventually writes to calculate the square root (in Scheme):

    (define (improve guess x)
        (average guess (/ x guess)))

    (define (good-enough? guess x)
        (< (abs (- (square guess) x)) .001))

    (define (try guess x)
        (if (good-enough? guess x)
            guess
            (try (improve guess x) x)))

    (define (sqrt x) (try 1 x))

I have left out implementations for square and average here. This code and the algorithm is explained in the video. Here it is in clojure (I defined average and square -- you could just use Java, but I am trying to keep it all in clojure for now)

    (defn sum ([] 0)
        ([x] x)
        ([x & more] (+ x (apply sum more) )))

    (defn average
        ([] 0)
        ([x] x)
        ([x & more]
            (/ (+ x (apply sum more)) (+ 1 (count more)))))

    (defn abs [x] (if (< x 0) (- x) x))

    (defn square [x] (* x x))

    (defn improve [guess x]
        (average guess (/ x guess)))

    (defn good-enough? [guess x]
        (< (abs (- (square guess) x)) 0.001))

    (defn try-sqrt [guess x]
        (if (good-enough? guess x)
            guess
            (try-sqrt (improve guess x) x)))

    (defn sqrt [x] (try-sqrt 1 x))

    (prn (sqrt 9))

If you take this and put it in a file named sqrt.clj, you can run it like this (after downloading clojure):

    java -cp clojure.jar clojure.lang.Script sqrt.clj

And it will print out:

    65537/21845

Explanation of the differences between the two examples:
  1. clojure uses defn to define functions and puts arguments in []
  2. sum and average implementations show how clojure can overload on arity (number of arguments)
  3. try is reserved for exception handling, so I renamed the try function to try-sqrt
  4. the answer is in clojure's rational number format, since all of the computation was adding and averaging integers and rationals, the result is a rational number and clojure preserves that.
Tomorrow, I'll improve this so to make it more generic. A lot of what I used in this can is explained in the clojure Functional Programming support page.

Book Review: Collective Intelligence, Part I

51AnWLR89xL._AA240_
I'm at about the halfway point in Programming Collective Intelligence by Toby Segaran, and even if the second half was blank, I'd be telling everyone to read this book. It's been a while since I've read a technical book this good. It excels because of a combination of three factors:
  1. Useful algorithms
  2. Shown in real-life situations
  3. With real data that you can get yourself
The basic idea of the book is to show different ways of using user-generated datasets to extract information and make decisions. Each chapter tackles a different problem and shows a few ways to approach it. All of the examples are in Python, but even if you don't know Python, you should be fine if you know any structured language. Practically every example either uses some dataset that he tells you how to get, or a public API that is free to access.

This book is the first I've read that relies on the current maturity level of the web. A few years ago, it wouldn't have been possible to provide so many relevant examples. In a few years, the API's might be out-of-date -- but don't let that stop you -- the algorithms are timeless.

The choice of Python is a good one for a few reasons. Firstly, it's pretty clear to read even if you don't know it. Secondly, list-comprehensions are a pretty good way to think about data crunching (which these algorithms do a lot), and, finally, there are many freely available Python libraries to deal with the public APIs he uses.

To give you an idea of Toby's approach, I'm going to run through my favorite chapter so far, the one on Optimization. The quick description of the problem domain should be familiar to every programmer -- solve NP complete problems like the traveling salesperson. Basically, Optimization techniques can be used in situations where (1) you can't analytically solve the problem (2) the solution set is too large to exhaustively search (3) you can express the cost of a possible solution such that better solutions have a lower cost than worse ones and (4) solutions that are "near" each other have similar values for cost.

Toby's first example is how a family can choose flights so that they arrive at the same airport from all over the country at nearly the same time and depart for home a few days later at nearly the same time. The cost function is a combination of the actual costs of the tickets and the difference in time between the earliest and latest flights. He goes on to provide a dataset of flights, describes setting up optimization problems in general, and shows the python code to implement the cost function.

Then he introduces a few optimization algorithms, implements them, and discusses their relative strengths and weaknesses. The next section applies these techniques to real data on Kayak.com using the standard Python DOM parser, minidom. The final section shows how to apply these techniques to other problems like pairing students in a dorm and network visualization. Other chapters get a similarly exhaustive treatment.

I really hope that this is the future of technical books.

API UI

I saw this article on Reddit that applies principles from human-factors design to API design. I think that's a great idea, but this suggestion sounds weird to me.

Now we could use progressive disclosure to help reduce the complexity of that JButton class: put the expert stuff in an object returned by a getExpertKnobs() method, and the graphics subsystem hooks in an object returned by a getIntegrationHooks() method, and you would be left with a button API that had just a handful of methods—the basic methods we all need.

This brings the number of methods in a button from about 100 to about 10, which sounds great until you try to subclass. Overriding a method that's in the object returned by the getExpertKnobs method would be a little tricky. It's likely that you don't even know the real class of that object because those methods are likely to return references to interfaces, plus the component code is in charge of constructing the objects, so there would have to be a way plug into that too, so that your subclass is constructed. I agree that the interface would be much simpler for most users, but it is much more complex for subclassers.

There's also the issue of what would be in the expertKnobs object. Some of the methods in JButton that would go in there actually come from JComponent or Component. In fact, as we progressively go down the class hierarchy, we are adding methods into the class that might belong together, which means that there is probably an expertKnobs class for each component class.  But, the getExpertKnobs() method is probably introduced early in the hierarchy -- what does it return? Will I have to downcast to get to the button's expert knobs?

I don't have a solution -- but these issues probably drove the design of JButton more than the idea that they just sloppily added a 100 methods without giving a thought to the complexity of the class.

But the point is taken, and I like the idea of using hard won knowledge from other domains in software design. 

Another recent blog on this topic was Steve Hawley's Principle of Least Astonishment  essay -- the last topic, on layering APIs, has his approach to this problem.

UML Cheatsheet

Here is a download for a UML Cheatsheet I made for my UML presentation this week. It's released under this creative commons license. When I'm done with the slides, I will post them here as well.

Update: Here are the slides:

WinFX Generation

Been looking around WinFX and specifically XAML recently. Not sure I think writing XML is a great way to write a GUI (but billions of web pages clearly prove me wrong). My first impression is that one of the benefits is going to be automated creation of XAML -- not from tools, but from our own programs.

It's been 10 years since HTML was a household word and I still feel like it's easier to hand write it than use a WYSIWYG tool. I also never really liked using dialog editors. Even good ones like Delphi have too many limitations, especially when it comes to layout. So, I'm not expecting much from the tool support for XAML. But since it's just a text format, it should be pretty easy to generate from a simpler format (or from models). I'm looking forward to playing around with that idea.

Presentation Links

Here are some links I've collected on giving presentations. I have been revisiting them lately in preparation for my talk next month.

Apple XCode CPlusTest Port

C++ (and Cocoa) unit testing is built into XCode on OSX. I am writing an application right now that I plan to port to Windows. When I wanted to see how portable my code was, I decided to get all of the unit tests passed on Windows.

Unfortunately, XCode does not use CppUnit, which is already portable. Here is some code I wrote that you can use to run your XCode generated unit tests on Windows.

UML Primer at WMass Dot Net Users Group

I'll be giving the next presentation at the Western Mass Dot Net Users Group on the basics of UML. Details:

Presentation:
Get an introduction to the Unified Modeling Language. Learn the basics of Class, Sequence, and Use Case diagrams.  The emphasis will be on UML sketching, but UML code generation will be discussed.

This session will be platform independent.  Please bring an open mind, your ideas, and experience to share with the group.  Software developers, project managers, students and anyone interested in software development on any platform are encouraged to attend.

Meeting Details:
Tuesday, February 7 at 6:00 PM
Fazzi Associates, Inc.
243 King St. Suite 236, Northampton

Servware

Despite his rants against C++, Stevey's Drunken Blog Rants, is one of my favorite new blogs. Actually, since it hasn't been updated in about nine months, and was written mostly in 2004, it's not really new. It was rediscovered by social bookmarking sites (I found it on reddit.com). I was reminded of it while thinking about resuscitating this site.

There are a lot of great articles, but this one about server-side service software is one of my favorites (given that I've spent a lot of time thinking about the subject). Another great article on this topic is Paul Graham's The Other Road Ahead.

Western Mass .Net Users Group

The old .NET SAPs is rechristened under the name: Western Mass .NET Users Group  (and a spiffy new website). Tomorrow's meeting topic is XML.

Speaking of XML, I saw this recently: Don't Invent XML Languages by Tim Bray. In addition to pointing to this great list of XML Languages, he introduces "The Big Five" XML Languages and where they should be used.

Suppose you’ve got an application where a markup language would be handy, and you’re wisely resisting the temptation to build your own. What are you going to do, then?

The smartest thing to do would be to find a way to use one of the perfectly good markup languages that have been designed and debugged and have validators and authoring software and parsers and generators and all that other good stuff. Here’s a radical idea: don’t even think of making your own language until you’re sure that you can’t do the job using one of the Big Five: XHTML, DocBook, ODF, UBL, and Atom.

The Chameleon Developer

Chameleon camouflaged against some leaves

I was talking to a developer friend today about developers who don't get the importance of communication. Look through any want-ads for developers and right after the alphabet soup of technologies you have to know is the same requirement—must have “excellent written and verbal communication skills”. You'd be hard pressed to find a programmer job without those requirements.

But, expressing yourself clearly is only half of communication. The more important aspect is listening to and understanding other people. This is what separates good and great developers. And the most important skill in understanding is empathy—not only understanding someone, but being able to be them.

To be a successful developer, you have to learn more than just the software aspect of your business. Developers who understand this are more valuable and valued more. If you want to do this effectively, you have to think like a saleperson, a marketer, a CEO, your customers, the press, etc. You have to think like them because they aren't all going to try to think like you (but if they're going to try, by all means, help them).

Another part of empathy is thinking about how they will interpret your words. For instance, avoid rhetorical tics like “think about it” and “you're not seeing the big picture”. Not only are they annoying, but it puts people on the defensive.

If you understand—not only what someone is saying, but why—you'll have a much better chance of expressing yourself. So, the next time you have a communication gap with a non-developer, stop talking and ask yourself how much you even understand about them.

Software Addins as a Business

Animated demo of Highlighter
I've written a couple of add-ins for CityDesk, and of course, it's tempting to think about trying to make some money off of them. I'm already a partner in another add-ins business (Spheresoft, which makes MS Office addins -- check out Highlighter if you view realtime data in Excel), so I know a little about what goes into the development, support, marketing, sales, licensing, etc. of doing this kind of thing.

Add-ins are a great business for small companies, because they are much easier to implement than a full-blown product and the software you are adding on to will help with marketing. Another add-in that Spheresoft will be publishing soon, changes the nature of spreadsheet modeling, but we'd never want to compete with Excel on thousands of other tiny features, so it makes more sense for this to be an add-in.

But, with any business, you have to decide how much effort you are willing to put in and what you are expecting back. The market of people who have Office is huge and I have a reasonable expectation of making a good enough return on Office add-ins to warrant the time spent writing and supporting them. 

I'm not sure I can do the same for add-ins for Fogcreek products. I'm glad that people find them useful, and I do them because I need them as well. I would never make enough money to recoup support costs I would have to undertake if I charged for them.

So, they will remain free and pretty much unsupported for now.

Code Generation from Requirements

This article on Domain Specific Languages from Martin Fowler got me thinking on code generation in general. A project I'm working on right now has many opportunities for generating code—not from a domain specific language, but from the requirements documentation and other source information.

I am working on a project now with a very detailed requirements document. For instance, practically all of the strings shown to the user are in there. Since the project uses string resources, there's no reason to manually copy and paste—the string resource is easily generated from the document.

The project also has a lot of images.  And images are associated with each other (some images are the small version of another, or meant to be the same object but from another angle). Luckily, the artist used a directory and naming convention that makes these associations easy to figure out, and generate an XML resource descriptor and object building code from.

Tying together this information (each image has a string descriptor) is also easy, given the detail in the document and the naming convention.

Actual logical code generation is almost possible as well, as the project is a kind of complicated finite state machine. In this case the document doesn't go far enough to be a source. A Domain Specific Language might help here, and leaves me with source that might be able to be maintained more easily that a Java representation.

This brings up a good lesson for those writing requirements, even if you don't know how to program. If you follow conventions closely, it might be a good starting point for generating code.

Furl

I've been playing around with the Furl Beta. It's looking pretty good. Essentially, it's a personal online cache of web pages that you can view, search, and share. Better than bookmarking because you can search the contents of it later, and it's accessible from any browser. You can even provide RSS feeds to your Furl content.  Collaborative research on the internet just got a whole lot easier.

One drawback, there does not appear to be any way to make the cache private, so I can't possibly use it, because there is no way I want a public database of every site I find interesting or useful. I definitely would share some of it with some people, but totally public is not an option.

Perhaps there will be a pay service for a private space.

Update: My bad. Just mark the item private when you Furl it.  Duh.  Didn't see that on the Furl dialog and still don't see how to make a public item private (except to change its topic to personal, but then I lose the utility of topics). Anyway, now the biggest drawback is that the first dialog comes up at the wrong size and there is no scrollbar, so you can't see the buttons.

Informational Integrity

Any program that accepts user input will need to separate good input from bad, and to make sure little of the latter gets recorded.

So begins the CHECKS Pattern Language of Informational Integrity by Ward Cunningham. I read this article years ago and loved it—mostly because it jibed with my experience for writing usable business software. The theme of the language is to make sure that data entered by the user is correct, and accomplishes this with various kinds of validation and data representation techniques.

I find that two of the patterns are invaluable for usable user interfaces—Echo Back and Visible Implication. The section they appear in starts with:

A person reaches through a program's interface to manipulate the domain model. Although the interface is itself a program (an interface model and graphical machinery), its purpose is to enable the direct manipulation of the domain model as transparently as possible. The user interface is programmed to create the illusion of control in the mind of the user. To this end it must provide sufficient clues of the model's state so that sensible operation is the norm.

These ideas are so important to the usability of software and so difficult to implement in HTML based software, because there is little domain knowledge accessible from the client side. For instance, Echo Back requires that a field, once entered, is transformed into a domain object and injected into the model. It is then read back from the model and displayed as understood by the model. Visible Implication further requires that any cascading consequences are also displayed. This kind of interface is the norm in the Fat Client or Client-Server world, and mostly absent in the web-based software world. As a consequence, it is harder for users to enter data.

We've all had the experience of filling out a huge form on a web page and submitting it, only to be told that we've made a mistake and have to go back. That's Deferred Validation used in isolation of the other patterns, and was never considered a good UI decision until the web made it commonplace. And yes, I know that you can get part of the way there with javascript, but only simple validations are possible. Imagine correcting a pay date in a maintainable and configurable way as described in the example for Echo Back.

I appreciate the revolution that the web has brought us, but something better is needed for software with high usability or productivity requirements (as opposed to high learnability or deployability requirements). Philip Brittan describes an architecture we worked on that can recapture the connection between UI code and the domain model, while still maintaining the benefits of web based software.

HTML DBScript

If you want to add the results of Database queries to HTML files before they are published (and keep them as static pages), you can use my new utility, HTML DBScript. It will work on any HTML, but is designed to fit well with CityDesk's "Before Copy" publishing location feature.

It's in Beta right now, but well tested and has decent error reporting. Let me know if you have any comments by using the Contact Page.

Update: v1.0 is released. FogCreek is featuring a link to it on their CityDesk News Page.

ArgoUML

I've been playing around with ArgoUML, an open-source UML modeling and code generation program. So far, the major drawback for me is the non-standard UI elements because of Swing—I really hate Swing File dialogs—even the “Windows” look and feel ones.

My first impressions are very good. I have spent seven years with Rational Rose and like it very much, but it is out of my price range right now. I haven't used any other products and I am only interested in packages with some kind of code generation and deep UML knowledge (that rules out generic drawing packages). And, price is an issue. So far, Argo meets all of my basic requirements.

One of the features I think I will like is what they call cognitive support. It manifests itself in a list of suggestions about your model. For instance, in playing around, I have a class with no non-static members or associations. In the to-do list, I see a suggestion to make this a Singleton. Argo also helped me understand some of its conventions by suggesting names for the packages I made. In general, Argo understands UML, and does a great job of helping you create it.

The code generation is in the early stages, and in inadequate for use on a project unless you are using an editor that will seamlessly fix the code into your standard tabbing style. Even when I use UML as a sketch, I like to generate the code as a sanity check to make sure the UML says what I think it says—it's good enough for that.

Large multi-person projects might find the lack of external packages (packages each stored in their own file) a deal breaker. Rational has very good support for this, and it's essential when you try to use UML as a programming language, since having one big file becomes a impediment to collaborative development. This extends to property sets as well, which I don't see how to share between models.

Here's my first model:

UML model of a bowling scorer

Created to mimic the design in this NUnit presentation.

UML in an Agile World

Although UML is usually not associated with agile processes, there are two ways I think it can play a major role in an agile development project.

The first is a full adoption of a UML toolset, including full round-tripping through the entire implementation. In this case, UML becomes part of your programming language. It maps very well onto the constructs of Object-Oriented implementation languages, and there is nothing about it that will stop you from doing all of the things you do in an agile process (Simple Design, Test First, etc.) The benefit is that you have a up-to-date UML document at the end of the implementation (why you might want this, I'll save for a future entry). I have done this with Rational Rose for C++ with great success, and if you want UML diagrams that actually match the implementation, this is the way to go.

The other use is to not even attempt to keep UML documents consistent with the implementation as it progresses, and just use UML to convey design ideas. This is more precise than a text description, and certainly a good supplement to one. Design Patterns by the GoF is a good example of a book that uses UML to convey design ideas. Martin Fowler's Patterns of Enterprise Application Architecture uses it extensively as well, and he says this about the UML in his books.

Most UML diagrams shown in books, such as mine, are sketches. Their emphasis is on selective communication rather than complete specification. Hence my sound-bite "comprehensiveness is the enemy of comprehensibility"


Unless UML is the implementation language (where completeness is essential), then there is no need to include every detail in the drawing. In fact, inclusion communicates importance in this context.

Unit Testing Old Code

If you are new to unit testing, but the source base is old, the sheer size of code to test will seem daunting. Here are some strategies for adding unit testing to old code.
  • Test code before refactoring or optimizing – The purpose of refactoring and optimizing is to make the code do the exact same thing but in a different way. It is perfectly suited to being unit tested since tests you write at the beginning of the change need to pass at the end.  And, the unit tests will let you make more aggressive changes.
  • Test code while learning what it does – When you're trying to learn a body of code, a unit test is a good way to check your assumptions about how the code works. This strategy also works well in training a new developer—explain the code by writing new tests for it together.
  • Before fixing a bug, write a failing test – This is the perfect time to add a test since reliably reproducing the bug is a good first step. The beauty of this is that now the bug won't return.
In general, I would not make a project out of writing unit tests for old code (except during the period where you are installing the xUnit framework and getting used to it). Unit testing works best when it isn't the end, but instead a means to accomplishing something else.

DotNetSaps: Unit Testing

It was my turn to present to the DotNetSaps. Here are the slides and code to score bowling games that I wrote during the presentation.  Get NUnit, FitNesse and look at XProgramming.com for more information on unit testing.

Note on the code: This code is not done. There are at least three bugs in it, but it passes all of the tests. This means that there are more tests to write.  Here are some that would fail:

public void testNonStrikeAfterDoubles()
{
  game.bowl(10);
  game.bowl(10);
  game.bowl(5);
  game.bowl(5);
  assertEquals(game.getScore(), 55);
}

public void testSimpleEndOfGame()
{
  for (int i = 0; i < 10; ++i) {
    game.bowl(0);
    game.bowl(0);
  }
  assert(game.getIsOver());
}

public void testEndOfGameStrikeInTenth()
{
  game = new BowlingGame();
  for (int i = 0; i < 10; ++i) {
    game.bowl(10);
  }
  assert(!game.getIsOver());
}

Not to mention scoring the tenth frame correctly. Fixing those tests and finding others is left as an exercise to the reader.

Nielsen's HomePage Usability Rules: A Self-Assessment

Homepage Usability book cover
Jakob Nielsen has been supporting his book, Homepage Usability: 50 Websites Deconstructed, with a few of articles on homepage design. I took the opportunity to make some changes to my own site after reading them.

The first article, Top Ten Web Design Mistakes of 2003, lists what's been annoying him recently. I did well on this one—I only needed to add some ALT text to my images (Tip #5 is Overly detailed ALT Text—oops, I actually didn't have any) and make pages stop linking to themselves (#10). The latter improvement is not directly supported in my CMS tool (CityDesk) and took a fair amount of scripting, but armed with CityDesk keywords and my keyword organizing utility, I got it done.

That article linked to The Ten Most Violated Homepage Design Guidelines. I didn't do too badly on that one either, but I decided to differently color visited links and make it more clear where you are on the site (#3). I rewrote my tag line to be more informative (#5) and made my Use a liquid layout that lets users adjust the homepage size), since that's a pet-peeve of mine.
Finally, I read Top Ten Guidelines for Homepage Usability. By this point, most of the tips were redundant. I haven't added a search yet, but I'll do that soon.

The book has over a hundred guidelines sure to improve any site.

The NeverLost UI—Design for the Disengaged User

Neverlost
Last week I had my first experience with a GPS based guidance system, the NeverLost from Hertz. I was struck by some of the good UI choices and want to highlight them here. For a more in-depth analysis, see this evaluation. I will limit this entry to my current blog topic of usability vs. learnability  as it applies to the disengaged user.

The designers of NeverLost certainly had to design for both usability and learnability. Few will have the time or inclination to read the manual once they get to their rental car, and many will be first time or occasional users. More importantly though, the NeverLost must be easy to use. I'm sure the designers of NeverLost understood the possibility of their device contributing to an accident, and let that be the overriding concern in all of their choices. The NeverLost is a good example of designing for disengaged users, that is, users that are doing something else while using your software. The principles of the NeverLost design have wide applicability (e.g. software for call centers or traders).

Here are some of the basic principles they have adhered to:
  1. Constraining choices rather than interrupting on errors: A good choice in any application, the NeverLost takes it to an extreme. Typing is hard (you choose letters by navigating over a pick-list), so the search interface only shows letters that will actually return a result. So, for example, if you pick the letters “H-O-L-I-D” in the Yellow Pages search, only the letter “A” appears on the pick list, because the “Holiday Inn” is the only item in their list that starts with those letters. Searches never fail to return a result, and you can correct errors immediately. Of course, a keyboard or a touchscreen might have made this easier, but they would also add to the cost.
  2. Speech rather than text: Essential for this kind of system since you really should not be looking at it while driving.
  3. The display is easy to glance at: While driving, the display is simply the major routes with the current one highlighted and the car showing the driving direction.  It's very easy to see that you are on the right track.  All other information is via speech.
  4. Anticipation rather than waiting for input:  We made two deviations from the directions. The first was missing our exit, and the NeverLost immediately calculated a new route and let us know what it was doing. The second was getting off at an exit to make a rest-stop—in that case, the NeverLost let us get back to the route ourselves without recalculating. Both choices were exactly what we wanted, so we didn't need to fiddle with the interface.
  5. Mistakes are easy to correct: The NeverLost keeps a list of the destinations you have input and lets you easily go back to them. A Cancel button lets you back out of all choices. Having these features lets the NeverLost stay out of your way most of the time, because it knows that you can revert to an old state when you need to.
All of these are good ideas for any interface, and if you imagine that your user is distracted, your user interfaces will be even easier to use when they aren't. Of course, a speech interface is not right for all applications, but sounds usually are, and can help to alert users when their attention is required.
When conducting usability tests, keep in mind the environment your users are in, and try to match it. For example, don't gather users to your pristine environment. Go to them if you want to see how they really use your software and the distractions they encounter.

Usable and Learnable

Usable interfaces are all about productivity for those that use your software every day. They feature a UI that anticipates you, but gets out of your way and ensures that your mistakes are recoverable.

Novices require more hand-holding. They may either be new to the application and not yet a power-user, or they may be an occasional user, or they may be watching a demo. In each of these cases, the application must be easy to learn and understand.

When you prepare your user profiles for a UI Design or a product requirements document, it may seem that these two user types are hopelessly irreconcilable, and sometimes you'll specify two completely separate UI's (e.g. wizards in addition to the “normal” UI). Sometimes this is appropriate, but on his blog, Philip Brittan discusses a product we worked on, where features supporting both types were implemented with the same UI.

Usability vs. Learnability

I just needed to look up something in User Interface Design for Programmers by Joel Spolsky and I was reminded of a great chapter on usability testing (only in the print version). One of the insights is the difference between “usability” and “learnability”. He revisited it in another post.

For casual users, learnability and simplicity are more important than usability and power. In that sentence, by “learnability,” I mean, the ability for novices to figure out how to get tasks done rapidly. By “usability,” I mean only the ability to do tasks in a convenient and ergonomic way without making mistakes and without needing to do repetitive tasks. A data entry system that minimizes keystrokes by prefilling things and automatically jumping from field to field is more usable for experienced users, but it's harder to learn because it behaves unexpectedly to a novice.

The book is worth picking up if you are planning a formal usability test, just to see what kind of things to look for, and what to expect to accomplish. Keep in mind, that if your testers are seeing the UI for the first time, you are testing learnability, not usability.

FitNesse and Requirements in XP

In XP, requirements gathering is an ongoing conversation validated with acceptance testing. It makes sense, therefore, that any document that results from gathering requirements be easy to edit and give explicit instructions for validation. FitNesse takes this idea to the extreme by using a Wiki front-end to make the requirements document the living, growing entity it needs to be, and it builds in FIT acceptance testing, so that tables representing acceptance tests can be added and run directly from the document.

To use FitNesse, developers add Fixtures to their projects using the FIT framework. These Fixtures can be are added to FitNesse using path and fixture directives. Each page can embed an acceptance test using syntax like the following:

|!-CalculatorColumnFixture-!|
|button|display()|
| |0|
|1|1|
|+|1|
|2|2|
|=|3|


This example tests a calculator by pressing its buttons. CalculatorColumnFixture is a class extending ColumnFixture in FIT.  It has a member variable called button and a method called display().  For each row, button is set to the value in the left column and the return value of display() is checked against the value in the right column. In addition to this table, any text describing the functionality of the calculator can appear on the page. Advanced Fixtures are available for more complex interfaces.

The beauty of FitNesse is that it is implemented as a web-server that serves up the Wiki with the requirements documentation and acceptance tests. It adds FIT directives to the normal Wiki editing syntax and includes version tracking for each page.

Combining the aspects of collaborative document growing and automated acceptance tests, FitNesse is a great addition to the requirements process.

Keyword Picker for CityDesk

I use CityDesk to manage this site. One of the features is that each page can be tagged with keywords. The problem is, however, that the keywords are typed into a TextBox, not chosen from a List. I am having problems remembering all of the keywords I have decided to use, but I want to be consistent, because I plan on making pages for each keyword when I have enough blog entries (to help organize the archive page).

I wrote a small program to help me do this.  If you have the same problem, check it out.

Update
: Cool. Five minutes after posting this, it's already in TK's CityDesk Help Reference—which, by the way, has some good stuff for CityDesk users.

FIT for Testing

Over at Martin Fowler's blog, I found a link to Framework for Integrated Test (FIT) from Ward Cunningham. I'm just starting to read it now, but I wanted to pass it along because it looks pretty good. Fowler writes:

But I wonder if a language designed for programming is really the right language for writing tests. The point about tests is that they operate by example. They don't try to cover how to handle any value, instead they describe specific scenarios and responses. I wonder if this implies a different kind of programming language is required. Perhaps this is the truly startling innovation in FIT.

More on this later.

Document Growing

If your documentation system is easy to use, add to and edit, you can grow your documentation. Esther Derby has an exampleWikis are particularly suited to this style of collaborative document growing.

Update: Martin Fowler wants us to be more precise with the word “refactoring”. Since the above article uses it in precisely the way he's talking about changing, I thought I'd add this note.

Fluid Communication

Sharing information in your organization is much easier if you use some kind of knowledge base software. For large enterprises, there are plenty of choices, most aimed at the help desk or call center, but they might be overkill and expensive if all you are trying to do is make a central repository of the bits of information in the collective mind of your company. Here are three low-cost ideas for implementing a simple knowledge repository.
  • News Group Software —  There are plenty of easy to install and use newsgroup applications. I've had good experiences with Snitz, which is free and full-featured.
    • Advantages: Familiar interface, every edit is marked with author, support for alerting via e-mail is common
    • Disadvantages: Knowledge is in discussion/serial format, hard to edit old entries, linking to other entries can be hard, requires ability to install software on the server, may not support attachments
  • Wiki Software — A Wiki is a website where every page is editable by the reader. The best known public example is the WikiPedia, but the concept started at the Portland Pattern Repository. It's a powerful idea, but depending on the exact software you use, it can be hard for some people to use. Here's a list of wiki implementations.
    • Advantages: Everything is editable, linking is easy, free implementations are available, pro versions track users and edits
    • Disadvantages: Can be hard to use, requires ability to install software on the server, may not support attachments
  • Content Management Software — CMS tools can be as expensive as KnowledgeBase tools, but for ease of use and quality of the resulting site, they cannot be beat. I use CityDesk for this site and others (Note: as of 2007, I use RapidWeaver). It averages about $100 a user for contributors and $300 for the site designer, but for small sites, a free version is available.
    • Advantages: Complete control of resulting site, linking is easy, everything is editable, very easy to use, attachments usually supported
    • Disadvantage: Must set up templates, edits not usually logged

For some knowledge bases, a combination of these ideas can work very well—a news group for requests and a Wiki or CMS for official information, or a Wiki for internal use and a CMS for customer facing pages that need to look pretty.

Book Review: VB for Testers

Book cover of VB for Testers
I was somewhat skeptical about Visual Basic for Testers [amazon] because I thought it was going to be focused on automated GUI testing. I have no interest in reinventing WinRunner or TestComplete as a giant list of SendKeys calls. Luckily, neither did Mary Romero Sweeney. She concludes the first chapter with the advice that “Visual Basic should not be considered a substitute for the major test tools but simply a powerful adjunct to them”. Earlier in the chapter, in a section titled Automated Software Testing in the Real World (page 8), she justifies the use of VB and other general-purpose languages for testing with this:

Using Visual Basic and other programming languages on a test project are some of the real-world techniques we end up using when our high-priced automated testing tools just can't get to that important information. This is not meant as a criticism of the automated test tools on the market today [...] However, by default, these tools can't possibly keep up with every need on every test project. Most testing organizations find that at some point they need to resort to using programming-experienced personnel to write code to supplement their testing.

I have to confess that had she not been clear about this goal, I may have abandoned the book here. Confident that this book was going to offer something to add to my arsenal of testing techniques, I read on.

The next few chapters are an introduction to VB focusing on the features that a tester would be interested in (getting data from a database, automating a COM object, manipulating the registry). Experienced VB programmers will likely skip over these chapters. If you want to skim these chapters, I recommend hunting down the asides marked “Tester's Tip” and those set off with dotted lines and a bold centered title. Some of the latter of these are good software development process tips. Ms. Sweeney rightly realized that she was addressing beginning programmers and sought to instill good practice in them from the start. She offers tips on accessibility (p. 52), tab order (p. 54), setting up a directory structure (p. 62), and naming conventions (p. 77). These are all important concepts and it's never too early to learn them.

It's obvious that Ms. Sweeney actually uses VB for testing, because the examples are suited to VB's strengths, not just a hodgepodge of VB features. She spends most of her time on database, COM, registry, file I/O and other Windows API features, revisiting them in later chapters. This is the gap between unit testing (which is best written in the same language of the application) and automated GUI testing (written using off-the-shelf tools). These features are hard to test with a recorder and often best tested in the language you expect your customers to use, which in many cases is VB. If your application exposes a COM interface, for instance, it would be foolish not to use VB.

Chapter 10, Testing the Web with Visual Basic, begins with an explanation that there are tools (some free) for testing websites, but also that there is more you can do with VB. One useful example in this chapter is a testing web browser that exposes the internals of the site. I could see this being useful, for example, for verifying that specific headers are present without constantly viewing source. And since you use the IE control, you can be assured that the page will be rendered exactly as it would in IE. Taking the idea further, the browser could be a flight recorder for functional testing—logging exactly what you've done on a site, so that if you see a bug, it would be easy to reproduce.

The one critique I have of the book is that while the examples are great for learning the features of VB, they are not really testing scripts. In real testing scripts, there would not be visual confirmation—testing scripts run best without a GUI or intervention from the user, only logging information when there is something wrong. The examples are visual because of the visual nature of VB development and the fact that when learning a new language, it's easier to understand if you can see what's going on.  I would have liked to see the idea of self-verification explored more. That being said, Ms. Sweeney says in the introduction that this is not a software testing automation book or a VB manual—that it is enough of both to get started on using VB for automation, and readers are expected to be somewhat familiar with automation practice. She recommends Software Test Automation by Fewster and Graham [amazon] and Automated Software Testing: Introduction, Management, and Performance by Dustin, et al. [amazon] for learning automation practice.

Also, realizing the need to at least mention .NET, this book tacks on two chapters from a VB.NET book. They are not specifically about testing and serve to introduce a VB programmer to the many differences in between VB and VB.NET. It is somewhat of an afterthought, and might be useful to get your feet wet, but I would have liked to see some of the “Testing Tips” or other asides from the earlier chapters.

The book ends with advice directly from some professional test automators and genuinely useful appendices. Appendix D collects some interesting essays for further reading.

If you are in test automation, and running up against the limitations of the available tools, this book is great for learning how to fill that gap. Also, any tester who is interested in learning how to program will find the advice invaluable and the examples relevant to their work. The fact that Ms. Sweeney and her contributors are professional test automators imparting hard-won advice makes this book all the more useful.

Using jUnit for Monitoring

Since jUnit (and the xUnit family) is just a verification mechanism, it's a natural fit for service monitoring. This article from JDJ describes using jUnit and JMX to build a service monitoring framework:

This article walks you through the process of setting up a basic service monitor and event handler for a common J2EE n-tier system. Developers of J2EE systems will be able to use JMX4ODP to create testing suites to help them develop more reliable systems. J2EE application administrators will be able to use JMX4ODP to simplify and regulate the management of deployed systems.

Why jUnit?

JUnit bills itself as a regression-testing suite for code objects, but it's not much of a leap to see it as a tool for distributed system diagnostics. JUnit runs tests by instantiating objects, invoking their methods with known inputs, and checking the output against expected returns.The article is very Java and J2EE focused, but the concepts are applicable to any service monitoring project.

Source Control for One

Eric Sink, whose company, SourceGear, publishes a source control product called SourceGear Vault, announced that they are now offering a single user license for $49.

He has an excellent article about why to use source control on projects with one developer and a follow up.

To his list of reasons, I'd add these:
  • It enables Daily Builds.  You can't build on a clean machine from the source on your drive. You need there to be an official, working version of the source.
  • It enables bug fixes to released versions. Using labeling and branching features, you can fix a bug in a released version to a user without also giving them every change you've made since then. This is important no matter how many developers you have.
  • It enables binary search bug fixing. Outlined in this JoelOnSoftware article, it's simply a way of finding a bug by systematically trying each archived build until you find the one that introduced the bug.  Then you check the log entries to see what you did.
Eric sells the benefits a little short, I think, but any company with one developer is unlikely to shell out a lot for source control, so the price is probably right, and there's always that hope for a second developer.

Unit Testing with NUnit

I'm preparing a presentation on unit testing in .NET (using NUnit). Unit Testing is generally non-implementation language specific. The original SmallTalk framework and the popular jUnit Framework for Java have nearly identical designs. The same is true for CppUnit (for C++) and every other unit testing framework I've looked at. Since .NET languages support all of the language features necessary to implement this design, I assumed that it would be the same—and for NUnit 1.0, that was the case.

NUnit 2.0 takes it further. The developers correctly realized that .NET offers some features that can automate some of the tasks in unit testing, and took advantage of them. This “breaks” consistency with other unit testing frameworks, but for a good reason, and it's close enough to the others to not be a big problem.

The basic architecture of the xUnit Frameworks provides a base class called TestCase which you extend to contain the test methods. The framework requires you to write another class that builds a list of these classes into a TestSuite. In jUnit, this class typically looks like this:

import junit.framework.*;

public class AllTests
{
  public static Test suite() {
    TestSuite suite =
      new TestSuite("All tests");

    suite.addTest(
      new TestSuite(ReaderTest.class)
    );

    suite.addTest(
      new TestSuite(CommandTest.class)
    );

    return suite;
  }
}


Note that the method AllTests.suite() must be edited each time a test class is added. If you forget, you will get a false test success. If you're writing tests first, you will notice this immediately—but it would still be nice not to have to write or maintain this class. NUnit 2.0 uses custom attributes so that the framework can discover all Test classes and call them automatically, and takes care of this manual task. In NUnit, instead of extending a TestCase, you use a custom attribute to mark the class as a TestFixture and add test methods to it (either marked with the attribute Test or named with the prefix “test”).  It looks like this:

namespace NUnit.Tests
{
  using System;
  using NUnit.Framework;

  [TestFixture]
  public class SuccessTests
  {
    [Test] public void Add()
    { /* ... */ }

    public void TestSubtract()
    { /* ... */ }
  }
}


There is no need to write an AllTests class, as .NET has features that allows the framework to discover this class automatically and run it. Kent Beck calls this an “idiomatic design”, one that takes advantage of language features, instead of porting the design from a language with less features, and is a good lesson for any type of language port.

Getting Started with Extreme Programming

Extreme Programming combines several practices into a well-defined software process. The goal of XP is to deliver features to the end user as fast as possible. Some of the practices are focused on quality, some on maintaining releasability, some on development speed and some on planning.

XP may seem daunting to you at first because it's probably very different from how you work now. It's also common for development teams to doubt the benefits of some of the practices and resist adopting them. Therefore, when getting started with XP, you will find it easier to migrate there in small changes, each one building upon the last and gaining confidence and trust in the process the whole way. If some of the practices aren't right for your organization, you can still benefit from the others.

There are three XP practices which are easy to adopt and will provide immediate benefit. In addition, they don't rely on other practices.

The first, Coding Standard, simply states that you should have a consistent naming and formatting convention for your code. Many organizations already do this and some languages (Java, Eiffel, C#) already have recommended coding standards. Your coding standard should not only be internally consistent, but also consistent with industry norms. This practice enables the Metaphor and Common Code Ownership practices.

The next, Unit Testing, is far less commonly used in non-XP shops, but no other XP practice is as easy to adopt and has as much immediate benefit as it does. Even if you are skeptical of XP, I highly recommend that you try Unit Testing and I will be spending a lot of time in this blog discussing why. Unit Testing enables Refactoring, Simple Design, Common Code Ownership, Continuous Integration, and other good practices.

Lastly, instituting Daily Builds will make your functional testing and deployment more consistent. This is also easy to adopt and has many obvious benefits. This practice enables Continuous Integration, Small Releases, and Customer Tests practices.

XP practices work best when combined, but it's better to be successful at using some of them than fail when trying to use them all at once.

Once you've mastered these, pick among the enabled practices to take your process further. Practices focused on getting code to users will be most beneficial (Continuous Integration, Simple Design and Small Releases).

Automated Software Process Checklist

Here is a list of software processes that should be automated

1. Source control
2. Bug tracking
3. Unit Testing
4. Daily Build and Test
5. Deployment
6. Functional Testing (Black box testing)
7. Source Code Generation (from other sources -- e.g. Databases)

I'll be talking more about these soon.  The order is determined by what will give you the most benefit for the least additional work.

For some projects, it's just inexcusable to not have automated deployment, and it will make sense to tackle that first. For most projects, it's the development processes which are manual or semi-manual and need to be dealt with first.