20 Days of Clojure: Day 15

Christophe Grand made two observations about my parallel prime code, which improve it quite a bit.

First, since primes-in-list is lazy, you have to (touch) it to get it to be completely created in the agent thread. That leaves you with this:

  (defn par-primes-in-list [potential-primes known-primes]
    (let [ x 10 x-times-2 (* x 2)]
      (! w1 (fn[w pp kp] (
touch (primes-in-list pp kp))) (take x potential-primes) known-primes)
      (! w2 (fn[w pp kp] (
touch (primes-in-list pp kp))) (take x (drop x potential-primes)) known-primes)
      (await w1 w2)

      (let [new-primes (concat @w1 @w2) ]
        (if (= (count new-primes) 0)
          (recur (drop x-times-2 potential-primes) known-primes)
          (let [real-primes
            (if (> (first new-primes) x-times-2) new-primes (primes-in-list new-primes []))]
            (lazy-cat real-primes (par-primes-in-list
              (drop x-times-2 potential-primes)
              (into known-primes real-primes))))))))

But, the bigger difference is picking the number of potential-primes to evaluate intelligently. I had experimented with trying to do this dynamically, but without the fix above, anything you do will probably result in primes being found lazily in the main thread. Here’s Christophe’s code:

    (defn par-primes-in-list [potential-primes known-primes]
        (let [p (first potential-primes) x (quot p 2)]
            (! w1 (fn[w pp kp] (touch (lazy-primes-in-list pp kp)))
                (take x potential-primes) known-primes)
            (! w2 (fn[w pp kp] (touch (lazy-primes-in-list pp kp)))
                (take (- p x) (drop x potential-primes)) known-primes)
            (await w1 w2)

            (let [new-primes (concat @w1 @w2) ]
                (lazy-cat new-primes
                         (drop p potential-primes)
                         (into known-primes new-primes))))))

Christophe’s observation:

I use the property that no number in the interval [n .. 2*n-1] is divisible by n (except n itself) so it’s safe to process the whole interval with the current set of known primes. Hence this interval can be divided into m intervals (where m is the number of cores) for parallel processing.

The one issue with Christophe’s code it part lazy and part eager, so it’s applicability depends on whether you are trying to get to a specific number the fastest or just generate primes as fast as possible. Since the number of primes to try grows as it searches, it lazily generates bigger and bigger lists of primes. I tried to find a good value to ask for where it would perform well, and chose 20390 which is big enough to show a difference and doesn’t make Christophe’s code do too much extra. Here are the results:

Christophe’s: “Elapsed time: 87266.119 msecs”
Mine (with a touch added): “Elapsed time: 108841.354 msecs”
Sequential: “Elapsed time: 151075.292 msecs”