| Andrew Cooke | Contents | Latest | RSS | Twitter | Previous | Next


Welcome to my blog, which was once a mailing list of the same name and is still generated by mail. Please reply via the "comment" links.

Always interested in offers/projects/new ideas. Eclectic experience in fields like: numerical computing; Python web; Java enterprise; functional languages; GPGPU; SQL databases; etc. Based in Santiago, Chile; telecommute worldwide. CV; email.

Personal Projects

Lepl parser for Python.

Colorless Green.

Photography around Santiago.

SVG experiment.

Professional Portfolio

Calibration of seismometers.

Data access via web services.

Cache rewrite.

Extending OpenSSH.

C-ORM: docs, API.

Last 100 entries

Small Success With Go!; Re: Quick message - This link is broken; Adding Reverb To The Echo Chamber; Sox Audio Tools; Would This Have Been OK?; Honesty only important economically before institutions develop; Stegangraphy via PS4; OpenCL Mess; More Book Recommendations; Good Explanation of Difference Between Majority + Minority; Musical Chairs - Who's The Privileged White Guy; I can see straight men watching this conversation and laffing; When it's Actually a Source of Indignation and Disgust; Meta Thread Defending POC Causes POC To Close Account; Indigenous People Of Chile; Curry Recipe; Interesting Link On Marginality; A Nuclear Launch Ordered, 1962; More Book Recs (Better Person); It's Nuanced, And I Tried, So Back Off; Marx; The Negative Of Positive; Jenny Holzer Rocks; Huge Article on Cultural Evolution and More; "Ignoring language theory"; Negative Finger Counting; Week 12; Communication Via Telecomm Bids; Finding Suspects Via Relatives' DNA From Non-Crime Databases; Statistics and Information Theory; Ice OK in USA; On The Other Hand; (Current Understanding Of) Chilean Taxes / Contributions; M John Harrison; Playing Games on a Cloud GPU; China Gamifies Real Life; Can't Help Thinking It's Thoughtcrime; Mefi Quotes; Spray Painting Bike Frame; Weeks 10 + 11; Change: No Longer Possible To Merge Metadata; Books on Old Age; Health Tree Maps; MRA - Men's Rights Activists; Writing Good C++14; Risk Assessment - Fukushima; The Future of Advertising and Surveillance; Travelling With Betaferon; I think I know what I dislike so much about Metafilter; Weeks 8 + 9; More; Pastamore - Bad Italian in Vitacura; History Books; Iraq + The (UK) Governing Elite; Answering Some Hard Questions; Pinochet: The Dictator's Shadow; An Outsider's Guide To Julia Packages; Nobody gives a shit; Lepton Decay Irregularity; An Easier Way; Julia's BinDeps (aka How To Install Cairo); Good Example Of Good Police Work (And Anonymity Being Hard); Best Santiago Burgers; Also; Michael Emmerich (Vibrator Translator) Interview (Japanese Books); Clarice Lispector (Brazillian Writer); Books On Evolution; Looks like Ara (Modular Phone) is dead; Index - Translations From Chile; More Emotion in Chilean Wines; Week 7; Aeon Magazine (Science-ish); QM, Deutsch, Constructor Theory; Interesting Talk Transcripts; Interesting Suggestion Of Election Fraud; "Hard" Books; Articles or Papers on depolarizing the US; Textbook for "QM as complex probabilities"; SFO Get Libor Trader (14 years); Why Are There Still So Many Jobs?; Navier Stokes Incomplete; More on Benford; FBI Claimed Vandalism; Architectural Tessellation; Also: Go, Blake's 7; Delusions of Gender (book); Crypto AG DID work with NSA / GCHQ; UNUMS (Universal Number Format); MOOCs (Massive Open Online Courses); Interesting Looking Game; Euler's Theorem for Polynomials; Weeks 3-6; Reddit Comment; Differential Cryptanalysis For Dummies; Japanese Graphic Design; Books To Be Re-Read; And Today I Learned Bugs Need Clear Examples; Factoring a 67 bit prime in your head; Islamic Geometric Art; Useful Julia Backtraces from Tasks; Nothing, however, is lost with less discomfort than that which, when lost, cannot be missed

© 2006-2015 Andrew Cooke (site) / post authors (content).

Optimising Clojure

From: andrew cooke <andrew@...>

Date: Wed, 31 Aug 2011 08:54:22 -0300

I had some Clojure code that needed optimising.  Eventually I achieved a 5x
speedup.  This is how.

Finding Something to Measure

First, I needed something to measure, so that I could tell how effective my
changes were.  This was complicated by three things: the need to generate test
data to process; the delayed action of the JIT; garbage collection.

Generating test data is quite expensive, so I needed to carefully tune
parameters so that my processing used more time than the test data generation
(not necessary for the simple timings, but important for profiling, since I
could find no simple way to delay profiling to start after the data was
ready).  Then I placed the processing in a loop, repeating it multiple times
(re-using the same test data).  Each pass through the loop was timed - the
successive times let me see when the JIT engaged (typically the first
calculation was 50% slower) and also reduce the effect of GC by selecting the
lowest value.


Once I had a good test running in my IDE, I needed to make standalone Java
classes to profile.  The simplest way to do this seems to be to use a build
tool called leiningen - https://github.com/technomancy/leiningen - which was
very easy to use.  You just download and run it, create an empty project, and
modify the project descriptor file (which I then copied into my existing

My file contents are:

  (defproject compressive "0.0.0"
    :description "set-related experiments"
    :aot [#"com.isti.compset.*"]
    :main com.isti.compset.stack
    :dependencies [[org.clojure/clojure "1.3.0-beta1"]
		   [org.clojure/clojure-contrib "1.2.0"]])


  - aot indicates which classes need compiling
  - main indicates the location of the "main" file
  - dependencies are downloaded automatically

I also added the (-main) method and put (:gen-class) in my ns at the top of
the file.

With all that, "lein compile" generated a set of classes that could be run
with java:

  java -cp classes:lib/clojure-1.3.0-beta1.jar \
    -agentlib:hprof=cpu=samples,depth=10,file=hprof com.isti.compset.stack

giving a profile in the file "hprof".

This profile was dominated by expected methods related to generating
and consuming sequences, and adding and multiplying generic numbers.

Vector of Doubles

The central loop in my code sums various spectra and the square of their
values.  The spectra were stored as "vec" instances.  I hope to remove the
use of generic numbers by replacing this with a vector of doubles and, if
necessary, adding some type annotations to various functions.

So I replaced "vec" with

  (defn d-vec [collection]
    (apply conj (vector-of :double) collection))

However, this did not have the desired effect - the code was
significantly slower and the trace was dominated by calls to "Vec.count".

I asked for help on stack overflow http://stackoverflow.com/q/7223297/181772
but didn't get any very useful response, so I removed these changes.

Array of Doubles

At this point I was a little disheartened.  The traces didn't have much useful
information (that I could see, even using a nice little tool called PerfAnal

So I looked at my inner loop and decided that I would simply rewrite it as I
thought it should be written, relying on experience rather than profiling.  I
also found this page http://clojure.org/java_interop on the use of arrays

I made the following changes:

 - replace the vector in which I was accumulating results with a native array
   of doubles (using double-array)

 - mutate the accumulator rather than creating a new instance each loop

 - use additional indices to identify the "active range" of the accumulator
   (which may get smaller over time, due to restricted data ranges in
   spectra).    This had not been necessary when I was generating new
   immutable instances - the instances simply decreased in size.

 - change the spectra from vectors to arrays.

 - replace some maps and reduces with explicit loops that mutate data.

This was, initially a complete disaster.  Running time increased by roughly a

Type Annotations

Rather than reverting these changes, which seemed justified from experience, I
looked again at profiling.  The profile was now dominated by methods related
to java introspection.

This is a known issue with Clojure and has a simple solution - add

  (set! *warn-on-reflection* true)

and remove all warnings by adding type hints.  I did not enable this for all
code - only for the central loop and related functions.  And I used the
"^doubles" hint which indicates a native array of doubles.

This, finally, gave a significant speedup - about 5x faster than the initial


 - Clojure code is simplest, and reasonably efficient, when you stay within
   Clojure's own types (vec, seq etc).

 - Profiling with hprof can give broad information, but I, at least, couldn't
   understand it in detail (I normally can read profiling data!) - there
   seemed to be information missing, or I simply do not understand how Clojure

 - It was fairly simple to change the code to use typed, mutable data in the
   inner loop, which gave significant speedups.

 - Dynamic interop with Java is apallingly slow, but easy to fix once you see
   that it is happening.

 - More generally, Clojure works fairly well for exploratory data processing.
   It's nicely high-level and easy to work with, and can be optimised where
   necessary.  I have also implemented this particular algorithm in C and on
   GPUs, and I am sure it is faster in both cases, but (without doing any
   formal measurements) the code feels close enough to C for me to experiment
   with new ideas quickly.  I doubt this would have been possible in Python,
   for example (even with numpy - the inner loop is annoyingly hard to

More on (vector-of :double)

From: andrew cooke <andrew@...>

Date: Wed, 31 Aug 2011 22:50:48 -0300

Justin Kramer at http://stackoverflow.com/q/7223297/181772 sugegsted using 

  (into (vector-of :double) collection)

to create vectors of doubles.  This seemed like a good idea for the immutable
spectra (it's probably not clear above but I introduced arrays of soubles in
two places - the mutable accumulator used to sum spectra, and the indvidual
spectra themselves).  However, when I tried it, I found that there was no way
to add type hints for these vectors.  That led to warnings of dynamic code
when the contents were used.

So using arrays of doubles is heping in two ways - it's supporting fast,
mutable access and also helping the compiler infer (from the ^doubles hint for
the array) the tyoe of the content.


PS I posted this on HN where it gained some view, but no useful comments.

Comment on this post