Population Genetics


PCA computations for genome-wide SNP data and other very large data sets. The covariance matrix formation is parallelised for multiprocessor machines and scientific computing clusters.


Parallel EM approach to fitting the Pritchard et al. (2000) "Admixture Model"

Shell utilities

lines and slice

Rapid extraction of lines and columns from tabular text files. Especially useful for large files, such as genome-wide genotype data. Lines may be requested in any order and will be retrieved with a single pass through the file.


Rapid, content-blind syncing of directory trees. Like rsync, but works with file-names only: if the file names are the same, the content is assumed to be the same. Useful for maintaining back-ups of large collections of files, for example an MP3 library.

~> treesync --help
Usage: treesync tree1 tree2

  -h, --help        show this help message and exit
  --debug           show debug messages and pass exceptions
  -v, --verbose     show informational messages
  -q, --quiet       do not show log messages on console
  --log=FILE        append logging data to FILE
  --loglevel=LEVEL  set log level: DEBUG, INFO, WARNING, ERROR, CRITICAL
  -n, --dry-run     Just print shell commands; don't actually do anything
  -d, --delete      Delete files from receiver that do not exist in sender
  --delete-only     Delete files from receiver that do not exist in sender,
                    and do not copy files.
~> treesync -dv /path/to/source/root /path/to/destination/root


Org-babel [a component of Emacs Org-mode]

Active source code within Org-mode documents: a new environment for reproducible computational research and literate programming.


A buffer organisation tool: provides an Org-mode front-end to the ibuffer machinery.


Minimalist appearance for Emacs: no mode line, no scroll bars, no menu, nothing.



Intelligent playlist generation and music library navigation. Especially for MP3 players running the open-source rockbox operating system.