Software
Population Genetics
shellfish
PCA computations for genome-wide SNP data and other very large data sets. The covariance matrix formation is parallelised for multiprocessor machines and scientific computing clusters.
psi
Parallel EM approach to fitting the Pritchard et al. (2000) "Admixture Model"
Shell utilities
lines and slice
Rapid extraction of lines and columns from tabular text files. Especially useful for large files, such as genome-wide genotype data. Lines may be requested in any order and will be retrieved with a single pass through the file.
treesync
Rapid, content-blind syncing of directory trees. Like rsync, but works with file-names only: if the file names are the same, the content is assumed to be the same. Useful for maintaining back-ups of large collections of files, for example an MP3 library.
~> treesync --help
Usage: treesync tree1 tree2
Options:
-h, --help show this help message and exit
--debug show debug messages and pass exceptions
-v, --verbose show informational messages
-q, --quiet do not show log messages on console
--log=FILE append logging data to FILE
--loglevel=LEVEL set log level: DEBUG, INFO, WARNING, ERROR, CRITICAL
-n, --dry-run Just print shell commands; don't actually do anything
-d, --delete Delete files from receiver that do not exist in sender
--delete-only Delete files from receiver that do not exist in sender,
and do not copy files.
~> treesync -dv /path/to/source/root /path/to/destination/root
Emacs
Org-babel [a component of Emacs Org-mode]
Active source code within Org-mode documents: a new environment for reproducible computational research and literate programming.
org-buffers
A buffer organisation tool: provides an Org-mode front-end to the ibuffer machinery.
minimal
Minimalist appearance for Emacs: no mode line, no scroll bars, no menu, nothing.