Software



Popgen

popgen is an R package that implements some statistical and population genetics methods.

A poster presented at SMBE 2003 conference which describes the package can be downloaded here.

The package is available from CRAN


AnalyzeFMRI

AnalyzeFMRI is an R package that provides I/O, visualisation and analysis of functional Magnetic Resonance Imaging (fMRI) datasets stored in the ANALYZE image format. The package supercedes the packages AnalyzeIO and AnalyzeRead (described below).

More functionality will be slowly added.

An S-PLUS version will appear when i get some more time (although i'm more likely to do it soon if someone wants to use it!)

Please let me know if you use this package and have suggestions for functionality that you would find useful.

The package is available from CRAN

You can download an example datasaet from fmri.null.dataset.tar.gz
The gzipped tar archive contains 4 files described below

    [1] jm2.null.mc.img - null fMRI dataset with dimensions 64x64x21x190
    [2] jm2.null.mc.hdr - header file for [1]
    [3] jm2.null.mc.mask.img - mask for [1] which is 1 for voxels inside the brain and 0 outside.
    [4] jm2.null.mc.mask.hdr - header file for [3]


fastICA

fastICA is a an R, Splus (5.x, 6.x) and C implementation of the FastICA algorithms developed by Aapo Hyvarinen et al at the Neural Networks Research Centre , Laboratory of Computer and Information Science, Helsinki University of Technology. Independent Component Analysis (ICA) is a method of decomposing a multi-dimensional dataset into a set of statistically independent non-gaussian variables. The method is based upon a generative model in which measured signals are constructed from linear mixtures of unknown latent variables or sources. These sources are assumed to be statistically independent and non-gaussian. ICA attempts to unmix the measured signals and recover the sources.

The method relies upon the fact that mixtures of independent variables tend to become more gaussian in distribution when they are mixed linearly (by the Central Limit Theorem). Thus in order to recover the independent sources we should maximise some measure of non-gaussianity. The FastICA algorithm (as its name suggests) is designed to provide a computationally quick method of estimating the unobserved independent components. The algorithm iteratively maximises an approximation to the negentropy of the projected data. Negentropy is based on the information-theoretic quantity of (differential) entropy which measures the "randomness" of an observed variable. Since gaussian variables have the largest entropy among all random variables of equal variance entropy can be used to define a measure of non-gaussianity i.e. negentropy. In practice this quantity can be time consuming to calculate. This led to the development of the fast and robust approximations implemented in the FastICA algorithm. The algorithm has been implemented by the original authors in MATLAB .

In the absence of a generative model for the data the algorithm can be used to find Projection Pursuit directions. Projection Pursuit is a technique for finding 'interesting' directions/projections in multi-dimensional datasets. These projectoins and are useful for visualising the dataset and in density estimation and regression. Interesting directions are those which show the least Gaussian distribution, which is what the FastICA algorithm does.

The fastICA package contains both R/Splus and C code to implement the FastICA algorithm. The R/Splus code is included for clarity whereas the C code allows the method to be run much faster. When the package is compiled the code is linked to optimized BLAS routines if they are present on your machine. If not then unoptimized BLAS routines are compiled separately, which makes the code faster than the R code but not as fast as it could be. Most of the C code included in the package was written by Chris Heaton who is a summer research student at the Department of Statistics, University of Oxford (see this report ).

The R package is available from CRAN

The Splus library source code is available from the following link fastICA_S_1.0-2.tar.gz  (see Installation instructions )

A pre-compiled binary for Splus 6.0 for Windows has been made available by Professor B D Ripley at http://www.stats.ox.ac.uk/pub/MASS3/Winlibs/fastICA.zip

A standalone C version of the code is also available fastiCa.tgz  The code is essentially the same as that used in the R and Splus packages described above but uses the ranlib RNG library. Please read the README file included in the directory for instructions on compilation. I have successfully compiled this code on my Linux machine but thats about it so far. The code is distributed under the GPL license (for details see the file COPYING). An static executable of this code is available at fastiCa

GWApower

GWApower is a R package for assessing the power of genome-wide association studies using commercially available genotyping chips. The package encapsulates extensive simulation results generated by our program HAPGEN and described fully in the paper

Spencer, C., Su, Z., Donnelly, P. and Marchini, J. (2008) Designing Genome-Wide Association Studies: sample size, power, and the choice of genotyping chip. submitted.

Download : GWApower_1.1.tar.gz


JMisc

An R package where i put all my miscellaneous stuff  [Source  JMisc_0.1-2.tar.gz ]  [Windows pre-compiled binary  JMisc_0.1-2.zip ]

Currently includes