An example ancestral recombination graph for the Kreitman (1983) data explaining the data with the minimum of 7 recombinations. The ARG was constructed by beagle and visualised by Graphviz.
haplotypebound is a reimplementation of the
Hudson-Kaplan bound based on finding a maximum number of non-overlapping
intervals flanked by a pair of incompatible segregating sites, and of the
haplotype bound proposed by S. Myers and R. Griffiths. This program is
compiled with make haplotypebound. The program takes a data
file as argument. It can be compiled with heuristics for speeding up the
haplotype bound computation, and the three relevant parameters – the
maximum size of sets of sites considered, the maximum interval width
spanned by a set of sites, and the increase in number types that adding an
extra site to an existing set of sites results in that has to be exceeded
for further site additions to the set to be explored – are specified
by options. Default values for these three arguments are infinity,
infinity, and 0, i.e. if none are specified the program will compute
the exact haplotype bound without applying heuristics. The haplotype bound
is computed on a reduced version of the input data set, so the interval
width spanned in the original data by some sets of sites considered may
exceed the specified maximum interval width. It is also available as
an executable that can be run from a Windows
command prompt. This executable was compiled using
the Minimalistic GNU for Windows
runtime headers for Win32 platforms.
beagle implements a branch and bound
method for determining the exact minimum number of recombinations required
for a data set. This program is compiled with make beagle. The
program takes a data file as argument. The program can generate the
sequence of evolutionary events that it found with a minimum number of
recombinations, or output a description of the corresponding ancestral
recombination graph in dot format (used by Graphviz), GML format (used by VGJ),
and GDL format (used by aiSee). This
program is available as an executable that can be run from a Windows
command prompt. This executable was compiled using the Minimalistic GNU for Windows runtime
headers for Win32 platforms.
kwarg is a heuristic alternative to
beagle. It is not guaranteed to find a minimal recombination
history, but aims to find a history with a low number of
recombinations. It can handle much larger data sets than
beagle, and has the same output options as. The program uses
a highly customisable score
function to guide the heuristic search. It is compiled
with make kwarg. It is also available as an executable that can be
run from a Windows command prompt. This executable was compiled using
the Minimalistic GNU for Windows
runtime headers for Win32 platforms.
eagl uses the composite method proposed by S. Myers
and R. Griffiths to combine exact minimums computed for small
segments of the data with haplotype bounds for larger segments. This
program is compiled with make eagl. The program takes a data
file as argument. The expansion of segments for which the exact minimum
can be bounded either by an absolute value, by a timing constraint, or by
querying the user. This program is available as an executable that can
be run from a Windows command prompt. This executable was compiled
using the Minimalistic GNU for
Windows runtime headers for Win32 platforms. greven expands on the branch
and bound method to perform a complete enumeration of all ancestral states
that can be encountered in evolutionary histories with a bound on the
number of recombinations. This program is compiled with make
greven. Note that even for small data sets the number of ancestral
states enumerated – that are all stored in memory until the
completion of the program – becomes quite large. This
program is available as an executable that can be run from a Windows
command prompt. This executable was compiled using the Minimalistic GNU for Windows runtime
headers for Win32 platforms.cob builds and solves the ancestral
state based equation system for data set probability under the
coalescent, using a discretised model of recombination. The program allows
the equation system to be restricted to only ancestral states that can
occur in evolutionary histories with a minimal or near-minimal number of
recombinations. To compile the program it is necessary to have the UMFPACK library
and a BLAS library
installed. You will probably have to modify the Makefile to
correctly refer to the location of these libraries. When this has been
successfully done the program can be compiled with make
cob. As for the greven program, the space requirements
rise steeply with data set size.To compile all programs except cob, make sure that you
have gcc, flex, bison, and python properly installed, then simply
run make. To compile all programs, make sure that the libraries
required for cob are installed in the places indicated in the
Makefile and then run make all+. The
distribution also contains various python scripts, mainly used for our
experimentation with the programs. If you have any problems with the
software, please email lyngsoe@stats.ox.ac.uk. If
possible, please include an example data set causing the problem.
The relevant citations for this software are Minimum Recombination Histories by Branch and Bound by R.B. Lyngsø, Y.S. Song & J. Hein, Proceedings of Workshop on Algorithms in Bioinformatics 2005, Lecture Notes in Computer Science 3692, pp. 239–250; Counting All Possible Ancestral Configurations of Sample Sequences in Population Genetics by Y.S. Song, R.B. Lyngsø & J. Hein, IEEE/ACM Transactions on Computational Biology and Bioinformatics 3, 239–251; Accurate Computation of Likelihoods in the Coalescent with Recombination via Parsimony by R.B. Lyngsø, Y.S. Song & J. Hein, Proceedings of International Conference on Research in Computational Molecular Biology, Lecture Notes in Computer Science 4955, pp. 463–477. Slides for a presentation of the methods are also available.
In case you wondered, the program suite is named after Section 26 at
the banks of the Duckburg River. The first program in the suite was
beagle that was named after the infamous Beagle Boys
that go by the same acronym as branch & bound. So initially the
program suite was called the beagle suite. When coming to Duckburg the
Beagle Boys initially moved to Section 26 at the Duckburg River, which
thus with a properly far fetching imagination can be consider the Beagle
(Boys') suite.