META

META is a program for the meta analysis of genome-wide association studies. The program is designed to synthesizing the evidence from different association studies. Particularly, the program is able to work seamlessly with the output of SNPTEST [1] . This program was used in the meta analysis of the genome-wide association studies of smoking.

Home Input File Formats Output Summaries Running META Options Perl Script
Contributors
Download
Version History References
Contact Information

Contributors (top)

The following people have contributed to the development of the software for META:

Jason Liu, Jonathan Marchini

Download (top)

Pre-compiled versions of the program and example files can be downloaded from the links below. We've supplied both static and dynamic versions of the Linux executables. If you intend to run META on a machine running an old kernel then you probably want to use the dynamic version. If you have any problems getting the program to work on your machine please contact me.

Platform
File
Linux (x86_64) Static Executable
meta_v1.3.1_x86_64_static.tgz
Linux (x86_64) Dynamic Executable
meta_v1.3.1_x86_64_dynamic.tgz
Linux (x86_64) Static Executable meta_v1.3.2_x86_64_static.tgz
Linux (x86_64) Dynamic Executable meta_v1.3.2_x86_64_dynamic.tgz

Please fill out the
registration form to receive emails about updates to this software.

To unpack the files use a command like

tar zxvf meta_vX.X_x86_64.tgz

This will create an executable called META and a directory /example that contains the following example data files

example1.txt A result of study containing information mention in Input File Formats.
example2.txt A sample file as exmple1.txt.

Version History (top)


Version
Release Time
Description
1.0
11-3-2010
First version made available:
  • Gzipped input support
  • Input files fully compatible with SNPTEST output
  • Based on fixed-effects model
1.1
20-9-2010
Changes from META v1.0
  • Random-effects model is available
  • Standard input files support
1.2
20-11-2010
Changes from META v1.1
  • Optimize the program structure
1.3
07-06-2011
Changes from META v1.2
  • Abandom the boost library, but still support gzipped input.
1.3.1
16-08-2011
Changes from META v1.3
  • Change the way the read column information in SNPTEST output
1.3.2
22-08-2011
Changes from META v1.3.1
  • Now if using standard input file, an optional column named "chr" is allowed, which means in one intput file, SNPs could be from different chromosomes. If "chr" column is not specifed, META assumes all SNPs in the input file are from the same chromosome.

Input File Formats (top)

META reads plain text files at input, with each line of each file represeting the information of a SNP. Although the format is quite flexible, following column names must be provided:

rsid SNP id.
pos Base-pair position of SNP.
allele_A non-coded allele (a.k.a non-effect allele, non-reference allele).
allele_B coded allele (a.k.a effect allele, reference allele).
info imputation quality score
(RSQR_HAT column in MACH; INFO column in PLINK; PROPER_INFO column in SNPTEST).
P_value p-value of each SNP.
beta effect size of each snp.
se standard error of effect size.

Some other columns, e.g. chr which is chromosome number (1-22), and coded_af which is the coded allele frequencies, can be aslo provided. If the chr column is specified, the input file can contain SNPs from difference chromosomes, otherwise, SNPs are assumed to come from the same chromosome. Note that for --method 3 (z-statistics combination method), beta and se are not required, only the direction of effect size is needed. An example of input file is given below ( this is not a real data set):

chr rsid pos allele_A allele_B P_value info beta se
1 rs16969968 76669980 A G 0.027185 0.99025 -0.12571 0.056914
1 rs518425 76670868 A G 0.012406 0.98888 -0.15238 0.060954
2 rs514743 76671282 A T 0.91281 0.9997 0.0061483 0.056075
3 rs615470 76673043 C T 0.90384 0.99988 0.0067651 0.055996
6 rs12899226 76674493 G T 0.69283 0.99717 -0.050464 0.12883
6 rs660652 76674887 A G 0.90419 0.99943 -0.0067418 0.056007
6 rs472054 76675049 A G 0.90419 0.99943 -0.0067418 0.056007
15 rs8029939 76675404 C T 0.96537 0.91413 -0.027428 0.63172
15 rs578776 76675455 A G 0.013069 0.98698 0.1537 0.061882
15 rs6495307 76677376 C T 0.77301 0.99926 0.016023 0.055553
15 rs12910984 76678682 A G 0.0032279 0.98707 -0.19692 0.066864
15 rs1051730 76681394 A G 0.030083 0.96504 -0.12551 0.058043
...


META can use the output files of SNPTEST as its input files because all the information mentioned above is already included in the output of SNPTEST. See SNPTEST Mode for how to read them and here for the details of output of SNPTEST.

Output Summaries (top)

The output file of META contains a line for each SNP and there is a header line which specifies the contents of each column. The following table give a description of each of the entries in this file.

chr
Chromosome number (if you specified the chromosome in input files)
rsid SNP id (taken from input files).
pos Base-pair position of SNP (taken from input files).
allele_A non-coded allele (taken from input files).
allele_B coded allele (taken from input files).
P_value combined p-value.
beta combined effect size.
se combined standard error of effect size.
Q Cochran's Q statistics.
P_heterogeneity p-value for heterogeneity.
I2 percentage of total variation across studies that is due to heterogeneity.
P_cohort_1, ..., P_cohort_n p-values of cohort 1, ..., cohort n.

Running META (top)

To run META and see the parameters it requires, simply type:

./meta

META will read gzipped or non-compressed files at input. Output files will be gzipped if the main input data file is gzipped.

Following command gives a simplest use of META:

./meta --method 1 --cohort example1.txt example2.txt --output meta.txt

or if the input files are gzipped:

Following command gives a simplest use of META:

./meta --method 1 --cohort example1.txt.gz example2.txt.gz --output meta.txt

Three differnt meta analysis methods are coded. Here 1 means to use inverse variance method based on the fixed-effects model. For other methods, see Options. Hence, this will combine the p-values at each SNP in exampl1.txt and exmple2.txt, saving the result into meta.txt. The SNPs in the output file is a union of SNPs in the input files. So the number of cohorst used to combine p-values at each SNP would be different, as some SNPs only can be found only in some cohorts (due to different genotyping platforms, imputation quality, etc)

Threshold Imputation Quality (top)

The imputation quality score is the ratio of the empirically observed variance of the allele doseage to the expected binomial variance p(1-p) at Hardy-Weinburg equilibrium, where p is the observed allele frequency from HapMap [2]. By default, META combines p-values at SNP with imputation quality score ≥ 0.5. This can be changed by setting --threshold. For example, to produce a result based on SNPs with imputation quality score ≥ 0.9, use command:

./meta --method 1 --threshold 0.9 --cohort example1.txt example2.txt --output meta.txt

Specify sample size of each cohort (top)

To use z-statistics combination method (--method 3), sample size of each cohort are required. In our example, the sample sizes of example1.txt and example2.txt are 100 and 120 respectively. To specfiy them, --sample-size option is used and following command is used:

./meta --method 3 --sample-size 100 120 --cohort example1.txt example2.txt --output meta.txt

Specify genomic control lambda for each cohrt (top)

The test statitics of each cohort is probably inflated (due to population structure, for example). Therfore, the genomic inflation lambda of each cohort should be checked prior to the meta analysis. And these lambdas should be added into the meta analysis procedure to adjust the standard error of effect size. To achieve the targe, --lambda option is used. For example, fowllowing command can be used given the genomic control lambads of example1.txt and example2.txt are 1.05 and 1.08 respectively:

./meta --method 1 --lambda 1.05 1.08 --cohort example1.txt example2.txt --output meta.txt

Select SNPs of interest (top)

With --rsid option, we can focus our interest on some specific SNPs. For example:

./meta --method 1 --cohort example1.txt example2.txt --rsid rs1051730 rs16969968 --output meta.txt

will output the meta analysis result of two SNPs only: rs1051730 and rs16969968.

Select a subregion of SNPs (top)

The --interval option can specify the lower and upper boundary of the region of interest, in terms of the base-paired positions. See the following example:

./meta --method 1 --cohort example1.txt example2.txt --interval 76500000 7700000 --output meta.txt

The output file will contain the results of SNPs in the region [76500000, 77000000).

Select Best SNPs (top)

META can also give the best SNPs in terms of the combined P_value, by using --top-snp option. For example,

./meta --method 1 --cohort example1.txt example2.txt --top-snp 5 --output meta.txt

will give top 5 SNPs (in and ascending order of combined p-value) in the meta.txt.

SNPTEST Mode (top)

META is able to directly http://www.stats.ox.ac.uk/~marchini/software/gwas/snptest.html#tput of SNPTEST at input using the --snptest option. A simplest example of using it is illustrated:

./meta --snptest --method 1 --cohort example1.txt example2.txt --output meta.txt

NOTE that if you want to use this option, you can only specify one model with option --frequentist in SNPTEST. 

Options (top)

A complete set of options is given in the following table :

Parameters Type Description
--method Number (1 to 3) Three different methods used to combine p-values:
1 = inverse variance method (based on fixed-effects model);
2 = inverse variance mehtod (based on random-effects model);
3 = z-statistics combination method (based on fixed-effects model).
--cohort Files A vector of formatted files.
--output File Output file.
--snptest
File
Optional, use the output of SNPTEST as input files.
--sample-size
Numbers
Optional, a vector of sample sizes for each cohort.
To use z-statistics combination method (method = 3), sample sizes have to be given.
--lambda
Numbers
Optional, a vector of genomic control lambdas for each cohort.
--threshold
Number (between 0 and 1)
Optional, define a threshold of imputation quality score (between 0 and 1), default value = 0.5.
--rsid RSIDs Optional, RSIDs of SNP of interest.
--interval Two numbers Optional, define a subset of SNPs by position (in basepairs) in the range start ≤ position ≤ end.
--top-snp Number Optional, define the number of most significant SNPs that will be output.

Perl Script (top)

A perl script called meta.pl is also provided to ease the use of META. Once you have downloaded, you can type "./meta.pl --help" in the terminal for a brief help or type "perldoc meta.pl" for full message. A simple use of the script is given below, when the cohort files example1.txt and example2.txt are stored in examples/cohorts1/ (note that there is no "chr" information in the input files):


./meta.pl --method 1 --dir examples/cohorts1/ --output meta1.txt

Another example (the "chr" information is included in the input files) is given below:

./meta.pl --method 1 --dir examples/cohorts2/ --output meta2.txt

References (top)

[1] J. Z. Liu, et al (2010) Meta-analysis and imputation refines the association of 15q25 with smoling quantity. Nature Genetics, 42, 436-440
[2] J. Marchini, B. Howie, S. Myers, G. McVean and P. Donnelly (2007) A new multipoint method for genome-wide association studies via imputation of genotypes. Nature Genetics, 39 : 906-913
[3] P. de Bakker et al (2008) Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Human Moleculare Genetics, 17 : R122-R128

Contact Information (top)

If you have any questions regarding the use of this program please send an email to Dr Jason Liu (jsliu < at > stats < dot > ox <dot > ac < dot > uk)