GWAS DESCRIPTION Citation: Liu et al., Nature Genetics 2015 Sample size: 35,197 unaffected and 35,346 affected individuals (20,155 CrohnÕs disease and 15,191 ulcerative colitis) Genotyping platform and processing: ImmunoChip array. Genotypes were called using optiCall for 192,402 autosomal variants before quality control. Quality control: We removed variants with missing data rate >2% across the whole dataset, or >10% in any one batch, and variants that failed (with false discovery rate <10E-5 in either the whole dataset or at least two batches) tests for the following tests: (1) HardyÐWeinberg equilibrium in controls; (2) differential missingness between cases and controls; (3) different allele frequency across different batches in controls, CrohnÕs disease, or ulcerative colitis. We also removed non-coding variants that were present in the 1000 Genomes pilot stage but were not in the subsequent phase I integrated variant set (March 2012 release) and had not been in releases 2 or 3 of HapMap, as these mostly represented false positives from the 1000 Genomes pilot, which often genotype poorly. Where a variant failed in exactly one batch, we set all genotypes to missing for that batch (to be reimputed later) and included the site if it passed in the remainder of the batches. We removed individuals that had >2% missing data, had significantly higher or lower (defined as FDR<0.01) inbreeding coefficient (F), or were duplicated or related (PI_HAT>=0.4, calculated from the linkage disequilibrium pruned dataset described below), by sequentially removing the individual with the largest number of related samples until no related samples remained. We projected all remaining samples onto principal component axes generated from HapMap 3, and classified their ancestry using a Gaussian mixture model fitted to the European (CEU+TSI), African (YRI+LWK+ASW+MKK), East Asian (CHB+JPT), and South Asian (GIH) HapMap samples. We removed all samples that were classified as non-European, or that lay more than 8 standard deviations from the European cluster. After quality control, there were 67,852 European-derived samples with valid diagnoses (healthy control, CrohnÕs disease, or ulcerative colitis), and 161,681 genotyped variants available for downstream analyses. Imputation: Imputation was performed separately in each ImmunoChip autosomal high-density region (185 total) from the 1000 Genomes phase I integrated haplotype reference panel. To prevent the edge effect, we extended each side of the high-density regions by 50 kbp. Two imputations were performed sequentially using software and parameters as described below. The first imputation was performed immediately after the quality control, from which the major results were manually inspected. The second imputation was performed after removing variants that failed the manual cluster plot inspection. We used SHAPEIT (versions: first imputation, v2.r644; second imputation, v2.r769) to pre-phase the genotypes, followed by IMPUTE2 (versions: first, 2.2.2; second, 2.3.0) to perform the imputation. The reference panels were downloaded from the IMPUTE2 website (first, March 2012 release; second, December 2013 release). After the second imputation, there were 388,432 variants with good imputation quality (INFO>0.4). These include 99.9% of variants with MAF>=0.05, 99.3% of variants with 0.05>MAF>=0.01, and 63.0% of variants with MAF<0.01, with similar success rates both for coding and for non-coding variants, making it unlikely that missing variants substantially affected our fine-mapping conclusions.