Statistical analyses Validation of the long-term average depression phenotype We assessed construct validity by examining the association between the 14-year depression measure and established correlates of depression available in our sample: cigarette smoking (pack-years), physical activity (Mets per week), household characteristics, and phobic anxiety scale. We expected depression to be associated with greater likelihood of smoking, less physical activity, lower occupational and socioeconomic status, and higher degree
of phobic anxiety. Details are described in Appendix 22010. Traditional GWAS Genome-wide Inhibitors,research,lifescience,medical association analyses were first conducted separately for each NHS GWA substudies. A linear regression (using ProbABEL; Aulchenko et al. 2010) was performed on the long-term average depression score assuming additive genetic model, adjusting for age, disease status, and the top three or Inhibitors,research,lifescience,medical four principal components-derived eigenvectors
to address residual population stratification (depending on the sample, as detailed Inhibitors,research,lifescience,medical in the Table S2). SNPs with minor allele frequency less than 2% or imputation quality of R2 less than 0.5 were excluded on a per-substudy basis. Meta-analysis using the METAL program was performed for each SNP across four NHS GWA substudies, combining allelic effects with inverse variance weighting (Willer et al. 2010). We used a genome-wide significance threshold P < 5 × 10−8. Our sample provides 80% power to detect a genetic effect size of 0.1 (corresponding to R2 of 0.006) with minor allele frequency Inhibitors,research,lifescience,medical of 0.15, under an additive genetic model. Agnostic genome-wide polygenic scoring in NHS (NHS-GWAS-PS) Genome-wide PS based on agnostic priors can provide a genetic risk score even when few Inhibitors,research,lifescience,medical of the causal genetic loci have been consistently identified
in the literature. Following previously established methods, we first restricted to 1,584,339 SNPs with high imputation quality (R2 > 0.95) that were available across all four NHS GWA substudies. We next used the PLINK pruning Veliparib clinical trial procedure (200-SNP sliding window, pairwise r2 threshold of 0.25, and successive shift forward by five SNPs) to remove redundant SNPs, Casein kinase 1 leaving a total of 97,883 independent SNPs. Next, we performed a cross-validation procedure to obtain an unbiased estimate of the prediction performance. In the PS calculations, each time we used three of the four NHS GWA substudies as the “training” set to construct a polygenic risk score, which was then tested in the one remaining subsample (“testing” set). The procedure was conducted in three steps: (1) SNP-depression associations (beta weights) were first extracted from each of the three substudies in the training set.