Materials and Methods
We used data from the UKB, a population-based prospective cohort study of 502 419 UK residents aged 40–69 years recruited between 2006 and 2010 from 22 assessment centres across the UK. A wide range of phenotyping assessments, biochemical assays, genome-wide genotyping and ongoing longitudinal follow-up data are available for most study participants. For the purposes of the current analyses, we restricted our sample to 46% of the study participants (n = 231 336) with detailed linked electronic medical records from their primary care general practitioners. These primary care data include medication prescriptions for a time period ranging from as early as 1978 until 2019, thus allowing a detailed assessment of duration and dose of statin intake both before and after baseline assessments. We excluded 15 137 individuals with missing genetic data and 155 individuals with a history of ICH at baseline (defined as presence of the illness code 'brain haemorrhage', Figure 1).
Study concept and study population. (A) Individuals respond differently to statins according to their genetic profile.35 Assuming that there is no selection pressure, drug response is assorted randomly within a population and can be used to explore causal effects of the drug on outcomes using observational data. (B) Flow chart of the study participants. Individuals without genetic data or history of ICH at baseline were excluded.
The UKB has institutional review board approval from the Northwest Multi-Center Research Ethics Committee (Manchester, UK). All participants provided written informed consent. We accessed the data following approval of an application by the UKB Ethics and Governance Council (Application No. 36993).
Preparation of the UK Biobank Primary Care Data
We extracted data on statin prescriptions and LDL measurements from the UKB primary care data. For obtaining statin exposure metrics, we harmonized the dosages of different statins on the basis of comparison factors from trials evaluating statin efficacy and calculated a mean statin dose per participant across the different prescriptions in the equivalent atorvastatin dose.[3,32–34] The data extraction and quality control process are described in detail in the Supplementary material. According to the 2018 AHA guidelines on cholesterol management, statin intensity was categorized as low (<10 mg), medium (≥10 mg and <40 mg) and high (≥ 40 mg) on the basis of the atorvastatin equivalent dose.
Polygenic Score for Estimating On-statin LDL Response
We used data from the Genomic Investigation of Statin Therapy Consortium, a two-stage genome-wide association study (GWAS) for on-statin LDL cholesterol response among 40 914 statin-treated subjects of European ancestry (30 246 from 10 randomized controlled trials and 10 668 from 11 observational studies), to construct a polygenic score of LDL lowering following statin intake. There was no participant overlap between those studies and the UKB. Following a previously described approach, we used a set of 35 single nucleotide polymorphisms (SNPs), selected on the basis of associations with on-statin LDL lowering at P < 5 × 10−5 and clumped at r2 < 0.001 on the basis of the European reference panel of the 1000 Genomes. We then calculated a genetic score with the imputed genotype data of UKB. To confirm that the observed effects were specifically due to genetically predicted on-statin LDL and not off-statin LDL, in sensitivity analyses we tested the association of each of the 35 SNPs included in the score and LDL levels measured at the UKB baseline assessment among those who had never used statins (linear regression models adjustments for age, sex, principal components (PC) 1–10 of population structure, kinship and genotyping assay) and removed the SNPs that associated with off-statin LDL at P < 0.0014 (0.05/35 according to Bonferroni). All SNPs for the genetic and the alternative score are provided in Supplementary Table 3.
Validation of Statin Response Genetic Scores on LDL Trajectories
To test the relevance assumption of MR, we aimed to confirm the effect of the genetic score used on on-statin LDL levels by exploring associations with longitudinal LDL level changes in the primary care data among statin users. Only participants with at least one LDL measurement before and one measurement after their first recorded statin prescription (off- and on-statin LDL) were included in this analysis (n = 40 633, 53% of statin users). To account for multiple LDL values per participant over time, we used a mixed model clustered by participant with LDL levels as the outcome and the genetic score, time and their interaction as the exposure. The modes were further adjusted for age, sex, statin equivalency dose, PC1–10, race, kinship and genotyping assay.
Influence of the Genetic Scores on Baseline LDL and Lipid Particle Metabolites
To explore whether a higher genetic score for on-statin LDL lowering mimics an exposure to higher statin intake, we compared associations of a higher score and a higher statin dose with the entire spectrum of 228 lipid particle metabolites among statin users, as measured by nuclear magnetic resonance at baseline. We constructed linear regression models with each metabolite as outcome and the genetic scores or statin equivalent dose as exposure. The models were further adjusted for age and sex; those with the genetic score as exposure were further adjusted for PC1–10, race, kinship and genotyping assay. We corrected for multiple hypothesis testing with the Bonferroni method setting a significance threshold at 0.05/228. Correlations in the derived estimates for the genetic scores or statin equivalent dose across the lipid traits were tested with Pearson's correlation.
UKB participants' records have been linked with inpatient hospital codes, primary care data and death registry for longitudinal follow-up. Incident ICH was defined as events occurring after baseline, documented in either hospital admissions or death registry data by the following International Classification of Diseases (ICD) codes: ICD-9 431.X and ICD-10 I61. These criteria were aligned with the diagnostic algorithm for stroke in the UKB (https://biobank.ndph.ox.ac.uk/showcase/ukb/docs/alg_outcome_stroke.pdf) that captured events up to December 2018. We manually applied the same criteria to capture events occurring thereafter up to the end of follow-up (June 2020). Types of intracranial haemorrhage other than ICH were not studied. As positive controls, we also tested associations of the genetic scores with incident myocardial infarction (MI) and peripheral artery disease (PAD), which were defined on the basis of the following ICD-10 codes: I21.X, I22.X, I23.X, I24.1, I25.2 (for MI) and I70.0, I70.00, I70.01, I70.2, I70.20, I70.21, I70.8, I70.80, I70.9, I70.90, I73.8, I73.9 (for PAD).
Effect of the Genetic Score on ICH and Cardiovascular end Points
To explore the effects of on-statin genetically predicted LDL response on risk for incident ICH, we used Cox proportional hazard models adjusted for previously published risk factors for ICH[1,2] and genetic covariates: age, sex, BMI, smoking status, history of diabetes, systolic blood pressure, mean statin dose, mean LDL levels, use of anticoagulation and antiplatelet drugs at baseline, PC1–10, race, kinship and genotyping assay. As positive controls, we explored associations between on-statin genetically predicted LDL response and risk for MI and PAD using similar Cox models, additionally adjusting for history of hypertension, hypercholesterolaemia, MI, stroke or PAD without adjusting for antiplatelet and anticoagulation intake. To test the independence and exclusion restriction assumptions of MR and exclude the possibility that any associations are driven by pleiotropic effects of the score independent of on-statin LDL lowering, we tested the same associations among non-statin users. Because statin users are, due to indication bias, at higher risk for MI and PAD, the selection of the population on the basis of statin use could have introduced collider bias. To address this issue, in a sensitivity analysis we used inverse probability weighting to confirm our findings for MI and PAD. Specifically, in the full UKB cohort, we constructed a linear regression model with statin use as outcome and age, sex, BMI, smoking status, hypertension, systolic blood pressure, history of diabetes, intake of diabetes drugs, hypercholesterolaemia, LDL, history of MI, stroke or PAD and the genetic score as covariates. For statin users, we then used the inverse of the fitted values of that model as weights in the respective Cox models to account for the probability of statin prescription in an individual.
For SNP extraction, genetic score calculation, SNP association tests and relationship inference we used PLINK, bcftools and KING.[37–40] For data extraction, curation, preparation and figure generation, we used RStudio 2021.09.0 with R v.4.1.1 on Mac OS X (aarch64-apple-darwin20) with the packages coxphw, data.table, dplyr, FSA, ggplot2, gmodels, lmerTest, lme4, PheWAS, readr, readxl, stringr, survival, survivalAnalysis, survminer, tidyr and writexl. Figure 1 was partly created with BioRender.com. The analysis plan followed the STROBE-MR statement for the usage of MR in observational studies.
The data that support the findings of this study are available from the UKB on submission of a research proposal. The summary statistics of the GWAS for on-statin LDL response used to create the tested polygenic risk score are publicly available.[31,35]
Brain. 2022;145(8):2677-2686. © 2022 Oxford University Press