Population-scale genomics (N = 937 939) allowed assembly of a large FVL/PTGM double-heterozygous cohort while mitigating ascertainment bias.
FVL/PTGM DH appeared to be more common than FVL homozygosity and conferred a similar risk of venous thrombosis.
Visual Abstract
The factor V Leiden (FVL; rs6025) and prothrombin G20210A (PTGM; rs1799963) polymorphisms are 2 of the most well-studied genetic risk factors for venous thromboembolism (VTE). However, double heterozygosity (DH) for FVL and PTGM remains poorly understood, with previous studies showing marked disagreement regarding thrombosis risk conferred by the DH genotype. Using multidimensional data from the UK Biobank (UKB) and FinnGen biorepositories, we evaluated the clinical impact of DH carrier status across 937 939 individuals. We found that 662 participants (0.07%) were DH carriers. After adjustment for age, sex, and ancestry, DH individuals experienced a markedly elevated risk of VTE compared with wild-type individuals (odds ratio [OR] = 5.24; 95% confidence interval [CI], 4.01-6.84; P = 4.8 × 10−34), which approximated the risk conferred by FVL homozygosity. A secondary analysis restricted to UKB participants (N = 445 144) found that effect size estimates for the DH genotype remained largely unchanged (OR = 4.53; 95% CI, 3.42-5.90; P < 1 × 10−16) after adjustment for commonly cited VTE risk factors, such as body mass index, blood type, and markers of inflammation. In contrast, the DH genotype was not associated with a significantly higher risk of any arterial thrombosis phenotype, including stroke, myocardial infarction, and peripheral artery disease. In summary, we leveraged population-scale genomic data sets to conduct, to our knowledge, the largest study to date on the DH genotype and were able to establish far more precise effect size estimates than previously possible. Our findings indicate that the DH genotype may occur as frequently as FVL homozygosity and may confer a similarly increased risk of VTE.
Introduction
The risk of venous thromboembolism (VTE) is influenced by a range of factors, including germ line genetics, body habitus, inflammatory state, and exposure to known provocations, such as smoking, cancer, trauma, and immobility. Among the genetic risk factors, factor V Leiden (FVL; rs6025) and the 3′ UTR prothrombin gene mutation G20210A (PTGM; rs1799963) are 2 well-established common variants associated with a moderately increased risk of VTE.1-4 In those of European ancestry, FVL heterozygosity is estimated to affect 1 in 20 individuals and PTGM heterozygosity 1 in 50 individuals.5-10 Other populations experience lower but substantial carrier rates for both variants.8,11-13 Modern studies suggest that heterozygosity for either FVL or PTGM confers an approximately two- to threefold increased risk of VTE.14-16
Given the frequencies of FVL and PTGM in the general population, it is estimated that ∼1 in 1000 individuals possesses 1 copy of each variant allele, a state known as double heterozygosity (DH).17 DH patients represent a significant quandary in hematology and genetic counseling, as the clinical implications of the DH genotype remain poorly understood despite the passage of decades since FVL and PTGM were originally discovered. Prior attempts to characterize the impact of the DH genotype on VTE risk have led to extremely divergent effect size estimates that vary by nearly one order of magnitude and often involve high statistical uncertainty.18-23 This outcome may be attributed to study limitations such as small sample size, heterogeneous case and control groups, and the inability to adjust for potential confounders. However, perhaps the greatest limiting factor in most studies was the presence of an inherent ascertainment bias whereby analyses were restricted to patients with thrombosis presenting to medical attention and/or their family members. This referral-based approach has made it challenging to accurately infer the phenotype of DH carriers in the broader population and to predict the consequences of DH carrier status in individuals without a family or personal history of thrombosis.
To address these limitations and establish reliable risk estimates for the DH genotype across a range of thrombotic disorders, we harnessed data from nearly 1 million participants in the UK Biobank (UKB)24 and FinnGen25 data sets. To our knowledge, we were able to generate the first effect size estimates for the DH genotype derived from unselected data at the population level and achieved superior statistical confidence compared with prior approximations. Our findings indicate that when present in the same individual, FVL and PTGM function in an additive manner to markedly increase the risk of venous but not arterial thrombosis.
Methods
Study population and phenotypes
We used data from 2 large prospective cohort studies, the UKB and the FinnGen. The UKB is a population-scale biorepository that contains deep phenotyping and genomic data from ∼500 000 participants in the UK.24 The UKB includes comprehensive clinical annotation, laboratory values, primary physical and biometric measures, and proteomic, metabolomic, genotyping, whole-exome sequencing, and whole-genome sequencing data. These features allow for multidimensional analysis of genotyping data for discovery and hypothesis-driven investigations. Similarly, the FinnGen study combines genotyping and registry–based outcome data for ∼500 000 participants in Finland, representing a diversity of inpatient, outpatient, blood donor, and prospective cohort samples.25 Our study included genotype data from 453 733 individuals included in the FinnGen Data Freeze 11. The UKB was accessed under application number 17488. Ethical approval for local analysis of UKB data was granted by the Massachusetts General Hospital Institutional Review Board. The Coordinating ethics committee of the Hospital District of Helsinki and Uusimaa approved the FinnGen study protocol. The full ethics statement for FinnGen is provided in the supplemental Methods, available on the Blood website.
The thrombotic phenotypes studied included VTE, myocardial infarction (MI), stroke (CVA), and peripheral artery disease (PAD). Briefly, we used the standardized VTE phenotype developed previously by the Broad Institute,26 which included ICD-9/10 codes covering upper and lower extremity deep vein thromboses, phlebitis and thrombophlebitis, and pulmonary embolism, whereas excluding entities such as thrombophlebitis of the superficial vessels and unspecified coagulation defects. Disease phenotypes were predefined and generated through a combination of medical history, billing codes in electronic health records, and death registry records as previously described.24,26 A detailed description of each clinical phenotype is contained in the supplemental Methods.
Genotyping and quality control
Details of genotyping, imputation, and quality control have been described previously for UKB.24 Briefly, carriers of the FVL variant (rs6025) were identified by direct genotyping of participants using the Affymetrix UKB Axiom (450 000 samples) and Affymetrix UK BiLEVE Axiom (50 000 samples) arrays. Genetic data were then imputed to the Haplotype Reference Consortium panel and UK10K + 1000 Genomes panels (imputed version 3). We removed samples that were outliers for heterozygosity or missingness, samples with putative sex chromosome aneuploidy, samples with a mismatch between self-reported and genetically inferred sex, and samples from participants who had revoked their consent. In the UKB, the FVL variant (rs6025) status was directly genotyped, whereas the PTGM variant (rs1799963) status was imputed with a quality (INFO score) threshold of 0.95. Ancestry for all samples was inferred using principal component analysis as previously described.27
FinnGen samples were genotyped using a FinnGen ThermoFisher Axiom custom array (Thermo Fisher Scientific, San Diego, CA); in addition, a subset of samples from legacy cohorts were previously genotyped using Illumina and Affymetrix arrays (Illumina Inc, San Diego, CA, and Thermo Fisher Scientific, Santa Clara, CA) as detailed previously.25 Genotype imputation was performed using a population-specific SISu v4 imputation reference panel comprising 8557 whole genomes based on a publicly available protocol. Carriers of the FVL and PTGM variants in FinnGen were identified from the imputed data using an INFO score threshold of 0.996. As part of central quality control in the FinnGen study, only participants of genetically inferred Finnish ancestry were included in the analyses.
Sensitivity analyses
We performed a sensitivity analysis using unrelated subsets of each cohort after removing 1 sample for each pair of third-degree or closer relatives in the UKB and FinnGen. Summary data from each cohort of unrelated individuals were combined in a random- effects meta-analysis. Within the UKB cohort, an additional Firth logistic regression model restricted to UKB participants (N = 445 144) was generated with extended covariate adjustments, including body mass index (BMI), blood group, smoking status, platelet count, and C-reactive protein (CRP).
Statistical methods
In the primary analysis, odds ratios (ORs) and confidence intervals (CIs) for specific binary phenotypes were estimated within each biorepository data set using Firth bias-reduced logistic regression (R-package logistf version 1.23).28,29 The models were adjusted for age, sex, and the first 10 principal components of genetic ancestry. The results were subsequently combined using a random-effects meta-analysis. The more conservative random-effects model was chosen because of the considerable variability in previous estimates of VTE risk associated with the DH genotype, as well as differences in sample ascertainment and baseline participant health status between UKB and FinnGen. The significance threshold was set at alpha = 0.01 (0.05/5) to reveal any evidence of significant association after correcting for multiple comparisons among the 5 genotypes included in the model. Statistical analyses were performed using R v.4.0 for UKB data and R v4.3.1 for FinnGen data and meta-analyses. Survival curves evaluating the cumulative incidence of VTE were generated using the Kaplan-Meier method. Hazard ratios and P values for VTE risk reported with survival curves were derived using a univariate Cox proportional hazard model.
This work was approved by the institutional review boards of all the participating centers.
Results
Study population
After conducting sample-level quality control and filtering for patients with complete genotyping data, our analyses included 937 939 individuals (484 206 from the UKB and 453 733 from the FinnGen study) (Figure 1). Of these, 44 051 participants (4.7%) had experienced at least 1 VTE episode (cases), whereas 893 888 had not (controls). Among the UKB participants, the mean ± standard deviation age was 57 ± 8.1 years, with 54.2% being female and 87.9% having European ancestry (supplemental Tables 1 and 2). For FinnGen project participants, the mean ± standard deviation age was 53 ± 18 years, with 56% being female and 100% having European (Finnish) ancestry (supplemental Table 3). Participants across the genotype distribution exhibited similar demographic and baseline clinical characteristics, which is consistent with the minimal bias in the enrollment of individuals according to genotype (supplemental Tables 2 and 3). Combining data from both biorepositories for our primary analyses, we found that 38 186 (4.1%) individuals were heterozygous for FVL and 14 150 (1.5%) were heterozygous for PTGM, whereas 662 participants (0.07%) carried the DH genotype (Figure 1). In comparison, 593 participants (0.06%) were homozygous for FVL, and 69 individuals (<0.01%) were homozygous for PTGM. Consistent with prior reports,30-32 the FVL and PTGM alleles were more common in those of European ancestry, although the South Asian and admixed populations also contained a substantial number of variant carriers (supplemental Table 1).
Study design. The risk of VTE was assessed by generating independent Firth logistic regression models in data sets from the FinnGen project (N = 454 149 participants) and the UKB (N = 484 206 participants). The results were then combined using a random-effects meta-analysis (N = 938 255). The number of VTE cases and the genotype breakdown of the participants at each stage were noted. hom, homozygous; het, heterozygous.
Study design. The risk of VTE was assessed by generating independent Firth logistic regression models in data sets from the FinnGen project (N = 454 149 participants) and the UKB (N = 484 206 participants). The results were then combined using a random-effects meta-analysis (N = 938 255). The number of VTE cases and the genotype breakdown of the participants at each stage were noted. hom, homozygous; het, heterozygous.
Risk of VTE in DH genotype carriers
We used Firth logistic regression modeling in each data set followed by a random-effects meta-analysis to determine the risk of VTE associated with the genotypes of interest. We computed the I2 statistic to be 0.4, suggesting a low level of statistical heterogeneity among the 2 cohorts being meta-analyzed. After adjusting for sex, age, and ancestry, we found that DH genotype carriers experienced a significantly elevated risk of VTE compared with individuals who are wild-type for both alleles (OR = 5.24; 95% CI, 4.01-6.84; P = 4.8 × 10−34) (Figure 2; supplemental Table 4). By comparison, lower VTE risks were associated with single heterozygosity for PTGM (OR = 1.86; 95% CI, 1.57-2.21; P = 8.6 × 10−13) and FVL (OR = 2.28; 95% CI, 2.03-2.56; P = 1.9 × 10−43) (Figure 2). Homozygosity for FVL (N = 593) was found to be associated with the greatest risk of VTE (OR = 6.19; 95% CI, 4.63-7.88; P = 1.1 × 10−49). In contrast, PTGM homozygosity affected the fewest number of participants (N = 69) and was associated with a trend toward an increased risk of VTE, which was not statistically significant after adjustment for multiple comparisons (OR = 3.07; 95% CI, 1.26-7.43; P = .014). To assess the impact of plasma lipid concentration on VTE risk, we included the polygenic risk score (PRS) for low-density lipoprotein (LDL) levels33 as a covariate in an expanded Firth regression model (supplemental Table 5). Interestingly, higher plasma LDL, as predicted by PRS, did not appear to influence the risk of VTE (OR = 0.99; 95% CI, 0.97-1.00; P = .09). We next evaluated the potential impact of kinship on our findings by performing a sensitivity analysis restricted to unrelated participants (N = 555 138) (supplemental Table 6). This subgroup included 26 333 VTE cases and 489 DH genotype individuals. Our analysis of the unrelated subgroup yielded effect size estimates similar to those obtained for the overall cohort for all studied genotypes, and the risk of VTE associated with the DH genotype was virtually unchanged (OR = 5.66; 95% CI, 3.66-8.77; P = 7.5 × 10−15).
Risk of VTE according to genotype in 938 355 individuals. Effect size estimates, 95% CI, and 2-sided P values obtained using a random-effects meta-analysis (N = 938 355) are shown. The third and fourth columns display the number of VTE cases and controls stratified by genotype carrier status. For each comparison, the adjusted OR and 95% CI are plotted. Noncarriers of both FVL and PTGM were set as the reference group (OR = 1.0).
Risk of VTE according to genotype in 938 355 individuals. Effect size estimates, 95% CI, and 2-sided P values obtained using a random-effects meta-analysis (N = 938 355) are shown. The third and fourth columns display the number of VTE cases and controls stratified by genotype carrier status. For each comparison, the adjusted OR and 95% CI are plotted. Noncarriers of both FVL and PTGM were set as the reference group (OR = 1.0).
To better characterize the biological interaction between FVL and PTGM, we next sought to determine whether the DH genotype was associated with an additive or multiplicative (synergistic) increase in VTE risk relative to single heterozygosity for either allele. To evaluate this question, we introduced an interaction term in our analysis. Revised modeling and meta-analysis demonstrated a statistically significant interaction between the 2 thrombophilia alleles (OR = 1.3; 95% CI, 1.05-1.6; P = .017). However, this relatively modest effect size estimate suggests that, despite statistical evidence for a multiplicative interaction, the FVL and PTGM variants likely behave in an additive manner for the purposes of clinical care.
Refined VTE risk estimates for PTGM, FVL, and DH genotypes
In addition to basic demographic parameters, the UKB makes available a wide range of detailed clinical data that encompass many known risk factors for VTE. To refine our VTE risk estimates, we performed a secondary Firth's logistic regression analysis within the UKB data set (N = 445 144) and adjusted for BMI, blood type, smoking history, and CRP level. Notably, we found that the risk of VTE in DH carriers remained largely unchanged after accounting for these widely recognized risk factors (OR = 4.53; 95% CI, 3.42-5.90; P < 1 × 10−16), a finding that appeared consistent across the studied genotypes (Table 1). This result indicates that the germ line variants we evaluated probably exerted most of their effects independently of other clinical contributors. Furthermore, our analysis reproduced many known associations between VTE and previously described clinical and demographic risk factors, including age, sex, BMI, smoking, non-O blood groups, and CRP. In contrast, platelet count was not found to contribute to VTE risk, consistent with previous reports.34,35
Risk of VTE in the UKB according to genotype after adjustment for common risk factors
. | Carriers/total (%) . | Carriers/noncarriers . | OR (95% CI) . | P value . | |
---|---|---|---|---|---|
Cases . | Controls . | ||||
Genotype | |||||
PTGM heterozygous | 8 854/425 690 (2.0) | 584/16 589 | 8 270/16 589 | 1.72 (1.58-1.88) | <1 × 10−16 |
PTGM homozygous | 45/416 881 (0.01) | 3/16 589 | 42/16 589 | 2.18 (0.59-5.76) | .21 |
FVL heterozygous | 18 786/435 622 (4.3) | 1 541/16 589 | 17 245/16 589 | 2.18 (2.06-2.31) | <1 × 10−16 |
FVL homozygous | 210/417 046 (0.05) | 43/16 589 | 167/16 589 | 7.14 (5.01-9.97) | <1 × 10−16 |
DH | 413/417 249 (0.1) | 64/16 589 | 349/16 589 | 4.53 (3.42-5.90) | <1 × 10−16 |
Other factors | |||||
Male sex | 191 246/417 249 (46) | 7 589/9 064 | 183 657/216 939 | 0.95 (0.91-0.97) | 6.2 × 10−5 |
Age∗ | N/A | 60.1 ± 7.2 | 56.9 ± 8.1 | 1.05 (1.05-1.06) | <1 × 10−16 |
BMI† | N/A | 29.2 ± 5.6 | 27.3 ± 4.7 | 1.06 (1.06-1.07) | <1 × 10−16 |
Blood group O | 181 172/417 249 (43) | 5 992/10 661 | 175 180/225 416 | 0.72 (0.69-0.74) | <1 × 10−16 |
Smoking | 188 934/417 249 (45) | 8 528/8 125 | 180 406/220 190 | 1.15 (1.11-1.18) | <1 × 10−16 |
Platelet count‡ | N/A | 252.8 ± 63.8 | 253.1 ± 59.7 | 1.00 (1.00-1.00) | .69 |
CRP§ | N/A | 3.5 ± 5.6 | 2.5 ± 4.3 | 1.02 (1.01-1.02) | <1 × 10−16 |
. | Carriers/total (%) . | Carriers/noncarriers . | OR (95% CI) . | P value . | |
---|---|---|---|---|---|
Cases . | Controls . | ||||
Genotype | |||||
PTGM heterozygous | 8 854/425 690 (2.0) | 584/16 589 | 8 270/16 589 | 1.72 (1.58-1.88) | <1 × 10−16 |
PTGM homozygous | 45/416 881 (0.01) | 3/16 589 | 42/16 589 | 2.18 (0.59-5.76) | .21 |
FVL heterozygous | 18 786/435 622 (4.3) | 1 541/16 589 | 17 245/16 589 | 2.18 (2.06-2.31) | <1 × 10−16 |
FVL homozygous | 210/417 046 (0.05) | 43/16 589 | 167/16 589 | 7.14 (5.01-9.97) | <1 × 10−16 |
DH | 413/417 249 (0.1) | 64/16 589 | 349/16 589 | 4.53 (3.42-5.90) | <1 × 10−16 |
Other factors | |||||
Male sex | 191 246/417 249 (46) | 7 589/9 064 | 183 657/216 939 | 0.95 (0.91-0.97) | 6.2 × 10−5 |
Age∗ | N/A | 60.1 ± 7.2 | 56.9 ± 8.1 | 1.05 (1.05-1.06) | <1 × 10−16 |
BMI† | N/A | 29.2 ± 5.6 | 27.3 ± 4.7 | 1.06 (1.06-1.07) | <1 × 10−16 |
Blood group O | 181 172/417 249 (43) | 5 992/10 661 | 175 180/225 416 | 0.72 (0.69-0.74) | <1 × 10−16 |
Smoking | 188 934/417 249 (45) | 8 528/8 125 | 180 406/220 190 | 1.15 (1.11-1.18) | <1 × 10−16 |
Platelet count‡ | N/A | 252.8 ± 63.8 | 253.1 ± 59.7 | 1.00 (1.00-1.00) | .69 |
CRP§ | N/A | 3.5 ± 5.6 | 2.5 ± 4.3 | 1.02 (1.01-1.02) | <1 × 10−16 |
Data are represented for 484 206 individuals.
An extended Firth logistic regression model was generated using the clinical annotation and phenotyping data. The third and fourth columns display the number of VTE cases and controls stratified by genotype and other covariates, respectively. For each comparison, noncarriers of both FVL and PTGM were used as the reference value (OR = 1.0).
N/A, not applicable.
Per 1-year increase.
Per 1 kg/m2 increase.
Per 1000/μL increase.
Per 1 mg/dL increase.
Risk of arterial thrombosis associated with the DH genotype
The role of PTGM and FVL in arterial thrombosis risk remains poorly understood. We therefore assessed the potential contributions of the PTGM, FVL, and DH genotypes to the 3 subtypes of arterial thrombosis, including MI, CVA, and PAD. Adjusting for age, sex, and ancestry, we did not detect a significant association after correcting for multiple comparisons between any of the 3 genotypes and MI or CVA across the 938 987 individuals (Figure 3; supplemental Table 7). DH carriers experienced a modest trend toward an increased risk of MI (OR = 1.38; 95% CI, 1.00-1.90; P = .04) that did not withstand adjustment for multiple testing, whereas no associations were identified between DH carrier status and CVA or PAD. In contrast, heterozygous FVL and PTGM carriers experienced a small but statistically significant increased risk of PAD (OR = 1.17; 95% CI, 1.11-1.24; P = 1.3 × 10−7 and OR = 1.14; 95% CI, 1.04-1.27; P = .007, respectively). In the expanded models that included hypertension, LDL cholesterol (as reflected by LDL PRS), and hemoglobin A1c as covariates, only the association between PAD and heterozygous FVL carrier status remained significant (supplemental Table 8).
Associations between selected genotypes and disease risk across 4 venous and arterial thrombosis phenotypes. A random-effects meta-analysis was used to generate effect size estimates for thrombosis risk across the 5 genotypes in 938 355 biorepository participants. For each comparison, noncarriers of both FVL and PTGM were used as the reference value (OR = 1.0). Closed circles represent the effect size estimates that are statistically significant after correction for multiple comparisons.
Associations between selected genotypes and disease risk across 4 venous and arterial thrombosis phenotypes. A random-effects meta-analysis was used to generate effect size estimates for thrombosis risk across the 5 genotypes in 938 355 biorepository participants. For each comparison, noncarriers of both FVL and PTGM were used as the reference value (OR = 1.0). Closed circles represent the effect size estimates that are statistically significant after correction for multiple comparisons.
Evaluation of VTE risk over time
To assess the risk of VTE associated with the DH genotype over time, we performed Kaplan-Meier analysis on participants in the UKB data set (N = 470 899). After excluding patients with a history of VTE at the time of enrollment (prevalent cases), we compared the rates of incident VTE among wild-type, FVL heterozygous, PTGM heterozygous, and DH individuals. We found that VTE incidence was significantly higher among PTGM individuals (log-rank P = 8.5 × 10−15), FVL individuals (log-rank P < 2 × 10−16), and DH individuals (log-rank P = 7.6 × 10−5) than among wild-type participants (Figure 4). Cox proportional hazards modeling demonstrated a higher rate of incident VTE for the DH genotype (hazard ratio [HR] = 2.70; 95% CI, 1.65-4.40; P = 7.6 × 10−5) compared with either FVL (HR = 1.98; 95% CI, 1.82-2.15; P = < 2 × 10−16) or PTGM (HR = 1.68; 95% CI, 1.47-1.91; P = 8.48 × 10−15). Correspondingly, patients homozygous for PTGM (N = 48) and FVL N = 198) demonstrated an increased risk of incident VTE, although this trend did not achieve significance for PTGM homozygous individuals (supplemental Table 9).
Kaplan-Meier analysis of VTE risk in the UKB (N = 472 754). Genotype–stratified Kaplan-Meier curves were generated to assess the time-dependent risk of VTE from the date of UKB enrollment. HR and P values were derived from univariate Cox proportional hazards modeling. The median follow-up was 4070 days.
Kaplan-Meier analysis of VTE risk in the UKB (N = 472 754). Genotype–stratified Kaplan-Meier curves were generated to assess the time-dependent risk of VTE from the date of UKB enrollment. HR and P values were derived from univariate Cox proportional hazards modeling. The median follow-up was 4070 days.
Discussion
Leveraging 2 large-scale genomic data sets comprising nearly 1 million individuals, we conducted the largest study to date of FVL and PTGM double heterozygosity. This approach allowed us to assess with a much higher level of precision the VTE risk experienced by DH carriers, a longstanding question in the field.36 We found that the DH genotype may occur as frequently as FVL homozygosity and confer a similarly increased risk of VTE.
Our work helps to place DH individuals in context by providing direct comparisons to other FVL and PTGM genotypes assessed using the same data set and methodology. The DH genotype occurred in 662 (0.07%) participants, consistent with the prior prevalence estimates of ∼0.1%.17 However, to our surprise, DH individuals appeared to be as common as those in the more widely studied and discussed FVL homozygous state (0.06%). Few if any investigations have directly evaluated the differential prevalence of FVL homozygous and DH genotypes in the same unselected national-level data set, as this type of resource has only recently become available. Moreover, the prior literature on VTE risk associated with DH carrier status is characterized by a broad range of point estimates and CIs that overlap significantly with those of other genotypes.18-23,37 Assessments of double heterozygosity have included studies that found it to be practically indistinguishable from other FVL and PTGM genotypes21 as well as reports suggesting that DH individuals experience a lower risk of VTE than those with FVL single heterozygosity.19 In contrast, we found that the DH genotype leads to a markedly increased risk of VTE compared with having either variant alone, and our pooled analytical estimates featured far greater statistical precision than has previously been possible. The heightened risk of VTE associated with all 3 genotypes (FVL heterozygous, PTGM heterozygous, and DH) appeared to persist even into later life among UKB participants. This observation challenges the view that common germ line thrombophilias become clinically irrelevant in older individuals and complements a recent report by Méan et al38 that found a high rate of FVL and PTGM heterozygosity in 240 consecutive elderly patients with their first VTE.
The risk of arterial thrombosis associated with single heterozygous FVL and PTGM mutations remains controversial. Although some prior studies have demonstrated a statistically significant increase in CVA and MI risk among those who were singly heterozygous for FVL or PTGM, the observed effect sizes were small and of unclear clinical importance.39-41 Similarly, a consensus on whether DH carriers experience a clinically meaningful increased risk of arterial thrombosis has yet to be established.42 Both the UKB and FinnGen data sets featured broad clinical annotation that allowed us to address these questions. Our results indicate that the risk for the 3 subtypes of arterial thrombosis (MI, CVA, and PAD) was not significantly higher in DH individuals after adjusting for multiple comparisons. Interestingly, we found that DH carriers had a small and nominally significantly increased risk of MI, a finding that should be interpreted with caution. Although we cannot rule out an association, any effect is likely to be modest and further studies are required to provide clarity. On balance, our data do not support the practice of routinely testing for FVL and PTGM in patients with arterial thrombosis, particularly CVA.
Several important limitations should be considered when interpreting our results. First, we did not evaluate genetic modifiers for VTE risk outside of the FVL and PTGM. This is an important topic that will be addressed in subsequent investigations. Second, our models were limited to data available through the UKB and FinnGen studies, which precluded us from considering the possible use of anticoagulation by participants. On balance, this concern is mitigated because our analysis focused solely on incident (first event) thrombosis, thereby reducing the probability that the participants’ VTE risk had been modulated by prior exposure to anticoagulation. Third, we were unable to account for the presence of active cancer or the use of oral contraceptives among the participants due to a lack of data. Fourth, clinical annotation within the UKB and FinnGen data sets did not allow us to meaningfully evaluate more granular phenotype classifications such as subtypes of MI or specific symptoms of PAD. Fifth, our results could be subject to differential enrollment patterns by genotype, such as the possibility that DH individuals tend to develop VTE early in life and are subsequently more or less likely to participate in large-scale biobanking studies. Nevertheless, we were unable to detect any differences in baseline characteristics according to genotype that would suggest biased enrollment. Moreover, the robustness of our findings is supported by the similar risk estimates obtained in the UKB and FinnGen, whose participants had different age ranges and prevalence of common diseases.25,43
Our approach also included several strengths that allowed us to overcome the key limitations affecting prior studies. First, we used 2 large state-of-the-art genomic data sets to rapidly assemble the largest cohort of DH individuals to date. The total size of our cohort and the consistency of the results across each subcohort highlight the generalizability of these findings to external populations. Second, rather than relying on patients presenting to thrombophilia clinics and their family members, we were able to mitigate ascertainment bias by drawing both cases and controls from the same cross-sectional, population-level cohorts in which enrollment was agnostic to thrombosis history. This strategy prevented the selection of controls with possible unmeasured genetic risk factors for thrombosis (as may be expected in the relatives of DH patients who presented with VTE), which could have the effect of artificially suppressing effect size estimates. Indeed, even in our large data set, we noted a modest increase in the observed OR for VTE in DH individuals when only unrelated participants (N = 649 644) were considered. Third, to our knowledge, our study is the first to adjust for a broad range of common thrombosis risk factors to more closely approximate the “true” risk conferred by the DH genotype. Fourth, we were able to replicate many common VTE risk factors, such as age, BMI, and smoking status, a sign that clinical annotation in the data set is robust. Similarly, we found that the VTE risks associated with FVL and PTGM single heterozygosity broadly align with prior work.1,9,19,44
In conclusion, to our knowledge, we performed the first study using population-scale genomic data sets to evaluate the impact of FVL and PTGM double heterozygosity on thrombosis risk. These results contribute to an improved understanding of 2 commonly encountered genetic risk factors for thrombosis and could serve as a model for future studies focused on the contribution of germ line genetics to coagulation disorders.
Acknowledgments
The authors express their gratitude to all UK Biobank and FinnGen study participants, as this work would not have been possible without their contributions. The authors also thank Walter Dzik for providing his valuable insights during the drafting of the manuscript. The authors acknowledge the following biobanks for contributing samples to FinnGen: Auria Biobank, THL Biobank, Helsinki Biobank, Biobank Borealis of Northern Finland, Finnish Clinical Biobank Tampere, Biobank of Eastern Finland, Central Finland Biobank, Finnish Red Cross Blood Service Biobank, Terveystalo Biobank, and Arctic Biobank. All Finnish biobanks are members of the BBMRI.fi infrastructure. Finnish Biobank Cooperative is the coordinator of BBMRI-ERIC operations in Finland.
P.K.B. is supported by National Institutes of Health (NIH), National Heart, Lung, and Blood Institute (NHLBI) grants NIH 1R01 HL166246 and NIH 1R03 HL162761 and the CSL Behring Heimburger Award. P.T.E is supported by NIH, NHLBI grants 1R01HL092577, 5R01HL139731, and 1R01HL157635, American Heart Association grant 18SFRN34110082, and European Union grant MAESTRIA 965286. J.T.R. was supported by a fellowship grant from the Sigrid Jusélius Foundation. T.N. was supported by grants from the Finnish Foundation for Cardiovascular Research, the Sigrid Jusélius Foundation, and the Research Council of Finland (321531 and 354447). The FinnGen project is funded by 2 grants from Business Finland (HUS 4685/31/2016 and UH 4386/31/2016) and the following industry partners: AbbVie, AstraZeneca United Kingdom Limited, Biogen, Bristol Myers Squibb (and Celgene Corporation and Celgene International II Sàrl), Genentech Inc, Merck Sharp & Dohme LLC, Pfizer, GlaxoSmithKline Intellectual Property Development Ltd, Sanofi US Services Inc, Maze Therapeutics Inc, Janssen Biotech Inc, Novartis Pharma AG, and Boehringer Ingelheim International GmbH.
Authorship
Contribution: J.R. and P.K.B. conceived the project and drafted the manuscript; J.R., A.H., P.K.B., and J.T.R. analyzed the data, drafted the manuscript, and contributed to the interpretation of the results; S.J.J., T.N., A.P., M.D., S.S.-C., S.H.C., and P.T.E. helped design and oversee the statistical analysis; K.A.B. assisted in overseeing the clinical validity of the analyses; and all authors read and assisted in editing the manuscript.
Conflict-of-interest disclosure: P.T.E. receives sponsored research support from Bayer AG, IBM Research, Bristol Myers Squibb, Pfizer, and Novo Nordisk, and also served on advisory boards or consulted for MyoKardia and Bayer AG. The remaining authors declare no competing financial interests.
A complete list of the members of the FinnGen study group appears in the supplemental Data and Methods.
Correspondence: Pavan K. Bendapudi, Center for Life Science, Room 906, 3 Blackfan Circle, Boston, MA 02115; email: pbendapudi@mgb.org.
References
Author notes
This work used publicly available large-scale genomic data sets. Raw analysis data and code are available to the scientific community upon reasonable written request from the corresponding author, Pavan K. Bendapudi (pbendapudi@mgb.org).
The Finnish Biobank data can be accessed through Fingenious services (https://site.fingenious.fi/en/) managed by Finnish Biobank Cooperative.
The online version of this article contains a data supplement.
There is a Blood Commentary on this article in this issue.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal