The goal of this study was to determine whether statistical modeling of population data for a phenotypic marker could reflect a major locus gene defect. Identifying mutations in the HFE gene makes it possible to assess the association between transferrin saturation (TS) subpopulations and HFE mutations. Data were analyzed from 27 895 white patients who attended a health appraisal clinic and who had TS and common mutations of HFE determined. Mixture distribution modeling of TS was performed, and the proportion of HFE mutations in TS subpopulations was assessed on a probability basis. Three subpopulations of TS were identified, consistent with Hardy-Weinberg conditions for major locus effects. For men, 72% of the subpopulation with the highest mean TS had HFE gene mutations; they were primarily homozygotes or compound heterozygotes. Seventy-three percent of the subpopulation with moderate mean TS also had HFE gene mutations; they were predominantly simple heterozygotes. Sixty-seven percent of the subpopulation with the lowest mean TS were wild-type homozygotes. Similar results were observed for women. These results suggest that statistical modeling of population clinical laboratory test data can reveal the influence of a major locus gene defect and perhaps can be applied to other aspects of body metabolism than iron. (Blood. 2003;102:4563-4566)

Homozygous hemochromatosis, a common inherited susceptibility to iron overload, occurs in 0.3% to 0.5% of white persons of western European descent.1,2  Most cases of hemochromatosis in white persons are attributable to common missense mutations of the HFE gene. The 2 common missense mutations are C282Y (exon 4, nt 845G>A) and H63D (exon 2, nt 187C>G), and 80% to 90% of patients of northern European ancestry with typical hemochromatosis phenotype are C282Y homozygotes.3,4  Transferrin saturation (TS) generally is regarded as the best single screening test for hemochromatosis.5 

We previously used mixture modeling of TS measured in the second and third National Health and Nutrition Examination Surveys in the United States (NHANES II, NHANES III) and in population data from Australia to study possible genetic influences on iron metabolism in white populations.6-8  In each study, we found TS subpopulations consistent with Hardy-Weinberg predictions for a major hemochromatosis locus that leads to altered TS values in affected homozygotes and in heterozygotes. In one study, serum ferritin concentrations were measured simultaneously, permitting the determination of 3 subpopulations of TS with progressively increasing mean age-adjusted serum ferritin concentration values, consistent with increasing body iron stores.8 

For our previous analyses of population data, HFE gene mutation status was unavailable. In this new study, we analyzed a large data set from patients attending a health appraisal clinic. Transferrin saturation, serum ferritin, and HFE mutations were measured for each patient. The goal was to verify that statistical distribution methodology could be used to analyze laboratory test values to predict the presence of major locus mutations that affect body metabolism.

Sources of data

The primary source of data was a study conducted at the Kaiser Permanente San Diego Health Appraisal Clinic from 1998 to 2001.9,10  All patients older than 20 years of age who were registered with the clinic were apprised of a research project in which DNA analysis for HFE mutations and measurements of serum ferritin concentration would be added to the tests usually performed. We analyzed data from the 30 966 non-Hispanic white men and women, at least 20 years of age, who consented to participate in the Kaiser Permanente study and for whom HFE genotype data and health and demographic data were obtained. The term white is used in this paper to designate non-Hispanic patients who selected the category white to indicate their sole ancestry. For the purposes of the present study, we did not analyze data from Asian, black, or Hispanic clinic patients because HFE mutations are less common in these populations than among white persons. Details of the laboratory analysis of samples for HFE C282Y and H63D mutations, hematologic measurements, serum iron concentration, total iron-binding capacity, transferrin saturation, and serum ferritin concentration have been described previously.9 

Statistical modeling of transferrin saturations

For analysis, we selected transferrin saturation and serum ferritin values from 15 294 nonpregnant white women and 15 415 white men at least 20 years of age. Patients were excluded if they had abnormally low values for hemoglobin or mean corpuscular volume (MCV) because associated conditions might have altered transferrin saturation and serum ferritin values.11-17  We did not exclude patients with MCV values above the reference range because hemochromatosis probands have mean MCV values significantly higher than wild-type patients.18  Data sets for transferrin saturation modeling consisted of 13 805 men and 14 090 women (Table 1).

Table 1.

Exclusions from the total sample of white participants at least 20 years of age


Exclusion

Men

Women
Total sample 15 415 15 294 
   Samples excluded on the basis of low Hb or MCV <80 fL 1 258 909 
   Transferrin saturation not measured 352 295 
Sample for transferrin saturation modeling 13 805 14 090 
   Serum ferritin not measured 533 585 
Sample for serum ferritin analysis
 
13 272
 
13 505
 

Exclusion

Men

Women
Total sample 15 415 15 294 
   Samples excluded on the basis of low Hb or MCV <80 fL 1 258 909 
   Transferrin saturation not measured 352 295 
Sample for transferrin saturation modeling 13 805 14 090 
   Serum ferritin not measured 533 585 
Sample for serum ferritin analysis
 
13 272
 
13 505
 

Hemoglobin (Hb) exclusions: men, Hb < 13.5 g/dL; women, Hb < 12.0 g/dL.

The research hypothesis was that HFE gene mutations influence the distribution of transferrin saturation, resulting in genetically based subpopulations. Details of mixture-modeling techniques as applied to transferrin saturation distributions, including parameter estimation, statistical tests, and confidence intervals for proportions, have been described elsewhere.8,19-23 

Comparison of mean serum ferritin concentrations within transferrin saturation subpopulations

In a previous investigation of population data, we found evidence of subpopulations of patients, determined on the basis of transferrin saturation, with significantly different iron stores based on mean serum ferritin concentrations. Confirmatory analyses were performed in this new study. Considering the transferrin saturation data sets used for mixture modeling, 533 men and 585 women did not have measurements of serum ferritin concentration. After removing values for these patients, the data sets to analyze serum ferritin concentration consisted of 13 272 men and 13 505 women. Because the distributions of serum ferritin concentration were markedly skewed, the square root transformation was applied to each data set. In addition, because serum ferritin concentration tends to increase with age,24,25  we applied methods for age standardization to age 60 years. The probability that an individual serum ferritin concentration value belonged to either of the 3 transferrin saturation subpopulations was computed.8  We used the Parametric Trend Test26  to compare the mean square root of serum ferritin concentration within each of 3 transferrin saturation subgroups, taking into account the unequal sample sizes for subgroups of patients.

Genetic analysis of the genotype data

With rare exceptions27,28  the C282Y and H63D mutations occur in trans; therefore, we observed 3 gametes, defined by the presence or absence of the mutations, and 6 genotypes in the sample. The 6 genotypes are denoted by the mutation that is present or as wild type (wt) if neither mutation exists on the chromosome, giving wt/wt, wt/H63D, wt/C282Y, H63D/C282Y, H63D/H63D, and C282Y/C282Y. The absence of chromosomes carrying both mutations confirms that the mutations are in virtually complete linkage disequilibrium.

Genotype data were analyzed to determine whether the observed genotype frequencies fit the expectations under the Hardy-Weinberg equilibrium model. The Hardy-Weinberg model assumes that the 2 alleles (or mutations) in a genotype are independent and that the genotype frequencies are the product of their constituent allele or mutation frequencies. Deviations from the Hardy-Weinberg model can result from an underlying structure in the population (eg, inbreeding or selection bias) or, more commonly, errors in the data. The C282Y and H63D mutation frequencies in the population were estimated separately from the data for 13 805 men and 14 090 women, and expected genotype frequencies were then calculated from the mutation frequencies, assuming Hardy-Weinberg equilibrium. The Kolmogorov-Smirnov goodness-of-fit test was applied to compare observed and expected genotype frequencies.

Association of modeled TS subpopulations and HFE genotypes

For each sex, the frequency and proportion of genotypes occurring within transferrin saturation subpopulations was estimated. Given the total sample size, the number of expected observations within each transferrin saturation interval was calculated. As described previously,8  within a given transferrin saturation interval, each observation was then assigned a probability, and these probabilities were used to assign transferrin saturation values to a given subgroup according to the expected proportions within transferrin saturation subpopulations.

Statistical modeling of transferrin saturations

The primary analysis was performed on transferrin saturations for 27 895 patients. Results of statistical mixture modeling indicated that the fit of the data to a mixture of 3 normal populations with unequal variances was significantly better than the fit to either a mixture of 2 normal populations or a single normal population for white men (likelihood ratio statistics, 157 [P < .01] and 1704 [P < .01], respectively) and women (likelihood ratio statistics, 120 [P < .01] and 649 [P < .01], respectively). Figure 1 shows the distribution of transferrin saturation for men and women. The fitted subpopulations are superimposed over the histograms of the observed data. Table 2 gives mixture-model parameter estimates for transferrin saturation subpopulations and shows that for each sex, transferrin saturation subpopulations were identified with increasing means for men and women (Trend test, P < .0001 for each). We also compared the age-standardized mean serum ferritin concentrations for the TS subpopulations (Table 2). Trend tests demonstrated an increase in mean serum ferritin concentration, standardized to age 60 years, with increasing mean transferrin saturation for men (P < .0001) and women (P < .0001).

Figure 1.

Observed and modeled distributions of transferrin saturation values are shown for each ethnicity and sex. Distribution of transferrin saturation values: (A) 13 805 white men and (B) 14 090 white women. A histogram of the observed data is shown. Dashed lines represent the fitted normal distributions representing each subpopulation. Solid line represents the overall fitted mixture distribution.

Figure 1.

Observed and modeled distributions of transferrin saturation values are shown for each ethnicity and sex. Distribution of transferrin saturation values: (A) 13 805 white men and (B) 14 090 white women. A histogram of the observed data is shown. Dashed lines represent the fitted normal distributions representing each subpopulation. Solid line represents the overall fitted mixture distribution.

Close modal
Table 2.

Parameter estimates for transferrin saturation and age-standardized serum ferritin concentration



Transferrin saturation, %

Serum ferritin concentration, ng/mL
Mixture distribution models
Subpopulation estimates
Sex
Population
Mean
SD
N
Mean
95% CI
Men 86.2 25.8 7.26 11 419 134.6 132.3, 136.9 
 13.0 42.2 7.72 1 735 171.6 166.4, 176.9 
 0.9 71.5 10.83 118 361.0 285.6, 445.2 
Women 86.3 21.8 6.96 11 637 74.0 72.3, 75.7 
 12.9 36.8 7.39 1 754 94.1 90.3, 98.0 

 
0.8
 
61.6
 
10.4
 
114
 
156.3
 
123.2, 193.2
 


Transferrin saturation, %

Serum ferritin concentration, ng/mL
Mixture distribution models
Subpopulation estimates
Sex
Population
Mean
SD
N
Mean
95% CI
Men 86.2 25.8 7.26 11 419 134.6 132.3, 136.9 
 13.0 42.2 7.72 1 735 171.6 166.4, 176.9 
 0.9 71.5 10.83 118 361.0 285.6, 445.2 
Women 86.3 21.8 6.96 11 637 74.0 72.3, 75.7 
 12.9 36.8 7.39 1 754 94.1 90.3, 98.0 

 
0.8
 
61.6
 
10.4
 
114
 
156.3
 
123.2, 193.2
 

For patients younger than 60 years, serum ferritin concentration values were standardized to those expected at age 60.

Genetic analysis of the genotype data

The observed frequencies of the H63D and C282Y mutations were 0.152 and 0.062, respectively, in men and 0.149 and 0.064, respectively, in women. The data show strong agreement with the Hardy-Weinberg equilibrium model, which would be expected for a large, randomly selected sample from a large, randomly mating population.10,29  There were no significant differences between observed and expected genotype frequencies (Kolmogorov-Smirnov statistic = 0.167; P = 1.0 for men and women).

Association of modeled TS subpopulations and HFE genotypes

The frequency of genotypes within each TS subpopulation is given in Table 3. For men, 72% of patients in the subpopulation with the highest mean TS had HFE gene mutations; 60% of them were homozygotes or compound heterozygotes, and 40% were simple heterozygotes (Table 3). Seventy-three percent of patients in the subpopulation with moderate mean TS had HFE gene mutations; 71% of them were simple heterozygotes for HFE mutations, and 29% were homozygotes or compound heterozygotes. In the subpopulation with the lowest mean TS, only 33% of patients had HFE mutations; 94% of them were simple heterozygotes, and 6% were homozygotes or compound heterozygotes for HFE mutations. Similar results were observed in analyses of data from women.

Table 3.

Frequency of genotypes within transferrin saturation subpopulations




Subpopulation


Genotype
1
2
3
Total
Men      
 wt/wt 8016 485 33 8534 
 H63D/wt 2941 321 27 3289 
 C282Y/wt 724 610 1341 
 H63D/H63D 139 195 336 
 C282Y/H63D 68 169 246 
 C282Y/C282Y 11 41 59 
 Total 11 895 1791 119 13 805 
Women      
 wt/wt 8271 445 43 8759 
 H63D/wt 2934 277 23 3234 
 C282Y/wt 709 706 1424 
 H63D/H63D 146 195 345 
 C282Y/H63D 81 182 267 
 C282Y/C282Y 13 14 34 61 

 
Total
 
12 154
 
1822
 
114
 
14 090
 



Subpopulation


Genotype
1
2
3
Total
Men      
 wt/wt 8016 485 33 8534 
 H63D/wt 2941 321 27 3289 
 C282Y/wt 724 610 1341 
 H63D/H63D 139 195 336 
 C282Y/H63D 68 169 246 
 C282Y/C282Y 11 41 59 
 Total 11 895 1791 119 13 805 
Women      
 wt/wt 8271 445 43 8759 
 H63D/wt 2934 277 23 3234 
 C282Y/wt 709 706 1424 
 H63D/H63D 146 195 345 
 C282Y/H63D 81 182 267 
 C282Y/C282Y 13 14 34 61 

 
Total
 
12 154
 
1822
 
114
 
14 090
 

As we gain understanding of the genetic influences on the body's metabolic processes, it is likely that we will find that routine laboratory measurements performed in clinical medicine are influenced by genetic mutations and polymorphisms in the population. It is also possible that studying the distributions of the results of laboratory measurements in populations can provide evidence for such as yet unidentified major mutations that influence metabolic pathways of the body. An example of this approach is a community-based prevalence study of hypertension, in which values for angiotensinogen (AGT) collected from Nigerian families were analyzed. Although the AGT genetic mutation status of family members was unknown, a mixture of 2 or 3 distributions fit the AGT values significantly better than a single distribution, and a mixture of 3 distributions provided the best fit to the data, suggesting a major genetic effect.30  We have examined mixture modeling of a phenotypic marker to reveal a major locus effect in the context of iron overload in the white population. Before the HFE gene was identified, we conducted a series of studies based on the postulate that the hemochromatosis mutation in white patients would influence the distribution of transferrin saturation in the population and would permit the identification of distinct subpopulations based on hemochromatosis genotype. We amassed considerable evidence consistent with the hypothesis6-8  but had not had the opportunity to verify our hypothesis by studying a population in which transferrin saturations and HFE genotypes were determined simultaneously in all participants. The present study represents our first opportunity to attempt to verify our hypothesis in a large population in whom phenotyping and genotyping were performed.10 

Under the hypothesis that the distribution of transferrin saturation reflects several populations based on individual genotype for hemochromatosis, in the present study we applied a statistical distribution methodology to transferrin saturations in a large data set and examined the relationship between the model results and the actual genotyping of HFE mutations. Using this approach, 3 transferrin saturation subpopulations were identified in white men and women. The proportions of these subpopulations are consistent with Hardy-Weinberg criteria for a major, common genetic influence on iron metabolism. For men and women, the subpopulation with higher mean TS values consisted predominantly of those with HFE gene mutations. In the subpopulation with the highest TS, the HFE mutations were predominantly homozygote or compound heterozygote. In the subpopulations with moderate TS, the HFE mutations were predominantly simple heterozygote. In contrast, most of the patients in the subpopulation with the lowest TS were predominantly HFE wild-type homozygote.

Our study has several limitations. We were unable to exclude patients with elevations in erythrocyte protoporphyrin or liver function tests or those with positive serum hepatitis B surface antigen or serum hepatitis C antibody because these laboratory tests were not performed for all participants. Patients with abnormalities in these test results might have had elevations in TS not attributable to a genetic defect and might have been part of the proportion with wild-type HFE found in the subpopulation with the highest TS and moderate TS. The first 10 000 patients were tested for the S65C HFE mutation.9  Because not all patients were tested, we did not exclude those with other HFE mutations, such as S65C, who might also have been classified as having wild-type HFE in the upper TS subpopulation.

Even with these limitations, however, modeling demonstrated subpopulations of TS corresponding to distributions of HFE genotypes in this population. Our findings thus provide confirmation that mutations in a major locus influence the distribution of transferrin saturation in the US white population. Our findings also raise the possibility that mixture-modeling procedures might be used to predict the presence of major loci influencing many other laboratory tests, including those measuring metals, electrolytes, enzyme activities, metabolic breakdown products, and components of blood and plasma.

Prepublished online as Blood First Edition Paper, August 7, 2003; DOI 10.1182/blood-2003-04-1278.

Supported in part by National Institutes of Health grants and contracts HL-508203, N01-HC-05180 (C.E.M.), UH1 HL03679-03, and N01-HC-05186; Howard University General Clinical Research Center grants M01-RR10284 (V.R.G.), RR00833, and DK53505-4 (E.B.); Centers for Disease Control grant DK535-02; and the Stein Endowment Fund (E.B.).

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.

This is manuscript no. 15507-MEM from The Scripps Research Institute.

1
Adams PC, Gregor JC, Kertesz AE, et al. Screening blood donors for hereditary hemochromatosis: decision and analysis model based on a 30-year database.
Gastroenterology
.
1995
;
109
:
177
-188.
2
Witte DL, Crosby WH, Edwards CQ, et al. Practice Guideline Development Task Force of the College of American Pathologists: hereditary hemochromatosis.
Clin Chim Acta
.
1996
;
245
:
139
-200.
3
Feder JN, Gnirke A, Thomas W, et al. A novel MHC class I-like gene is mutated in patients with hereditary haemochromatosis.
Nat Genet
.
1996
;
13
:
399
-408.
4
Burke W, Thomson E, Khoury MJ, et al. Hereditary hemochromatosis: gene discovery and its implications for population-based screening.
JAMA
.
1998
;
280
:
172
-178.
5
Powell LW, Jazwinska E, Halliday JW. Primary iron overload. In: Brock JH, Halliday JW, Pippard MJ, et al, eds.
Iron Metabolism in Health and Disease
. Philadelphia, PA: WB Saunders;
1994
:
247
-254.
6
McLaren CE, Gordeuk VR, Looker AC, et al. Prevalence of heterozygotes for hemochromatosis in the white population of the United States.
Blood
.
1995
;
86
:
2021
-2027.
7
McLaren CE, McLachlan GJ, Halliday JW, et al. Distribution of transferrin saturation in an Australian population: relevance to the early diagnosis of hemochromatosis.
Gastroenterology
.
1998
;
114
:
543
-549.
8
McLaren CE, Li KT, Gordeuk VR, et al. Relationship between transferrin saturation and iron stores in the African American and US caucasian populations: analysis of data from the third National Health and Nutrition Examination Survey.
Blood
.
2001
;
98
:
2345
-2351.
9
Beutler E, Felitti V, Gelbart T, Ho N. The effect of HFE genotypes on measurements of iron overload in patients attending a health appraisal clinic.
Ann Intern Med
.
2000
;
133
:
329
-337.
10
Beutler E, Felitti VJ, Koziol JA, Ho NJ, Gelbart T. Penetrance of 845G>A (C282Y) HFE hereditary haemochromatosis mutation in the USA.
Lancet
.
2002
;
359
:
211
-218.
11
Finch CA, Deubelbeiss K, Cook JD, et al. Ferrokinetics in man.
Medicine
.
1970
;
49
:
17
-53.
12
Hershko C, Graham G, Bates GW, Rachmilewitz EA. Non-specific serum iron in thalassaemia: an abnormal serum iron fraction of potential toxicity.
Br J Haematol
.
1978
;
40
:
255
-263.
13
Cartwright GE, Wintrobe MM. Chemical, clinical, and immunological studies on the products of human plasma fractionation, XXXIX: the anemia of infection: studies on the iron-binding capacity of serum.
J Clin Invest
.
1949
;
28
:
86
-98.
14
Jandl JH.
Blood
. Boston, MA: Little, Brown & Co;
1987
.
15
Bainton DF, Finch CA. The diagnosis of iron deficiency anemia.
Am J Med
.
1964
;
37
:
62
-70.
16
Edwards CQ, Griffen LM, Kaplan J, Kushner JP. Twenty-four-hour variation of transferrin saturation in treated and untreated haemochromatosis homozygotes.
J Intern Med
.
1989
;
226
:
173
-179.
17
Gordeuk VR, Brittenham GM, McLaren GD, Spagnuolo PJ. Hyperferremia in immmunosuppressed patients with acute nonlymphocytic leukemia and the risk of infection.
J Lab Clin Med
.
1986
;
108
:
466
-472.
18
Barton JC, Bertoli LF, Rothenberg BE. Peripheral blood erythrocyte parameters in hemochromatosis: evidence for increased erythrocyte hemoglobin content.
J Lab Clin Med
.
2000
;
135
:
96
-104.
19
Gordeuk VR, McLaren CE, Looker AC, Hasselblad V, Brittenham GM. Distribution of transferrin saturations in the African-American population.
Blood
.
1998
;
91
:
2175
-2179.
20
Hasselblad V.
Manual for the grouped DISFIT software
, version 1.1. Durham, NC: Duke University;
1992
.
21
McLachlan GJ, Krishnan T.
The EM Algorithm and Extensions
. New York, NY: John Wiley & Sons;
1997
.
22
McLachlan GJ, Basford KE.
Mixture Models: Inference and Applications to Clustering
. New York, NY: Marcel Dekker;
1988
.
23
Crump KS, Howe R. A preview of methods for calculating confidence limits in low dose extrapolation. In: Krewski D, ed.
Toxicological Risk Assessment
. Boca Raton, FL: CRC Press;
1985
:
188
-203.
24
Expert Scientific Working Group. Summary of a report on assessment of the iron nutritional status of the United States population.
Am J Clin Nutr
.
1985
;
42
:
1318
-1330.
25
Zacharski LR, Ornstein DL, Woloshin S, Schwartz LM. Association of age, sex, and race with body iron stores in adults: analysis of NHANES III data.
Am Heart J
.
2000
;
140
:
98
-104.
26
Cuzick J. Trend tests. In: Kotz S, Johnson N, eds.
Encyclopedia of Statistical Sciences
. Vol
9
. New York, NY: John Wiley & Sons;
1988
:
336
-342.
27
Spriggs EL, Harris PE, Best LG. Hemochromatosis mutations C282Y and H63D in “cis” phase.
Clin Genet
.
2001
;
60
:
68
-72.
28
Lucotte G, Champenois T, Semonin O. A rare case of a patient heterozygous for the hemochromatosis mutation C282Y and homozygous for H63D.
Blood Cells Mol Dis
.
2001
;
27
:
892
-893.
29
Waalen J, Felitti V, Gelbart T, Ho NJ, Beutler E. Prevalence of hemochromatosis-related symptoms in homozygotes for the C282Y mutation of the HFE gene.
Mayo Clin Proc
.
2002
;
77
:
522
-530.
30
Guo X, Rotimi C, Cooper R, et al. Evidence of a major gene effect for angiotensinogen among Nigerians.
Ann Hum Genet
.
1999
;
63
:
293
-300.
Sign in via your Institution