Abstract
Background:Anemia remains a common comorbidity in persons living with HIV/AIDS (PLWHA). Among people infected with HIV, anemia is a strong risk factor for disease progression and death.Although the use of highly active antiretroviral therapy can improve anemia and reduce the prevalence of anemia, the prevalence of anemia related to HIV remains high. Anemia in PLWHA is associated with increased morbidity and mortality, making it crucial to understand its prevalence,risk factors, and impact on prognosis.At present, there is a lack of large-scale real-world studies on bone marrow cytology and morphology in patients with PLWHA combined with anemia in China.Therefore, we analyzed a PLWHA cohort in China to assess the prevalence of anemia and its relationship with prognosis, and analyzed the related risk factors in demographic characteristics, comorbidities, hematological and bone marrow indicators.
Methods:We conducted a retrospective study on 618 PLWHA admitted to Zhongnan Hospital of Wuhan University from January 2016 to October 2020. Patients were categorized into anemia (n = 474) and non-anemia groups (n = 144) based on hemoglobin (HGB) levels:anemia group (male: HGB<120g/L, Female: HGB<110g/L) and non-anemia group (male: HGB≥120g/L, female: HGB≥110g/L). We analyzed demographic characteristics, comorbidities, peripheral blood cells, lymphocyte subpopulations, bone marrow cytology, and bone marrow morphology.In the cohort of HIV-infected individuals, we explored key factors influencing the development of anemia and construct a binary classification model for anemia status using these critical factors. Prior to analysis, to address the random missing patterns in the data, an iterative imputation method based on a random forest was used for missing value imputation. Given the significant differences in variable scales, continuous variables after imputation were further standardized. Then, the dataset was then randomly divided into a training set and a validation set. For model development, MCP-penalized logistic regression was selected to perform feature selection and a logistic regression model was constructed based on the selected features. Compared to traditional “black-box” machine learning models, this approach yields explicit regression coefficients for selected features, thereby enhancing model interpretability. It enables precise identification of key variables influencing anemia status from multi-dimensional inputs, offering clear guidance for targeted clinical investigations.
Findings:Feature selection identified 4 candidate variables :ESR, basophilic stippling cells, CD4+ lymphocytes percent, eosinophil(EO). After rigorous modeling and analysis, we finally obtained the following logistic regression model:log(p/1-p) = -3.0155-3.7364ESR-0.6533CD4+% -4.8914basophilic stippling + 0.6857EO( ESR p < 2×10-16; CD4% p = 0.0013; basophilic stippling p = 4.0×10-9; EO p = 0.034). The validation results showed that the model achieved a classification accuracy of 85% on the validation set, significantly higher than the conventional logistic regression model using all variables (77.5%).
Conclusion: MCP-penalized logistic regression distilled a 4-marker panel (ESR, CD4%, basophilic stippling, EO) that predicts anemia in PLWHA with superior accuracy. This transparent, low-dimensional model supports targeted laboratory work-up and may inform individualized management of anemia in HIV care.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal