Key Points
We assess how clinical factors (age, sex, platelet count) and training on other cancers influence machine-learning ovarian cancer detection
We have developed a highly sensitive machine-learning model dedicated to early ovarian cancer detection
Ovarian cancer (OC) presents a diagnostic challenge, often resulting in poor patient outcomes. Platelet RNA sequencing, which reflects the host's response to disease, shows promise for earlier OC detection. This study examines the impact of sex, age, platelet count and the training on cancer types other than OC on classification accuracy achieved in the previous platelet-alone training dataset. A total of 339 samples from healthy donors and 1396 samples from cancer patients, spanning 18 cancer types (including 135 OC cases) were analyzed. Logistic Regression was applied to verify our classifiers' performance and interpretability. Models were tested at 100% specificity and 100% sensitivity levels. Incorporating patients' age as an additional feature along with gene expression increased sensitivity from 68.6% to 72.6%. Models trained on data from both sexes and on female-only data achieved a sensitivity of 68.6% and 74.5%, respectively. Training solely on OC data reduced late-stage sensitivity from 69.1% to 44.1 but increased early-stage sensitivity from 66.7% to 69.7%. This study highlights the potential of platelet RNA profiling for OC detection and the importance of clinical variables in refining classification accuracy. Incorporating age with gene expression data may enhance OC diagnostic accuracy. The inclusion of male samples deteriorates classifier performance. Data from diverse cancer types improves advanced cancer detection but negatively impacts early-stage diagnosis.