Leveraging Machine Learning to Identify Proteomic Biomarkers of Tibial Bone Stress Reinjury
Loading...
Date
2024
Authors
Dinh, Ethan
Journal Title
Journal ISSN
Volume Title
Publisher
University of Oregon
Abstract
Bone stress injuries (BSIs) are experienced by up to 20% of runners, with one in five of these athletes sustaining an additional BSI [1], [2]. Yet the underlying biological factors contributing to refracture risk remain elusive. In this study, we present a longitudinal exploration of 1500 serum proteins in 30 female recreational runners diagnosed with BSIs. Serum protein levels were measured at five distinct timepoints over a year of healing, during which additional BSIs were observed in 10 individuals (33%).Top of FormBottom of Form To enhance bioinformatic pipelines for statistical analysis, a Python package was developed to process the platform specific file type received for the serum protein data. These new file structures were then analyzed via opensource libraries including MetaboAnalystR and ExpressAnalystR.
To identify proteomic signatures that distinguish individuals with an additional BSI from those with a single BSI at the earliest timepoint, we utilized sparse partial least squares-discriminant analysis (sPLS-DA). Notably, this analysis revealed 10 significant markers yielding a predictive accuracy of 95% through leave-one-out cross-validation and identified Fumarylacetoacetase (FAAA) and Trypsin-2 as the most prevalent predictors. Time-course differential expression analysis highlighted 106 significant proteins that were differentially expressed between the two groups mapping to immune and blood clotting pathways.
3
These findings provide new opportunities for targeted therapeutic interventions by pinpointing specific biomarkers and disrupted biological pathways in individuals who developed additional BSIs. Ongoing work will evaluate linear and non-linear models with tibial bone structural data, self-reported pain metrics, and return-to-play data with goals of better understanding the individual time course of healing and BSI risk, ultimately informing personalized strategies for injury prevention, assessment, and rehabilitation.
Description
50 pages
Keywords
Sparse Partial Least Squares Discriminant Analysis, Block HSIC Lasso, Genetic Programming, ParetoGP, Differential Gene Expression