MACHINE LEARNING AND WEARABLE SENSORS FOR THE ESTIMATION OF BIOMECHANICAL VARIABLES OUTSIDE THE LABORATORY By SETH R. DONAHUE A DISSERTATION Presented to the Department of Human Physiology and the Division of Graduate Studies of the University of Oregon in partial fulfillment of the requirements for the degree of Doctor of Philosophy June 2022 DISSERTATION APPROVAL PAGE Student: Seth R. Donahue Title: Machine Leaning and Wearable Sensors for the Estimation of Biomechanical Variables Outside the Laboratory This dissertation has been accepted and approved in partial fulfillment of the requirements for the Doctor of Philosophy degree in the Department of Human Physiology by: Michael Hahn, Ph.D. Chairperson Andrew Karduna, Ph.D. Core Member Nicole Swann, Ph.D. Core Member Daniel Lowd, Ph.D. Institutional Representative and Krista M. Chronister, PhD Vice Provost for Graduate Studies Original approval signatures are on file with the University of Oregon Division of Graduate Studies Degree awarded June 2022 ii © 2022 Seth R. Donahue This work is under a Creative Commons Attribution -NoDerivs (United States) License. iii DISSERTATION ABSTRACT Seth R. Donahue Doctor of Philosophy Department of Human Physiology June 2022 Title: Machine Learning and Wearable Sensors for the Estimation of Biomechanical Variables Outside the Laboratory The miniaturization of sensors and their availability for biomechanical analysis outside of the laboratory has opened whole new areas of research. Wearable sensors have been developed to measure ground reaction forces, and inertial measurement units have been developed for the measurement of acceleration and angular velocity. The purpose of this dissertation was to develop methodologies for the measurement and estimation of biomechanical variables, outside of the laboratory. As these sensors can provide vast amounts of data, it is natural to leverage the strengths of machine learning models, which have been used to find patterns in large datasets to assist in the task of estimating biomechanical variables using wearable sensor data as input. This dissertation is divided into five distinct, but related projects all linked to the identification of gait events and machine learning applications for human locomotion data, both in and out of the laboratory. The first two projects were focused on identification of gait events and transitions between locomotion modes, while Projects 3 - 5 were focused on gait event detection and estimation of biomechanical parameters during running outside the laboratory. Project 1: Validation of a supervised machine learning algorithm for steady state locomotion, and dynamic transitions between those iv locomotion modes. Project 2: Deployment of an unsupervised machine learning and heuristic gait event detection algorithms for the identification of gait events, across environmentally constrained and internally driven locomotion transitions Projects 3 - 5 resulted in the development of methodologies for biomechanical analysis. We utilized both heuristic and machine learning methodologies for the estimation of biomechanical variables in these scenarios. Project 3: Estimation of gait events and contact times from inertial measures on the foot and the sacrum in a semi- uncontrolled environment. Project 4: Implementation of a recurrent neural network for the estimation of whole ground reaction force waveforms and the calculation of discrete kinetic variables from these waveforms in a semi-uncontrolled environment. Project 5: Synthesis and application of the previous two chapters, gait event detection and estimation of ground reaction force waveforms on data collected in a real-world environment during a 5-mile free run. v CURRICULUM VITAE NAME OF AUTHOR: Seth R. Donahue GRADUATE AND UNDERGRADUATE SCHOOLS ATTENDED: University of Oregon, Eugene, OR University of Montana, Missoula, MT DEGREES AWARDED: Doctor of Philosophy, Human Physiology, 2022, University of Oregon Master of Science, Health and Human Performance, 2017, University of Montana Bachelor of Arts, Mathematics, 2015, University of Montana AREAS OF SPECIAL INTEREST: Biomechanical analysis outside the laboratory Wearable Sensors Machine Learning Locomotion Transitions PROFESSIONAL EXPERIENCE: Graduate Research Assistant, University of Oregon Dept. Human Physiology (Neuromechanics Laboratory), 2017-present Graduate Employee, University of Oregon Dept. Human Physiology, 2017-present Graduate Research Assistant, University of Montana Dept. Health and Human Performance (Biomechanics Laboratory), 2015-2017 GRANTS, AWARDS, AND HONORS: Jan Broekhoff Fellowship, Department of Human Physiology, University of Oregon, 2021 Jan Broekhoff Fellowship, Department of Human Physiology, University of Oregon, 2020 Jan Broekhoff Fellowship, Department of Human Physiology, University of vi Oregon, 2019 Jan Broekhoff Fellowship, Department of Human Physiology, University of Oregon, 2018 Martin Sharkey Research Award, Department of Health and Human Performance, University of Montana, 2017 PUBLICATIONS: S. R. Donahue, L. Jin, and M. E. Hahn, “User Independent Estimations of Gait Events With Minimal Sensor Data,” IEEE J. Biomed. Heal. Informatics, vol. 25, no. 5, pp. 1583–1590, May 2021. S. R. Donahue and M. E. Hahn, “Feature Identification With a Heuristic Algorithm and an Unsupervised Machine Learning Algorithm for Prior Knowledge of Gait Events,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 30, pp. 108–114, 2022. vii ACKNOWLEDGMENTS Thank you to Dr. Mike Hahn, my gratitude for your guidance, mentorship, and friendship throughout the past 5 years is unending. I have never felt that I would make it, through grad school. You believed in me when I didn’t, you gave me opportunities I never thought I would be possible. I am so indebted to you, not only as a mentor but as a friend as well. You are the best mentor anyone could ask for, you give so much to your students, and push us to succeed professionally and personally. I will always look fondly back upon these years in the BSSC, these have been some of the best of my life. Thank you. To my committee, thank you for your support and pivoting with me when necessary mid-dissertation. Thank you for your guidance and feedback throughout this process. Thank you for investing in me as a researcher, for your time, and willingness to provide feedback to me for this project. This project would not have been possible without your help. Thank you to the current and former members of the BSSC, you all were the best people possible to work with. Dr. Katie Farina: Thanks for being my friend and colleague, angsty music and inverse dynamics were the best. To Dr. Mike McGeehan: Your level head and example always made me want to be a better person and researcher, thank you for inviting me to Thanksgiving back in 2018 with you and Kate. To Dr. Evan Day: Thank you for your questions about what I am doing and why I am doing it, still done know. Your snarky feedback always made my work better. To Emily Karolidis: I am so glad Mike brought you on board, I remember that first chat we had thinking that you were going to get your Ph.D. someday, and here you are. Thank you for all the viii laughs. To Rachel Robinson: Having you on the longitudinal running study has been amazing, thank you for all your hard work in learning the tips and tricks of data collections outside the lab. To Aida Chebbi: The work we have been doing has been amazing, your willingness to come in and learn a whole new skillset, work with this team and truly contribute to the group, thank you. To my wife, Matlyn: You have also believed in me and encouraged me when I didn’t. You have loved me through this process, thank you. These last few months have been crazy, thank you for your steadfast love and support while I have finished up this degree. Thank you for listening to my crazy ideas, to my frustrations and joys throughout this process. These last two years would have been impossible if it wasn’t for you, I would have been lost if it weren’t for you keeping me grounded and focused on what matters. Thank you for sharing your life with me. To the rest of my family: Thank you for your unwavering support of me chasing after my dreams, for the yearly trips to Oregon, all the chats, and the facetimes. Thank you. Finally, thank you to the University of Oregon Human Physiology Department for the support in so many aspects of my time at UO. I cannot think of better people to have been surrounded by during my graduate studies. ix DEDICATION This dissertation is dedicated to my wife and family for their unconditional love, none of this could have happened without you. x TABLE OF CONTENTS Chapter Page I. INTRODUCTION .................................................................................................... 1 Background and Significance ................................................................................ 1 Introduction ...................................................................................................... 1 Gait Event Identification .................................................................................. 2 Machine Learning Algorithms for Gait Event Identification .......................... 3 Running Outside the Laboratory ...................................................................... 4 General and Specific Aims .................................................................................... 7 Organization of the Dissertation ............................................................................ 9 II. GENERAL METHODOLOGY .............................................................................. 12 Subjects .................................................................................................................. 12 Study Design and Experimental Protocol .............................................................. 14 Chapter III ........................................................................................................ 14 Chapter IV ........................................................................................................ 14 Chapters V - VI ................................................................................................ 15 Chapter VII ...................................................................................................... 15 Data Collection ...................................................................................................... 15 Chapter III ........................................................................................................ 15 Chapter IV ........................................................................................................ 16 Chapters V - VI ................................................................................................ 16 Data and Statistical Analysis ................................................................................. 16 Chapter III ........................................................................................................ 16 xi Chapter Page Chapter IV ........................................................................................................ 18 Chapters V ....................................................................................................... 18 Chapters VI ...................................................................................................... 19 Chapter VII ...................................................................................................... 23 III. USER INDEPENDENT ESTIMATIONS OF GAIT EVENTS WITH MINIMAL SENSOR DATA ...................................................................................... 26 Introduction ............................................................................................................ 26 Methods.................................................................................................................. 29 Instrumentation and Data Collection ............................................................... 29 Model Overview .............................................................................................. 31 Traditional Protocols for Machine Learning.................................................... 32 Analysis Methodology ..................................................................................... 33 Results .................................................................................................................... 34 Discussion .............................................................................................................. 39 Conclusion ............................................................................................................. 42 Bridge ..................................................................................................................... 44 IV. FEATURE IDENTIFICATION WITH A HUERISTIC ALGORITHM AND AN UNSERVISED MACHINE LEARNING ALGORITHM FOR PRIOR KNOWLEDGE OF GAIT EVENTS ........................................................................... 45 Introduction ............................................................................................................ 45 Methods.................................................................................................................. 49 Results .................................................................................................................... 53 Discussion .............................................................................................................. 60 xii Chapter Page Conclusion ............................................................................................................. 63 Bridge ..................................................................................................................... 64 V. VALIDATION OF RUNNING GAIT EVENT DETECTION ALGORITHM IN A SEMI-UNCONTROLLED ENVIRONMENT .......................... 65 Introduction ............................................................................................................ 65 Methods.................................................................................................................. 67 Data Processing ................................................................................................ 68 Algorithmic Descriptions ................................................................................. 70 Results .................................................................................................................... 70 Discussion .............................................................................................................. 75 Bridge ..................................................................................................................... 82 VI. ESTIMATION OF GROUND REACTION FORCE DURING FIXED PACE RUNS OUTSIDE THE LABORATORY ............................................ 83 Introduction ............................................................................................................ 83 Methods.................................................................................................................. 85 Data Processing ................................................................................................ 86 Machine Learning Algorithms ......................................................................... 87 Data Analysis ................................................................................................... 88 Results .................................................................................................................... 89 Discussion .............................................................................................................. 97 Bridge ..................................................................................................................... 103 xiii Chapter Page VII. RUNNING IN THE REAL WORLD: ESTIMATION OF GAIT EVENTS AND KINETIC WAVEFORMS WITH WEARABLE SENSORS AND MACHINE LEARNING .................................................................................... 104 Introduction ............................................................................................................ 104 Methods.................................................................................................................. 106 Data Processing ................................................................................................ 107 Descriptions of Gait Event Detection Algorithms .......................................... 109 Machine Learning Methods ............................................................................. 109 Results .................................................................................................................... 112 Gait Event and Contact Time Estimation from IMUs ..................................... 113 BD LSTM Gait Event and Contact Time ........................................................ 115 Discrete Kinetic Variables ............................................................................... 115 Discussion .............................................................................................................. 119 VIII. CONCLUSION ................................................................................................... 134 Summary of Results and Findings ......................................................................... 134 Limitations ............................................................................................................. 137 Recommendations for Future Work....................................................................... 140 REFERENCES CITED ................................................................................................ 142 xiv LIST OF FIGURES Figure Page 2.1 LSTM Cell layout and mapping of operations. ..................................................... 21 3.1 The Full BP-AR-HMM ......................................................................................... 32 3.2 Exemplar plots of the output from the BP-AR-HMM with foot acceleration of a single representative subject ........................................................................... 35 3.3 Changes in the duration of state 7 and state 3 for all subjects’ transitions between walking and running ................................................................................ 37 4.1 Data collection protocol ......................................................................................... 51 4.2 Gait event identifiers from the BP-AR-HMM ....................................................... 56 4.3 Representative foot ground contacts for Level Ground Walking and Running ........................................................................................... 57 4.4 Representative foot ground contacts for Stair Ascent and Descent ............................................................................................................ 58 4.5 Representative foot ground contacts Ramp Ascent and Descent ............................................................................................................ 59 5.1 Flow chart of the data post processing steps and the algorithmic methods for the identification of gait events from the force sensing insole data and data from multi-axial IMUs at both anatomical locations .............................. 69 5.2 Acceleration waveforms from the IMUs mounted on the right foot with superimposed foot contacts identified from the force measuring insoles ........................................................................... 72 5.3 Acceleration waveforms from IMUs mounted on the sacrum with superimposed foot contacts identified from the force measuring insoles for Level Ground............................................................. 73 5.4 Time differences in the identification of gait events between measured forces and estimated IMU gait events .................................................................... 74 5.5 Differences in force measured and IMU estimated contact time across the range of average velocity trials ........................................................................ 75 xv Figure Page 5.6 Bland-Altman plot displaying the average differences between the estimated contact durations and the measured gait events from the force sensing insoles .................................................................... 76 5.7 Time differences between the IMU estimated foot contact and force measured foot contact ................................................................................... 77 6.1 Machine learning methodology ............................................................................. 88 6.2 Complete analysis of contact time ......................................................................... 93 6.3 Kinetic variables across velocities ......................................................................... 94 6.4 Regression analysis of estimated kinetic variables with mean and 95% confidence intervals .............................................................. 95 6.5 Bland Altman plot with offset and 95% Limits of Agreement. Each data point represents an average velocity trial .............................................. 96 7.1 Outlined course each participant ran...................................................................... 111 7.2 Gait event differences estimated from the foot mounted IMUs ............................ 116 7.3 Estimated contact time from foot mounted IMUs ................................................. 120 7.4 Regression analysis of estimated kinetic variables with mean and 95% confidence intervals .............................................................. 121 7.5 Estimated contact time from the BD-LSTM .......................................................... 122 7.6 Kinetic variables across the range of running velocities ....................................... 129 7.7 Stance average GRF. Linear regression and Bland-Altman plots ......................... 130 7.8 Peak Force. Linear regression and Bland-Altman plots ........................................ 131 7.9 Impulse. Linear regression and Bland-Altman plots ............................................. 132 7.10 Average loading rate. Linear regression and Bland-Altman plots ....................... 133 xvi LIST OF TABLES Table Page 2.1 Subject Characteristics (Mean ± SD) for Specific Aim 1 ...................................... 12 2.2 Subject Characteristics (Mean ± SD) for Specific Aim 2 ...................................... 13 2.3 Subject Characteristics (Mean ± SD) for Specific Aims 3 and 4 .......................... 13 2.4 Subject Characteristics (Mean ± SD) for Specific Aim 5 ...................................... 14 3.1 Subject Gait Cycle Accuracy (% of Occurrences) and Duration of States Output by the BP-AR-HMM Across Subjects ............................................ 38 3.2 Average Time Prior to Gait Events When Either State 4 or 1 Occurs ................... 39 4.1 Heuristic Algorithm Accuracy and Lead Time ...................................................... 54 4.2 BP-AR-HMM Algorithm Accuracy and Lead Time ............................................. 55 4.3 Primary States Output From the BP-AR-HMM .................................................... 55 5.1 Distribution of Average Running Velocities ......................................................... 68 5.2 Exemplar Paces for Participant with a 7-Minute per Mile 5-km Race Pace ............................................................................................. 68 5.3 Root Mean Square Error for the Identification of Gait Events and Estimation of Foot Contact ................................................................. 71 6.1 Distribution of Average Running Velocities ......................................................... 85 6.2 Example paces and timing for 400 m run .............................................................. 86 6.3 Stance and Waveform Root Mean Square Error .................................................... 90 6.4 LSTM Estimated Gait Event Error ........................................................................ 91 6.5 LSTM Estimated Spatial-Temporal and Kinetic Variables Root Mean Square Error ........................................................................................ 92 7.1 Participant Characteristics and Mile Pace.............................................................. 107 7.2 Root Mean Square Error Table for IMU Gait Event Detection ............................. 112 xvii Table Page 7.3 Root Mean Squared Error from BD-LSTM Output ............................................... 113 7.4 BD-LSTM Stance Phase Ground Reaction Force Waveform RMSE ................... 114 7.5 Measured Kinetic Variables from Force Insoles ................................................... 117 7.6 Kinetic Variable RMSE from BD-LSTM Estimated Waveforms ......................... 118 xviii CHAPTER I INTRODUCTION Background and Significance Introduction Biomechanical analysis of human locomotion outside of the laboratory has been made possible with the development of wearable sensors, for both clinical and sport applications [1]–[3]. Collection and analysis of biomechanical data in the laboratory has been well established for the measurement and analysis of human locomotion. Traditional biomechanical protocols require motion analysis in the laboratory either across inground force plates or on an instrumented treadmill [4]–[6]. These methods require expensive equipment, and a high level of technical expertise, while comparatively wearable sensors are less expensive, lightweight and require less technical expertise than laboratory-based methods. From both laboratory-based analyses, and real-world/ecological analyses there are three broad categories of measurements; spatial temporal, kinematic and kinetic. Examples of spatial temporal parameters relating to gait are initial contact (IC), the instance the foot comes into contact with the ground, toe off (TO), the last instance the foot is in contact with the ground, and contact time as the difference between the TO and IC. These form the basis of biomechanical analysis, via gait cycle segmentation. Examples of kinematic variables are joint angles and angular velocity. Kinetic variables include stance average ground reaction force (GRF), stance average GRFs, impulse, peak GRFs, and loading rates. For the purposes of this dissertation, analysis will be focused on the identification of gait events in all manuscript chapters (III – VII), and estimation of ground reaction forces in the final two manuscript chapters (VI and VII). Gait Event Identification for General Locomotion The gold standard for identification of gait events in the laboratory are vertical GRF using force plates and marker trajectories measured using optical motion capture systems [7]. Wearable sensors for the measurement of forces outside the lab currently include in-shoe force or pressure sensors. Insoles utilized for the identification of gait events outside of the laboratory have been developed and validated for gait event detection, and measurement of kinetic variables [8]–[10]. Force or pressure sensing insole systems can be expensive, and they are often not viable for use in assistive devices or consistent daily use. For the purposes of this dissertation, force sensing insoles were used as the standard comparator against which the IMU based estimates were examined. Typically, there are 9 individual sensors in a multi-axial IMU, 3-D accelerometers (linear acceleration), 3-D gyroscopes (angular velocity) and 3-D magnetometers (magnetic field). Multi-axial inertial sensors have been utilized extensively for the identification of gait events and locomotion mode, with varying success [11]–[18]. Accelerometers have the potential to be used for gait event detection but require the use of gyroscopes and extensive mathematical modelling, and Kalman Filters to correct the orientation of the senor consistent identification of gait events [19], [20]. The use of gyroscopes for identification of gait events may be more consistent as they are orientation invariant, and unaffected by gravity [21]. While these methodologies have been 2 previously explored in the laboratory, they are yet to be validated in a real-world environment. Machine Learning Algorithms for Identification of Gait Events Machine learning algorithms have been used to identify gait events and classify locomotion modes, such as walking, ramp/stair ascent/descent. These methods include linear discriminant analysis (LDA), and support vector machines (SVM) that utilized simple rules in high dimensional feature spaces to divide the different classes [22]–[25]. More sophisticated implementations of machine learning algorithms for gait event identification are hidden Markov models (HMM), and time history dependent variations of these (e.g. dynamic Bayesian networks and auto regressive HMM), while other algorithms include artificial neural networks (ANN), and phase variable classification [17], [25]–[28]. Although more computationally intensive, these algorithms provide improved accuracy compared to more simplistic machine learning methods. Machine learning algorithms are favored over simpler heuristic methods as they have the potential flexibility to transfer learning between environments. Some issues with these approaches are they have not yet been tested on data in an ecological environment, and they typically require large sensor arrays and high sampling frequencies that are not practical for real- world applications. One solution to the problem of large sensor arrays and high sampling frequencies are novel algorithmic techniques have been developed particularly for analysis of multiple related time series and minimal input data. An example of these algorithms are switching linear dynamical systems, developed previously for economic analysis and the 3 description of other dynamical processes [29]. These algorithms are unique as behaviors do not have to be known a priori but can be added when a new feature is discovered. A switching linear dynamical system, the Beta Process Autoregressive Hidden Markov Model (BP-AR-HMM) was used in the first part of this dissertation [30]. The beta process is a feature selection algorithm, that can generate new features based upon the given data and share these features across related time series [29]. The transitions between these features are governed by the AR-HMM model, which considers the previous time steps and generates transition probabilities for efficient transitions states. This type of feature identification algorithm is advantageous as it generates states from the input data, does not rely on user defined states, and it can be utilized as both an unsupervised and supervised machine learning algorithm. The first two Specific Aims of this dissertation sought to identify features prior to gait events from sensors with minimal sampling frequency with the BP-AR-HMM and a heuristic algorithm. The final three Specific Aims of this dissertation focused on the estimation of contact time from IMUs, and the mapping of GRF waveforms from IMUs while running outside of the laboratory. Running Outside the Laboratory Over the past three decades biomechanics researchers have studied running in different simulated environments. Examples of these are: running on different surfaces [31], [32], different simulated gravities [33], positive and negative slopes [34]–[36] and on uneven terrains [37]. These studies have developed the gold standards for human running performance analysis. Inertial sensors have been used for the estimations of leg 4 joint angles, and measurements of stride kinematics and segmental accelerations, during competitive events [38], [39]. The overarching purpose of these studies was to develop an understanding of human running biomechanics across a range of locomotion velocities and in different simulated environments. There have been further inquiries utilizing inertial data collected in the laboratory for estimation of gait events [40]–[42], discrete kinetic variables[43], [44] and kinetic waveforms [45]–[47]. Initial comparisons between the laboratory standard and wearable sensors for the identification of gait events has been successful [41], [42], [48]. There have been two general methods for the estimation of gait events from inertial sensors: heuristic rule sets, and machine learning. Identification of gait events from IMUs for running locomotion has generally relied on heuristic rules. Sensors mounted on the shank and foot have relied on the identification of peak accelerations for the estimation of Initial Contact (IC), [40], [41], [49]. Peak impact accelerations during running for IMUs mounted on the foot, are greater than 50 m s-2, these peak accelerations are consistent enough for the identification of IC while running. Multi-axial IMUs, mounted on the waist of the participants tend to rely on accelerations in either the vertical or anteroposterior directions for the identification of IC [40], [41], [44], [48]. From the identification of IC, it is then possible to detect TO with specific rules within a temporal window. The use of machine learning algorithms for the identification of gait events during running has shown promise as well in the laboratory, for a data driven approach to the identification of foot contacts [50]. Machine learning for mapping of IMU data to GRF waveforms for the analysis of running biomechanics has not yet been achieved from data collected in an outdoor training or real-world environment. The estimation of temporally dependent waveforms 5 from recurrent neural networks (RNNs), particularly, Long Short Term Memory networks (LSTMs) have been used in biomechanics as gait is cyclical [45]–[47], [51]. They have been used for estimation of GRFs during an identified stance phase from inertial data [46], [47], from data normalized to the duration of a step [52]. More recently, there have been other efforts in this space to estimate partial GRF waveforms, that can be linked together to form waveforms of indefinite length [45]. From these waveforms simple rules can be made for the identification of IC and TO. These studies have focused on the estimation of contact time, peak GRF, impulse and loading rate. Each of the previous studies, with the exception of [45], have only estimated GRF during stance phase as identified by a researcher. Furthermore, the work in this space has been limited to estimation of these variables while running on level ground with the exception of [45], which included treadmill running on an incline and decline. We look to expand upon these methods by validating level ground running at set paces as well as free running for the estimation of GRFs. General and Specific Aims This dissertation set out to develop validated methodologies for the identification of gait events and estimation of GRF waveforms from wearable sensors across a range of locomotion modes and running velocities from data collected outside of the laboratory. The practical applications of these data and methods are the accurate identification of foot contact from mobile sensors, including those used for control systems on powered assistive devices with minimal sensor data and more accurate and relevant training tools for runners. Towards these objectives five specific aims were addressed. The first two 6 Specific Aims focused on the estimation of features prior to gait events during steady state locomotion and transitions between locomotion modes with minimally sampled data. The last three Specific Aims focused on estimation of gait events and GRF waveforms while running in a semi-uncontrolled and an uncontrolled environment. Specific Aim 1: Test the utility of a machine learning algorithm to label foot acceleration data for classification of human locomotion. The Beta Process – Auto Regressive – Hidden Markov Model (BP-AR-HMM) can be either a supervised or unsupervised machine learning algorithm that generates a subject independent feature set for related time series. We utilized steady state locomotion data, walking and running as well as dynamic transitions between the two to test this algorithm. Features output from the supervised BP-AR-HMM were expected to be able to identify waveform characteristics prior to IC and TO. To complete this aim, data collected in a prior study conducted my Dr. Li Jin, a former graduate student in the BSSC were used as input to the BP-AR-HMM. Specific Aim 2: Develop a system for the identification of features prior to gait events in an ecological environment with both an unsupervised machine learning and a heuristic algorithm. We tested the utility of the unsupervised BP-AR-HMM to label steady state and dynamic locomotion data, as well as a rules-based algorithm for the identification of features prior to the occurrence of gait events. Steady state walking, running, ramp and stair ascent/descent as well as transitions between level ground walking and all other locomotion modes in an ecological environment. This aim sought to build a participant 7 independent labelling system of gait for the identification of gait events prior to their occurrence with both a machine learning and heuristic model. Specific Aim 3: Develop and validate two algorithms for the estimation of contact times at set paces on a track. We developed two algorithms based on previous work for the estimation of gait events from a wide range running velocities and participant skill levels. We collected data in a semi-uncontrolled environment, at set running paces on a square track, to estimate gait events from two IMUs located bilaterally on the dorsum of each foot as well as the one IMU clipped to the back of the waistband. Each of the algorithms used peak identification for the estimation of IC and temporal windows for the identification of TO. The IMU estimated gait events were compared to events derived from instrumented force sensing insoles as a standard. Specific Aim 4: Estimate whole ground reaction force waveforms from inertial sensors during fixed pace runs on a track. We implemented a machine learning algorithm for the mapping of inertial data to GRF waveforms using data collected from participants running at set paces and with a variety of running experience. This analysis used data from Aim 3. The purpose of this study was to build a machine learning model for the estimation of gait events and contact time, as well as the estimation of discrete GRF variables from an estimated waveform, including stance average GRFs, peak GRFs, impulse, and average loading rate in a semi-uncontrolled environment. 8 Specific Aim 5: Estimate whole ground reaction force waveforms from inertial sensors during a free run in a real-world environment. The focus of this aim was to validate the algorithms developed in Aims 3 and 4 in a real-world running environment. Specifically, this study compared the estimation of whole GRF waveforms from the same three sensor set up used in Aims 3 and 4 for participants running at their own pace on a 5-mile course, which included running up and down hill. Through the completion of these aims, methodologies for the identification of gait events outside of the laboratory have been improved. This knowledge may be applied to the identification of gait events in assistive devices, wherein minimizing sampling frequencies makes it less burdensome to monitor patients in their own ecological environment. Other aspects of this work may be applied to development of a model for the quantification of training load throughout a runner’s preparation for a race, or for a novice runner to improve their form. The larger study mentioned throughout Specific Aims 3-5 is the Longitudinal Running Study, the overarching purpose of this study is to develop biomechanical tools to monitor running performance throughout the course of their training. Organization of Dissertation This dissertation is written in a journal style format, where chapters III-VII have been or will be submitted for publication to peer-reviewed journals. The following section explains how these chapters fit together into a coherent body of work. A bridge statement explaining the flow of studies is included at the conclusion of Chapters III-VII. 9 The current chapter (Chapter I) provides essential background information regarding wearable sensors and machine learning algorithms for the detection of features prior to gait events, gait events and the estimation of whole force waveforms from wearable sensors. This chapter establishes the basis and need for the research presented in the dissertation. Chapter II will detail the methodology used for each study. Chapter III describes methods, model development, and validation procedures related to Specific Aim 1. Chapter IV describes the development of methods for outdoor data collections with wearable sensors and machine learning algorithms for gait event detections in a real- world environment to complete Specific Aim 2. Chapter V describes the development of algorithms for the estimation of gait events using two different anatomical locations for running in a semi-uncontrolled environment for completion of Specific Aims 3. Chapter VI corresponds to the model development, methods and analysis of running in a semi- uncontrolled environment with data estimated from an optimized machine learning model, for completion of Specific Aim 4. Finally, Chapter VII synthesizes the methods and models of chapters IV, V and VI for the analysis for the completion of Specific Aim 5. Chapter VIII summarizes the notable results of the overall body of work, reiterating the key findings while acknowledging limitations and outlining future directions for work in this area of research. This dissertation includes co-authored work, some which has already been submitted for publication in peer-reviewed journals. Chapter III has been submitted and accepted to IEEE Journal of Biomedical and Health Informatics, DOI: 10.1109/JBHI.2020.3028827. Chapter IV has also been submitted and accepted to IEEE Transactions on Neural Systems and Rehabilitation Engineering, DOI: 10 10.1109/TNSRE.2021.3131953. Chapters V-VII will be submitted for publication to an appropriate journal. For all work in this dissertation, with the exception of the study design and data collection of Chapter III (completed as part of Dr. Li Jin’s dissertation work), Seth Donahue was the primary investigator, responsible for study design, data analysis, interpretation, and dissemination. Dr. Michael E. Hahn advised on all aspects of this dissertation. 11 CHAPTER II GENERAL METHODOLOGY Subjects Chapters III and IV of this dissertation describe the development and validation of novel methods for the estimation of gait events across and between locomotion modes, both inside and outside the laboratory. Chapters V-VII of this dissertation describe the development of both heuristic and machine learning algorithms for the estimation of gait events in both a semi-uncontrolled and uncontrolled environment. The current chapter provides a summary of the methods developed and used for data collection and analysis for the various Aims. To address Specific Aim 1 (Chapter III), ten middle aged able-bodied subjects participated in the study (Table 2.1). To be included participants had to be between the ages of 40 and 65 and not have sustained a lower limb injury in the past year. These data were collected as apart of Dr. Li Jin’s dissertation. Table 2.1: Subject Characteristics (Mean ± SD) for Specific Aim 1 Sex Age (yr) Height (cm) Mass (Kg) Total (n = 10) 50.7 ± 6.0 173.4 ± 11.4 69.7 ± 1.0 To address Specific Aim 2, sixteen able bodied participants with no injuries to their lower extremities in the past 6 months were recruited (Table 2.2). For Chapter IV, written informed consent was obtained from each participant and study protocols were approved by the University of Oregon Institutional Review Board (IRB protocol #08202018.018). 12 Table 2.2: Subject Characteristics (Mean ± SD) for Specific Aim 2 Sex Age (yr) Height (cm) Mass (Kg) Male (n = 8) 33.75 ± 17.36 179.39 ± 6.78 70.57 ± 8.21 Female (n = 8) 28.50 ± 12.35 167.32 ± 5.02 62.67 ± 4.75 Total (n = 16) 31.13 ± 14.81 173.36 ± 8.49 66.32 ± 7.62 To address Specific Aims 3 and 4, fifteen runners were recruited as part of a larger ongoing study. These participants were recreationally trained, with a minimum training volume of 5 miles of running per week. For Chapters V and VI, written informed consent was obtained from each participant and study protocols were approved by the University of Oregon Institutional Review Board (IRB protocol # 10062020.007). Table 2.3: Subject Characteristics (Mean ± SD) for Specific Aims 3 and 4 Sex Age (yr) Height (cm) Mass (Kg) Male (n = 9) 26.33 ± 13.44 179.44 ± 4.22 71.33 ± 7.00 Female (n = 6) 19.83 ± 0.41 163.00 ± 15.41 64.50 ± 2.17 Total (n = 15) 23.73 ± 10.69 172.87 ± 12.83 68.60 ± 6.46 To address Specific Aim 5, sixteen runners were recruited as a part of larger ongoing study. Participants were recreationally trained runners. They were asked to complete a 5-mile course, which had similar characteristics to their normal training routes, distances, and elevation changes. For Chapters VII, written informed consent was obtained from each participant and study protocols were approved by the University of Oregon Institutional Review Board (IRB protocol # 10062020.007, MODCR: 00000269a). 13 Table 2.4: Subject Characteristics (Mean ± SD) for Specific Aim 5 Sex Age (yr) Height (cm) Mass (kg) Pace (min:secs) Male (n = 8) 23.57 ± 6.00 166.29 ± 7.74 70.86 ± 8.38 7:53±0:46 Female (n = 5) 23.40 ± 3.58 168.40 ± 4.56 56.20 ± 4.21 8:08±0:36 Total (n = 13) 23.15 ± 4.88 167.77 ± 30.86 65.00 ± 9.70 7:56±0:40 Study Design and Experimental Protocol Chapter III Participants walked and ran on an instrumented treadmill (Bertec, Inc., Columbus, OH) with four steady state trials, and two dynamic walk-to-run and run-to-walk transitions. There were two steady state walking trials (1.4 and 1.6 m s-1), two steady state running trials (2.6 and 3.0 m s-1), and a walk-to-run trial from 1.8 – 2.4 m s-1 with an acceleration of 0.1 m s-2 and run-to-walk trial from 2.4 -1.8 m s-1, with a deceleration of 0.1 m s-2. Each trial was 90 seconds long, with the middle 10 seconds of each trial being analyzed. Chapter IV Participants took part in two days of data collection. The first day consisted of locomotion transitions in a real-world environment, transitioning between level ground walking and each of the following: ramps, stairs and running. On the second day of data collection participants were asked to complete nine average velocity running trials from 1.3 – 3.0 m s-1 on an 80 m straightaway. 14 Chapters V-V1 As a part of a larger ongoing study, participants were asked to complete average running velocity trials on a single lap around a 400 m track. They ran at three paces slower than their race pace, at their race pace and at a push pace. For example, if their race pace for a 5 km race was 6:00 they would be asked to run at 7:30, 7:00 and 6:30, then at 6:00, after their race pace they were given the choice to continue to a 5:30 pace. Each participant was provided a minimum rest period of two minutes or self-selected by the participants between each time trial. Chapter VII As a part of a larger ongoing study, participants were asked to complete a 5-mile run on a course around the University of Oregon and Prefontaine Trail. This course included incline, decline and level ground running. Data Collection Chapter III Participants were outfitted with a bilateral marker set consisting of 43 retro- reflective markers defining nine segments (forefoot, rearfoot, shank, thigh, pelvis). Three-dimensional marker data were collected at 200 Hz using an 8-camera motion capture system (Motion Analysis Corp., Santa Rosa, CA). Ground reaction force data were collected at 1000 Hz using a force-instrumented treadmill (Bertec, Inc., Columbus, OH). 15 Chapter IV Participants were equipped with four IMUs (Vicon, Centennial, CO): one on the waistband, one on the lateral aspect of the thigh, at approximately the midpoint between the greater trochanter and the knee, one on the lateral aspect of the leg, at approximately the midpoint between the knee and the lateral malleolus and one on the dorsal aspect of the foot. Three participants were equipped with the Noraxon foot switch DTS system (Noraxon, Scottsdale, AZ; sampling rate of 500 Hz), which measured pressure at the foot-shoe interface. The other 13 participants were equipped with Loadsol insoles (Novel, Minneapolis, MN; sampling rate of 100 Hz), which measured normal force between the foot-shoe interface. Chapters V - VII Participants were equipped with three IMUs, with a sampling rate (Casio, Tokyo, Japan; sampling rate of 200 Hz). Force sensing insoles (Novel, Minneapolis, MN; sampling rate of 100 Hz) were used for the measurement of normal force data between the foot-shoe interface. They were also equipped with a GPS watch (Garmin Forerunner 130 and 135; Garmin, Olathe, KS; sampling rate of 1 Hz) from which distance, pace, elevation, and slope were measured. Data and Statistical Analysis Chapter III 16 All data analyses were completed with custom Matlab scripts (Mathworks, Natick, MA). The GRF data were filtered with a low pass fourth order Butterworth filter (35 Hz cut off frequency). Ground contact state was determined when vertical GRF values were above a threshold of 50 N. Motion trajectory data from a single marker on the dorsum of the left foot were extracted from a larger data set [53]. For this study and throughout the dissertation, x defined the medio-lateral axis, y defined the antero- posterior axis, and z defined the longitudinal axis. Idealized three-dimensional (3D) acceleration values were calculated by taking the second derivative of the filtered marker position data. The foot contact information and acceleration data were then downsampled to 20 Hz and both input into the Beta Process Auto-Regressive Hidden Markov Model (BP-AR-HMM). The BP-AR-HMM is an algorithm developed to share features across related time series. The key element of this algorithm, the beta process, adds and remove states that can be shared across all data sets input. For example, if two states describe the same feature they will be combined into a single state and shared. This development of dynamic features is crucial for use in semi-supervised or unsupervised machine learning, or as data sets grow so large that they cannot be feasibly annotated. The HMM assumed that a set of distributions and transitions between dynamic features follow a Markov process. This process was modified with the presence of the auto-regressive model, which considers temporal relationships in the data. The vector auto regression approach provided an estimation of current state, based upon the previous time steps, the available states and distributions based on the hidden Markov model. A sticky parameter was used to increase the probability of self-transitions within a state to identify biomechanically 17 significant states [30]. Output data from the BP-AR-HMM were states of varying duration, transitions between different states were indicative of transitions between different aspects of the gait cycle. Performance evaluation of the BP-AR-HMM included the critical timing of states and how consistently the features were used across all locomotion modes. Accuracy in this study was defined as the number of features identified prior to the impending gait events divided by the number of measured gait events for both the heuristic and machine learning algorithms. Chapter IV All data analyses were completed with custom Matlab scripts. Foot-shoe normal force data were considered the standard reference for identification of measured gait events. The inertial signals and force data were time-synced using ‘foot-stomps’ before and after each trial. Inertial data from the IMUs on the participant were time synced and down sampled to 100 Hz, then synchronized with the insole force sensors off-line. Foot contact information and ground reaction force data were converted into indicator variables, indicating cyclic motion or not. Then data were labeled with specific locomotion mode. Input into the heuristic algorithm was single axis angular velocity data about the x-axis of the IMU (sagittal plane of the participant), sampled at 100 Hz. Input into the BP-AR-HMM consisted of single axis angular velocity data about the x-axis of the IMU, down sampled to 25 Hz. There were four rules utilized for the identification of gait events from the heuristic algorithm and six state from the BP-AR-HMM. Chapter V 18 All data analyses were completed with custom Matlab scripts (Mathworks, Natick, MA). Foot-shoe normal force data were considered the standard reference for identification of measured gait events. A Kalman filter was applied for the adjustment of orientation of the local IMU coordinate system to gravity [54], [55]. Initialization of the Kalman filter occurs with the IMU still, and an estimation of the orientation of the sensor using accelerometer signal. During motion, orientation of the sensor is estimated by calculating the change in orientation from the rate gyroscopes. Each stance phase when the resultant accelerations were near that of gravity, the orientation of the sensor was reset with acceleration. Identification of gait events with the force sensing insole utilized a threshold of 50 N. Initial contact from the foot mounted IMUs were estimated with peak detection algorithms, and TO was detected with either a peak acceleration or a threshold during a variable length temporal window. From the sacral mounted IMU, IC was detected with a peak finding algorithm and TO was detected with a static temporal window, either detecting maximum acceleration or maximal slope of the vertical acceleration. Pearson correlation coefficients (r2) were used to compare the estimated contact times to the measured force insole contact times. A strong correlation was defined as r2 ≥ 0.8, a moderate correlation as 0.5 ≤ r2 ≤ 0.8 and a weak correlation as 0.3 ≤ r2 ≤ 0.5. The relationships between the estimated and measured waveform variables are presented in both linear regression and Bland-Altman plots with 95% confidence intervals (CIs)/ Limits of Agreement (LoA) respectively. Chapter VI 19 All data analyses were completed with custom Matlab scripts (Mathworks, Natick, MA). An extension of Chapter V, the analysis in this chapter shifted the data analysis to include the use of an LSTM for the estimation of GRF waveforms. The variables calculated from these waveforms were contact time, stance average GRFs, peak GRF, impulse, and average loading rate. We chose to use an LSTM for the estimation of GRF waveforms since GRFs patterns during gait are cyclical and therefore time dependent. The LSTM was initially developed to overcome the vanishing gradient or the long term dependency problem within machine learning [51]. Within the LSTM, there are specific feedback connections, which facilitates retention of about previous data in the sequence by the LSTM. The output of the LSTM depends on three different things, the cell state, hidden state, and the input data at the current time step. The cell state and hidden state are governed by specific gates in the LSTM cell shown in Figure 2.1. The forget gate decides which parts of the memory are useful, based on the previous hidden state and new input data. The previous hidden state and new input data are fed into a neural network with sigmoid activation functions. If the output from this network was approximately 0, that portion of the input data was deemed irrelevant, and when the output is closer to 1 it was deemed relevant and kept. Outputs from the forget gate were then pointwise multiplied by the previous cell state, so that portions of the cell state that were deemed irrelevant were multiplied by approximately 0, and thus have less influence on the next steps. The next step determined if any new information should be added to the cell state, given the new input data and the previous hidden state that has been modified by the 20 forget gate. A new cell state was estimated using a neural network, with a hyperbolic tangent activation function. This network had the same input data. The purpose of this gate was to estimate the new cell state based upon the input data and the previous hidden state. The input gate was the next gate and together with the estimated cell state, generate a state vector. The input gate had a sigmoid activated neural network, that functioned as a filter identifying which components of the new memory vector were important. Thus, if the output from the input gate were near zero, the model will not update that element of the cell state. The output from the estimated cell state network and the input network are then point multiplied and added to the cell state. Figure 2.1: LSTM cell layout and mapping of operations. The final step was the output gate, which is similar to the forget gate, in that it takes in the previous hidden state, and input data. The output gate is a neural network with a sigma activation function. The cell state was passed through a hyperbolic tangent filter. The steps described above are repeated for each input. For example, for a 400- 21 sample input, the above steps would be repeated 400 times. The output was also a hidden state; therefore, it needed to be processed through a regression layer to output a waveform. The loss function in this work was computed as the half mean squared error. There were two steps in the validation of these machine learning protocols, first a hyperparameter optimization, followed by a Leave One Out Cross Validation (LOOCV). We optimized the hyperparameters of the LSTM with a Bayesian Optimization [56], [57], this algorithm attempts to minimize the error in the loss function of the LSTM in a bounded domain, however the specifics of this algorithm are beyond the scope of this dissertation. Input data during the optimization were randomly split into a training set of 70% of the total data, a validation set of 15% of the total data and a test set of 15% of the data. The hyperparameters optimized were the initial learning rate, gradient decay factor, squared gradient decay factor, L2Regularization and number of hidden units. The only hyperparameter that influenced the outcome of the algorithm was the number of hidden units, as all others converged to the default Matlab input. The range for the number of hidden units in the optimization was [10 50]. Optimization of these data was tested on temporal windows of varying length between 1 and 5 seconds in duration at half second intervals. The optimal network was determined by model performance on these windows. The optimal temporal window length was determined by examination of the RMSE from the optimal network output. This network was then used in a LOOCV, where a single participant was left out of the training set to be utilized as the test set. From the LOOCV we determined the most accurate estimation of waveforms by the identification of the temporal window duration that had the lowest RMSE for the waveform and discrete kinetic variables. 22 Estimated waveforms from the LOOCV were analyzed and presented in this work. Initial analysis involved identification of the optimal temporal window. This was done via inspection of the estimated GRF variables in Bland-Altman plots. Initial Contact (IC) was identified by the first instance of force >5% BW and toe off (TO) was determined by the last instance of force greater than >5% BW. Contact time was determined by taking the temporal difference between these two discrete events. Stance average GRFs, impulse, peak GRFs, and average loading rate were the GRF variables calculated in this work, from the estimated force waveforms. Average loading rate was calculated by identifying the impact peak and then averaging the slope in the middle 60% of the region between IC and the impact peak [58]. Chapter VII All data analyses were completed with custom Matlab scripts (Mathworks, Natick, MA). We utilized variations from previous chapters for the estimation of gait events from an IMU and the estimation of GRF waveforms from bi-directional LSTMs (BD-LSTMs). These BD-LSTMs take in data both forward, say from time points 0-99, and backwards, from 99-0 and utilize the methods described above. Post-processing consisted of the same protocols as Chapters V and VI, with additions due to the uncontrolled nature of this data collection. Data from the GPS signal were upsampled from 1 Hz to 100 Hz, by creating a matrix where each original data row was increased to be 100 rows of identical data. Occasionally, ground reaction force data exhibited baseline drift. These were corrected for, however the magnitude of the adjusted force was not 23 corrected for, as there were less than 500 footfalls that needed correction, having minimal effect out of the >85,000 analyzed footfalls. Gait event detection algorithms used in this chapter were similar to those used in Chapter V. Identification of foot contact from the force data used the same rules as in Chapter VI. The identification of gait with IMUs required the addition of detection of pre-IC similar to chapter IV. A temporal window began 50 ms after the identified minima and ended 450 ms after identified minima, and the largest peak > 50 m s-2 in the resultant accelerations was identified as IC. The rules for identification of TO were the same as in Chapter V. Optimized networks were taken from Chapter VI and used for a LOOCV of the data in this chapter. A BD-LSTM with a regression output of 1 second was utilized for this study. Foot contacts < 100 ms and > 500 ms were removed from the data set, as they represented non-cyclical movement (i.e., not a running pattern). Further, any data < 5% BW were set to 0 BW. Initial contact and toe off from the estimated waveforms were identified in the same manner as Chapter VI. Foot contacts that could not be matched to the measured GRF data were removed from the data set. From the estimated waveforms, stance phase RMSE, stance average GRFs, peak GRFs, impulse and average loading rate were all calculated in the same fashion as Chapter VI. Velocities from the GPS were set to the nearest 0.25 m s-1 for velocities ranging from 2.25 – 5.25 m s-1, all other velocities were removed from the analysis. Any velocity < 2.25 m s-1 was considered to be a walking step, and there were not enough foot contacts for analysis > 5.25 m s-1. Slope was calculated from the altitude data and binned into three different groupings of level ground, incline, and decline. 24 Incline foot strikes were identified as slopes > 5° and decline foot strikes were identified as slopes < -5°. We noted errors of up to 4° in the GPS data, which was the reason for delineating the slopes as more than ±5°, and designating all other foot strikes to be level ground. Running velocity and slope combinations were included in the analysis if the participant had more than 10 footfalls for either the IMU or estimated variables. Linear models, RMSE, and bias were calculated for the variables in this study using the same approach as in Chapter VI. 25 CHAPTER III USER INDEPENDENT ESTIMATIONS OF GAIT EVENTS WITH MINIMAL SENSOR DATA Published as Donahue, S.R; Jin, L; Hahn, M. E; User Independent Estimations of Gait Events with Minimal Sensor Data. IEEE Journal of Biomedical and Health Informatics. 2021, 25(5), 1583-1590. The experimental work was performed by L. Jin. The writing and analysis are entirely mine. M.E. Hahn provided editorial assistance and guidance. Introduction In recent years the development of assistive devices and their control systems have improved dramatically; from passive prosthetic limbs that provide minimal adaptability to specialized performance limbs and adaptive assistive devices which support the user in a variety of ecological environments [59]–[63]. These adaptive limbs rely on input from mechanical sensors such as Inertial Measurement Units (IMUs), accelerometers, goniometers, load cells, and neuromotor input derived from electromyography. Current control systems take these sensory data and are able to identify gait events, classify steady state locomotion, and provide prediction of user intent for the next step [5-7]. This study aims to implement an unsupervised machine learning algorithm on minimally sampled accelerometer data, to identify gait events, and further assess the feasibility of the output for building a model for the classification of locomotion and prediction of locomotion transitions. 26 The control algorithms used to estimate gait events with wearable sensors (e.g. IMU’s) have classically been threshold based with a variety of sensor locations [12], [65], [66]. Other groups have used adaptive algorithms such as Hidden Markov Models (HMM) for the development of rules for detection of gait events [18], [67]. These algorithms are accurate for a constrained task however, they tend to be inflexible. The expert user defines the states, and the transitions between the states. Another method that involves gait event estimation is the identification of gait phase. Gait phase identification has informed the classification of both steady state locomotion and transitions between locomotion states. The algorithms used for these analyses traditionally are variants of linear discriminant analysis, support vector machines, artificial neural networks and hidden Markov models [22], [24], [68]–[73], and a variety of other classification algorithms [11], [14], [24], [27], [74], [75]. Algorithms utilizing the temporal nature of gait data to optimize classification and prediction have been successful [76], e.g. Dynamic Bayesian Networks consider the previous data and states to determine the current state and make a prediction of what the next state will be. Subject independent algorithms, which provide generalized classification of gait and transition prediction for both able-bodied subjects and prosthesis users have shown promise as well [77], [78]. These algorithms have been highly accurate in the laboratory setting, with great than 95% accuracy for the identification of gait events. Previous algorithms represent solutions from the understanding of the expert user. The potential set of features in any given data can be extensive, therefore expert knowledge of the user, coupled with observations of the dataset can define important features or characteristics of the waveforms. Often, for threshold-based models high 27 sampling rates are required for accurate classification of gait. Furthermore, these algorithms require large sensor arrays, i.e. multiple mechanical and neuromuscular sensors, for classification. In most cases these algorithms identify gait events at a steady state velocities, and do not consider how the subject may be accelerating through space or transitioning between locomotion states. These characteristics limit the applicability of these approaches to a real-world environment, because of their inflexibility to a variety of situations and for the user to manage the sensors. Gait event detection in the past has primarily been concerned with environmentally constrained locomotion e.g. ramps and stairs [79]–[81]. The primary locomotion states and transitions studied with both subject dependent and independent classifiers have typically been environmentally constrained transitions from level ground walking to ramp or stair ascent or descent [22], [64], [74], [77], [82]. In addition to these standard locomotion states and transitions, recent work has examined classification of running in prosthetic leg systems [61], [83]. However, it remains that little progress has been made on the classification of walk to run (WRT) and run to walk (RWT) transitions. In the laboratory setting, walking and running can be differentiated via ground reaction force (GRF) waveforms [5], [84], [85]. Walking is defined as a pendular gait with a brief double support phase, minimal ballistic propulsion, and a ground reaction force waveform with two distinct peaks [85]–[87]. Running is defined as a bouncing gait with an aerial phase, ballistic propulsion and ground reaction force waveforms that have a single large peak, and occasionally a transient peak in the first 25% of foot contact [86]– [88]. While these general descriptions clearly distinguish the two locomotion forms, it remains that an efficient parsing of gait (specifically initial contact (IC), toe-off (TO), 28 mid-stance and phases of swing) during steady state walking and running, and the transitions between the two gaits has not been attained in near real-time. Recent advances in assistive devices and prosthetic limbs require new flexible control algorithms to determine locomotion states, discrete gait events and transitions between. An unsupervised classification algorithm could provide efficient locomotion mode detection, which may be translated to a near real-time control application in assistive devices. In this paper, we propose the use of an unsupervised learning algorithm, the Beta Process - Auto Regressive - Hidden Markov Model (BP-AR-HMM) [30], [89]. The BP- AR-HM is a data driven feature extraction algorithm used to derive common features from related time series data. For this experiment we input calculated acceleration data for the algorithm to derive features (or states), to then estimate gait events. These states could also provide a framework for the classification of gait and potentially a basis for the prediction of locomotion transitions. The primary purpose of this study was to derive output from the BP-AR-HMM to provide biomechanically relevant classifications for estimation of gait events such as initial contact and toe off for steady state and dynamic locomotion with a group of able-bodied middle-aged adults. The secondary purpose was to explore the possibility of developing a framework for classification of gait and the development of a probabilistic model for prediction of locomotion transitions between walking and running from the output of the BP-AR-HMM. Methods Instrumentation and Data Collection 29 Ten middle aged able-bodied subjects participated in the study (50.7 ± 6.0 years, 173.4 ± 11.4 cm, 69.7 ± 14.9 kg; 5 female). Ground reaction force (GRF) and passive marker trajectory data were collected at 1200 Hz and 120 Hz, respectively. Marker data were collected with an 8-camera system (Motion Analysis Corp., Santa Rosa, CA). The protocol consisted of two different transitions: (walk to run) WRT and (run to walk) RWT on a force-instrumented treadmill (Bertec, Inc., Columbus, OH). The WRT protocol began with walking at 1.8 m s-1 for 30 seconds, then the treadmill was constantly accelerated at 0.1 m s-2 to the velocity of 2.4 m s-1. Subjects were asked to transition to running gait whenever they felt ready during the acceleration stage. After transitioning to a running gait, they ran at 2.4 m s-1 for another 30 seconds. The RWT protocol was the inverse of the WRT, starting at 2.4 m s-1 for 30 seconds, then the treadmill was constantly decelerated at 0.1 m s-2 to the velocity of 1.8 m s-1. The subjects were then asked to perform four steady state locomotion trials. Walking trials were performed at 1.4 and 1.6 m s-1 and running at 2.6 and 3.0 m s-1 for 90 seconds. Data from the middle 10 seconds of each trial were extracted for analysis. All data analysis was completed with custom Matlab scripts (Mathworks, Natick, MA) The GRF data were filtered with a low pass fourth order Butterworth filter (50 Hz cut off frequency). Ground contact state was determined when the vertical GRF values were above a threshold of 50 N. These GRF signals were used as the reference for foot contact by the BP-AR-HMM. Motion trajectory data from a single marker on the dorsum of the left foot were extracted from a larger data set [53]. Idealized, three-dimensional (3D) acceleration values were calculated by taking the second derivative of the raw marker position data. In this case idealized means the data from the accelerometer does 30 not contain a signal from the acceleration due to gravity. The foot contact information and acceleration data were then down-sampled to 20 Hz in Matlab to minimize the amount of data needed for successful classification. The BP-AR-HMM algorithm was tested on an initial set of 58 trials, from all 10 participants, to provide user-independent classification [53]. Two trials were removed from the original data set of 60 trials because the GRF data were incomplete for the present analysis. Model Overview The BP-AR-HMM is an unsupervised machine learning algorithm that provides a spatial-temporal statistical model for related time series data. More detailed descriptions of the model can be found in the work of Fox and colleagues [31,33]. A brief overview of the model components is provided here. The beta process (BP) provides a sparse feature sharing framework which is a probabilistic binary feature inclusion model (Figure 1.1). Vector auto-regression (AR) provides an estimation of what the current state is, based upon the previous time step(s), current state data, and available transition distributions. The HMM assumes that the set of distributions and transitions between dynamic features follow a Markov process. The assumption of the Markov process is the probability of state change from i to j has a set probability πij, which does not depend on prior state. To ensure biomechanically relevant states, a sticky parameter was used as described by Fox et al, [90]. The purpose of the sticky parameter increases the probability of self- transitions to capture biomechanically relevant states [90]. Finally, the reversible jump 31 Markov Chain Monte Carlo (MCMC) algorithm is used to ensure the correct features are utilized by determining the transition probabilities between states [30]. Figure 3.1: The full BP-AR-HMM approach is shown here, adapted from [30] where  and  are the hyper parameters that govern the Markovian state switching process to produce (i) which is the collection of transition weights. From this the transition distribution can be constructed with the feature vector from the beta process to generate transition behaviors indicated from fi. The representation of a single is from a finite Dirichlet distribution of dimension Ki of k such that fik = 1. The BP-AR-HMM generates a user independent statistical model, based upon minimally sampled data that does not require real time feature extraction or windowing. It also allows subject independent classification of data sets with a variety of transitions. Traditional Protocols for Machine Learning Classic machine learning paradigms have been based upon algorithms that have shown exemplary results for high dimensional pattern matching, taking n-dimensional 32 from a number of sensors and identifying a state or gait phase. These algorithms have required a large volume of data in order to generate classifications and predictions. Many of these algorithms bypass gait event identification for the use of gait phase identification. These phases are defined by the user, using features extracted from the data as described in the introduction. Features can include the mean, variance, zero crossing, minimum, maximum, frequency content, amongst others [74], [91], [92]. These features are then used for classification of current phase and locomotion state. Data are then split into training (70%), validation (15%) and test (15%) sets. The initial training of the model is completed on the training data set. The validation data is used as a real-time training check of the generalizability of the model; the model does not use this data for training. The test data is utilized after the training of the model is complete to test the accuracy of the model on an unseen data. The BP-AR-HMM is different from these traditional methods of feature extraction and classification. The only thing we define for input into the algorithm is foot contact with the ground. This is then input into the model as a guide and to assist in limiting the state output. The rules determined by the BP-AR-HMM are derived from the acceleration data. Inputting the foot contact data decreases the initial number of states in the instantiation of the MCMC algorithm. Instead of breaking up the data set into training, validation and testing sets, a Reversible MCMC algorithm was used to determine probability distributions for the states derived by the Beta Process. Analysis Methodology 33 Data input into the BP-AR-HMM were three-dimensional acceleration data, along with a Boolean indicating stance and swing phase from treadmill force data. This information was utilized in the initialization of the model and limits the number of states derived by the BP-AR-HMM. Gait events were determined from the GRF data. The temporal differences between the first occurrence of a BP-AR-HMM estimated state and the measured initial contact (IC) and toe off (TO) of the left foot were used to calculate the lead time and estimation of a given gait event. The duration of a state was calculated from the first instance of the state to the first instance of a different state. The durations of the state output for each of the steps were averaged across all subjects, and was used to build a framework from which the classification of gait and an initial examination of the development of a probabilistic model of locomotion transitions could be completed. The utility of the probabilistic model for gait transitions was examined with transition data using the three steps prior to the transition step, the transition step (defined as the first step in the new gait pattern) and the three steps post transition. The transition step and gait events were determined by GRF data. All analyses were performed post hoc. Results There were 9 dynamic states derived by the BP-AR-HMM. Five states occurred in > 98% of the gait cycles (Table 3.1). Three states were strictly associated with foot contact, five states were associated with swing phase, and one state was associated with both stance and swing phase. Of the 797 gait cycles analyzed, state 2 and state 5 occurred in every cycle. The next most consistent occurrence was state 4, which was associated with terminal swing phase and initial foot contact. States 2 and 5 were captured for each 34 walking gait cycle, and from running gait there were 4 states (2-5) captured in each cycle. The average duration of these states, except 7 and 3, remained approximately the same for all walking and running gait cycles (Table 3.2). Figure 3.2: Exemplar plots of the state output from the BP-AR-HMM with foot acceleration of a single representative subject. States 1 and 4 consistently can be used for the identification of gait events, as they occur prior to the identified gait event from the force data. State 4 occurs when the acceleration has a negative slope in the medial lateral and vertical components of the acceleration at the end of swing phase, and a positive slope after initial contact. State 1 occurs at the first instance of an increase in the acceleration in the medial lateral component of the acceleration after middle stance phase. The major differences between walking and running are shown in the durations of state 7. In the walk to run transition, as the velocity of the treadmill was increased State 7 no longer appears, panel D, step S+2. In all panels the accelerations are shown as a variation of a dashed line. In each of the panels each step is labeled as the steps prior to the transition (S-2, S-1), the transition step (S-0) or the steps following the transition (S+1, S+2). Panels A and B show data from the WRT; panels C and D show data from the RWT. 35 There were no states unique to either walking or running. However, state 7 was unique to slower locomotion. It represents the portion of stance phase in which there is approximately zero acceleration in all three directions (Figure 3.2). State 7 occurred in 28.9% of running gait cycles, with a higher frequency in the RWT transition than the WRT. State 9, which represents an approximate mid-swing phase only occurred in 26.8% of gait cycles and could be considered a misclassification of state 6. Classification of impending gait events were represented by states 1 and 4. When state 1 occurred the foot was moving into the push-off phase of stance, for walking and running. State 1 occurred 0.14 ± 0.03 seconds before TO (Table 3.2). State 4, representing terminal swing into the initial braking phase of stance, consistently occurred 0.13 ± 0.02 seconds before IC. Both states occurred in >99% of gait cycles. The average duration of each state output from the BP-AR-HMM can be indicative of which locomotion mode the subject is using. For walking, the state with the longest average duration during foot contact was state 7 which occurred in 99.10% of walking foot contacts with an average duration of 0.29 ± 0.12 seconds. The average duration of state 7 for running gaits was 0.06 ± 0.02 seconds, occurring in 28.94% of running steps (Figure 3.2). State 3 had the second largest difference in duration between the two locomotion modes, at 0.09 seconds. The average duration of state 3 in running gait cycles was 0.14 ± 0.04 s, and was 0.05 ± 0.01 s in walking gait. State 3 occurred in 97.90% of walking and running gait cycles. Figure 3.3 shows the relationship between these two state durations during each of the transitions. 36 In Figure 3.2, foot contacts are shown in panels A and C, with walking and running foot contacts differentiated by their height. In panels B and D, the state output from the BP-AR-HMM is shown. In both panels state 1 and 4 are identified first, indicative of when either an initial contact or toe off event will occur. In panel B the decrease in the duration of state 7 provides evidence of a gait transition from a run to a walk. In panel D the increased duration of state 7 and the decreased duration of state 3 is indicative of a walk to run transition. Figure 3.3: Changes in the duration of state 7 and state 3 for all subjects’ transitions between walking and running (RWT and WRT). The steps before the transition step are denoted by S-3, S-2, S-1, the transition step is S-0 and the steps of the new locomotion type are S+1, S+2, S+3. The duration of each of the states were averaged across subjects, for each step. 37 Table 3.1: Gait Cycle Accuracy (% of Occurrences) and Duration of States Output by the BP-AR-HMM Across Subjects Gait Cycle Gait Phase Average State Duration (s); Mean ± SD Occurrences (%) All Walking Running All Walking Running State Walking Running N1 = 797 N2 = 334 N3 = 463 N1 = 797 N2 = 334 N3 = 463 1 Push Off Push off 98.53 97.30 99.35 0.09 ± 0.04 0.07 ± 0.05 0.098 ± 0.038 2 Early swing Early Swing 100.00 100.00 100.00 0.10 ± 0.03 0.10 ± 0.03 0.10 ± 0.03 3 Early Swing Early Swing 99.14 97.90 100.00 0.11 ± 0.05 0.05 ± 0.01 0.14 ± 0.04 Terminal Terminal 4 Swing /Early 99.88 99.70 100.00 0.20 ± 0.04 0.20 ± 0.04 0.20 ± 0.04 Swing Stance 5 Mid-Swing Mid-Swing 100.00 100.00 100.00 0.06 ± 0.02 0.05 ± 0.01 0.08 ±0.02 6 Mid Swing Mid Swing 92.17 97.01 88.34 0.08 ± 0.04 0.10 ± 0.04 0.06 ±0.02 7 Stance Early Stance 59.07 99.10 28.94 0.22 ± 0.12 0.29 ± 0.08 0.06 ±0.02 Early /mid 8 Early Stance 94.24 94.61 93.95 0.08 ± 0.03 0.10 ± 0.03 0.07 ± 0.03 Stance 9 Mid Swing Mid Swing 26.83 34.13 21.60 0.07 ± 0.04 0.06 ± 0.03 0.07 ± 0.05 Table 3.2: Average Time Prior to Gait Events when Either State 4 or 1 Occurs. Velocity INITIAL (m s-1 Toe Off (S) ) CONTACT (S) 1.4 0.11 ± 0.01 0.12 ± 0.03 1.6 0.11 ± 0.01 0.13 ± 0.05 2.6 0.11 ± 0.02 0.14 ± 0.01 3.0 0.12 ± 0.02 0.13 ± 0.02 WRT 0.10 ± 0.02 0.16 ± 0.04 RWT 0.21 ± 0.02 0.14 ± 0.03 Average 0.13 ± 0.02 0.14 ± 0.03 38 Discussion In this study, we have demonstrated that a data driven, semi-unsupervised machine learning algorithm can consistently, with minimal inter-participant variability, identify gait events from minimal sensor data for steady state and dynamic locomotion. The parsing of gait with single sensor data sampled at 20 Hz represents an important step in the development of transition-predictive control algorithms for assistive devices. The minimally sampled data from a single sensor provides an approach to reduce the computational load in control systems for actuated prosthetic limbs. The classification accuracy in both steady state and dynamic locomotion trials by the output of the BP-AR- HMM is comparable to other gait classification studies. Identification of gait events with threshold algorithms has been presented thoroughly [7], [8], [93]. The timing of the identification of gait events is critical in the overall structure of the control system for decision making. Other groups have reported minimal lead time or detection of the event after it has occurred using accelerometers at various anatomical locations [11], [12], [66], [94]. Mannini and colleagues used a HMM for gait event detection, with 100% accuracy and identification of the gait event approximately 40 ms prior to the gait event [17]. However, their work uses states that were identified by the user, while in our study states were derived in an unsupervised fashion with the Beta Process. Groups that have used finite state machines, which require user defined states, have shown high gait event identification accuracy for able-bodied subjects [95], [96]. The limitations of these studies are large sensor arrays and high sample frequency. Another limitation is the limited velocities at which the subjects were asked to move. In the present work we have shown that IC and TO can be identified across various locomotion velocities and two 39 different gaits with adequate lead time for a control system to adjust orientation or stiffness parameters of an assistive device in preparation for the next step [14]. The secondary purpose of this study was to determine if the output of the BP-AR- HMM could be used for gait classification and whether a probabilistic model for the identification of locomotion transitions would be feasible. Classification approaches for multiple gaits by a control algorithm have been reported to be accurate to the range of 95 – 99% [11], [27], [74], [97], [98]. In Table 3.1, state 7 is indicative of walking stance phases and also present in 29% of running steps (slow running foot contacts and primarily identified during the WRT and RWT trials). State 7 captures the time of approximately zero acceleration when the foot is in mid stance. As gait velocity increases, the duration of state 7 will decrease until it altogether disappears. A rule-based classification utilizing the duration of state 7 could provide input about what gait type the subject is using. State 3 shows a similar pattern but the differences in the durations are not as stark. Results presented in Figure 3.3 indicate transition occurrence as the patterns in the accelerations of the foot change. From these data, it appears the BP-AR-HMM output can provide relevant information about the current locomotion state and when a transition may occur. The information about the previous stance phases could then be used to estimate what type of gait will be used next, either a walking or running gait, as shown in Figure 3.3. In short, by assessing changes in the presence and duration of state output over the course of multiple stance phases, a probabilistic model of gait transitions could be generated. 40 When there are multiple types of locomotion, recent work has made use of heuristic algorithms which utilize user defined patterns in linear acceleration and angular velocity signals [14], [18], [27], [99]. Other algorithms such as linear discriminant analysis, support vector machines and artificial neural networks have taken the user defined states and generated optimized rules for the classification of gait, with varying degrees of success [22], [70], [74]. This approach is similar to the process of heuristic models and rules developed in other machine learning methodologies. Unlike the aforementioned methodologies, the BP-AR-HMM optimizes the feature set using split- merge, merge or remove features to ensure sparsity in the feature space [30], [89]. Finally, algorithms that use time history for the classification of gait are typically based on user defined locomotion types. Previous work has used a Dynamic Bayesian network (DBN) [13]. The DBN requires expert user input into the algorithm for classification of gait parameters and locomotion transitions. A characteristic which is advantageous for consistency in the output of the algorithm is use of the latent temporal structure of cyclical gait data. Both the DBN and the BP-AR-HMM use these patterns to provide information about the gait cycle. This is valuable in control systems because the additional information could be used to provide more accurate predictions of locomotion transitions. The use of an instrumented force treadmill in this study limits the applicability of these results to real world control systems, because of the constraints of steady state velocity and the use of constant accelerations during transitions on the treadmill. Additionally, the calculated acceleration data provides less information than a full IMU, which would include additional data from rate gyros and magnetometers. Data were not 41 filtered between differentiation steps which may have magnified small errors in the data. These data are not directly comparable to acceleration data from an IMU as the orientation of the local coordinate systems will differ. There will also be a difference via acceleration due to gravity, which is not included in this data set. However, the given data are sufficient to test this algorithm on the patterned nature of gait. Utilization of the rate gyro to measure angular velocity of the foot may lead to more accurate classification of gait events and more robust predictive models for locomotion transitions. The use of offline analysis and able-bodied subjects shows that the output from this algorithm has the potential to be implemented in an assistive device. In the present study, motion data from the foot dorsum were chosen to represent a potential location for an IMU embedded in a prosthetic limb. Additionally, the unilateral methods used in this study were intended to represent how measures could be made within a prosthetic limb. Conclusion In the present study, limited sensor data from a single idealized accelerometer was used for the identification of gait events prior to their occurrence. Use of this data set has approached the minimal information required from a sensor to determine when gait events will occur. The utilization of minimally sampled data could drastically reduce computational load while maintaining similar gait event identification accuracy. Further, we have built a plausible framework for the classification of locomotion state and transitions between walking and running gaits. In future work we plan to test and validate this algorithm on a variety of environmentally constrained locomotion transitions. This 42 will be followed by development of probabilistic models for predicting locomotion transitions in various ecological settings. 43 Bridge The goal of this chapter was to investigate the effectiveness of a supervised machine learning algorithm for the identification of gait events during steady state walking, running and the transitions between the two locomotion modes. We utilized the BP-AR-HMM which derives features from the input data and shared them across that data. We have shown that we can identify features prior to IC and TO from the BP-AR- HMM. Furthermore, we have shown that the temporal duration of the features changes during the walk-to-run locomotion transitions. Data in this chapter support further investigation of the BP-AR-HMM, and its ability to build a set of features for the identification of gait events outside the laboratory from minimally sampled data. Chapter IV will utilize the same algorithm to identify gait events prior to their measured occurrence for level ground walking, ramp and stair ascent/descent and spontaneous transitions between walking and running. 44 CHAPTER IV FEATURE IDENTIFICATION WITH A HEURISTIC ALGORITHM AND AN UNSUPERVISED MACHINE LEARNING ALGORITHM FOR PRIOR KNOWLEDGE OF GAIT EVENTS Published in IEEE Transactions on Neural Systems and Rehabilitation, Donahue, S.R; Hahn, M. E; Feature Identification with a Heuristic Algorithm and an Unsupervised Machine Learning Algorithm for Prior Knowledge of Gait Events. 2022, 30, 108-114. Experimental work was performed entirely by me, and the writing is entirely mine. M.E. Hahn provided mentorship and aided in study design, general oversight, and editing and finalizing the final manuscript. Introduction Human locomotion occurs across a range of velocities and environments that are not easily replicated in the laboratory. Understanding how able-bodied humans move naturally through a real-world environment can inform interventions and device design for individuals with orthopedic or neurologic conditions limiting full ambulatory function. Proper identification of gait events is crucial for gait analysis in the clinical setting, for remote monitoring of patient activity levels and the control of prosthetic devices [1], [2], [3]. The quantification and data driven extraction of features shared across human locomotion modes in a real-world environment can provide researchers with succinct representations of these data. Identification of key gait events, initial contact (IC) and toe off (TO), form the basis of our understanding of different locomotion 45 modes, commonly used in the analysis of gait [4], [5], [6]. Recently, studies have utilized mobile sensor arrays for the quantification of human locomotion through varied environments [54]. These mobile sensor arrays have allowed for the development of rules-based algorithms for the identification of gait events. Developments in algorithmic methods, particularly machine learning, have improved the potential of mobile sensor technology for analysis of gait in real-world environments [102]–[104]. Of particular interest, unsupervised machine learning algorithms for the analysis of multiple time series are being developed and tested for gait event detection [105], [106]. While these algorithms have been shown to be accurate for gait event identification in the laboratory, they have not been tested on data collected from a real-world environment, nor have they been compared to simple heuristic identifiers with minimal sensor data from the same data set. The gold standards for identification of gait events in a real-world environment have been force and pressure based insole systems [93], [107]. These systems can be expensive, are prone to failure, and are often not feasible for use in assistive devices or long-term use. Additional options for mobile sensing are inertial sensors, such as accelerometers and gyroscopes. These sensors have been utilized extensively for the identification of gait events and locomotion mode, with varying success [1], [10], [11], [12], [13], [14], [15], [16]. One of the major limitations of accelerometers regards excess vibrations experienced during initial foot contact which may limit consistency of accelerometry data when compared to gyroscopic data. Additionally, the orientation of the gyroscope is unaffected by gravity [93], therefore gyroscopes may provide more consistent performance than other inertial quantities for the identification of locomotion 46 and gait events. Previous methodologies have been primarily focused on level ground walking at the subject’s preferred velocity, or have examined multiple locomotion modes, including ramps, stairs and running, but not all modalities in the same study [1], [12], [14], [17], [18], [19], [20]. There is currently no consensus about the placement of sensors for gait event detection, though it is common to use an array of sensors on the lower limb, including the foot, shank and sacrum [1], [3], [10], [14], [15], [16], [17], [20], [21], [22]. Recent work has employed a single sensor mounted on the shank or foot for gait event identification using a set of heuristic rules derived by the expert user, [66], [99], [110], [111], [113], a combination of user defined phases and supervised machine learning [17], [22], [25], [26], [27], [28] or frequency analysis [14], [109] to detect common gait events. Heuristic algorithms are advantageous because of their computational simplicity. However, they also require an intimate knowledge of the waveforms, for a variety of locomotion modes. In contrast, unsupervised machine learning algorithms have the potential to input the data with minimal modification by the researcher. Indeed, there have been rare attempts to use unsupervised machine learning algorithms for identification of gait events [105], [106], which have performed at the same accuracy as the supervised and heuristic algorithms. These methods have been focused on determining precisely when a gait event is going to occur. In this study we look to identify features prior to the occurrence of the gait event. A key limitation of most gait event detection and locomotion classification algorithms is the reliance on data collected in a controlled environment, such as a laboratory or clinic. Current algorithms also typically require a large volume of data (large sensor arrays with high sampling frequencies) to determine and differentiate 47 patterns in locomotion. High density data from multiple inputs, requires computationally intensive methods and feature extraction for the identification of gait events. In real world applications, the identification of gait events with minimal sensor data is ideal as it will reduce the computational load and preserve the battery life of a device. In the modern renaissance of machine learning, novel algorithms have been introduced. They have been built specifically for analysis of multiple related time series data (e.g. IMU data collected from the same anatomical location on multiple subjects undergoing the same protocol). One example is switching linear dynamical systems, developed previously for economic analysis and the description of dynamical processes [29]. This class of algorithms allows for conditions where not all possible features are known a priori but can be added when a new feature is discovered. From this set of novel algorithms, we propose use of the Beta Process Auto Regressive Hidden Markov Model (BP-AR-HMM) for the identification of gait events using a single stream of angular velocity data from a sensor on the foot sampled at 25 Hz. Minimizing the data input into the algorithm will reduce the number of required sensors and the computational complexity needed to identify gait events. This algorithm has been used previously for the identification of unique features in time series from similar data sources (e.g. position data from motion of different subjects) [30]. The approach is used to discover dynamics shared between time series and builds a global set of features that describe all possible behaviors in the data set. The result is a feature identification algorithm that is not constrained by user-determined states and is completely unsupervised. It is currently unknown how well these methods are suited for locomotion data collected in a real-world environment with minimal input data. There has yet to be a 48 direct comparison between a heuristic, peak finding algorithm and an unsupervised learning algorithm on data collected in a real-world environment. It is critical to note that the approach used in this work seeks to identify features that indicate an impending gait event, as the knowledge prior to the gait event is useful when designing a controller for application in an assistive device [1]. The purpose of this study was to implement and compare two different algorithms, a heuristic algorithm and an unsupervised machine learning algorithm for the detection of gait events prior to their actual occurrence using minimally sampled angular velocity data collected in a real-world environment. Methods This study was approved by the University of Oregon Institutional Review Board (protocol #: 08202018.018). All participants provided written consent prior to involvement in the study. Data were collected from sixteen able bodied participants (8 male and 8 female; 31.1 ± 14.8 years, 173.4 ± 8.5 cm, 66.7 ± 7.7 kg). All analyses were performed offline in custom Matlab programs (Mathworks, Natick, MA). Each subject’s dominant foot was equipped with an IMU (Vicon, Centennial, CO) mounted on the dorsal aspect of the foot (Figure 2.1, Panel A). Foot ground contact information was recorded with two different systems. The first three participants were equipped with the Noraxon footswitch Telemyo DTS system (Noraxon, Scottsdale, AZ), while the other 13 participants were equipped with Loadsol inserts (Novel Electronics, St. Paul, MN). The first day of the study consisted of multiple locomotion mode transitions with steady state locomotion between modes (Figure 2.1, Panel C). On the second day of the study, transitions between walking and running were examined with a protocol based upon the 49 work of Long and Srinivasan [117] the range of target average velocities ranged from 1.3 m s-1 to 3.0, m s-1 (1.30, 1.51, 1.73, 1.94, 2.15, 2.36, 2.58, 2.79, 3.00 m s-1) encompassing velocities typical in human locomotion. Participants were asked to cover a distance of 90 m, during which they received verbal feedback at the half-way time point and when five seconds remained in the trial (Figure 4.1) [117]. The condition order of these trials was randomized to minimize any learning effects (Figure 4.1, Panel C). The IMU data were first down sampled to a common frequency of 100 Hz, then synchronized with the insole force sensors. Ground reaction force data were converted into an indicator variable, indicating either no discernable movement pattern, or consistent pattern such as stance or swing phase. The data were further labeled with each different locomotion mode. Input into the rules-based algorithm was the single axis angular velocity data about the x-axis of the IMU, approximately sagittal plane of the participant, sampled at 100 Hz. The data were then filtered with a fourth order zero lag Butterworth filter (fc = 6 Hz). Input into the BP-AR-HMM consisted of the single axis angular velocity data, further down sampled to 25 Hz (Figure 4.1, Panel B). There were four rules utilized for the identification of gait events from the raw waveform. There are two prominent peaks that were identified as consistent features for the identification of gait events. For the identification of an impending IC a minimum must be < -100 rad s-1. The feature utilized to identify impending TO was a peak > 150 rad s-1. The two other rules to minimize misclassification of peaks were 1) temporal differences between minima or maxima must be ≥ 0.1 s; 2) if the difference between IC and TO < 0.1s, ignore the next peak. This heuristic set results in a peak finding algorithm, with minimal constraints, and is computationally simple compared to the BP-AR-HMM. 50 Figure 4.1: Data Collection Setup and Protocol Panel A: The experimental set up with the IMU outlined on the dorsum of the participant’s dominant foot and the insole for foot contact identification. Panel B: Flow of the data from collection to identification of patterns for analysis. Panel C: The real-world environment course, unlabeled sections indicate level ground locomotion. Day 1 contained two environmentally constrained transitions and walk to run and run to walk transitions (course in red). Day 2 of the trial consisted of only walk to run and run to walk transitions (course in blue). 51 The rules for gait event identification were the first instance of a state from a subset of states output from the BP-AR-HMM. The identification of IC was the first occurrence of a state that was not 5, 6 or 13. The rules for the identification of TO were the first instance of either state 2, 6 or 16. The BP-AR-HMM is a machine learning algorithm that provides a spatial- temporal statistical model and classifications for related time series data [30], [89]. Briefly, the Beta Process provides a sparse subject independent feature set, to be used by the Auto-Regressive Hidden Markov Model (AR-HMM). The key element of this algorithm, the beta process, gives the algorithm the ability to add and remove states. For example, if two states describe the same feature they will be combined into a single state. This development of dynamic features is crucial for use in semi-supervised or unsupervised machine learning. The HMM assumed that the set of distributions and transitions between dynamic features follow a Markov process. This process however was modified with the presence of the AR model, which does not assume temporal independence. The vector auto regression approach provided an estimation of current state, based upon the previous time steps, the available states and distributions based on the HMM. A sticky parameter was used in the BP-AR-HMM to ensure the capture of biomechanically relevant states by encouraging self-transitions [30]. Performance evaluation of the BP-AR-HMM included the critical timing of states and how consistently the features were used across all locomotion modes. Each stance and swing phase were identified via the force data from force-measuring insert. The locomotion mode of each step was identified, and gait events were identified using rules generated 52 from the states output from the BP-AR-HMM. Each gait event was identified as the first instance of force being higher than 20 N. Classical machine learning methodologies rely on three different groupings of the data, a training data set (~70%), validation set (~15%), and test set (~15%). In this paradigm, the machine learning algorithms are trained on the training set, and rules are developed for classification or prediction. During training the validation set is tested with the partially trained algorithm to ensure the algorithm is not being overfit to the training data. If the training accuracy of the algorithm is higher than the validation and testing accuracy the algorithm is considered to be overfit. After training, the test data are used as a measure of accuracy on a novel dataset (i.e. the test set). These test sets are typically a single novel subject, as in the Leave One Out Cross Validation (LOOCV) procedure. The BP-AR-HMM utilized a reversible jump Markov Chain Monte Carle method, which does not require the use of LOOCV, The details of this algorithm are beyond the scope of this paper, but can be found here: [118]. Results The output from for the identification of features that occur prior to identified gait events with heuristic rules resulted in a 94.87% accuracy of all gait events (TO and IC). Accuracy in this study was calculated as the number of features identified, divided by the number of measured gait events. Figure 4.2 demonstrates the consistency with which the state output indicate impending gait events. The average temporal difference between measured IC from the insoles and the peak identified prior to IC was 186.32 ± 86.70 ms, and the difference between measured TO and estimated TO was 63.96 ± 46.30 ms (Table 53 4.1). The only feature that occurred consistently after the measured gait event was IC for stair ascent. Figures 4.3, 4.4 and 4.5 (Panel A) show where the estimated IC and TO events are identified based upon the aforementioned rules. The accuracies presented in Tables 4.1 and 4.2 for IC and TO are shown out of all of the measured footfalls. The output from the BP-AR-HMM resulted in 16 states that described the entirety of the data input. Of these states, 12 were considered viable features for the description of Table 4.1: Heuristic Algorithm Accuracy and Lead Time Overall accuracy of the Heuristic algorithm and time differences between the features identified for the prior identification of gait events relative to observed gait events. All of the timings for the identified features are prior to the measured event. locomotion in a real-world environment. From these, the identification of IC and TO was approximately 99.6% across all locomotion modes (Table 4.2). Nine of the 12 states occurred in >65% of gait cycles (Table 4.3). Each state described a specific pattern in the angular velocity waveform, and transitions between the states were temporally dependent. For the current paper, analysis was constrained to states considered to be the most descriptive of gait phases (Table 4.3). States such as 16, 11 and 6 tended to occur during the periods described by IC and TO, therefore they are counted as occurring in both stance and swing phase of a given gait cycle. States 5 and 6 were considered to best describe swing phase and the absence of these states was considered to be the feature that indicated impending foot contact. States 2 and 16 were considered to be the best states 54 that described pre-TO and TO in the majority of locomotion modes. These states are presented in Figures 4.3-4.5, showing representative waveforms of all steady state locomotion modes. Table 4.2: BP-AR-HMM Algorithm Accuracy and Lead Time Accuracy of a BP-AR-HMM gait event identifier and time differences between the identified features relative to observed gait events. All of the timings for identified features are prior to the measured event. Table 4.3: Primary States Output From the BP-AR-HMM The percentage of the stance and swing phases in which each state occurs. We used simple rules based upon features derived by the BP-AR-HMM to identify gait events, prior to or just after their occurrence. Figure 4.3 shows which states were optimal for the identification of gait events prior to the event. Features across locomotion mode were found to have temporal consistency for the prior identification of gait events (Table 4.3). Across locomotion modes, the states occurring most consistently were states 2 and 16 (Tables 4.1 and 4.2). State 6 generally occurred throughout swing 55 phase, however it was occasionally used to detect impending TO, as needed (Table 4.3). Specifically, if states 2 or 16 were not observed then the first occurrence of state 6 was used to identify impending TO (Table 4.2). The states used to identify impending IC from the BP-AR-HMM was the first occurrence of a state that was not state 5, 6 for most locomotion modes (Figure 4.2). The notable difference was with running, where the rule for identifying IC was the first instance when the state was not 13. Figure 4.2: Gait Event identifiers from the BP-AR-HMM. These graphs present five time points before and after a gait event and the proportion of footfalls when a selected state output from the BP-AR-HMM occurs. The states that identify swing phase and for all locomotion modes other than running are states 5 and 6. The absence of state 5, 6 and 13 are indicative of IC. The identification of TO can be achieved with first occurrence states 2, 16. 56 Figure 4.3: Representative foot ground contacts for Level Ground Walking (A & C) and Running (B & D). Panels A and B show the angular velocity plotted in Black with the identified foot contacts in Blue. Panels C and D show output from the BP-AR-HMM in Green plotted with the angular velocity in Black. The states used for identification of gait events are labeled in panels C and D. 57 Figure 4.4. Representative foot ground contacts for Stair Ascent (A & C) and Descent (B & D). Panels A and B show the angular velocity plotted in Black with the identified foot contacts in Blue. Panels C and D show output from the BP-AR-HMM in Green plotted with the angular velocity in Black. The states used for identification of gait events are labeled in panels C and D. 58 Figure 4.5: Representative foot ground contacts Ramp Ascent (A & C) and Descent (B & D). Panels A and B show the angular velocity plotted in Black with the identified foot contacts in Blue. Panels C and D show output from the BP-AR-HMM in Green plotted with the angular velocity in Black. The states used for identification of gait events are labeled in panels C and D. 59 Discussion The purpose of this study was to implement and compare two algorithms, a heuristic algorithm and an unsupervised machine learning algorithm for identification of features and states for the detection of gait events prior to their actual occurrence using minimally sampled angular velocity data collected in a real-world environment. In this study we built two classifiers able to provide prior identification of gait events across multiple locomotion modes; including level ground walking, running, stairs and ramps from a single angular velocity waveform. The BP-AR-HMM was more accurate in its identification of gait events than the heuristic algorithm across all locomotion modes. From the output of the BP-AR-HMM, identification of impending TO was most consistently achieved by identifying the first occurrence of either state 2 or 16, and identification of impending IC was most consistently achieved by the identification of states 5 or 6. The BP-AR-HMM accuracy for estimation of gait events was >97% across locomotion modes, which was greater than the 94% overall accuracy of the heuristic algorithm. Utilizing a single data stream from a gyroscope in a real-world setting, we were able to achieve accuracy at levels similar to others who have used larger sensor arrays to collect data in a controlled environment [111], [119]. Although the heuristic algorithm was less accurate than the BP-AR-HMM the results remain similar to previous work with the addition of data from multiple locomotion modes collected in a real-world environment [66], [110]. The need to reposition the limb during swing phase is consistent across all locomotion modes and velocities. This allowed for the identification of TO with simple 60 rules, as the limb pushed off from the ground in rapid plantar flexion into swing phase, which had a similar pattern across locomotion modes. In contrast, during stance phase there were distinct differences in the angular velocity waveform across locomotion modes (Figures 4.3, 4.4 and 4.5). These differences led to differing state output from the BP-AR-HMM, indicating that this algorithm may be useful for classification and prediction of locomotion mode. Previous studies developed gait event detection algorithms for up to six different locomotion modes (walking, ramps, stairs, ascent and descent), with >95% accuracy [18], [66], [91], [108], [111]. To build real world applications for this type of gait event detection system, it is important to anticipate when a gait event is going to occur. One other study implemented an algorithm with capacity to provide predictions of gait events using an Adaptive Neuro-Fuzzy Bayesian algorithm from three IMUs on the lower limb of the subject, collected at 100 Hz over level ground and ramp walking [108]. That approach was able to predict gait events based upon the previous gait event but did not provide any lead time for the calculation of the event and it required up to 100 ms to make a decision if a gait event was going to occur [108]. Another study developed an HMM using data from a foot mounted gyroscope with a detection accuracy of 100% and an average delay of 43 ms for level ground walking [17], and another group developed an HMM using data from a gyroscope with mean detection delay of 60 ms [93]. We have improved upon these results by identifying features prior to the occurrence of the gait event prior. A further improvement is that the current study utilizes less data than the previous studies. 61 Other comparable work includes that of Figueiredo et al [111], which examined multiple locomotion modes with a heuristic model based upon data collected from a gyroscope on the foot of a healthy subject and then input into a finite state machine for the detection of gait events. Their approach was reported to detect IC across multiple locomotion modes at a rate of > 96.98%, and TO at a rate of > 95.89% [111]. We achieved slightly higher accuracy with > 97.26% using the BP-AR-HMM across gait events and locomotion modes. Another study used a heuristic approach with a single IMU on the shank, identifying gait events with the following time gaps: level ground IC with 16 ± 9 ms and TO at –16 ± 15.9 ms, ramp ascent IC at 18.8 ± 11.6 ms and TO at - 17.2 ± 21.3 ms [120]. In the present study, the features using a heuristic algorithm had a lead time of > 100 ms, with the exception of stair ascent, for IC and > 60 ms for TO (Table 4.1). The lead time for the BP-AR-HMM identification of gait events was > 40 ms for both IC and TO (Table 4.2). Previous studies utilizing combinations of machine learning and EMG have identified gait events with varying accuracy [102]–[104]. These approaches have shown the efficacy of machine learning and neural networks for the identification of gait events with EMG for level ground walking, with accuracy levels of 87.4% [104], 96.1% [102], and 94.9% [103]. These values are comparable to our level ground walking accuracy with the output from the BP-AR-HMM and the heuristic rules generated. It should be noted that these previous studies were focused on determining when a gait event had occurred. In contrast, in our approach the BP-AR-HMM and the heuristic model identified the key portions of gait prior to the actual event with data collected in a real-world environment. 62 Limitations of this study include but are not limited to post hoc, offline data analysis; therefore, we have not yet demonstrated the ability to detect gait events in real time. Although the BP-AR-HMM output has a higher accuracy than the heuristic algorithm, a simpler algorithm might be easier to include in a control algorithm as it is less computationally intensive, therefore reducing the temporal delay between the sensor data input and an identified gait event. Additionally, although the data were collected in a real-world environment, all subjects completed the same course, which is not representative of the natural variation experienced in daily locomotion. Conclusion In this study we have shown the efficacy of both a heuristic algorithm and an unsupervised machine learning algorithm for the identification of features prior to gait events based on a single stream of minimally sampled angular velocity data. This leads us to believe that in practice a near real time system can be implemented for subject independent identification of gait events across locomotion modes and velocities with either a heuristic, user defined rule set, or features extracted with an unsupervised machine learning algorithm. The key to the BP-AR-HMM used in this study is the Beta Process, allowing the model to generate and remove states and share them across all input time series, thus allowing the construction of a robust subject independent model. For the identification of gait events, this model was at least as accurate as previous approaches, while using less sensor data. The outcome of this work indicates that the model used may be useful in the classification and prediction of locomotion mode. Such an approach could be adapted for use in the control of assistive devices. 63 Bridge The goal of this chapter was to investigate the effectiveness of an unsupervised machine learning algorithm for the identification of gait events during dynamic locomotion in an ecological environment. We utilized the BP-AR-HMM as an unsupervised machine learning algorithm, to derive features and share them across the input data. We have also developed a heuristic algorithm for the identification of features prior to gait events which will be crucial for the measurement of gait events in a real- world environment. The methodology of this chapter is the basis for subsequent chapters. While the modality changes in the next chapters to running, this work sets the basis for it. 64 CHAPTER V VALIDATION OF RUNNNING GAIT EVENT DETECTION ALGORITHMS IN A SEMI-UNCONTROLLED ENVIRONMENT This work is currently in preparation for submission to Sensors. Dr. Michael E. Hahn, provided mentorship including assistance with study design, data interpretation, and editing and finalizing the final manuscript. Introduction The biomechanical analysis of running outside of the laboratory can be a useful tool for real-time or session by session feedback during training for coaches and athletes [121], [122]. The standard in human locomotion research for identification of gait events is the use of three-dimensional (3D) motion capture and ground reaction force data. These systems typically utilize multiple cameras and force plates in a controlled laboratory environment. Despite being the gold standard, these methods require expensive equipment, large indoor facilities and technical expertise [107], [123], [124], thus limiting their practical use in clinical or sporting environments. Previous studies have shown the efficacy of utilizing mobile sensing units for the detection of gait events inside and outside the laboratory [93], [125]. However, there has been a lack of validation of these algorithms for gait event estimation outside the laboratory. This work expands upon current validation techniques for the estimation of gait events and foot contact duration during running in a semi-uncontrolled environment. 65 Over the past decade there has been extensive work to understand Inertial Measurement Unit (IMU) data in the development of techniques to evaluate human locomotion and identify gait events for running in a controlled laboratory setting [20], [41], [44], [49], [126]–[129]. There are typically 9 sensors in a standard IMU: tri-axial accelerometers (linear acceleration), tri-axial rate gyroscopes (angular velocity) and tri- axial magnetometers (magnetic field). These sensors have been used for the measurement of biomechanical variables in several environments and over different durations in and out of the laboratory [122], [130], [131]. However, data from IMUs cannot be used for analysis of biomechanical variables without thorough pre-processing of the data and specific algorithms [38]. The bases of these algorithms rely explicitly on expertise of the researcher to understand segmental accelerations and angular velocities throughout the gait cycle, and then developed rules for the estimation of foot contact, via gait events: initial contact (IC) and toe off (TO). Some of these approaches have been developed specifically for running, with sensors on the foot, shank and the sacrum [126]–[128], [132], [133]. Application of these analyses have led to measurement of trail running, marathons, and training runs for athletes ranging in skill from recreational to competitive collegiate athletes [39], [44], [130], [134]. Prior work in this research space has shown that consistent features can be extracted from data in the laboratory and in the real-world, with modest efforts made to validate the estimated biomechanical outcomes in the real- world environment [135], [136]. Force plates have been used previously to validate identification of gait events [41], [126], as have force-instrumented treadmills [42], [48], [131]. However, the validations of these methods did not account for the possible range of velocities and skill levels in an outdoor semi-uncontrolled environment. 66 The purpose of this work was to validate identification of gait events using data from IMUs in a semi-uncontrolled environment. Second, this study sought to expand the range of velocities tested with these algorithms and expand the range of participant skill, from novice runners who run < 5 miles per week to runners who can run a 5k race in sub- 15-minute time. We expected the estimation of gait events and foot contact time to be more accurate for the foot mounted IMUs in comparison to the sacral mounted IMU across the range of velocities. Portable insoles developed for the measurement of in-shoe forces [10] were used as the standard for validation of the identification of IC and TO in a real-world environment. The algorithm developed in this study will be considered valid if the RMSE in overall contact time across the range of velocities is less than 0.04 seconds across the range of velocities (representing < 5-6% total contact time at jogging/running velocities). Methods Data were collected from 15 participants, (9 male, 6 female, age: 23.6 years, height: 178.3 cm, mass: 73.5 kg) as part of a larger study (Table 5.1). All analyses were performed using custom Matlab programs (Mathworks, Natick, MA). Multi-axis IMUs (Casio, Tokyo, Japan) were mounted on the dorsal aspect of the participants’ feet and approximately on the sacrum (clipped to the back of the participants waistband). These sensors recorded 3D linear accelerations and angular velocities at 200 Hz. Inertial data were post processed with a Kalman filter to orient the vertical axis of the local (IMU) coordinate system to gravity. Foot-shoe normal force data were recorded with Loadsol insole force sensors (Novel Electronics, St. Paul, MN) at 100 Hz. Standard GPS data 67 Table 5.1: Distribution of Average were measured with a Garmin Running Velocities Average Running Number of Forerunner (Kansas City, KS). Participants Velocity (m s-1) Participants 2.23 1 performed progressively faster 400 m running 2.33 1 2.43 1 trials (four to five, with the fastest velocity 2.55 4 2.68 6 being optional), with participant selected rest 2.82 6 2.98 6 durations, on a square practice track, based on 3.15 7 3.35 9 self-reported race pace for a 5k. An example 3.57 9 3.83 9 of the paces run by a participant are shown in 4.12 8 4.47 8 Table 5.2. The total range of velocities run by 4.87 7 5.36 7 participants was 2.4 – 5.4 m s -1. The participants monitored their pace with a standard wrist-mounted Garmin GPS display. If a participant missed their pace by more than 2 seconds, they were asked to repeat that pace after suitable rest. These velocities represent typical training and race paces for the majority of recreational and high-level distance runners [44], [137]. Table 5.2: Exemplar Paces for Participant with a 7-Minute per Mile 5-km Race Pace EXAMPLE PACES Minutes per Mile Pace 1 8:30 Pace 2 8:00 Pace 3 7:30 Pace 4 (Race Pace) 7:00 Pace 5 (Optional) 6:30 Data Processing Foot-shoe normal force data measured from force sensing insoles were considered the standard reference for identification of measured gait events [10]. Inertial signals and 68 force data were time-synced using controlled unilateral ‘foot-stomps’ before and after each trial. The IMU data were then down-sampled to match the force sensing insole sampling frequency (100 Hz) and filtered using a 4th order low-pass zero-lag Butterworth filter (fc = 35 Hz) (Figure 5.1). This filter was chosen as it was more conservative in the reduction of noise for the accurate identification of gait events, using peak accelerations [44]. Force data < 50 N were set to zero. Figure 5.1: Flow chart of the data post processing steps and the algorithmic methods for the identification of gait events from the force sensing insole data, and data from multi- axial IMUs at both anatomical locations. 69 Algorithmic Descriptions Algorithms used in this study are briefly described here. A more thorough treatment of the algorithms can be found in Figure 5.1. Identification of gait events with foot-shoe normal force data utilized a threshold of 50 N. Gait event estimation from the IMU data utilized distinct spatial and temporal rules. Identification of gait events from the dorsal mounted IMUs estimated initial contact by identifying peaks in the resultant acceleration. The spatial rule for ICfoot was a minimum resultant acceleration of 50 m s -2. The temporal rule for determining ICfoot was a minimum duration of 500 ms between estimated consecutive ICfoot [40]. Identification of TOfoot was performed by searching a specific temporal window beginning 100 ms after estimated ICfoot, ending at the half- width of the estimated stride time. In this window TOfoot was either identified as the local maxima of vertical acceleration or the first instance when the vertical acceleration was greater than three times gravity [40], [128]. Gait event detection using the sacral IMU utilized the anterior posterior accelerations. The spatial rule for the identification of ICsacrum was local minima with a maximum value of 5 m s -2 in the posterior direction. The temporal rule for ICsacrum was a minimum temporal difference of 200 ms between the identified ICsacrum [40]. The identification of TOsacrum with a search window was either with the maximum acceleration in the anterior direction, or the maximum positive slope of the acceleration in the anterior direction [41]. Exemplar output of these algorithms are presented in Figures 5.2 and 5.3. Results The gait events estimated using data from foot mounted IMUs were more accurate than the those estimated using the sacral mounted IMU when compared to the force 70 measurement standard identification (Figure 5.4, Table 5.3). Foot mounted IMUs had a larger RMSE than the sacral mounted IMU for the identification of gait events in the slowest running velocity conditions. Across the range of velocities, the algorithms identified ICfoot and ICsacrum with similar accuracy (Figure 5.4, Table 5.3). The identification of TOsacrum was more accurate at slower velocities than TOfoot. However, at running velocities > 3.57 m s-1 the algorithm identified TOsacrum after the force measured TO and with larger RMSEs than TOfoot (Figure 5.4). Table 5.3: Root Mean Square Error for the Identification of Gait Events and Estimation of Foo t Contact Foot Mounted Sacral Mounted Velocity Initial Toe Contact Initial Toe Contact (m s-1) Contact (s) Off (s) Time (s) Contact (s) Off (s) Time (s) 2.24 0.074 0.092 0.025 0.045 0.059 0.043 2.33 0.026 0.060 0.045 0.038 0.021 0.030 2.44 0.024 0.044 0.026 0.035 0.027 0.026 2.55 0.033 0.064 0.054 0.039 0.036 0.024 2.68 0.024 0.040 0.038 0.028 0.028 0.490 2.82 0.024 0.034 0.029 0.030 0.032 0.030 2.98 0.018 0.039 0.035 0.024 0.032 0.027 3.16 0.024 0.036 0.032 0.029 0.042 0.029 3.35 0.024 0.035 0.025 0.039 0.051 0.036 3.58 0.018 0.023 0.022 0.019 0.042 0.037 3.83 0.020 0.031 0.028 0.024 0.045 0.036 4.13 0.019 0.024 0.023 0.020 0.047 0.039 4.47 0.021 0.026 0.021 0.020 0.044 0.036 4.88 0.018 0.026 0.022 0.020 0.035 0.030 5.36 0.020 0.038 0.035 0.020 0.033 0.026 In these data, 0.01s is equivalent to a single sample difference between the IMU estimation of gait events and contact time and the values derived from the force data. 71 Estimation of foot contact duration with data from the foot mounted IMU showed an overestimation of foot contact duration at slower running velocities (< 2.55 m s-1). Estimated foot contact duration from the dorsal mounted IMUs generally had a smaller RMSE across the range of velocities than the sacral IMU estimated contact duration (Table 5.3 and Figure 5.5). The algorithm for the sacral mounted IMU data consistently underestimated the duration of foot contact at slower velocities, < 3.5 m s-1, and overestimated the duration of foot contact at faster velocities > 3.5 m s-1 (Figure 5.5). Figure 5.2: Acceleration waveforms from the IMUs mounted on the right foot (orange and blue waveforms), with superimposed foot contacts (dashed black square waves) identified from the force measuring insoles. The estimated ICfoot are shown in the filled circles and estimated TOfoot is shown in the filled squares. The search windows used for the identification of TOfoot are shown in the solid black rectangles. Panel A shows data from a 2.24 m s-1 run and panel B shows data from a 5.36 m s-1 run. Analysis of foot contact by trial mean were examined using Bland Altman plots. These plots present a comparison between the average difference between IMU estimated foot contacts and force measured foot contacts (Figure 5.6). The offset of the foot IMU 72 estimate was 0.004s with [-0.005 0.013] 95% Limits of Agreement (LoA) and the sacral IMU estimate offset was 0.001s with [-0.018 0.021] 95% LoA. These results show more variability at the slower velocities, and longer foot contacts, for both the sacral and foot mounted IMUs (Figure 5.6). We used a linear model to examine the relationship between the IMU estimated foot contacts and the force measured foot contact as well (Figure 5.7). Regression analysis of the sacral estimation of foot contact resulted in an r2 value of 0.73, a moderate correlation, and a slope of 0.60, indicating an underestimation of the foot contact duration . Regression analysis from the foot IMU estimated contact duration resulted in an r2 value of 0.91, a strong correlation, and a slope of 1.15, indicating a slight overestimation of the foot contact duration. Figure 5.3: Acceleration waveforms from IMUs mounted on the sacrum (blue waveforms), with superimposed foot contacts (dashed black square waves) identified from the force measuring insoles. The estimated ICsacrum are shown in the filled circles and the estimated TOsacrum are shown in the filled squares. The search windows used for the identification of TOsacrum are shown in the solid black rectangles. Panel A shows data from a 2.24 m s-1 run and panel B shows data from a 5.36 m s-1 run. 73 Figure 5.4: Time differences in the identification of gait events between measured forces and estimated IMU gait events. Negative time differences indicate that the IMU estimated gait event occurred prior to the measured gait event. The identification of TOsacrum had larger error rates due to the temporal windowing of the data, and the wider range of velocities used than previous work. 74 Figure 5.5: Differences in force measured and IMU estimated contact time across the range of average velocity trials. The sample sizes for each velocity are shown in Table 5.1. Discussion The purpose of this work was to validate the identification of gait events using data from IMUs in a semi-uncontrolled environment. We collected inertial data from two IMUs attached bilaterally on each foot and one approximately on the sacrum, from participants of varying running skill levels. We developed and implemented two algorithms for the identification of gait events from an IMU, based upon previous work [40], [128], [133]. The output of these algorithms, IMU estimated gait events were validated against the standard of a force sensing insole. The main findings of the work are summarized briefly here: 1) Estimated and measured contact times generally decreased across the range of running velocities; 2) Identification of gait events from the foot mounted IMUs was more accurate than identification of gait events from the sacral 75 mounted IMU; 3) Foot contact was identified with an average RMSE of < 0.04 s across the range of average running velocities for the foot and sacral mounted IMUs. Figure 5.6: Bland-Altman plot displaying the average differences between the estimated contact durations from the IMUs and the measured gait events from the force sensing insoles from the Sacral IMU in Panel A and the Foot Mounted IMU in Panel B. Each dot represents an average velocity trial by a participant. Differences greater than 0 are an overestimation of contact time by the IMU. Differences less than 0 are an underestimation of foot contact by the IMU. 76 Figure 5.7: Time differences between the IMU estimated foot contact and force measured foot contact. Sacral mounted IMU data generally underestimate foot contact, with a slope of 0.60. Foot mounted IMU data generally overestimate foot contact duration at faster velocities slope of 1.15. The foot mounted IMU was more accurate across the range of velocities for the estimation of foot contact in comparison to the sacral mounted IMU. It is necessary to address the accuracy of the force sensing insoles and their measurement of contact time with respect to velocity, as this measure was considered our standard. Foot contact time measured from the insoles followed a similar pattern to 77 [137]; a decrease in contact time with an increase in velocity [138]. Foot contact durations as measured by the force insoles were consistently longer in duration than those reported in [137]. A contributing factor in this was the participants recruited for these studies. Our study recruited participants from a range of running skills, from the truly novice, to endurance athletes able to run at 15 min 5 km race pace. The samples tested in previous studies consisted of middle distance and sprint athletes. The time between ICfoot and the force sensing insole approached zero difference as velocity increased (Figure 5.5). We used an algorithmic method most similar to [9], which proposed the identification of peak resultant acceleration across the range of velocities, and this has become a common approach for the identification of ICfoot [41], [139]. The work of [41] utilized a force plate and single foot strikes for the identification of gait events, reporting differences in ICfoot ranging from -7.3 to 3.3 ms. In our study error in the estimate of ICfoot ranged from -63 to -5 ms (Figure 5.4). They additionally reported TO differences ranging from – 53 ms to -32 ms, compared to our range of TOfoot differences from -78 to 2 ms (Figure 5.4) [41]. We have improved on the identification of TOfoot, by incorporating their rule for the identification of TOfoot as a secondary rule to the identification of peak vertical acceleration, in a temporal search window [40], [41]. Our work had more variability in the range of TOfoot identification, however, only at the three slowest running velocities, as there was only one participant who ran at these average running velocities. Foot mounted IMUs overestimated contact time at running velocities < 2.52 m s-1 (Figures 5.4 and 5.6). The slope of the regression line for this comparison is 1.15, providing further evidence of overestimation of foot contact time overall (Figure 5.7). 78 Another study reported an offset of -0.047 s with 95% LoA of [-0.059 0.154] [40], compared with our findings of 0.004s with a 95% LoA [-0.005 0.013], showing an overall improvement. It should be noted that Benson et al. reported challenges in the identification of TOfoot for one of their participants, which may have contributed to the larger offset [40]. Our algorithm was developed on a wider range of velocities, participant running abilities and on data collected in a semi-uncontrolled environment. The set of algorithms we have developed captured the gait events more effectively than previous work. The overestimation of foot contact at the slower running velocities would likely be remedied by the inclusion of a greater number of less experienced runners. Estimations of contact time from the sacral mounted IMUs between the velocities of 2.52 and 3.16 m s-1 matched the measured contact times (Figure 5.5). However, the RMSE TOsacrum at velocities >3.57 m s -1 is greater than 0.04 (Table 5.3). This stems from difficulty in estimating the temporal window in which to identify TOscrum. Identification of the temporal duration of the window in which to estimate TOsacrum is related to aerial time, which in this study ranged from 40 to 100 ms. Differing window lengths were tested to accurately identify TOsacrum. The most accurate of these resulted from window termination 20 ms before the next ICsacrum (Figure 5.3). A dynamic temporal window for the estimation of TOsacrum could be a way to further improve estimation accuracy of TOsacrum and foot contact. The study by [48] reported an average underestimation of foot contact from sacral mounted IMUs from -0.017 to -0.001s, while the current findings show the average contact time differences from -0.011 to 0.027s. Specifically, at average running velocities < 2.56 m s-1 the sacral IMU underestimated foot contact time, and at average running velocities >3.16 m s-1 contact time was overestimated. Another study, 79 [40], reported a foot contact offset from a sacral mounted IMU of 0.029s with a 95% LoA [-0.069 0.010]. Our analysis had a smaller offset of 0.001s with 95% LoA of [-0.018 0.021]. There were multiple limitations in this study. First, the temporal synchronization between the IMUs and the force sensing insoles presented an initial challenge. While participants were running, the IMU clock and the force sensing insole clock could be off by a 0.01s. These errors resulted in zeros being added or removed during swing the phase from the force sensing insole data. These errors were partially accounted for by foot- stomp synchronization between each trial. We did not want to artificially decrease the RMSE in the results of this work by removing data due to imperfect synchronization. This in turn led to cumulative errors between the measured and estimated gait events. These errors would be more concerning if there were large differences in the estimation of contact time across the range of velocities. As it is, we feel that it is important to include the data in full, to fully represent the performance of the techniques used. A single participant’s data heavily influenced the error in the model. This participant ran less than 5 miles per week and was truly a novice runner. When the data from this participant were removed from the data set, ICfoot ranged from -20 ms to -1 ms, compared to the original -63 ms to 5 ms (a ~60% improvement) from velocities of 2.67 m s-1 to 5.4 m s-1 and TOfoot temporal differences ranged from -30 ms to -1 ms, compared to the original values of -78 ms to 2 ms (a ~75% improvement) across the range of velocities. Further, exclusion of this participant from the analysis reduces offset in the foot mounted IMU in the estimation of foot contact from an offset of 4 ms to an offset of 1 ms, while the 95% LoA remains the same. The slope of the linear regression also 80 decreased from 1.15 to 1.10 when this participant’s data are removed from analysis, indicating reduced overestimation of contact time in the model. We chose to include this participant as a representative example of the truly novice runner. We expect that if more runners from a wide range of running levels were included in the data set, we would see decreased variability in the identification of IC and TO from the minimal and maximal running velocities included in this work. In conclusion, our results demonstrate the validity of two different gait event detection algorithms for a range of running velocities and skill levels in a semi- uncontrolled environment. We used data from a wider range of participant skill levels, and a wider range of running velocities than previous studies. We have demonstrated the utility of these algorithms for identification of foot contacts in a semi-uncontrolled environment. The use of a gait event detection system in a real-world environment needs to be validated for a broader set of conditions before we can estimate other biomechanical variables from these devices. The next steps in this research are the estimation of ground reaction force waveforms from the force sensing insole data, and testing of these algorithms in a truly uncontrolled environment. 81 Bridge The goal of the previous chapter was to investigate the effectiveness of two rules- based algorithms for the identification of foot contacts in a semi-uncontrolled environment. Given the consistency of these results we were able to make the corrections to the force data necessary to time-align initial contacts from the IMU and force sensing insoles. In the next chapter, we move forward with the estimation of whole GRF waveforms across the range of velocities in a semi-uncontrolled environment. During these trials participants ran at approximately a set pace, with minor fluctuations in running velocity, such as at the beginning and the end of the trial. 82 CHAPTER VI VALIDATION OF RUNNNING GAIT EVENT DETECTION ALGORITHMS IN A SEMI-UNCONTROLLED ENVIRONMENT This work is currently in preparation for submission to Journal of Biomechanics. Dr. Michael E. Hahn, provided mentorship including assistance with study design, data interpretation, and editing and finalizing the final manuscript. Introduction Wearable sensors are being used extensively for the collection of human running biomechanical data outside of the laboratory [40], [123], [130], [140], [141]. The primary wearable sensors recently used in locomotion biomechanics have been multi-axial inertial measurement units (IMUs), which measure linear acceleration and angular velocity data. Previously, IMUs have been used to estimate gait events, contact time, and other spatial temporal variables [40], [41], [44], [48]. Additionally, specific kinetic variables have been estimated from IMUs, such as joint moments, peak vertical force, impulse and loading rate [43], [44], [47], [140]. Other wearable sensors utilized for biomechanical monitoring or clinical evaluation are insole force sensors. These sensors measure force between the foot and shoe, and have been validated as a measure of vertical ground reaction forces (GRF) on a treadmill [10]. Wearable sensors have the capability to also be incorporated into other sensor systems for the estimation of specific external loading variables and internal tissue loading [142], [143], and for overall feedback during training [122], [144]. 83 Previous studies estimating kinetic variables associated with external loading have used either statistical or physical models [44], [88], [145]. In recent years, machine learning techniques have been used instead of physical or statistical modelling, having become a popular set of tools for biomechanical analysis and estimation of kinematic and kinetic variables. Previous work using machine learning algorithms have estimated or predicted gait events from IMU data [18], [146]–[148], with IMUs located either bilaterally on the feet or on the sacrum. Machine learning algorithms, including artificial neural networks (ANNs), recurrent neural nets (RNNs), among other techniques, have also been used to estimate kinetic variables, such as vertical impulse, loading rate and peak GRFs [43], [45]–[47], [149]. A few drawbacks for these studies include the biomechanical expertise required for estimation of these variables, significant preprocessing of raw data, and identification of stance phase before data can be parsed into a usable form. Furthermore, these machine learning models are not yet ready for implementation outside of the laboratory, as they have been built using data from controlled laboratory settings and do not capture the variability of human movement that occurs out of the laboratory in response to surface differences and changes in velocity [150]. We propose the use of RNNs, specifically Long Short Term Memory networks (LSTMs) to map IMU data onto GRF waveforms measured with force sensing insoles. The LSTM approach was specifically developed for time series data, and mapping between two different waveforms [151]. The purpose of this study was to implement a machine learning algorithm for the mapping of inertial data to kinetic waveforms from participants running on a track across a range of velocities and participant skill levels. 84 Methods This study was approved by the University of Oregon Institutional Review Board (protocol #: 10062020.007). Data were collected from 15 participants (Table 6.1), (9 male, 6 female, age: 23.6 years, height: 178.3 cm, mass: 73.5 kg) as part of a larger, ongoing project. Participant 5-km race times Table 6.1: Distribution of Average Running Velocities ranged from 30:00 - 15:30 minutes (Tables 6.1 Average Number of Running Participants and Table 6.2). All analyses were performed in Velocity (m s-1) 2.23 1 custom Matlab programs (Mathworks, Natick, 2.33 1 2.43 1 MA). Multi-axial IMUs (Casio, Tokyo, Japan) 2.55 4 2.68 6 were mounted bilaterally on the dorsal aspect 2.82 6 2.98 6 of each participants feet and approximately on 3.15 7 3.35 9 the sacrum (clipped on the back of each 3.57 9 3.83 9 participant’s waistband). These sensors 4.12 8 4.47 8 recorded 3D linear accelerations and angular 4.87 7 5.36 7 velocities at 200 Hz. Data were post processed with a Kalman filter to orient the local (IMU) coordinate system vertical to gravity (Figure 6.1 Panel B). The use of multiple inertial sensors has been suggested to lead to improved estimation of spatial temporal and kinetic variables, compared to a single inertial sensor [44], [152], [153]. Foot-shoe normal force data were recorded with Loadsol insole force sensors (Novel Electronics, St. Paul, MN) at 100 Hz. 85 Participants performed four to five 400 m running trials on a square track, at paces based upon their self-reported 5 km race pace (a total of 4 or 5 paces, with the Table 6.2: Example Paces and Timing for 400 m run fastest pace being Example Paces Minutes per mile Pace 1 8:30 optional). An exemplar Pace 2 8:00 Pace 3 7:30 set of paces is shown in Pace 4 7:00 Pace 5 (optional) 6:30 Table 6.2. The total range of velocities run by participants in this study was 2.4 – 5.4 m s-1. Each participant monitored their pace with Garmin GPS, (Kansas City, KS). If they missed their time trial by more than 2 seconds, they would be asked to repeat the trial, after sufficient rest. These velocities represent typical training and race paces for the majority of recreational and high-level distance runners [44], [137]. Data Processing Foot-shoe normal force data were measured from force sensing insoles, considered the standard reference for identification of measured gait events and kinetic variables in this study [10]. The IMU data were down sampled to 100 Hz to match the force sensing insole sampling frequency and then filtered with a 4th order low-pass zero- lag Butterworth filter (fc = 35 Hz). Kinetic data were filtered with a 2nd order low-pass zero-lag Butterworth filter (fc = 20 Hz). The inertial signals and force data were time- synced using foot-stomps before and after each trial. Internal clock drift between the insoles and the IMUs was rectified by an iterative corrections algorithm. Force data <5% body weight (BW) were set to 0 BW. The estimated kinetic waveforms output from the machine learning algorithm were filtered with the same filter as the measured kinetic 86 waveforms. Estimated foot ground contacts less than 0.050 s were set to 0 BW, as foot contacts shorter than 0.050 s were considered noise, having small magnitudes and not consistent with measured foot contacts observed in running locomotion. Machine Learning Architecture The overall structure of the machine learning algorithm is shown in (Figure 6.1 Panel A). The mathematical details of LSTMs can be found elsewhere [51], [154]. The steps for development and testing of the machine learning models were two-fold; first the network architecture was optimized using the Bayesian Optimization for Deep Learning [155], and then the network was tested using Leave One Out Cross Validation (LOOCV). Bayesian Optimization for Deep Learning requires user specification of the hyperparameters, which are then optimized. The Bayesian Optimization was tested on the data set with 70% Training, 15% Validation and 15% Test segmentation of the data (Figure 6.1). The optimal network architecture was determined by performance on the Test data set. The hyperparameters optimized included the initial learning rate, gradient decay factor, squared gradient decay factor, L2Regularization and number of hidden units. The only hyperparameter that influenced the outcome of the algorithm was the number of hidden units. The range of the number of hidden units used in the optimization was [10 - 50]. Through the Bayesian Optimization process, the optimal number of hidden units was determined to be 42. This value was used for the LOOCV process. All other hyperparameters converged to the default Matlab input. Additional preliminary work showed that a 4 second temporal window was the most accurate for the estimation of ground reaction force waveforms. 87 Figure 6.1. Panel A, machine learning methodology and throughput, specifically the input data, machine learning protocol and output. Calculated output contact time and kinetic variables are shown here. Panel B instrumentation on the foot, with the Kalman corrected coordinate system on the IMU. Panel C measured and estimated ground reaction forces from a participant with the smallest RMSE for a participant, Panel D shows the largest RMSE measured and estimated ground reaction forces from the same participant. Data Analysis Estimated waveforms from the LOOCV were analyzed and are presented in this work. Initial analysis involved identification of the optimal temporal window. This was done via inspection of the estimated kinetic variables in Bland-Altman plots. Initial Contact (IC) was identified by the first instance of force >5% BW, and toe off (TO) was determined by the last instance of force greater than >5% BW. Contact time was determined by taking the temporal difference between these two discrete events. Stance average GRFs, impulse, peak GRFs, and average loading rate were the kinetic variables 88 calculated in this work, from the estimated force waveforms. Average loading rate was calculated by identifying the impact peak and then averaging the slope in the middle 60% of the region between IC and the impact peak [58]. Pearson correlation coefficients (r2) were used to compare the estimated force data output from the LSTM to the measured insole force data. Seventy-five data points were used, with fifteen participants running five velocities, and each data point representing a 400 m time trial on a square track. Differences between the model estimated variable and measured waveform variable are presented in both linear regression and Bland-Altman plots with 95% confidence intervals (CIs) and Limits of Agreement (LoA), respectively. A strong correlation was defined as r2 ≥ 0.8, a moderate correlation as 0.5 ≤ r2 ≤ 0.8 and a weak correlation as 0.3 ≤ r2 ≤ 0.5. Differences between measured and estimated gait events are presented as well as, root mean square error (RMSE) for each contact time, and kinetic variable. Results The data presented are the trial means from each subject and velocity from the LOOCV analysis. Waveform RMSE ranged from 0.191 – 0.309 BW, while the individual stance phase RMSE ranged from 0.189 to 0.288 BW (Table 6.3). Exemplar data for the minimal and maxima RMSE outputs for a participant are shown in (Figure 6.1 panel C). Estimated IC was identified prior to measured IC and IC differences ranged from 0.013 - 0.020 s per trial (Table 6.4). The identification of TO differences ranged from - 0.012 - 0.041 s. At velocities <3.16 m s-1, TO was estimated prior to the measured gait event. 89 However, at velocities > 3.16 m s-1, the estimation of TO occurred after the measured gait event (Table 6.4). Estimated and measured contact time had good agreement at average running velocities < 3.16 m s-1, however at average running velocities >3.16 m s-1, contact time was overestimated (Figure 6.2 Panel A). The Pearson’s Correlation Coefficient between the estimated and measured contact time showed a moderate Table 6.3: Stance and Waveform Root Mean Square Error correlation; r 2 = 0.795 Average Stance RMSE Waveform RMSE Velocity (m s-1) (BW) (BW) (Figure 6.2 Panel B). 2.24 0.230 ± 0.000 0.238 ± 0.025 2.33 0.189 ± 0.000 0.191 ± 0.029 Bias in the estimate of 2.44 0.199 ± 0.000 0.202 ± 0.028 2.55 0.253 ± 0.080 0.271 ± 0.100 contact time was 0.020 2.68 0.268 ± 0.056 0.266 ± 0.074 2.82 0.267 ± 0.019 0.274 ± 0.057 with 95% LoA: [- 2.98 0.281 ± 0.020 0.309 ± 0.059 3.16 0.262 ± 0.038 0.304 ± 0.056 0.011 0.051] (Figure 3.35 0.288 ± 0.039 0.305 ± 0.071 3.58 0.248 ± 0.033 0.268 ± 0.062 6.2 Panel C). Contact 3.83 0.230 ± 0.049 0.265 ± 0.076 4.13 0.242 ± 0.053 0.281 ± 0.072 time RMSE ranged 4.47 0.233 ± 0.048 0.275 ± 0.073 4.88 0.243 ± 0.043 0.284 ± 0.089 from 0.089 s to 0.021s 5.36 0.240 ± 0.052 0.287 ± 0.101 (Table 6.5). Estimated output from the measured stance average GRFs showed a consistent underestimation at velocities > 3.16 m s-1 (Figure 6.3 Panel A). There was a weak correlation between the estimated stance average GRF and the measured stance average GRF; r2 = 0.408 (Figure 6.4 Panel A). The agreement between the estimated stance average GRFs and the measured stance average GRFs were offset by -0.092 BW and 95% LoA [-0.351 0.167] BW (Figure 6.5 Panel A). The stance average ground reaction force RMSE ranged from 0.063 - 0.402 BW (Table 6.5). The measured stance impulse decreased as the average running velocity increased. At all but the slowest velocity (2.24 90 Table 6.4: LSTM Estimated Gait Event Error Average Velocity IC Difference (s) TO Difference (s) (m s-1) 2.24 0.013 ± 0.000 0.041 ± 0.000 2.33 0.019 ± 0.000 0.014 ± 0.000 2.44 0.019 ± 0.000 0.015 ± 0.000 2.55 0.015 ± 0.003 0.007 ± 0.003 2.68 0.017 ± 0.005 0.014 ± 0.018 2.82 0.015 ± 0.006 0.015 ± 0.017 2.98 0.015 ± 0.003 0.007 ± 0.017 3.16 0.014 ± 0.006 0.001 ± 0.011 3.35 0.019 ± 0.004 -0.003 ± 0.022 3.58 0.020 ± 0.005 -0.011 ± 0.012 3.83 0.018 ± 0.005 -0.007 ± 0.012 4.13 0.015 ± 0.008 -0.012 ± 0.009 4.47 0.017 ± 0.008 -0.009 ± 0.009 4.88 0.017 ± 0.003 -0.006 ± 0.008 5.36 0.017 ± 0.005 -0.006 ± 0.005 m s-1) and the two fastest velocities (4.88 and 5.36 m s-1) the impulse was overestimated by the LSTM output (Figure 6.3, Panel B). Estimated impulse had a weak correlation with measured impulse; r2 = 0.385 (Figure 6.4 Panel B). The agreement between the estimated impulse and the measured impulse bias was 0.007 BW*s and 95% LoA [-0.051 0.065] BW*s (Figure 6.5 Panel B). The RMSE across the range of velocities ranged from 0.159 to 0.266 BW*s (Table 6.5). The estimated peak forces across the range of velocities were similar to the measured peak forces, except for at the slowest velocity (2.24 m s-1) (Figure 6.3 Panel C). Estimated peak GRFs were moderately correlated with the measured peak ground reactions forces (r2 = 0.614) (Figure 6.4 Panel C). The agreement between the measured and estimated peak GRFs had an offset of 0.029 BW with 95% LoA [-0.322 0.381] BW (Figure 6.5 Panel C). The average RMSE of peak vertical GRFs ranged from 7.798 to 91 18.132 (BW) (Table 6.5). The estimated average force loading rate was overestimated compared to measured loading rate across the range of velocities. Estimated loading rate was weakly correlated with measured loading rate (r2 = 0.405) (Figure 6.4 Panel D). The agreement between the measured and estimated loading rate had an offset of -6.116 BW s-1, with LoA [-20.475 8.243] BW s-1 (Figure 6.5 Panel D). The average RMSE for loading rate over each velocity ranged from [0.07 0.218] BW s-1 (Table 6.5). Table 6.5: LSTM Estimated Spatial-Temporal and Kinetic Variables RMSE Average Contact Time Stance Impulse Peak GRF Loading Rate Velocity (s) Average (BW) (BW* s) (BW) (BW s-1) (m s-1) 2.24 0.090 ± 0.000 0.063 ± 0.000 0.103 ± 0.000 0.266 ± 0.000 13.592 ± 0.000 2.33 0.021 ± 0.000 0.070 ± 0.000 0.032 ± 0.000 0.159 ± 0.000 15.575 ± 0.000 2.44 0.023 ± 0.000 0.070 ± 0.000 0.032 ± 0.000 0.183 ± 0.000 12.559 ± 0.000 2.55 0.028 ± 0.002 0.127 ± 0.082 0.048 ± 0.024 0.227 ± 0.081 9.535 ± 3.032 2.68 0.024 ± 0.001 0.122 ± 0.069 0.041 ± 0.016 0.221 ± 0.053 8.733 ± 1.564 2.82 0.024 ± 0.001 0.108 ± 0.048 0.042 ± 0.012 0.256 ± 0.053 8.695 ± 1.839 2.98 0.025 ± 0.006 0.121 ± 0.037 0.044 ± 0.012 0.208 ± 0.034 7.798 ± 1.242 3.16 0.023 ± 0.006 0.154 ± 0.073 0.041 ± 0.011 0.221 ± 0.018 9.963 ± 4.378 3.35 0.034 ± 0.014 0.150 ± 0.068 0.047 ± 0.013 0.213 ± 0.038 11.383 ± 5.400 3.58 0.039 ± 0.016 0.196 ± 0.091 0.047 ± 0.016 0.225 ± 0.056 14.725 ± 4.691 3.83 0.031 ± 0.012 0.177 ± 0.086 0.037 ± 0.018 0.230 ± 0.107 14.085 ± 3.744 4.13 0.033 ± 0.013 0.196 ± 0.093 0.040 ± 0.022 0.237 ± 0.130 14.825 ± 4.458 4.47 0.032 ± 0.014 0.198 ± 0.093 0.037 ± 0.019 0.226 ± 0.118 15.561 ± 5.319 4.88 0.029 ± 0.008 0.218 ± 0.102 0.034 ± 0.014 0.237 ± 0.118 15.608 ± 3.667 5.36 0.028 ± 0.007 0.198 ± 0.085 0.032 ± 0.014 0.229 ± 0.121 18.132 ± 5.392 92 Figure 6.2: Complete analysis of contact time. Panel A shows contact time trends across the range of velocities. The measured contact times are in black and the estimated contact times are in red. Regression analysis and 95% confidence intervals of contact time is in Panel B. Panel C presents a Bland-Altman plot of the difference between the estimated and measured contact times, and 95% limits of agreement. 93 Figure 6.3: Kinetic variables across velocities. Measured (black) variables calculated directly from the force sensing insoles. Estimated (red) calculated from the LSTM estimated waveform. The trends in the estimated data follow those of the measured data, there is however an offset between the two. 94 Figure 6.4: Regression analysis of estimated kinetic variables with mean and 95% confidence intervals. Each data point represents an average velocity trial. The color of a trial represents the average running velocity of the trial. 95 Figure 6.5: Bland Altman plot with offset and 95% Limits of Agreement. Each data point represents an average velocity trial. The color of a trial represents the average running velocity of the trial. 96 Discussion The purpose of this study was to implement a machine learning algorithm for the mapping of inertial data to kinetic waveforms from participants running on a track across a range of velocities and participant skill levels. We estimated GRF waveforms with three inertial sensors from participants running in a real-world training scenario; 400 m repeats at prescribed paces. Three specific findings can be summarized: (1) we estimated four- second GRF waveforms from the IMU data of the same duration, (2) estimations of contact time from the output waveform were accurate, but were overestimated at average running velocities > 3.16 m s-1, and (3) estimates of single kinetic variables matched the overall trends of the measured input data, however the model tended to underestimate kinetic variables (stance average forces, peak force and average loading rate) at running velocities > 3.16 m s-1 (Figures 6.3 and 6.5). The estimation of gait events from IMUs have been reported with a wide variety of algorithmic methods, as these are the most critical variables for parsing biomechanical waveforms [14], [70], [156]. In the present study, IC difference between estimated and measured gait events ranged from 0.013 - 0.020s across a range of running velocities, indicating that IC was estimated to occur prior to the measured IC. This temporal difference may have been due to the iterative corrections algorithm that was utilized to match estimated IC with measured IC. Estimation of TO differences ranged from -0.012 – 0.041s, indicating a range of slightly early and slightly late estimation. This appears to be specific to running velocity, as estimation of TO at velocities < 3.16 m s-1 occurred before measured TO, and at running velocities > 3.16 m s-1 TO was estimated to occur after measured TO. It follows that contact time was overestimated at velocities >3.35 m s- 97 1. Previous work reported an RMSE of 0.011s and r2 = 0.665 from a quantile regression forest [43]. Our results show a threefold increase in the RMSE to 0.032s but a stronger correlation r2 = 0.795. Greater error in our estimates likely came from greater variability in the average running velocity throughout a trial and the inclusion of accelerations and decelerations within a running trial. Stance phase ground reaction force RMSE was comparable to ranges presented in previous work (RMSE of 0.39 BW) [47], with our estimated waveforms resulting in average RMSE of 0.245 BW for all running velocities. Another study reported an RMSE ranging from 0.13 – 0.17 BW between velocities of 2.7 and 4.5 m s-1 [45], using an algorithm that is closest in nature to ours, as they estimated portions of waveforms that could be concatenated into full GRF waveforms. The performance of our algorithm was similar to these previously reported values, with an RMSE ranging from 0.189 - 0.288 BW (Table 6.3), across a wider range of velocities and at non-constant velocities. The primary difference between our current study and previous work in this area is that previous studies estimated whole GRF waveforms in the laboratory at steady state running velocities on a treadmill, and only estimated stance phase or a segment of the waveform. Ours is the first study to produce a model for estimation of a full GRF waveform with multiple stance and swing phases from data collected outside of the laboratory. In our study, measured stance average GRFs generally increased with velocity (Figure 6.3 Panel A), however not linearly as expected [137]. Estimated stance average GRFs were underestimated when compared to measured stance average GRFs (Figure 6.2 Panel A and Figure 6.5 Panel A). Divergence between estimated and measured values 98 occurred over the same range of velocities (3.16 to 5.36 m s -1) that contact time was overestimated (Figure 6.2 and Figure 6.5 Panel A). Generally, an increase in contact time will cause a decrease in stance average GRFs. This is compounded with the underestimation of peak GRF values at faster average running velocities (Figure 6.5 Panel C). Faster running velocities revealed a greater bias in estimated stance average GRFs. For comparison, the physical model developed by [157], presented an average RMSE ranging from 0.681 – 1.302 BW for running velocities from 2 – 5 m s-1. Regardless, our results show notable improvement on this work, with an RMSE for estimated stance average GRF ranging from 0.063 to 0.218 BW (Table 6.5). Estimation of impulse is the most mathematically complex variable presented in this work and it also has the poorest agreement between estimated and measured values. Impulse was expected to decrease as velocity increased [145], [158], which matches our results. Estimated impulse from a quantile regression forest was reported to have a strong correlation (r = 0.974) and an RMSE of 0.004 BW*s for running velocities between 3.8 and 5.4 m s-1 [43]. Our results differ, with a weak correlation of r2 = 0.385 and an average RMSE across velocities 0.044 BW*s. These differences can be related, in part, to the discussion of errors above for both contact time and stance average GRFs. Another key difference is the variation in experience levels among our participants when compared to highly trained Division 1 endurance athletes. Beyond these differences, impulse is highly susceptible to errors in both the estimation of contact time and GRF magnitude, both of which had detectable bias in the current model. As expected, peak force increased with running velocity (Figure 6.3 Panel C). This measure has been related to estimation of external load while running [145], [158]. 99 Estimation of peak GRFs across the range of running velocities was the most accurate output from the current model. Previous research reported that the relationship between peak GRFs estimated by an ANN at three different velocities (ranging from 2.78 – 3.89 m s-1) had a moderate correlation for peak GRF; r2 = 0.72 and 95% LoA [-0.17 0.18] BW, with a bias of 0.01 BW, on average [47]. In contrast, our model had a slightly weaker correlation (r2 = 0.614) and LoA [-0.322 0.381] BW, with an average RMSE of 0.223 BW. Although our model resulted in similar correlations, we also have twice as much variability represented by our 95% LoA range. Further investigation revealed an outlier from the peak GRF analysis, in which the value was overestimated by approximately 50% for a single participant. This observation indicate that the force-measuring insoles were moving between the foot and the shoe for this participant. Measured average loading rate generally increased with running velocity, as expected (Figure 6.3 Panel D) [158]. Wouda et al. reported an ANN-estimated loading rate with correlation of r2 = 0.57, LoA of [-16 10] BW s-1 and a bias of -2.9 BW s-1 [47]. Our results showed a correlation of r2 = 0.405, with LoA [-20.450 8.243] BW s-1 and a bias of -6.116 BW s-1, demonstrating less agreement and a larger bias than the previous work. This could be due in part to differences in data collection protocols and the sensitivity of the force sensing insoles to error in the calculation of loading rate [9]. Estimated average loading rate was underestimated at velocities > 3.16 m s-1, possibly due to errors in the estimation of gait events. Identification of IC prior to the measured gait event decreases the estimated average loading rate. We attempted to mitigate this by estimating average loading rate between 10 and 40% contact time. However, as contact 100 time was overestimated and GRFs were underestimated, underestimation of average loading rate across the range of velocities still occurred. There are several limitations in this work. The force sensing insoles occasionally lost connection during trials, which led to different calibration files for the same participant. Force sensing insoles rely heavily on the calibration process prior to the data collection, and if they move between the foot and the shoe the force values will be affected. These sources of error likely contributed to the variability within our data and affected the machine learning model for the estimation of GRF waveforms. The methodology presented in this work is transferable to real-world running. However, we hesitate to recommend the algorithm in current form as a tool for the analysis of training and the translation of this work into the real-world environment. Overestimation of contact time with increased running velocity is an example of the limited transferability of the algorithm to novel environments. Building a machine learning model for a single participant or a small subset of participants with similar running ability would substantially reduce the model’s estimation error. We had a single participant run at the slowest velocities, and this participant’s data did not follow the expected kinetic trends. However, this participant’s data provide a good benchmark to demonstrate how these methods capture running performance of a truly novice runner. In conclusion, the mapping of GRF waveforms from IMU data collected in a real- world environment has been shown to be feasible, with limitations. We have presented conservative results from an LSTM model of GRF waveform estimation by reporting data from a LOOCV analysis. We used three IMUs for the mapping of inertial to kinetic data for a variety of participants ranging in skill from truly novice runner (30:00 101 estimated 5km race time) to more highly trained runners (15:30 5km race time) running 400m on a square track. This work has improved upon much of the relevant literature for estimation of spatial-temporal and kinetic measures from the estimated ground reaction force waveforms. Future studies investigating the effects of different amounts of data input, and potentially the inclusion of a wider range of running velocities should improve estimations from similar machine learning algorithms. Additionally, it would be valuable to identify biases in the reported variables by comparing measurement of force data from a force-instrumented treadmill to those measured by force sensing insoles, across a range of velocities and inclinations matching the training environment of experienced runners. 102 Bridge The goal of the previous chapter was to investigate the performance of an LSTM for mapping could estimate a whole GRF waveform from inertial data in a semi- uncontrolled environment. From these data we were able to estimate contact time as well as discrete kinetic variables from three inertial sensors. The purpose of the next chapter was to utilize multiple aspects from the previous methodologies for the estimation gait events from foot-mounted IMUs and GRF waveform from a free run on a 5-mile course. We included the use of GPS for measurements of pace and slope the participant ran on. This next chapter synthesizes the gait event identification, and GRF waveform estimation from previous chapters and applies them to this uncontrolled setting. 103 CHAPTER VII RUNNING IN THE REAL WORLD: ESTIMATION OF GAIT EVENTS AND GROUND REACTION FORCE WAVEFROMS WITH WEARABLE SENSORS AND MACHINE LEARNING This work is currently in preparation for submission to Nature. Dr. Michael E. Hahn provided mentorship including assistance with study design, data interpretation, and editing and finalizing the final manuscript. Introduction Biomechanical analysis of running outside the laboratory has become possible, due to advances in wearable sensor and machine learning technologies [1], [159]. Laboratory based technologies such as motion capture and instrumented force plates have been the traditional method with which to measure biomechanical data, including spatial- temporal, kinematic and kinetic variables. These tools require significant investment and high levels of training to collect these data. Wearable technologies are an alternative to laboratory based methods and have become more widely available for the monitoring of running biomechanics in uncontrolled environments [144], [160]. Examples of these are inertial measurement units (IMUs), GPS, and in-shoe force or pressure sensors, which can be used to measure or estimate biomechanical data [10], [40], [147], [161]. Earlier research has utilized IMUs for the estimation of gait events, foot ground contact time [40], [44], [48], [162]–[164], and estimation of specific kinetic variables with statistical or machine learning models [43], [47], [149]. Furthermore, in-shoe force sensors measure 104 normal force between the foot and shoe during foot contact have been validated as a measure of vertical ground reaction forces (GRFs) on an instrumented force treadmill [10]. There are typically 9 sensors in a multi-axial IMU: tri-axial accelerometers (linear accelerations), tri-axial rate gyroscopes (angular velocity), and tri-axial magnetometers (magnetic field). Data from IMUs need specific processing and algorithms for extraction of meaningful biomechanical variables [38]. Some approaches have been developed specifically for running, with sensors on the foot, shank, and sacrum [39], [44], [48], [134]. These algorithmic techniques have demonstrated that consistent features can be extracted from inertial data for identification of foot contacts in the laboratory and in real- world environments. However, these algorithms are yet to be validated against a kinetic measure in a free running real-world environment, with uncontrolled running velocities and different positive and negative grades. Machine learning models have been implemented for estimation and prediction of gait events [65], [165], of single kinetic variables [43], [47], [149], and single stance phase GRFs during running, or portions of running GRF waveforms [45]–[47], [166]. These studies have been constrained to the laboratory, with either in-ground force plates or instrumented force measuring treadmills for GRF measurement. Recurrent neural networks (RNNs) show promise for estimation of kinetic variables, specifically Long Short-Term Memory networks (LSTMs). These network structures were designed for the analysis of temporally related data [151]. Human gait data are ideal for these types of algorithms, as locomotion is a cyclic activity and therefore, temporally related. However, we must be cautious with the application of machine learning algorithms for evaluation 105 of running performance outside of the laboratory, as it has been well established that gait parameters, kinematics and kinetics are different between treadmill running and overground running of different durations [130], [139], [167]–[169]. It is currently unknown how these algorithms will perform with data collected over the course of an entire run over different grades and velocities. Consequently, the purpose of this study was to test two specific methods for the biomechanical analysis of running in an unconstrained environment: 1) a heuristic algorithm for the estimation of foot contacts from IMU data; 2) a machine learning algorithm, Bi-Directional LSTM (BD-LSTM), for estimation of normal GRFs between the foot and shoe; foot contacts and discrete GRF variables. We expect gait event detection from both algorithms to have similar accuracies across the range of running velocities and grades in this study. Specifically, we expect an RMSE of 0.04s, or 6% error, in the estimation of foot contact from the IMU data which are similar to results in [40]. Output waveforms from the machine learning algorithm, estimated stance phases would have an RMSE of 0.030 BW, and estimated discrete kinetic variables would have moderate correlations with measured variables in a manner akin to previous work [45]. Methods This study was approved by the University of Oregon Institutional Review Board (protocol #: 10062020.007). Data were collected from 16 participants (Table 7.1), (8 male, 8 female, age: 23.15 years, height: 167.77 cm, mass: 65.00 kg) as part of a larger ongoing study. Three participants were excluded from the analysis, due to GPS 106 malfunctions. All analyses were performed in custom Matlab programs (Mathworks, Natick, MA). Multi-axial IMUs (Casio, Tokyo, JPN) were mounted bilaterally on the dorsal aspect of each participant’s foot and approximately on the sacrum (clipped on the back of each participant’s waistband). The use of multiple inertial sensors has been suggested to Table 7.1: Participant Characteristics and Mile Pace improve estimation of GENDER Age Mass Height Pace (Kg) (cm) (mins:secs) spatial temporal and M 19 68 175 7:24 F 21 63 168 7:59 kinetic variables, compared M 21 73 183 7:15 to a single inertial sensor F 19 55 173 7:24 M 34 68 185 8:17 [44], [152], [153]. These F 23 52 163 7:51 M 18 68 183 7:33 multi-axial IMUs recorded M 20 68 170 7:13 F 27 54 173 8:57 3D linear accelerations and M 28 88 191 9:10 M 26 70 69 7:13 angular velocities at 200 F 27 57 165 8:31 M 18 61 183 8:31 Hz. Acceleration data were post-processed with a Kalman filter to orient the local (IMU) coordinate system vertical to gravity. Foot-shoe normal force data were recorded with Loadsol insole force sensors (Novel Electronics, St. Paul, MN) at 100 Hz. Participants were asked to run a five-mile course around the University of Oregon and surrounding parks (Figure 7.1). Participants also wore a Garmin (Kansas City, KS) GPS watch (either Garmin Forerunner 130 and 135 to record elevation and running velocity). Data Processing Force-sensing insole and IMU data were synced with ‘foot stomps’ before and after each run. The IMU data were down sampled to 100 Hz to match the force sensing 107 insoles and filtered with a 4th order low pass zero-lag Butterworth filter (fc = 35 Hz). Internal clock drift between the IMUs and force sensing insoles were resolved with an iterative corrections algorithm. The kinetic data were normalized to participant bodyweight (BW) and were filtered with a 4th order low-pass zero-lag Butterworth filter (fc = 20 Hz). Post-hoc corrections to force insole data due to a drifting baseline were required (< 1% of the measured footfalls needed this adjustment). Making these corrections entailed identifying swing phases during a period in which the forces had a drifting baseline and setting the swing phases to 0 BW. This approach is described in greater detail in Chapter II of the dissertation. Additionally, force data <5% BW was set to 0 BW. Elevation and velocity measured by the GPS at (sf = 1 Hz), were filtered with a zero-lag 10 sample moving average filter. Velocities from GPS data were set to the nearest 0.25 m s-1 for velocities ranging from 2.25 – 5.25 m s-1, and the upper limit was set by the number of footfalls available for analysis, discussed below. Running velocities < 2.25 m s-1 are typically walking velocities and the walk to run transition typically occurs at around 2.00 – 2.10 m s-1. Grade was calculated from the elevation data and binned into three different groupings. Incline foot strikes were identified at measured grades of > 5°, and decline foot strikes were identified as measured grades of < -5°, with level ground foot strikes between 5° and -5°. The [-5°, 5°] limit was set due to observed noise of ± 4° throughout the run during portions of the course with no physically discernible grade. Data from the GPS were then time-synced to the beginning and end of the IMU and kinetic data. For each combination of velocity and grade to be included in the analysis from a participant there must have been a total of 10-foot contacts meeting this criterion throughout the course. 108 Descriptions of Gait Event Detection Algorithm Gait event estimation from IMU data utilized heuristic rules similar to previous work [40], [41], [147]. Initial contact from the IMUs on the dorsal aspect of the foot (IC- IMU) was identified with two rules. First was the identification of minimum angular velocity in the x-axis of the IMU with minimum of 500 ms between identified minima. Second, a temporal window relative to each minimum, ranging from 5 ms post minima to 45 ms post minima was searched for a resultant acceleration > 50 m s-2. If this condition was satisfied then this peak was set to be ICIMU [40]–[42]. Identification of TOIMU was performed by searching a specific temporal window beginning 100 ms after ICIMU and ending at the half-width of the estimated stride time. In this window TOIMU was either identified as the local maxima of vertical acceleration or the first instance that vertical acceleration was > 3g [40], [128]. Identification of gait events with foot-shoe normal force data utilized a 5% BW cutoff; the first instance of force > 5% BW was identified as IC, and TO was identified as the last instance of force <5% BW. We then removed foot contacts that could not be matched to the IMU and force sensing insole measures. If ICIMU was not within half contact time of the IC from the force sensing insole it was removed from the analysis. Machine Learning Methods We utilized a BD-LSTM with 19 hidden units and a regression output. A more thorough description of the network architecture can be found here [151]. Input into the BD-LSTM were 1-second temporal windows of inertial data: 3-D accelerations, angular velocities, and their respective resultants, from three anatomical locations (dorsum of both feet, and the waistband at approximately the sacrum). Output from the BD-LSTM 109 were 1-second intervals of estimated GRF data. The algorithm was evaluated with a Leave One Out Cross Validation (LOOCV) with 12 participants in the training data and 1 participant in the test data, repeated for each participant. The estimated force data were then filtered with a 2nd order low-pass zero-lag Butterworth filter (fc = 15 Hz). Estimated data tended to be noisier than the input GRF waveforms. This was accounted for with a lower cutoff frequency in the filter. Errant estimated GRF data were removed by setting estimated force <5% BW to 0 BW, and by removal of false “foot-contacts” generated by the model that were <100 ms or > 500 ms. Foot contacts shorter than 100 ms were not consistent with measured foot contacts during running and foot contacts longer than 500 ms tended to occur during periods of quiet standing (e.g. participant was at a cross walk). We observed that the swing phase estimation error approaches 0 as most of the errant data are corrected for using the steps described above. Initial contact from the machine learning output (ICLSTM) was identified by the first instance of force >5% BW and toe off (TOLSTM) was identified by the last instance of force greater than >5% BW. To be sure we were matching foot contact correctly during analysis, if ICLSTM was not within a half contact time of the measured IC, it was removed from the analysis. From the model output GRF waveforms, stance average GRFs, peak GRFs, impulse and average loading rate (ALR) were calculated. Average loading rate was calculated by identifying the impact peak and then averaging the force/time slope in the middle 60% of the region between IC and the impact peak [58]. Root Mean Squared Error (RMSE), linear models and bias analyses were used to assess estimated contact time and the kinetic variables. Differences between the model estimated variable and measured variable waveform are presented in both linear 110 regression and Bland-Altman plots with 95% confidence intervals (CIs) or Limits of Agreement (LoA), respectively. Pearson correlation coefficients (r2) were calculated to show agreement between estimated and measured data. A strong correlation was defined as r2 ≥ 0.8, a moderate correlation as 0.5 ≤ r2 ≤ 0.8 and a weak correlation as 0.3 ≤ r2 ≤ 0.5. Figure 7.1: Outlined course each participant ran. It is clear there is error in the GPS, specifically at the beginning of the course. This may have been due to the number of buildings and the chosen GPS technology. 111 Results There were 90,537 foot strikes measured with the force sensing insoles. Using the rules-based algorithm described above, data from the foot mounted IMU provided a total of 90,063 (88,364 analyzed) foot strikes, and the BD-LSTM estimated 90,579 (85,406 analyzed) foot strikes. The average pace for the 5 mile course (Figure 7.1) was 7:56 ± 0:40 (min:secs) (Table 7.1). Two participants ran different courses than initially planned, each longer than 5 miles. Table 7.2: Root Mean Square Error Table for IMU Gait Event Detection RUNNING Level Ground Decline Incline VELOCITY (M S-1) IC (s) TO (s) TC (s) IC (s) TO (s) TC (s) IC (s) TO (s) TC (s) 2.25 0.018 ± 0.034 ± - - - - 0.010 ± 0.054 ± 0.059 ± 0.002 0.017 0.000 0.000 0.000 2.50 0.017 ± 0.034 ± 0.032 ± 0.015 ± 0.046 ± 0.044 ± 0.016 ± 0.026 ± 0.020 ± 0.003 0.017 0.016 0.006 0.012 0.015 0.007 0.009 0.012 2.75 0.017 ± 0.028 ± 0.030 ± 0.019 ± 0.033 ± 0.032 ± 0.015 ± 0.022 ± 0.023 ± 0.003 0.008 0.011 0.006 0.013 0.014 0.006 0.008 0.009 3.00 0.019 ± 0.028 ± 0.027 ± 0.018 ± 0.034 ± 0.032 ± 0.017 ± 0.027 ± 0.027 ± 0.005 0.009 0.009 0.006 0.011 0.007 0.006 0.009 0.011 3.25 0.019 ± 0.027 ± 0.026 ± 0.018 ± 0.025 ± 0.024 ± 0.017 ± 0.024 ± 0.027 ± 0.005 0.009 0.009 0.005 0.009 0.010 0.005 0.007 0.010 3.50 0.018 ± 0.024 ± 0.025 ± 0.019 ± 0.028 ± 0.026 ± 0.017 ± 0.026 ± 0.026 ± 0.004 0.006 0.009 0.007 0.008 0.010 0.006 0.008 0.011 3.75 0.018 ± 0.025 ± 0.025 ± 0.019 ± 0.027 ± 0.027 ± 0.020 ± 0.025 ± 0.022 ± 0.004 0.009 0.011 0.005 0.012 0.010 0.005 0.009 0.010 4.00 0.020 ± 0.028 ± 0.028 ± 0.022 ± 0.025 ± 0.029 ± 0.020 ± 0.024 ± 0.021 ± 0.005 0.011 0.012 0.011 0.009 0.011 0.006 0.011 0.010 4.25 0.020 ± 0.028 ± 0.029 ± 0.017 ± 0.025 ± 0.025 ± 0.019 ± 0.020 ± 0.026 ± 0.007 0.014 0.012 0.011 0.011 0.014 0.007 0.009 0.011 4.50 0.019 ± 0.026 ± 0.033 ± 0.011 ± 0.035 ± 0.029 ± - - - 0.008 0.013 0.014 0.007 0.008 0.012 4.75 0.018 ± 0.042 ± 0.036 ± - - - - - - 0.000 0.000 0.000 5.00 0.021 ± 0.026 ± 0.033 ± 0.017 ± 0.010 ± 0.024 ± - - - 0.000 0.025 0.022 0.000 0.000 0.000 5.25 0.051 ± 0.045 ± 0.066 ± - - - - - - 0.000 0.000 0.000 112 Results are presented with the minima and the maxima difference or RMSE for each variable. Specific RMSEs for estimated variables across velocities and grades can be found in Tables 7.2-7.6. Pearson correlation coefficients are presented as well as the slope of the regression line. Bland-Altman plots show mean difference in the estimated and measured variable with the 95% Limits of Agreement (LoA) (Figures 7.3 – 7.4 and 7.7-7.10). Each marker in these figures represents a minimum of 10 footfalls for each velocity and grade (positive, negative, and level ground). Table 7.3: RMSE from BD-LSTM Output RUNNING Initial Contact Toe Off Contact Time VELOCITY (M S-1) Level Level Level Decline Incline Decline Incline Decline Incline Ground Ground Ground 2.25 0.020 ± 0.023 ± 0.030 ± 0.025 ± 0.038 ± 0.025 ± - - - 0.014 0.006 0.019 0.012 0.022 0.003 2.50 0.021 ± 0.039 ± 0.025 ± 0.030 ± 0.060 ± 0.030 ± 0.033 ± 0.047 ± 0.027 ± 0.006 0.000 0.008 0.012 0.000 0.015 0.017 0.000 0.006 2.75 0.020 ± 0.021 ± 0.019 ± 0.024 ± 0.023 ± 0.024 ± 0.028 ± 0.022 ± 0.027 ± 0.006 0.014 0.005 0.010 0.005 0.009 0.013 0.005 0.012 3.00 0.020 ± 0.019 ± 0.019 ± 0.022 ± 0.021 ± 0.022 ± 0.027 ± 0.024 ± 0.026 ± 0.006 0.008 0.005 0.009 0.006 0.008 0.014 0.012 0.011 3.25 0.020 ± 0.018 ± 0.019 ± 0.022 ± 0.025 ± 0.020 ± 0.027 ± 0.027 ± 0.024 ± 0.007 0.008 0.006 0.009 0.006 0.007 0.015 0.012 0.009 3.50 0.021 ± 0.021 ± 0.018 ± 0.023 ± 0.024 ± 0.021 ± 0.029 ± 0.027 ± 0.027 ± 0.007 0.008 0.006 0.009 0.009 0.009 0.016 0.012 0.011 3.75 0.021 ± 0.023 ± 0.016 ± 0.024 ± 0.029 ± 0.016 ± 0.030 ± 0.035 ± 0.021 ± 0.010 0.011 0.009 0.009 0.014 0.005 0.015 0.017 0.007 4.00 0.021 ± 0.024 ± 0.018 ± 0.026 ± 0.032 ± 0.019 ± 0.034 ± 0.039 ± 0.026 ± 0.009 0.011 0.010 0.011 0.011 0.008 0.016 0.017 0.009 4.25 0.021 ± 0.024 ± 0.023 ± 0.027 ± 0.030 ± 0.014 ± 0.034 ± 0.039 ± 0.021 ± 0.009 0.010 0.011 0.013 0.010 0.002 0.018 0.017 0.004 4.50 0.019 ± 0.022 ± 0.029 ± 0.043 ± 0.039 ± 0.051 ± - - - 0.005 0.010 0.007 0.024 0.011 0.019 4.75 0.022 ± 0.026 ± 0.039 ± - - - - - - 0.000 0.000 0.000 5.00 0.025 ± 0.025 ± 0.040 ± - - - - - - 0.007 0.009 0.010 5.25 0.029 ± 0.036 ± 0.040 ± - - - - - - 0.000 0.000 0.000 Gait Event and Contact Time Estimation from IMUs The RMSE of the estimated ICIMU ranged from 0.017 – 0.051s for level ground running, 0.011 – 0.022s for decline running, and 0.015 – 0.020s for incline running 113 (Table 7.2 and Figure 7.2 Panel A). Estimation of TOIMU RMSE ranged from 0.024 – 0.044s for level ground running, 0.024 – 0.046s for decline running, and 0.020 – 0.053s for incline running (Table 7.2 and Figure 7.2 Panel B). The RMSE of estimated contact times ranged from 0.024 – 0.066s for level ground running, 0.023 – 0.046 for decline running, and 0.020 – 0.059 for incline running (Table 7.2 and Figure 7.2 Panel C). Linear regression and the analysis of bias in the estimate can be found in (Figure 7.3). Table 7.4: BD-LSTM Stance Phase Ground Reaction Force Waveform RMSE RUNNING Level Ground VELOCITY Incline (BW) Decline (BW) (BW) (M S-1) 2.25 0.315 ± 0.102 - 0.253 ± 0.081 2.50 0.313 ± 0.066 0.300 ± 0.000 0.265 ± 0.058 2.75 0.304 ± 0.057 0.332 ± 0.004 0.268 ± 0.067 3.00 0.294 ± 0.065 0.292 ± 0.082 0.272 ± 0.073 3.25 0.289 ± 0.059 0.321 ± 0.035 0.277 ± 0.065 3.50 0.299 ± 0.070 0.354 ± 0.070 0.274 ± 0.066 3.75 0.310 ± 0.072 0.389 ± 0.120 0.229 ± 0.036 4.00 0.339 ± 0.085 0.391 ± 0.165 0.309 ± 0.082 4.25 0.358 ± 0.115 0.394 ± 0.172 0.269 ± 0.007 4.50 0.397 ± 0.180 0.576 ± 0.371 - 4.75 0.376 ± 0.000 - - 5.00 0.427 ± 0.010 - - 5.25 0.635 ± 0.000 - - Stride frequency was observed to change across velocities and minimally with grades. Stride frequency ranged between 72.460- 91.681 strides min-1 for level running, between 79.624 – 86.299 strides min-1 for decline running, and from 76.395 – 87.016 strides min-1 for incline running (Figure 7.4). Measured stance average GRFs ranged from 1.291 – 1.515 BW for level ground running, from 1.294 – 1.590 BW for decline running, and from 1.253 – 1.356 BW for incline running (Figure 7.6 Panel A). Peak GRFs ranged from 2.253 to 2.66 BW for during level ground running, from 2.168 to 114 2.845 BW for decline running, and from 2.187 – 2.367 BW for incline running (Figure 7.6 Panel B). Impulse during stance phase ranged from 0.333 – 0.398 BW*s for level ground running, from 0.340 – 0.403 BW*s for decline running, and from 0.342 – 0.370 BW*s for incline running (Figure 7.6 Panel C). Measured ALR ranged from 31.919 – 58.307 BW s-1 for level ground running, from 38.220 – 57.359 BW s-1 for decline running, and from 29.969 – 48.234 BW s-1 for incline running (Figure 7.6 Panel D). BD-LSTM Gait Event and Contact Time The RMSE of the estimation of ICLSTM ranged from 0.019 – 0.029 s for level ground, from 0.018 – 0.039 s for decline running, and from from 0.016 – 0.025 s for incline running (Table 7.3). Estimation of TOLSTM ranged from 0.021 – 0.036 s for level ground running, from 0.021 – 0.059 s for decline running, and from 0.014 – 0.030 s for incline running (Table 7.3). Contact time estimation ranged from 0.024 – 0.066 s for level ground, from 0.023 – 0.046 s for decline running, and from 0.020 – 0.059 s for incline running (Table 7.3 and Figure 7.4). Kinetic Variables from Machine Learning Output Stance phase GRF whole waveform RMSE ranged from 0.298 BW to 0.635 BW across all running velocities and grades (Table 7.4). Stance average GRF RMSEs ranged from 0.113 – 0.313 BW for level ground running, from 0.112 – 0.337 BW for decline running, and from 0.089 – 0.158 BW for incline running (Table 7.5 and Figure 7.7). Peak force RMSE ranged from 0.187 – 0.392 BW for level ground running, from 0.197 – 0.508 BW for decline running, and from 0.114 – 0.212 BW incline running (Table 7.5 and Figure 7.8). Impulse RMSE ranged from 0.024 – 0.066 BW*s for level ground 115 running, from 0.035 – 0.086 BW*s for decline running, and from 0.022 – 0.038 BW*s for incline running (Table 7.5 and Figure 7.9). Average loading rate RMSE ranged from 14.963 – 34.680 Bw s-1 for level ground running, from 18.346 – 45.121 BW s-1 for decline running, and from 12.241 – 21.642 Bw s-1 for incline running (Table 7.5 and Figure 7.10). Figure 7.2: Gait event differences estimated from the foot mounted IMUs (Panels A and B). Panels C and D show the estimated and measured contact times. 116 TABLE 7.5: Measured Kinetic Variables from Force Insoles Stance Average Ground Reaction Force (BW) Peak Ground Reaction Force (BW) RUNNING VELOCITY (M S-1) Level Decline Incline Level Decline Incline Ground Ground 2.25 1.350 ± 0.075 - 1.230 ± 0.021 2.318 ± 0.146 - 2.157 ± 0.052 2.50 1.314 ± 0.079 1.294 ± 0.000 1.230 ± 0.021 2.253 ± 0.152 2.168 ± 0.000 2.157 ± 0.052 2.75 1.327 ± 0.076 1.350 ± 0.084 1.277 ± 0.097 2.282 ± 0.165 2.249 ± 0.185 2.223 ± 0.203 3.00 1.294 ± 0.081 1.316 ± 0.085 1.257 ± 0.086 2.254 ± 0.151 2.248 ± 0.187 2.195 ± 0.163 3.25 1.291 ± 0.089 1.320 ± 0.066 1.253 ± 0.085 2.254 ± 0.154 2.279 ± 0.135 2.187 ± 0.151 3.50 1.314 ± 0.091 1.352 ± 0.118 1.264 ± 0.084 2.298 ± 0.162 2.338 ± 0.216 2.212 ± 0.146 3.75 1.317 ± 0.100 1.355 ± 0.133 1.280 ± 0.084 2.308 ± 0.186 2.367 ± 0.255 2.242 ± 0.156 4.00 1.337 ± 0.109 1.375 ± 0.141 1.278 ± 0.075 2.341 ± 0.206 2.401 ± 0.268 2.236 ± 0.130 4.25 1.338 ± 0.118 1.386 ± 0.178 1.305 ± 0.063 2.346 ± 0.230 2.431 ± 0.344 2.294 ± 0.128 4.50 1.335 ± 0.114 1.351 ± 0.245 - 2.353 ± 0.234 2.394 ± 0.435 - 4.75 1.457 ± 0.000 - - 2.514 ± 0.000 - - 5.00 1.515 ± 0.150 1.590 ± 0.000 - 2.669 ± 0.260 2.846 ± 0.000 - 5.25 1.449 ± 0.000 - - 2.505 ± 0.000 - - Impulse (BW*s) Average Loading Rate (BW s-1) Level Decline Incline Level Decline Incline Ground Ground 2.25 0.356 ± 0.023 - 0.365 ± 0.024 40.380 ± - 26.897 ± 8.429 4.971 2.50 0.373 ± 0.031 0.354 ± 0.000 0.365 ± 0.024 31.919 ± 46.045 ± 26.897 ± 1.564 0.000 4.971 2.75 0.369 ± 0.026 0.365 ± 0.024 0.370 ± 0.024 35.168 ± 50.658 ± 29.969 ± 5.992 11.145 4.459 3.00 0.357 ± 0.029 0.366 ± 0.026 0.361 ± 0.029 34.657 ± 38.220 ± 30.215 ± 5.588 6.578 3.581 3.25 0.352 ± 0.027 0.353 ± 0.024 0.354 ± 0.028 36.008 ± 45.744 ± 30.755 ± 6.455 12.826 4.304 3.50 0.350 ± 0.028 0.357 ± 0.028 0.353 ± 0.027 39.088 ± 47.238 ± 32.381 ± 5.242 13.109 4.213 3.75 0.346 ± 0.029 0.348 ± 0.029 0.348 ± 0.027 39.984 ± 51.401 ± 36.196 ± 5.073 10.231 4.405 4.00 0.348 ± 0.030 0.347 ± 0.029 0.342 ± 0.026 43.588 ± 55.461 ± 38.833 ± 5.613 9.529 4.716 4.25 0.346 ± 0.032 0.347 ± 0.036 0.342 ± 0.026 46.055 ± 55.107 ± 43.496 ± 8.777 11.352 8.523 4.50 0.342 ± 0.032 0.340 ± 0.041 - 47.118 ± 47.399 ± - 6.174 17.226 4.75 0.398 ± 0.000 - - 40.092 ± - - 0.000 5.00 0.374 ± 0.042 0.403 ± 0.000 - 57.365 ± 57.359 ± - 14.135 0.000 5.25 0.333 ± 0.000 - - 58.307 ± - - 0.000 117 Table 7.6: Kinetic Variable RMSE from BD-LSTM Estimated Waveforms Stance Average Ground Reaction Force (BW) Peak Ground Reaction Force (BW) RUNNING VELOCITY (M S-1) LG Decline Incline LG Decline Incline 2.25 0.160 ± 0.068 - 0.089 ± 0.013 0.222 ± 0.087 - 0.117 ± 0.042 2.50 0.149 ± 0.063 0.191 ± 0.000 0.106 ± 0.021 0.233 ± 0.097 0.430 ± 0.000 0.115 ± 0.021 2.75 0.153 ± 0.047 0.132 ± 0.003 0.122 ± 0.048 0.201 ± 0.091 0.218 ± 0.042 0.182 ± 0.106 3.00 0.134 ± 0.055 0.123 ± 0.061 0.121 ± 0.049 0.187 ± 0.097 0.197 ± 0.128 0.169 ± 0.102 3.25 0.141 ± 0.058 0.138 ± 0.068 0.118 ± 0.044 0.194 ± 0.097 0.205 ± 0.109 0.154 ± 0.089 3.50 0.153 ± 0.070 0.185 ± 0.061 0.124 ± 0.056 0.208 ± 0.114 0.268 ± 0.126 0.163 ± 0.119 3.75 0.162 ± 0.079 0.227 ± 0.126 0.106 ± 0.023 0.220 ± 0.129 0.351 ± 0.194 0.138 ± 0.038 4.00 0.186 ± 0.088 0.212 ± 0.151 0.158 ± 0.059 0.255 ± 0.132 0.316 ± 0.241 0.213 ± 0.110 4.25 0.204 ± 0.100 0.233 ± 0.159 0.139 ± 0.017 0.259 ± 0.155 0.345 ± 0.251 0.136 ± 0.031 4.50 0.245 ± 0.171 0.337 ± 0.315 - 0.311 ± 0.261 0.508 ± 0.416 - 4.75 0.247 ± 0.000 - - 0.392 ± 0.000 - - 5.00 0.298 ± 0.076 - - 0.380 ± 0.252 - - 5.25 0.313 ± 0.000 - - 0.223 ± 0.000 - - Impulse (BW*s) Average Loading Rate (BW s-1) LG Decline Incline LG Decline Incline 2.25 0.024 ± 0.004 - 0.022 ± 0.010 19.208 ± 4.419 - 13.043 ± 2.744 2.50 0.038 ± 0.015 0.086 ± 0.000 0.024 ± 0.011 14.963 ± 4.868 28.065 ± 0.000 12.242 ± 4.931 2.75 0.035 ± 0.008 0.036 ± 0.006 0.030 ± 0.012 15.399 ± 4.307 24.423 ± 5.967 12.915 ± 3.642 3.00 0.031 ± 0.011 0.035 ± 0.011 0.030 ± 0.014 16.549 ± 4.926 18.346 ± 6.540 13.225 ± 3.691 3.25 0.031 ± 0.011 0.036 ± 0.013 0.029 ± 0.015 18.065 ± 4.870 22.735 ± 4.258 15.267 ± 4.550 3.50 0.031 ± 0.009 0.040 ± 0.011 0.031 ± 0.011 18.830 ± 4.966 27.728 ± 5.564 15.812 ± 4.568 3.75 0.034 ± 0.010 0.043 ± 0.011 0.032 ± 0.008 20.028 ± 4.578 30.510 ± 2.926 13.566 ± 2.308 4.00 0.039 ± 0.012 0.037 ± 0.013 0.039 ± 0.014 22.185 ± 5.708 32.335 ±13.041 19.349 ± 9.038 4.25 0.039 ± 0.014 0.043 ± 0.025 0.035 ± 0.009 25.270 ± 5.889 32.324 ± 21.642 ± 3.445 13.928 4.50 0.047 ± 0.018 0.047 ± 0.018 - 30.374 ± 45.121 ± - 19.786 18.762 4.75 0.058 ± 0.000 - - 31.505 ± 0.000 - - 5.00 0.066 ± 0.009 - - 33.656 ± 9.727 - - 5.25 0.058 ± 0.000 - - 34.680 ± 0.000 - - 118 Discussion The purpose of this study was to test two specific methods for the biomechanical analysis of running in an unconstrained environment: 1) a heuristic algorithm for the estimation of foot contacts from IMU data; 2) a machine learning algorithm, BD-LSTM, for estimation of normal GRFs between the foot and shoe; foot contacts and discrete GRF variables. The specific findings of the study are summarized briefly here: 1) contact time with foot-mounted IMUs was estimated with an average RMSE of 0.030 s, 2) BD-LSTM output waveforms estimated contact times with RMSE of 0.031 s, 3) BD-LSTM output waveform step-by-step average for all combinations of velocities and grades had an RMSE of 0.33 BW per step. Throughout the discussion, it is assumed that the greater ranges of RMSEs, lower Pearson Correlation Coefficients and wider 95% LoA are due to the unconstrained running environment of this study, in comparison to running in a controlled laboratory environment. We observed a decrease in contact time with increased running velocity, for level ground, incline and decline foot contacts (Figure 7.2 Panel 3). We noted minimal differences in measured contact times between level ground, incline, and decline (Figure 7.2 Panel E). Comparison of stride frequencies for running velocities from 2.5 – 4.5 m s-1 between the current study and a treadmill study showed minimal differences ranged from [-2.604 -4.643] strides min-1 [138]. There were negligible differences between level ground running, decline and incline stride frequencies (Figure 7.5). This finding is not surprising, as velocity has been shown to have a larger effect on stride frequency [35], [170]. However, we have shown that stance average GRFs, peak GRFs and ALR increased with running velocity (Figure 7.6), following the same trends previously 119 Figure 7.3: Estimated contact time from foot mounted IMUs. Linear regression and Bland-Altman plots are presented for all foot contacts (A & E), followed by level ground (B & F), decline (C & G) and incline foot contacts (D & H). Pearson Correlation Coefficients, and the slope of the regression line are presented in panels A-D. The Bland- Altman plots present differences between the estimated and measured contact time. The average difference and the 95% LoA are shown in panels E-H. 120 Figure 7.4: Estimated contact time from the BD-LSTM. Linear regression and Bland- Altman plots are presented for all foot contacts (A & E), followed by level ground (B & F), decline (C & G) and incline foot contacts (D & H). Pearson Correlation Coefficients, and the slope of the regression line are presented in panels A-D. The Bland-Altman plots present differences between the estimated and measured contact time. The average difference and the 95% LoA are shown in panels E-H. 121 Figure 7.5: Stride Frequency measured from the force sensing insoles, across the range of velocities, with plotted with incline, decline and level ground. reported [137], [138]. The current study had measured impulses ranged from 0.333 – 0.403 BW*s (Table 7.5), compared to another study that reported impulse across different velocities and grades on a treadmill ranging from 0.300 – 0.340 BW*s [45]. The ranges of ALR in our study (31.919 – 58.307 BW s-1) were similar to previous work (30.100 – 64.700 BW s-1) across a range of velocities and grades [45]. Differences in ALR during decline running showed an increase of 9.298 BW s-1, and decrease of 2.048 BW s-1 during incline running (Figure 7.6), which is similar to values reported previously [35]. Our approach to estimating gait events used both acceleration and angular velocity data, which diverges from previous work, as most studies have made use of only one type of data, either accelerations or angular velocities [21], [40]–[42], [126], [147], [171], [172]. Differences between ICIMU and measured IC in the current study occurred in the expected range (-0.020- 0.020s), due to the iterative corrections algorithm used 122 (Figure 7.2 Panel A). A previous study in a controlled laboratory environment reported identification of ICIMU across a small range of velocities (8 – 11 km h -1), with a range of RMSE 0.004 – 0.008 s [42]. This is a smaller average RMSE than our study across a range of velocities, and grades (RMSE range 0.011 - 0.051 s). The same study reported a larger RMSE range for identification of TOIMU: 0.008 – 0.011 s, than ICIMU from their work [42], while our study presented TOIMU RMSE range from 0.020 - 0.053 s. Machine learning estimation of gait events allows for flexibility in the identification of ICLSTM and TOLSTM instead of relying on specific heuristics, as presented above. We have shown minimal differences between the IMU estimated contact time and the BD-LSTM estimated contact time; RMSE ranges for ICIMU, 0.011 – 0.051 s, compared to ICLSTM, 0.016 - 0.039 s. There was a larger RMSE in the lower bound of ICLSTM, but a narrower range of RMSEs across the range of running velocities. Estimation of TOIMU had an RMSE range of 0.020 - 0.053 s, while TOLSTM RMSE was 0.014 - 0.059 s (Tables 7.2 and 7.3). The estimation of TO with inertial sensors has shown more variability than estimation of IC in many different studies, including the current study [40]–[42], [48]. Contact time estimated from both the foot mounted IMUs and the BD-LSTM decreased with increased running velocity (Figures 7.3 and 7.4). Contact time estimation from foot mounted IMUs in this study had an RMSE from 0.020 – 0.066 s. Contact time estimated from the BD-LSTMs had an RMSE that ranged from 0.021- 0.040 s, an improvement over the IMU estimated contact time. Foot contact durations for IMU estimates had an r2 = 0.460, and the BD-LSTM estimated foot contact durations had an r2 = 0.524 (Figures 7.3 Panel E, and Figure 7.4 Panel E). Despite better agreement in the output from the BD-LSTM, there was more bias in the estimation of contact time from 123 the BD-LSTM compared to the IMU estimated contact time, (Level Ground: 0.010s vs. 0.005 s), and this trend continued with the different grade conditions (Figures 7.3 and 7.4 Panels E - F). Another study reported an r2 = 0.665 for estimated contact times from a Quantile Regression Forest, while the current study presented an r2 = 0.524 across all foot contacts [43]. For external comparison, the our model showed a reduced bias in the estimations of contact time compared to [40]. They reported an offset of -0.016s with 95% LoA [-0.058 0.027 s], while the current model resulted in an offset of 0.005 s with 95% LoA [–0.035 0.044 s] across all IMU estimated foot contacts, and an offset of 0.010 s with 95% LoA [–0.025 0.044 s] across all BD-LSTM estimated foot contacts. Our approach resulted in narrower limits of agreement for both the IMU and BD-LSTM estimated contact times. The previous study used raw data for analysis, compared to the use of averages of each combination of velocity and grade presented in this work. The accuracy in the estimation of GRF waveforms during stance phase using the BD-LSTM varied across running velocities. Level ground running had the largest RMSE range (0.288 – 0.635 BW), compared to decline running (0.291 – 0.576 BW) and incline running (0.229 – 0.309 BW). It should be noted however, that level ground running also had the largest range of velocities (Table 7.3). We observed that the BD-LSTM underestimated the GRF across the full range of velocities and grades. Stance phase RMSE ranged from 0.229 BW to 0.635 BW for all velocities and grades, compared to previously estimated stance phase RMSE from 0.12 to 0.20 BW derived from kinetic waveforms estimated from a machine learning algorithm for treadmill running at different velocities and inclinations [45]. 124 Stance average forces mirrored the estimation of the whole waveform errors. Correlation of stance average GRFs reported in the previous chapter was r2 = 0.408 across running velocities. However, the current analysis yielded a much lower r2 = 0.105. The current study had 95% LoA [-0.332 0.105] BW and a mean difference of -0.113 BW for all foot contacts (Figure 7.7). This is slightly more bias than reported in the previous chapter (mean difference = -0.092 BW). For external comparison, an LSTM was developed for the estimation of stance average GRFs reported an RMSE between 0.340 and 0.630 BW [47], while the current study reported a stance average GRF RMSE between 0.089 – 0.313 BW. Estimated peak force had similar patterns to the estimated stance average GRF (Figure 7.8). The correlation of peak force reported in the previous chapter was r2 = 0.614 for level ground steady state running velocities, while the current work resulted in an r2 = 0.332 across all foot contacts, with worse performance in the estimation of peak force during incline running foot contacts (r2 =0.264). Previous work reported a moderate correlation between the estimated and measured peak GRFs, from data collected on a force instrumented treadmill (r2 = 0.665) [43]. For further external comparison, a BD- LSTM that utilized only foot contact information from a single sensor on the sacrum resulted in an r2 = 0.62, with 95% LoA [-0.17 0.18] BW and a bias of 0.01 BW. In another study, the 95% LoA ranged from [-0.503 0.220] BW with a bias of -0.142 BW [47]. The major difference between the previous work and ours was that they estimated single stance phase vertical GRFs on a treadmill, while we estimated entire waveforms in a free running environment, which is inherently more variable than in the controlled laboratory setting. 125 Impulse had the best performance of the calculated discrete kinetic variables from the estimated GRF waveform, as it was the least effected by underestimation of the force waveform magnitude. Linear regression showed that estimated and measured impulse were moderately correlated, r2 = 0.571 for all foot contacts (Figure 7.9 Panel A). Impulse was underestimated by 0.040 BW*s for all foot contacts, which equates to approximately 6 - 8% error in the estimation of impulse across the range of locomotion velocities and grades. More precise estimation of contact time from the BD-LSTM increased the accuracy in the calculation of impulse. In the previous chapter, estimated impulse was weakly correlated with the measured impulse, r2 = 0.385, with a bias of 0.007 BW*s and 95% LoA [-0.051 0.065] BW*s, while for the current study we observed an r2= 0.571, a bias of -0.020 BW*s and 95% LoA [-0.060 0.021] BW*s. A previous study estimated impulse with a mean absolute error of approximately 0.030 BW*s across velocities and grades running on a treadmill [45], compared to the current study with RMSE across running velocities and grades ranging from 0.022 to 0.086 BW*s. The RMSEs for impulse in the present study tended to be larger for decline running compared to level ground and incline running, due to the more pronounced impact peak observed in that condition. Estimated ALR was weakly correlated with measured ALR, with r2 = 0.160 for level ground running, r2 = 0.210 for decline running, and r2 = 0.492 for incline running. In comparison, from Chapter VI, correlation between the estimated and measured ALR during level ground running on a track surface yielded an r2 = 0.614. Estimated ALR was more accurate when there were more footfalls available in the waveform (e.g., a longer temporal input). For comparison, in another study using data collected in a laboratory 126 environment across a range of velocities, loading rate was moderately correlated to measured loading rate, with an r2 = 0.57, a bias of -2.9 BW s-1 95% and LoA [-16 10] BW s-1 [47], while the current study presents a 95% LoA [34.002 7.917] BW s -1 with a bias of -13.042 BW s-1. Estimated ALR has been reported to have larger percent errors than other estimated kinetic variables [45]. In the present study, this may be due to the inherent differences in loading rate between decline, incline, and level running. The ALR is typically much larger for decline running than it is for incline or level ground running [35]. There are various limitations in the collection of running data outside of the laboratory, some of which have been highlighted above. There were two corrections made to the force data in this study. The first of these was an iterative corrections algorithm to resolve differences between the internal clocks of the IMU and the force sensing insoles. Second, throughout the study, approximately 500 footfalls were observed to have a drifting baseline. The drifting baseline may have been a result of the force sensing insole moving between the foot and the shoe during the highly dynamic running activities being tested. There was also a small error in the synchronization of GPS data to IMU and GRF data. Our protocol involved synchronization of the data by matching the sudden increase in velocity measured by the GPS to the beginning of the run, and the periods in which the runner had minimal velocity (e.g., while waiting at cross walks). This protocol greatly improved our data analysis capacity, but unfortunately there did remain a slight offset in the GPS data, for each footfall. This can be rectified by the use of on-board GPS with the IMUs or force sensing insoles, which will lead the to the GPS being time synced with either the IMU or the GPS data. 127 The BD-LSTM shows great promise in its ability to estimate running GRF waveforms. However, the algorithm appears to have limited transferability to a novel participant, as evidenced by the inferior performance in the estimation of kinetic variables, especially at faster running velocities (due to the limited data available for model input at higher running velocities). Performance might be improved upon by the addition of more data at faster running velocities. In conclusion, while this study has limitations in its transferability to biomechanical analysis of running in the real world, it is the first study to our knowledge to make kinetic measures during an uncontrolled run outside of the laboratory. Further, this is the first study to estimate IMU contact times and validate them against a kinetic standard, as well as estimate kinetic waveforms via machine learning. We have shown that the estimation of gait events and contact time using IMU data matches the estimation of a machine learning algorithm. Future studies focusing on building models for training load, single participant machine learning models, and the inclusion of GPS data into the input of the machine learning algorithm may reduce the underestimations of the stance phase GRFs at faster running velocities. 128 Figure 7.6: Kinetic variables across the range of running velocities, with level ground, decline and incline foot contacts shown. 129 Figure 7.7: Stance Average GRF. Linear regression and Bland-Altman plots are presented for all foot contacts (A & E), followed by level ground (B & F), decline (C & G) and incline foot contacts (D & H). Pearson Correlation Coefficients and the slope of the regression line are presented in panels A-D. The Bland-Altman plots present differences between the estimated and measured stance average GRF. The average difference and the 95% LoA are shown in panels E-H. 130 Figure 7.8: Peak Force. Linear regression and Bland-Altman plots are presented for all foot contacts (A & E), followed by level ground (B & F), decline (C & G) and incline foot contacts (D & H). Pearson Correlation Coefficients and the slope of the regression line are presented in panels A-D. The Bland-Altman plots present differences between the estimated and measured stance average GRF. The average difference and the 95% LoA are shown in panels E-H. 131 Figure 7.9: Impulse. Linear regression and Bland-Altman plots are presented for all foot contacts (A & E), followed by level ground (B & F), decline (C & G) and incline foot contacts (D & H). Pearson Correlation Coefficients and the slope of the regression line are presented in panels A-D. The Bland-Altman plots present differences between the estimated and measured stance average GRF. The average difference and the 95% LoA are shown in panels E-H. 132 Figure 7.10: Average Loading Rate. Linear regression and Bland-Altman plots are presented for all foot contacts (A & E), followed by level ground (B & F), decline (C & G) and incline foot contacts (D & H). Pearson Correlation Coefficients and the slope of the regression line are presented in panels A-D. The Bland-Altman plots present differences between the estimated and measured stance average GRF. The average difference and the 95% LoA are shown in panels E-H. 133 CHAPTER VIII CONCLUSION Summary of Results and Findings This dissertation set out to develop validated methodologies for the identification of gait events and estimation of GRF waveforms from wearable sensors across a range of locomotion modes and running velocities from data collected outside of the laboratory. Motivation for this work was the potential of wearable sensors and machine learning to explore whole new areas of gait biomechanics outside of the laboratory. We first validated a supervised machine learning model for the identification of features occurring prior to gait events from motion capture data collected on a treadmill (Chapter III). Then an unsupervised machine learning model and a heuristic algorithm were implemented with inertial and GRF data collected outside of the laboratory for feature identification prior to gait events (Chapter IV). The dissertation then shifted to identification of gait events and estimation of GRF waveforms during running performance in a semi- uncontrolled and uncontrolled environment. In Chapter V, we developed an algorithm for the identification of gait events in a semi-uncontrolled environment from IMUs on the foot and on the sacrum and validated the results against force sensing insoles. In Chapter VI we implemented an LSTM for the estimation of GRF waveforms. We were able to estimate gait events, contact time, and calculated discrete GRF variables form the output waveform. Finally, in Chapter VII we applied the techniques developed in the previous chapters to estimate gait events from the IMUs with heuristic rules, and estimate GRF waveforms from data collected in a free running environment. 134 Chapter III demonstrated an in-laboratory validation of a supervised BP-AR- HMM for the identification of consistent features prior to gait events across a number of steady state walking and running velocities as well as the dynamic transitions between them. We have shown the feature output of the BP-AR-HMM can be used to identify when locomotion transitions may occur. We utilized minimal sensor data from an idealized accelerometer for the identification of gait events prior to their occurrence. The use of minimally sampled data would greatly reduce the computational load for identification of features prior to gait events collected over long periods time. This is a crucial aspect of this work, as minimizing the computational load will increase the battery life of an assistive device. The success of this algorithm and the methods developed are a solid foundation for translation of this work outside of the laboratory. Chapter IV demonstrated the efficacy of feature identification prior to gait events on an outdoor course with both a heuristic and an unsupervised machine learning algorithm. Two major methodologies were developed in this chapter, first the introduction of a heuristic algorithm to identify features prior to a gait event, and the collection of data outside of the laboratory with naturally occurring transitions between locomotion modes. This work was based on the input of a single stream of 25 Hz and 100 Hz angular velocity data across locomotion modes and transitions between. These models were at least as accurate as previous methodologies, while using minimally sampled data from a single source. We believe this work could be the basis for a near real-time system for participant independent identification of gait events across locomotion modes and velocities with a heuristic or an unsupervised machine learning algorithm. 135 Chapter V validated identification of IC and TO during running gait. The methodologies from Chapter IV were applied to the sports performance setting, as we implemented and synthesized ideas from the previous chapters for the validation of the identification of gait events and contact times from IMUs and compared the results to the standard of a force insole. We have shown the use of foot mounted IMUs for identification of foot contacts is superior to that of the sacral mounted IMUs. Results from this study suggest we will be able to estimate gait events from data collected in a semi-uncontrolled running environment across a range of locomotion velocities and participant skill levels. Chapter VI detailed the use of a machine learning algorithm for the estimation of GRF waveforms. The data presented from this study were outputs from a LOOCV, to provide a conservative result of this methodology. The estimated GRF waveform for the calculation of contact time was strongly correlated to the measured contact times. The calculated discrete variables generally matched the trends in the measured data with a bias for underestimation of the variables. The analysis of running performance in a semi- uncontrolled environment provided a strong enough basis for the application of this work into a real-world environment. Chapter VII describes the collection and analysis of gait events and GRF variables while running in a free running environment. The gait event detection algorithm was modified slightly to include features from the angular velocity data for the identification of pre-IC, from Chapter IV. We evaluated multiple temporal windows durations for the estimation of GRF waveforms and identified a 1 second window. Estimated contact times from the BD-LSTM were similar to those estimated from the IMUs on both feet. There 136 was significant underestimation of GRF waveforms. The results of this study provide insight into the measurement and analysis of GRF waveforms during running performance in an uncontrolled environment. In summary, the methods and results of these studies show the potential of wearable sensors and machine learning for the analysis of biomechanics of individuals outside of the laboratory. We have shown the accuracy of these models for the estimation of features prior to gait events, using data from a single sensor on the foot (Chapters III and IV) both in an out of the laboratory environment. We have identified gait events and estimated ground reaction forces in both a semi-uncontrolled and uncontrolled environment (Chapters IV, V, VI and VII). Features for prior knowledge of gait events were found to be crucial in Chapter VII during running in an uncontrolled environment. These methods can be used for the identification of gait events with minimal sensor data in assistive devices or for the monitoring of training load for runners with wearable sensors. Limitations One of the most significant limitations of this work is that all the calculations and estimations were made offline. Application of these algorithms for near real-time feedback would be advantageous for analysis of biomechanical data in the field (e.g. for clinicians, coaches and athletes). Specifically, the running data collected would be ideal for feedback for participants. Furthermore, models developed in this dissertation are not directly transferable to real world applications. However, they build a framework for the collection of biomechanical analysis outside of the laboratory with wearable sensors. 137 Errors in the synchronization of the sensors included: drift between the IMU clock and the force sensing insole clock, these led to larger differences in the estimation of gait events. These errors resulted in zeros being added or removed during swing the phase from the force sensing insole data. These errors were present specifically in Chapters IV and V as there were no adjustments made for the initial gait event detection algorithms. We did not want to artificially decrease the error in the results of this work by removing data due to imperfect synchronization or adjusting the data for the identification of gait events, therefore we did not match the ICs, but maintained a maximum difference of 0.002s. This in turn led to small cumulative errors between the measured and estimated gait events. One limitation unique to Chapter III is that data from motion capture markers were not directly comparable to measured acceleration from an IMU. The calculated accelerations were similar to the output of a Kalman filter with a zero-velocity update with gravity removed from the vertical direction of the filtered acceleration signal. Therefore, the models developed here are not directly applicable to any clinical work. However, implementation of the BP-AR-HMM for the identification of features prior to gait events were successful and therefore used in the next chapter. One limitations of Chapter IV is that we did not complete any real-time testing for gait event detection, or the implementation of these algorithms into a control system. This limits the clinical applicability of the work, as it remains unknown how real-time implementation would affect the performance of the BP-AR-HMM. Additionally, these data were collected on a set course and not in an unconstrained environment, therefore, we do not know how applicable these models are for real world biomechanical analysis. 138 A limitation of Chapter V is that we chose to include a single participant’s data that heavily influenced the analysis. Removal of this participant from the data set reduced the bias of the estimation of contact time from an overestimation of 4 ms to 1 ms. The algorithms in this chapter are not yet validated in a real-world training scenario. As participants were only asked to run at set paces on a level ground track, there were no changes in grade or terrain, which may affect accuracy of the identification of gait events. Additionally, there was a discrepancy between the internal clocks of the IMUs and the force sensing insoles. We chose not to adjust these discrepancies as we didn’t want to artificially lower the differences the estimated and measured gait events. Therefore, the identification of gait events in practice may perform better than indicated in this work. Limitations of Chapters VI and VII were in part due to the limited transferability of the machine learning algorithms for use in training or clinical scenarios. There is yet too much error for participant independent algorithms for the estimation of GRF waveforms to be applied in a real-world scenario. The models were biased to running velocities < 4.0 m s-1, therefore at faster running velocities, the BD-LSTM underestimated the GRF waveforms at these velocities. For the most accurate estimations of GRF waveforms, we would recommended at this point that the participant’s own data be included in the training of the algorithm or that participant-specific algorithms should be developed for the estimation of GRF waveforms. Additionally, approximately 500 footfalls were adjusted due to a drifting baseline. Improvements in the technology for the measurement of GRF data between the foot and the shoe are needed, specifically with an increase in sampling frequency for more effective measurement of loading rates, as well as more accurate measurement of foot contacts. Finally, there were slight errors in the 139 synchronization of the GPS to IMU and force data. This could be rectified with the utilization of an embedded GPS with the IMU. Recommendations for Future Work This dissertation provides a platform upon which to implement algorithms for the identification of gait events in an assistive device with minimal sensor data, and the application of these methods for the creation of a training load quantification algorithm for running coaches and athletes through the estimation of GRF waveforms. Additionally, we have laid the foundation for methods of data collection outside of the laboratory, with force sensing insoles, IMUs and GPS. Future work could explore the use of these algorithms in an onboard controller for an assistive device, first in a controlled environment such as the laboratory but also in an uncontrolled environment with the user freely moving with the device. The heuristic algorithm is simple to implement for identification of gait events for long term monitoring of clinical populations. This would be a necessary step to understand how those who are using assistive devices are interacting with them outside of the clinic. If we understand how the patient is moving throughout their daily life, targeted interventions from the clinic may help them interact more efficiently with their environment. From the second half of the dissertation, models for the quantification of training load could easily be derived from these data and improved upon with the addition of more data and the inclusion of more participants. Other work could explore methods for the implementation of participant-specific models, instead of the subject independent models developed in this work. There is evidence, yet unpublished, from similar work on 140 the treadmill in the laboratory showing the importance of development of participant- specific models for the quantification of training load. The use of GPS and IMU data for input into a machine learning algorithm may lead to better results for estimation of ground reaction force waveforms, as velocity has been shown to have a greater effect on discrete ground reaction force waveform calculations. There is a need to understand how novice runners perform and learn to run consistently. Long term monitoring of novice runners would inform the creation of new models for running feedback. These models of novice runners could be crucial in understanding the development of running related injuries. Future work could also explore the differences between running in the laboratory and running in a real-world environment with the force loading insoles and provide corrections to the measured force data to match exactly what we measure on the in-lab force instrumented treadmill. Lastly error state Kalman filters could be implemented for the tracking of the IMU trajectories in 3-D space, which would allow for the estimation of join kinematics and provide estimations of joint kinetic values with the inclusion of force sensor data. 141 REFERENCES CITED [1] B. J. Horsley et al., Does Site Matter? Impact of Inertial Measurement Unit Placement on the Validity and Reliability of Stride Variables During Running: A Systematic Review and Meta-analysis, vol. 51, no. 7. Springer International Publishing, 2021. [2] A. Ancillao, S. Tedesco, J. Barton, and B. O’flynn, “Indirect measurement of ground reaction forces and moments by means of wearable inertial sensors: A systematic review,” Sensors (Switzerland), vol. 18, no. 8, 2018. [3] I. Weygers, M. Kok, M. Konings, H. Hallez, H. De Vroey, and K. Claeys, “Inertial sensor-based lower limb joint kinematics: A methodological systematic review,” Sensors (Switzerland), vol. 20, no. 3, pp. 1–23, 2020. [4] D. A. Winter, “Kinematic and kinetic patterns in human gait: Variability and compensating effects,” Hum. Mov. Sci., vol. 3, no. 1–2, pp. 51–76, 1984. [5] R. Cavahna, GA; Saibene, F P; Margaria, “Mechanical work in running,” J. Appl. Physiol., vol. 19, no. 2, pp. 249–256, 1964. [6] M. H. Dickinson, C. T. Farley, R. J. Full, M. A. R. Koehl, R. Kram, and S. Lehman, “How Animals Move: An Integrative View,” Science (80-. )., vol. 288, no. 5463, pp. 100–106, Apr. 2000. [7] S. Sprager and M. B. Juric, Inertial sensor-based gait recognition: A review, vol. 15, no. 9. 2015. [8] S. Chen, J. Lach, B. Lo, and G. Z. Yang, “Toward Pervasive Gait Analysis With Wearable Sensors: A Systematic Review,” IEEE J. Biomed. Heal. Informatics, vol. 20, no. 6, pp. 1521–1537, 2016. [9] A. Peebles, L. Maguire, K. Renner, and R. Queen, “Validity and Repeatability of Single-Sensor Loadsol Insoles during Landing,” Sensors, vol. 18, no. 12, p. 4082, 2018. [10] K. E. Renner, D. S. Blaise Williams, and R. M. Queen, “The reliability and validity of the Loadsol® under various walking and running conditions,” Sensors (Switzerland), vol. 19, no. 2, pp. 1–14, 2019. [11] I. P. I. Pappas, T. Keller, S. Mangold, M. R. Popovic, V. Dietz, and M. Morari, “A Reliable Gyroscope-Based Gait-Phase Detection Sensor Embedded in a Shoe Insole,” IEEE Sens. J., vol. 4, no. 2, pp. 268–274, 2004. 142 [12] R. C. González, A. M. López, J. Rodriguez-Uría, D. Álvarez, and J. C. Alvarez, “Real-time gait event detection for normal subjects from lower trunk accelerations,” Gait Posture, vol. 31, no. 3, pp. 322–325, 2010. [13] S. Khandelwal and N. Wickström, “Evaluation of the performance of accelerometer-based gait event detection algorithms in different real-world scenarios using the MAREA gait database,” Gait Posture, vol. 51, pp. 84–90, 2017. [14] D. Joshi, B. H. Nakamura, and M. E. Hahn, “A novel approach for toe off estimation during locomotion and transitions on ramps and level ground,” IEEE J. Biomed. Heal. Informatics, vol. 20, no. 1, pp. 153–157, 2016. [15] N. C. Bejarano, E. Ambrosini, A. Pedrocchi, G. Ferrigno, M. Monticone, and S. Ferrante, “A novel adaptive, real-time algorithm to detect gait events from wearable sensors,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 23, no. 3, pp. 413–422, 2015. [16] P. C. Formento, R. Acevedo, S. Ghoussayni, and D. Ewins, “Gait event detection during stair walking using a rate gyroscope,” Sensors (Switzerland), vol. 14, no. 3, pp. 5470–5485, 2014. [17] A. Mannini, V. Genovese, and A. M. Sabatini, “Online decoding of hidden markov models for gait event detection using foot-mounted gyroscopes,” IEEE J. Biomed. Heal. Informatics, vol. 18, no. 4, pp. 1122–1130, 2014. [18] A. Mannini and A. M. Sabatini, “Gait phase detection and discrimination between walking-jogging activities using hidden Markov models applied to foot motion data from a gyroscope,” Gait Posture, vol. 36, no. 4, pp. 657–661, 2012. [19] L. Ojeda and J. Borenstein, “Non-GPS Navigation for Security Personnel and First Responders,” J. Navig., vol. 60, no. 3, pp. 391–407, Sep. 2007. [20] A. M. Zaferiou et al., “Quantifying performance on an outdoor agility drill using foot-mounted inertial measurement units,” PLoS One, vol. 12, no. 11, pp. 1–15, 2017. [21] J. M. Jasiewicz et al., “Gait event detection using linear accelerometers or angular velocity transducers in able-bodied and spinal-cord injured individuals,” Gait Posture, vol. 24, no. 4, pp. 502–509, 2006. [22] J. D. Miller, M. S. Beazer, and M. E. Hahn, “Myoelectric walking mode classification for transtibial amputees,” IEEE Trans. Biomed. Eng., vol. 60, no. 10, pp. 2745–2750, 2013. 143 [23] H. L. Bartlett and M. Goldfarb, “A phase variable approach for IMU-based locomotion activity recognition,” IEEE Trans. Biomed. Eng., vol. 65, no. 6, pp. 1330–1338, 2018. [24] X. Zhang and H. Huang, “A real-time, practical sensor fault-tolerant module for robust EMG pattern recognition,” J. Neuroeng. Rehabil., vol. 12, no. 1, p. 18, 2015. [25] R. B. Woodward, J. A. Spanias, and L. J. Hargrove, “User intent prediction with a scaled conjugate gradient trained artificial neural network for lower limb amputees using a powered prosthesis,” Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS, vol. 2016-Octob, pp. 6405–6408, 2016. [26] A. J. Young and L. J. Hargrove, “A Classification Method for User-Independent Intent Recognition for Transfemoral Amputees Using Powered Lower Limb Prostheses,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 24, no. 2, pp. 217–225, Feb. 2016. [27] D. Quintero, D. J. Lambert, D. J. Villarreal, and R. D. Gregg, “Real-Time continuous gait phase and speed estimation from a single sensor,” 1st Annu. IEEE Conf. Control Technol. Appl. CCTA 2017, vol. 2017-Janua, pp. 0–5, 2017. [28] D. J. Villarreal, H. A. Poonawala, and R. D. Gregg, “A Robust Parameterization of Human Gait Patterns Across Phase-Shifting Perturbations,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 25, no. 3, pp. 265–278, 2017. [29] E. Fox, E. B. Sudderth, M. I. Jordan, and A. S. Willsky, “Bayesian nonparametric inference of switching dynamic linear models,” Signal Process. IEEE Trans., vol. 59, no. 4, pp. 1569–1585, 2011. [30] E. B. Fox, M. C. Hughes, E. B. Sudderth, and M. I. Jordan, “Joint modeling of multiple time series via the beta process with application to motion capture segmentation,” Ann. Appl. Stat., vol. 8, no. 3, pp. 1281–1313, 2014. [31] A. E. Kerdok, A. A. Biewener, T. A. McMahon, P. G. Weyand, and H. M. Herr, “Energetics and mechanics of human running on surfaces of different stiffnesses,” J. Appl. Physiol., vol. 92, no. 2, pp. 469–478, 2002. [32] D. P. Ferris, M. Louie, and C. T. Farley, “Running in the real world : adjusting leg stiffness for different surfaces,” no. January, 1998. [33] R. Kram, A. Domingo, and D. P. Ferris, “Effect of reduced gravity on the preferred walk-run transition speed.,” J. Exp. Biol., vol. 200, no. Pt 4, pp. 821–6, 1997. 144 [34] S. Kipp, A. M. Grabowski, and R. Kram, “What determines the metabolic cost of human running across a wide range of velocities?,” J. Exp. Biol., vol. 221, no. 18, 2018. [35] J. S. Gottschall and R. Kram, “Ground reaction forces during downhill and uphill running,” J. Biomech., vol. 38, no. 3, pp. 445–452, Mar. 2005. [36] C. S. Whiting, S. P. Allen, J. W. Brill, and R. Kram, “Steep (30°) uphill walking vs. running: COM movements, stride kinematics, and leg muscle excitations,” Eur. J. Appl. Physiol., vol. 120, no. 10, pp. 2147–2157, 2020. [37] A. S. Voloshina and D. P. Ferris, “Biomechanics and energetics of running on uneven terrain,” J. Exp. Biol., vol. 218, no. 5, pp. 711–719, 2015. [38] L. C. Benson, C. A. Clermont, E. Bošnjak, and R. Ferber, “The use of wearable devices for walking and running gait analysis outside of the lab: A systematic review,” Gait Posture, vol. 63, no. April, pp. 124–138, 2018. [39] J. Reenalda, E. Maartens, L. Homan, and J. H. (Jaap. Buurke, “Continuous three dimensional analysis of running mechanics during a marathon by means of inertial magnetic measurement units to objectify changes in running mechanics,” J. Biomech., vol. 49, no. 14, pp. 3362–3367, 2016. [40] L. C. Benson, C. A. Clermont, R. Watari, T. Exley, and R. Ferber, “Automated accelerometer-based gait event detection during multiple running conditions,” Sensors (Switzerland), vol. 19, no. 7, pp. 1–19, 2019. [41] S. Mo and D. H. K. Chow, “Accuracy of three methods in gait event detection during overground running,” Gait Posture, vol. 59, no. June 2017, pp. 93–98, 2018. [42] D. K. Chew, K. J. H. Ngoh, D. Gouwanda, and A. A. Gopalai, “Estimating running spatial and temporal parameters using an inertial sensor,” Sport. Eng., vol. 21, no. 2, pp. 115–122, 2018. [43] R. S. Alcantara, E. M. Day, M. E. Hahn, and A. M. Grabowski, “Sacral acceleration can predict whole-body kinetics and stride kinematics across running speeds,” PeerJ, vol. 9, pp. 1–18, 2021. [44] E. M. Day, R. S. Alcantara, M. A. McGeehan, A. M. Grabowski, and M. E. Hahn, “Low-pass filter cutoff frequency affects sacral-mounted inertial measurement unit estimations of peak vertical ground reaction force and contact time during treadmill running.,” J. Biomech., vol. 119, p. 110323, Apr. 2021. 145 [45] R. S. Alcantara, W. B. Edwards, G. Y. Millet, and A. M. Grabowski, “Predicting continuous ground reaction forces from accelerometers during uphill and downhill running: a recurrent neural network solution,” PeerJ, vol. 10, p. e12752, 2022. [46] W. R. Johnson, A. Mian, M. A. Robinson, J. Verheul, D. G. Lloyd, and J. A. Alderson, “Multidimensional Ground Reaction Forces and Moments from Wearable Sensor Accelerations via Deep Learning,” IEEE Trans. Biomed. Eng., vol. 68, no. 1, pp. 289–297, 2021. [47] F. J. Wouda et al., “Estimation of Vertical Ground Reaction Forces and Sagittal Knee Kinematics During Running Using Three Inertial Sensors,” Front. Physiol., vol. 9, pp. 1–14, Mar. 2018. [48] R. Watari, B. Hettinga, S. Osis, and R. Ferber, “Validation of a torso-mounted accelerometer for measures of vertical oscillation and ground contact time during treadmill running,” J. Appl. Biomech., vol. 32, no. 3, pp. 306–310, 2016. [49] J. Sinclair, S. J. Hobbs, L. Protheroe, C. J. Edmundson, and A. Greenhalgh, “Determination of gait events using an externally mounted shank accelerometer,” J. Appl. Biomech., vol. 29, no. 1, pp. 118–122, 2013. [50] P. Robberechts et al., “Predicting gait events from tibial acceleration in rearfoot running: A structured machine learning approach,” Gait Posture, vol. 84, pp. 87– 92, 2021. [51] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997. [52] E. Dorschky, M. Nitschke, C. F. Martindale, A. J. van den Bogert, A. D. Koelewijn, and B. M. Eskofier, “CNN-Based Estimation of Sagittal Plane Walking and Running Biomechanics From Measured and Simulated Inertial Sensor Data,” Front. Bioeng. Biotechnol., vol. 8, no. June, pp. 1–14, 2020. [53] L. Jin, “Kinematic and Kinetic Analysis of Walking and Running Across Speeds and Transitions Between Locomotion States,” University of Oregon, 2018. [54] L. V. Ojeda et al., “Estimating stair running performance using inertial sensors,” Sensors (Switzerland), vol. 17, no. 11, pp. 1–14, 2017. [55] M. Kok, J. D. Hol, and T. B. Schön, “Using Inertial Sensors for Position and Orientation Estimation,” 2017. [56] A. D. Bull and M. L. Oct, “Convergence rates of efficient global optimization algorithms,” vol. 26, pp. 1–30. 146 [57] M. A. Gelbart, J. Snoek, and R. P. Adams, “Bayesian Optimization with Unknown Constraints,” pp. 1–14. [58] T. Ueda, H. Hobara, Y. Kobayashi, T. A. Heldoorn, M. Mochimaru, and H. Mizoguchi, “Comparison of 3 Methods for Computing Loading Rate during Running,” Int. J. Sports Med., vol. 37, no. 13, pp. 1087–8090, 2016. [59] S. K. Au, J. Weber, and H. Herr, “Powered ankle-foot prosthesis improves walking metabolic economy,” IEEE Trans. Robot., vol. 25, no. 1, pp. 51–66, 2009. [60] M. Grimmer et al., “A powered prosthetic ankle joint for walking and running,” Biomed. Eng. Online, vol. 15, no. S3, p. 141, 2016. [61] B. E. Lawson, H. A. Varol, A. Huff, E. Erdemir, and M. Goldfarb, “Control of stair ascent and descent with a powered transfemoral prosthesis,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 21, no. 3, pp. 466–473, 2013. [62] A. D. Segal et al., “The effects of a controlled energy storage and return prototype prosthetic foot on transtibial amputee ambulation,” Hum. Mov. Sci., vol. 31, no. 4, pp. 918–931, 2012. [63] R. D. Bellman, M. A. Holgate, and T. G. Sugar, “SPARKy 3: Design of an active robotic ankle prosthesis with two actuated degrees of freedom using regenerative kinetics,” Proc. 2nd Bienn. IEEE/RAS-EMBS Int. Conf. Biomed. Robot. Biomechatronics, BioRob 2008, pp. 511–516, 2008. [64] B. H. Nakamura and M. E. Hahn, “Myoelectric Activation Pattern Changes in the Involved Limb of Individuals With Transtibial Amputation During Locomotor State Transitions,” Arch. Phys. Med. Rehabil., vol. 98, no. 6, pp. 1180–1186, 2017. [65] M. Hanlon and R. Anderson, “Real-time gait event detection using wearable sensors,” Gait Posture, vol. 30, no. 4, pp. 523–527, 2009. [66] G. Pacini Panebianco, M. C. Bisi, R. Stagni, and S. Fantozzi, “Analysis of the performance of 17 algorithms from a systematic review: Influence of sensor position, analysed variable and computational approach in gait timing estimation from IMU measurements,” Gait Posture, vol. 66, no. April, pp. 76–82, 2018. [67] E. Guenterberg, A. Y. Yang, H. Ghasemzadeh, R. Jafari, R. Bajcsy, and S. S. Sastry, “A method for extracting temporal parameters based on hidden markov models in body sensor networks with inertial sensors,” IEEE Trans. Inf. Technol. Biomed., vol. 13, no. 6, pp. 1019–1030, 2009. [68] L. H. Smith, T. A. Kuiken, and L. J. Hargrove, “Use of probabilistic weights to enhance linear regression myoelectric control,” J. Neural Eng., vol. 12, no. 6, p. 066030, 2015. 147 [69] L. J. Hargrove, E. J. Scheme, K. B. Englehart, and B. S. Hudgins, “Multiple binary classifications via linear discriminant analysis for improved controllability of a powered prosthesis,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 18, no. 1, pp. 49–57, 2010. [70] A. Miller, “Gait event detection using a multilayer neural network,” Gait Posture, vol. 29, no. 4, pp. 542–545, 2009. [71] J. Bae and M. Tomizuka, “Gait phase analysis based on a Hidden Markov Model,” Mechatronics, vol. 21, no. 6, pp. 961–970, 2011. [72] A. J. Young, A. M. Simon, N. P. Fey, and L. J. Hargrove, “Intent recognition in a powered lower limb prosthesis using time history information,” Ann. Biomed. Eng., vol. 42, no. 3, pp. 631–641, 2014. [73] J. Y. Jung, W. Heo, H. Yang, and H. Park, “A neural network-based gait phase classification method using sensors equipped on lower limb exoskeleton robots,” Sensors (Switzerland), vol. 15, no. 11, pp. 27738–27759, 2015. [74] H. Huang, F. Zhang, L. J. Hargrove, Z. Dou, D. R. Rogers, and K. B. Englehart, “Continuous locomotion-mode identification for prosthetic legs based on neuromuscular - Mechanical fusion,” IEEE Trans. Biomed. Eng., vol. 58, no. 10 PART 1, pp. 2867–2875, 2011. [75] P. Zhou, M. Lowery, and K. B. Englehart, “Decoding a new neural machine interface for control of artificial limbs,” J. Neurophysiol., vol. 98, pp. 2974–2982, 2007. [76] A. J. Young, A. M. Simon, N. P. Fey, and L. J. Hargrove, “Classifying the intent of novel users during human locomotion using powered lower limb prostheses,” Int. IEEE/EMBS Conf. Neural Eng. NER, pp. 311–314, 2013. [77] A. J. Young, A. Simon, and L. J. Hargrove, “An intent recognition strategy for transfemoral amputee ambulation across different locomotion modes,” Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS, pp. 1587–1590, 2013. [78] L. Kilmartin, R. K. Ibrahim, E. Ambikairajah, and B. Celler, “Optimising recognition rates for subject independent gait pattern classification,” IET Irish Signals Syst. Conf., pp. 1–6, 2009. [79] B. Purcell, J. Channells, D. James, and R. Barrett, “Use of accelerometers for detecting foot-ground contact time during running,” in BioMEMS and Nanotechnology II, 2006, vol. 6036, no. June 2014, p. 603615. 148 [80] J. Leitch, J. Stebbins, G. Paolini, and A. B. Zavatsky, “Identifying gait events without a force plate during running: A comparison of methods,” Gait Posture, vol. 33, no. 1, pp. 130–132, 2011. [81] P. C. Formento, R. Acevedo, S. Ghoussayni, and D. Ewins, “Gait Event Detection during Stair Walking Using a Rate Gyroscope,” pp. 5470–5485, 2014. [82] M. T. Farrell and H. Herr, “A method to determine the optimal features for control of a powered lower-limb prostheses,” Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS, pp. 6041–6046, 2011. [83] J. K. Hitt et al., “Bionic Running for Unilateral Transtibial Military Amputees,” 27th Army Sci. Conf., p. 8, 2010. [84] G. Cavagna, F. P. Saibene, and R. Margaria, “External work in walking.,” J. Appl. Physiol., vol. 18, no. 3, pp. 1–9, 1963. [85] D. P. Ferris, M. Louie, and C. T. Farley, “Running in the real world: Adjusting leg stiffness for different surfaces,” Proc. R. Soc. B Biol. Sci., vol. 265, no. 1400, pp. 989–994, 1998. [86] H. Geyer, A. Seyfarth, and R. Blickhan, “Compliant leg behaviour explains basic dynamics of walking and running,” Proc. R. Soc. B Biol. Sci., vol. 273, no. 1603, pp. 2861–2867, 2006. [87] L. Jin and M. E. Hahn, “Modulation of lower extremity joint stiffness, work and power at different walking and running speeds,” Hum. Mov. Sci., vol. 58, no. January, pp. 1–9, 2018. [88] R. Blickhan, “The spring-mass model for running and hopping,” Journal of Biomechanics, vol. 22, no. 11–12. pp. 1217–1227, 1989. [89] E. B. Fox, E. B. Sudderth, M. I. Jordan, and A. S. Willsky, “Sharing Features among Dynamical Systems with Beta Processes,” Adv. Neural Inf. Process. Syst. 23 24th Annu. Conf. Neural Inf. Process. Syst. 2010., pp. 388–396, 2009. [90] E. B. Fox, E. B. Sudderth, M. I. Jordan, and A. S. Willsky, “An HDP-HMM for systems with state persistence,” Proc. 25th Int. Conf. Mach. Learn. - ICML ’08, pp. 312–319, 2008. [91] D. Joshi and M. E. Hahn, “Terrain and Direction Classification of Locomotion Transitions Using Neuromuscular and Mechanical Input,” Ann. Biomed. Eng., vol. 44, no. 4, pp. 1275–1284, 2016. 149 [92] J. A. Spanias, A. M. Simon, K. A. Ingraham, and L. J. Hargrove, “Effect of additional mechanical sensor data on an EMG-based pattern recognition system for a powered leg prosthesis,” Int. IEEE/EMBS Conf. Neural Eng. NER, vol. 2015- July, pp. 639–642, 2015. [93] J. Taborri, E. Palermo, S. Rossi, and P. Cappa, “Gait partitioning methods: A systematic review,” Sensors (Switzerland), vol. 16, no. 1, pp. 40–42, 2016. [94] B. D. Robertson and G. S. Sawicki, “Exploiting elasticity: Modeling the influence of neural control on mechanics and energetics of ankle muscle-tendons during human hopping,” J. Theor. Biol., vol. 353, pp. 121–132, 2014. [95] N. C. Bejarano, E. Ambrosini, A. Pedrocchi, G. Ferrigno, M. Monticone, and S. Ferrante, “A novel adaptive, real-time algorithm to detect gait events from wearable sensors,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 23, no. 3, pp. 413–422, 2015. [96] R. Williamson and B. J. Andrews, “Gait event detection for FES using accelerometers and supervised machine learning,” IEEE Trans. Rehabil. Eng., vol. 8, no. 3, pp. 312–319, 2000. [97] J. A. Spanias, A. M. Simon, E. J. Perreault, and L. J. Hargrove, “Preliminary results for an adaptive pattern recognition system for novel users using a powered lower limb prosthesis,” Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS, vol. 2016-Octob, pp. 5083–5086, 2016. [98] E. Sejdic, K. A. Lowry, J. Bellanca, M. S. Redfern, and J. S. Brach, “A comprehensive assessment of gait accelerometry signals in time, frequency and time-frequency domains,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 22, no. 3, pp. 603–612, 2014. [99] M. A. Azhar, D. Gouwanda, and A. A. Gopalai, “Development of an intelligent real-time heuristic-based algorithm to identify human gait events,” 2014 IEEE- EMBS Int. Conf. Biomed. Heal. Informatics, BHI 2014, pp. 573–576, 2014. [100] H. Huang, F. Zhang, Y. L. Sun, and H. He, “Design of a robust EMG sensing interface for pattern classification,” J. Neural Eng., vol. 7, no. 5, p. 056005, 2010. [101] R. Caldas, M. Mundt, W. Potthast, F. Buarque de Lima Neto, and B. Markert, “A systematic review of gait analysis methods based on inertial sensors and adaptive algorithms,” Gait Posture, vol. 57, no. February, pp. 204–210, 2017. [102] F. Di Nardo, C. Morbidoni, G. Mascia, F. Verdini, and S. Fioretti, “Intra-subject approach for gait-event prediction by neural network interpretation of EMG signals,” Biomed. Eng. Online, vol. 19, no. 1, pp. 1–20, 2020. 150 [103] C. Morbidoni, A. Cucchiarelli, S. Fioretti, and F. Di Nardo, “A deep learning approach to EMG-based classification of gait phases during level ground walking,” Electron., vol. 8, no. 8, 2019. [104] N. Nazmi, M. A. Abdul Rahman, S. I. Yamamoto, and S. A. Ahmad, “Walking gait event detection based on electromyography signals using artificial neural network,” Biomed. Signal Process. Control, vol. 47, pp. 334–343, 2019. [105] S. Hotta, A. Inomata, Y. Sasamoto, S. Washizawa, and B. Caulfield, “Unsupervised gait detection using biomechanical restrictions,” Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS, pp. 4009–4013, 2017. [106] M. Yuwono, S. W. Su, Y. Guo, B. D. Moulton, and H. T. Nguyen, “Unsupervised nonparametric method for gait analysis using a waist-worn inertial sensor,” Appl. Soft Comput. J., vol. 14, no. PART A, pp. 72–80, 2014. [107] V. Agostini, G. Balestra, and M. Knaflitz, “Segmentation and classification of gait cycles,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 22, no. 5, pp. 946–952, 2014. [108] U. Martinez-Hernandez, A. Rubio-Solis, G. Panoutsos, and A. A. Dehghani-Sanij, “A combined Adaptive Neuro-Fuzzy and Bayesian strategy for recognition and prediction of gait events using wearable sensors,” IEEE Int. Conf. Fuzzy Syst., 2017. [109] S. Khandelwal and N. Wickström, “Gait Event Detection in Real-World Environment for Long-Term Applications: Incorporating Domain Knowledge Into Time-Frequency Analysis,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 24, no. 12, pp. 1363–1372, 2016. [110] P. Catalfamo, S. Ghoussayni, and D. Ewins, “Gait event detection on level ground and incline walking using a rate gyroscope,” Sensors, vol. 10, no. 6, pp. 5683– 5702, 2010. [111] J. Figueiredo, P. Félix, L. Costa, J. C. Moreno, and C. P. Santos, “Gait Event Detection in Controlled and Real-Life Situations: Repeated Measures from Healthy Subjects,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 26, no. 10, pp. 1945–1956, 2018. [112] J. C. Pérez-Ibarra, H. Williams, A. A. G. Siqueira, and H. I. Krebs, “Real-Time Identification of Impaired Gait Phases Using a Single Foot-Mounted Inertial Sensor: Review and Feasibility Study,” Proc. IEEE RAS EMBS Int. Conf. Biomed. Robot. Biomechatronics, vol. 2018-Augus, pp. 1157–1162, 2018. [113] S. A.M., M. C., S. S., and C. F., “Assessment of walking features from foot inertial sensing,” IEEE Trans. Biomed. Eng., vol. 52, no. 3, pp. 486–494, 2005. 151 [114] L. Yan, T. Zhen, J. L. Kong, L. M. Wang, and X. L. Zhou, “Walking gait phase detection based on acceleration signals using voting-weighted integrated neural network,” Complexity, vol. 2020, 2020. [115] Ł. Kidziński, S. Delp, and M. Schwartz, “Automatic real-time gait event detection in children using deep neural networks,” PLoS One, vol. 14, no. 1, pp. 1–11, 2019. [116] He Huang, T. A. Kuiken, and R. D. Lipschutz, “A Strategy for Identifying Locomotion Modes Using Surface Electromyography,” IEEE Trans. Biomed. Eng., vol. 56, no. 1, pp. 65–73, Jan. 2009. [117] L. L. Long and M. Srinivasan, “Walking, running, and resting under time, distance, and average speed constraints: optimality of walk-run-rest mixtures.,” J. R. Soc. Interface, vol. 10, no. 81, Apr. 2013. [118] P. J. Green and B. S. Tw, “Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination,” Biometrika, vol. 82, no. 4, pp. 711–732, 1995. [119] U. Martinez-Hernandez and A. A. Dehghani-Sanij, “Adaptive Bayesian inference system for recognition of walking activities and prediction of gait events using wearable sensors,” Neural Networks, vol. 102, pp. 107–119, 2018. [120] H. F. Maqbool, M. A. Bin Husman, M. I. Awad, A. Abouhossein, N. Iqbal, and A. A. Dehghani-Sanij, “A Real-Time Gait Event Detection for Lower Limb Prosthesis Control and Evaluation,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 25, no. 9, pp. 1–1, 2016. [121] M. Norris, R. Anderson, and I. C. Kenny, “Method analysis of accelerometers and gyroscopes in running gait: A systematic review,” Proc. Inst. Mech. Eng. Part P J. Sport. Eng. Technol., vol. 228, no. 1, pp. 3–15, 2014. [122] M. R. Paquette, C. Napier, R. W. Willy, and T. Stellingwerff, “Moving Beyond Weekly ‘Distance’: Optimizing Quantification of Training Load in Runners,” J. Orthop. Sport. Phys. Ther., vol. 0, no. 0, pp. 1–20, Aug. 2020. [123] C. E. Milner and M. R. Paquette, “A kinematic method to detect foot contact during running for all foot strike patterns,” J. Biomech., vol. 48, no. 12, pp. 3502– 3505, 2015. [124] D. A. Winter, “Kinematic and kinetic patterns in human gait: Variability and compensating effects,” Hum. Mov. Sci., vol. 3, no. 1–2, pp. 51–76, Mar. 1984. [125] H. Prasanth et al., “Wearable sensor-based real-time gait detection: A systematic review,” Sensors, vol. 21, no. 8, pp. 1–28, 2021. 152 [126] K. G. Aubol and C. E. Milner, “Foot contact identification using a single triaxial accelerometer during running,” J. Biomech., vol. 105, p. 109768, 2020. [127] M. Zrenner, A. Küderle, N. Roth, U. Jensen, B. Dümler, and B. M. Eskofier, “Does the position of foot-mounted imu sensors influence the accuracy of spatio- temporal parameters in endurance running?,” Sensors (Switzerland), vol. 20, no. 19, pp. 1–21, 2020. [128] C. Strohrmann, H. Harms, C. Kappeler-Setz, and G. Troster, “Monitoring Kinematic Changes With Fatigue in Running Using Body-Worn Sensors,” IEEE Trans. Inf. Technol. Biomed., vol. 16, no. 5, pp. 983–990, Sep. 2012. [129] J. R. Rebula, L. V. Ojeda, P. G. Adamczyk, and A. D. Kuo, “Measurement of foot placement and its variability with inertial sensors,” Gait Posture, vol. 38, no. 4, pp. 974–980, 2013. [130] C. A. Clermont, L. C. Benson, W. B. Edwards, B. A. Hettinga, and R. Ferber, “New Considerations for Wearable Technology Data: Changes in Running Biomechanics During a Marathon,” J. Appl. Biomech., vol. 35, no. 6, pp. 401–409, Dec. 2019. [131] L. C. Benson, C. A. Clermont, and R. Ferber, “New Considerations for Collecting Biomechanical Data Using Wearable Sensors: The Effect of Different Running Environments,” Front. Bioeng. Biotechnol., vol. 8, no. February, pp. 1–8, 2020. [132] H. Lee, “Task-dependent modulation of multi-dimensional human ankle stiffness,” in Powered Prostheses, Elsevier, 2020, pp. 41–56. [133] J. B. Lee, R. B. Mellifont, and B. J. Burkett, “The use of a single inertial sensor to identify stride, step, and stance durations of running gait,” J. Sci. Med. Sport, vol. 13, no. 2, pp. 270–273, 2010. [134] M. Giandolini, N. Horvais, J. Rossi, G. Y. Millet, P. Samozino, and J. B. Morin, “Foot strike pattern differently affects the axial and transverse components of shock acceleration and attenuation in downhill trail running,” J. Biomech., vol. 49, no. 9, pp. 1765–1771, 2016. [135] B. Auvinet, E. Gloria, G. Renault, and E. Barrey, “Runner’s stride analysis: Comparison of kinematic and kinetic analyses under field conditions,” Sci. Sport., vol. 17, no. 2, pp. 92–94, 2002. [136] A. J. Wixted, D. C. Billing, and D. A. James, “Validation of trunk mounted inertial sensors for analysing running biomechanics under field conditions, using synchronously collected foot contact data,” Sport. Eng., vol. 12, no. 4, pp. 207– 212, 2010. 153 [137] P. G. Weyand, R. F. Sandell, D. N. L. Prime, and M. W. Bundle, “The biological limits to running speed are imposed from the ground up,” J. Appl. Physiol., vol. 108, no. 4, pp. 950–961, 2010. [138] R. K. Fukuchi, C. A. Fukuchi, and M. Duarte, “A public dataset of running biomechanics and the effects of running speed on lower extremity kinematics and kinetics,” PeerJ, vol. 2017, no. 5, p. 3298, 2017. [139] C. A. Clermont, L. C. Benson, S. T. Osis, D. Kobsar, and R. Ferber, “Running patterns for male and female competitive and recreational runners based on accelerometer data,” J. Sports Sci., vol. 37, no. 2, pp. 204–211, 2019. [140] L. Marotta, J. H. Buurke, B. J. F. van Beijnum, and J. Reenalda, “Towards machine learning-based detection of running-induced fatigue in real-world scenarios: Evaluation of IMU sensor configurations to reduce intrusiveness,” Sensors, vol. 21, no. 10, 2021. [141] M. R. Ryan, C. Napier, D. Greenwood, and M. R. Paquette, “Comparison of different measures to monitor week-to-week changes in training load in high school runners,” Int. J. Sport. Sci. Coach., vol. 16, no. 2, pp. 370–379, 2021. [142] W. B. Edwards, “Modeling Overuse Injuries in Sport as a Mechanical Fatigue Phenomenon,” Exerc. Sport Sci. Rev., vol. 46, no. 4, pp. 224–231, 2018. [143] E. S. Matijevich, L. R. Scott, P. Volgyesi, K. H. Derry, and K. E. Zelik, “Combining wearable sensor signals, machine learning and biomechanics to estimate tibial bone force and damage during running,” Hum. Mov. Sci., vol. 74, no. October, p. 102690, 2020. [144] B. Vanwanseele, T. Op De Beéck, K. Schütte, and J. Davis, “Accelerometer Based Data Can Provide a Better Estimate of Cumulative Load During Running Compared to GPS Based Parameters,” Front. Sport. Act. Living, vol. 2, no. October, pp. 1–7, 2020. [145] K. P. Clark and P. G. Weyand, “Are running speeds maximized with simple-spring stance mechanics?,” J. Appl. Physiol., vol. 117, no. 6, pp. 604–615, 2014. [146] S. R. Donahue, L. Jin, and M. E. Hahn, “User Independent Estimations of Gait Events With Minimal Sensor Data,” IEEE J. Biomed. Heal. Informatics, vol. 25, no. 5, pp. 1583–1590, May 2021. [147] S. R. Donahue and M. E. Hahn, “Feature Identification With a Heuristic Algorithm and an Unsupervised Machine Learning Algorithm for Prior Knowledge of Gait Events,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 30, no. Ic, pp. 108–114, 2022. 154 [148] P. Robberechts et al., “Predicting gait events from tibial acceleration in rearfoot running: A structured machine learning approach,” Gait Posture, vol. 84, no. October 2020, pp. 87–92, 2021. [149] M. Mundt et al., “Estimation of Gait Mechanics Based on Simulated and Measured IMU Data Using an Artificial Neural Network,” Front. Bioeng. Biotechnol., vol. 8, no. February, pp. 1–16, 2020. [150] C. R. Hollis, R. M. Koldenhoven, J. E. Resch, and J. Hertel, “Running biomechanics as measured by wearable sensors: effects of speed and surface,” Sport. Biomech., vol. 20, no. 5, pp. 521–531, 2021. [151] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997. [152] R. D. Gurchiek, N. Cheney, and R. S. McGinnis, “Estimating biomechanical time- series with wearable sensors: A systematic review of machine learning techniques,” Sensors (Switzerland), vol. 19, no. 23, 2019. [153] M. I. M. Refai, B. J. F. Van Beijnum, J. H. Buurke, and P. H. Veltink, “Portable Gait Lab: Estimating 3D GRF Using a Pelvis IMU in a Foot IMU Defined Frame,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 28, no. 6, pp. 1308–1316, 2020. [154] N. Neverova et al., “Learning Human Identity from Motion Patterns,” IEEE Access, vol. 4, pp. 1810–1820, 2016. [155] B. J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesian optimization of machine learning algorithms,” Adv. Neural Inf. Process. Syst., vol. 25, 2012. [156] C. M. O’Connor, S. K. Thorpe, M. J. O’Malley, and C. L. Vaughan, “Automatic detection of gait events using kinematic data,” Gait Posture, vol. 25, no. 3, pp. 469–474, 2007. [157] N. J. Nedergaard et al., “The feasibility of predicting ground reaction forces during running from a trunk accelerometry driven mass-spring-damper model,” PeerJ, vol. 2018, no. 12, pp. 1–17, 2018. [158] T. Keller, A. Weisberger, J. Ray, S. Hasan, R. Shiavi, and D. Spengler, “Relationship between vertical ground reaction force and speed during walking, slow jogging, and running,” Clin. Biomech., vol. 11, no. 5, pp. 253–259, Jul. 1996. [159] E. Halilaj, A. Rajagopal, M. Fiterau, J. L. Hicks, T. J. Hastie, and S. L. Delp, “Machine learning in human movement biomechanics: Best practices, common pitfalls, and new opportunities,” J. Biomech., vol. 81, pp. 1–11, 2018. 155 [160] D. Kiernan et al., “Accelerometer-based prediction of running injury in National Collegiate Athletic Association track athletes,” J. Biomech., vol. 73, pp. 201–209, 2018. [161] S. P. Messier et al., “A 2-Year Prospective Cohort Study of Overuse Running Injuries: The Runners and Injury Longitudinal Study (TRAILS),” Am. J. Sports Med., vol. 46, no. 9, pp. 2211–2221, 2018. [162] Y. S. Lee, C. S. Ho, Y. Shih, S. Y. Chang, F. J. Róbert, and T. Y. Shiang, “Assessment of walking, running, and jumping movement features by using the inertial measurement unit,” Gait Posture, vol. 41, no. 4, pp. 877–881, 2015. [163] V. Guimarães, I. Sousa, and M. V. Correia, “Orientation-invariant spatio-temporal gait analysis using foot-worn inertial sensors,” Sensors, vol. 21, no. 11, pp. 1–19, 2021. [164] M. Falbriard, F. Meyer, B. Mariani, G. P. Millet, and K. Aminian, “Drift-Free Foot Orientation Estimation in Running Using Wearable IMU,” Front. Bioeng. Biotechnol., vol. 8, no. February, pp. 1–11, 2020. [165] A. Mannini and A. M. Sabatini, “Gait phase detection and discrimination between walking-jogging activities using hidden Markov models applied to foot motion data from a gyroscope,” Gait Posture, vol. 36, no. 4, pp. 657–661, 2012. [166] D. S. Komaris et al., “Predicting Three-Dimensional Ground Reaction Forces in Running by Using Artificial Neural Networks and Lower Body Kinematics,” IEEE Access, vol. 7, pp. 156779–156786, 2019. [167] B. M. Nigg, R. W. De Boer, and V. Fisher, “A kinematic comparison of overground and treadmill running.,” Med. Sci. Sports Exerc., vol. 27, no. 1, pp. 98–105, Jan. 1995. [168] P. O. Riley, J. Dicharry, J. Franz, U. Della Croce, R. P. Wilder, and D. C. Kerrigan, “A kinematics and kinetic comparison of overground and treadmill running,” Med. Sci. Sports Exerc., vol. 40, no. 6, pp. 1093–1100, 2008. [169] A. Psarras, D. Mertyri, and P. Tsaklis, “Biomechanical Analysis of Ankle during the Stance Phase of Gait on Various Surfaces: A Literature Review,” Hum. Mov., vol. 17, no. 3, pp. 140–147, 2016. [170] B. C. Heiderscheit, E. S. Chumanov, M. P. Michalski, C. M. Wille, and M. B. Ryan, “Effects of step rate manipulation on joint mechanics during running,” Med. Sci. Sports Exerc., vol. 43, no. 2, pp. 296–302, 2011. 156 [171] M. Grimmer, K. Schmidt, J. E. Duarte, L. Neuner, G. Koginov, and R. Riener, “Stance and swing detection based on the angular velocity of lower limb segments during walking,” Front. Neurorobot., vol. 13, no. July, pp. 1–15, 2019. tw [172] C. Fadillioglu, B. J. Stetter, S. Ringhof, F. C. Krafft, S. Sell, and T. Stein, “Automated gait event detection for a variety of locomotion tasks using a novel gyroscope-based algorithm,” Gait Posture, vol. 81, no. November 2019, pp. 102– 108, 2020. 157