MACHINE LEARNING AND WEARABLE SENSORS FOR THE ESTIMATION 
OF BIOMECHANICAL VARIABLES OUTSIDE THE LABORATORY 
 
 
 
 
 
 
 
 
 
 
 
By 
 
SETH R. DONAHUE  
 
 
 
 
 
 
 
 
 
 
 
 
A DISSERTATION 
 
Presented to the Department of Human Physiology  
and the Division of Graduate Studies of the University of Oregon 
in partial fulfillment of the requirements 
for the degree of 
Doctor of Philosophy  
 
June 2022 
 
 
 
 
 
 
 
  
DISSERTATION APPROVAL PAGE 
 
Student: Seth R. Donahue 
 
Title: Machine Leaning and Wearable Sensors for the Estimation of Biomechanical 
Variables Outside the Laboratory 
 
This dissertation has been accepted and approved in partial fulfillment of the 
requirements for the Doctor of Philosophy degree in the Department of Human 
Physiology by: 
 
Michael Hahn, Ph.D. Chairperson 
Andrew Karduna, Ph.D.   Core Member 
Nicole Swann, Ph.D. Core Member 
Daniel Lowd, Ph.D. Institutional Representative 
 
and 
 
Krista M. Chronister, PhD Vice Provost for Graduate Studies  
 
Original approval signatures are on file with the University of Oregon Division of 
Graduate Studies  
 
Degree awarded June 2022 
ii 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
© 2022 Seth R. Donahue  
 
This work is under a Creative Commons  
Attribution -NoDerivs (United States) License. 
 
  
iii 
  
DISSERTATION ABSTRACT 
 
Seth R. Donahue 
 
Doctor of Philosophy 
 
Department of Human Physiology 
 
June 2022 
 
Title: Machine Learning and Wearable Sensors for the Estimation of Biomechanical 
Variables Outside the Laboratory 
 
 
 
The miniaturization of sensors and their availability for biomechanical analysis 
outside of the laboratory has opened whole new areas of research. Wearable sensors have 
been developed to measure ground reaction forces, and inertial measurement units have 
been developed for the measurement of acceleration and angular velocity. The purpose of 
this dissertation was to develop methodologies for the measurement and estimation of 
biomechanical variables, outside of the laboratory. As these sensors can provide vast 
amounts of data, it is natural to leverage the strengths of machine learning models, which 
have been used to find patterns in large datasets to assist in the task of estimating 
biomechanical variables using wearable sensor data as input.  
This dissertation is divided into five distinct, but related projects all linked to the 
identification of gait events and machine learning applications for human locomotion 
data, both in and out of the laboratory. The first two projects were focused on 
identification of gait events and transitions between locomotion modes, while Projects 3 - 
5 were focused on gait event detection and estimation of biomechanical parameters 
during running outside the laboratory. Project 1: Validation of a supervised machine 
learning algorithm for steady state locomotion, and dynamic transitions between those 
iv 
  
locomotion modes. Project 2: Deployment of an unsupervised machine learning and 
heuristic gait event detection algorithms for the identification of gait events, across 
environmentally constrained and internally driven locomotion transitions 
Projects 3 - 5 resulted in the development of methodologies for biomechanical 
analysis.  We utilized both heuristic and machine learning methodologies for the 
estimation of biomechanical variables in these scenarios. Project 3: Estimation of gait 
events and contact times from inertial measures on the foot and the sacrum in a semi-
uncontrolled environment. Project 4: Implementation of a recurrent neural network for 
the estimation of whole ground reaction force waveforms and the calculation of discrete 
kinetic variables from these waveforms in a semi-uncontrolled environment. Project 5: 
Synthesis and application of the previous two chapters, gait event detection and 
estimation of ground reaction force waveforms on data collected in a real-world 
environment during a 5-mile free run.     
 
 
v 
  
CURRICULUM VITAE 
 
NAME OF AUTHOR:  Seth R. Donahue  
 
 
GRADUATE AND UNDERGRADUATE SCHOOLS ATTENDED: 
 
 University of Oregon, Eugene, OR 
 University of Montana, Missoula, MT 
  
 
DEGREES AWARDED: 
 
 Doctor of Philosophy, Human Physiology, 2022, University of Oregon 
 Master of Science, Health and Human Performance, 2017, University of Montana 
 Bachelor of Arts, Mathematics, 2015, University of Montana 
  
 
AREAS OF SPECIAL INTEREST: 
 
 Biomechanical analysis outside the laboratory 
 Wearable Sensors  
 Machine Learning  
 Locomotion Transitions 
 
 
PROFESSIONAL EXPERIENCE: 
 
Graduate Research Assistant, University of Oregon Dept. Human Physiology 
(Neuromechanics Laboratory), 2017-present 
 
Graduate Employee, University of Oregon Dept. Human Physiology, 2017-present 
 
Graduate Research Assistant, University of Montana Dept. Health and Human 
Performance (Biomechanics Laboratory), 2015-2017 
 
 
GRANTS, AWARDS, AND HONORS: 
 
Jan Broekhoff Fellowship, Department of Human Physiology, University of 
Oregon, 2021 
 
Jan Broekhoff Fellowship, Department of Human Physiology, University of 
Oregon, 2020 
 
Jan Broekhoff Fellowship, Department of Human Physiology, University of 
vi 
  
Oregon, 2019 
 
Jan Broekhoff Fellowship, Department of Human Physiology, University of 
Oregon, 2018 
 
Martin Sharkey Research Award, Department of Health and Human Performance, 
University of Montana, 2017 
 
 
 
PUBLICATIONS: 
 
S. R. Donahue, L. Jin, and M. E. Hahn, “User Independent Estimations of Gait 
Events With Minimal Sensor Data,” IEEE J. Biomed. Heal. Informatics, vol. 25, no. 5, 
pp. 1583–1590, May 2021. 
 
S. R. Donahue and M. E. Hahn, “Feature Identification With a Heuristic 
Algorithm and an Unsupervised Machine Learning Algorithm for Prior Knowledge of 
Gait Events,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 30, pp. 108–114, 2022. 
 
 
vii 
  
ACKNOWLEDGMENTS 
 
Thank you to Dr. Mike Hahn, my gratitude for your guidance, mentorship, and 
friendship throughout the past 5 years is unending. I have never felt that I would make it, 
through grad school.  You believed in me when I didn’t, you gave me opportunities I 
never thought I would be possible.  I am so indebted to you, not only as a mentor but as a 
friend as well.  You are the best mentor anyone could ask for, you give so much to your 
students, and push us to succeed professionally and personally. I will always look fondly 
back upon these years in the BSSC, these have been some of the best of my life.  Thank 
you.  
To my committee, thank you for your support and pivoting with me when 
necessary mid-dissertation.  Thank you for your guidance and feedback throughout this 
process.  Thank you for investing in me as a researcher, for your time, and willingness to 
provide feedback to me for this project. This project would not have been possible 
without your help. 
Thank you to the current and former members of the BSSC, you all were the best 
people possible to work with.  Dr. Katie Farina:  Thanks for being my friend and 
colleague, angsty music and inverse dynamics were the best.  To Dr. Mike McGeehan:  
Your level head and example always made me want to be a better person and researcher, 
thank you for inviting me to Thanksgiving back in 2018 with you and Kate.  To Dr. Evan 
Day:  Thank you for your questions about what I am doing and why I am doing it, still 
done know. Your snarky feedback always made my work better.  To Emily Karolidis: I 
am so glad Mike brought you on board, I remember that first chat we had thinking that 
you were going to get your Ph.D. someday, and here you are.  Thank you for all the 
viii 
  
laughs.  To Rachel Robinson: Having you on the longitudinal running study has been 
amazing, thank you for all your hard work in learning the tips and tricks of data 
collections outside the lab.  To Aida Chebbi: The work we have been doing has been 
amazing, your willingness to come in and learn a whole new skillset, work with this team 
and truly contribute to the group, thank you. 
To my wife, Matlyn:  You have also believed in me and encouraged me when I 
didn’t.  You have loved me through this process, thank you. These last few months have 
been crazy, thank you for your steadfast love and support while I have finished up this 
degree.  Thank you for listening to my crazy ideas, to my frustrations and joys throughout 
this process. These last two years would have been impossible if it wasn’t for you, I 
would have been lost if it weren’t for you keeping me grounded and focused on what 
matters. Thank you for sharing your life with me. To the rest of my family:  Thank you 
for your unwavering support of me chasing after my dreams, for the yearly trips to 
Oregon, all the chats, and the facetimes.  Thank you.  
Finally, thank you to the University of Oregon Human Physiology Department for 
the support in so many aspects of my time at UO. I cannot think of better people to have 
been surrounded by during my graduate studies.  
 
ix 
  
 
DEDICATION 
 
 
This dissertation is dedicated to my wife and family for their unconditional love, none of 
this could have happened without you. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
x 
  
TABLE OF CONTENTS 
Chapter Page 
 
 
I. INTRODUCTION .................................................................................................... 1 
 Background and Significance ................................................................................ 1 
 Introduction ...................................................................................................... 1 
 Gait Event Identification .................................................................................. 2 
 Machine Learning Algorithms for Gait Event Identification .......................... 3 
 Running Outside the Laboratory ...................................................................... 4 
 General and Specific Aims .................................................................................... 7 
 Organization of the Dissertation ............................................................................ 9 
II. GENERAL METHODOLOGY .............................................................................. 12 
 Subjects .................................................................................................................. 12 
 Study Design and Experimental Protocol .............................................................. 14 
 Chapter III ........................................................................................................ 14 
 Chapter IV ........................................................................................................ 14 
 Chapters V - VI ................................................................................................ 15 
 Chapter VII ...................................................................................................... 15 
 Data Collection ...................................................................................................... 15 
 Chapter III ........................................................................................................ 15 
 Chapter IV ........................................................................................................ 16 
 Chapters V - VI ................................................................................................ 16 
 Data and Statistical Analysis ................................................................................. 16 
 Chapter III ........................................................................................................ 16 
xi 
  
Chapter Page 
 
 Chapter IV ........................................................................................................ 18 
 Chapters V ....................................................................................................... 18 
 Chapters VI ...................................................................................................... 19 
 Chapter VII ...................................................................................................... 23 
III. USER INDEPENDENT ESTIMATIONS OF GAIT EVENTS WITH 
MINIMAL SENSOR DATA  ...................................................................................... 26 
 Introduction ............................................................................................................ 26 
 Methods..................................................................................................................    29 
 Instrumentation and Data Collection ............................................................... 29 
 Model Overview .............................................................................................. 31 
 Traditional Protocols for Machine Learning....................................................    32 
 Analysis Methodology ..................................................................................... 33 
 Results .................................................................................................................... 34 
 Discussion .............................................................................................................. 39 
 Conclusion ............................................................................................................. 42 
 Bridge ..................................................................................................................... 44 
IV. FEATURE IDENTIFICATION WITH A HUERISTIC ALGORITHM 
AND AN UNSERVISED MACHINE LEARNING ALGORITHM FOR PRIOR   
KNOWLEDGE OF GAIT EVENTS ........................................................................... 45 
 Introduction ............................................................................................................ 45 
 Methods.................................................................................................................. 49 
 Results .................................................................................................................... 53 
 Discussion .............................................................................................................. 60 
xii 
  
Chapter Page 
 
 Conclusion ............................................................................................................. 63 
 Bridge ..................................................................................................................... 64 
V. VALIDATION OF RUNNING GAIT EVENT DETECTION  
ALGORITHM IN A SEMI-UNCONTROLLED ENVIRONMENT .......................... 65 
 Introduction ............................................................................................................ 65 
 Methods.................................................................................................................. 67 
 Data Processing ................................................................................................ 68 
 Algorithmic Descriptions ................................................................................. 70 
 Results .................................................................................................................... 70 
 Discussion .............................................................................................................. 75 
 Bridge ..................................................................................................................... 82 
VI.  ESTIMATION OF GROUND REACTION FORCE DURING  
FIXED PACE RUNS OUTSIDE THE LABORATORY ............................................ 83 
 Introduction ............................................................................................................ 83 
 Methods.................................................................................................................. 85 
 Data Processing ................................................................................................ 86 
 Machine Learning Algorithms ......................................................................... 87 
 Data Analysis ................................................................................................... 88 
 Results .................................................................................................................... 89 
 Discussion .............................................................................................................. 97 
 Bridge ..................................................................................................................... 103 
 
 
xiii 
  
Chapter Page 
 
VII.  RUNNING IN THE REAL WORLD: ESTIMATION OF GAIT 
EVENTS AND KINETIC WAVEFORMS WITH WEARABLE SENSORS  
AND MACHINE LEARNING .................................................................................... 104 
 Introduction ............................................................................................................ 104 
 Methods.................................................................................................................. 106 
 Data Processing ................................................................................................ 107 
 Descriptions of Gait Event Detection Algorithms  .......................................... 109 
 Machine Learning Methods ............................................................................. 109 
 Results ....................................................................................................................  112 
 Gait Event and Contact Time Estimation from IMUs ..................................... 113 
 BD LSTM Gait Event and Contact Time ........................................................ 115 
 Discrete Kinetic Variables ............................................................................... 115 
 Discussion .............................................................................................................. 119 
VIII. CONCLUSION ................................................................................................... 134 
 Summary of Results and Findings ......................................................................... 134 
 Limitations .............................................................................................................  137 
 
 Recommendations for Future Work....................................................................... 140 
 
REFERENCES CITED ................................................................................................ 142 
  
xiv 
  
LIST OF FIGURES 
 
Figure Page 
 
 
2.1 LSTM Cell layout and mapping of operations. ..................................................... 21 
3.1 The Full BP-AR-HMM  ......................................................................................... 32 
3.2 Exemplar plots of the output from the BP-AR-HMM with foot acceleration 
 of a single representative subject ........................................................................... 35 
3.3 Changes in the duration of state 7 and state 3 for all subjects’ transitions  
 between walking and running ................................................................................ 37 
 
4.1 Data collection protocol ......................................................................................... 51 
4.2 Gait event identifiers from the BP-AR-HMM ....................................................... 56 
4.3 Representative foot ground contacts for Level Ground 
  Walking and Running ........................................................................................... 57 
4.4 Representative foot ground contacts for Stair Ascent  
 and Descent ............................................................................................................ 58 
4.5 Representative foot ground contacts Ramp Ascent   
 and Descent ............................................................................................................ 59 
5.1 Flow chart of the data post processing steps and the algorithmic methods for 
 the identification of gait events from the force sensing insole data 
 and data from multi-axial IMUs at both anatomical locations .............................. 69 
5.2 Acceleration waveforms from the IMUs mounted on the right foot  
 with superimposed foot contacts identified  
 from the force measuring insoles ........................................................................... 72 
5.3 Acceleration waveforms from IMUs mounted on the sacrum 
 with superimposed foot contacts identified from the  
 force measuring insoles for Level Ground............................................................. 73 
5.4 Time differences in the identification of gait events between measured  
 forces and estimated IMU gait events .................................................................... 74 
5.5 Differences in force measured and IMU estimated contact time across 
 the range of average velocity trials ........................................................................ 75 
xv 
  
Figure Page 
 
5.6 Bland-Altman plot displaying the average differences between the 
 estimated contact durations and the measured gait  
 events from the force sensing insoles .................................................................... 76 
5.7 Time differences between the IMU estimated foot contact and 
 force measured foot contact ................................................................................... 77 
6.1 Machine learning methodology ............................................................................. 88 
6.2 Complete analysis of contact time ......................................................................... 93 
6.3 Kinetic variables across velocities ......................................................................... 94 
6.4 Regression analysis of estimated kinetic variables  
 with mean and 95% confidence intervals .............................................................. 95 
6.5 Bland Altman plot with offset and 95% Limits of Agreement.  
 Each data point represents an average velocity trial .............................................. 96 
 
7.1 Outlined course each participant ran...................................................................... 111 
7.2 Gait event differences estimated from the foot mounted IMUs ............................ 116 
7.3 Estimated contact time from foot mounted IMUs ................................................. 120 
7.4 Regression analysis of estimated kinetic variables  
 with mean and 95% confidence intervals .............................................................. 121 
7.5 Estimated contact time from the BD-LSTM .......................................................... 122 
7.6 Kinetic variables across the range of running velocities ....................................... 129 
7.7 Stance average GRF. Linear regression and Bland-Altman plots ......................... 130 
7.8 Peak Force. Linear regression and Bland-Altman plots ........................................ 131 
7.9 Impulse. Linear regression and Bland-Altman plots ............................................. 132 
7.10 Average loading rate. Linear regression and Bland-Altman plots ....................... 133 
 
 
 
xvi 
  
LIST OF TABLES 
 
Table Page 
 
2.1 Subject Characteristics (Mean ± SD) for Specific Aim 1 ...................................... 12 
2.2 Subject Characteristics (Mean ± SD) for Specific Aim 2 ...................................... 13 
2.3 Subject Characteristics (Mean ± SD) for Specific Aims 3 and 4 .......................... 13 
2.4 Subject Characteristics (Mean ± SD) for Specific Aim 5 ...................................... 14 
3.1 Subject Gait Cycle Accuracy (% of Occurrences) and Duration of  
 States Output by the BP-AR-HMM Across Subjects ............................................ 38 
3.2 Average Time Prior to Gait Events When Either State 4 or 1 Occurs ................... 39 
4.1 Heuristic Algorithm Accuracy and Lead Time ...................................................... 54 
4.2 BP-AR-HMM Algorithm Accuracy and Lead Time ............................................. 55 
4.3 Primary States Output From the BP-AR-HMM .................................................... 55 
5.1 Distribution of Average Running Velocities ......................................................... 68 
5.2 Exemplar Paces for Participant with a 7-Minute per  
 Mile 5-km Race Pace ............................................................................................. 68 
5.3 Root Mean Square Error for the Identification of Gait  
 Events and Estimation of Foot Contact ................................................................. 71 
6.1 Distribution of Average Running Velocities ......................................................... 85 
6.2 Example paces and timing for 400 m run .............................................................. 86 
6.3 Stance and Waveform Root Mean Square Error .................................................... 90 
6.4 LSTM Estimated Gait Event Error ........................................................................ 91 
6.5 LSTM Estimated Spatial-Temporal and Kinetic Variables  
 Root Mean Square Error ........................................................................................ 92 
7.1 Participant Characteristics and Mile Pace.............................................................. 107 
7.2 Root Mean Square Error Table for IMU Gait Event Detection ............................. 112 
xvii 
  
Table                                                                                                                             Page 
7.3 Root Mean Squared Error from BD-LSTM Output ............................................... 113 
7.4 BD-LSTM Stance Phase Ground Reaction Force Waveform RMSE ................... 114 
7.5 Measured Kinetic Variables from Force Insoles ................................................... 117 
7.6 Kinetic Variable RMSE from BD-LSTM Estimated Waveforms ......................... 118 
xviii 
 
 
CHAPTER I 
 
INTRODUCTION 
 
 
Background and Significance 
Introduction 
 Biomechanical analysis of human locomotion outside of the laboratory has been 
made possible with the development of wearable sensors, for both clinical and sport 
applications [1]–[3]. Collection and analysis of biomechanical data in the laboratory has 
been well established for the measurement and analysis of human locomotion. Traditional 
biomechanical protocols require motion analysis in the laboratory either across inground 
force plates or on an instrumented treadmill [4]–[6]. These methods require expensive 
equipment, and a high level of technical expertise, while comparatively wearable sensors 
are less expensive, lightweight and require less technical expertise than laboratory-based 
methods. From both laboratory-based analyses, and real-world/ecological analyses there 
are three broad categories of measurements; spatial temporal, kinematic and kinetic. 
Examples of spatial temporal parameters relating to gait are initial contact (IC), the 
instance the foot comes into contact with the ground, toe off (TO), the last instance the 
foot is in contact with the ground, and contact time as the difference between the TO and 
IC. These form the basis of biomechanical analysis, via gait cycle segmentation. 
Examples of kinematic variables are joint angles and angular velocity. Kinetic variables 
include stance average ground reaction force (GRF), stance average GRFs, impulse, peak 
GRFs, and loading rates. For the purposes of this dissertation, analysis will be focused on 
 
 
  
the identification of gait events in all manuscript chapters (III – VII), and estimation of 
ground reaction forces in the final two manuscript chapters (VI and VII).  
 
Gait Event Identification for General Locomotion 
The gold standard for identification of gait events in the laboratory are vertical 
GRF using force plates and marker trajectories measured using optical motion capture 
systems [7]. Wearable sensors for the measurement of forces outside the lab currently 
include in-shoe force or pressure sensors. Insoles utilized for the identification of gait 
events outside of the laboratory have been developed and validated for gait event 
detection, and measurement of  kinetic variables [8]–[10]. Force or pressure sensing 
insole systems can be expensive, and they are often not viable for use in assistive devices 
or consistent daily use. For the purposes of this dissertation, force sensing insoles were 
used as the standard comparator against which the IMU based estimates were examined.  
Typically, there are 9 individual sensors in a multi-axial IMU, 3-D accelerometers 
(linear acceleration), 3-D gyroscopes (angular velocity) and 3-D magnetometers 
(magnetic field). Multi-axial inertial sensors have been utilized extensively for the 
identification of gait events and locomotion mode, with varying success [11]–[18]. 
Accelerometers have the potential to be used for gait event detection but require the use 
of gyroscopes and extensive mathematical modelling, and Kalman Filters to correct the 
orientation of the senor consistent identification of gait events [19], [20]. The use of 
gyroscopes for identification of gait events may be more consistent as they are orientation 
invariant, and unaffected by gravity [21]. While these methodologies have been 
2 
  
previously explored in the laboratory, they are yet to be validated in a real-world 
environment.  
 
Machine Learning Algorithms for Identification of Gait Events   
Machine learning algorithms have been used to identify gait events and classify 
locomotion modes, such as walking, ramp/stair ascent/descent. These methods include 
linear discriminant analysis (LDA), and support vector machines (SVM) that utilized 
simple rules in high dimensional feature spaces to divide the different classes [22]–[25]. 
More sophisticated implementations of machine learning algorithms for gait event 
identification are hidden Markov models (HMM), and time history dependent variations 
of these (e.g. dynamic Bayesian networks and auto regressive HMM), while other 
algorithms include artificial neural networks (ANN), and phase variable classification 
[17], [25]–[28]. Although more computationally intensive, these algorithms provide 
improved accuracy compared to more simplistic machine learning methods. Machine 
learning algorithms are favored over simpler heuristic methods as they have the potential 
flexibility to transfer learning between environments. Some issues with these approaches 
are they have not yet been tested on data in an ecological environment, and they typically 
require large sensor arrays and high sampling frequencies that are not practical for real-
world applications.  
 One solution to the problem of large sensor arrays and high sampling frequencies 
are novel algorithmic techniques have been developed particularly for analysis of 
multiple related time series and minimal input data. An example of these algorithms are 
switching linear dynamical systems, developed previously for economic analysis and the 
3 
  
description of other dynamical processes [29]. These algorithms are unique as behaviors 
do not have to be known a priori but can be added when a new feature is discovered. A 
switching linear dynamical system, the Beta Process Autoregressive Hidden Markov 
Model (BP-AR-HMM) was used in the first part of this dissertation [30]. The beta 
process is a feature selection algorithm, that can generate new features based upon the 
given data and share these features across related time series [29]. The transitions 
between these features are governed by the AR-HMM model, which considers the 
previous time steps and generates transition probabilities for efficient transitions states. 
This type of feature identification algorithm is advantageous as it generates states from 
the input data, does not rely on user defined states, and it can be utilized as both an 
unsupervised and supervised machine learning algorithm.  
The first two Specific Aims of this dissertation sought to identify features prior to 
gait events from sensors with minimal sampling frequency with the BP-AR-HMM and a 
heuristic algorithm. The final three Specific Aims of this dissertation focused on the 
estimation of contact time from IMUs, and the mapping of GRF waveforms from IMUs 
while running outside of the laboratory.  
 
Running Outside the Laboratory  
Over the past three decades biomechanics researchers have studied running in 
different simulated environments. Examples of these are: running on different surfaces 
[31], [32], different simulated gravities [33], positive and negative slopes [34]–[36] and 
on uneven terrains [37]. These studies have developed the gold standards for human 
running performance analysis. Inertial sensors have been used for the estimations of leg 
4 
  
joint angles, and measurements of stride kinematics and segmental accelerations, during 
competitive events [38], [39]. The overarching purpose of these studies was to develop an 
understanding of human running biomechanics across a range of locomotion velocities 
and in different simulated environments. There have been further inquiries utilizing 
inertial data collected in the laboratory for estimation of gait events [40]–[42], discrete 
kinetic variables[43], [44] and kinetic waveforms [45]–[47]. Initial comparisons between 
the laboratory standard and wearable sensors for the identification of gait events has been 
successful [41], [42], [48]. There have been two general methods for the estimation of 
gait events from inertial sensors: heuristic rule sets, and machine learning.  
Identification of gait events from IMUs for running locomotion has generally 
relied on heuristic rules. Sensors mounted on the shank and foot have relied on the 
identification of peak accelerations for the estimation of Initial Contact (IC), [40], [41], 
[49]. Peak impact accelerations during running for IMUs mounted on the foot, are greater 
than 50 m s-2, these peak accelerations are consistent enough for the identification of IC 
while running. Multi-axial IMUs, mounted on the waist of the participants tend to rely on 
accelerations in either the vertical or anteroposterior directions for the identification of IC 
[40], [41], [44], [48]. From the identification of IC, it is then possible to detect TO with 
specific rules within a temporal window. The use of machine learning algorithms for the 
identification of gait events during running has shown promise as well in the laboratory, 
for a data driven approach to the identification of foot contacts [50].  
Machine learning for mapping of IMU data to GRF waveforms for the analysis of 
running biomechanics has not yet been achieved from data collected in an outdoor 
training or real-world environment. The estimation of temporally dependent waveforms 
5 
  
from recurrent neural networks (RNNs), particularly, Long Short Term Memory 
networks (LSTMs) have been used in biomechanics as gait is cyclical [45]–[47], [51]. 
They have been used for estimation of GRFs during an identified stance phase from 
inertial data [46], [47], from data normalized to the duration of a step [52]. More recently, 
there have been other efforts in this space to estimate partial GRF waveforms, that can be 
linked together to form waveforms of indefinite length [45]. From these waveforms 
simple rules can be made for the identification of IC and TO. These studies have focused 
on the estimation of contact time, peak GRF, impulse and loading rate. Each of the 
previous studies, with the exception of [45], have only estimated GRF during stance 
phase as identified by a researcher. Furthermore, the work in this space has been limited 
to estimation of these variables while running on level ground with the exception of [45], 
which included treadmill running on an incline and decline. We look to expand upon 
these methods by validating level ground running at set paces as well as free running for 
the estimation of GRFs.  
 
General and Specific Aims 
 This dissertation set out to develop validated methodologies for the identification 
of gait events and estimation of GRF waveforms from wearable sensors across a range of 
locomotion modes and running velocities from data collected outside of the laboratory. 
The practical applications of these data and methods are the accurate identification of 
foot contact from mobile sensors, including those used for control systems on powered 
assistive devices with minimal sensor data and more accurate and relevant training tools 
for runners. Towards these objectives five specific aims were addressed. The first two 
6 
  
Specific Aims focused on the estimation of features prior to gait events during steady 
state locomotion and transitions between locomotion modes with minimally sampled 
data. The last three Specific Aims focused on estimation of gait events and GRF 
waveforms while running in a semi-uncontrolled and an uncontrolled environment. 
 
Specific Aim 1: Test the utility of a machine learning algorithm to label foot 
acceleration data for classification of human locomotion. The Beta Process – Auto 
Regressive – Hidden Markov Model (BP-AR-HMM) can be either a supervised or 
unsupervised machine learning algorithm that generates a subject independent feature set 
for related time series. We utilized steady state locomotion data, walking and running as 
well as dynamic transitions between the two to test this algorithm. Features output from 
the supervised BP-AR-HMM were expected to be able to identify waveform 
characteristics prior to IC and TO. To complete this aim, data collected in a prior study 
conducted my Dr. Li Jin, a former graduate student in the BSSC were used as input to the 
BP-AR-HMM. 
 
Specific Aim 2: Develop a system for the identification of features prior to gait events in 
an ecological environment with both an unsupervised machine learning and a heuristic 
algorithm. We tested the utility of the unsupervised BP-AR-HMM to label steady state 
and dynamic locomotion data, as well as a rules-based algorithm for the identification of 
features prior to the occurrence of gait events. Steady state walking, running, ramp and 
stair ascent/descent as well as transitions between level ground walking and all other 
locomotion modes in an ecological environment. This aim sought to build a participant 
7 
  
independent labelling system of gait for the identification of gait events prior to their 
occurrence with both a machine learning and heuristic model.   
 
Specific Aim 3: Develop and validate two algorithms for the estimation of contact times 
at set paces on a track. We developed two algorithms based on previous work for the 
estimation of gait events from a wide range running velocities and participant skill levels. 
We collected data in a semi-uncontrolled environment, at set running paces on a square 
track, to estimate gait events from two IMUs located bilaterally on the dorsum of each 
foot as well as the one IMU clipped to the back of the waistband. Each of the algorithms 
used peak identification for the estimation of IC and temporal windows for the 
identification of TO. The IMU estimated gait events were compared to events derived 
from instrumented force sensing insoles as a standard.  
 
Specific Aim 4: Estimate whole ground reaction force waveforms from inertial sensors 
during fixed pace runs on a track. We implemented a machine learning algorithm for the 
mapping of inertial data to GRF waveforms using data collected from participants 
running at set paces and with a variety of running experience. This analysis used data 
from Aim 3.  The purpose of this study was to build a machine learning model for the 
estimation of gait events and contact time, as well as the estimation of discrete GRF 
variables from an estimated waveform, including stance average GRFs, peak GRFs, 
impulse, and average loading rate in a semi-uncontrolled environment.  
 
8 
  
Specific Aim 5: Estimate whole ground reaction force waveforms from inertial sensors 
during a free run in a real-world environment. The focus of this aim was to validate the 
algorithms developed in Aims 3 and 4 in a real-world running environment. Specifically, 
this study compared the estimation of whole GRF waveforms from the same three sensor 
set up used in Aims 3 and 4 for participants running at their own pace on a 5-mile course, 
which included running up and down hill.  
Through the completion of these aims, methodologies for the identification of gait 
events outside of the laboratory have been improved. This knowledge may be applied to 
the identification of gait events in assistive devices, wherein minimizing sampling 
frequencies makes it less burdensome to monitor patients in their own ecological 
environment. Other aspects of this work may be applied to development of a model for 
the quantification of training load throughout a runner’s preparation for a race, or for a 
novice runner to improve their form.  The larger study mentioned throughout Specific 
Aims 3-5 is the Longitudinal Running Study, the overarching purpose of this study is to 
develop biomechanical tools to monitor running performance throughout the course of 
their training.  
 
Organization of Dissertation 
 This dissertation is written in a journal style format, where chapters III-VII have 
been or will be submitted for publication to peer-reviewed journals. The following 
section explains how these chapters fit together into a coherent body of work. A bridge 
statement explaining the flow of studies is included at the conclusion of Chapters III-VII. 
9 
  
 The current chapter (Chapter I) provides essential background information 
regarding wearable sensors and machine learning algorithms for the detection of features 
prior to gait events, gait events and the estimation of whole force waveforms from 
wearable sensors. This chapter establishes the basis and need for the research presented in 
the dissertation. Chapter II will detail the methodology used for each study. Chapter III 
describes methods, model development, and validation procedures related to Specific 
Aim 1. Chapter IV describes the development of methods for outdoor data collections 
with wearable sensors and machine learning algorithms for gait event detections in a real-
world environment to complete Specific Aim 2. Chapter V describes the development of 
algorithms for the estimation of gait events using two different anatomical locations for 
running in a semi-uncontrolled environment for completion of Specific Aims 3. Chapter 
VI corresponds to the model development, methods and analysis of running in a semi-
uncontrolled environment with data estimated from an optimized machine learning 
model, for completion of Specific Aim 4. Finally, Chapter VII synthesizes the methods 
and models of chapters IV, V and VI for the analysis for the completion of Specific Aim 
5. Chapter VIII summarizes the notable results of the overall body of work, reiterating the 
key findings while acknowledging limitations and outlining future directions for work in 
this area of research. 
 This dissertation includes co-authored work, some which has already been 
submitted for publication in peer-reviewed journals. Chapter III has been submitted and 
accepted to IEEE Journal of Biomedical and Health Informatics, DOI: 
10.1109/JBHI.2020.3028827. Chapter IV has also been submitted and accepted to IEEE 
Transactions on Neural Systems and Rehabilitation Engineering, DOI: 
10 
  
10.1109/TNSRE.2021.3131953. Chapters V-VII will be submitted for publication to an 
appropriate journal. For all work in this dissertation, with the exception of the study 
design and data collection of Chapter III (completed as part of Dr. Li Jin’s dissertation 
work), Seth Donahue was the primary investigator, responsible for study design, data 
analysis, interpretation, and dissemination. Dr. Michael E. Hahn advised on all aspects of 
this dissertation.  
11 
  
CHAPTER II 
GENERAL METHODOLOGY 
Subjects 
Chapters III and IV of this dissertation describe the development and validation of 
novel methods for the estimation of gait events across and between locomotion modes, 
both inside and outside the laboratory. Chapters V-VII of this dissertation describe the 
development of both heuristic and machine learning algorithms for the estimation of gait 
events in both a semi-uncontrolled and uncontrolled environment. The current chapter 
provides a summary of the methods developed and used for data collection and analysis 
for the various Aims.   
To address Specific Aim 1 (Chapter III), ten middle aged able-bodied subjects 
participated in the study (Table 2.1). To be included participants had to be between the 
ages of 40 and 65 and not have sustained a lower limb injury in the past year. These data 
were collected as apart of Dr. Li Jin’s dissertation.  
 
Table 2.1: Subject Characteristics (Mean ± SD) for Specific Aim 1 
Sex Age (yr) Height (cm)  Mass (Kg)  
Total (n = 10) 50.7 ± 6.0 173.4 ± 11.4 69.7 ± 1.0 
 
 
To address Specific Aim 2, sixteen able bodied participants with no injuries to 
their lower extremities in the past 6 months were recruited (Table 2.2). For Chapter IV, 
written informed consent was obtained from each participant and study protocols were 
approved by the University of Oregon Institutional Review Board (IRB protocol 
#08202018.018). 
12 
  
 
Table 2.2: Subject Characteristics (Mean ± SD) for Specific Aim 2 
Sex Age (yr) Height (cm)  Mass (Kg)  
Male (n = 8) 33.75 ± 17.36 179.39 ± 6.78 70.57 ± 8.21 
Female (n = 8) 28.50 ± 12.35 167.32 ± 5.02 62.67 ± 4.75 
Total (n = 16) 31.13 ± 14.81 173.36 ± 8.49 66.32 ± 7.62 
 
 
To address Specific Aims 3 and 4, fifteen runners were recruited as part of a 
larger ongoing study. These participants were recreationally trained, with a minimum 
training volume of 5 miles of running per week. For Chapters V and VI, written informed 
consent was obtained from each participant and study protocols were approved by the 
University of Oregon Institutional Review Board (IRB protocol # 10062020.007). 
 
Table 2.3: Subject Characteristics (Mean ± SD) for Specific Aims 3 and 4 
Sex Age (yr) Height (cm)  Mass (Kg)  
Male (n = 9) 26.33 ± 13.44 179.44 ± 4.22  71.33 ± 7.00 
Female (n = 6)  19.83 ± 0.41 163.00 ± 15.41  64.50 ± 2.17 
Total (n = 15) 23.73 ± 10.69 172.87 ± 12.83  68.60 ± 6.46 
 
 
To address Specific Aim 5, sixteen runners were recruited as a part of larger 
ongoing study. Participants were recreationally trained runners. They were asked to 
complete a 5-mile course, which had similar characteristics to their normal training 
routes, distances, and elevation changes. For Chapters VII, written informed consent was 
obtained from each participant and study protocols were approved by the University of 
Oregon Institutional Review Board (IRB protocol # 10062020.007, MODCR: 
00000269a). 
13 
  
 
 
Table 2.4: Subject Characteristics (Mean ± SD) for Specific Aim 5  
Sex Age (yr) Height (cm)  Mass (kg)  Pace 
(min:secs) 
Male (n = 8) 23.57 ± 6.00 166.29 ± 7.74  70.86 ± 8.38 7:53±0:46 
Female (n = 5)  23.40 ± 3.58 168.40 ± 4.56  56.20 ± 4.21 8:08±0:36 
Total (n = 13) 23.15 ± 4.88 167.77 ± 30.86  65.00 ± 9.70 7:56±0:40 
 
 
Study Design and Experimental Protocol 
Chapter III  
 Participants walked and ran on an instrumented treadmill (Bertec, Inc., Columbus, 
OH) with four steady state trials, and two dynamic walk-to-run and run-to-walk 
transitions. There were two steady state walking trials (1.4 and 1.6 m s-1), two steady 
state running trials (2.6 and 3.0 m s-1), and a walk-to-run trial from 1.8 – 2.4 m s-1 with an 
acceleration of 0.1 m s-2 and run-to-walk trial from 2.4 -1.8 m s-1, with a deceleration of 
0.1 m s-2. Each trial was 90 seconds long, with the middle 10 seconds of each trial being 
analyzed.   
 
Chapter IV 
 Participants took part in two days of data collection. The first day consisted of 
locomotion transitions in a real-world environment, transitioning between level ground 
walking and each of the following: ramps, stairs and running. On the second day of data 
collection participants were asked to complete nine average velocity running trials from 
1.3 – 3.0 m s-1 on an 80 m straightaway.  
14 
  
 
Chapters V-V1 
 As a part of a larger ongoing study, participants were asked to complete average 
running velocity trials on a single lap around a 400 m track. They ran at three paces 
slower than their race pace, at their race pace and at a push pace. For example, if their 
race pace for a 5 km race was 6:00 they would be asked to run at 7:30, 7:00 and 6:30, 
then at 6:00, after their race pace they were given the choice to continue to a 5:30 pace. 
Each participant was provided a minimum rest period of two minutes or self-selected by 
the participants between each time trial.  
 
Chapter VII 
 As a part of a larger ongoing study, participants were asked to complete a 5-mile 
run on a course around the University of Oregon and Prefontaine Trail. This course 
included incline, decline and level ground running.  
 
Data Collection 
Chapter III 
 Participants were outfitted with a bilateral marker set consisting of 43 retro-
reflective markers defining nine segments (forefoot, rearfoot, shank, thigh, pelvis). 
Three-dimensional marker data were collected at 200 Hz using an 8-camera motion 
capture system (Motion Analysis Corp., Santa Rosa, CA). Ground reaction force data 
were collected at 1000 Hz using a force-instrumented treadmill (Bertec, Inc., Columbus, 
OH).   
15 
  
 
Chapter IV 
 Participants were equipped with four IMUs (Vicon, Centennial, CO): one on the 
waistband, one on the lateral aspect of the thigh, at approximately the midpoint between 
the greater trochanter and the knee, one on the lateral aspect of the leg, at approximately 
the midpoint between the knee and the lateral malleolus and one on the dorsal aspect of 
the foot. Three participants were equipped with the Noraxon foot switch DTS system 
(Noraxon, Scottsdale, AZ; sampling rate of 500 Hz), which measured pressure at the 
foot-shoe interface. The other 13 participants were equipped with Loadsol insoles (Novel, 
Minneapolis, MN; sampling rate of 100 Hz), which measured normal force between the 
foot-shoe interface.  
 
  Chapters V - VII 
 Participants were equipped with three IMUs, with a sampling rate (Casio, Tokyo, 
Japan; sampling rate of 200 Hz). Force sensing insoles (Novel, Minneapolis, MN; 
sampling rate of 100 Hz) were used for the measurement of normal force data between 
the foot-shoe interface. They were also equipped with a GPS watch (Garmin Forerunner 
130 and 135; Garmin, Olathe, KS; sampling rate of 1 Hz) from which distance, pace, 
elevation, and slope were measured. 
 
Data and Statistical Analysis 
Chapter III 
16 
  
 All data analyses were completed with custom Matlab scripts (Mathworks, 
Natick, MA). The GRF data were filtered with a low pass fourth order Butterworth filter 
(35 Hz cut off frequency). Ground contact state was determined when vertical GRF 
values were above a threshold of 50 N. Motion trajectory data from a single marker on 
the dorsum of the left foot were extracted from a larger data set [53]. For this study and 
throughout the dissertation, x defined the medio-lateral axis, y defined the antero-
posterior axis, and z defined the longitudinal axis. Idealized three-dimensional (3D) 
acceleration values were calculated by taking the second derivative of the filtered marker 
position data. The foot contact information and acceleration data were then downsampled 
to 20 Hz and both input into the Beta Process Auto-Regressive Hidden Markov Model 
(BP-AR-HMM). 
The BP-AR-HMM is an algorithm developed to share features across related time 
series. The key element of this algorithm, the beta process, adds and remove states that 
can be shared across all data sets input. For example, if two states describe the same 
feature they will be combined into a single state and shared. This development of 
dynamic features is crucial for use in semi-supervised or unsupervised machine learning, 
or as data sets grow so large that they cannot be feasibly annotated. The HMM assumed 
that a set of distributions and transitions between dynamic features follow a Markov 
process. This process was modified with the presence of the auto-regressive model, 
which considers temporal relationships in the data. The vector auto regression approach 
provided an estimation of current state, based upon the previous time steps, the available 
states and distributions based on the hidden Markov model. A sticky parameter was used 
to increase the probability of self-transitions within a state to identify biomechanically 
17 
  
significant states [30]. Output data from the BP-AR-HMM were states of varying 
duration, transitions between different states were indicative of transitions between 
different aspects of the gait cycle. Performance evaluation of the BP-AR-HMM included 
the critical timing of states and how consistently the features were used across all 
locomotion modes. Accuracy in this study was defined as the number of features 
identified prior to the impending gait events divided by the number of measured gait 
events for both the heuristic and machine learning algorithms. 
 
Chapter IV 
 All data analyses were completed with custom Matlab scripts. Foot-shoe normal 
force data were considered the standard reference for identification of measured gait 
events. The inertial signals and force data were time-synced using ‘foot-stomps’ before 
and after each trial. Inertial data from the IMUs on the participant were time synced and 
down sampled to 100 Hz, then synchronized with the insole force sensors off-line. Foot 
contact information and ground reaction force data were converted into indicator 
variables, indicating cyclic motion or not. Then data were labeled with specific 
locomotion mode. Input into the heuristic algorithm was single axis angular velocity data 
about the x-axis of the IMU (sagittal plane of the participant), sampled at 100 Hz. Input 
into the BP-AR-HMM consisted of single axis angular velocity data about the x-axis of 
the IMU, down sampled to 25 Hz. There were four rules utilized for the identification of 
gait events from the heuristic algorithm and six state from the BP-AR-HMM. 
 
Chapter V 
18 
  
All data analyses were completed with custom Matlab scripts (Mathworks, 
Natick, MA). Foot-shoe normal force data were considered the standard reference for 
identification of measured gait events. A Kalman filter was applied for the adjustment of 
orientation of the local IMU coordinate system to gravity [54], [55]. Initialization of the 
Kalman filter occurs with the IMU still, and an estimation of the orientation of the sensor 
using accelerometer signal. During motion, orientation of the sensor is estimated by 
calculating the change in orientation from the rate gyroscopes. Each stance phase when 
the resultant accelerations were near that of gravity, the orientation of the sensor was 
reset with acceleration.  
 Identification of gait events with the force sensing insole utilized a threshold of 50 
N. Initial contact from the foot mounted IMUs were estimated with peak detection 
algorithms, and TO was detected with either a peak acceleration or a threshold during a 
variable length temporal window. From the sacral mounted IMU, IC was detected with a 
peak finding algorithm and TO was detected with a static temporal window, either 
detecting maximum acceleration or maximal slope of the vertical acceleration.  
 Pearson correlation coefficients (r2) were used to compare the estimated contact 
times to the measured force insole contact times. A strong correlation was defined as r2 ≥ 
0.8, a moderate correlation as 0.5 ≤ r2 ≤ 0.8 and a weak correlation as 0.3 ≤ r2 ≤ 0.5. The 
relationships between the estimated and measured waveform variables are presented in 
both linear regression and Bland-Altman plots with 95% confidence intervals (CIs)/ 
Limits of Agreement (LoA) respectively.  
Chapter VI 
19 
  
  All data analyses were completed with custom Matlab scripts (Mathworks, 
Natick, MA). An extension of Chapter V, the analysis in this chapter shifted the data 
analysis to include the use of an LSTM for the estimation of GRF waveforms. The 
variables calculated from these waveforms were contact time, stance average GRFs, peak 
GRF, impulse, and average loading rate.  
 We chose to use an LSTM for the estimation of GRF waveforms since GRFs 
patterns during gait are cyclical and therefore time dependent. The LSTM was initially 
developed to overcome the vanishing gradient or the long term dependency problem 
within machine learning [51]. Within the LSTM, there are specific feedback connections, 
which facilitates retention of about previous data in the sequence by the LSTM. The 
output of the LSTM depends on three different things, the cell state, hidden state, and the 
input data at the current time step.  
 The cell state and hidden state are governed by specific gates in the LSTM cell 
shown in Figure 2.1. The forget gate decides which parts of the memory are useful, based 
on the previous hidden state and new input data. The previous hidden state and new input 
data are fed into a neural network with sigmoid activation functions. If the output from 
this network was approximately 0, that portion of the input data was deemed irrelevant, 
and when the output is closer to 1 it was deemed relevant and kept. Outputs from the 
forget gate were then pointwise multiplied by the previous cell state, so that portions of 
the cell state that were deemed irrelevant were multiplied by approximately 0, and thus 
have less influence on the next steps.  
 The next step determined if any new information should be added to the cell state, 
given the new input data and the previous hidden state that has been modified by the 
20 
  
forget gate. A new cell state was estimated using a neural network, with a hyperbolic 
tangent activation function. This network had the same input data. The purpose of this 
gate was to estimate the new cell state based upon the input data and the previous hidden 
state. The input gate was the next gate and together with the estimated cell state, generate 
a state vector. The input gate had a sigmoid activated neural network, that functioned as a 
filter identifying which components of the new memory vector were important. Thus, if 
the output from the input gate were near zero, the model will not update that element of 
the cell state. The output from the estimated cell state network and the input network are 
then point multiplied and added to the cell state.  
 
 
Figure 2.1: LSTM cell layout and mapping of operations.  
 
 The final step was the output gate, which is similar to the forget gate, in that it 
takes in the previous hidden state, and input data. The output gate is a neural network 
with a sigma activation function. The cell state was passed through a hyperbolic tangent 
filter. The steps described above are repeated for each input. For example, for a 400-
21 
  
sample input, the above steps would be repeated 400 times. The output was also a hidden 
state; therefore, it needed to be processed through a regression layer to output a 
waveform. The loss function in this work was computed as the half mean squared error.  
 There were two steps in the validation of these machine learning protocols, first a 
hyperparameter optimization, followed by a Leave One Out Cross Validation (LOOCV). 
We optimized the hyperparameters of the LSTM with a Bayesian Optimization [56], [57], 
this algorithm attempts to minimize the error in the loss function of the LSTM in a 
bounded domain, however the specifics of this algorithm are beyond the scope of this 
dissertation. Input data during the optimization were randomly split into a training set of 
70% of the total data, a validation set of 15% of the total data and a test set of 15% of the 
data. The hyperparameters optimized were the initial learning rate, gradient decay factor, 
squared gradient decay factor, L2Regularization and number of hidden units. The only 
hyperparameter that influenced the outcome of the algorithm was the number of hidden 
units, as all others converged to the default Matlab input. The range for the number of 
hidden units in the optimization was [10 50]. Optimization of these data was tested on 
temporal windows of varying length between 1 and 5 seconds in duration at half second 
intervals. The optimal network was determined by model performance on these windows. 
The optimal temporal window length was determined by examination of the RMSE from 
the optimal network output. This network was then used in a LOOCV, where a single 
participant was left out of the training set to be utilized as the test set. From the LOOCV 
we determined the most accurate estimation of waveforms by the identification of the 
temporal window duration that had the lowest RMSE for the waveform and discrete 
kinetic variables.  
22 
  
   Estimated waveforms from the LOOCV were analyzed and presented in this 
work. Initial analysis involved identification of the optimal temporal window. This was 
done via inspection of the estimated GRF variables in Bland-Altman plots. Initial Contact 
(IC) was identified by the first instance of force >5% BW and toe off (TO) was 
determined by the last instance of force greater than >5% BW. Contact time was 
determined by taking the temporal difference between these two discrete events. Stance 
average GRFs, impulse, peak GRFs, and average loading rate were the GRF variables 
calculated in this work, from the estimated force waveforms. Average loading rate was 
calculated by identifying the impact peak and then averaging the slope in the middle 60% 
of the region between IC and the impact peak [58].  
 
Chapter VII 
  All data analyses were completed with custom Matlab scripts (Mathworks, 
Natick, MA). We utilized variations from previous chapters for the estimation of gait 
events from an IMU and the estimation of GRF waveforms from bi-directional LSTMs 
(BD-LSTMs). These BD-LSTMs take in data both forward, say from time points 0-99, 
and backwards, from 99-0 and utilize the methods described above. Post-processing 
consisted of the same protocols as Chapters V and VI, with additions due to the 
uncontrolled nature of this data collection. Data from the GPS signal were upsampled 
from 1 Hz to 100 Hz, by creating a matrix where each original data row was increased to 
be 100 rows of identical data. Occasionally, ground reaction force data exhibited baseline 
drift. These were corrected for, however the magnitude of the adjusted force was not 
23 
  
corrected for, as there were less than 500 footfalls that needed correction, having minimal 
effect out of the >85,000 analyzed footfalls.  
 Gait event detection algorithms used in this chapter were similar to those used in 
Chapter V. Identification of foot contact from the force data used the same rules as in 
Chapter VI. The identification of gait with IMUs required the addition of detection of 
pre-IC similar to chapter IV. A temporal window began 50 ms after the identified minima 
and ended 450 ms after identified minima, and the largest peak > 50 m s-2 in the resultant 
accelerations was identified as IC. The rules for identification of TO were the same as in 
Chapter V.  
 Optimized networks were taken from Chapter VI and used for a LOOCV of the 
data in this chapter. A BD-LSTM with a regression output of 1 second was utilized for 
this study. Foot contacts < 100 ms and > 500 ms were removed from the data set, as they 
represented non-cyclical movement (i.e., not a running pattern). Further, any data < 5% 
BW were set to 0 BW. Initial contact and toe off from the estimated waveforms were 
identified in the same manner as Chapter VI.  
 Foot contacts that could not be matched to the measured GRF data were removed 
from the data set. From the estimated waveforms, stance phase RMSE, stance average 
GRFs, peak GRFs, impulse and average loading rate were all calculated in the same 
fashion as Chapter VI. Velocities from the GPS were set to the nearest 0.25 m s-1 for 
velocities ranging from 2.25 – 5.25 m s-1, all other velocities were removed from the 
analysis. Any velocity < 2.25 m s-1 was considered to be a walking step, and there were 
not enough foot contacts for analysis > 5.25 m s-1. Slope was calculated from the altitude 
data and binned into three different groupings of level ground, incline, and decline. 
24 
  
Incline foot strikes were identified as slopes > 5° and decline foot strikes were identified 
as slopes < -5°. We noted errors of up to 4° in the GPS data, which was the reason for 
delineating the slopes as more than ±5°, and designating all other foot strikes to be level 
ground. 
 Running velocity and slope combinations were included in the analysis if the 
participant had more than 10 footfalls for either the IMU or estimated variables. Linear 
models, RMSE, and bias were calculated for the variables in this study using the same 
approach as in Chapter VI. 
25 
  
CHAPTER III 
USER INDEPENDENT ESTIMATIONS OF GAIT EVENTS WITH MINIMAL 
SENSOR DATA  
 
Published as Donahue, S.R; Jin, L; Hahn, M. E; User Independent Estimations of 
Gait Events with Minimal Sensor Data. IEEE Journal of Biomedical and Health 
Informatics. 2021, 25(5), 1583-1590.  The experimental work was performed by L. Jin.  
The writing and analysis are entirely mine. M.E. Hahn provided editorial assistance and 
guidance. 
 
Introduction 
 In recent years the development of assistive devices and their control systems 
have improved dramatically; from passive prosthetic limbs that provide minimal 
adaptability to specialized performance limbs and adaptive assistive devices which 
support the user in a variety of ecological environments [59]–[63]. These adaptive limbs 
rely on input from mechanical sensors such as Inertial Measurement Units (IMUs), 
accelerometers, goniometers, load cells, and neuromotor input derived from 
electromyography. Current control systems take these sensory data and are able to 
identify gait events, classify steady state locomotion, and provide prediction of user intent 
for the next step [5-7]. This study aims to implement an unsupervised machine learning 
algorithm on minimally sampled accelerometer data, to identify gait events, and further 
assess the feasibility of the output for building a model for the classification of 
locomotion and prediction of locomotion transitions.  
26 
  
The control algorithms used to estimate gait events with wearable sensors (e.g. 
IMU’s) have classically been threshold based with a variety of sensor locations [12], 
[65], [66]. Other groups have used adaptive algorithms such as Hidden Markov Models 
(HMM) for the development of rules for detection of gait events [18], [67]. These 
algorithms are accurate for a constrained task however, they tend to be inflexible. The 
expert user defines the states, and the transitions between the states. Another method that 
involves gait event estimation is the identification of gait phase. Gait phase identification 
has informed the classification of both steady state locomotion and transitions between 
locomotion states. The algorithms used for these analyses traditionally are variants of 
linear discriminant analysis, support vector machines, artificial neural networks and 
hidden Markov models [22], [24], [68]–[73], and a variety of other classification 
algorithms [11], [14], [24], [27], [74], [75]. Algorithms utilizing the temporal nature of 
gait data to optimize classification and prediction have been successful [76], e.g. 
Dynamic Bayesian Networks consider the previous data and states to determine the 
current state and make a prediction of what the next state will be. Subject independent 
algorithms, which provide generalized classification of gait and transition prediction for 
both able-bodied subjects and prosthesis users have shown promise as well [77], [78]. 
These algorithms have been highly accurate in the laboratory setting, with great than 95% 
accuracy for the identification of gait events.  
Previous algorithms represent solutions from the understanding of the expert user. 
The potential set of features in any given data can be extensive, therefore expert 
knowledge of the user, coupled with observations of the dataset can define important 
features or characteristics of the waveforms.  Often, for threshold-based models high 
27 
  
sampling rates are required for accurate classification of gait. Furthermore, these 
algorithms require large sensor arrays, i.e. multiple mechanical and neuromuscular 
sensors, for classification. In most cases these algorithms identify gait events at a steady 
state velocities, and do not consider how the subject may be accelerating through space or 
transitioning between locomotion states. These characteristics limit the applicability of 
these approaches to a real-world environment, because of their inflexibility to a variety of 
situations and for the user to manage the sensors.   
Gait event detection in the past has primarily been concerned with 
environmentally constrained locomotion e.g. ramps and stairs [79]–[81].  The primary 
locomotion states and transitions studied with both subject dependent and independent 
classifiers have typically been environmentally constrained transitions from level ground 
walking to ramp or stair ascent or descent [22], [64], [74], [77], [82]. In addition to these 
standard locomotion states and transitions, recent work has examined classification of 
running in prosthetic leg systems [61], [83]. However, it remains that little progress has 
been made on the classification of walk to run (WRT) and run to walk (RWT) transitions.   
In the laboratory setting, walking and running can be differentiated via ground 
reaction force (GRF) waveforms [5], [84], [85]. Walking is defined as a pendular gait 
with a brief double support phase, minimal ballistic propulsion, and a ground reaction 
force waveform with two distinct peaks [85]–[87]. Running is defined as a bouncing gait 
with an aerial phase, ballistic propulsion and ground reaction force waveforms that have a 
single large peak, and occasionally a transient peak in the first 25% of foot contact [86]–
[88]. While these general descriptions clearly distinguish the two locomotion forms, it 
remains that an efficient parsing of gait (specifically initial contact (IC), toe-off (TO), 
28 
  
mid-stance and phases of swing) during steady state walking and running, and the 
transitions between the two gaits has not been attained in near real-time. Recent advances 
in assistive devices and prosthetic limbs require new flexible control algorithms to 
determine locomotion states, discrete gait events and transitions between. An 
unsupervised classification algorithm could provide efficient locomotion mode detection, 
which may be translated to a near real-time control application in assistive devices. 
In this paper, we propose the use of an unsupervised learning algorithm, the Beta 
Process - Auto Regressive - Hidden Markov Model (BP-AR-HMM) [30], [89]. The BP-
AR-HM is a data driven feature extraction algorithm used to derive common features 
from related time series data. For this experiment we input calculated acceleration data 
for the algorithm to derive features (or states), to then estimate gait events. These states 
could also provide a framework for the classification of gait and potentially a basis for 
the prediction of locomotion transitions. The primary purpose of this study was to derive 
output from the BP-AR-HMM to provide biomechanically relevant classifications for 
estimation of gait events such as initial contact and toe off for steady state and dynamic 
locomotion with a group of able-bodied middle-aged adults. The secondary purpose was 
to explore the possibility of developing a framework for classification of gait and the 
development of a probabilistic model for prediction of locomotion transitions between 
walking and running from the output of the BP-AR-HMM. 
 
Methods 
Instrumentation and Data Collection  
29 
  
 Ten middle aged able-bodied subjects participated in the study (50.7 ± 6.0 years, 
173.4 ± 11.4 cm, 69.7 ± 14.9 kg; 5 female). Ground reaction force (GRF) and passive 
marker trajectory data were collected at 1200 Hz and 120 Hz, respectively. Marker data 
were collected with an 8-camera system (Motion Analysis Corp., Santa Rosa, CA).  The 
protocol consisted of two different transitions: (walk to run) WRT and (run to walk) 
RWT on a force-instrumented treadmill (Bertec, Inc., Columbus, OH). The WRT 
protocol began with walking at 1.8 m s-1 for 30 seconds, then the treadmill was constantly 
accelerated at 0.1 m s-2 to the velocity of 2.4 m s-1. Subjects were asked to transition to 
running gait whenever they felt ready during the acceleration stage. After transitioning to 
a running gait, they ran at 2.4 m s-1 for another 30 seconds. The RWT protocol was the 
inverse of the WRT, starting at 2.4 m s-1 for 30 seconds, then the treadmill was constantly 
decelerated at 0.1 m s-2 to the velocity of 1.8 m s-1. The subjects were then asked to 
perform four steady state locomotion trials. Walking trials were performed at 1.4 and 1.6 
m s-1 and running at 2.6 and 3.0 m s-1 for 90 seconds. Data from the middle 10 seconds of 
each trial were extracted for analysis. All data analysis was completed with custom 
Matlab scripts (Mathworks, Natick, MA) 
The GRF data were filtered with a low pass fourth order Butterworth filter (50 Hz 
cut off frequency). Ground contact state was determined when the vertical GRF values 
were above a threshold of 50 N. These GRF signals were used as the reference for foot 
contact by the BP-AR-HMM. Motion trajectory data from a single marker on the dorsum 
of the left foot were extracted from a larger data set [53]. Idealized, three-dimensional 
(3D) acceleration values were calculated by taking the second derivative of the raw 
marker position data. In this case idealized means the data from the accelerometer does 
30 
  
not contain a signal from the acceleration due to gravity. The foot contact information 
and acceleration data were then down-sampled to 20 Hz in Matlab to minimize the 
amount of data needed for successful classification.   
The BP-AR-HMM algorithm was tested on an initial set of 58 trials, from all 10 
participants, to provide user-independent classification [53]. Two trials were removed 
from the original data set of 60 trials because the GRF data were incomplete for the 
present analysis. 
 
Model Overview 
 The BP-AR-HMM is an unsupervised machine learning algorithm that provides a 
spatial-temporal statistical model for related time series data. More detailed descriptions 
of the model can be found in the work of Fox and colleagues [31,33]. A brief overview of 
the model components is provided here. The beta process (BP) provides a sparse feature 
sharing framework which is a probabilistic binary feature inclusion model (Figure 1.1). 
Vector auto-regression (AR) provides an estimation of what the current state is, based 
upon the previous time step(s), current state data, and available transition distributions. 
The HMM assumes that the set of distributions and transitions between dynamic features 
follow a Markov process. The assumption of the Markov process is the probability of 
state change from i to j has a set probability πij, which does not depend on prior state. To 
ensure biomechanically relevant states, a sticky parameter was used as described by Fox 
et al, [90]. The purpose of the sticky parameter increases the probability of self-
transitions to capture biomechanically relevant states [90]. Finally, the reversible jump 
31 
  
 
 
 
Markov Chain Monte Carlo (MCMC) algorithm is used to ensure the correct features are 
 
utilized by determining the transition probabilities between states [30]. 
  
Figure 3.1:  The full BP-AR-HMM approach is shown here, adapted from [30] where  
and  are the hyper parameters that govern the Markovian state switching process to 
produce (i) which is the collection of transition weights. From this the transition 
distribution can be constructed with the feature vector from the beta process to 
generate transition behaviors indicated from fi. The representation of a single  is from 
a finite Dirichlet distribution of dimension Ki of k such that fik = 1. 
 
The BP-AR-HMM generates a user independent statistical model, based upon 
minimally sampled data that does not require real time feature extraction or windowing. 
It also allows subject independent classification of data sets with a variety of transitions. 
 
Traditional Protocols for Machine Learning 
Classic machine learning paradigms have been based upon algorithms that have 
shown exemplary results for high dimensional pattern matching, taking n-dimensional 
32 
  
from a number of sensors and identifying a state or gait phase. These algorithms have 
required a large volume of data in order to generate classifications and predictions. Many 
of these algorithms bypass gait event identification for the use of gait phase 
identification. These phases are defined by the user, using features extracted from the 
data as described in the introduction. Features can include the mean, variance, zero 
crossing, minimum, maximum, frequency content, amongst others [74], [91], [92].  These 
features are then used for classification of current phase and locomotion state. Data are 
then split into training (70%), validation (15%) and test (15%) sets. The initial training of 
the model is completed on the training data set.  The validation data is used as a real-time 
training check of the generalizability of the model; the model does not use this data for 
training.  The test data is utilized after the training of the model is complete to test the 
accuracy of the model on an unseen data.  
The BP-AR-HMM is different from these traditional methods of feature 
extraction and classification. The only thing we define for input into the algorithm is foot 
contact with the ground. This is then input into the model as a guide and to assist in 
limiting the state output. The rules determined by the BP-AR-HMM are derived from the 
acceleration data. Inputting the foot contact data decreases the initial number of states in 
the instantiation of the MCMC algorithm. Instead of breaking up the data set into 
training, validation and testing sets, a Reversible MCMC algorithm was used to 
determine probability distributions for the states derived by the Beta Process.    
 
Analysis Methodology 
33 
  
Data input into the BP-AR-HMM were three-dimensional acceleration data, along 
with a Boolean indicating stance and swing phase from treadmill force data. This 
information was utilized in the initialization of the model and limits the number of states 
derived by the BP-AR-HMM. Gait events were determined from the GRF data. The 
temporal differences between the first occurrence of a BP-AR-HMM estimated state and 
the measured initial contact (IC) and toe off (TO) of the left foot were used to calculate 
the lead time and estimation of a given gait event. The duration of a state was calculated 
from the first instance of the state to the first instance of a different state. The durations of 
the state output for each of the steps were averaged across all subjects, and was used to 
build a framework from which the classification of gait and an initial examination of the 
development of a probabilistic model of locomotion transitions could be completed. The 
utility of the probabilistic model for gait transitions was examined with transition data 
using the three steps prior to the transition step, the transition step (defined as the first 
step in the new gait pattern) and the three steps post transition. The transition step and 
gait events were determined by GRF data. All analyses were performed post hoc. 
 
Results 
There were 9 dynamic states derived by the BP-AR-HMM. Five states occurred in 
> 98% of the gait cycles (Table 3.1). Three states were strictly associated with foot 
contact, five states were associated with swing phase, and one state was associated with 
both stance and swing phase. Of the 797 gait cycles analyzed, state 2 and state 5 occurred 
in every cycle. The next most consistent occurrence was state 4, which was associated 
with terminal swing phase and initial foot contact. States 2 and 5 were captured for each 
34 
  
walking gait cycle, and from running gait there were 4 states (2-5) captured in each cycle. 
The average duration of these states, except 7 and 3, remained approximately the same 
for all walking and running gait cycles (Table 3.2). 
   
Figure 3.2:  Exemplar plots of the state output from the BP-AR-HMM with foot 
acceleration of a single representative subject. States 1 and 4 consistently can be used for 
the identification of gait events, as they occur prior to the identified gait event from the 
force data. State 4 occurs when the acceleration has a negative slope in the medial lateral 
and vertical components of the acceleration at the end of swing phase, and a positive 
slope after initial contact. State 1 occurs at the first instance of an increase in the 
acceleration in the medial lateral component of the acceleration after middle stance 
phase. The major differences between walking and running are shown in the durations of 
state 7. In the walk to run transition, as the velocity of the treadmill was increased State 7 
no longer appears, panel D, step S+2. In all panels the accelerations are shown as a 
variation of a dashed line. In each of the panels each step is labeled as the steps prior to 
the transition (S-2, S-1), the transition step (S-0) or the steps following the transition 
(S+1, S+2). Panels A and B show data from the WRT; panels C and D show data from 
the RWT.  
35 
  
 
 
 There were no states unique to either walking or running. However, state 7 was 
unique to slower locomotion. It represents the portion of stance phase in which there is 
approximately zero acceleration in all three directions (Figure 3.2). State 7 occurred in 
28.9% of running gait cycles, with a higher frequency in the RWT transition than the 
WRT. State 9, which represents an approximate mid-swing phase only occurred in 26.8% 
of gait cycles and could be considered a misclassification of state 6. 
   Classification of impending gait events were represented by states 1 and 4. When 
state 1 occurred the foot was moving into the push-off phase of stance, for walking and 
running. State 1 occurred 0.14 ± 0.03 seconds before TO (Table 3.2). State 4, 
representing terminal swing into the initial braking phase of stance, consistently occurred 
0.13 ± 0.02 seconds before IC. Both states occurred in >99% of gait cycles. 
  The average duration of each state output from the BP-AR-HMM can be 
indicative of which locomotion mode the subject is using. For walking, the state with the 
longest average duration during foot contact was state 7 which occurred in 99.10% of 
walking foot contacts with an average duration of 0.29 ± 0.12 seconds. The average 
duration of state 7 for running gaits was 0.06 ± 0.02 seconds, occurring in 28.94% of 
running steps (Figure 3.2). State 3 had the second largest difference in duration between 
the two locomotion modes, at 0.09 seconds. The average duration of state 3 in running 
gait cycles was 0.14 ± 0.04 s, and  was 0.05 ± 0.01 s in walking gait. State 3 occurred in 
97.90% of walking and running gait cycles. Figure 3.3 shows the relationship between 
these two state durations during each of the transitions. 
36 
  
   In Figure 3.2, foot contacts are shown in panels A and C, with walking and 
running foot contacts differentiated by their height. In panels B and D, the state output 
from the BP-AR-HMM is shown. In both panels state 1 and 4 are identified first, 
indicative of when either an initial contact or toe off event will occur. In panel B the 
decrease in the duration of state 7 provides evidence of a gait transition from a run to a 
walk. In panel D the increased duration of state 7 and the decreased duration of state 3 is 
indicative of a walk to run transition.  
 
Figure 3.3:  Changes in the duration of state 7 and state 3 for all subjects’ transitions 
between walking and running (RWT and WRT). The steps before the transition step are 
denoted by S-3, S-2, S-1, the transition step is S-0 and the steps of the new locomotion 
type are S+1, S+2, S+3. The duration of each of the states were averaged across subjects, 
for each step.  
 
 
 
 
37 
  
Table 3.1: Gait Cycle Accuracy (% of Occurrences) and Duration of States Output by 
the BP-AR-HMM Across Subjects 
Gait Cycle  
 Gait Phase Average State Duration (s); Mean ± SD 
Occurrences (%) 
All Walking Running All Walking Running 
State Walking Running 
N1 = 797 N2 = 334 N3 = 463 N1 = 797 N2 = 334 N3 = 463 
1 Push Off Push off 98.53 97.30 99.35 0.09 ± 0.04 0.07 ± 0.05 0.098 ± 0.038 
2 Early swing Early Swing 100.00 100.00 100.00 0.10 ± 0.03 0.10  ± 0.03 0.10 ± 0.03 
3 Early Swing Early Swing 99.14 97.90 100.00 0.11  ± 0.05 0.05 ± 0.01 0.14  ± 0.04 
Terminal 
Terminal 
4 Swing /Early 99.88 99.70 100.00 0.20  ± 0.04 0.20  ± 0.04 0.20  ± 0.04 
Swing 
Stance  
5 Mid-Swing Mid-Swing 100.00 100.00 100.00 0.06  ± 0.02 0.05  ± 0.01 0.08  ±0.02 
6 Mid Swing Mid Swing 92.17 97.01 88.34 0.08  ± 0.04 0.10  ± 0.04 0.06  ±0.02 
7 Stance Early Stance 59.07 99.10 28.94 0.22  ± 0.12 0.29  ± 0.08 0.06  ±0.02 
Early /mid 
8 Early Stance 94.24 94.61 93.95 0.08  ± 0.03 0.10  ± 0.03 0.07  ± 0.03 
Stance 
9 Mid Swing Mid Swing 26.83 34.13 21.60 0.07  ± 0.04 0.06  ± 0.03 0.07  ± 0.05 
 
 
 Table 3.2: Average Time Prior to Gait Events when 
Either State 4 or 1 Occurs. 
 
Velocity INITIAL 
(m s-1
Toe Off (S) 
) CONTACT (S) 
 
1.4 0.11 ± 0.01 0.12 ± 0.03 
1.6 0.11 ± 0.01 0.13 ± 0.05 
 
2.6 0.11 ± 0.02 0.14 ± 0.01 
3.0 0.12 ± 0.02 0.13 ± 0.02 
 
WRT 0.10 ± 0.02 0.16 ± 0.04 
RWT 0.21 ± 0.02 0.14 ± 0.03 
 
Average 0.13 ± 0.02 0.14 ± 0.03 
  
 
 
38 
  
Discussion 
  In this study, we have demonstrated that a data driven, semi-unsupervised 
machine learning algorithm can consistently, with minimal inter-participant variability, 
identify gait events from minimal sensor data for steady state and dynamic locomotion. 
The parsing of gait with single sensor data sampled at 20 Hz represents an important step 
in the development of transition-predictive control algorithms for assistive devices.  
  The minimally sampled data from a single sensor provides an approach to reduce 
the computational load in control systems for actuated prosthetic limbs. The classification 
accuracy in both steady state and dynamic locomotion trials by the output of the BP-AR-
HMM is comparable to other gait classification studies. Identification of gait events with 
threshold algorithms has been presented thoroughly [7], [8], [93].  The timing of the 
identification of gait events is critical in the overall structure of the control system for 
decision making. Other groups have reported minimal lead time or detection of the event 
after it has occurred using accelerometers at various anatomical locations [11], [12], [66], 
[94]. Mannini and colleagues used a HMM for gait event detection, with 100% accuracy 
and identification of the gait event approximately 40 ms prior to the gait event [17]. 
However, their work uses states that were identified by the user, while in our study states 
were derived in an unsupervised fashion with the Beta Process. Groups that have used 
finite state machines, which require user defined states, have shown high gait event 
identification accuracy for able-bodied subjects [95], [96]. The limitations of these 
studies are large sensor arrays and high sample frequency. Another limitation is the 
limited velocities at which the subjects were asked to move. In the present work we have 
shown that IC and TO can be identified across various locomotion velocities and two 
39 
  
different gaits with adequate lead time for a control system to adjust orientation or 
stiffness parameters of an assistive device in preparation for the next step [14].   
 The secondary purpose of this study was to determine if the output of the BP-AR-
HMM could be used for gait classification and whether a probabilistic model for the 
identification of locomotion transitions would be feasible. Classification approaches for 
multiple gaits by a control algorithm have been reported to be accurate to the range of 95 
– 99% [11], [27], [74], [97], [98]. In Table 3.1, state 7 is indicative of walking stance 
phases and also present in 29% of running steps (slow running foot contacts and 
primarily identified during the WRT and RWT trials). State 7 captures the time of 
approximately zero acceleration when the foot is in mid stance. As gait velocity 
increases, the duration of state 7 will decrease until it altogether disappears. A rule-based 
classification utilizing the duration of state 7 could provide input about what gait type the 
subject is using. State 3 shows a similar pattern but the differences in the durations are 
not as stark. Results presented in Figure 3.3 indicate transition occurrence as the patterns 
in the accelerations of the foot change. From these data, it appears the BP-AR-HMM 
output can provide relevant information about the current locomotion state and when a 
transition may occur.  
 The information about the previous stance phases could then be used to estimate 
what type of gait will be used next, either a walking or running gait, as shown in Figure 
3.3. In short, by assessing changes in the presence and duration of state output over the 
course of multiple stance phases, a probabilistic model of gait transitions could be 
generated.  
40 
  
 When there are multiple types of locomotion, recent work has made use of 
heuristic algorithms which utilize user defined patterns in linear acceleration and angular 
velocity signals [14], [18], [27], [99]. Other algorithms such as linear discriminant 
analysis, support vector machines and artificial neural networks have taken the user 
defined states and generated optimized rules for the classification of gait, with varying 
degrees of success [22], [70], [74]. This approach is similar to the process of heuristic 
models and rules developed in other machine learning methodologies. Unlike the 
aforementioned methodologies, the BP-AR-HMM optimizes the feature set using split-
merge, merge or remove features to ensure sparsity in the feature space [30], [89]. 
Finally, algorithms that use time history for the classification of gait are typically based 
on user defined locomotion types. Previous work has used a Dynamic Bayesian network 
(DBN) [13]. The DBN requires expert user input into the algorithm for classification of 
gait parameters and locomotion transitions. A characteristic which is advantageous for 
consistency in the output of the algorithm is use of the latent temporal structure of 
cyclical gait data. Both the DBN and the BP-AR-HMM use these patterns to provide 
information about the gait cycle. This is valuable in control systems because the 
additional information could be used to provide more accurate predictions of locomotion 
transitions. 
 The use of an instrumented force treadmill in this study limits the applicability of 
these results to real world control systems, because of the constraints of steady state 
velocity and the use of constant accelerations during transitions on the treadmill. 
Additionally, the calculated acceleration data provides less information than a full IMU, 
which would include additional data from rate gyros and magnetometers. Data were not 
41 
  
filtered between differentiation steps which may have magnified small errors in the data. 
These data are not directly comparable to acceleration data from an IMU as the 
orientation of the local coordinate systems will differ. There will also be a difference via 
acceleration due to gravity, which is not included in this data set. However, the given data 
are sufficient to test this algorithm on the patterned nature of gait. Utilization of the rate 
gyro to measure angular velocity of the foot may lead to more accurate classification of 
gait events and more robust predictive models for locomotion transitions. The use of 
offline analysis and able-bodied subjects shows that the output from this algorithm has 
the potential to be implemented in an assistive device. In the present study, motion data 
from the foot dorsum were chosen to represent a potential location for an IMU embedded 
in a prosthetic limb. Additionally, the unilateral methods used in this study were intended 
to represent how measures could be made within a prosthetic limb. 
 
Conclusion 
 In the present study, limited sensor data from a single idealized accelerometer was 
used for the identification of gait events prior to their occurrence. Use of this data set has 
approached the minimal information required from a sensor to determine when gait 
events will occur. The utilization of minimally sampled data could drastically reduce 
computational load while maintaining similar gait event identification accuracy. Further, 
we have built a plausible framework for the classification of locomotion state and 
transitions between walking and running gaits. In future work we plan to test and validate 
this algorithm on a variety of environmentally constrained locomotion transitions. This 
42 
  
will be followed by development of probabilistic models for predicting locomotion 
transitions in various ecological settings. 
43 
  
Bridge 
 The goal of this chapter was to investigate the effectiveness of a supervised 
machine learning algorithm for the identification of gait events during steady state 
walking, running and the transitions between the two locomotion modes. We utilized the 
BP-AR-HMM which derives features from the input data and shared them across that 
data. We have shown that we can identify features prior to IC and TO from the BP-AR-
HMM. Furthermore, we have shown that the temporal duration of the features changes 
during the walk-to-run locomotion transitions. Data in this chapter support further 
investigation of the BP-AR-HMM, and its ability to build a set of features for the 
identification of gait events outside the laboratory from minimally sampled data. Chapter 
IV will utilize the same algorithm to identify gait events prior to their measured 
occurrence for level ground walking, ramp and stair ascent/descent and spontaneous 
transitions between walking and running.  
44 
  
CHAPTER IV 
FEATURE IDENTIFICATION WITH A HEURISTIC ALGORITHM AND AN 
UNSUPERVISED MACHINE LEARNING ALGORITHM FOR PRIOR 
KNOWLEDGE OF GAIT EVENTS  
 
 Published in IEEE Transactions on Neural Systems and Rehabilitation, Donahue, S.R; 
Hahn, M. E; Feature Identification with a Heuristic Algorithm and an Unsupervised 
Machine Learning Algorithm for Prior Knowledge of Gait Events. 2022, 30, 108-114.   
Experimental work was performed entirely by me, and the writing is entirely mine. M.E. 
Hahn provided mentorship and aided in study design, general oversight, and editing and 
finalizing the final manuscript.  
 
Introduction 
 Human locomotion occurs across a range of velocities and environments that are 
not easily replicated in the laboratory. Understanding how able-bodied humans move 
naturally through a real-world environment can inform interventions and device design 
for individuals with orthopedic or neurologic conditions limiting full ambulatory 
function. Proper identification of gait events is crucial for gait analysis in the clinical 
setting, for remote monitoring of patient activity levels and the control of prosthetic 
devices [1], [2], [3]. The quantification and data driven extraction of features shared 
across human locomotion modes in a real-world environment can provide researchers 
with succinct representations of these data. Identification of key gait events, initial 
contact (IC) and toe off (TO), form the basis of our understanding of different locomotion 
45 
  
modes, commonly used in the analysis of gait [4], [5], [6]. Recently, studies have utilized 
mobile sensor arrays for the quantification of human locomotion through varied 
environments [54]. These mobile sensor arrays have allowed for the development of 
rules-based algorithms for the identification of gait events. Developments in algorithmic 
methods, particularly machine learning, have improved the potential of mobile sensor 
technology for analysis of gait in real-world environments [102]–[104]. Of particular 
interest, unsupervised machine learning algorithms for the analysis of multiple time series 
are being developed and tested for gait event detection [105], [106]. While these 
algorithms have been shown to be accurate for gait event identification in the laboratory, 
they have not been tested on data collected from a real-world environment, nor have they 
been compared to simple heuristic identifiers with minimal sensor data from the same 
data set.  
 The gold standards for identification of gait events in a real-world environment 
have been force and pressure based insole systems [93], [107]. These systems can be 
expensive, are prone to failure, and are often not feasible for use in assistive devices or 
long-term use. Additional options for mobile sensing are inertial sensors, such as 
accelerometers and gyroscopes. These sensors have been utilized extensively for the 
identification of gait events and locomotion mode, with varying success [1], [10], [11], 
[12], [13], [14], [15], [16]. One of the major limitations of accelerometers regards excess 
vibrations experienced during initial foot contact which may limit consistency of 
accelerometry data when compared to gyroscopic data. Additionally, the orientation of 
the gyroscope is unaffected by gravity [93], therefore gyroscopes may provide more 
consistent performance than other inertial quantities for the identification of locomotion 
46 
  
and gait events. Previous methodologies have been primarily focused on level ground 
walking at the subject’s preferred velocity, or have examined multiple locomotion modes, 
including ramps, stairs and running, but not all modalities in the same study [1], [12], 
[14], [17], [18], [19], [20].   
 There is currently no consensus about the placement of sensors for gait event 
detection, though it is common to use an array of sensors on the lower limb, including the 
foot, shank and sacrum [1], [3], [10], [14], [15], [16], [17], [20], [21], [22]. Recent work 
has employed a single sensor mounted on the shank or foot for gait event identification 
using a set of heuristic rules derived by the expert user, [66], [99], [110], [111], [113], a 
combination of user defined phases and supervised machine learning [17], [22], [25], 
[26], [27], [28] or frequency analysis [14], [109] to detect common gait events. Heuristic 
algorithms are advantageous because of their computational simplicity. However, they 
also require an intimate knowledge of the waveforms, for a variety of locomotion modes. 
In contrast, unsupervised machine learning algorithms have the potential to input the data 
with minimal modification by the researcher. Indeed, there have been rare attempts to use 
unsupervised machine learning algorithms for identification of gait events [105], [106], 
which have performed at the same accuracy as the supervised and heuristic algorithms. 
These methods have been focused on determining precisely when a gait event is going to 
occur. In this study we look to identify features prior to the occurrence of the gait event. 
 A key limitation of most gait event detection and locomotion classification 
algorithms is the reliance on data collected in a controlled environment, such as a 
laboratory or clinic. Current algorithms also typically require a large volume of data 
(large sensor arrays with high sampling frequencies) to determine and differentiate 
47 
  
patterns in locomotion. High density data from multiple inputs, requires computationally 
intensive methods and feature extraction for the identification of gait events. In real world 
applications, the identification of gait events with minimal sensor data is ideal as it will 
reduce the computational load and preserve the battery life of a device. 
 In the modern renaissance of machine learning, novel algorithms have been 
introduced. They have been built specifically for analysis of multiple related time series 
data (e.g. IMU data collected from the same anatomical location on multiple subjects 
undergoing the same protocol). One example is switching linear dynamical systems, 
developed previously for economic analysis and the description of dynamical processes 
[29]. This class of algorithms allows for conditions where not all possible features are 
known a priori but can be added when a new feature is discovered. From this set of novel 
algorithms, we propose use of the Beta Process Auto Regressive Hidden Markov Model 
(BP-AR-HMM) for the identification of gait events using a single stream of angular 
velocity data from a sensor on the foot sampled at 25 Hz. Minimizing the data input into 
the algorithm will reduce the number of required sensors and the computational 
complexity needed to identify gait events. This algorithm has been used previously for 
the identification of unique features in time series from similar data sources (e.g. position 
data from motion of different subjects) [30]. The approach is used to discover dynamics 
shared between time series and builds a global set of features that describe all possible 
behaviors in the data set. The result is a feature identification algorithm that is not 
constrained by user-determined states and is completely unsupervised. 
 It is currently unknown how well these methods are suited for locomotion data 
collected in a real-world environment with minimal input data. There has yet to be a 
48 
  
direct comparison between a heuristic, peak finding algorithm and an unsupervised 
learning algorithm on data collected in a real-world environment. It is critical to note that 
the approach used in this work seeks to identify features that indicate an impending gait 
event, as the knowledge prior to the gait event is useful when designing a controller for 
application in an assistive device [1]. The purpose of this study was to implement and 
compare two different algorithms, a heuristic algorithm and an unsupervised machine 
learning algorithm for the detection of gait events prior to their actual occurrence using 
minimally sampled angular velocity data collected in a real-world environment.  
 
Methods  
 This study was approved by the University of Oregon Institutional Review Board 
(protocol #: 08202018.018). All participants provided written consent prior to 
involvement in the study. Data were collected from sixteen able bodied participants (8 
male and 8 female; 31.1 ± 14.8 years, 173.4 ± 8.5 cm, 66.7 ± 7.7 kg). All analyses were 
performed offline in custom Matlab programs (Mathworks, Natick, MA). Each subject’s 
dominant foot was equipped with an IMU (Vicon, Centennial, CO) mounted on the 
dorsal aspect of the foot (Figure 2.1, Panel A). Foot ground contact information was 
recorded with two different systems. The first three participants were equipped with the 
Noraxon footswitch Telemyo DTS system (Noraxon, Scottsdale, AZ), while the other 13 
participants were equipped with Loadsol inserts (Novel Electronics, St. Paul, MN). The 
first day of the study consisted of multiple locomotion mode transitions with steady state 
locomotion between modes (Figure 2.1, Panel C). On the second day of the study, 
transitions between walking and running were examined with a protocol based upon the  
49 
  
work of Long and Srinivasan [117] the range of target average velocities ranged from 1.3 
m s-1 to 3.0, m s-1 (1.30, 1.51, 1.73, 1.94, 2.15, 2.36, 2.58, 2.79, 3.00 m s-1) encompassing 
velocities typical in human locomotion. Participants were asked to cover a distance of 90 
m, during which they received verbal feedback at the half-way time point and when five 
seconds remained in the trial (Figure 4.1) [117]. The condition order of these trials was 
randomized to minimize any learning effects (Figure 4.1, Panel C).  
 The IMU data were first down sampled to a common frequency of 100 Hz, then 
synchronized with the insole force sensors. Ground reaction force data were converted 
into an indicator variable, indicating either no discernable movement pattern, or 
consistent pattern such as stance or swing phase. The data were further labeled with each 
different locomotion mode. Input into the rules-based algorithm was the single axis 
angular velocity data about the x-axis of the IMU, approximately sagittal plane of the 
participant, sampled at 100 Hz. The data were then filtered with a fourth order zero lag 
Butterworth filter (fc  = 6 Hz). Input into the BP-AR-HMM consisted of the single axis 
angular velocity data, further down sampled to 25 Hz (Figure 4.1, Panel B).  
 There were four rules utilized for the identification of gait events from the raw 
waveform. There are two prominent peaks that were identified as consistent features for 
the identification of gait events. For the identification of an impending IC a minimum 
must be < -100 rad s-1. The feature utilized to identify impending TO was a peak > 150 
rad s-1. The two other rules to minimize misclassification of peaks were 1) temporal 
differences between minima or maxima must be ≥ 0.1 s; 2) if the difference between IC 
and TO < 0.1s, ignore the next peak. This heuristic set results in a peak finding algorithm,  
with minimal constraints, and is computationally simple compared to the BP-AR-HMM.   
50 
  
 
 Figure 4.1: Data Collection Setup and Protocol Panel A: The experimental set up 
with the IMU outlined on the dorsum of the participant’s dominant foot and the insole for 
foot contact identification. Panel B: Flow of the data from collection to identification of 
patterns for analysis. Panel C: The real-world environment course, unlabeled sections 
indicate level ground locomotion. Day 1 contained two environmentally constrained 
transitions and walk to run and run to walk transitions (course in red). Day 2 of the trial 
consisted of only walk to run and run to walk transitions (course in blue). 
  
51 
  
 The rules for gait event identification were the first instance of a state from a 
subset of states output from the BP-AR-HMM.  The identification of IC was the first 
occurrence of a state that was not 5, 6 or 13.  The rules for the identification of TO were 
the first instance of either state 2, 6 or 16.   
The BP-AR-HMM is a machine learning algorithm that provides a spatial-
temporal statistical model and classifications for related time series data [30], [89]. 
Briefly, the Beta Process provides a sparse subject independent feature set, to be used by 
the Auto-Regressive Hidden Markov Model (AR-HMM). The key element of this 
algorithm, the beta process, gives the algorithm the ability to add and remove states. For 
example, if two states describe the same feature they will be combined into a single state. 
This development of dynamic features is crucial for use in semi-supervised or 
unsupervised machine learning. The HMM assumed that the set of distributions and 
transitions between dynamic features follow a Markov process. This process however 
was modified with the presence of the AR model, which does not assume temporal 
independence. The vector auto regression approach provided an estimation of current 
state, based upon the previous time steps, the available states and distributions based on 
the HMM. A sticky parameter was used in the BP-AR-HMM to ensure the capture of 
biomechanically relevant states by encouraging self-transitions [30]. Performance 
evaluation of the BP-AR-HMM included the critical timing of states and how 
consistently the features were used across all locomotion modes. Each stance and swing 
phase were identified via the force data from force-measuring insert. The locomotion 
mode of each step was identified, and gait events were identified using rules generated 
52 
  
from the states output from the BP-AR-HMM. Each gait event was identified as the first 
instance of force being higher than 20 N. 
 Classical machine learning methodologies rely on three different groupings of the 
data, a training data set (~70%), validation set (~15%), and test set (~15%). In this 
paradigm, the machine learning algorithms are trained on the training set, and rules are 
developed for classification or prediction. During training the validation set is tested with 
the partially trained algorithm to ensure the algorithm is not being overfit to the training 
data. If the training accuracy of the algorithm is higher than the validation and testing 
accuracy the algorithm is considered to be overfit. After training, the test data are used as 
a measure of accuracy on a novel dataset (i.e. the test set). These test sets are typically a 
single novel subject, as in the Leave One Out Cross Validation (LOOCV) procedure. The 
BP-AR-HMM utilized a reversible jump Markov Chain Monte Carle method, which does 
not require the use of LOOCV, The details of this algorithm are beyond the scope of this 
paper, but can be found here: [118]. 
 
Results  
 The output from for the identification of features that occur prior to identified gait 
events with heuristic rules resulted in a 94.87% accuracy of all gait events (TO and IC). 
Accuracy in this study was calculated as the number of features identified,  divided by the 
number of measured gait events. Figure 4.2 demonstrates the consistency with which the 
state output indicate impending gait events. The average temporal difference between 
measured IC from the insoles and the peak identified prior to IC was 186.32 ± 86.70 ms, 
and the difference between measured TO and estimated TO was 63.96 ± 46.30 ms (Table 
53 
  
4.1). The only feature that occurred consistently after the measured gait event was IC for 
stair ascent. Figures 4.3, 4.4 and 4.5 (Panel A) show where the estimated IC and TO 
events are identified based upon the aforementioned rules. The accuracies presented in 
Tables 4.1 and 4.2 for IC and TO are shown out of all of the measured footfalls.   
 
 The output from the BP-AR-HMM resulted in 16 states that described the entirety 
of the data input. Of these states, 12 were considered viable features for the description of 
Table 4.1: Heuristic Algorithm Accuracy and Lead Time  
 
 
 
 
 
 
 
 Overall accuracy of the Heuristic algorithm and time differences between the 
features identified for the prior identification of gait events relative to observed gait 
events. All of the timings for the identified features are prior to the measured event.   
locomotion in a real-world environment. From these, the identification of IC and TO was 
approximately 99.6% across all locomotion modes (Table 4.2). Nine of the 12 states 
occurred in >65% of gait cycles (Table 4.3). Each state described a specific pattern in the 
angular velocity waveform, and transitions between the states were temporally 
dependent.  For the current paper, analysis was constrained to states considered to be the 
most descriptive of gait phases (Table 4.3). States such as 16, 11 and 6 tended to occur 
during the periods described by IC and TO, therefore they are counted as occurring in 
both stance and swing phase of a given gait cycle. States 5 and 6 were considered to best 
describe swing phase and the absence of these states was considered to be the feature that 
indicated impending foot contact. States 2 and 16 were considered to be the best states 
54 
  
that described pre-TO and TO in the majority of locomotion modes. These states are 
presented in Figures 4.3-4.5, showing representative waveforms of all steady state 
locomotion modes. 
 
Table 4.2: BP-AR-HMM Algorithm Accuracy and Lead Time  
 
 
 
 
 
 
 
 
 Accuracy of a BP-AR-HMM gait event identifier and time differences between the identified 
features relative to observed gait events. All of the timings for identified features are prior to the 
measured event.   
Table 4.3: Primary States Output From the BP-AR-HMM 
 
 
 
 
 
 The percentage of the stance and swing phases in which each state occurs.  
 We used simple rules based upon features derived by the BP-AR-HMM to 
identify gait events, prior to or just after their occurrence. Figure 4.3 shows which states 
were optimal for the identification of gait events prior to the event. Features across 
locomotion mode were found to have temporal consistency for the prior identification of 
gait events (Table 4.3). Across locomotion modes, the states occurring most consistently 
were states 2 and 16 (Tables 4.1 and 4.2). State 6 generally occurred throughout swing 
55 
  
phase, however it was occasionally used to detect impending TO, as needed (Table 4.3). 
Specifically, if states 2 or 16 were not observed then the first occurrence of state 6 was 
used to identify impending TO (Table 4.2). The states used to identify impending IC from 
the BP-AR-HMM was the first occurrence of a state that was not state 5, 6 for most 
locomotion modes (Figure 4.2). The notable difference was with running, where the rule 
for identifying IC was the first instance when the state was not 13.  
 
Figure 4.2:  Gait Event identifiers from the BP-AR-HMM. These graphs present five 
time points before and after a gait event and the proportion of footfalls when a selected 
state output from the BP-AR-HMM occurs. The states that identify swing phase and for 
all locomotion modes other than running are states 5 and 6. The absence of state 5, 6 and 
13 are indicative of IC. The identification of TO can be achieved with first occurrence 
states 2, 16.  
 
 
 
56 
  
  
Figure 4.3:  Representative foot ground contacts for Level Ground Walking (A & C) and 
Running (B & D). Panels A and B show the angular velocity plotted in Black with the 
identified foot contacts in Blue. Panels C and D show output from the BP-AR-HMM in 
Green plotted with the angular velocity in Black. The states used for identification of gait 
events are labeled in panels C and D. 
57 
  
 
Figure 4.4.  Representative foot ground contacts for Stair Ascent (A & C) and Descent 
(B & D). Panels A and B show the angular velocity plotted in Black with the identified 
foot contacts in Blue. Panels C and D show output from the BP-AR-HMM in Green 
plotted with the angular velocity in Black. The states used for identification of gait events 
are labeled in panels C and D. 
58 
  
Figure 4.5:  Representative foot ground contacts Ramp Ascent (A & C) and Descent (B 
& D).  Panels A and B show the angular velocity plotted in Black with the identified foot 
contacts in Blue. Panels C and D show output from the BP-AR-HMM in Green plotted 
with the angular velocity in Black. The states used for identification of gait events are 
labeled in panels C and D.  
59 
  
Discussion 
 The purpose of this study was to implement and compare two algorithms, a 
heuristic algorithm and an unsupervised machine learning algorithm for identification of 
features and states for the detection of gait events prior to their actual occurrence using 
minimally sampled angular velocity data collected in a real-world environment. In this 
study we built two classifiers able to provide prior identification of gait events across 
multiple locomotion modes; including level ground walking, running, stairs and ramps 
from a single angular velocity waveform. The BP-AR-HMM was more accurate in its 
identification of gait events than the heuristic algorithm across all locomotion modes. 
From the output of the BP-AR-HMM, identification of impending TO was most 
consistently achieved by identifying the first occurrence of either state 2 or 16, and 
identification of impending IC was most consistently achieved by the identification of 
states 5 or 6. 
 The BP-AR-HMM accuracy for estimation of gait events was >97% across 
locomotion modes, which was greater than the 94% overall accuracy of the heuristic 
algorithm. Utilizing a single data stream from a gyroscope in a real-world setting, we 
were able to achieve accuracy at levels similar to others who have used larger sensor 
arrays to collect data in a controlled environment [111], [119]. Although the heuristic 
algorithm was less accurate than the BP-AR-HMM the results remain similar to previous 
work with the addition of data from multiple locomotion modes collected in a real-world 
environment [66], [110]. 
 The need to reposition the limb during swing phase is consistent across all 
locomotion modes and velocities. This allowed for the identification of TO with simple 
60 
  
rules, as the limb pushed off from the ground in rapid plantar flexion into swing phase, 
which had a similar pattern across locomotion modes. In contrast, during stance phase 
there were distinct differences in the angular velocity waveform across locomotion 
modes (Figures 4.3, 4.4 and 4.5). These differences led to differing state output from the 
BP-AR-HMM, indicating that this algorithm may be useful for classification and 
prediction of locomotion mode. 
 Previous studies developed gait event detection algorithms for up to six different 
locomotion modes (walking, ramps, stairs, ascent and descent), with >95% accuracy [18], 
[66], [91], [108], [111]. To build real world applications for this type of gait event 
detection system, it is important to anticipate when a gait event is going to occur. One 
other study implemented an algorithm with capacity to provide predictions of gait events 
using an Adaptive Neuro-Fuzzy Bayesian algorithm from three IMUs on the lower limb 
of the subject, collected at 100 Hz over level ground and ramp walking [108]. That 
approach was able to predict gait events based upon the previous gait event but did not 
provide any lead time for the calculation of the event and it required up to 100 ms to 
make a decision if a gait event was going to occur [108]. Another study developed an 
HMM using data from a foot mounted gyroscope with a detection accuracy of 100% and 
an average delay of 43 ms for level ground walking [17], and another group developed an 
HMM using data from a gyroscope with mean detection delay of 60 ms [93]. We have 
improved upon these results by identifying features prior to the occurrence of the gait 
event prior. A further improvement is that the current study utilizes less data than the 
previous studies. 
61 
  
 Other comparable work includes that of Figueiredo et al [111], which examined 
multiple locomotion modes with a heuristic model based upon data collected from a 
gyroscope on the foot of a healthy subject and then input into a finite state machine for 
the detection of gait events. Their approach was reported to detect IC across multiple 
locomotion modes at a rate of > 96.98%, and TO at a rate of > 95.89% [111]. We 
achieved slightly higher accuracy with > 97.26% using the BP-AR-HMM across gait 
events and locomotion modes. Another study used a heuristic approach with a single 
IMU on the shank, identifying gait events with the following time gaps: level ground IC 
with 16 ± 9 ms and TO at –16 ± 15.9 ms, ramp ascent IC at 18.8 ± 11.6 ms and  TO at -
17.2 ± 21.3 ms [120]. In the present study, the features using a heuristic algorithm had a 
lead time of > 100 ms, with the exception of stair ascent, for IC and > 60 ms for TO 
(Table 4.1). The lead time for the BP-AR-HMM identification of gait events was > 40 ms 
for both IC and TO (Table 4.2). Previous studies utilizing combinations of machine 
learning and EMG have identified gait events with varying accuracy [102]–[104]. These 
approaches have shown the efficacy of machine learning and neural networks for the 
identification of gait events with EMG for level ground walking, with accuracy levels of 
87.4% [104], 96.1% [102], and 94.9% [103]. These values are comparable to our level 
ground walking accuracy with the output from the BP-AR-HMM and the heuristic rules 
generated. It should be noted that these previous studies were focused on determining 
when a gait event had occurred. In contrast, in our approach the BP-AR-HMM and the 
heuristic model identified the key portions of gait prior to the actual event with data 
collected in a real-world environment. 
62 
  
 Limitations of this study include but are not limited to post hoc, offline data 
analysis; therefore, we have not yet demonstrated the ability to detect gait events in real 
time. Although the BP-AR-HMM output has a higher accuracy than the heuristic 
algorithm, a simpler algorithm might be easier to include in a control algorithm as it is 
less computationally intensive, therefore reducing the temporal delay between the sensor 
data input and an identified gait event. Additionally, although the data were collected in a 
real-world environment, all subjects completed the same course, which is not 
representative of the natural variation experienced in daily locomotion. 
 
Conclusion 
 In this study we have shown the efficacy of both a heuristic algorithm and an 
unsupervised machine learning algorithm for the identification of features prior to gait 
events based on a single stream of minimally sampled angular velocity data. This leads us 
to believe that in practice a near real time system can be implemented for subject 
independent identification of gait events across locomotion modes and velocities with 
either a heuristic, user defined rule set, or features extracted with an unsupervised 
machine learning algorithm. The key to the BP-AR-HMM used in this study is the Beta 
Process, allowing the model to generate and remove states and share them across all input 
time series, thus allowing the construction of a robust subject independent model. For the 
identification of gait events, this model was at least as accurate as previous approaches, 
while using less sensor data. The outcome of this work indicates that the model used may 
be useful in the classification and prediction of locomotion mode. Such an approach 
could be adapted for use in the control of assistive devices.  
63 
  
Bridge 
 The goal of this chapter was to investigate the effectiveness of an unsupervised 
machine learning algorithm for the identification of gait events during dynamic 
locomotion in an ecological environment. We utilized the BP-AR-HMM as an 
unsupervised machine learning algorithm, to derive features and share them across the 
input data. We have also developed a heuristic algorithm for the identification of features 
prior to gait events which will be crucial for the measurement of gait events in a real-
world environment. The methodology of this chapter is the basis for subsequent chapters.  
While the modality changes in the next chapters to running, this work sets the basis for it.   
 
64 
  
CHAPTER V 
VALIDATION OF RUNNNING GAIT EVENT DETECTION ALGORITHMS IN A 
SEMI-UNCONTROLLED ENVIRONMENT  
 
This work is currently in preparation for submission to Sensors.  Dr. Michael E. Hahn, 
provided mentorship including assistance with study design, data interpretation, and 
editing and finalizing the final manuscript. 
 
Introduction 
 The biomechanical analysis of running outside of the laboratory can be a useful 
tool for real-time or session by session feedback during training for coaches and athletes 
[121], [122]. The standard in human locomotion research for identification of gait events 
is the use of three-dimensional (3D) motion capture and ground reaction force data. 
These systems typically utilize multiple cameras and force plates in a controlled 
laboratory environment. Despite being the gold standard, these methods require 
expensive equipment, large indoor facilities and technical expertise [107], [123], [124], 
thus limiting their practical use in clinical or sporting environments. Previous studies 
have shown the efficacy of utilizing mobile sensing units for the detection of gait events 
inside and outside the laboratory [93], [125]. However, there has been a lack of validation 
of these algorithms for gait event estimation outside the laboratory. This work expands 
upon current validation techniques for the estimation of gait events and foot contact 
duration during running in a semi-uncontrolled environment. 
65 
  
Over the past decade there has been extensive work to understand Inertial 
Measurement Unit (IMU) data in the development of techniques to evaluate human 
locomotion and identify gait events for running in a controlled laboratory setting [20], 
[41], [44], [49], [126]–[129]. There are typically 9 sensors in a standard IMU: tri-axial 
accelerometers (linear acceleration), tri-axial rate gyroscopes (angular velocity) and tri-
axial magnetometers (magnetic field). These sensors have been used for the measurement 
of biomechanical variables in several environments and over different durations in and 
out of the laboratory [122], [130], [131]. However, data from IMUs cannot be used for 
analysis of biomechanical variables without thorough pre-processing of the data and 
specific algorithms [38]. The bases of these algorithms rely explicitly on expertise of the 
researcher to understand segmental accelerations and angular velocities throughout the 
gait cycle, and then developed rules for the estimation of foot contact, via gait events: 
initial contact (IC) and toe off (TO). Some of these approaches have been developed 
specifically for running, with sensors on the foot, shank and the sacrum [126]–[128], 
[132], [133]. Application of these analyses have led to measurement of trail running, 
marathons, and training runs for athletes ranging in skill from recreational to competitive 
collegiate athletes [39], [44], [130], [134]. Prior work in this research space has shown 
that consistent features can be extracted from data in the laboratory and in the real-world, 
with modest efforts made to validate the estimated biomechanical outcomes in the real-
world environment [135], [136]. Force plates have been used previously to validate 
identification of gait events [41], [126], as have force-instrumented treadmills [42], [48], 
[131]. However, the validations of these methods did not account for the possible range 
of velocities and skill levels in an outdoor semi-uncontrolled environment.  
66 
  
The purpose of this work was to validate identification of gait events using data 
from IMUs in a semi-uncontrolled environment. Second, this study sought to expand the 
range of velocities tested with these algorithms and expand the range of participant skill, 
from novice runners who run < 5 miles per week to runners who can run a 5k race in sub-
15-minute time. We expected the estimation of gait events and foot contact time to be 
more accurate for the foot mounted IMUs in comparison to the sacral mounted IMU 
across the range of velocities. Portable insoles developed for the measurement of in-shoe 
forces [10] were used as the standard for validation of the identification of IC and TO in a 
real-world environment. The algorithm developed in this study will be considered valid if 
the RMSE in overall contact time across the range of velocities is less than 0.04 seconds 
across the range of velocities (representing < 5-6% total contact time at jogging/running 
velocities). 
 
Methods 
Data were collected from 15 participants, (9 male, 6 female, age: 23.6 years, 
height: 178.3 cm, mass: 73.5 kg) as part of a larger study (Table 5.1). All analyses were 
performed using custom Matlab programs (Mathworks, Natick, MA). Multi-axis IMUs 
(Casio, Tokyo, Japan) were mounted on the dorsal aspect of the participants’ feet and 
approximately on the sacrum (clipped to the back of the participants waistband). These 
sensors recorded 3D linear accelerations and angular velocities at 200 Hz. Inertial data 
were post processed with a Kalman filter to orient the vertical axis of the local (IMU) 
coordinate system to gravity. Foot-shoe normal force data were recorded with Loadsol 
insole force sensors (Novel Electronics, St. Paul, MN) at 100 Hz. Standard GPS data  
67 
  
Table 5.1: Distribution of Average were measured with a Garmin 
Running Velocities  
Average Running Number of Forerunner (Kansas City, KS). Participants 
Velocity (m s-1) Participants 
2.23 1 performed progressively faster 400 m running 
2.33 1 
2.43 1 trials (four to five, with the fastest velocity 
2.55 4 
2.68 6 being optional), with participant selected rest 
2.82 6 
2.98 6 durations, on a square practice track, based on 
3.15 7 
3.35 9 self-reported race pace for a 5k. An example 
3.57 9 
3.83 9 of the paces run by a participant are shown in 
4.12 8 
4.47 8 Table 5.2. The total range of velocities run by 
4.87 7 
5.36 7 participants was 2.4 – 5.4 m s
-1. The 
participants monitored their pace with a standard wrist-mounted Garmin GPS display. If 
a participant missed their pace by more than 2 seconds, they were asked to repeat that 
pace after suitable rest. These velocities represent typical training and race paces for the 
majority of recreational and high-level distance runners [44], [137].  
 
 
Table 5.2: Exemplar Paces for Participant with a 7-Minute 
 per Mile 5-km Race Pace  
 EXAMPLE PACES Minutes per Mile 
Pace 1 8:30 
 
Pace 2 8:00 
 Pace 3 7:30 
Pace 4 (Race Pace) 7:00 
 Pace 5 (Optional) 6:30 
 
Data Processing 
Foot-shoe normal force data measured from force sensing insoles were considered 
the standard reference for identification of measured gait events [10]. Inertial signals and 
68 
  
force data were time-synced using controlled unilateral ‘foot-stomps’ before and after 
each trial. The IMU data were then down-sampled to match the force sensing insole 
sampling frequency (100 Hz) and filtered using a 4th order low-pass zero-lag Butterworth 
filter (fc = 35 Hz) (Figure 5.1). This filter was chosen as it was more conservative in the 
reduction of noise for the accurate identification of gait events, using peak accelerations 
[44]. Force data < 50 N were set to zero. 
 
 
Figure 5.1: Flow chart of the data post processing steps and the algorithmic methods for 
the identification of gait events from the force sensing insole data, and data from multi-
axial IMUs at both anatomical locations.  
 
69 
  
Algorithmic Descriptions 
Algorithms used in this study are briefly described here. A more thorough 
treatment of the algorithms can be found in Figure 5.1. Identification of gait events with 
foot-shoe normal force data utilized a threshold of 50 N. Gait event estimation from the 
IMU data utilized distinct spatial and temporal rules. Identification of gait events from 
the dorsal mounted IMUs estimated initial contact by identifying peaks in the resultant 
acceleration. The spatial rule for ICfoot was a minimum resultant acceleration of 50 m s
-2. 
The temporal rule for determining ICfoot was a minimum duration of 500 ms between 
estimated consecutive ICfoot [40]. Identification of TOfoot was performed by searching a 
specific temporal window beginning 100 ms after estimated ICfoot, ending at the half-
width of the estimated stride time. In this window TOfoot was either identified as the local 
maxima of vertical acceleration or the first instance when the vertical acceleration was 
greater than three times gravity [40], [128]. Gait event detection using the sacral IMU 
utilized the anterior posterior accelerations. The spatial rule for the identification of 
ICsacrum was local minima with a maximum value of 5 m s
-2 in the posterior direction. The 
temporal rule for ICsacrum was a minimum temporal difference of 200 ms between the 
identified ICsacrum [40]. The identification of TOsacrum with a search window was either 
with the maximum acceleration in the anterior direction, or the maximum positive slope 
of the acceleration in the anterior direction [41]. Exemplar output of these algorithms are 
presented in Figures 5.2 and 5.3.   
Results 
The gait events estimated using data from foot mounted IMUs were more accurate 
than the those estimated using the sacral mounted IMU when compared to the force 
70 
  
measurement standard identification (Figure 5.4, Table 5.3). Foot mounted IMUs had a 
larger RMSE than the sacral mounted IMU for the identification of gait events in the 
slowest running velocity conditions. Across the range of velocities, the algorithms 
identified ICfoot and ICsacrum with similar accuracy (Figure 5.4, Table 5.3). The 
identification of TOsacrum was more accurate at slower velocities than TOfoot. However, at 
running velocities > 3.57 m s-1 the algorithm identified TOsacrum after the force measured 
TO and with larger RMSEs than TOfoot (Figure 5.4).  
 
Table 5.3: Root Mean Square Error for the Identification of Gait Events and Estimation 
of Foo t Contact 
Foot Mounted Sacral Mounted 
Velocity Initial Toe  Contact Initial Toe  Contact 
(m s-1) Contact (s) Off (s) Time (s) Contact (s) Off (s) Time (s) 
2.24 0.074 0.092 0.025 0.045 0.059 0.043 
2.33 0.026 0.060 0.045 0.038 0.021 0.030 
2.44 0.024 0.044 0.026 0.035 0.027 0.026 
2.55 0.033 0.064 0.054 0.039 0.036 0.024 
2.68 0.024 0.040 0.038 0.028 0.028 0.490 
2.82 0.024 0.034 0.029 0.030 0.032 0.030 
2.98 0.018 0.039 0.035 0.024 0.032 0.027 
3.16 0.024 0.036 0.032 0.029 0.042 0.029 
3.35 0.024 0.035 0.025 0.039 0.051 0.036 
3.58 0.018 0.023 0.022 0.019 0.042 0.037 
3.83 0.020 0.031 0.028 0.024 0.045 0.036 
4.13 0.019 0.024 0.023 0.020 0.047 0.039 
4.47 0.021 0.026 0.021 0.020 0.044 0.036 
4.88 0.018 0.026 0.022 0.020 0.035 0.030 
5.36 0.020 0.038 0.035 0.020 0.033 0.026 
In these data, 0.01s is equivalent to a single sample difference between the IMU 
estimation of gait events and contact time and the values derived from the force data.  
 
 
71 
  
Estimation of foot contact duration with data from the foot mounted IMU showed 
an overestimation of foot contact duration at slower running velocities (< 2.55 m s-1). 
Estimated foot contact duration from the dorsal mounted IMUs generally had a smaller 
RMSE across the range of velocities than the sacral IMU estimated contact duration 
(Table 5.3 and Figure 5.5). The algorithm for the sacral mounted IMU data consistently 
underestimated the duration of foot contact at slower velocities, < 3.5 m s-1, and 
overestimated the duration of foot contact at faster velocities > 3.5 m s-1 (Figure 5.5).   
 
Figure 5.2: Acceleration waveforms from the IMUs mounted on the right foot (orange 
and blue waveforms), with superimposed foot contacts (dashed black square waves) 
identified from the force measuring insoles. The estimated ICfoot are shown in the filled 
circles and estimated TOfoot is shown in the filled squares. The search windows used for 
the identification of TOfoot are shown in the solid black rectangles. Panel A shows data 
from a 2.24 m s-1 run and panel B shows data from a 5.36 m s-1 run.   
 
 Analysis of foot contact by trial mean were examined using Bland Altman plots. 
These plots present a comparison between the average difference between IMU estimated 
foot contacts and force measured foot contacts (Figure 5.6). The offset of the foot IMU 
72 
  
estimate was 0.004s with [-0.005 0.013] 95% Limits of Agreement (LoA) and the sacral 
IMU estimate offset was 0.001s with [-0.018 0.021] 95% LoA. These results show more 
variability at the slower velocities, and longer foot contacts, for both the sacral and foot 
mounted IMUs (Figure 5.6). We used a linear model to examine the relationship between 
the IMU estimated foot contacts and the force measured foot contact as well (Figure 5.7). 
Regression analysis of the sacral estimation of foot contact resulted in an r2 value of 0.73,  
a moderate correlation, and a slope of 0.60, indicating an underestimation of the foot 
contact duration . Regression analysis from the foot IMU estimated contact duration 
resulted in an r2 value of 0.91, a strong correlation, and a slope of 1.15, indicating a slight 
overestimation of the foot contact duration.  
 
Figure 5.3: Acceleration waveforms from IMUs mounted on the sacrum (blue 
waveforms), with superimposed foot contacts (dashed black square waves) identified 
from the force measuring insoles. The estimated ICsacrum are shown in the filled circles 
and the estimated TOsacrum are shown in the filled squares. The search windows used for 
the identification of TOsacrum are shown in the solid black rectangles. Panel A shows data 
from a 2.24 m s-1 run and panel B shows data from a 5.36 m s-1 run.   
 
73 
  
 
Figure 5.4: Time differences in the identification of gait events between measured forces 
and estimated IMU gait events. Negative time differences indicate that the IMU 
estimated gait event occurred prior to the measured gait event. The identification of 
TOsacrum had larger error rates due to the temporal windowing of the data, and the wider 
range of velocities used than previous work. 
 
 
74 
  
 
Figure 5.5: Differences in force measured and IMU estimated contact time across the 
range of average velocity trials. The sample sizes for each velocity are shown in Table 
5.1. 
 Discussion 
 The purpose of this work was to validate the identification of gait events using 
data from IMUs in a semi-uncontrolled environment. We collected inertial data from two 
IMUs attached bilaterally on each foot and one approximately on the sacrum, from 
participants of varying running skill levels. We developed and implemented two 
algorithms for the identification of gait events from an IMU, based upon previous work 
[40], [128], [133]. The output of these algorithms, IMU estimated gait events were 
validated against the standard of a force sensing insole. The main findings of the work are 
summarized briefly here: 1) Estimated and measured contact times generally decreased 
across the range of running velocities; 2) Identification of gait events from the foot 
mounted IMUs was more accurate than identification of gait events from the sacral 
75 
  
mounted IMU; 3) Foot contact was identified with an average RMSE of < 0.04 s across 
the range of average running velocities for the foot and sacral mounted IMUs.   
 
Figure 5.6: Bland-Altman plot displaying the average differences between the estimated 
contact durations from the IMUs and the measured gait events from the force sensing 
insoles from the Sacral IMU in Panel A and the Foot Mounted IMU in Panel B. Each dot 
represents an average velocity trial by a participant. Differences greater than 0 are an 
overestimation of contact time by the IMU. Differences less than 0 are an 
underestimation of foot contact by the IMU.  
 
 
76 
  
 
Figure 5.7:  Time differences between the IMU estimated foot contact and force 
measured foot contact. Sacral mounted IMU data generally underestimate foot contact, 
with a slope of 0.60. Foot mounted IMU data generally overestimate foot contact 
duration at faster velocities slope of 1.15. The foot mounted IMU was more accurate 
across the range of velocities for the estimation of foot contact in comparison to the 
sacral mounted IMU.  
 
 It is necessary to address the accuracy of the force sensing insoles and their 
measurement of contact time with respect to velocity, as this measure was considered our 
standard.   Foot contact time measured from the insoles followed a similar pattern to 
77 
  
[137]; a decrease in contact time with an increase in velocity [138]. Foot contact 
durations as measured by the force insoles were consistently longer in duration than those 
reported in [137]. A contributing factor in this was the participants recruited for these 
studies. Our study recruited participants from a range of running skills, from the truly 
novice, to endurance athletes able to run at 15 min 5 km race pace. The samples tested in 
previous studies consisted of middle distance and sprint athletes.  
 The time between ICfoot and the force sensing insole approached zero difference 
as velocity increased (Figure 5.5). We used an algorithmic method most similar to [9], 
which proposed the identification of peak resultant acceleration across the range of 
velocities, and this has become a common approach for the identification of ICfoot [41], 
[139]. The work of [41] utilized a force plate and single foot strikes for the identification 
of gait events, reporting differences in ICfoot ranging from -7.3 to 3.3 ms. In our study 
error in the estimate of ICfoot ranged from -63 to -5 ms (Figure 5.4). They additionally 
reported TO differences ranging from – 53 ms to -32 ms, compared to our range of 
TOfoot differences from -78 to 2 ms (Figure 5.4) [41]. We have improved on the 
identification of TOfoot, by incorporating their rule for the identification of TOfoot as a 
secondary rule to the identification of peak vertical acceleration, in a temporal search 
window [40], [41]. Our work had more variability in the range of TOfoot identification, 
however, only at the three slowest running velocities, as there was only one participant 
who ran at these average running velocities.  
 Foot mounted IMUs overestimated contact time at running velocities < 2.52 m s-1 
(Figures 5.4 and 5.6). The slope of the regression line for this comparison is 1.15, 
providing further evidence of overestimation of foot contact time overall (Figure 5.7).  
78 
  
Another study reported an offset of -0.047 s with 95% LoA of [-0.059 0.154] [40], 
compared with our findings of 0.004s with a 95% LoA [-0.005 0.013], showing an 
overall improvement. It should be noted that Benson et al. reported challenges in the 
identification of TOfoot for one of their participants, which may have contributed to the 
larger offset [40]. Our algorithm was developed on a wider range of velocities, 
participant running abilities and on data collected in a semi-uncontrolled environment. 
The set of algorithms we have developed captured the gait events more effectively than 
previous work. The overestimation of foot contact at the slower running velocities would 
likely be remedied by the inclusion of a greater number of less experienced runners. 
 Estimations of contact time from the sacral mounted IMUs between the velocities 
of 2.52 and 3.16 m s-1 matched the measured contact times (Figure 5.5). However, the 
RMSE TOsacrum at velocities >3.57 m s
-1 is greater than 0.04 (Table 5.3). This stems from 
difficulty in estimating the temporal window in which to identify TOscrum. Identification 
of the temporal duration of the window in which to estimate TOsacrum is related to aerial 
time, which in this study ranged from 40 to 100 ms. Differing window lengths were 
tested to accurately identify TOsacrum. The most accurate of these resulted from window 
termination 20 ms before the next ICsacrum (Figure 5.3). A dynamic temporal window for 
the estimation of TOsacrum could be a way to further improve estimation accuracy of 
TOsacrum and foot contact. The study by [48] reported an average underestimation of foot 
contact from sacral mounted IMUs from -0.017 to -0.001s, while the current findings 
show the average contact time differences from -0.011 to 0.027s. Specifically, at average 
running velocities < 2.56 m s-1 the sacral IMU underestimated foot contact time, and at 
average running velocities >3.16 m s-1 contact time was overestimated. Another study, 
79 
  
[40], reported a foot contact offset from a sacral mounted IMU of 0.029s with a 95% LoA 
[-0.069 0.010]. Our analysis had a smaller offset of 0.001s with 95% LoA of [-0.018 
0.021].  
 There were multiple limitations in this study. First, the temporal synchronization 
between the IMUs and the force sensing insoles presented an initial challenge. While 
participants were running, the IMU clock and the force sensing insole clock could be off 
by a 0.01s. These errors resulted in zeros being added or removed during swing the phase 
from the force sensing insole data. These errors were partially accounted for by foot-
stomp synchronization between each trial. We did not want to artificially decrease the 
RMSE in the results of this work by removing data due to imperfect synchronization. 
This in turn led to cumulative errors between the measured and estimated gait events. 
These errors would be more concerning if there were large differences in the estimation 
of contact time across the range of velocities. As it is, we feel that it is important to 
include the data in full, to fully represent the performance of the techniques used.  
A single participant’s data heavily influenced the error in the model. This 
participant ran less than 5 miles per week and was truly a novice runner. When the data 
from this participant were removed from the data set, ICfoot ranged from -20 ms to -1 ms, 
compared to the original -63 ms to 5 ms (a ~60% improvement) from velocities of 2.67 m 
s-1 to 5.4 m s-1 and TOfoot temporal differences ranged from -30 ms to -1 ms, compared to 
the original values of -78 ms to 2 ms (a ~75% improvement) across the range of 
velocities. Further, exclusion of this participant from the analysis reduces offset in the 
foot mounted IMU in the estimation of foot contact from an offset of 4 ms to an offset of 
1 ms, while the 95% LoA remains the same. The slope of the linear regression also 
80 
  
decreased from 1.15 to 1.10 when this participant’s data are removed from analysis, 
indicating reduced overestimation of contact time in the model. We chose to include this 
participant as a representative example of the truly novice runner. We expect that if more 
runners from a wide range of running levels were included in the data set, we would see 
decreased variability in the identification of IC and TO from the minimal and maximal 
running velocities included in this work. 
 In conclusion, our results demonstrate the validity of two different gait event 
detection algorithms for a range of running velocities and skill levels in a semi-
uncontrolled environment. We used data from a wider range of participant skill levels, 
and a wider range of running velocities than previous studies. We have demonstrated the 
utility of these algorithms for identification of foot contacts in a semi-uncontrolled 
environment. The use of a gait event detection system in a real-world environment needs 
to be validated for a broader set of conditions before we can estimate other biomechanical 
variables from these devices. The next steps in this research are the estimation of ground 
reaction force waveforms from the force sensing insole data, and testing of these 
algorithms in a truly uncontrolled environment.  
81 
  
Bridge 
 The goal of the previous chapter was to investigate the effectiveness of two rules-
based algorithms for the identification of foot contacts in a semi-uncontrolled 
environment. Given the consistency of these results we were able to make the corrections 
to the force data necessary to time-align initial contacts from the IMU and force sensing 
insoles. In the next chapter, we move forward with the estimation of whole GRF 
waveforms across the range of velocities in a semi-uncontrolled environment. During 
these trials participants ran at approximately a set pace, with minor fluctuations in 
running velocity, such as at the beginning and the end of the trial. 
82 
  
CHAPTER VI 
VALIDATION OF RUNNNING GAIT EVENT DETECTION ALGORITHMS IN A 
SEMI-UNCONTROLLED ENVIRONMENT  
 
This work is currently in preparation for submission to Journal of Biomechanics.  Dr. 
Michael E. Hahn, provided mentorship including assistance with study design, data 
interpretation, and editing and finalizing the final manuscript. 
  
Introduction 
 Wearable sensors are being used extensively for the collection of human running 
biomechanical data outside of the laboratory [40], [123], [130], [140], [141]. The primary 
wearable sensors recently used in locomotion biomechanics have been multi-axial inertial 
measurement units (IMUs), which measure linear acceleration and angular velocity data. 
Previously, IMUs have been used to estimate gait events, contact time, and other spatial 
temporal variables [40], [41], [44], [48]. Additionally, specific kinetic variables have 
been estimated from IMUs, such as joint moments, peak vertical force, impulse and 
loading rate [43], [44], [47], [140]. Other wearable sensors utilized for biomechanical 
monitoring or clinical evaluation are insole force sensors. These sensors measure force 
between the foot and shoe, and have been validated as a measure of vertical ground 
reaction forces (GRF) on a treadmill [10]. Wearable sensors have the capability to also be 
incorporated into other sensor systems for the estimation of specific external loading 
variables and internal tissue loading [142], [143], and for overall feedback during training 
[122], [144].  
83 
  
 Previous studies estimating kinetic variables associated with external loading 
have used either statistical or physical models [44], [88], [145]. In recent years, machine 
learning techniques have been used instead of physical or statistical modelling, having 
become a popular set of tools for biomechanical analysis and estimation of kinematic and 
kinetic variables. Previous work using machine learning algorithms have estimated or 
predicted gait events from IMU data [18], [146]–[148], with IMUs located either 
bilaterally on the feet or on the sacrum. Machine learning algorithms, including artificial 
neural networks (ANNs), recurrent neural nets (RNNs), among other techniques, have 
also been used to estimate kinetic variables, such as vertical impulse, loading rate and 
peak GRFs [43], [45]–[47], [149]. A few drawbacks for these studies include the 
biomechanical expertise required for estimation of these variables, significant 
preprocessing of raw data, and identification of stance phase before data can be parsed 
into a usable form. Furthermore, these machine learning models are not yet ready for 
implementation outside of the laboratory, as they have been built using data from 
controlled laboratory settings and do not capture the variability of human movement that 
occurs out of the laboratory in response to surface differences and changes in velocity 
[150].  
We propose the use of RNNs, specifically Long Short Term Memory networks 
(LSTMs) to map IMU data onto GRF waveforms measured with force sensing insoles. 
The LSTM approach was specifically developed for time series data, and mapping 
between two different waveforms [151]. The purpose of this study was to implement a 
machine learning algorithm for the mapping of inertial data to kinetic waveforms from 
participants running on a track across a range of velocities and participant skill levels.   
84 
  
 
Methods 
This study was approved by the University of Oregon Institutional Review Board 
(protocol #: 10062020.007). Data were collected from 15 participants (Table 6.1), (9 
male, 6 female, age: 23.6 years, height: 178.3 cm, mass: 73.5 kg) as part of a larger, 
ongoing project. Participant 5-km race times Table 6.1: Distribution of Average 
Running Velocities 
ranged from 30:00 - 15:30 minutes (Tables 6.1 Average Number of 
Running Participants 
and Table 6.2). All analyses were performed in Velocity (m s-1) 
2.23 1 
custom Matlab programs (Mathworks, Natick, 2.33 1 
2.43 1 
MA). Multi-axial IMUs (Casio, Tokyo, Japan) 2.55 4 
2.68 6 
were mounted bilaterally on the dorsal aspect 2.82 6 
2.98 6 
of each participants feet and approximately on 3.15 7 
3.35 9 
the sacrum (clipped on the back of each 3.57 9 
3.83 9 
participant’s waistband). These sensors 4.12 8 
4.47 8 
recorded 3D linear accelerations and angular 4.87 7 
5.36 7 
velocities at 200 Hz. Data were post processed 
with a Kalman filter to orient the local (IMU) coordinate system vertical to gravity 
(Figure 6.1 Panel B). The use of multiple inertial sensors has been suggested to lead to 
improved estimation of spatial temporal and kinetic variables, compared to a single 
inertial sensor [44], [152], [153]. Foot-shoe normal force data were recorded with 
Loadsol insole force sensors (Novel Electronics, St. Paul, MN) at 100 Hz.  
85 
  
Participants performed four to five 400 m running trials on a square track, at 
paces based upon their self-reported 5 km race pace (a total of 4 or 5 paces, with the 
Table 6.2: Example Paces and Timing for 400 m run fastest pace being 
Example Paces  Minutes per mile  
Pace 1  8:30 optional). An exemplar 
Pace 2  8:00 
Pace 3  7:30 set of paces is shown in 
Pace 4  7:00 
Pace 5 (optional)  6:30 Table 6.2. The total  
range of velocities run by participants in this study was 2.4 – 5.4 m s-1. Each participant 
monitored their pace with Garmin GPS, (Kansas City, KS). If they missed their time trial 
by more than 2 seconds, they would be asked to repeat the trial, after sufficient rest. 
These velocities represent typical training and race paces for the majority of recreational 
and high-level distance runners [44], [137].  
 
Data Processing  
Foot-shoe normal force data were measured from force sensing insoles, 
considered the standard reference for identification of measured gait events and kinetic 
variables in this study [10]. The IMU data were down sampled to 100 Hz to match the 
force sensing insole sampling frequency and then filtered with a 4th order low-pass zero-
lag Butterworth filter (fc = 35 Hz). Kinetic data were filtered with a 2nd order low-pass 
zero-lag Butterworth filter (fc = 20 Hz). The inertial signals and force data were time-
synced using foot-stomps before and after each trial. Internal clock drift between the 
insoles and the IMUs was rectified by an iterative corrections algorithm. Force data <5% 
body weight (BW) were set to 0 BW. The estimated kinetic waveforms output from the 
machine learning algorithm were filtered with the same filter as the measured kinetic 
86 
  
waveforms. Estimated foot ground contacts less than 0.050 s were set to 0 BW, as foot 
contacts shorter than 0.050 s were considered noise, having small magnitudes and not 
consistent with measured foot contacts observed in running locomotion.  
 
Machine Learning Architecture  
The overall structure of the machine learning algorithm is shown in (Figure 6.1 
Panel A). The mathematical details of LSTMs can be found elsewhere [51], [154]. The 
steps for development and testing of the machine learning models were two-fold; first the 
network architecture was optimized using the Bayesian Optimization for Deep Learning 
[155], and then the network was tested using Leave One Out Cross Validation (LOOCV). 
Bayesian Optimization for Deep Learning requires user specification of the 
hyperparameters, which are then optimized. The Bayesian Optimization was tested on the 
data set with 70% Training, 15% Validation and 15% Test segmentation of the data 
(Figure 6.1). The optimal network architecture was determined by performance on the 
Test data set. The hyperparameters optimized included the initial learning rate, gradient 
decay factor, squared gradient decay factor, L2Regularization and number of hidden 
units. The only hyperparameter that influenced the outcome of the algorithm was the 
number of hidden units. The range of the number of hidden units used in the optimization 
was [10 - 50]. Through the Bayesian Optimization process, the optimal number of hidden 
units was determined to be 42. This value was used for the LOOCV process. All other 
hyperparameters converged to the default Matlab input. Additional preliminary work 
showed that a 4 second temporal window was the most accurate for the estimation of 
ground reaction force waveforms.  
87 
  
 
Figure 6.1.  Panel A, machine learning methodology and throughput, specifically the 
input data, machine learning protocol and output. Calculated output contact time and 
kinetic variables are shown here. Panel B instrumentation on the foot, with the Kalman 
corrected coordinate system on the IMU. Panel C measured and estimated ground 
reaction forces from a participant with the smallest RMSE for a participant, Panel D 
shows the largest RMSE measured and estimated ground reaction forces from the same 
participant.   
 
Data Analysis 
 Estimated waveforms from the LOOCV were analyzed and are presented in this 
work. Initial analysis involved identification of the optimal temporal window. This was 
done via inspection of the estimated kinetic variables in Bland-Altman plots. Initial 
Contact (IC) was identified by the first instance of force >5% BW, and toe off (TO) was 
determined by the last instance of force greater than >5% BW. Contact time was 
determined by taking the temporal difference between these two discrete events. Stance 
average GRFs, impulse, peak GRFs, and average loading rate were the kinetic variables 
88 
  
calculated in this work, from the estimated force waveforms. Average loading rate was 
calculated by identifying the impact peak and then averaging the slope in the middle 60% 
of the region between IC and the impact peak [58].  
Pearson correlation coefficients (r2) were used to compare the estimated force 
data output from the LSTM to the measured insole force data. Seventy-five data points 
were used, with fifteen participants running five velocities, and each data point 
representing a 400 m time trial on a square track. Differences between the model 
estimated variable and measured waveform variable are presented in both linear 
regression and Bland-Altman plots with 95% confidence intervals (CIs) and Limits of 
Agreement (LoA), respectively. A strong correlation was defined as r2 ≥ 0.8, a moderate 
correlation as 0.5 ≤ r2 ≤ 0.8 and a weak correlation as 0.3 ≤ r2 ≤ 0.5. Differences between 
measured and estimated gait events are presented as well as, root mean square error 
(RMSE) for each contact time, and kinetic variable.  
 
Results 
The data presented are the trial means from each subject and velocity from the 
LOOCV analysis. Waveform RMSE ranged from 0.191 – 0.309 BW, while the individual 
stance phase RMSE ranged from 0.189 to 0.288 BW (Table 6.3). Exemplar data for the 
minimal and maxima RMSE outputs for a participant are shown in (Figure 6.1 panel C). 
Estimated IC was identified prior to measured IC and IC differences ranged from 0.013 - 
0.020 s per trial (Table 6.4). The identification of TO differences ranged from - 0.012 - 
0.041 s. At velocities <3.16 m s-1, TO was estimated prior to the measured gait event. 
89 
  
However, at velocities > 3.16 m s-1, the estimation of TO occurred after the measured gait 
event (Table 6.4). Estimated and measured contact time had good agreement at average  
running velocities < 3.16 m s-1, however at average running velocities >3.16 m s-1, 
contact time was overestimated (Figure 6.2 Panel A). The Pearson’s Correlation 
Coefficient between the estimated and measured contact time showed a moderate  
Table 6.3: Stance and Waveform Root Mean Square Error correlation; r
2 = 0.795 
Average Stance RMSE Waveform RMSE 
Velocity (m s-1) (BW) (BW) (Figure 6.2 Panel B). 
2.24 0.230 ± 0.000 0.238 ± 0.025 
2.33 0.189 ± 0.000 0.191 ± 0.029 Bias in the estimate of 
2.44 0.199 ± 0.000 0.202 ± 0.028 
2.55 0.253 ± 0.080 0.271 ± 0.100 contact time was 0.020 
2.68 0.268 ± 0.056 0.266 ± 0.074 
2.82 0.267 ± 0.019 0.274 ± 0.057 with 95% LoA: [-
2.98 0.281 ± 0.020 0.309 ± 0.059 
3.16 0.262 ± 0.038 0.304 ± 0.056 0.011 0.051] (Figure 
3.35 0.288 ± 0.039 0.305 ± 0.071 
3.58 0.248 ± 0.033 0.268 ± 0.062 6.2 Panel C). Contact 
3.83 0.230 ± 0.049 0.265 ± 0.076 
4.13 0.242 ± 0.053 0.281 ± 0.072 time RMSE ranged 
4.47 0.233 ± 0.048 0.275 ± 0.073 
4.88 0.243 ± 0.043 0.284 ± 0.089 from 0.089 s to 0.021s 
5.36 0.240 ± 0.052 0.287 ± 0.101 
(Table 6.5).   
Estimated output from the measured stance average GRFs showed a consistent 
underestimation at velocities > 3.16 m s-1 (Figure 6.3 Panel A). There was a weak 
correlation between the estimated stance average GRF and the measured stance average 
GRF; r2 = 0.408 (Figure 6.4 Panel A). The agreement between the estimated stance 
average GRFs and the measured stance average GRFs were offset by -0.092 BW and 
95% LoA [-0.351 0.167] BW (Figure 6.5 Panel A). The stance average ground reaction 
force RMSE ranged from 0.063 - 0.402 BW (Table 6.5). The measured stance impulse 
decreased as the average running velocity increased. At all but the slowest velocity (2.24  
90 
  
Table 6.4: LSTM Estimated Gait Event Error 
Average Velocity IC Difference (s) TO Difference (s) 
 (m s-1) 
2.24 0.013 ± 0.000 0.041 ± 0.000 
2.33 0.019 ± 0.000 0.014 ± 0.000 
2.44 0.019 ± 0.000 0.015 ± 0.000 
2.55 0.015 ± 0.003 0.007 ± 0.003 
2.68 0.017 ± 0.005 0.014 ± 0.018 
2.82 0.015 ± 0.006 0.015 ± 0.017 
2.98 0.015 ± 0.003 0.007 ± 0.017 
3.16 0.014 ± 0.006 0.001 ± 0.011 
3.35 0.019 ± 0.004 -0.003 ± 0.022 
3.58 0.020 ± 0.005 -0.011 ± 0.012 
3.83 0.018 ± 0.005 -0.007 ± 0.012 
4.13 0.015 ± 0.008 -0.012 ± 0.009 
4.47 0.017 ± 0.008 -0.009 ± 0.009 
4.88 0.017 ± 0.003 -0.006 ± 0.008 
5.36 0.017 ± 0.005 -0.006 ± 0.005 
 
m s-1) and the two fastest velocities (4.88 and 5.36 m s-1) the impulse was overestimated 
by the LSTM output (Figure 6.3, Panel B). Estimated impulse had a weak correlation  
 with measured impulse; r2 = 0.385 (Figure 6.4 Panel B). The agreement between the 
estimated impulse and the measured impulse bias was 0.007 BW*s and 95% LoA [-0.051 
0.065] BW*s (Figure 6.5 Panel B). The RMSE across the range of velocities ranged from 
0.159 to 0.266 BW*s (Table 6.5).  
The estimated peak forces across the range of velocities were similar to the 
measured peak forces, except for at the slowest velocity (2.24 m s-1) (Figure 6.3 Panel C). 
Estimated peak GRFs were moderately correlated with the measured peak ground 
reactions forces (r2 = 0.614) (Figure 6.4 Panel C). The agreement between the measured 
and estimated peak GRFs had an offset of 0.029 BW with 95% LoA [-0.322 0.381] BW 
(Figure 6.5 Panel C). The average RMSE of peak vertical GRFs ranged from 7.798 to 
91 
  
18.132 (BW) (Table 6.5). The estimated average force loading rate was overestimated 
compared to measured loading rate across the range of velocities. Estimated loading rate 
was weakly correlated with measured loading rate (r2 = 0.405) (Figure 6.4 Panel D). The  
agreement between the measured and estimated loading rate had an offset of -6.116 BW 
s-1, with LoA [-20.475 8.243] BW s-1 (Figure 6.5 Panel D). The average RMSE for  
loading rate over each velocity ranged from [0.07 0.218] BW s-1 (Table 6.5). 
Table 6.5: LSTM Estimated Spatial-Temporal and Kinetic Variables RMSE 
Average Contact Time Stance Impulse Peak GRF Loading Rate  
Velocity  (s) Average (BW) (BW* s) (BW) (BW s-1) 
(m s-1) 
2.24 0.090 ± 0.000 0.063 ± 0.000 0.103 ± 0.000 0.266 ± 0.000 
13.592 ± 0.000 
2.33 0.021 ± 0.000 0.070 ± 0.000 0.032 ± 0.000 0.159 ± 0.000 
15.575 ± 0.000 
2.44 0.023 ± 0.000 0.070 ± 0.000 0.032 ± 0.000 0.183 ± 0.000 
12.559 ± 0.000 
2.55 0.028 ± 0.002 0.127 ± 0.082 0.048 ± 0.024 0.227 ± 0.081 
9.535 ± 3.032 
2.68 0.024 ± 0.001 0.122 ± 0.069 0.041 ± 0.016 0.221 ± 0.053 
8.733 ± 1.564 
2.82 0.024 ± 0.001 0.108 ± 0.048 0.042 ± 0.012 0.256 ± 0.053 
8.695 ± 1.839 
2.98 0.025 ± 0.006 0.121 ± 0.037 0.044 ± 0.012 0.208 ± 0.034 
7.798 ± 1.242 
3.16 0.023 ± 0.006 0.154 ± 0.073 0.041 ± 0.011 0.221 ± 0.018 
9.963 ± 4.378 
3.35 0.034 ± 0.014 0.150 ± 0.068 0.047 ± 0.013 0.213 ± 0.038 
11.383 ± 5.400 
3.58 0.039 ± 0.016 0.196 ± 0.091 0.047 ± 0.016 0.225 ± 0.056 
14.725 ± 4.691 
3.83 0.031 ± 0.012 0.177 ± 0.086 0.037 ± 0.018 0.230 ± 0.107 
14.085 ± 3.744 
4.13 0.033 ± 0.013 0.196 ± 0.093 0.040 ± 0.022 0.237 ± 0.130 
14.825 ± 4.458 
4.47 0.032 ± 0.014 0.198 ± 0.093 0.037 ± 0.019 0.226 ± 0.118 
15.561 ± 5.319 
4.88 0.029 ± 0.008 0.218 ± 0.102 0.034 ± 0.014 0.237 ± 0.118 
15.608 ± 3.667 
5.36 0.028 ± 0.007 0.198 ± 0.085 0.032 ± 0.014 0.229 ± 0.121 
18.132 ± 5.392 
 
 
 
92 
  
 
Figure 6.2: Complete analysis of contact time. Panel A shows contact time trends across 
the range of velocities. The measured contact times are in black and the estimated contact 
times are in red.  Regression analysis and 95% confidence intervals of contact time is in 
Panel B. Panel C presents a Bland-Altman plot of the difference between the estimated 
and measured contact times, and 95% limits of agreement.  
93 
  
Figure 6.3: 
Kinetic variables 
across 
velocities. 
Measured 
(black) variables 
calculated 
directly from the 
force sensing 
insoles. 
Estimated (red) 
calculated from 
the LSTM 
estimated 
waveform. The 
trends in the 
estimated data 
follow those of 
the measured 
data, there is 
however an 
offset between 
the two. 
 
94 
  
Figure 6.4:  
Regression  
analysis of 
estimated 
kinetic 
variables with 
mean and 95% 
confidence 
intervals. Each 
data point 
represents an 
average 
velocity trial. 
The color of a 
trial represents 
the average 
running 
velocity of the 
trial.  
95 
  
Figure 6.5: 
Bland 
Altman plot 
with offset 
and 95% 
Limits of 
Agreement. 
Each data 
point 
represents 
an average 
velocity 
trial. The 
color of a 
trial 
represents 
the average 
running 
velocity of 
the trial. 
 96 
  
Discussion  
The purpose of this study was to implement a machine learning algorithm for the 
mapping of inertial data to kinetic waveforms from participants running on a track across 
a range of velocities and participant skill levels. We estimated GRF waveforms with three 
inertial sensors from participants running in a real-world training scenario; 400 m repeats 
at prescribed paces. Three specific findings can be summarized: (1) we estimated four-
second GRF waveforms from the IMU data of the same duration, (2) estimations of 
contact time from the output waveform were accurate, but were overestimated at average 
running velocities > 3.16 m s-1, and (3) estimates of single kinetic variables matched the 
overall trends of the measured input data, however the model tended to underestimate 
kinetic variables (stance average forces, peak force and average loading rate) at running 
velocities > 3.16 m s-1 (Figures 6.3 and 6.5).  
The estimation of gait events from IMUs have been reported with a wide variety 
of algorithmic methods, as these are the most critical variables for parsing biomechanical 
waveforms [14], [70], [156]. In the present study, IC difference between estimated and 
measured gait events ranged from 0.013 - 0.020s across a range of running velocities, 
indicating that IC was estimated to occur prior to the measured IC. This temporal 
difference may have been due to the iterative corrections algorithm that was utilized to 
match estimated IC with measured IC. Estimation of TO differences ranged from -0.012 
– 0.041s, indicating a range of slightly early and slightly late estimation. This appears to 
be specific to running velocity, as estimation of TO at velocities < 3.16 m s-1 occurred 
before measured TO, and at running velocities > 3.16 m s-1 TO was estimated to occur 
after measured TO. It follows that contact time was overestimated at velocities >3.35 m s-
97 
  
1. Previous work reported an RMSE of 0.011s and r2 = 0.665 from a quantile regression 
forest [43]. Our results show a threefold increase in the RMSE to 0.032s but a stronger 
correlation r2 = 0.795. Greater error in our estimates likely came from greater variability 
in the average running velocity throughout a trial and the inclusion of accelerations and 
decelerations within a running trial. 
Stance phase ground reaction force RMSE was comparable to ranges presented in 
previous work (RMSE of 0.39 BW) [47], with our estimated waveforms resulting in 
average RMSE of 0.245 BW for all running velocities. Another study reported an RMSE 
ranging from 0.13 – 0.17 BW between velocities of 2.7 and 4.5 m s-1  [45], using an 
algorithm that is closest in nature to ours, as they estimated portions of waveforms that 
could be concatenated into full GRF waveforms. The performance of our algorithm was 
similar to these previously reported values, with an RMSE ranging from 0.189 - 0.288 
BW (Table 6.3), across a wider range of velocities and at non-constant velocities. The 
primary difference between our current study and previous work in this area is that 
previous studies estimated whole GRF waveforms in the laboratory at steady state 
running velocities on a treadmill, and only estimated stance phase or a segment of the 
waveform. Ours is the first study to produce a model for estimation of a full GRF 
waveform with multiple stance and swing phases from data collected outside of the 
laboratory.  
In our study, measured stance average GRFs generally increased with velocity 
(Figure 6.3 Panel A), however not linearly as expected [137]. Estimated stance average 
GRFs were underestimated when compared to measured stance average GRFs (Figure 6.2 
Panel A and Figure 6.5 Panel A). Divergence between estimated and measured values 
98 
  
occurred over the same range of velocities (3.16 to 5.36 m s -1) that contact time was 
overestimated (Figure 6.2 and Figure 6.5 Panel A). Generally, an increase in contact time 
will cause a decrease in stance average GRFs. This is compounded with the 
underestimation of peak GRF values at faster average running velocities (Figure 6.5 
Panel C). Faster running velocities revealed a greater bias in estimated stance average 
GRFs. For comparison, the physical model developed by [157], presented an average 
RMSE ranging from 0.681 – 1.302 BW for running velocities from 2 – 5 m s-1. 
Regardless, our results show notable improvement on this work, with an RMSE for 
estimated stance average GRF ranging from 0.063 to 0.218 BW (Table 6.5).  
Estimation of impulse is the most mathematically complex variable presented in 
this work and it also has the poorest agreement between estimated and measured values. 
Impulse was expected to decrease as velocity increased [145], [158], which matches our 
results. Estimated impulse from a quantile regression forest was reported to have a strong 
correlation (r = 0.974) and an RMSE of 0.004 BW*s for running velocities between 3.8 
and 5.4 m s-1 [43]. Our results differ, with a weak correlation of r2 = 0.385 and an average 
RMSE across velocities 0.044 BW*s. These differences can be related, in part, to the 
discussion of errors above for both contact time and stance average GRFs. Another key 
difference is the variation in experience levels among our participants when compared to 
highly trained Division 1 endurance athletes. Beyond these differences, impulse is highly 
susceptible to errors in both the estimation of contact time and GRF magnitude, both of 
which had detectable bias in the current model.  
As expected, peak force increased with running velocity (Figure 6.3 Panel C). 
This measure has been related to estimation of external load while running [145], [158]. 
99 
  
Estimation of peak GRFs across the range of running velocities was the most accurate 
output from the current model. Previous research reported that the relationship between 
peak GRFs estimated by an ANN at three different velocities (ranging from 2.78 – 3.89 
m s-1) had a moderate correlation for peak GRF; r2 = 0.72 and 95% LoA [-0.17 0.18] BW, 
with a bias of 0.01 BW, on average [47]. In contrast, our model had a slightly weaker 
correlation (r2 = 0.614) and LoA [-0.322 0.381] BW, with an average RMSE of 0.223 
BW. Although our model resulted in similar correlations, we also have twice as much 
variability represented by our 95% LoA range. Further investigation revealed an outlier 
from the peak GRF analysis, in which the value was overestimated by approximately 
50% for a single participant. This observation indicate that the force-measuring insoles 
were moving between the foot and the shoe for this participant.  
 Measured average loading rate generally increased with running velocity, as 
expected (Figure 6.3 Panel D) [158]. Wouda et al. reported an ANN-estimated loading 
rate with correlation of r2 = 0.57, LoA of [-16 10] BW s-1 and a bias of -2.9 BW s-1 [47]. 
Our results showed a correlation of r2 = 0.405, with LoA [-20.450 8.243] BW s-1 and a 
bias of -6.116 BW s-1, demonstrating less agreement and a larger bias than the previous 
work. This could be due in part to differences in data collection protocols and the 
sensitivity of the force sensing insoles to error in the calculation of loading rate [9]. 
Estimated average loading rate was underestimated at velocities > 3.16 m s-1, possibly 
due to errors in the estimation of gait events. Identification of IC prior to the measured 
gait event decreases the estimated average loading rate. We attempted to mitigate this by 
estimating average loading rate between 10 and 40% contact time. However, as contact 
100 
  
time was overestimated and GRFs were underestimated, underestimation of average 
loading rate across the range of velocities still occurred. 
There are several limitations in this work. The force sensing insoles occasionally 
lost connection during trials, which led to different calibration files for the same 
participant. Force sensing insoles rely heavily on the calibration process prior to the data 
collection, and if they move between the foot and the shoe the force values will be 
affected. These sources of error likely contributed to the variability within our data and 
affected the machine learning model for the estimation of GRF waveforms. The 
methodology presented in this work is transferable to real-world running. However, we 
hesitate to recommend the algorithm in current form as a tool for the analysis of training 
and the translation of this work into the real-world environment. Overestimation of 
contact time with increased running velocity is an example of the limited transferability 
of the algorithm to novel environments. Building a machine learning model for a single 
participant or a small subset of participants with similar running ability would 
substantially reduce the model’s estimation error. We had a single participant run at the 
slowest velocities, and this participant’s data did not follow the expected kinetic trends. 
However, this participant’s data provide a good benchmark to demonstrate how these 
methods capture running performance of a truly novice runner.  
In conclusion, the mapping of GRF waveforms from IMU data collected in a real-
world environment has been shown to be feasible, with limitations. We have presented 
conservative results from an LSTM model of GRF waveform estimation by reporting 
data from a LOOCV analysis. We used three IMUs for the mapping of inertial to kinetic 
data for a variety of participants ranging in skill from truly novice runner (30:00 
101 
  
estimated 5km race time) to more highly trained runners (15:30 5km race time) running 
400m on a square track. This work has improved upon much of the relevant literature for 
estimation of spatial-temporal and kinetic measures from the estimated ground reaction 
force waveforms. Future studies investigating the effects of different amounts of data 
input, and potentially the inclusion of a wider range of running velocities should improve 
estimations from similar machine learning algorithms. Additionally, it would be valuable 
to identify biases in the reported variables by comparing measurement of force data from 
a force-instrumented treadmill to those measured by force sensing insoles, across a range 
of velocities and inclinations matching the training environment of experienced runners.  
102 
  
Bridge 
 The goal of the previous chapter was to investigate the performance of an LSTM 
for mapping could estimate a whole GRF waveform from inertial data in a semi-
uncontrolled environment. From these data we were able to estimate contact time as well 
as discrete kinetic variables from three inertial sensors. The purpose of the next chapter 
was to utilize multiple aspects from the previous methodologies for the estimation gait 
events from foot-mounted IMUs and GRF waveform from a free run on a 5-mile course. 
We included the use of GPS for measurements of pace and slope the participant ran on. 
This next chapter synthesizes the gait event identification, and GRF waveform estimation 
from previous chapters and applies them to this uncontrolled setting.  
103 
  
CHAPTER VII 
RUNNING IN THE REAL WORLD: ESTIMATION OF GAIT EVENTS AND 
GROUND REACTION FORCE WAVEFROMS WITH WEARABLE SENSORS AND 
MACHINE LEARNING  
 
This work is currently in preparation for submission to Nature.  Dr. Michael E. Hahn 
provided mentorship including assistance with study design, data interpretation, and 
editing and finalizing the final manuscript. 
 
Introduction 
Biomechanical analysis of running outside the laboratory has become possible, 
due to advances in wearable sensor and machine learning technologies [1], [159]. 
Laboratory based technologies such as motion capture and instrumented force plates have 
been the traditional method with which to measure biomechanical data, including spatial-
temporal, kinematic and kinetic variables. These tools require significant investment and 
high levels of training to collect these data. Wearable technologies are an alternative to 
laboratory based methods and have become more widely available for the monitoring of 
running biomechanics in uncontrolled environments [144], [160]. Examples of these are 
inertial measurement units (IMUs), GPS, and in-shoe force or pressure sensors, which 
can be used to measure or estimate biomechanical data [10], [40], [147], [161]. Earlier 
research has utilized IMUs for the estimation of gait events, foot ground contact time 
[40], [44], [48], [162]–[164], and estimation of specific kinetic variables with statistical 
or machine learning models [43], [47], [149]. Furthermore, in-shoe force sensors measure 
104 
  
normal force between the foot and shoe during foot contact have been validated as a 
measure of vertical ground reaction forces (GRFs) on an instrumented force treadmill 
[10].  
There are typically 9 sensors in a multi-axial IMU: tri-axial accelerometers (linear 
accelerations), tri-axial rate gyroscopes (angular velocity), and tri-axial magnetometers 
(magnetic field). Data from IMUs need specific processing and algorithms for extraction 
of meaningful biomechanical variables [38]. Some approaches have been developed 
specifically for running, with sensors on the foot, shank, and sacrum [39], [44], [48], 
[134]. These algorithmic techniques have demonstrated that consistent features can be 
extracted from inertial data for identification of foot contacts in the laboratory and in real-
world environments. However, these algorithms are yet to be validated against a kinetic 
measure in a free running real-world environment, with uncontrolled running velocities 
and different positive and negative grades. 
Machine learning models have been implemented for estimation and prediction of 
gait events [65], [165], of single kinetic variables [43], [47], [149], and single stance 
phase GRFs during running, or portions of running GRF waveforms [45]–[47], [166]. 
These studies have been constrained to the laboratory, with either in-ground force plates 
or instrumented force measuring treadmills for GRF measurement. Recurrent neural 
networks (RNNs) show promise for estimation of kinetic variables, specifically Long 
Short-Term Memory networks (LSTMs). These network structures were designed for the 
analysis of temporally related data [151]. Human gait data are ideal for these types of 
algorithms, as locomotion is a cyclic activity and therefore, temporally related. However, 
we must be cautious with the application of machine learning algorithms for evaluation 
105 
  
of running performance outside of the laboratory, as it has been well established that gait 
parameters, kinematics and kinetics are different between treadmill running and 
overground running of different durations [130], [139], [167]–[169]. It is currently 
unknown how these algorithms will perform with data collected over the course of an 
entire run over different grades and velocities.  
Consequently, the purpose of this study was to test two specific methods for the 
biomechanical analysis of running in an unconstrained environment: 1) a heuristic 
algorithm for the estimation of foot contacts from IMU data; 2) a machine learning 
algorithm, Bi-Directional LSTM (BD-LSTM), for estimation of normal GRFs between 
the foot and shoe; foot contacts and discrete GRF variables. We expect gait event 
detection from both algorithms to have similar accuracies across the range of running 
velocities and grades in this study. Specifically, we expect an RMSE of 0.04s, or 6% 
error, in the estimation of foot contact from the IMU data which are similar to results in 
[40]. Output waveforms from the machine learning algorithm, estimated stance phases 
would have an RMSE of 0.030 BW, and estimated discrete kinetic variables would have 
moderate correlations with measured variables in a manner akin to previous work [45]. 
 
Methods 
This study was approved by the University of Oregon Institutional Review Board 
(protocol #: 10062020.007). Data were collected from 16 participants (Table 7.1), (8 
male, 8 female, age: 23.15 years, height: 167.77 cm, mass: 65.00 kg) as part of a larger 
ongoing study. Three participants were excluded from the analysis, due to GPS 
106 
  
malfunctions. All analyses were performed in custom Matlab programs (Mathworks, 
Natick, MA). Multi-axial IMUs (Casio, Tokyo, JPN) were mounted bilaterally on the 
dorsal aspect of each participant’s foot and approximately on the sacrum (clipped on the 
back of each participant’s waistband). The use of multiple inertial sensors has been 
suggested to  
Table 7.1: Participant Characteristics and Mile Pace  improve estimation of 
GENDER Age Mass Height Pace 
(Kg) (cm) (mins:secs) spatial temporal and 
M 19 68 175 7:24 
F 21 63 168 7:59 kinetic variables, compared 
M 21 73 183 7:15 
to a single inertial sensor  
F 19 55 173 7:24 
M 34 68 185 8:17 
[44], [152], [153]. These 
F 23 52 163 7:51 
M 18 68 183 7:33 multi-axial IMUs recorded 
M 20 68 170 7:13 
F 27 54 173 8:57 3D linear accelerations and 
M 28 88 191 9:10 
M 26 70 69 7:13 angular velocities at 200 
F 27 57 165 8:31 
M 18 61 183 8:31 Hz. Acceleration data were 
post-processed with a Kalman filter to orient the local (IMU) coordinate system vertical 
to gravity. Foot-shoe normal force data were recorded with Loadsol insole force sensors 
(Novel Electronics, St. Paul, MN) at 100 Hz. Participants were asked to run a five-mile 
course around the University of Oregon and surrounding parks (Figure 7.1). Participants 
also wore a Garmin (Kansas City, KS) GPS watch (either Garmin Forerunner 130 and 
135 to record elevation and running velocity).  
Data Processing 
Force-sensing insole and IMU data were synced with ‘foot stomps’ before and 
after each run. The IMU data were down sampled to 100 Hz to match the force sensing 
107 
  
insoles and filtered with a 4th order low pass zero-lag Butterworth filter (fc = 35 Hz). 
Internal clock drift between the IMUs and force sensing insoles were resolved with an 
iterative corrections algorithm. The kinetic data were normalized to participant 
bodyweight (BW) and were filtered with a 4th order low-pass zero-lag Butterworth filter 
(fc = 20 Hz). Post-hoc corrections to force insole data due to a drifting baseline were 
required (< 1% of the measured footfalls needed this adjustment). Making these 
corrections entailed identifying swing phases during a period in which the forces had a 
drifting baseline and setting the swing phases to 0 BW. This approach is described in 
greater detail in Chapter II of the dissertation. Additionally, force data <5% BW was set 
to 0 BW. Elevation and velocity measured by the GPS at (sf = 1 Hz), were filtered with a 
zero-lag 10 sample moving average filter. Velocities from GPS data were set to the 
nearest 0.25 m s-1 for velocities ranging from 2.25 – 5.25 m s-1, and the upper limit was 
set by the number of footfalls available for analysis, discussed below. Running velocities 
< 2.25 m s-1 are typically walking velocities and the walk to run transition typically 
occurs at around 2.00 – 2.10 m s-1. Grade was calculated from the elevation data and 
binned into three different groupings. Incline foot strikes were identified at measured 
grades of > 5°, and decline foot strikes were identified as measured grades of < -5°, with 
level ground foot strikes between 5° and -5°. The [-5°, 5°] limit was set due to observed 
noise of ± 4° throughout the run during portions of the course with no physically 
discernible grade. Data from the GPS were then time-synced to the beginning and end of 
the IMU and kinetic data. For each combination of velocity and grade to be included in 
the analysis from a participant there must have been a total of 10-foot contacts meeting 
this criterion throughout the course.  
108 
  
Descriptions of Gait Event Detection Algorithm 
 Gait event estimation from IMU data utilized heuristic rules similar to previous 
work [40], [41], [147]. Initial contact from the IMUs on the dorsal aspect of the foot (IC-
IMU) was identified with two rules. First was the identification of minimum angular 
velocity in the x-axis of the IMU with minimum of 500 ms between identified minima. 
Second, a temporal window relative to each minimum, ranging from 5 ms post minima to 
45 ms post minima was searched for a resultant acceleration > 50 m s-2. If this condition 
was satisfied then this peak was set to be ICIMU [40]–[42]. Identification of TOIMU was 
performed by searching a specific temporal window beginning 100 ms after ICIMU and 
ending at the half-width of the estimated stride time. In this window TOIMU was either 
identified as the local maxima of vertical acceleration or the first instance that vertical 
acceleration was > 3g [40], [128]. Identification of gait events with foot-shoe normal 
force data utilized a 5% BW cutoff; the first instance of force > 5% BW was identified as 
IC, and TO was identified as the last instance of force <5% BW. We then removed foot 
contacts that could not be matched to the IMU and force sensing insole measures. If 
ICIMU was not within half contact time of the IC from the force sensing insole it was 
removed from the analysis.  
Machine Learning Methods 
 We utilized a BD-LSTM with 19 hidden units and a regression output. A more 
thorough description of the network architecture can be found here [151]. Input into the 
BD-LSTM were 1-second temporal windows of inertial data: 3-D accelerations, angular 
velocities, and their respective resultants, from three anatomical locations (dorsum of 
both feet, and the waistband at approximately the sacrum). Output from the BD-LSTM 
109 
  
were 1-second intervals of estimated GRF data. The algorithm was evaluated with a 
Leave One Out Cross Validation (LOOCV) with 12 participants in the training data and 1 
participant in the test data, repeated for each participant. The estimated force data were 
then filtered with a 2nd order low-pass zero-lag Butterworth filter (fc = 15 Hz). Estimated 
data tended to be noisier than the input GRF waveforms. This was accounted for with a 
lower cutoff frequency in the filter. Errant estimated GRF data were removed by setting 
estimated force <5% BW to 0 BW, and by removal of false “foot-contacts” generated by 
the model that were <100 ms or > 500 ms. Foot contacts shorter than 100 ms were not 
consistent with measured foot contacts during running and foot contacts longer than 500 
ms tended to occur during periods of quiet standing (e.g. participant was at a cross walk). 
We observed that the swing phase estimation error approaches 0 as most of the errant 
data are corrected for using the steps described above. Initial contact from the machine 
learning output (ICLSTM) was identified by the first instance of force >5% BW and toe off 
(TOLSTM) was identified by the last instance of force greater than >5% BW. To be sure 
we were matching foot contact correctly during analysis, if ICLSTM was not within a half 
contact time of the measured IC, it was removed from the analysis. From the model 
output GRF waveforms, stance average GRFs, peak GRFs, impulse and average loading 
rate (ALR) were calculated. Average loading rate was calculated by identifying the 
impact peak and then averaging the force/time slope in the middle 60% of the region 
between IC and the impact peak [58].  
Root Mean Squared Error (RMSE), linear models and bias analyses were used to 
assess estimated contact time and the kinetic variables. Differences between the model 
estimated variable and measured variable waveform are presented in both linear 
110 
  
regression and Bland-Altman plots with 95% confidence intervals (CIs) or Limits of 
Agreement (LoA), respectively. Pearson correlation coefficients (r2) were calculated to 
show agreement between estimated and measured data. A strong correlation was defined 
as r2 ≥ 0.8, a moderate correlation as 0.5 ≤ r2 ≤ 0.8 and a weak correlation as 0.3 ≤ r2 ≤ 
0.5.  
 
Figure 7.1: Outlined course each participant ran. It is clear there is error in the GPS, 
specifically at the beginning of the course. This may have been due to the number of 
buildings and the chosen GPS technology.  
111 
  
Results 
There were 90,537 foot strikes measured with the force sensing insoles. Using the 
rules-based algorithm described above, data from the foot mounted IMU provided a total 
of 90,063 (88,364 analyzed) foot strikes, and the BD-LSTM estimated 90,579 (85,406 
analyzed) foot strikes. The average pace for the 5 mile course (Figure 7.1) was 7:56 ± 
0:40 (min:secs) (Table 7.1). Two participants ran different courses than initially planned, 
each longer than 5 miles.  
Table 7.2: Root Mean Square Error Table for IMU Gait Event Detection  
RUNNING Level Ground Decline Incline 
VELOCITY  
(M S-1) IC (s) TO (s) TC (s) IC (s) TO (s) TC (s) IC (s) TO (s) TC (s) 
2.25 0.018 ± 0.034 ± - - - - 0.010 ± 0.054 ± 0.059 ± 
0.002 0.017 0.000 0.000 0.000 
2.50 0.017 ± 0.034 ± 0.032 ± 0.015 ± 0.046 ± 0.044 ± 0.016 ± 0.026 ± 0.020 ± 
0.003 0.017 0.016 0.006 0.012 0.015 0.007 0.009 0.012 
2.75 0.017 ± 0.028 ± 0.030 ± 0.019 ± 0.033 ± 0.032 ± 0.015 ± 0.022 ± 0.023 ± 
0.003 0.008 0.011 0.006 0.013 0.014 0.006 0.008 0.009 
3.00 0.019 ± 0.028 ± 0.027 ± 0.018 ± 0.034 ± 0.032 ± 0.017 ± 0.027 ± 0.027 ± 
0.005 0.009 0.009 0.006 0.011 0.007 0.006 0.009 0.011 
3.25 0.019 ± 0.027 ± 0.026 ± 0.018 ± 0.025 ± 0.024 ± 0.017 ± 0.024 ± 0.027 ± 
0.005 0.009 0.009 0.005 0.009 0.010 0.005 0.007 0.010 
3.50 0.018 ± 0.024 ± 0.025 ± 0.019 ± 0.028 ± 0.026 ± 0.017 ± 0.026 ± 0.026 ± 
0.004 0.006 0.009 0.007 0.008 0.010 0.006 0.008 0.011 
3.75 0.018 ± 0.025 ± 0.025 ± 0.019 ± 0.027 ± 0.027 ± 0.020 ± 0.025 ± 0.022 ± 
0.004 0.009 0.011 0.005 0.012 0.010 0.005 0.009 0.010 
4.00 0.020 ± 0.028 ± 0.028 ± 0.022 ± 0.025 ± 0.029 ± 0.020 ± 0.024 ± 0.021 ± 
0.005 0.011 0.012 0.011 0.009 0.011 0.006 0.011 0.010 
4.25 0.020 ± 0.028 ± 0.029 ± 0.017 ± 0.025 ± 0.025 ± 0.019 ± 0.020 ± 0.026 ± 
0.007 0.014 0.012 0.011 0.011 0.014 0.007 0.009 0.011 
4.50 0.019 ± 0.026 ± 0.033 ± 0.011 ± 0.035 ± 0.029 ± - - - 
0.008 0.013 0.014 0.007 0.008 0.012 
4.75 0.018 ± 0.042 ± 0.036 ± - - - - - - 
0.000 0.000 0.000 
5.00 0.021 ± 0.026 ± 0.033 ± 0.017 ± 0.010 ± 0.024 ± - - - 
0.000 0.025 0.022 0.000 0.000 0.000 
5.25 0.051 ± 0.045 ± 0.066 ± - - - - - - 
0.000 0.000 0.000 
112 
  
Results are presented with the minima and the maxima difference or RMSE for 
each variable. Specific RMSEs for estimated variables across velocities and grades can 
be found in Tables 7.2-7.6. Pearson correlation coefficients are presented as well as the 
slope of the regression line. Bland-Altman plots show mean difference in the estimated 
and measured variable with the 95% Limits of Agreement (LoA) (Figures 7.3 – 7.4 and 
7.7-7.10). Each marker in these figures represents a minimum of 10 footfalls for each 
velocity and grade (positive, negative, and level ground).  
Table 7.3: RMSE from BD-LSTM Output  
RUNNING Initial Contact Toe Off Contact Time 
VELOCITY 
(M S-1) Level Level Level Decline Incline Decline Incline Decline Incline 
Ground Ground Ground 
2.25 0.020 ± 0.023 ± 0.030 ± 0.025 ± 0.038 ± 0.025 ± 
- - - 
0.014 0.006 0.019 0.012 0.022 0.003 
2.50 0.021 ± 0.039 ± 0.025 ± 0.030 ± 0.060 ± 0.030 ± 0.033 ± 0.047 ± 0.027 ± 
0.006 0.000 0.008 0.012 0.000 0.015 0.017 0.000 0.006 
2.75 0.020 ± 0.021 ± 0.019 ± 0.024 ± 0.023 ± 0.024 ± 0.028 ± 0.022 ± 0.027 ± 
0.006 0.014 0.005 0.010 0.005 0.009 0.013 0.005 0.012 
3.00 0.020 ± 0.019 ± 0.019 ± 0.022 ± 0.021 ± 0.022 ± 0.027 ± 0.024 ± 0.026 ± 
0.006 0.008 0.005 0.009 0.006 0.008 0.014 0.012 0.011 
3.25 0.020 ± 0.018 ± 0.019 ± 0.022 ± 0.025 ± 0.020 ± 0.027 ± 0.027 ± 0.024 ± 
0.007 0.008 0.006 0.009 0.006 0.007 0.015 0.012 0.009 
3.50 0.021 ± 0.021 ± 0.018 ± 0.023 ± 0.024 ± 0.021 ± 0.029 ± 0.027 ± 0.027 ± 
0.007 0.008 0.006 0.009 0.009 0.009 0.016 0.012 0.011 
3.75 0.021 ± 0.023 ± 0.016 ± 0.024 ± 0.029 ± 0.016 ± 0.030 ± 0.035 ± 0.021 ± 
0.010 0.011 0.009 0.009 0.014 0.005 0.015 0.017 0.007 
4.00 0.021 ± 0.024 ± 0.018 ± 0.026 ± 0.032 ± 0.019 ± 0.034 ± 0.039 ± 0.026 ± 
0.009 0.011 0.010 0.011 0.011 0.008 0.016 0.017 0.009 
4.25 0.021 ± 0.024 ± 0.023 ± 0.027 ± 0.030 ± 0.014 ± 0.034 ± 0.039 ± 0.021 ± 
0.009 0.010 0.011 0.013 0.010 0.002 0.018 0.017 0.004 
4.50 0.019 ± 0.022 ± 0.029 ± 0.043 ± 0.039 ± 0.051 ± 
- - - 
0.005 0.010 0.007 0.024 0.011 0.019 
4.75 0.022 ± 0.026 ± 0.039 ± 
- - - - - - 
0.000 0.000 0.000 
5.00 0.025 ± 0.025 ± 0.040 ± 
- - - - - - 
0.007 0.009 0.010 
5.25 0.029 ± 0.036 ± 0.040 ± 
- - - - - - 
0.000 0.000 0.000 
 
Gait Event and Contact Time Estimation from IMUs  
 The RMSE of the estimated ICIMU ranged from 0.017 – 0.051s for level ground 
running, 0.011 – 0.022s for decline running, and 0.015 – 0.020s for incline running 
113 
  
(Table 7.2 and Figure 7.2 Panel A). Estimation of TOIMU RMSE ranged from 0.024 – 
0.044s for level ground running, 0.024 – 0.046s for decline running, and 0.020 – 0.053s 
for incline running (Table 7.2 and Figure 7.2 Panel B). The RMSE of estimated contact 
times ranged from 0.024 – 0.066s for level ground running, 0.023 – 0.046 for decline 
running, and 0.020 – 0.059 for incline running (Table 7.2 and Figure 7.2 Panel C). Linear 
regression and the analysis of bias in the estimate can be found in (Figure 7.3). 
Table 7.4: BD-LSTM Stance Phase Ground Reaction Force 
Waveform RMSE   
RUNNING 
Level Ground 
VELOCITY Incline (BW) Decline (BW) 
(BW) 
(M S-1) 
2.25 0.315 ± 0.102 - 0.253 ± 0.081 
2.50 0.313 ± 0.066 0.300 ± 0.000 0.265 ± 0.058 
2.75 0.304 ± 0.057 0.332 ± 0.004 0.268 ± 0.067 
3.00 0.294 ± 0.065 0.292 ± 0.082 0.272 ± 0.073 
3.25 0.289 ± 0.059 0.321 ± 0.035 0.277 ± 0.065 
3.50 0.299 ± 0.070 0.354 ± 0.070 0.274 ± 0.066 
3.75 0.310 ± 0.072 0.389 ± 0.120 0.229 ± 0.036 
4.00 0.339 ± 0.085 0.391 ± 0.165 0.309 ± 0.082 
4.25 0.358 ± 0.115 0.394 ± 0.172 0.269 ± 0.007 
4.50 0.397 ± 0.180 0.576 ± 0.371 - 
4.75 0.376 ± 0.000 - - 
5.00 0.427 ± 0.010 - - 
5.25 0.635 ± 0.000 - - 
 
  Stride frequency was observed to change across velocities and minimally with 
grades. Stride frequency ranged between 72.460- 91.681 strides min-1 for level running, 
between 79.624 – 86.299 strides min-1 for decline running, and from 76.395 – 87.016 
strides min-1 for incline running (Figure 7.4). Measured stance average GRFs ranged 
from 1.291 – 1.515 BW for level ground running, from 1.294 – 1.590 BW for decline 
running, and from 1.253 – 1.356 BW for incline running (Figure 7.6 Panel A). Peak 
GRFs ranged from 2.253 to 2.66 BW for during level ground running, from 2.168 to 
114 
  
2.845 BW for decline running, and from 2.187 – 2.367 BW for incline running (Figure 
7.6 Panel B). Impulse during stance phase ranged from 0.333 – 0.398 BW*s for level 
ground running, from 0.340 – 0.403 BW*s for decline running, and from 0.342 – 0.370 
BW*s for incline running (Figure 7.6 Panel C). Measured ALR ranged from 31.919 – 
58.307 BW s-1 for level ground running, from 38.220 – 57.359 BW s-1 for decline 
running, and from 29.969 – 48.234 BW s-1 for incline running (Figure 7.6 Panel D). 
BD-LSTM Gait Event and Contact Time   
 The RMSE of the estimation of ICLSTM ranged from 0.019 – 0.029 s for level 
ground, from 0.018 – 0.039 s for decline running, and from from 0.016 – 0.025 s for 
incline running (Table 7.3). Estimation of TOLSTM ranged from 0.021 – 0.036 s for level 
ground running, from 0.021 – 0.059 s for decline running, and from 0.014 – 0.030 s for 
incline running (Table 7.3). Contact time estimation ranged from 0.024 – 0.066 s for level 
ground, from 0.023 – 0.046 s for decline running, and from 0.020 – 0.059 s for incline 
running (Table 7.3 and Figure 7.4).  
Kinetic Variables from Machine Learning Output 
 Stance phase GRF whole waveform RMSE ranged from 0.298 BW to 0.635 BW 
across all running velocities and grades (Table 7.4). Stance average GRF RMSEs ranged 
from 0.113 – 0.313 BW for level ground running, from 0.112 – 0.337 BW for decline 
running, and from 0.089 – 0.158 BW for incline running (Table 7.5 and Figure 7.7). Peak 
force RMSE ranged from 0.187 – 0.392 BW for level ground running, from 0.197 – 
0.508 BW for decline running, and from 0.114 – 0.212 BW incline running (Table 7.5 
and Figure 7.8). Impulse RMSE ranged from 0.024 – 0.066 BW*s for level ground 
115 
  
running, from 0.035 – 
0.086 BW*s for decline 
running, and from 0.022 – 
0.038 BW*s for incline 
running (Table 7.5 and 
Figure 7.9). Average 
loading rate RMSE 
ranged from 14.963 – 
34.680 Bw s-1 for level 
ground running, from 
18.346 – 45.121 BW s-1 
for decline running, and 
from 12.241 – 21.642 Bw 
s-1 for incline running 
(Table 7.5 and Figure 
7.10). 
 
 
 
 
 
 
 
 
 
 
 Figure 7.2: Gait event differences estimated from the 
 foot mounted IMUs (Panels A and B). Panels C and D 
 show the estimated and measured contact times.  
116 
  
TABLE 7.5: Measured Kinetic Variables from Force Insoles   
Stance Average Ground Reaction Force (BW) Peak Ground Reaction Force (BW) 
RUNNING 
VELOCITY 
(M S-1) Level Decline Incline Level Decline Incline 
Ground Ground 
2.25 1.350 ± 0.075 - 1.230 ± 0.021 2.318 ± 0.146 - 2.157 ± 0.052 
2.50 1.314 ± 0.079 1.294 ± 0.000 1.230 ± 0.021 2.253 ± 0.152 2.168 ± 0.000 2.157 ± 0.052 
2.75 1.327 ± 0.076 1.350 ± 0.084 1.277 ± 0.097 2.282 ± 0.165 2.249 ± 0.185 2.223 ± 0.203 
3.00 1.294 ± 0.081 1.316 ± 0.085 1.257 ± 0.086 2.254 ± 0.151 2.248 ± 0.187 2.195 ± 0.163 
3.25 1.291 ± 0.089 1.320 ± 0.066 1.253 ± 0.085 2.254 ± 0.154 2.279 ± 0.135 2.187 ± 0.151 
3.50 1.314 ± 0.091 1.352 ± 0.118 1.264 ± 0.084 2.298 ± 0.162 2.338 ± 0.216 2.212 ± 0.146 
3.75 1.317 ± 0.100 1.355 ± 0.133 1.280 ± 0.084 2.308 ± 0.186 2.367 ± 0.255 2.242 ± 0.156 
4.00 1.337 ± 0.109 1.375 ± 0.141 1.278 ± 0.075 2.341 ± 0.206 2.401 ± 0.268 2.236 ± 0.130 
4.25 1.338 ± 0.118 1.386 ± 0.178 1.305 ± 0.063 2.346 ± 0.230 2.431 ± 0.344 2.294 ± 0.128 
4.50 1.335 ± 0.114 1.351 ± 0.245 - 2.353 ± 0.234 2.394 ± 0.435 - 
4.75 1.457 ± 0.000 - - 2.514 ± 0.000 - - 
5.00 1.515 ± 0.150 1.590 ± 0.000 - 2.669 ± 0.260 2.846 ± 0.000 - 
5.25 1.449 ± 0.000 - - 2.505 ± 0.000 - - 
 
Impulse (BW*s) Average Loading Rate (BW s-1) 
 
Level Decline Incline Level Decline Incline 
Ground Ground 
2.25 0.356 ± 0.023 - 0.365 ± 0.024 40.380 ± - 26.897 ± 
8.429 4.971 
2.50 0.373 ± 0.031 0.354 ± 0.000 0.365 ± 0.024 31.919 ± 46.045 ± 26.897 ± 
1.564 0.000 4.971 
2.75 0.369 ± 0.026 0.365 ± 0.024 0.370 ± 0.024 35.168 ± 50.658 ± 29.969 ± 
5.992 11.145 4.459 
3.00 0.357 ± 0.029 0.366 ± 0.026 0.361 ± 0.029 34.657 ± 38.220 ± 30.215 ± 
5.588 6.578 3.581 
3.25 0.352 ± 0.027 0.353 ± 0.024 0.354 ± 0.028 36.008 ± 45.744 ± 30.755 ± 
6.455 12.826 4.304 
3.50 0.350 ± 0.028 0.357 ± 0.028 0.353 ± 0.027 39.088 ± 47.238 ± 32.381 ± 
5.242 13.109 4.213 
3.75 0.346 ± 0.029 0.348 ± 0.029 0.348 ± 0.027 39.984 ± 51.401 ± 36.196 ± 
5.073 10.231 4.405 
4.00 0.348 ± 0.030 0.347 ± 0.029 0.342 ± 0.026 43.588 ± 55.461 ± 38.833 ± 
5.613 9.529 4.716 
4.25 0.346 ± 0.032 0.347 ± 0.036 0.342 ± 0.026 46.055 ± 55.107 ± 43.496 ± 
8.777 11.352 8.523 
4.50 0.342 ± 0.032 0.340 ± 0.041 - 47.118 ± 47.399 ± - 
6.174 17.226 
4.75 0.398 ± 0.000 - - 40.092 ± - - 
0.000 
5.00 0.374 ± 0.042 0.403 ± 0.000 - 57.365 ± 57.359 ± - 
14.135 0.000 
5.25 0.333 ± 0.000 - - 58.307 ± - - 
0.000 
 
 
117 
  
Table 7.6: Kinetic Variable RMSE from BD-LSTM Estimated Waveforms  
Stance Average Ground Reaction Force (BW) Peak Ground Reaction Force (BW) 
RUNNING 
VELOCITY 
(M S-1) LG Decline Incline LG Decline Incline 
2.25 0.160 ± 0.068 - 0.089 ± 0.013 0.222 ± 0.087 - 0.117 ± 0.042 
2.50 0.149 ± 0.063 0.191 ± 0.000 0.106 ± 0.021 0.233 ± 0.097 0.430 ± 0.000 0.115 ± 0.021 
2.75 0.153 ± 0.047 0.132 ± 0.003 0.122 ± 0.048 0.201 ± 0.091 0.218 ± 0.042 0.182 ± 0.106 
3.00 0.134 ± 0.055 0.123 ± 0.061 0.121 ± 0.049 0.187 ± 0.097 0.197 ± 0.128 0.169 ± 0.102 
3.25 0.141 ± 0.058 0.138 ± 0.068 0.118 ± 0.044 0.194 ± 0.097 0.205 ± 0.109 0.154 ± 0.089 
3.50 0.153 ± 0.070 0.185 ± 0.061 0.124 ± 0.056 0.208 ± 0.114 0.268 ± 0.126 0.163 ± 0.119 
3.75 0.162 ± 0.079 0.227 ± 0.126 0.106 ± 0.023 0.220 ± 0.129 0.351 ± 0.194 0.138 ± 0.038 
4.00 0.186 ± 0.088 0.212 ± 0.151 0.158 ± 0.059 0.255 ± 0.132 0.316 ± 0.241 0.213 ± 0.110 
4.25 0.204 ± 0.100 0.233 ± 0.159 0.139 ± 0.017 0.259 ± 0.155 0.345 ± 0.251 0.136 ± 0.031 
4.50 0.245 ± 0.171 0.337 ± 0.315 - 0.311 ± 0.261 0.508 ± 0.416 - 
4.75 0.247 ± 0.000 - - 0.392 ± 0.000 - - 
5.00 0.298 ± 0.076 - - 0.380 ± 0.252 - - 
5.25 0.313 ± 0.000 - - 0.223 ± 0.000 - - 
 
Impulse (BW*s) Average Loading Rate (BW s-1) 
 
LG Decline Incline LG Decline Incline 
2.25 0.024 ± 0.004 - 0.022 ± 0.010 19.208 ± 4.419 - 13.043 ± 2.744 
2.50 0.038 ± 0.015 0.086 ± 0.000 0.024 ± 0.011 14.963 ± 4.868 28.065 ± 0.000 12.242 ± 4.931 
2.75 0.035 ± 0.008 0.036 ± 0.006 0.030 ± 0.012 15.399 ± 4.307 24.423 ± 5.967 12.915 ± 3.642 
3.00 0.031 ± 0.011 0.035 ± 0.011 0.030 ± 0.014 16.549 ± 4.926 18.346 ± 6.540 13.225 ± 3.691 
3.25 0.031 ± 0.011 0.036 ± 0.013 0.029 ± 0.015 18.065 ± 4.870 22.735 ± 4.258 15.267 ± 4.550 
3.50 0.031 ± 0.009 0.040 ± 0.011 0.031 ± 0.011 18.830 ± 4.966 27.728 ± 5.564 15.812 ± 4.568 
3.75 0.034 ± 0.010 0.043 ± 0.011 0.032 ± 0.008 20.028 ± 4.578 30.510 ± 2.926 13.566 ± 2.308 
4.00 0.039 ± 0.012 0.037 ± 0.013 0.039 ± 0.014 22.185 ± 5.708 32.335 ±13.041 19.349 ± 9.038 
4.25 0.039 ± 0.014 0.043 ± 0.025 0.035 ± 0.009 25.270 ± 5.889 32.324 ± 21.642 ± 3.445 
13.928 
4.50 0.047 ± 0.018 0.047 ± 0.018 - 30.374 ± 45.121 ± - 
19.786 18.762 
4.75 0.058 ± 0.000 - - 31.505 ± 0.000 - - 
5.00 0.066 ± 0.009 - - 33.656 ± 9.727 - - 
5.25 0.058 ± 0.000 - - 34.680 ± 0.000 - - 
118 
  
Discussion  
The purpose of this study was to test two specific methods for the biomechanical 
analysis of running in an unconstrained environment: 1) a heuristic algorithm for the 
estimation of foot contacts from IMU data; 2) a machine learning algorithm, BD-LSTM, 
for estimation of normal GRFs between the foot and shoe; foot contacts and discrete GRF 
variables. The specific findings of the study are summarized briefly here: 1) contact time 
with foot-mounted IMUs was estimated with an average RMSE of 0.030 s, 2) BD-LSTM 
output waveforms estimated contact times with RMSE of 0.031 s, 3) BD-LSTM output 
waveform step-by-step average for all combinations of velocities and grades had an 
RMSE of 0.33 BW per step. Throughout the discussion, it is assumed that the greater 
ranges of RMSEs, lower Pearson Correlation Coefficients and wider 95% LoA are due to 
the unconstrained running environment of this study, in comparison to running in a 
controlled laboratory environment.  
 We observed a decrease in contact time with increased running velocity, for level 
ground, incline and decline foot contacts (Figure 7.2 Panel 3). We noted minimal 
differences in measured contact times between level ground, incline, and decline (Figure 
7.2 Panel E). Comparison of stride frequencies for running velocities from 2.5 – 4.5 m s-1 
between the current study and a treadmill study showed minimal differences ranged from 
[-2.604 -4.643] strides min-1 [138]. There were negligible differences between level 
ground running, decline and incline stride frequencies (Figure 7.5). This finding is not 
surprising, as velocity has been shown to have a larger effect on stride frequency [35], 
[170]. However, we have shown that stance average GRFs, peak GRFs and ALR 
increased with running velocity (Figure 7.6), following the same trends previously 
119 
  
Figure 7.3: Estimated contact time from foot mounted IMUs. Linear regression and 
Bland-Altman plots are presented for all foot contacts (A & E), followed by level ground 
(B & F), decline (C & G) and incline foot contacts (D & H). Pearson Correlation 
Coefficients, and the slope of the regression line are presented in panels A-D. The Bland-
Altman plots present differences between the estimated and measured contact time. The 
average difference and the 95% LoA are shown in panels E-H. 
 
120 
  
 
Figure 7.4:  Estimated contact time from the BD-LSTM. Linear regression and Bland-
Altman plots are presented for all foot contacts (A & E), followed by level ground (B & 
F), decline (C & G) and incline foot contacts (D & H). Pearson Correlation Coefficients, 
and the slope of the regression line are presented in panels A-D. The Bland-Altman plots 
present differences between the estimated and measured contact time. The average 
difference and the 95% LoA are shown in panels E-H. 
121 
  
 
Figure 7.5: Stride Frequency measured from the force sensing insoles, across the range 
of velocities, with plotted with incline, decline and level ground. 
reported [137], [138]. The current study had measured impulses ranged from 0.333 – 
0.403 BW*s (Table 7.5), compared to another study that reported impulse across 
different velocities and grades on a treadmill ranging from 0.300 – 0.340 BW*s [45]. The 
ranges of ALR in our study (31.919 – 58.307 BW s-1) were similar to previous work 
(30.100 – 64.700 BW s-1) across a range of velocities and grades [45]. Differences in 
ALR during decline running showed an increase of 9.298 BW s-1, and decrease of 2.048 
BW s-1 during incline running (Figure 7.6), which is similar to values reported previously 
[35].  
 Our approach to estimating gait events used both acceleration and angular 
velocity data, which diverges from previous work, as most studies have made use of only 
one type of data, either accelerations or angular velocities [21], [40]–[42], [126], [147], 
[171], [172]. Differences between ICIMU and measured IC in the current study occurred in 
the expected range (-0.020- 0.020s), due to the iterative corrections algorithm used 
122 
  
(Figure 7.2 Panel A). A previous study in a controlled laboratory environment reported 
identification of ICIMU across a small range of velocities (8 – 11 km h
-1), with a range of 
RMSE 0.004 – 0.008 s [42]. This is a smaller average RMSE than our study across a 
range of velocities, and grades (RMSE range 0.011 - 0.051 s). The same study reported a 
larger RMSE range for identification of TOIMU: 0.008 – 0.011 s, than ICIMU from their 
work [42], while our study presented TOIMU RMSE range from 0.020 - 0.053 s. Machine 
learning estimation of gait events allows for flexibility in the identification of ICLSTM and 
TOLSTM instead of relying on specific heuristics, as presented above. We have shown 
minimal differences between the IMU estimated contact time and the BD-LSTM 
estimated contact time; RMSE ranges for ICIMU, 0.011 – 0.051 s, compared to ICLSTM, 
0.016 - 0.039 s. There was a larger RMSE in the lower bound of ICLSTM, but a narrower 
range of RMSEs across the range of running velocities. Estimation of TOIMU had an 
RMSE range of 0.020 - 0.053 s, while TOLSTM RMSE was 0.014 - 0.059 s (Tables 7.2 
and 7.3). The estimation of TO with inertial sensors has shown more variability than 
estimation of IC in many different studies, including the current study [40]–[42], [48].  
Contact time estimated from both the foot mounted IMUs and the BD-LSTM 
decreased with increased running velocity (Figures 7.3 and 7.4). Contact time estimation 
from foot mounted IMUs in this study had an RMSE from 0.020 – 0.066 s. Contact time 
estimated from the BD-LSTMs had an RMSE that ranged from 0.021- 0.040 s, an 
improvement over the IMU estimated contact time. Foot contact durations for IMU 
estimates had an r2 = 0.460, and the BD-LSTM estimated foot contact durations had an r2 
= 0.524 (Figures 7.3 Panel E, and Figure 7.4 Panel E). Despite better agreement in the 
output from the BD-LSTM, there was more bias in the estimation of contact time from 
123 
  
the BD-LSTM compared to the IMU estimated contact time, (Level Ground: 0.010s vs. 
0.005 s), and this trend continued with the different grade conditions (Figures 7.3 and 7.4 
Panels E - F). Another study reported an r2 = 0.665 for estimated contact times from a 
Quantile Regression Forest, while the current study presented an r2 = 0.524 across all foot 
contacts [43]. For external comparison, the our model showed a reduced bias in the 
estimations of contact time compared to [40]. They reported an offset of -0.016s with 
95% LoA [-0.058 0.027 s], while the current model resulted in an offset of 0.005 s with 
95% LoA [–0.035 0.044 s] across all IMU estimated foot contacts, and an offset of 0.010 
s with 95% LoA [–0.025 0.044 s] across all BD-LSTM estimated foot contacts. Our 
approach resulted in narrower limits of agreement for both the IMU and BD-LSTM 
estimated contact times. The previous study used raw data for analysis, compared to the 
use of averages of each combination of velocity and grade presented in this work. 
 The accuracy in the estimation of GRF waveforms during stance phase using the 
BD-LSTM varied across running velocities. Level ground running had the largest RMSE 
range (0.288 – 0.635 BW), compared to decline running (0.291 – 0.576 BW) and incline 
running (0.229 – 0.309 BW). It should be noted however, that level ground running also 
had the largest range of velocities (Table 7.3). We observed that the BD-LSTM 
underestimated the GRF across the full range of velocities and grades. Stance phase 
RMSE ranged from 0.229 BW to 0.635 BW for all velocities and grades, compared to 
previously estimated stance phase RMSE from 0.12 to 0.20 BW derived from kinetic 
waveforms estimated from a machine learning algorithm for treadmill running at 
different velocities and inclinations [45].  
124 
  
Stance average forces mirrored the estimation of the whole waveform errors. 
Correlation of stance average GRFs reported in the previous chapter was r2 = 0.408 
across running velocities. However, the current analysis yielded a much lower r2 = 0.105. 
The current study had 95% LoA [-0.332 0.105] BW and a mean difference of -0.113 BW 
for all foot contacts (Figure 7.7). This is slightly more bias than reported in the previous 
chapter (mean difference = -0.092 BW). For external comparison, an LSTM was 
developed for the estimation of stance average GRFs reported an RMSE between 0.340 
and 0.630 BW [47], while the current study reported a stance average GRF RMSE 
between 0.089 – 0.313 BW. 
Estimated peak force had similar patterns to the estimated stance average GRF 
(Figure 7.8). The correlation of peak force reported in the previous chapter was r2 = 0.614 
for level ground steady state running velocities, while the current work resulted in an r2 = 
0.332 across all foot contacts, with worse performance in the estimation of peak force 
during incline running foot contacts (r2 =0.264). Previous work reported a moderate 
correlation between the estimated and measured peak GRFs, from data collected on a 
force instrumented treadmill (r2 = 0.665) [43]. For further external comparison, a BD-
LSTM that utilized only foot contact information from a single sensor on the sacrum 
resulted in an r2 = 0.62, with 95% LoA [-0.17 0.18] BW and a bias of 0.01 BW. In 
another study, the 95% LoA ranged from [-0.503 0.220] BW with a bias of -0.142 BW 
[47]. The major difference between the previous work and ours was that they estimated 
single stance phase vertical GRFs on a treadmill, while we estimated entire waveforms in 
a free running environment, which is inherently more variable than in the controlled 
laboratory setting. 
125 
  
Impulse had the best performance of the calculated discrete kinetic variables from 
the estimated GRF waveform, as it was the least effected by underestimation of the force 
waveform magnitude. Linear regression showed that estimated and measured impulse 
were moderately correlated, r2 = 0.571 for all foot contacts (Figure 7.9 Panel A). Impulse 
was underestimated by 0.040 BW*s for all foot contacts, which equates to approximately 
6 - 8% error in the estimation of impulse across the range of locomotion velocities and 
grades. More precise estimation of contact time from the BD-LSTM increased the 
accuracy in the calculation of impulse. In the previous chapter, estimated impulse was 
weakly correlated with the measured impulse, r2 = 0.385, with a bias of 0.007 BW*s and 
95% LoA [-0.051 0.065] BW*s, while for the current study we observed an r2= 0.571, a 
bias of -0.020 BW*s and 95% LoA [-0.060 0.021] BW*s. A previous study estimated 
impulse with a mean absolute error of approximately 0.030 BW*s across velocities and 
grades running on a treadmill [45], compared to the current study with RMSE across 
running velocities and grades ranging from 0.022 to 0.086 BW*s. The RMSEs for 
impulse in the present study tended to be larger for decline running compared to level 
ground and incline running, due to the more pronounced impact peak observed in that 
condition. 
Estimated ALR was weakly correlated with measured ALR, with r2 = 0.160 for 
level ground running, r2 = 0.210 for decline running, and r2 = 0.492 for incline running. 
In comparison, from Chapter VI, correlation between the estimated and measured ALR 
during level ground running on a track surface yielded an r2 = 0.614. Estimated ALR was 
more accurate when there were more footfalls available in the waveform (e.g., a longer 
temporal input). For comparison, in another study using data collected in a laboratory 
126 
  
environment across a range of velocities, loading rate was moderately correlated to 
measured loading rate, with an r2 = 0.57, a bias of -2.9 BW s-1 95% and LoA [-16 10] 
BW s-1 [47], while the current study presents a 95% LoA [34.002 7.917] BW s
-1 with a 
bias of -13.042 BW s-1. Estimated ALR has been reported to have larger percent errors 
than other estimated kinetic variables [45]. In the present study, this may be due to the 
inherent differences in loading rate between decline, incline, and level running. The ALR 
is typically much larger for decline running than it is for incline or level ground running 
[35].  
There are various limitations in the collection of running data outside of the 
laboratory, some of which have been highlighted above. There were two corrections 
made to the force data in this study. The first of these was an iterative corrections 
algorithm to resolve differences between the internal clocks of the IMU and the force 
sensing insoles. Second, throughout the study, approximately 500 footfalls were observed 
to have a drifting baseline. The drifting baseline may have been a result of the force 
sensing insole moving between the foot and the shoe during the highly dynamic running 
activities being tested. There was also a small error in the synchronization of GPS data to 
IMU and GRF data. Our protocol involved synchronization of the data by matching the 
sudden increase in velocity measured by the GPS to the beginning of the run, and the 
periods in which the runner had minimal velocity (e.g., while waiting at cross walks). 
This protocol greatly improved our data analysis capacity, but unfortunately there did 
remain a slight offset in the GPS data, for each footfall. This can be rectified by the use of 
on-board GPS with the IMUs or force sensing insoles, which will lead the to the GPS 
being time synced with either the IMU or the GPS data. 
127 
  
The BD-LSTM shows great promise in its ability to estimate running GRF 
waveforms. However, the algorithm appears to have limited transferability to a novel 
participant, as evidenced by the inferior performance in the estimation of kinetic 
variables, especially at faster running velocities (due to the limited data available for 
model input at higher running velocities). Performance might be improved upon by the 
addition of more data at faster running velocities.  
 In conclusion, while this study has limitations in its transferability to 
biomechanical analysis of running in the real world, it is the first study to our knowledge 
to make kinetic measures during an uncontrolled run outside of the laboratory. Further, 
this is the first study to estimate IMU contact times and validate them against a kinetic 
standard, as well as estimate kinetic waveforms via machine learning. We have shown 
that the estimation of gait events and contact time using IMU data matches the estimation 
of a machine learning algorithm. Future studies focusing on building models for training 
load, single participant machine learning models, and the inclusion of GPS data into the 
input of the machine learning algorithm may reduce the underestimations of the stance 
phase GRFs at faster running velocities. 
128 
  
Figure 7.6: 
Kinetic 
variables across 
the range of 
running 
velocities, with 
level ground, 
decline and 
incline foot 
contacts 
shown.   
 
 
129 
  
 
 
 
Figure 7.7: Stance Average GRF. Linear regression and Bland-Altman plots are 
presented for all foot contacts (A & E), followed by level ground (B & F), decline (C & 
G) and incline foot contacts (D & H). Pearson Correlation Coefficients and the slope of 
the regression line are presented in panels A-D. The Bland-Altman plots present 
differences between the estimated and measured stance average GRF. The average 
difference and the 95% LoA are shown in panels E-H. 
 
 
130 
  
 
Figure 7.8: Peak Force. Linear regression and Bland-Altman plots are presented for all 
foot contacts (A & E), followed by level ground (B & F), decline (C & G) and incline 
foot contacts (D & H). Pearson Correlation Coefficients and the slope of the regression 
line are presented in panels A-D. The Bland-Altman plots present differences between 
the estimated and measured stance average GRF. The average difference and the 95% 
LoA are shown in panels E-H. 
131 
  
 
Figure 7.9: Impulse. Linear regression and Bland-Altman plots are presented for all foot 
contacts (A & E), followed by level ground (B & F), decline (C & G) and incline foot 
contacts (D & H). Pearson Correlation Coefficients and the slope of the regression line 
are presented in panels A-D. The Bland-Altman plots present differences between the 
estimated and measured stance average GRF. The average difference and the 95% LoA 
are shown in panels E-H. 
 
132 
  
 
Figure 7.10: Average Loading Rate. Linear regression and Bland-Altman plots are 
presented for all foot contacts (A & E), followed by level ground (B & F), decline (C & 
G) and incline foot contacts (D & H). Pearson Correlation Coefficients and the slope of 
the regression line are presented in panels A-D. The Bland-Altman plots present 
differences between the estimated and measured stance average GRF. The average 
difference and the 95% LoA are shown in panels E-H. 
 
133 
  
CHAPTER VIII 
CONCLUSION  
Summary of Results and Findings 
This dissertation set out to develop validated methodologies for the identification 
of gait events and estimation of GRF waveforms from wearable sensors across a range of 
locomotion modes and running velocities from data collected outside of the laboratory. 
Motivation for this work was the potential of wearable sensors and machine learning to 
explore whole new areas of gait biomechanics outside of the laboratory. We first 
validated a supervised machine learning model for the identification of features occurring 
prior to gait events from motion capture data collected on a treadmill (Chapter III). Then 
an unsupervised machine learning model and a heuristic algorithm were implemented 
with inertial and GRF data collected outside of the laboratory for feature identification 
prior to gait events (Chapter IV). The dissertation then shifted to identification of gait 
events and estimation of GRF waveforms during running performance in a semi-
uncontrolled and uncontrolled environment. In Chapter V, we developed an algorithm for 
the identification of gait events in a semi-uncontrolled environment from IMUs on the 
foot and on the sacrum and validated the results against force sensing insoles. In Chapter 
VI we implemented an LSTM for the estimation of GRF waveforms. We were able to 
estimate gait events, contact time, and calculated discrete GRF variables form the output 
waveform. Finally, in Chapter VII we applied the techniques developed in the previous 
chapters to estimate gait events from the IMUs with heuristic rules, and estimate GRF 
waveforms from data collected in a free running environment.  
134 
  
 Chapter III demonstrated an in-laboratory validation of a supervised BP-AR-
HMM for the identification of consistent features prior to gait events across a number of 
steady state walking and running velocities as well as the dynamic transitions between 
them. We have shown the feature output of the BP-AR-HMM can be used to identify 
when locomotion transitions may occur. We utilized minimal sensor data from an 
idealized accelerometer for the identification of gait events prior to their occurrence. The 
use of minimally sampled data would greatly reduce the computational load for 
identification of features prior to gait events collected over long periods time. This is a 
crucial aspect of this work, as minimizing the computational load will increase the battery 
life of an assistive device. The success of this algorithm and the methods developed are a 
solid foundation for translation of this work outside of the laboratory.  
 Chapter IV demonstrated the efficacy of feature identification prior to gait events 
on an outdoor course with both a heuristic and an unsupervised machine learning 
algorithm. Two major methodologies were developed in this chapter, first the 
introduction of a heuristic algorithm to identify features prior to a gait event, and the 
collection of data outside of the laboratory with naturally occurring transitions between 
locomotion modes. This work was based on the input of a single stream of 25 Hz and 100 
Hz angular velocity data across locomotion modes and transitions between. These models 
were at least as accurate as previous methodologies, while using minimally sampled data 
from a single source. We believe this work could be the basis for a near real-time system 
for participant independent identification of gait events across locomotion modes and 
velocities with a heuristic or an unsupervised machine learning algorithm.  
135 
  
 Chapter V validated identification of IC and TO during running gait. The 
methodologies from Chapter IV were applied to the sports performance setting, as we 
implemented and synthesized ideas from the previous chapters for the validation of the 
identification of gait events and contact times from IMUs and compared the results to the 
standard of a force insole. We have shown the use of foot mounted IMUs for 
identification of foot contacts is superior to that of the sacral mounted IMUs. Results 
from this study suggest we will be able to estimate gait events from data collected in a 
semi-uncontrolled running environment across a range of locomotion velocities and 
participant skill levels.  
 Chapter VI detailed the use of a machine learning algorithm for the estimation of 
GRF waveforms. The data presented from this study were outputs from a LOOCV, to 
provide a conservative result of this methodology. The estimated GRF waveform for the 
calculation of contact time was strongly correlated to the measured contact times. The 
calculated discrete variables generally matched the trends in the measured data with a 
bias for underestimation of the variables. The analysis of running performance in a semi-
uncontrolled environment provided a strong enough basis for the application of this work 
into a real-world environment.  
 Chapter VII describes the collection and analysis of gait events and GRF variables 
while running in a free running environment. The gait event detection algorithm was 
modified slightly to include features from the angular velocity data for the identification 
of pre-IC, from Chapter IV. We evaluated multiple temporal windows durations for the 
estimation of GRF waveforms and identified a 1 second window. Estimated contact times 
from the BD-LSTM were similar to those estimated from the IMUs on both feet. There 
136 
  
was significant underestimation of GRF waveforms. The results of this study provide 
insight into the measurement and analysis of GRF waveforms during running 
performance in an uncontrolled environment.  
 In summary, the methods and results of these studies show the potential of 
wearable sensors and machine learning for the analysis of biomechanics of individuals 
outside of the laboratory. We have shown the accuracy of these models for the estimation 
of features prior to gait events, using data from a single sensor on the foot (Chapters III 
and IV) both in an out of the laboratory environment. We have identified gait events and 
estimated ground reaction forces in both a semi-uncontrolled and uncontrolled 
environment (Chapters IV, V, VI and VII). Features for prior knowledge of gait events 
were found to be crucial in Chapter VII during running in an uncontrolled environment. 
These methods can be used for the identification of gait events with minimal sensor data 
in assistive devices or for the monitoring of training load for runners with wearable 
sensors.  
 
Limitations 
One of the most significant limitations of this work is that all the calculations and 
estimations were made offline. Application of these algorithms for near real-time 
feedback would be advantageous for analysis of biomechanical data in the field (e.g. for 
clinicians, coaches and athletes). Specifically, the running data collected would be ideal 
for feedback for participants. Furthermore, models developed in this dissertation are not 
directly transferable to real world applications. However, they build a framework for the 
collection of biomechanical analysis outside of the laboratory with wearable sensors.  
137 
  
Errors in the synchronization of the sensors included: drift between the IMU clock 
and the force sensing insole clock, these led to larger differences in the estimation of gait 
events. These errors resulted in zeros being added or removed during swing the phase 
from the force sensing insole data. These errors were present specifically in Chapters IV 
and V as there were no adjustments made for the initial gait event detection algorithms. 
We did not want to artificially decrease the error in the results of this work by removing 
data due to imperfect synchronization or adjusting the data for the identification of gait 
events, therefore we did not match the ICs, but maintained a maximum difference of 
0.002s. This in turn led to small cumulative errors between the measured and estimated 
gait events.  
One limitation unique to Chapter III is that data from motion capture markers 
were not directly comparable to measured acceleration from an IMU. The calculated 
accelerations were similar to the output of a Kalman filter with a zero-velocity update 
with gravity removed from the vertical direction of the filtered acceleration signal. 
Therefore, the models developed here are not directly applicable to any clinical work. 
However, implementation of the BP-AR-HMM for the identification of features prior to 
gait events were successful and therefore used in the next chapter.  
One limitations of Chapter IV is that we did not complete any real-time testing for 
gait event detection, or the implementation of these algorithms into a control system. This 
limits the clinical applicability of the work, as it remains unknown how real-time 
implementation would affect the performance of the BP-AR-HMM. Additionally, these 
data were collected on a set course and not in an unconstrained environment, therefore, 
we do not know how applicable these models are for real world biomechanical analysis.  
138 
  
A limitation of Chapter V is that we chose to include a single participant’s data 
that heavily influenced the analysis. Removal of this participant from the data set reduced 
the bias of the estimation of contact time from an overestimation of 4 ms to 1 ms. The 
algorithms in this chapter are not yet validated in a real-world training scenario. As 
participants were only asked to run at set paces on a level ground track, there were no 
changes in grade or terrain, which may affect accuracy of the identification of gait events. 
Additionally, there was a discrepancy between the internal clocks of the IMUs and the 
force sensing insoles. We chose not to adjust these discrepancies as we didn’t want to 
artificially lower the differences the estimated and measured gait events. Therefore, the 
identification of gait events in practice may perform better than indicated in this work. 
Limitations of Chapters VI and VII were in part due to the limited transferability of the 
machine learning algorithms for use in training or clinical scenarios. There is yet too 
much error for participant independent algorithms for the estimation of GRF waveforms 
to be applied in a real-world scenario. The models were biased to running velocities < 4.0 
m s-1, therefore at faster running velocities, the BD-LSTM underestimated the GRF 
waveforms at these velocities. For the most accurate estimations of GRF waveforms, we 
would recommended at this point that the participant’s own data be included in the 
training of the algorithm or that participant-specific algorithms should be developed for 
the estimation of GRF waveforms. Additionally, approximately 500 footfalls were 
adjusted due to a drifting baseline. Improvements in the technology for the measurement 
of GRF data between the foot and the shoe are needed, specifically with an increase in 
sampling frequency for more effective measurement of loading rates, as well as more 
accurate measurement of foot contacts. Finally, there were slight errors in the 
139 
  
synchronization of the GPS to IMU and force data. This could be rectified with the 
utilization of an embedded GPS with the IMU.  
 
Recommendations for Future Work 
This dissertation provides a platform upon which to implement algorithms for the 
identification of gait events in an assistive device with minimal sensor data, and the 
application of these methods for the creation of a training load quantification algorithm 
for running coaches and athletes through the estimation of GRF waveforms. Additionally, 
we have laid the foundation for methods of data collection outside of the laboratory, with 
force sensing insoles, IMUs and GPS.  
Future work could explore the use of these algorithms in an onboard controller for 
an assistive device, first in a controlled environment such as the laboratory but also in an 
uncontrolled environment with the user freely moving with the device. The heuristic 
algorithm is simple to implement for identification of gait events for long term 
monitoring of clinical populations. This would be a necessary step to understand how 
those who are using assistive devices are interacting with them outside of the clinic. If we 
understand how the patient is moving throughout their daily life, targeted interventions 
from the clinic may help them interact more efficiently with their environment. 
From the second half of the dissertation, models for the quantification of training 
load could easily be derived from these data and improved upon with the addition of 
more data and the inclusion of more participants. Other work could explore methods for 
the implementation of participant-specific models, instead of the subject independent 
models developed in this work. There is evidence, yet unpublished, from similar work on 
140 
  
the treadmill in the laboratory showing the importance of development of participant-
specific models for the quantification of training load. The use of GPS and IMU data for 
input into a machine learning algorithm may lead to better results for estimation of 
ground reaction force waveforms, as velocity has been shown to have a greater effect on 
discrete ground reaction force waveform calculations. There is a need to understand how 
novice runners perform and learn to run consistently. Long term monitoring of novice 
runners would inform the creation of new models for running feedback. These models of 
novice runners could be crucial in understanding the development of running related 
injuries. 
Future work could also explore the differences between running in the laboratory 
and running in a real-world environment with the force loading insoles and provide 
corrections to the measured force data to match exactly what we measure on the in-lab 
force instrumented treadmill. Lastly error state Kalman filters could be implemented for 
the tracking of the IMU trajectories in 3-D space, which would allow for the estimation 
of join kinematics and provide estimations of joint kinetic values with the inclusion of 
force sensor data.  
141 
  
REFERENCES CITED 
[1] B. J. Horsley et al., Does Site Matter? Impact of Inertial Measurement Unit 
Placement on the Validity and Reliability of Stride Variables During Running: A 
Systematic Review and Meta-analysis, vol. 51, no. 7. Springer International 
Publishing, 2021. 
 
[2] A. Ancillao, S. Tedesco, J. Barton, and B. O’flynn, “Indirect measurement of 
ground reaction forces and moments by means of wearable inertial sensors: A 
systematic review,” Sensors (Switzerland), vol. 18, no. 8, 2018. 
 
[3] I. Weygers, M. Kok, M. Konings, H. Hallez, H. De Vroey, and K. Claeys, “Inertial 
sensor-based lower limb joint kinematics: A methodological systematic review,” 
Sensors (Switzerland), vol. 20, no. 3, pp. 1–23, 2020. 
 
[4] D. A. Winter, “Kinematic and kinetic patterns in human gait: Variability and 
compensating effects,” Hum. Mov. Sci., vol. 3, no. 1–2, pp. 51–76, 1984. 
 
[5] R. Cavahna, GA; Saibene, F P; Margaria, “Mechanical work in running,” J. Appl. 
Physiol., vol. 19, no. 2, pp. 249–256, 1964. 
 
[6] M. H. Dickinson, C. T. Farley, R. J. Full, M. A. R. Koehl, R. Kram, and S. 
Lehman, “How Animals Move: An Integrative View,” Science (80-. )., vol. 288, 
no. 5463, pp. 100–106, Apr. 2000. 
 
[7] S. Sprager and M. B. Juric, Inertial sensor-based gait recognition: A review, vol. 
15, no. 9. 2015. 
 
[8] S. Chen, J. Lach, B. Lo, and G. Z. Yang, “Toward Pervasive Gait Analysis With 
Wearable Sensors: A Systematic Review,” IEEE J. Biomed. Heal. Informatics, vol. 
20, no. 6, pp. 1521–1537, 2016. 
 
[9] A. Peebles, L. Maguire, K. Renner, and R. Queen, “Validity and Repeatability of 
Single-Sensor Loadsol Insoles during Landing,” Sensors, vol. 18, no. 12, p. 4082, 
2018. 
 
[10] K. E. Renner, D. S. Blaise Williams, and R. M. Queen, “The reliability and 
validity of the Loadsol® under various walking and running conditions,” Sensors 
(Switzerland), vol. 19, no. 2, pp. 1–14, 2019. 
 
[11] I. P. I. Pappas, T. Keller, S. Mangold, M. R. Popovic, V. Dietz, and M. Morari, “A 
Reliable Gyroscope-Based Gait-Phase Detection Sensor Embedded in a Shoe 
Insole,” IEEE Sens. J., vol. 4, no. 2, pp. 268–274, 2004. 
 
 
 
142 
  
[12] R. C. González, A. M. López, J. Rodriguez-Uría, D. Álvarez, and J. C. Alvarez, 
“Real-time gait event detection for normal subjects from lower trunk 
accelerations,” Gait Posture, vol. 31, no. 3, pp. 322–325, 2010. 
 
[13] S. Khandelwal and N. Wickström, “Evaluation of the performance of 
accelerometer-based gait event detection algorithms in different real-world 
scenarios using the MAREA gait database,” Gait Posture, vol. 51, pp. 84–90, 
2017. 
 
[14] D. Joshi, B. H. Nakamura, and M. E. Hahn, “A novel approach for toe off 
estimation during locomotion and transitions on ramps and level ground,” IEEE J. 
Biomed. Heal. Informatics, vol. 20, no. 1, pp. 153–157, 2016. 
 
[15] N. C. Bejarano, E. Ambrosini, A. Pedrocchi, G. Ferrigno, M. Monticone, and S. 
Ferrante, “A novel adaptive, real-time algorithm to detect gait events from 
wearable sensors,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 23, no. 3, pp. 
413–422, 2015. 
 
[16] P. C. Formento, R. Acevedo, S. Ghoussayni, and D. Ewins, “Gait event detection 
during stair walking using a rate gyroscope,” Sensors (Switzerland), vol. 14, no. 3, 
pp. 5470–5485, 2014. 
 
[17] A. Mannini, V. Genovese, and A. M. Sabatini, “Online decoding of hidden markov 
models for gait event detection using foot-mounted gyroscopes,” IEEE J. Biomed. 
Heal. Informatics, vol. 18, no. 4, pp. 1122–1130, 2014. 
 
[18] A. Mannini and A. M. Sabatini, “Gait phase detection and discrimination between 
walking-jogging activities using hidden Markov models applied to foot motion 
data from a gyroscope,” Gait Posture, vol. 36, no. 4, pp. 657–661, 2012. 
 
[19] L. Ojeda and J. Borenstein, “Non-GPS Navigation for Security Personnel and First 
Responders,” J. Navig., vol. 60, no. 3, pp. 391–407, Sep. 2007. 
 
[20] A. M. Zaferiou et al., “Quantifying performance on an outdoor agility drill using 
foot-mounted inertial measurement units,” PLoS One, vol. 12, no. 11, pp. 1–15, 
2017. 
 
[21] J. M. Jasiewicz et al., “Gait event detection using linear accelerometers or angular 
velocity transducers in able-bodied and spinal-cord injured individuals,” Gait 
Posture, vol. 24, no. 4, pp. 502–509, 2006. 
 
[22] J. D. Miller, M. S. Beazer, and M. E. Hahn, “Myoelectric walking mode 
classification for transtibial amputees,” IEEE Trans. Biomed. Eng., vol. 60, no. 10, 
pp. 2745–2750, 2013. 
 
 
143 
  
[23] H. L. Bartlett and M. Goldfarb, “A phase variable approach for IMU-based 
locomotion activity recognition,” IEEE Trans. Biomed. Eng., vol. 65, no. 6, pp. 
1330–1338, 2018. 
 
[24] X. Zhang and H. Huang, “A real-time, practical sensor fault-tolerant module for 
robust EMG pattern recognition,” J. Neuroeng. Rehabil., vol. 12, no. 1, p. 18, 
2015. 
 
[25] R. B. Woodward, J. A. Spanias, and L. J. Hargrove, “User intent prediction with a 
scaled conjugate gradient trained artificial neural network for lower limb amputees 
using a powered prosthesis,” Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 
EMBS, vol. 2016-Octob, pp. 6405–6408, 2016. 
 
[26] A. J. Young and L. J. Hargrove, “A Classification Method for User-Independent 
Intent Recognition for Transfemoral Amputees Using Powered Lower Limb 
Prostheses,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 24, no. 2, pp. 217–225, 
Feb. 2016. 
 
[27] D. Quintero, D. J. Lambert, D. J. Villarreal, and R. D. Gregg, “Real-Time 
continuous gait phase and speed estimation from a single sensor,” 1st Annu. IEEE 
Conf. Control Technol. Appl. CCTA 2017, vol. 2017-Janua, pp. 0–5, 2017. 
 
[28] D. J. Villarreal, H. A. Poonawala, and R. D. Gregg, “A Robust Parameterization of 
Human Gait Patterns Across Phase-Shifting Perturbations,” IEEE Trans. Neural 
Syst. Rehabil. Eng., vol. 25, no. 3, pp. 265–278, 2017. 
 
[29] E. Fox, E. B. Sudderth, M. I. Jordan, and A. S. Willsky, “Bayesian nonparametric 
inference of switching dynamic linear models,” Signal Process. IEEE Trans., vol. 
59, no. 4, pp. 1569–1585, 2011. 
 
[30] E. B. Fox, M. C. Hughes, E. B. Sudderth, and M. I. Jordan, “Joint modeling of 
multiple time series via the beta process with application to motion capture 
segmentation,” Ann. Appl. Stat., vol. 8, no. 3, pp. 1281–1313, 2014. 
 
[31] A. E. Kerdok, A. A. Biewener, T. A. McMahon, P. G. Weyand, and H. M. Herr, 
“Energetics and mechanics of human running on surfaces of different stiffnesses,” 
J. Appl. Physiol., vol. 92, no. 2, pp. 469–478, 2002. 
 
[32] D. P. Ferris, M. Louie, and C. T. Farley, “Running in the real world : adjusting leg 
stiffness for different surfaces,” no. January, 1998. 
 
[33] R. Kram, A. Domingo, and D. P. Ferris, “Effect of reduced gravity on the 
preferred walk-run transition speed.,” J. Exp. Biol., vol. 200, no. Pt 4, pp. 821–6, 
1997. 
 
 
144 
  
[34] S. Kipp, A. M. Grabowski, and R. Kram, “What determines the metabolic cost of 
human running across a wide range of velocities?,” J. Exp. Biol., vol. 221, no. 18, 
2018. 
 
[35] J. S. Gottschall and R. Kram, “Ground reaction forces during downhill and uphill 
running,” J. Biomech., vol. 38, no. 3, pp. 445–452, Mar. 2005. 
 
[36] C. S. Whiting, S. P. Allen, J. W. Brill, and R. Kram, “Steep (30°) uphill walking 
vs. running: COM movements, stride kinematics, and leg muscle excitations,” Eur. 
J. Appl. Physiol., vol. 120, no. 10, pp. 2147–2157, 2020. 
 
[37] A. S. Voloshina and D. P. Ferris, “Biomechanics and energetics of running on 
uneven terrain,” J. Exp. Biol., vol. 218, no. 5, pp. 711–719, 2015. 
 
[38] L. C. Benson, C. A. Clermont, E. Bošnjak, and R. Ferber, “The use of wearable 
devices for walking and running gait analysis outside of the lab: A systematic 
review,” Gait Posture, vol. 63, no. April, pp. 124–138, 2018. 
 
[39] J. Reenalda, E. Maartens, L. Homan, and J. H. (Jaap. Buurke, “Continuous three 
dimensional analysis of running mechanics during a marathon by means of inertial 
magnetic measurement units to objectify changes in running mechanics,” J. 
Biomech., vol. 49, no. 14, pp. 3362–3367, 2016. 
 
[40] L. C. Benson, C. A. Clermont, R. Watari, T. Exley, and R. Ferber, “Automated 
accelerometer-based gait event detection during multiple running conditions,” 
Sensors (Switzerland), vol. 19, no. 7, pp. 1–19, 2019. 
 
[41] S. Mo and D. H. K. Chow, “Accuracy of three methods in gait event detection 
during overground running,” Gait Posture, vol. 59, no. June 2017, pp. 93–98, 
2018. 
 
[42] D. K. Chew, K. J. H. Ngoh, D. Gouwanda, and A. A. Gopalai, “Estimating running 
spatial and temporal parameters using an inertial sensor,” Sport. Eng., vol. 21, no. 
2, pp. 115–122, 2018. 
 
[43] R. S. Alcantara, E. M. Day, M. E. Hahn, and A. M. Grabowski, “Sacral 
acceleration can predict whole-body kinetics and stride kinematics across running 
speeds,” PeerJ, vol. 9, pp. 1–18, 2021. 
 
[44] E. M. Day, R. S. Alcantara, M. A. McGeehan, A. M. Grabowski, and M. E. Hahn, 
“Low-pass filter cutoff frequency affects sacral-mounted inertial measurement unit 
estimations of peak vertical ground reaction force and contact time during 
treadmill running.,” J. Biomech., vol. 119, p. 110323, Apr. 2021. 
 
 
 
145 
  
[45] R. S. Alcantara, W. B. Edwards, G. Y. Millet, and A. M. Grabowski, “Predicting 
continuous ground reaction forces from accelerometers during uphill and downhill 
running: a recurrent neural network solution,” PeerJ, vol. 10, p. e12752, 2022. 
 
[46] W. R. Johnson, A. Mian, M. A. Robinson, J. Verheul, D. G. Lloyd, and J. A. 
Alderson, “Multidimensional Ground Reaction Forces and Moments from 
Wearable Sensor Accelerations via Deep Learning,” IEEE Trans. Biomed. Eng., 
vol. 68, no. 1, pp. 289–297, 2021. 
 
[47] F. J. Wouda et al., “Estimation of Vertical Ground Reaction Forces and Sagittal 
Knee Kinematics During Running Using Three Inertial Sensors,” Front. Physiol., 
vol. 9, pp. 1–14, Mar. 2018. 
 
[48] R. Watari, B. Hettinga, S. Osis, and R. Ferber, “Validation of a torso-mounted 
accelerometer for measures of vertical oscillation and ground contact time during 
treadmill running,” J. Appl. Biomech., vol. 32, no. 3, pp. 306–310, 2016. 
 
[49] J. Sinclair, S. J. Hobbs, L. Protheroe, C. J. Edmundson, and A. Greenhalgh, 
“Determination of gait events using an externally mounted shank accelerometer,” 
J. Appl. Biomech., vol. 29, no. 1, pp. 118–122, 2013. 
 
[50] P. Robberechts et al., “Predicting gait events from tibial acceleration in rearfoot 
running: A structured machine learning approach,” Gait Posture, vol. 84, pp. 87–
92, 2021. 
 
[51] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., 
vol. 9, no. 8, pp. 1735–1780, Nov. 1997. 
 
[52] E. Dorschky, M. Nitschke, C. F. Martindale, A. J. van den Bogert, A. D. 
Koelewijn, and B. M. Eskofier, “CNN-Based Estimation of Sagittal Plane Walking 
and Running Biomechanics From Measured and Simulated Inertial Sensor Data,” 
Front. Bioeng. Biotechnol., vol. 8, no. June, pp. 1–14, 2020. 
 
[53] L. Jin, “Kinematic and Kinetic Analysis of Walking and Running Across Speeds 
and Transitions Between Locomotion States,” University of Oregon, 2018. 
 
[54] L. V. Ojeda et al., “Estimating stair running performance using inertial sensors,” 
Sensors (Switzerland), vol. 17, no. 11, pp. 1–14, 2017. 
 
[55] M. Kok, J. D. Hol, and T. B. Schön, “Using Inertial Sensors for Position and 
Orientation Estimation,” 2017. 
 
[56] A. D. Bull and M. L. Oct, “Convergence rates of efficient global optimization 
algorithms,” vol. 26, pp. 1–30. 
 
 
146 
  
[57] M. A. Gelbart, J. Snoek, and R. P. Adams, “Bayesian Optimization with Unknown 
Constraints,” pp. 1–14. 
 
[58] T. Ueda, H. Hobara, Y. Kobayashi, T. A. Heldoorn, M. Mochimaru, and H. 
Mizoguchi, “Comparison of 3 Methods for Computing Loading Rate during 
Running,” Int. J. Sports Med., vol. 37, no. 13, pp. 1087–8090, 2016. 
 
[59] S. K. Au, J. Weber, and H. Herr, “Powered ankle-foot prosthesis improves walking 
metabolic economy,” IEEE Trans. Robot., vol. 25, no. 1, pp. 51–66, 2009. 
 
[60] M. Grimmer et al., “A powered prosthetic ankle joint for walking and running,” 
Biomed. Eng. Online, vol. 15, no. S3, p. 141, 2016. 
 
[61] B. E. Lawson, H. A. Varol, A. Huff, E. Erdemir, and M. Goldfarb, “Control of 
stair ascent and descent with a powered transfemoral prosthesis,” IEEE Trans. 
Neural Syst. Rehabil. Eng., vol. 21, no. 3, pp. 466–473, 2013. 
 
[62] A. D. Segal et al., “The effects of a controlled energy storage and return prototype 
prosthetic foot on transtibial amputee ambulation,” Hum. Mov. Sci., vol. 31, no. 4, 
pp. 918–931, 2012. 
 
[63] R. D. Bellman, M. A. Holgate, and T. G. Sugar, “SPARKy 3: Design of an active 
robotic ankle prosthesis with two actuated degrees of freedom using regenerative 
kinetics,” Proc. 2nd Bienn. IEEE/RAS-EMBS Int. Conf. Biomed. Robot. 
Biomechatronics, BioRob 2008, pp. 511–516, 2008. 
 
[64] B. H. Nakamura and M. E. Hahn, “Myoelectric Activation Pattern Changes in the 
Involved Limb of Individuals With Transtibial Amputation During Locomotor 
State Transitions,” Arch. Phys. Med. Rehabil., vol. 98, no. 6, pp. 1180–1186, 2017. 
 
[65] M. Hanlon and R. Anderson, “Real-time gait event detection using wearable 
sensors,” Gait Posture, vol. 30, no. 4, pp. 523–527, 2009. 
 
[66] G. Pacini Panebianco, M. C. Bisi, R. Stagni, and S. Fantozzi, “Analysis of the 
performance of 17 algorithms from a systematic review: Influence of sensor 
position, analysed variable and computational approach in gait timing estimation 
from IMU measurements,” Gait Posture, vol. 66, no. April, pp. 76–82, 2018. 
 
[67] E. Guenterberg, A. Y. Yang, H. Ghasemzadeh, R. Jafari, R. Bajcsy, and S. S. 
Sastry, “A method for extracting temporal parameters based on hidden markov 
models in body sensor networks with inertial sensors,” IEEE Trans. Inf. Technol. 
Biomed., vol. 13, no. 6, pp. 1019–1030, 2009. 
 
[68] L. H. Smith, T. A. Kuiken, and L. J. Hargrove, “Use of probabilistic weights to 
enhance linear regression myoelectric control,” J. Neural Eng., vol. 12, no. 6, p. 
066030, 2015. 
147 
  
[69] L. J. Hargrove, E. J. Scheme, K. B. Englehart, and B. S. Hudgins, “Multiple binary 
classifications via linear discriminant analysis for improved controllability of a 
powered prosthesis,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 18, no. 1, pp. 
49–57, 2010. 
 
[70] A. Miller, “Gait event detection using a multilayer neural network,” Gait Posture, 
vol. 29, no. 4, pp. 542–545, 2009. 
 
[71] J. Bae and M. Tomizuka, “Gait phase analysis based on a Hidden Markov Model,” 
Mechatronics, vol. 21, no. 6, pp. 961–970, 2011. 
 
[72] A. J. Young, A. M. Simon, N. P. Fey, and L. J. Hargrove, “Intent recognition in a 
powered lower limb prosthesis using time history information,” Ann. Biomed. 
Eng., vol. 42, no. 3, pp. 631–641, 2014. 
 
[73] J. Y. Jung, W. Heo, H. Yang, and H. Park, “A neural network-based gait phase 
classification method using sensors equipped on lower limb exoskeleton robots,” 
Sensors (Switzerland), vol. 15, no. 11, pp. 27738–27759, 2015. 
 
[74] H. Huang, F. Zhang, L. J. Hargrove, Z. Dou, D. R. Rogers, and K. B. Englehart, 
“Continuous locomotion-mode identification for prosthetic legs based on 
neuromuscular - Mechanical fusion,” IEEE Trans. Biomed. Eng., vol. 58, no. 10 
PART 1, pp. 2867–2875, 2011. 
 
[75] P. Zhou, M. Lowery, and K. B. Englehart, “Decoding a new neural machine 
interface for control of artificial limbs,” J. Neurophysiol., vol. 98, pp. 2974–2982, 
2007. 
 
[76] A. J. Young, A. M. Simon, N. P. Fey, and L. J. Hargrove, “Classifying the intent 
of novel users during human locomotion using powered lower limb prostheses,” 
Int. IEEE/EMBS Conf. Neural Eng. NER, pp. 311–314, 2013. 
 
[77] A. J. Young, A. Simon, and L. J. Hargrove, “An intent recognition strategy for 
transfemoral amputee ambulation across different locomotion modes,” Proc. Annu. 
Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS, pp. 1587–1590, 2013. 
 
[78] L. Kilmartin, R. K. Ibrahim, E. Ambikairajah, and B. Celler, “Optimising 
recognition rates for subject independent gait pattern classification,” IET Irish 
Signals Syst. Conf., pp. 1–6, 2009. 
 
[79] B. Purcell, J. Channells, D. James, and R. Barrett, “Use of accelerometers for 
detecting foot-ground contact time during running,” in BioMEMS and 
Nanotechnology II, 2006, vol. 6036, no. June 2014, p. 603615. 
 
 
 
148 
  
[80] J. Leitch, J. Stebbins, G. Paolini, and A. B. Zavatsky, “Identifying gait events 
without a force plate during running: A comparison of methods,” Gait Posture, 
vol. 33, no. 1, pp. 130–132, 2011. 
 
[81] P. C. Formento, R. Acevedo, S. Ghoussayni, and D. Ewins, “Gait Event Detection 
during Stair Walking Using a Rate Gyroscope,” pp. 5470–5485, 2014. 
 
[82] M. T. Farrell and H. Herr, “A method to determine the optimal features for control 
of a powered lower-limb prostheses,” Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. 
Soc. EMBS, pp. 6041–6046, 2011. 
 
[83] J. K. Hitt et al., “Bionic Running for Unilateral Transtibial Military Amputees,” 
27th Army Sci. Conf., p. 8, 2010. 
 
[84] G. Cavagna, F. P. Saibene, and R. Margaria, “External work in walking.,” J. Appl. 
Physiol., vol. 18, no. 3, pp. 1–9, 1963. 
 
[85] D. P. Ferris, M. Louie, and C. T. Farley, “Running in the real world: Adjusting leg 
stiffness for different surfaces,” Proc. R. Soc. B Biol. Sci., vol. 265, no. 1400, pp. 
989–994, 1998. 
 
[86] H. Geyer, A. Seyfarth, and R. Blickhan, “Compliant leg behaviour explains basic 
dynamics of walking and running,” Proc. R. Soc. B Biol. Sci., vol. 273, no. 1603, 
pp. 2861–2867, 2006. 
 
[87] L. Jin and M. E. Hahn, “Modulation of lower extremity joint stiffness, work and 
power at different walking and running speeds,” Hum. Mov. Sci., vol. 58, no. 
January, pp. 1–9, 2018. 
 
[88] R. Blickhan, “The spring-mass model for running and hopping,” Journal of 
Biomechanics, vol. 22, no. 11–12. pp. 1217–1227, 1989. 
 
[89] E. B. Fox, E. B. Sudderth, M. I. Jordan, and A. S. Willsky, “Sharing Features 
among Dynamical Systems with Beta Processes,” Adv. Neural Inf. Process. Syst. 
23 24th Annu. Conf. Neural Inf. Process. Syst. 2010., pp. 388–396, 2009. 
 
[90] E. B. Fox, E. B. Sudderth, M. I. Jordan, and A. S. Willsky, “An HDP-HMM for 
systems with state persistence,” Proc. 25th Int. Conf. Mach. Learn. - ICML ’08, 
pp. 312–319, 2008. 
 
[91] D. Joshi and M. E. Hahn, “Terrain and Direction Classification of Locomotion 
Transitions Using Neuromuscular and Mechanical Input,” Ann. Biomed. Eng., vol. 
44, no. 4, pp. 1275–1284, 2016. 
 
 
 
149 
  
[92] J. A. Spanias, A. M. Simon, K. A. Ingraham, and L. J. Hargrove, “Effect of 
additional mechanical sensor data on an EMG-based pattern recognition system for 
a powered leg prosthesis,” Int. IEEE/EMBS Conf. Neural Eng. NER, vol. 2015-
July, pp. 639–642, 2015. 
 
[93] J. Taborri, E. Palermo, S. Rossi, and P. Cappa, “Gait partitioning methods: A 
systematic review,” Sensors (Switzerland), vol. 16, no. 1, pp. 40–42, 2016. 
 
[94] B. D. Robertson and G. S. Sawicki, “Exploiting elasticity: Modeling the influence 
of neural control on mechanics and energetics of ankle muscle-tendons during 
human hopping,” J. Theor. Biol., vol. 353, pp. 121–132, 2014. 
 
[95] N. C. Bejarano, E. Ambrosini, A. Pedrocchi, G. Ferrigno, M. Monticone, and S. 
Ferrante, “A novel adaptive, real-time algorithm to detect gait events from 
wearable sensors,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 23, no. 3, pp. 
413–422, 2015. 
 
[96] R. Williamson and B. J. Andrews, “Gait event detection for FES using 
accelerometers and supervised machine learning,” IEEE Trans. Rehabil. Eng., vol. 
8, no. 3, pp. 312–319, 2000. 
 
[97] J. A. Spanias, A. M. Simon, E. J. Perreault, and L. J. Hargrove, “Preliminary 
results for an adaptive pattern recognition system for novel users using a powered 
lower limb prosthesis,” Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS, 
vol. 2016-Octob, pp. 5083–5086, 2016. 
 
[98] E. Sejdic, K. A. Lowry, J. Bellanca, M. S. Redfern, and J. S. Brach, “A 
comprehensive assessment of gait accelerometry signals in time, frequency and 
time-frequency domains,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 22, no. 3, 
pp. 603–612, 2014. 
 
[99] M. A. Azhar, D. Gouwanda, and A. A. Gopalai, “Development of an intelligent 
real-time heuristic-based algorithm to identify human gait events,” 2014 IEEE-
EMBS Int. Conf. Biomed. Heal. Informatics, BHI 2014, pp. 573–576, 2014. 
 
[100] H. Huang, F. Zhang, Y. L. Sun, and H. He, “Design of a robust EMG sensing 
interface for pattern classification,” J. Neural Eng., vol. 7, no. 5, p. 056005, 2010. 
 
[101] R. Caldas, M. Mundt, W. Potthast, F. Buarque de Lima Neto, and B. Markert, “A 
systematic review of gait analysis methods based on inertial sensors and adaptive 
algorithms,” Gait Posture, vol. 57, no. February, pp. 204–210, 2017. 
 
[102] F. Di Nardo, C. Morbidoni, G. Mascia, F. Verdini, and S. Fioretti, “Intra-subject 
approach for gait-event prediction by neural network interpretation of EMG 
signals,” Biomed. Eng. Online, vol. 19, no. 1, pp. 1–20, 2020. 
 
150 
  
[103] C. Morbidoni, A. Cucchiarelli, S. Fioretti, and F. Di Nardo, “A deep learning 
approach to EMG-based classification of gait phases during level ground 
walking,” Electron., vol. 8, no. 8, 2019. 
 
[104] N. Nazmi, M. A. Abdul Rahman, S. I. Yamamoto, and S. A. Ahmad, “Walking 
gait event detection based on electromyography signals using artificial neural 
network,” Biomed. Signal Process. Control, vol. 47, pp. 334–343, 2019. 
 
[105] S. Hotta, A. Inomata, Y. Sasamoto, S. Washizawa, and B. Caulfield, 
“Unsupervised gait detection using biomechanical restrictions,” Proc. Annu. Int. 
Conf. IEEE Eng. Med. Biol. Soc. EMBS, pp. 4009–4013, 2017. 
 
[106] M. Yuwono, S. W. Su, Y. Guo, B. D. Moulton, and H. T. Nguyen, “Unsupervised 
nonparametric method for gait analysis using a waist-worn inertial sensor,” Appl. 
Soft Comput. J., vol. 14, no. PART A, pp. 72–80, 2014. 
 
[107] V. Agostini, G. Balestra, and M. Knaflitz, “Segmentation and classification of gait 
cycles,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 22, no. 5, pp. 946–952, 2014. 
 
[108] U. Martinez-Hernandez, A. Rubio-Solis, G. Panoutsos, and A. A. Dehghani-Sanij, 
“A combined Adaptive Neuro-Fuzzy and Bayesian strategy for recognition and 
prediction of gait events using wearable sensors,” IEEE Int. Conf. Fuzzy Syst., 
2017. 
 
[109] S. Khandelwal and N. Wickström, “Gait Event Detection in Real-World 
Environment for Long-Term Applications: Incorporating Domain Knowledge Into 
Time-Frequency Analysis,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 24, no. 
12, pp. 1363–1372, 2016. 
 
[110] P. Catalfamo, S. Ghoussayni, and D. Ewins, “Gait event detection on level ground 
and incline walking using a rate gyroscope,” Sensors, vol. 10, no. 6, pp. 5683–
5702, 2010. 
 
[111] J. Figueiredo, P. Félix, L. Costa, J. C. Moreno, and C. P. Santos, “Gait Event 
Detection in Controlled and Real-Life Situations: Repeated Measures from 
Healthy Subjects,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 26, no. 10, pp. 
1945–1956, 2018. 
 
[112] J. C. Pérez-Ibarra, H. Williams, A. A. G. Siqueira, and H. I. Krebs, “Real-Time 
Identification of Impaired Gait Phases Using a Single Foot-Mounted Inertial 
Sensor: Review and Feasibility Study,” Proc. IEEE RAS EMBS Int. Conf. Biomed. 
Robot. Biomechatronics, vol. 2018-Augus, pp. 1157–1162, 2018. 
 
[113] S. A.M., M. C., S. S., and C. F., “Assessment of walking features from foot inertial 
sensing,” IEEE Trans. Biomed. Eng., vol. 52, no. 3, pp. 486–494, 2005. 
 
151 
  
[114] L. Yan, T. Zhen, J. L. Kong, L. M. Wang, and X. L. Zhou, “Walking gait phase 
detection based on acceleration signals using voting-weighted integrated neural 
network,” Complexity, vol. 2020, 2020. 
 
[115] Ł. Kidziński, S. Delp, and M. Schwartz, “Automatic real-time gait event detection 
in children using deep neural networks,” PLoS One, vol. 14, no. 1, pp. 1–11, 2019. 
 
[116] He Huang, T. A. Kuiken, and R. D. Lipschutz, “A Strategy for Identifying 
Locomotion Modes Using Surface Electromyography,” IEEE Trans. Biomed. 
Eng., vol. 56, no. 1, pp. 65–73, Jan. 2009. 
 
[117] L. L. Long and M. Srinivasan, “Walking, running, and resting under time, 
distance, and average speed constraints: optimality of walk-run-rest mixtures.,” J. 
R. Soc. Interface, vol. 10, no. 81, Apr. 2013. 
 
[118] P. J. Green and B. S. Tw, “Reversible Jump Markov Chain Monte Carlo 
Computation and Bayesian Model Determination,” Biometrika, vol. 82, no. 4, pp. 
711–732, 1995. 
 
[119] U. Martinez-Hernandez and A. A. Dehghani-Sanij, “Adaptive Bayesian inference 
system for recognition of walking activities and prediction of gait events using 
wearable sensors,” Neural Networks, vol. 102, pp. 107–119, 2018. 
 
[120] H. F. Maqbool, M. A. Bin Husman, M. I. Awad, A. Abouhossein, N. Iqbal, and A. 
A. Dehghani-Sanij, “A Real-Time Gait Event Detection for Lower Limb 
Prosthesis Control and Evaluation,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 
25, no. 9, pp. 1–1, 2016. 
 
[121] M. Norris, R. Anderson, and I. C. Kenny, “Method analysis of accelerometers and 
gyroscopes in running gait: A systematic review,” Proc. Inst. Mech. Eng. Part P J. 
Sport. Eng. Technol., vol. 228, no. 1, pp. 3–15, 2014. 
 
[122] M. R. Paquette, C. Napier, R. W. Willy, and T. Stellingwerff, “Moving Beyond 
Weekly ‘Distance’: Optimizing Quantification of Training Load in Runners,” J. 
Orthop. Sport. Phys. Ther., vol. 0, no. 0, pp. 1–20, Aug. 2020. 
 
[123] C. E. Milner and M. R. Paquette, “A kinematic method to detect foot contact 
during running for all foot strike patterns,” J. Biomech., vol. 48, no. 12, pp. 3502–
3505, 2015. 
 
[124] D. A. Winter, “Kinematic and kinetic patterns in human gait: Variability and 
compensating effects,” Hum. Mov. Sci., vol. 3, no. 1–2, pp. 51–76, Mar. 1984. 
 
[125] H. Prasanth et al., “Wearable sensor-based real-time gait detection: A systematic 
review,” Sensors, vol. 21, no. 8, pp. 1–28, 2021. 
 
152 
  
[126] K. G. Aubol and C. E. Milner, “Foot contact identification using a single triaxial 
accelerometer during running,” J. Biomech., vol. 105, p. 109768, 2020. 
 
[127] M. Zrenner, A. Küderle, N. Roth, U. Jensen, B. Dümler, and B. M. Eskofier, 
“Does the position of foot-mounted imu sensors influence the accuracy of spatio-
temporal parameters in endurance running?,” Sensors (Switzerland), vol. 20, no. 
19, pp. 1–21, 2020. 
 
[128] C. Strohrmann, H. Harms, C. Kappeler-Setz, and G. Troster, “Monitoring 
Kinematic Changes With Fatigue in Running Using Body-Worn Sensors,” IEEE 
Trans. Inf. Technol. Biomed., vol. 16, no. 5, pp. 983–990, Sep. 2012. 
 
[129] J. R. Rebula, L. V. Ojeda, P. G. Adamczyk, and A. D. Kuo, “Measurement of foot 
placement and its variability with inertial sensors,” Gait Posture, vol. 38, no. 4, pp. 
974–980, 2013. 
 
[130] C. A. Clermont, L. C. Benson, W. B. Edwards, B. A. Hettinga, and R. Ferber, 
“New Considerations for Wearable Technology Data: Changes in Running 
Biomechanics During a Marathon,” J. Appl. Biomech., vol. 35, no. 6, pp. 401–409, 
Dec. 2019. 
 
[131] L. C. Benson, C. A. Clermont, and R. Ferber, “New Considerations for Collecting 
Biomechanical Data Using Wearable Sensors: The Effect of Different Running 
Environments,” Front. Bioeng. Biotechnol., vol. 8, no. February, pp. 1–8, 2020. 
[132] H. Lee, “Task-dependent modulation of multi-dimensional human ankle stiffness,” 
in Powered Prostheses, Elsevier, 2020, pp. 41–56. 
 
[133] J. B. Lee, R. B. Mellifont, and B. J. Burkett, “The use of a single inertial sensor to 
identify stride, step, and stance durations of running gait,” J. Sci. Med. Sport, vol. 
13, no. 2, pp. 270–273, 2010. 
 
[134] M. Giandolini, N. Horvais, J. Rossi, G. Y. Millet, P. Samozino, and J. B. Morin, 
“Foot strike pattern differently affects the axial and transverse components of 
shock acceleration and attenuation in downhill trail running,” J. Biomech., vol. 49, 
no. 9, pp. 1765–1771, 2016. 
 
[135] B. Auvinet, E. Gloria, G. Renault, and E. Barrey, “Runner’s stride analysis: 
Comparison of kinematic and kinetic analyses under field conditions,” Sci. Sport., 
vol. 17, no. 2, pp. 92–94, 2002. 
 
[136] A. J. Wixted, D. C. Billing, and D. A. James, “Validation of trunk mounted inertial 
sensors for analysing running biomechanics under field conditions, using 
synchronously collected foot contact data,” Sport. Eng., vol. 12, no. 4, pp. 207–
212, 2010. 
 
 
153 
  
[137] P. G. Weyand, R. F. Sandell, D. N. L. Prime, and M. W. Bundle, “The biological 
limits to running speed are imposed from the ground up,” J. Appl. Physiol., vol. 
108, no. 4, pp. 950–961, 2010. 
 
[138] R. K. Fukuchi, C. A. Fukuchi, and M. Duarte, “A public dataset of running 
biomechanics and the effects of running speed on lower extremity kinematics and 
kinetics,” PeerJ, vol. 2017, no. 5, p. 3298, 2017. 
 
[139] C. A. Clermont, L. C. Benson, S. T. Osis, D. Kobsar, and R. Ferber, “Running 
patterns for male and female competitive and recreational runners based on 
accelerometer data,” J. Sports Sci., vol. 37, no. 2, pp. 204–211, 2019. 
 
[140] L. Marotta, J. H. Buurke, B. J. F. van Beijnum, and J. Reenalda, “Towards 
machine learning-based detection of running-induced fatigue in real-world 
scenarios: Evaluation of IMU sensor configurations to reduce intrusiveness,” 
Sensors, vol. 21, no. 10, 2021. 
 
[141] M. R. Ryan, C. Napier, D. Greenwood, and M. R. Paquette, “Comparison of 
different measures to monitor week-to-week changes in training load in high 
school runners,” Int. J. Sport. Sci. Coach., vol. 16, no. 2, pp. 370–379, 2021. 
 
[142] W. B. Edwards, “Modeling Overuse Injuries in Sport as a Mechanical Fatigue 
Phenomenon,” Exerc. Sport Sci. Rev., vol. 46, no. 4, pp. 224–231, 2018. 
 
[143] E. S. Matijevich, L. R. Scott, P. Volgyesi, K. H. Derry, and K. E. Zelik, 
“Combining wearable sensor signals, machine learning and biomechanics to 
estimate tibial bone force and damage during running,” Hum. Mov. Sci., vol. 74, 
no. October, p. 102690, 2020. 
 
[144] B. Vanwanseele, T. Op De Beéck, K. Schütte, and J. Davis, “Accelerometer Based 
Data Can Provide a Better Estimate of Cumulative Load During Running 
Compared to GPS Based Parameters,” Front. Sport. Act. Living, vol. 2, no. 
October, pp. 1–7, 2020. 
 
[145] K. P. Clark and P. G. Weyand, “Are running speeds maximized with simple-spring 
stance mechanics?,” J. Appl. Physiol., vol. 117, no. 6, pp. 604–615, 2014. 
 
[146] S. R. Donahue, L. Jin, and M. E. Hahn, “User Independent Estimations of Gait 
Events With Minimal Sensor Data,” IEEE J. Biomed. Heal. Informatics, vol. 25, 
no. 5, pp. 1583–1590, May 2021. 
 
[147] S. R. Donahue and M. E. Hahn, “Feature Identification With a Heuristic Algorithm 
and an Unsupervised Machine Learning Algorithm for Prior Knowledge of Gait 
Events,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 30, no. Ic, pp. 108–114, 
2022. 
 
154 
  
[148] P. Robberechts et al., “Predicting gait events from tibial acceleration in rearfoot 
running: A structured machine learning approach,” Gait Posture, vol. 84, no. 
October 2020, pp. 87–92, 2021. 
 
[149] M. Mundt et al., “Estimation of Gait Mechanics Based on Simulated and 
Measured IMU Data Using an Artificial Neural Network,” Front. Bioeng. 
Biotechnol., vol. 8, no. February, pp. 1–16, 2020. 
 
[150] C. R. Hollis, R. M. Koldenhoven, J. E. Resch, and J. Hertel, “Running 
biomechanics as measured by wearable sensors: effects of speed and surface,” 
Sport. Biomech., vol. 20, no. 5, pp. 521–531, 2021. 
 
[151] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., 
vol. 9, no. 8, pp. 1735–1780, Nov. 1997. 
 
[152] R. D. Gurchiek, N. Cheney, and R. S. McGinnis, “Estimating biomechanical time-
series with wearable sensors: A systematic review of machine learning 
techniques,” Sensors (Switzerland), vol. 19, no. 23, 2019. 
 
[153] M. I. M. Refai, B. J. F. Van Beijnum, J. H. Buurke, and P. H. Veltink, “Portable 
Gait Lab: Estimating 3D GRF Using a Pelvis IMU in a Foot IMU Defined Frame,” 
IEEE Trans. Neural Syst. Rehabil. Eng., vol. 28, no. 6, pp. 1308–1316, 2020. 
 
[154] N. Neverova et al., “Learning Human Identity from Motion Patterns,” IEEE 
Access, vol. 4, pp. 1810–1820, 2016. 
 
[155] B. J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesian optimization of 
machine learning algorithms,” Adv. Neural Inf. Process. Syst., vol. 25, 2012. 
 
[156] C. M. O’Connor, S. K. Thorpe, M. J. O’Malley, and C. L. Vaughan, “Automatic 
detection of gait events using kinematic data,” Gait Posture, vol. 25, no. 3, pp. 
469–474, 2007. 
 
[157] N. J. Nedergaard et al., “The feasibility of predicting ground reaction forces during 
running from a trunk accelerometry driven mass-spring-damper model,” PeerJ, 
vol. 2018, no. 12, pp. 1–17, 2018. 
 
[158] T. Keller, A. Weisberger, J. Ray, S. Hasan, R. Shiavi, and D. Spengler, 
“Relationship between vertical ground reaction force and speed during walking, 
slow jogging, and running,” Clin. Biomech., vol. 11, no. 5, pp. 253–259, Jul. 1996. 
 
[159] E. Halilaj, A. Rajagopal, M. Fiterau, J. L. Hicks, T. J. Hastie, and S. L. Delp, 
“Machine learning in human movement biomechanics: Best practices, common 
pitfalls, and new opportunities,” J. Biomech., vol. 81, pp. 1–11, 2018. 
 
 
155 
  
[160] D. Kiernan et al., “Accelerometer-based prediction of running injury in National 
Collegiate Athletic Association track athletes,” J. Biomech., vol. 73, pp. 201–209, 
2018. 
 
[161] S. P. Messier et al., “A 2-Year Prospective Cohort Study of Overuse Running 
Injuries: The Runners and Injury Longitudinal Study (TRAILS),” Am. J. Sports 
Med., vol. 46, no. 9, pp. 2211–2221, 2018. 
 
[162] Y. S. Lee, C. S. Ho, Y. Shih, S. Y. Chang, F. J. Róbert, and T. Y. Shiang, 
“Assessment of walking, running, and jumping movement features by using the 
inertial measurement unit,” Gait Posture, vol. 41, no. 4, pp. 877–881, 2015. 
 
[163] V. Guimarães, I. Sousa, and M. V. Correia, “Orientation-invariant spatio-temporal 
gait analysis using foot-worn inertial sensors,” Sensors, vol. 21, no. 11, pp. 1–19, 
2021. 
 
[164] M. Falbriard, F. Meyer, B. Mariani, G. P. Millet, and K. Aminian, “Drift-Free Foot 
Orientation Estimation in Running Using Wearable IMU,” Front. Bioeng. 
Biotechnol., vol. 8, no. February, pp. 1–11, 2020. 
 
[165] A. Mannini and A. M. Sabatini, “Gait phase detection and discrimination between 
walking-jogging activities using hidden Markov models applied to foot motion 
data from a gyroscope,” Gait Posture, vol. 36, no. 4, pp. 657–661, 2012. 
 
[166] D. S. Komaris et al., “Predicting Three-Dimensional Ground Reaction Forces in 
Running by Using Artificial Neural Networks and Lower Body Kinematics,” IEEE 
Access, vol. 7, pp. 156779–156786, 2019. 
 
[167] B. M. Nigg, R. W. De Boer, and V. Fisher, “A kinematic comparison of 
overground and treadmill running.,” Med. Sci. Sports Exerc., vol. 27, no. 1, pp. 
98–105, Jan. 1995. 
 
[168] P. O. Riley, J. Dicharry, J. Franz, U. Della Croce, R. P. Wilder, and D. C. 
Kerrigan, “A kinematics and kinetic comparison of overground and treadmill 
running,” Med. Sci. Sports Exerc., vol. 40, no. 6, pp. 1093–1100, 2008. 
 
[169] A. Psarras, D. Mertyri, and P. Tsaklis, “Biomechanical Analysis of Ankle during 
the Stance Phase of Gait on Various Surfaces: A Literature Review,” Hum. Mov., 
vol. 17, no. 3, pp. 140–147, 2016. 
 
[170] B. C. Heiderscheit, E. S. Chumanov, M. P. Michalski, C. M. Wille, and M. B. 
Ryan, “Effects of step rate manipulation on joint mechanics during running,” Med. 
Sci. Sports Exerc., vol. 43, no. 2, pp. 296–302, 2011. 
 
 
 
156 
  
[171] M. Grimmer, K. Schmidt, J. E. Duarte, L. Neuner, G. Koginov, and R. Riener, 
“Stance and swing detection based on the angular velocity of lower limb segments 
during walking,” Front. Neurorobot., vol. 13, no. July, pp. 1–15, 2019. 
tw 
 
[172] C. Fadillioglu, B. J. Stetter, S. Ringhof, F. C. Krafft, S. Sell, and T. Stein, 
“Automated gait event detection for a variety of locomotion tasks using a novel 
gyroscope-based algorithm,” Gait Posture, vol. 81, no. November 2019, pp. 102–
108, 2020. 
 
157