MACHINE LEARNING IN LARGE AND SMALL EARTHQUAKES: FROM RAPID LARGE EARTHQUAKE CHARACTERIZATION TO SLOW FAULT ZONE PROCESSES by JIUN-TING LIN A DISSERTATION Presented to the Department of Earth Sciences and the Division of Graduate Studies of the University of Oregon in partial fulfillment of the requirements for the degree of Doctor of Philosophy December 2022 DISSERTATION APPROVAL PAGE Student: Jiun-Ting Lin Title: Machine Learning in Large and Small Earthquakes: from Rapid Large Earthquake Characterization to Slow Fault Zone Processes This dissertation has been accepted and approved in partial fulfillment of the requirements for the Doctor of Philosophy degree in the Department of Earth Sciences by: Amanda M. Thomas Chairperson and Advisor Diego Melgar Core Member and Advisor Valerie J. Sahakian Core Member Jake Searcy Core Member Thien Nguyen Institutional Representative and Krista Chronister Vice Provost for Graduate Studies Original approval signatures are on file with the University of Oregon Division of Graduate Studies. Degree awarded December 2022 ii © 2022 Jiun-Ting Lin This work is licensed under a Creative Commons Attribution License ii i DISSERTATION ABSTRACT Jiun-Ting Lin Doctor of Philosophy Department of Earth Sciences December 2022 Title: Machine Learning in Large and Small Earthquakes: from Rapid Large Earthquake Characterization to Slow Fault Zone Processes This dissertation summarizes the work of integrating machine-learning and traditional seismic analysis techniques into large and small earthquake problems. Earthquake early warning for large magnitude earthquakes is one of the most challenging problems in seismology. Here I develop an algorithm, called M-LARGE, that harnesses machine-learning, rupture simulations, and GNSS data to rapidly predict magnitude without saturation issue with an accuracy of 99%, outperforming other similar methods. I then show how M-LARGE can predict finite fault parameters and their evolution when rupture unfolds for fast and accurate ground motion forecasting. This dissertation will demonstrate how machine-learning can be used as a data mining tool to detect small magnitude seismicity buried in noisy waveforms. I will show its application to detect LFEs, a special class of small earthquakes typically occur down- dip of the seismogenic zone. The model detects more than five times the number of events than the original catalog in Vancouver Island and can apply to unseen stations, which provides a more flexible way to refine the temporal resolution of subduction zone processes. Finally, I will show how do small and slow earthquakes link to large and fast events and their implication on earthquake hazard assessment. With jointly inverted GNSS, strong iv motion, and tsunami data of the 2018 M7.1 Hawaii earthquake, I find that fast slip ruptures into the area previously hosts slow slip. The result is further validated by rupture simulations, where we find that the effective stress can be a factor that exerts a dominant control on the rupture extent. This reinforces the idea that an individual section of fault can host a variety of distinct slip behaviors, and slow slip should be considered as rupture extent for a more accurate hazard assessment. This dissertation includes previously published and unpublished co-authored material. v CURRICULUM VITAE NAME OF AUTHOR: Jiun-Ting Lin GRADUATE AND UNDERGRADUATE SCHOOLS ATTENDED: University of Oregon, Eugene National Central University, Taiwan National Central University, Taiwan DEGREES AWARDED: Doctor of Philosophy, Earth Sciences, 2022, University of Oregon Master of Science, Geophysics, 2014, National Central University Bachelor of Science, Earth Sciences, 2012, National Central University AREAS OF SPECIAL INTEREST: Seismology Machine Learning Data Science Inverse Problem PROFESSIONAL EXPERIENCE: Graduate Employee, University of Oregon, 2017–2022 Intern, Lawrence Livermore National Laboratory, 2022–2022 NASA Earth and Space Science Fellow, NASA, 2018–2021 Research Assistant, Academia Sinica, 2016–2017 Graduate Research Assistant, National Central University, 2012–2014 GRANTS, AWARDS, AND HONORS: Geo Emritus Scholarship, University of Oregon, 2021 NASA Earth and Space Science Fellowship, NASA, 2018, 2019, 2020 v i Outstanding Student Presentation Award, AGU Fall Meeting, 2020 Research Recognition Award, University of Oregon, 2019 Summer Scholarship, University of Oregon, 2018 Conference Travel Grant, University of Oregon, 2018 Dr. Juan V. C. Best Paper Award, Geological Society in Taipei, 2017 Outstanding Research Award, National Central University, 2013 Excellence Special Topic Award, National Central University, 2011 PUBLICATIONS: Zhang, H., Melgar, D., Sahakian, V., Searcy, J., & Lin, J. T. (2022). Learning source, path and site effects: CNN-based on-site intensity prediction for earthquake early warning. Geophysical Journal International, 231(3), 2186-2204. Lin, J. T., Melgar, D., Thomas, A. M., & Searcy, J. (2021). Early warning for great earthquakes from characterization of crustal deformation patterns with deep learning. Journal of Geophysical Research: Solid Earth, 126(10), e2021JB022703. Yu, W., Song, T. R., Su, J., & Lin, J. T. (2021). Rayleigh‐Love Discrepancy Highlights Temporal Changes in Near‐Surface Radial Anisotropy After the 2004 Great Sumatra Earthquake. Journal of Geophysical Research: Solid Earth, 126(12), e2021JB022896. Lin, J. T., Aslam, K. S., Thomas, A. M., & Melgar, D. (2020). Overlapping regions of coseismic and transient slow slip on the Hawaiian décollement. Earth and Planetary Science Letters, 544, 116353. Yu, W., Lin, J. T., Su, J., Song, T. R., & Kang, C. C. (2020). S coda and Rayleigh waves from a decade of repeating earthquakes reveal discordant temporal velocity changes since the 2004 Sumatra earthquake. Journal of Geophysical Research: Solid Earth, 12 Lin, J. T., Chang, W. L., Melgar, D., Thomas, A., & Chiu, C. Y. (2019). Quick determination of earthquake source parameters from GPS measurements: a study of suitability for Taiwan. Geophysical Journal International, 219(2), 1148-1162. vi i ACKNOWLEDGMENTS This work was possible because of the help and support of many people. First, I wish to express sincere appreciation to my advisors Amanda Thomas and Diego Melgar for their guidance, feedback, and their welcoming attitude to my questions and problems. Second, I would like to thank my committee and coauthors Valerie Sahakian, Jake Searcy, and Thien Nguyen for their valuable input, insightful discussion, and editorial support for this research. Third, I am thankful to my colleagues and friends I met during graduate study, specifically, Tyler Newton, Alex Babb, Geena Littel, Emily Sexton, who helped me to overcome my first year’s culture shock. Additionally, I would like to thank the colleagues and friends of both Melgar and Thomas Lab, other graduate students and ES staff. Finally, thanks to my family for their unconditional love, who shared these good and hard times with me, and their support. This work can be done, and I am here because of you. This work was mainly supported by the NASA Earth and Space Science Fellowship 80NSSC18K1420 and National Science Foundation grants 1663834 and 1835661. The work is also partially funded by NASA grants 80NSSC19K0360 and 80NSSC19K1104 and the scholarships from the Department of Earth Sciences at the University of Oregon. vi ii To my family. ix TABLE OF CONTENTS Chapter Page I. INTRODUCTION .................................................................................................... 1 1.1 Large Magnitude Earthquakes ......................................................................... 2 1.1.1 Science of EEW ...................................................................................... 2 1.1.2 From Science to Practice: Next Generation of EEW System ................. 4 1.2 Small Magnitude Earthquakes ......................................................................... 4 1.2.1 Low-Frequency Earthquakes .................................................................. 5 1.2.2 Challenges in Detecting Small Magnitude Seismicity ............................ 7 1.2.3 Machine Learning as a Data Mining Tool .............................................. 8 1.3 From Small and Slow to Large and Fast .......................................................... 8 1.3.1 Slow Slip and Fast Slip ........................................................................... 9 1.3.2 Evidence from the Nature Laboratory .................................................... 9 II. EARLY WARNING FOR GREAT EARTHQUAKES FROM CHARACTERIZATION OF CRUSTAL DEFORMATION PATTERNS WITH DEEP LEARNING .................................................................................................. 11 2.1 Introduction ...................................................................................................... 11 2.2 Data and Methods ............................................................................................ 16 2.2.1 Rupture Simulation and Synthetic Waveforms ....................................... 16 2.2.2 M-LARGE: Model Architecture ............................................................. 23 2.2.3 M-LARGE: Input Features, Output Labeling, and Model Training ....... 24 2.3 Results .............................................................................................................. 26 2.3.1 Model Performance and Comparison to Existing Methods .................... 26 2.3.2 Model Performance on Real Earthquakes ............................................... 28 x 2.4 Discussion ........................................................................................................ 30 2.4.1 Timing of Final Magnitude Estimation ................................................... 30 2.4.2 Model Uncertainty .................................................................................. 31 2.4.3 Model Performance on Imperfect Data or Complex Rupture ................. 33 2.4.4 Limitations and Future Work .................................................................. 36 2.5 Conclusion ....................................................................................................... 38 III. REAL-TIME FAULT AND GROUND MOTION PREDICTION FOR LARGE EARTHQUAKES WITH HR-GNSS AND MACHINE LEARNING .................... 39 3.1 Introduction ...................................................................................................... 39 3.1.1 Current EEW systems ............................................................................. 39 3.1.2 Limitation of the Existing EEW systems ................................................ 40 3.1.3 New M-LARGE Model .......................................................................... 42 3.2 Data .................................................................................................................. 44 3.2.1 Kinematic Rupture Simulation ............................................................... 44 3.2.2 Synthetic HR-GNSS Waveforms ............................................................ 45 3.2.3 Synthetic High-Frequency Waveforms ................................................... 45 3.3 Model ............................................................................................................... 47 3.3.1 New M-LARGE Architecture ................................................................. 47 3.3.2 Model Training and Testing .................................................................... 47 3.3.3 Finite Fault and GM Prediction .............................................................. 48 3.4 Results .............................................................................................................. 50 3.4.1 M-LARGE Performance on Testing Data .............................................. 50 3.4.2 M-LARGE Performance on Real Data ................................................... 52 3.5 Discussion ........................................................................................................ 54 x i 3.5.1 Error Estimation ...................................................................................... 54 3.5.2 Timeliness of Alerts ................................................................................ 57 3.6 Conclusion ....................................................................................................... 60 IV. DETECTING LOW-FREQUENCY EARTHQUAKES IN NOISY TIME SERIES DATA WITH MACHINE LEARNING: CASE STUDY IN SOUTHERN VANCOUVER ISLAND ......................................................................................... 62 4.1 Introduction ...................................................................................................... 62 4.2 Methods............................................................................................................ 67 4.2.1 Training Data for Phase Picks ................................................................. 67 4.2.2 Convolutional Neural Network Architecture and Training .................... 69 4.3 Results .............................................................................................................. 70 4.3.1 Assessing Model Performance ................................................................ 70 4.3.2 Application to Continuous Seismic Data ................................................ 74 4.3.3 P and S-wave Arrival Time Estimates .................................................... 78 4.4 Discussion ........................................................................................................ 80 4.4.1 Flexibility and Efficiency ....................................................................... 80 4.4.2 Implication and Future Work .................................................................. 81 4.5 Conclusion ....................................................................................................... 81 V. OVERLAPPING OF COSEISMIC AND TRANSIENT SLOW SLIP .................. 83 5.1 Introduction ...................................................................................................... 83 5.1.1 Fast and Slow Slip .................................................................................. 83 5.1.2 Tectonic Setting of Hawaii ..................................................................... 86 5.1.3 Slow Slip in Hawaii ................................................................................ 87 5.2 Data .................................................................................................................. 88 xi i 5.2.1 The Mw7.1 2018 Hawaii Earthquake ..................................................... 88 5.3 Joint Inversion Model ...................................................................................... 92 5.4 Overlap of Slow and Fast Slip ......................................................................... 96 5.4.1 Evidence from Inversion Model ............................................................. 96 5.4.2 Evidence from Rupture Modeling .......................................................... 99 5.5 Discussion ........................................................................................................ 107 5.6 Conclusion ....................................................................................................... 109 VI. CONCLUSION AND FUTURE STEPS ............................................................... 111 REFERENCES CITED ................................................................................................ 114 xi ii LIST OF FIGURES Figure Page 2.1 Map view of the Chilean subduction zone, example rupture scenario, and resulting HR-GNSS waveforms ............................................................................. 15 2.2 Source parameters of the 36800 rupture scenarios ................................................ 21 2.3 Comparison between synthetic data and PGD-Mw scaling ................................... 22 2.4 M-LARGE model architecture .............................................................................. 24 2.5 Model performance on testing data set and on real events .................................... 27 2.6 M-LARGE performance on real Chilean earthquakes compared to GFAST ........ 29 2.7 Warning time ratios and STF analysis ................................................................... 30 2.8 Model performance on the testing data set ............................................................ 32 2.9 M-LARGE prediction tests with two different station distributions ..................... 33 2.10 Example plot for the time to corrected prediction (𝜏!), centroid time (𝜏!"#$), and peak time (𝜏%) for four different cases. .................................................................. 35 3.1 Map of the Chilean Subduction Zone, rupture scenario, synthetic HR-GNSS waveforms, and model predictions ........................................................................ 43 3.2 Source parameters of the 346400 rupture scenarios in this study .......................... 44 3.3 Example of 3-components synthetic high-frequency waveforms and comparison of synthetic PGA with GMPE................................................................................ 46 3.4 Example shake map based on the actual fault, simplified fault, and the M-LARGE predicted fault ........................................................................................................ 49 3.5 M-LARGE predictions on the Mw, centroid longitude, centroid latitude, length, and width with 60 and 300 second of data ............................................................. 51 3.6 M-LARGE performance on the 2010 Mw8.8 Maule earthquake real data ........... 53 3.7 Final MMI misfits at the 525 virtual stations for the simplified and M-LARGE predicted fault in different Mw groups .................................................................. 55 xi v 3.8 M-LARGE performance on perfect data using 510 s long time series .................. 56 3.9 M-LARGE predicted GM on real earthquakes ...................................................... 58 3.10 Warning time of M-LARGE model testing on synthetic data ............................. 60 4.1 Map view of the study area .................................................................................... 66 4.2 Example of LFE and noise data at station TWKB ................................................. 68 4.3 Performance analysis for our selected model (size=2) .......................................... 73 4.4 Example of S wave detections of a known LFE .................................................... 75 4.5 Example of S wave detections of an unknown LFE which is not in the original catalog .................................................................................................................... 75 4.6 Model performance on multiple stations ............................................................... 76 4.7 Model performance on unused stations ................................................................. 77 4.8 Distribution of arrival time misfits in different groups .......................................... 79 5.1 Map of southeastern Hawai'i island ....................................................................... 85 5.2 Data and fitting from the joint inversion model in Figure 5.1 ............................... 91 5.3 Tsunami waveforms and empirical tsunami correction ......................................... 92 5.4 Snapshots for the kinematic rupture model in Figure 5.1 ...................................... 95 5.5 Overlapping areas of coseismic slip and SSEs from our inversion tests ............... 98 5.6 Model setup and time evolution of slip rate within the slow slip zone (SSZ) and the co-seismic zone (CSZ) ..................................................................................... 102 5.7 Different CSE rupture scenarios observed in this study ........................................ 105 5.8 The percentage penetration of CSE into the SSZ for different σcsz and stress ratios (σssz/σcsz) ................................................................................................................. 107 xv LIST OF TABLES Table Page 5.1 Velocity model used in this study .......................................................................... 96 5.2 Constraints for the grid-searched inversions .......................................................... 99 5.3 List of parameter values used ................................................................................ 103 xv i CHAPTER I INTRODUCTION Large and small earthquakes are two challenging problems in seismology. In this dissertation, I will show what are the current limitations and how we integrate machine- learning and traditional seismic analysis techniques into these difficult tasks. The dissertation is composed of three main topics, with the summary work from two published papers presenting in Chapter II and V, and two soon to be published manuscripts in Chapter III and IV. In Chapter II and III, I will first introduce the challenges of earthquake early warning (EEW) systems for large magnitude earthquakes, the science behind EEW, and our solution that harnesses machine-learning, rupture simulations, and Global Navigation Satellite System (GNSS) data for rapid large earthquake characterization and ground motion forecasting. Second, in Chapter IV, I will show how machine-learning can be used as a data mining tool to detect low-frequency earthquakes (LFEs), a special class of small magnitude seismicity typically occur down-dip of the seismogenic zone, the challenges of identifying small magnitude seismicity buried in noisy waveforms, and how can the model help to refine the temporal resolution of subduction zone processes. For the third topic, I will show how do small and slow earthquakes link to large and fast events and their implication on earthquake hazard assessment in Chapter V. Finally, I will show the conclusion and the 1 future work in Chapter VI. 1.1 Large Magnitude Earthquakes Large magnitude earthquakes are one of the most catastrophic natural hazards threatening human safety and global economics. To mitigate the impact, EEW systems aim to provide warning to the public and automatic systems before strong shaking onsets (Given et al., 2014). However, despite more than thirty years since their first deployment, problems in magnitude underestimation and limitations in near-field seismic data exist, making current EEW systems challenging in providing accurate and timely warnings for large Mw7.5+ earthquakes. For example, the 2004 Mw9.2 Sumatra earthquake was initially classified as a Mw8.0 earthquake until few days later, when observations of Earth’s free oscillations revised the magnitude to Mw9.2 (Park et al., 2005; Stein & Okal, 2005). Additionally, the 2011 Mw9.0 Tohoku-Oki earthquake, a catastrophic tsumanigenic earthquake was first estimated to be a Mw8.1, which resulted in underestimating of both the shaking and tsunami intensity (Colombelli et al., 2013; Hoshiba et al., 2014). 1.1.1 Science of EEW Magnitude underestimation, or saturation (Geller, 1976) is a result of two factors. The first is observational; it is produced by limitations of seismic monitoring equipment. Seismometers are inertial systems and, in the near-field, strong ground shaking, produces rotations and tilts that accompany the usual translational motions, leading to poor characterization of long period and large ground motions necessary for proper magnitude assessment. The second reason is physical. Large earthquakes take a finite amount of time, 2 usually tens of seconds to minutes, to release their moment. During the process, it is a matter of debate that when does the rupture differentiate between small and large earthquakes and be distinguishable from observed data (Mori & Kanamori, 1996; Rydelek & Horiuchi, 2006; Meier et al., 2016, 2017; Melgar & Hayes, 2017). Whether earthquakes are deterministic is not only a scientific question but also the heart of the EEW algorithms. The determinism of earthquake rupture is postulated early in Olson & Allen, (2005). In this model, the first few seconds of rupture contain information on the final fate of an earthquake. If so, an EEW algorithm should focus on the first few seconds of the waveform and predict the final rupture characteristics. The opposite view, that earthquake is non-deterministic (Rydelek & Horiuchi, 2006), suggests that the initial rupture shows no differences between small and large earthquakes, and one must simply wait for the entire source process to conclude before any assessment is possible. Although this non-determinism is recently revisited by Meier et al., (2016, 2017), most arguments are based on the band-limited seismic-based observations and limited dataset. Another study shows a compromise model of weak-determinism (Melgar & Hayes, 2017). In this view, large earthquakes and very large earthquakes are not distinguishable at their nucleation stage from global navigation satellite system (GNSS) observations; however, as the ruptures evolving, the waveforms have observably different behavior for different magnitude events within the first tens of seconds and well before the rupture has finished. Even though this view is again challenged by a weaker form of determinism that different magnitude earthquakes have similar growth rate and cannot be separated earlier than half or one-third of the duration (Meier et al., 2021), weak-determinism seems to be a consensus that an EEW algorithm should rely on. 3 1.1.2 From Science to Practice: Next Generation of EEW System Although the earthquake rupture determinism remains debated, an algorithm that seeks to progressively update the source prediction when large rupture unfolds has become a more feasible solution for practical purpose. For such goal, an EEW algorithm must meet the following criteria: 1) the model is flexible that it does not rely on an a priori weak- deterministic parameter, which is still unknown, to predict the final source characteristic; 2) the model can predict unsaturated result for large magnitude events; and 3) the model has to rely on the fastest observations i.e. near-field data, while still capable of recovering actual motions from large earthquakes. Taking these into consideration, in Chapter II and Chapter III, I will show our work with applying machine learning, rupture simulations, and unsaturated GNSS data to rapidly and accurately determine large earthquakes magnitude and other source parameters that can be further utilized for earthquake hazard forecasting. 1.2 Small Magnitude Earthquakes Small magnitude earthquakes are another challenge in seismology. As opposed to large earthquakes that radiate significant energy in the form of seismic waves and can be recorded by distant instruments, small earthquakes release less energy and produce weaker seismic waves that are, sometimes, difficult to detect. Understanding small magnitude earthquakes is important because all large earthquakes start from a small one, and they have shorter recurrence intervals (from days to months) than large earthquakes (from tens of years to hundreds of years), providing more feasible ways to understand earthquake 4 hazards given the short instrument recording history. However, identifying small earthquakes usually requires manually checking and picking from human experts, which involves tremendous work hours and effort. Furthermore, because of the low signal-to- noise ratio, only a few of them can be identified manually from the raw waveforms. Template matching, or cross-correlation technique, a method that calculates the similarity between two waveforms, has become an alternative tool and has successfully been applied to detect small magnitude events (e.g., Gibbons et al., 2006), and earthquake swarms (e.g., Shelly et al., 2007). This technique is based on the idea that earthquakes repeatedly break the same faults at roughly the same location and generates repeating waveforms that can be identified by multiple stations. Here in this dissertation, I will focus on the LFEs, which are a special class of small seismicity, that usually occur down-dip of the seismogenic zone (Obara, 2002; Katsumata & Kamaya, 2003). In the next section, I will introduce the background of LFEs, and our machine-learning method to identify them. 1.2.1 Low-Frequency Earthquakes LFEs are a special class of small magnitude earthquake (i.e., usually = +!,-"#. )* (2.3) +$,-"#. 𝐺 //<𝑟)*> = 𝑟)*𝐾/<𝑟)*> (2.4) where 𝐾/ is the modified Bessel function of the second kind and H is the Hurst exponent. We set 𝐻 = 0.4 based on a recent analysis of large earthquakes between 1990 and 2019 (Melgar & Hayes, 2019), which is slightly lower than the value of H=0.7 proposed when stochastic slip models were first employed (Graves & Pitarka, 2010; Mai & Beroza, 2002). 𝑟)*is the inter-subfault distance given by 𝑟 1 1)* = C(𝑟0/𝑎0) + (𝑟2/𝑎2) (2.5) where 𝑟0 and 𝑟2 are the along-strike, and along-dip distance, respectively. The along-strike and along-dip correlation lengths, 𝑎0 and 𝑎2, control the predominant asperity size in the resulting slip pattern (Mai & Beroza, 2002) and scale with indirectly with magnitude as a function of the fault length and width. 18 𝑎0 = 2.0 + 3 𝐿 (2.6) 4 𝑎2 = 1.0 + 3𝑊 (2.7) 4 Once all the parameters of the correlation matrix are defined the covariance matrix is obtained by 𝐶F56 = 𝜎)𝐶)*𝜎* (2.8) where 𝜎 is the standard deviation of slip which is usually defined as a fraction of mean slip. Here we set 𝜎=0.9 (LeVeque et al., 2016). Now we can obtain a randomly generated slip pattern with the statistics as defined above by summing the eigenvectors of the covariance matrix according to the K-L expansion (LeVeque et al., 2016) by 𝑠 = 𝜇 + ∑8793 𝑧7C𝜆7𝑣7 (2.9) where 𝑠 is a column vector containing the values of slip at each of the subfaults for a particular realization; 𝜇 is the expected mean slip pattern. We set it to be a vector with enough homogenous slip over the selected subfaults to match the target magnitude. 𝑁 is the maximum number of summed eigenvectors. We use a reasonably large number of 100 which should give enough variation of slip complexity (Melgar et al., 2016; LeVeque et al., 2016). 𝑧7is a scalar randomly selected from a presumed gaussian distribution with zero mean and standard deviation of 1. 𝜆7 and 𝑣7 denotes the eigenvalue and eigenvector of the covariance matrix. 19 With the stochastic slip pattern in hand, the second step is to define the rupture kinematics. Here we follow common best practices and a full treatment of this can be found in Graves & Pitarka (2010, 2015). We set the rupture speed to 0.8 of the local shear wave velocity at the subfault depth plus some stochastic perturbation to destroy perfectly circular rupture fronts. The hypocenter is randomly selected from the subfaults that are involved in the rupture to ensure both unilateral and bilateral ruptures. Rise times are defined to be proportional to the square root of local slip (Mena et al., 2010) but over the entire fault model must on average obey known rise-time magnitude scaling laws (Melgar & Hayes, 2017). We then use the Dreger slip rate function to describe the time-evolution at a particular subfault (Mena et al., 2010; Melgar et al., 2016). It is well-known that the shallow megathrust has slow rupture speeds and long rise times, so for subfaults shallower than 10km rupture speeds are set to 0.6 of shear wave speed and rise times are doubled from what is predicted by the scaling by the square root of slip. Below 15km the previously described rules are used, and between 10 and 15km depth a linear transition between the two behaviors is employed. This is similar to what is done for continental strike-slip faults (Graves & Pitarka, 2010). Similarly, the rake vector is set to 90 degrees plus some stochastic perturbations. Once the slip pattern and its complete time evolution are known, synthetic GNSS waveforms can be generated by summing all the synthetic data from participating subfaults. We use the FK package, which is a 1D frequency-wavenumber approach (Zhu & Rivera, 2002) and the LITHO1.0 velocity structure (Pasyanos et al., 2014) to generate the Green’s functions from all subfaults to given stations. We focus only on the long period 20 displacement waveforms (<0.5 Hz or 1 second sampling) since they are less sensitive to small scale crustal structure and are the dominant period of large earthquakes. Finally, to make the synthetic data more realistic, we introduce noise into the displacement waveform characteristics using a known real-time GNSS noise model (Melgar et al., 2020), which was computed from the analysis of one-year-long real HR- GNSS observations spanning a large region. For each waveform, we randomly select the percentile noise model and add it to the synthetic displacement waveforms. The addition of noise guarantees that the resulting time-domain waveform varies with each realization. In this way, we guarantee a large variability of noise and quality in the stations as is routinely seen in true real-time operations. Figure 2.2. Source parameters of the 36800 rupture scenarios. 21 We validate the synthetic data by comparing the simulated peak ground displacement (PGD) against the existing PGD-Mw scaling (Melgar et al., 2015; Ruhl, Melgar, Chung, et al., 2019) (Figure 2.3). PGD is defined by 𝑃𝐺𝐷(𝑡) = 𝑚𝑎𝑥 (2.10) where E(t), N(t), Z(t) represents the East, North and vertical component of the GNSS displacement time series starting from the earthquake origin (i.e. t=0), respectively. Figure 2.3. Comparison between synthetic data and PGD-Mw scaling. (a) Scaling from Melgar et al. (2015) and (b) Scaling from Ruhl, Melgar, Chung, et al. (2019). (a) and (b) from left to right shows the misfit of synthetic PGD and PGD-Mw scaling and its contour; standard deviation of the misfit; and distribution of waveforms in count. 22 Figure 2.3 suggests that the synthetic PGD pattern matches the scaling based on real observations at hypocentral distance ~100 km and Mw from Mw7.7 to Mw8.7. We note that misfit between modeled and expected values of PGD increases at Mw greater than Mw9.0 or hypocentral distance smaller than 10 km. This is because the PGD regressions are constructed from databases of real events; large earthquakes (i.e., Mw9.0+) and very close observations are comparatively rare in those databases (Melgar et al., 2016; Ruhl, Melgar, Chung, et al., 2019). The large misfit is also due to the limitation of point source assumption in PGD-Mw scaling laws where finiteness of large events needs to be considered. 2.2.2 M-LARGE: Model Architecture For time-dependent earthquake magnitude prediction, we employ a deep learning model, called Machine Learning Assessed Rapid Geodetic Earthquake model (M-LARGE) by linking the input time series recorded at each GNSS station to the time-dependent Mw for each rupture derived from the integration of the source time function (STF). Our model is composed of seven fully connected layers and a unidirectional long-short term memory (LSTM) recurrent layer (Hochreiter & Schmidhuber, 1997), which iteratively predicts Mw using the current and previous HR-GNSS observations across the network (Figure 2.4). We adopted this model architecture because LSTMs are ideal for processing sequential data, which allows M-LARGE to update magnitude predictions as the rupture progresses. Additionally, it does not require a priori source information (such as the hypocenter) typically required by other rapid modeling methods (e.g., Crowell et al., 2018). Note that the dense layers only connect the feature values at the same time channel rather than all the 23 features, which would include future times as well. Dropouts are applied to prevent overfitting during the training process (Srivastava et al., 2014). We use a Leaky ReLU function with a slope of 0.1 at negative values (Maas et al., 2013), which is an adaptation of the regular ReLU (i.e., slope of 0 at negative values) (Glorot et al., 2011) for the activation of dense layers. Finally, the last layer is connected to an ReLU function to output a current magnitude prediction, and the goal is to minimize the mean square error (MSE) contributed from the magnitude misfits at every epoch (Figure 2.4). Figure 2.4. M-LARGE model architecture. The figure shows the input as the time- dependent PGD values from the GNSS stations plus the station on or off (existence) codes. Blue rectangles mark the input PGD time series (i.e., 100 s) from all the available stations with their existence codes and the participating layers. 2.2.3 M-LARGE: Input Features, Output Labeling, and Model Training To train the model, we use the 36,800 synthetic ruptures described in above section and split them into training (70% or 25,760 ruptures), validation (20% or 7,360 24 ruptures), and testing data (10% or 3,680 ruptures). Note that these are the number of rupture scenarios, the actual input data consider different station recording combinations and GNSS noise. In our case, we generate more than 6 million training data from the original training data set and 8,192 testing data from the testing data set. We use PGD time series at a total of 121 stations (Figure 2.1) as the input features. The PGD time series are first clipped at a minimum of 0.01 m and scaled logarithmically. This is done so that during this rescaling process, the zero-valued data do not diverge to negative infinity. For the model output, we use the time integration from the real STF, convert it to the moment magnitude scale, and rescale this by multiplying the value by 0.1 for computational efficiency. Both the input and output time series are decimated to 5 s sampling so that we obtain Mw updates in 5 s increments. To increase the variability of the data, we apply data augmentation by introducing realistic HR-GNSS noise as described above. We also randomly discard some GNSS records to simulate station outages and network variability yielding more than 6 million earthquake and station scenarios used for 50,000 training steps. To distinguish between existing and nonoperational stations, we add an additional “station existence” feature channel for every site. We set the value to zero to simulate a station outage and set it to 0.5 if the station is working normally. This last value is arbitrary, but it is a good practice to set it to the same scale of other features (i.e., in our feature, 0.5 is 3.16 m in the original scale, so that their values are comparable) (Figure 2.4). Finally, we save the training weights every 5 epochs and select the model which has the minimum validation loss as the final model, and test it with the simulated testing data and real earthquakes occurred in the region. 25 2.3 Results 2.3.1 Model Performance and Comparison to Existing Methods The performance of M-LARGE on the testing data set is shown in Figure 2.5. To quantify how well the model performs on testing data, we define a correct prediction as one within ±0.3 units of the target magnitude (i.e., time-dependent magnitude) and calculate the model accuracy. Within these bounds, the model performs well with a high accuracy of 95% after 60 s, and increases to 99% by 120 s. The standard deviation of the magnitude misfits are 0.15, 0.1, and 0.09 at 60, 120, and 360 s, respectively. The accuracy difference from 120 to 360 s is not significant; however, this additional time improves some underestimations of long source duration events with magnitude greater than Mw8.5 (Figure 2.5). Our main point of comparison for assessing whether M-LARGE is an improvement will be GFAST (Crowell et al., 2016), which predicts magnitude from GNSS observations, and it is also one of the most stable GNSS EEW methods currently operating in the U.S. EEW system (i.e., ShakeAlert). It uses the PGD observations from HR-GNSS time series. When a hypocenter is confirmed by a seismic method, the magnitude is calculated based on the PGD-Mw scaling relationship (Crowell et al., 2016; Melgar et al., 2015; Ruhl, Melgar, Chung, et al., 2019). To ensure the data contain PGD information and not noise, a 3 km/s travel-time filter is added into the algorithm, and the model only predicts Mw when at least 4 stations have valid information. 26 Figure 2.5. Model performance on testing data set and on real events. (a) (from left to right) snapshots of M-LARGEs performance at 60, 120, and 360 s, respectively. Gray dots show the Mw predictions compared to the time-dependent magnitude. Black dashed line represents the 1:1 line; shaded area represents the ±0.3 magnitude range. Colored markers denote the M-LARGE-predicted Mw and their final Mw for 5 real events is shown in Figure 2.1. The red dots with labels are two specific scenarios, which will be discussed in the next section. (b) Comparison of the GFAST algorithm (blue) and M-LARGE (red) predicted magnitudes at 60,120, and 360 s for different magnitude bins. Model accuracies at 60, 120, and 360 s are shown in text. The thick and thin green dashed lines show the 1:1 and ±0.3 reference for each magnitude bin. Note that for GFAST, we remove those predictions with Mw = 0 due to the four- station minimum and only show the data that have nonzero values. In this example, over 30% of the testing events have Mw=0 prediction at 60 s, these predictions require additional time to converge compared to our model. Despite this removal, we find that GFAST has a lower accuracy of 62% at 60 s, which slowly increases to 78% by 120 s and to 80% by 360 s. In comparison to M-LARGE's maximum accuracy of 99%, GFAST's 27 accuracy saturates at 86% by 215 s. The standard deviation of the magnitude predictions of GFAST are also larger, which are 0.22, 0.21, and 0.22 at 60, 120, and 360 s, respectively, about 2 times more scatter than the M-LARGE performance. To summarize, M-LARGE reaches 80% accuracy 5 times faster than GFAST and has half the scatter on average. GFAST is not the only GNSS modeling approach; there are other proposed algorithms that utilize near-field GNSS data to rapidly estimate earthquake magnitude. To further compare with M-LARGE, we also run the Global Positioning System-based centroid moment tensor (GPSCMT) method, which utilizes the near-field static offset term from the GNSS records to calculate magnitude, moment tensor, and centroid location (Lin et al., 2019; Melgar et al., 2012). Unlike the GFAST approach, GPSCMT does not require hypocenter information, instead it grid-searches every predefined centroid location and solves for the moment tensor. We take the same subfault meshes used by M-LARGE as the potential centroid locations for the GPSCMT algorithm. Our result show that the accuracies are only 41%, 28%, and 28% at 60, 120, and 360 s, respectively, less accurate and overall, much scattered than our M-LARGE model. 2.3.2 Model Performance on Real Earthquakes To further assess the performance of M-LARGE, we apply the model to five large historical events in the Chilean Subduction Zone with HR-GNSS records, which were not used for training (Figure 2.1). For the 2010 Mw8.8 Maule earthquake, the model only takes 40 s to reach the ±0.3 magnitude unit criteria. M-LARGE also successfully predicts the final magnitude of the 2014 Mw8.1 Iquique and the 2015 Mw8.3 Illapel earthquakes at 20 and 60 s, respectively. For the 2014 Mw7.7 Iquique aftershock and the 2016 Mw7.6 28 Melinka earthquakes, both the M-LARGE predictions slightly overshoot the true magnitude at 30 s, but soon correct downward to their actual magnitude ranges. Compared to the performance of GFAST, which underestimates the Mw7.7 Iquique aftershock by 0.3 magnitude units and overestimates the Melinka earthquake by 0.6 magnitude units, our model shows more robust results on those smaller magnitude events where large GNSS noise dominated the data across the whole network spanning a 3,000 km long subduction zone. We also note that the performance is bounded by the delay times prior to the P-wave arrival at the closest stations. For example, the Maule earthquake, where most of the presently operating closest stations did not exist in 2010, the first arrival time was 17 s. Considering these delay times, useful predictions are made as soon as the signals are recorded but the lowest uncertainties are anticipated after ~30 s. Figure 2.6. M-LARGE performance on real Chilean earthquakes compared to GFAST. (a) The 2010 Maule Mw8.8 earthquake. Red and blue lines show the M-LARGE and GFAST prediction, respectively. Black dashed line and gray shaded areas represent the true Mw and the ±0.3 magnitude unit range. Thin dashed lines show the peak ground displacement waveforms from the 2010 Maule earthquake (Figure 2.1). Magenta line represents the event source time function from the USGS finite fault product. Hatched dark gray area is the time period prior to the arrival of the P-wave at the closest site where no information on the rupture is available. (b–e) Same as (a) but for the 2014 Mw8.1 Iquique earthquake, the 2015 Mw8.3 Illapel earthquake, the 2014 Mw7.7 Iquique aftershock, and the 2016 Mw7.6 Melinka earthquake, respectively. 29 2.4 Discussion 2.4.1 Timing of Final Magnitude Estimation Although the timeliness of the final magnitude assessment is intimately tied to the evolution of the STFs (i.e., whether the event grows faster or slower), we find that frequently the time to correct predictions do not follow the exact STF behavior. The reason for this time variation is mainly because of the ± 0.3 tolerance. A successful prediction can occur earlier than the actual source duration and at the lower bound of the magnitude tolerance resulting in an earlier prediction. While the effect of the magnitude tolerance depends on the shape of the individual STF, which is nontrivial to our stochastic simulations; however, by simply assuming the STF as a triangular function (i.e., rise and fall-off rates are the same), we can estimate the time of Mw-0.3 being 71% of the original Mw duration time based on the scaling of Duputel et al. (2013). Figure 2.7. Warning time ratios and STF analysis. Panel (a) shows the duration (𝜏2), time to the correct prediction (𝜏!), and the ratio between these two for each magnitude bin. Texts indicate the number of samples for each bin. Panel (b) shows the STF of 36,800 rupture scenarios color coded by Mw. Thick lines denote the averaged STF of different magnitude bins. Inset shows the zoom-in view of the averaged source time functions. 30 Our model result shows an even shorter magnitude determining time, which is about 25%–50% of the source duration (Figure 2.7). This advance in time is when we consider sources that follow a nonsymmetric flat and long tail Dreger-STF, which have growth patterns that can frequently be seen in worldwide databases (Figure 2.7b) (Mena et al., 2010). Thus, our model can provide practical earlier warning while updating its magnitude as time progresses; this is only possible when the real-time STF can be accurately measured. 2.4.2 Model Uncertainty The uncertainty in magnitude prediction is important for practical EEW systems. A probabilistic output layer could potentially be an estimator of the model confidence; however, in our regression-type model, such an output layer is not straightforward. Here, we analyze the model performance on the testing data set to estimate uncertainty. Assuming that the distribution of the testing data set is complete, we calculate the model accuracy as a function of time (i.e., length of PGD data used) and its magnitude (Figure 2.8). We find that generally high accuracies occurred at the right-hand side of the estimated duration curve, suggesting a final magnitude is more likely to be determined after the source termination. On the other hand, the low accuracies at the beginning of the prediction suggest that the initial rupture signals are not good indicators of final magnitude, which is consistent with previous source studies (Goldberg et al., 2018; Ide, 2019; Meier et al., 2017; Melgar & Hayes, 2019; Rydelek & Horiuchi, 2006). We also note that for very large events (Mw9.2+), high accuracies can occur at the very early stage, prior to the source duration. This is due to a large slip influencing the beginning of the STF, and since the largest 31 possible magnitude is limited by the finite fault geometry (i.e., in our case, Mw9.6), the possibility that an Mw9.2+ event grows into a larger event is limited. Figure 2.8. Model performance on the testing data set. Panel (a) shows the prediction accuracy (i.e., number of successful prediction/total samples) as a function of peak ground displacement time and Mw, where a successful prediction is defined as when the predicted and final Mw misfit is smaller than 0.3. Dashed line shows the estimated duration from Duputel et al. (2013). Panel (b) same as (a), but define a successful prediction is when the predicted and time-dependent Mw misfit is smaller than 0.3. Note that the time-dependent Mw is the integration from the STF at the current time. 32 2.4.3 Model Performance on Imperfect Data or Complex Rupture To understand how M-LARGE performs on imperfect data, we test M-LARGE on two different recording scenarios of the same rupture from the testing data set. One scenario has poor station coverage whereas the other has excellent station coverage (i.e., red dots in Figure 2.5) (Figure 2.9). Figure 2.9. M-LARGE prediction tests with two different station distributions. (a) Rupture scenario of an Mw8.7 earthquake with the station distribution of Case1 (blue hexagon) and Case2 (red triangle). Red star denotes the hypocenter. Black dashed lines show the 50 s rupture time contours. (b) M-LARGE predictions for Case 1 (blue line), Case 2 (red line), and the actual Mw (dashed line) calculated from the STF (gray area). (c) Peak ground displacement data of Case 1 sorted by latitude, red star denotes the hypocenter latitude. Panel (d) similar to (c), but data of Case 2. 33 In the poor station coverage example (i.e., Case 1 in Figure 2.9), almost all the near- field data are missing and M-LARGE fails to estimate the true magnitude. It is not until far-field stations begin recording data that M-LARGE upgrades its moment estimate closer to the lower bound of the actual magnitude. In contrast, for the good station coverage example, abundant near-field data are used to accurately characterize the rupture process and M-LARGE predicts the actual magnitude successfully (Figure 2.9). This suggests that data sparsity in the near-field plays an important role for the accuracy and timeliness of the predictions. The clear implication is that having more stations closer to the source improves M-LARGE's performance. To understand M-LARGEs performance as a function of source complexity, we choose four different characteristic source time function shapes (i.e., symmetric, bimodal, early, and late skewed) and analyze the results (Figure 2.10). We use the time-to-peak STF (peak time 𝜏%) and centroid time (𝜏!"#$) as proxies to measure the model performance on different STF shapes. We define the time to corrected prediction (𝜏!) as the time when the model successfully predicts the final magnitude with a misfit smaller than 0.3, and finally we calculate the correlation coefficient (CC) between these metrics and find that the 𝜏! − 𝜏% has the weakest correlation with CC = 0.66 compared to 𝜏! − 𝜏!"#$ of CC = 0.9 and 𝜏! − 𝜏2:-;$)<# of CC = 0.85. This suggests that the 𝜏%, a proxy of STF's shape, does not significantly affect the timing and accuracy of the M-LARGE estimations. On the other hand, the performance relies on the actual moment release because it is trained to map the STF directly. 34 Figure 2.10. Example plot for the time to corrected prediction (𝜏!), centroid time (𝜏!"#$), and peak time (𝜏%) for four different cases. Red and black lines show the M-LARGE predictions and final magnitudes, respectively. Black dashed lines denote the ±0.3 magnitude unit range. Panel (a) shows the case of late rupturing, where the source focuses at the end of the rupture, (b) shows the case with early rupturing, where the source focuses at the beginning of the rupture, (c) nearly symmetric (triangular) source time function, and (d) shows the case of two rupture asperities. 35 2.4.4 Limitations and Future Work We have shown that M-LARGE has the ability to learn complex rupture patterns from the crustal deformation data. Also, it significantly outperforms other similar algorithms. However, we note that it still has some limitations, and these should be targets for potential improvements in the future. First, once M-LARGE is trained, the model is not global in scope, it is presently limited by the simulated earthquakes, waveforms, and network geometry for a specific region. In machine learning, whether a model can be applied to other data not seen in training, such as that from a different subduction zone, is called generalization. It is evident that the approach we have followed here is tied to the specific network geometry and subduction zone and will not immediately generalize to other tectonic settings. For M-LARGE to be useful elsewhere, it will need to be retrained to another specific geometry and perceived possible set of ruptures for that tectonic environment. M-LARGE, like any geodesy-based technique, is only limited in large magnitude events typically in the M7+ range (e.g., Crowell et al., 2013; Melgar et al., 2015). Damaging shaking during earthquakes can also occur at significantly smaller magnitudes (e.g., Minson et al., 2021). Thus, our model is not meant to replace seismic methods but rather to work in tandem with them. Saturation is a persistent concern for EEW and by combining networks, data types, and algorithms EEW systems can respond to a wider variety of events. We note that for the 2010 Mw8.8 Maule earthquake example, there is a 17 s gap without recording due to the lack of near-field stations. This performance could be sped up by ~10 s if the information delay introduced by the travel times could be reduced, that is if 36 station coverage were expanded offshore. The model performance is strongly reliant on the training data set behaving according to what is seen in world databases. As a result, an outlier event with a unique rupture may still prove challenging. More simulated events that incorporate rupture variability would improve M-LARGE's performance by making it resilient to complex rupture scenarios. The model architecture and hyperparameters are selected arbitrarily; however, the scale of hyperparameters is comparable to those in similar studies (e.g., Ross, Meier, Hauksson, et al., 2018; Zhu & Beroza, 2019). We do not find significant model improvement from tuning the hyperparameters we used, probably because the model has already reached its accuracy limit (i.e., 99%) based on the current architecture. Any further improvement will require a different model design. We find that the logarithmic scaling function of PGD features has better performance against the commonly adopted linear scaling. This is consistent with the existence of log-linear PGD and magnitude relationships (Crowell et al., 2016; Melgar et al., 2015; Ruhl, Melgar, Chung, et al., 2019) making the input and output pairs less complicated during model training. Lastly, the earthquake magnitude is not the only important factor for EEW. In fact, the source location, rupture length, width, and slip are equally important for an accurate ground motion prediction or tsunami amplitude forecasts. In this Chapter, we have successfully demonstrated that M-LARGE is capable of learning Mw directly from raw observations. In the next Chapter, we will show our improved M-LARGE model that can predict other source parameters that can be further utilized to forecast earthquake hazards. 37 2.5 Conclusion Developing frameworks to provide timely warning during the largest magnitude earthquakes remains an outstanding scientific and technological challenge. EEW systems continue to expand and have proliferated to many countries across the globe (Allen & Melgar, 2019). Despite this, how these systems will perform in rare but high consequence, large magnitude earthquakes is uncertain. Here, we have combined the knowledge of where great earthquakes will occur, their average expected rupture characteristics, state of the art sensor technology, and deep learning to rapidly characterize large magnitude earthquakes from their crustal deformation patterns. In simulated real-time testing, the resulting EEW algorithm, M-LARGE, has a significantly better performance than current GNSS algorithms and can be retrained to apply to any specific region capable of generating large events. As such, M-LARGE represents a new approach to EEW that, if made operational, can work in tandem with the current technologies and provide accurate, unsaturated Mw estimation and fast alerts that will lead to increased resilience. This is the first demonstration of our approach, future work to achieve generalization of the method is still needed and will include more training and testing data, interacting with existing EEW methods, and creating new data for different tectonic environments and GNSS network configurations. 38 CHAPTER III REAL-TIME FAULT AND GROUND MOTION PREDICTION FOR LARGE EARTHQUAKES WITH HR-GNSS AND MACHINE LEARNING From Lin, J. T., Melgar, D., Thomas, A. M., & Searcy, J. (in prep.) Real-time fault and ground motion prediction for large earthquakes with HR-GNSS and machine- learning. Journal of Geophysical Research: Solid Earth. I am the primary contributor for this manuscript writing, model building, results, and analysis. Diego Melgar contributes to the work by developing code for earthquake simulations; Amanda Thomas for providing writing assistance; Valerie Sahakian for ground motion simulations; Jake Searcy for initial model constructing. 3.1 Introduction 3.1.1 Current EEW systems The ultimate goal of EEW systems is to provide seconds to minutes of warning to public and automated systems before strong shaking occurs at their location (Given et al., 2014). In Chapter II, I have shown that our proposed machine learning model, M-LARGE, can rapidly and accurately predict earthquake moment magnitude without saturation issue 39 with an accuracy of 99%, outperforming other similar algorithms. To provide actual ground motion (GM) prediction, there are further steps to be considered. The most commonly adopted EEW system relies on the two steps approach: 1) quick estimation of point source parameters and 2) forecasting the upcoming strong ground motion with pre-built ground motion prediction equations (GMPE) (Nakamura, 1988; Allen & Kanamori, 2003; Allen, 2007). This method has been proved effective in Japan (Nakamura, 1988; Kamigaichi et al., 2009), west coast of the United States (Allen & Kanamori, 2003; Given et al., 2014), Mexico (Espinosa-Aranda et al., 1995, 2009), Taiwan (Wu & Kanamori, 2005; Hsiao et al., 2009), Turkey (Wenzel et al., 2014), and in many other seismic active regions (e.g., Böse et al., 2007; Zollo et al., 2009). 3.1.2 Limitation of the Existing EEW systems One of the challenges of the existing EEW systems is that large earthquakes have finiteness, which can contribute to source errors, resulting in underestimation of shaking and tsunami intensity (e.g., Hoshiba et al., 2011; Hoshiba & Ozaki, 2014). Later studies proposed finite fault methods to mitigate the issue (Böse et al., 2012; Hutchison et al., 2020). For example, the FinDer algorithm utilizes pattern matching technique to search for fault parameters with seismic observations and pre-built shaking patterns (Böse et al., 2012; Böse et al., 2018), or methods that forecast shaking intensity directly from the observed waveforms (e.g., Kodera et al., 2018; Cochran et al., 2019). However, although these methods are more reliable than the original oversimplified point source methods, they have either shorter warning time or limited on-site warning which requires dense network coverage (Allen & Melgar, 2019). 40 Another challenge for the EEW system is that large earthquakes have strong, and low-frequency shaking (i.e., down to permanent offset), making the recording difficult to unravel the true shaking emitted from large earthquakes (Boore & Bommer, 2005; Larson, 2009; Bock & Melgar, 2016). GNSS data, on the other hand, provides unsaturated near- field displacement data, which is ideal for building EEW systems for large events. One of the most stable and fastest methods, GFAST, utilizes the relationship of PGD and Mw to characterize the source soon after earthquake initiation (Crowell et al., 2013, 2016). Similarly, G-larmS and BEFOREs solve slip distribution on an assumed fault plane and can provide unsaturated Mw estimation (Grapenthin et al., 2014; Minson et al., 2014). The biggest limitations for these methods are that the fault dimensions and geometries are pre- assumed or derived from seismic-based EEW systems (Murray et al., 2018). Thus, any uncertainties in fault geometry can cause errors in source estimation. Recently, a number of machine learning (ML) approaches have emerged to provide faster and more accurate EEW. These include using waveforms data to estimate magnitude and other source parameters (Lomax et al., 2019; Mousavi & Beroza, 2020), or directly forecast shaking intensity from single or multiple station records in a region (e.g., Jozinović et al., 2020; Münchmeyer et al., 2021; Zhang et al., 2022). ML approaches are known to be a powerful tool for automatically extracting information from data, but their generalizability is also limited by the training data. Fundamentally, because of lacking large magnitude events for model training, the limitation of these models for large EEW can be expected (Jozinović et al., 2020; Mousavi & Beroza, 2020; Münchmeyer et al., 2021). 41 3.1.3 New M-LARGE Model In Chapter II, I have shown our M-LARGE model that solves the aforementioned limitations from the methodology, recording instrument, and lack of events. Here, we further expand the M-LARGE algorithm toward rapid finite fault parameters prediction. The new algorithm predicts Mw, centroid location, and fault size simultaneously, which can be further used for strong ground motion forecasting through existing GMPEs. We apply the new M-LARGE model to the Chilean Subduction Zone and test it with synthetic rupture data and 5 large historical Chilean earthquakes occurred in the past 12 years (Figure 3.1). With testing on simulated stochastic strong ground motion waveforms and real data, the results show that the timeliness of our M-LARGE EEW alerts is significant, outperforming other similar algorithms. 42 Figure 3.1. Map of the Chilean Subduction Zone, rupture scenario, synthetic HR-GNSS waveforms, and model predictions. (a) Synthetic Mw9.18 earthquake ID:024513 from Lin et al. (2021). Red star show the hypocenter of the synthetic earthquake. Triangles are the HR-GNSS stations used in this study. Black stars and focal mechanisms mark the 5 real large earthquakes occurred in the past 12 years (b) Synthetic 3-components HR-GNSS waveforms. Red, blue, green lines represent East-West, North-South, and vertical component, respectively. (c) Model prediction of Mw, centroid longitude, centroid latitude, fault length, and fault width as the rupture unfolds (red solid lines) and their actual values (black dashed lines). 43 3.2 Data 3.2.1 Kinematic Rupture Simulation We take the Chilean Subduction Zone rupture and waveform dataset from Lin et al. (2021), which has 36,800 rupture scenarios with magnitude ranges from Mw7.2-Mw9.4 with a small perturbation. The detailed methodology of rupture scenario is described in Melgar et al. (2016) and in Lin et al. (2021). Unlike the natural of fewer large Mw events, the magnitudes of the scenarios are uniformly distributed to guarantee enough large earthquakes exposed to the model. Additionally, to cover events that occur at the edge of the subduction zone, we generate 309,600 smaller events following the same procedures as in Lin et al (2021). Combining with the original dataset, a total of 346,400 events are used in this study (Figure 3.2). Figure 3.2. Source parameters of the 346400 rupture scenarios in this study. 44 3.2.2 Synthetic HR-GNSS Waveforms Once the kinematic ruptures have been created, the synthetic 3-components HR- GNSS waveforms can be generated by the summation of waveforms from all subfaults based on the Green’s function approach. We use the FK package (Zhu & Rivera, 2002) and the velocity structure from LITHO1.0 (Pasyanos et al., 2014) to generate the Green’s functions from all subfaults to the selected HR-GNSS stations currently operating in Chile (Báez et al., 2018). The HR-GNSS waveforms are in 1-Hz sampling rate, which is high enough to cover the dominant period of our large earthquakes. For computational efficiency, we only generate the waveform when the closest rupture is smaller than 10° to the station. To simulate realistic data, we apply the empirical HR-GNSS noise (Melgar et al., 2020) into the raw synthetic waveforms during the data augmentation step for model training. 3.2.3 Synthetic High-Frequency Waveforms The ultimate goal for EEW system is to forecast GM before its onset. Additional to the 1-Hz waveforms, we simulate high-frequency 100-Hz broadband data at 525 virtual stations distributed along the coast with an averaged spacing of 0.4° to understand the timeliness of the warning time from the expected shaking. Up to this frequency, the high- rate data cannot be generated by the aforementioned Green’s function approach because they are sensitive to unconstrained small-scale crustal structure. Instead, we apply the hybrid stochastic approach, where waveforms are controlled by modeled amplitude spectrum and random phases (Graves & Pitarka, 2010, 2015). Because these simulations are more computationally expensive, we randomly select the virtual stations with the 45 subfault-station distance smaller than 600 km to generate broadband data. We simulate 100-Hz broadband data for 10,000 ruptures and validate their peak ground acceleration values (PGA) with the existing GMPE for the Chilean Subduction Zone (Montalva et al., 2017) (Figure 3.3). Figure 3.3. Example of 3-components synthetic high-frequency waveforms and comparison of synthetic PGA with GMPE. (a) Synthetic for rupture ID:024513 (Figure 3.1). Triangles represent the selected virtual stations color coded by their PGA values calculated from synthetic timeseries data. (b) Comparison of the synthetic high-frequency data and the existing GMPE (Montalva et al., 2017) for different Mw and distance bins. 46 3.3 Model 3.3.1 New M-LARGE Architecture We expand the original M-LARGE architecture (Lin et al., 2021) to predict a total of 5 parameters: Mw, centroid longitude, centroid latitude, length, and width. We do not predict the centroid depth because depth is usually not well constrained by the observations and may contribute as a noise source to other parameters. Instead, we adopt the depth from the subduction zone geometry when the centroid longitude and latitude are determined. Unlike the original M-LARGE model where PGD timeseries are used (Lin et al., 2021), we use the raw 3-components displacement waveforms as the model inputs to solve for a more complex problem. Similarly to Lin et al. (2021), we decimate the timeseries to 5 second sampling so that the model predictions updates for every 5 second period up to a maximum of 510 second. Lastly, we scale the outputs by their maximum value to prevent one parameter dominates the other during the training. 3.3.2 Model Training and Testing We split the total of 346,400 rupture scenarios into 70%, 20%, and 10% for model training, validation, and testing, respectively. With these rupture scenarios, we create a generator that simulates real-world data with GNSS noise and station outages following the same procedure as in Lin et al. (2021). As a result, each scenario can have many numbers of unique observations. In our case, we create a total of 7.68 million realistic recordings from 242,480 ruptures (i.e. 70% of the total 346,400 ruptures) during the 2,000 training epochs, with 300 steps per epoch and a batch size of 128. 47 3.3.3 Finite Fault and GM Prediction The goal of M-LARGE is rapid GM forecasting. We take the two steps approach, similar to the existing methods (e.g., Nakamura, 1988; Allen & Kanamori, 2003; Allen, 2007). The main improvements are that our model predicts the finite fault plane based on the centroid location, which is more suitable for earthquakes with large centroid- hypocenter difference, and the fault size is directly predicted from observations. Once the 5 parameters are determined, they can be further utilized to create a finite fault plane with the strike and dip taken from subduction zone geometry based on the centroid location (Figure 3.4). We then apply the existing Chilean GMPE (Montalva et al., 2017) to predict the PGA and in Modified Mercalli Intensity (MMI) scale. We note that due to the simplification of the fault geometry curvature, the simplified fault may introduce some errors in the MMI prediction. However, as we will discuss later, the MMI difference between actual and simplified rupture is small and can be ignored. Furthermore, although misfits are presented in the GMPE (Montalva et al., 2017), we do not aim to develop a new GMPE for the most accurate GM forecasting, instead we focus more on how would the source parameters be quickly and accurately characterized, which will lead to long and accurate GM alerts when a good GMPE is available. 48 Figure 3.4. Example shake map based on the actual fault, simplified fault, and the M- LARGE predicted fault. Gray circles mark the location of 525 virtual stations used to construct the shake map (some stations are truncated). For this example, the difference in shake maps is insignificant. 49 3.4 Results 3.4.1 M-LARGE Performance on Testing Data The performance of M-LARGE on 10,000 unexposed testing dataset is shown in Figure 3.5. We define a successful prediction as the misfit of predicted and actual value smaller than 10% of the parameter range. For example, the range of Mw is set to be 0.3 Mw scale, a value from Mw of 6.5~9.5. We note that this is the definition only for evaluating the model performance such that the accuracy of the predictions can be calculated. Although the accuracy of each parameter may not be comparable (i.e. length has higher accuracy than the others due to the larger range), and the effect of imbalance distribution of parameters, it is an important metric to evaluate each parameter and their time evolution. Overall, the model is better at predicting parameters along-strike than along-dip direction with an accuracy of 96.2% and 99% for latitude and length, respectively, and 93.8% and 89.4% for longitude and width, respectively. On average, M-LARGE predicts the source parameters with a high accuracy of 95.5%. 50 Figure 3.5. M-LARGE predictions on the Mw, centroid longitude, centroid latitude, length, and width with 60 and 300 second of data. Colored dots show the predicted and true values at their current time. Magenta lines mark the boundary of accuracy estimation, defined by +/- 10% of the parameter range (i.e. maximum-minimum value of the parameter). 51 3.4.2 M-LARGE Performance on Real Data Additional to the simulated earthquakes, we test the model on the 2010 Mw8.8 Maule earthquake, one of the largest historical earthquakes in Chilean Subduction Zone. The M-LARGE performance at 10, 30, 60 and 510 s after the origin is shown in Figure 3.6. At 10 s, the “initial stage” of the rupture, none of the data are available at the time, the M- LARGE predicts the parameters based on the noise inputs (Figure 3.6a). At 30, and 60s, the “growing stage” of the rupture, a few of near-field stations have received the P or initial S-waves. At the times, M-LARGE corrects the centroid location closer to the actual answer and expanding the fault when more data and more stations are available (Figure 3.6b-3.6c). At 510 s, the “final stage”, which the maximum time window that M-LARGE is tested. Most stations have received the rupture signals and the predicted source parameters does not change significantly after 50 s (red lines in Figure 3.6d), which is similar to the result of Lin et al. (2021). 52 Figure 3.6. M-LARGE performance on the 2010 Mw8.8 Maule earthquake real data. Gray triangles mark the Chilean HR-GNSS stations that were not operating or unavailable during the earthquake. Colored triangles show the available stations color coded by their vertical displacement value. Shaded map show the PGA prediction based on predicted fault (blue rectangle) by the Chilean GMPE (Montalva et al., 2017). Black dash line and red solid line on the map shows the expected P and S arrival, respectively. (a) Initial stage of the rupture (10 s), where none of the stations have recordings at this time. (b)-(c) Growing stage of the rupture (30 s and 60 s). (d) Final stage of the rupture (510 s i.e., the maximum time for model testing). 53 3.5 Discussion 3.5.1 Error Estimation Our M-LARGE model predicts 5 source parameters as a representation of actual fault. Whether they are enough to reconstruct the finite fault as well as the GM forecasting is our main discussion. To understand the misfit introduced from such simplification, we calculate the MMI misfits between actual and simplified rupture (Figure 3.7). The misfit is obtained by 𝜀$ = 𝑌V$ − 𝑌$ (3.1) where 𝜀$ is the misfit at time 𝑡 , 𝑌V$ and 𝑌$ represents model and actual MMI value, respectively. We compare the misfits at the final time i.e., 510 s. At this time most of the ruptures should have been completed and the misfit simply represents the error of the simplification assumption. Our result shows an averaged MMI error close to zero and standard deviation of 0.12 MMI scale, which suggests that this is a reliable approach if the simplified fault plane, created from the M-LARGE predictions, can be accurately estimated. We then evaluate the M-LARGE based GM forecasting by comparing them with the actual MMI values. By Equation 3.1, we calculate the final MMI based on the M- LARGE predicted source parameters and compare them with the final MMI based on the actual slip pattern (Figure 3.7). The result shows an accurate final MMI prediction with an averaged error of -0.06 and standard deviation of 0.49 MMI scale. Considering the existing 54 0.1 MMI misfit due to the fault simplification error as described above, this is a small error, which suggests that actual GM can be accurately predicted by our M-LARGE model. The evaluation can also be discussed in the timeliness analysis, which we will compare the M- LARGE predicted MMI timeseries with the actual MMI timeseries from the high- frequency synthetic data in the later discussion. Figure 3.7. Final MMI misfits at the 525 virtual stations for the simplified and M-LARGE predicted fault in different Mw groups. We take the fault at 510 s as the final fault to generate MMI values. To prevent zero to zero comparison for distant stations, the misfit is only calculated at the stations with the true MMI value >=3 MMI scale. The averaged MMI misfits are -0.0020.116 and -0.0620.492, for simplified and M-LARGE predicted, respectively. The above result suggests that accurately predicting the 5 source parameters is crucial. To understand how we can improve the prediction accuracy, we discuss the effect of GNSS noise and data scarcity by testing the model with noise-free, full station data i.e., the perfect case (Figure 3.8). We find that with the perfect conditions, the accuracies of all 55 the parameters are close to 100% and there is no strong correlation between Mw and misfits, suggesting that the network coverage and data quality are more important than the source complexity to the M-LARGE predictions, consistent to the result of Lin et al. (2021). We note that length can be saturated as when a fault grows to 1,000+ km, this is because of the simplification of the curving subduction zone geometry, which introduces ambiguity in the along-strike and along-dip direction. Similarly, large misfit can also be seen in width due to the onshore station distribution (Figure 3.8). Figure 3.8. M-LARGE performance on perfect data using 510 s long time series. The accuracies of all the parameters are close to 100% except for the width of 92.8% due to the onshore station distribution of the GNSS network. 56 3.5.2 Timeliness of Alerts To understand the timeliness of warnings that our model could make, we compare the M-LARGE predicted GM to the real GM records collected from Ruhl, Melgar, Geng, et al. (2019). We calculate the effective warning time, defined as when the prediction is faster than the actual onset at a particular MMI level, to evaluate the performance (Figure 3.9). We find that M-LARGE can provide accurate MMI prediction and significant warning time for most of the sites. For MMI=4 as an example, it shows 100% true positive prediction of final MMI and a long warning time of ~60 s for most of the stations (Figure 3.9a). Some stations provide less or no warning time because these stations are close to the source (~100 km) located in the blind zone of the model where only a few of GNSS stations were operating during 2010 (Figure 3.6). We then calculate the warning time for all the Chilean large earthquakes (Figure 3.9b) and plot the warning time histograms for all the real data at different MMI levels (Figure 3.9). The result shows a long median warning time of 36 s for MMI=4, and 28 s for MMI=5, which is slightly better than the result of Ruhl, Melgar, Geng, et al. (2019) of 34 s and 26 s for MMI=4 and MMI=5, respectively. One big difference is that our model does not rely on the seismic triggered parameters which will potentially introduce latency and saturated information from the seismic approaches that we have introduced earlier. Thus, we anticipate that our model can have a longer and more accurate warning time compared to the similar methods. 57 Figure 3.9. M-LARGE predicted GM on real earthquakes. The five large Chilean earthquakes are shown in Figure 3.1. (a) Warning time for 2010 M8.8 Maule with MMI level of 3, 4, and 5. (b) Histograms of all the alerts from five large Chilean earthquakes with different MMI levels and cumulative distribution function (CDF). 58 We further test the model on synthetic data, which is more challenging because they can be larger (Mw9.5+) and rupture a wider area (1000+ km) than the 5 Chilean real earthquake data in the past 12 years. We test our M-LARGE model on the 10,000 ruptures from testing dataset and compare the GM between the model predictions and the real GM, assessed from the synthetic broadband data (e.g., Figure 3.3a). We find that with the rapid estimation of earthquake sources, our model yields a median warning time of 34.5 s for all the MMI levels for distance of 200 km, and this increases to 58.4 s for sites with distance of 300 km (Figure 3.10). Considering a blind zone of 100 km due to the subduction zone geometry and distribution of the GNSS stations that only a few of station has recording at the initial stage (e.g., Figure 3.6), this is a long warning time and can be utilized to forecast strong GM before their onsets. For MMI=4, our model yields a long median warning time of 40.5 s, and this changes to 25.8 s for MMI=5, similar to the real data test. Furthermore, we analysis the warning time statistic by calculating the rate of successful warning, defined as the number of effective warnings divided by the number of effective warnings and too late warnings (Figure 3.10). The result shows a high successful warning rate for most of the sites, suggesting that our model can provide long and accurate GM forecasting. 59 Figure 3.10. Warning time of M-LARGE model testing on synthetic data. Warning time is calculated by the median value of all the effective warning at a particular distance and MMI level. Successful warning is defined by the number of effective warnings divided by the number of effective and too late warning. Total number N is shown in logarithm scale. 3.6 Conclusion In this chapter, we have proposed a new M-LARGE model (Lin et al., 2021) that predicts finite fault parameters for large earthquakes from surface displacement waveforms from HR-GNSS data. We test the model with synthetic and real data in the past 12 years in the Chilean Subduction Zone and show that it successfully predicts magnitude with a high accuracy of 99% and an averaged of 95% for other source parameters. These predictions can be used to further forecast strong ground motion. Our synthetic earthquakes test suggests that a long median warning time of 34.5 s can be made at the sites with distance of 200 km, and this increases to 58.4 s for distance 300 km. Additionally, both the 60 synthetic data and real Chilean earthquake data test provide a longer warning time of 36~40.5 s at sites with MMI=4, longer than the existing G-larmS algorithm of 34 s. Most importantly, the prediction does not rely on seismic methods which will potentially introduce latency and saturated information from the seismic approaches. We conclude that the M-LARGE model is better and more suitable than the current EEW algorithms for large earthquakes. 61 CHAPTER IV DETECTING LOW-FREQUENCY EARTHQUAKES IN NOISY TIME SERIES DATA WITH MACHINE LEARNING: CASE STUDY IN SOUTHERN VANCOUVER ISLAND From Lin, J. T., Thomas, A. M., Melgar, D., & Bostock, M. G. (in prep.) Detecting low-frequency earthquakes in noisy time series data with machine learning: case study in Southern Vancouver Island. Earth Planet. Sci. Lett. The writing of this manuscript is by me and Amanda Thomas, with the direction of my analysis and results. Diego Melgar provides editorial support. Michael Bostock provides part of the data for this work. 4.1 Introduction In recent years, the geophysical community has discovered that faults globally can accommodate tectonic loading with different mechanisms that are commonly distinguished by their slip speed. Earthquake ruptures represent the fastest fault slip phenomena and slip at rates sufficient to produce high-frequency seismic waves. Whereas crystal plastic processes operating in the deep roots of faults, at elevated temperature and pressure, accommodate some of the slowest deformation that occurs at rates at or near that of tectonic 62 loading. Slip phenomena between these two endmembers, which include transient creep, slow slip, and afterslip, typically have slip velocities one or more orders of magnitude larger than the tectonic loading rate, but still far smaller than inertially limited slip velocities characteristic of seismic ruptures (Ide, Beroza, Shelly, et al., 2007; Peng & Gomberg, 2010; Bürgmann, 2018). Slow slip events are a type of transient fault slip during which the slip rate accelerates to speeds that are 1-2 orders of magnitude faster than the background tectonic loading rate (e.g., Bürgmann, 2018; Behr and Bürgmann, 2021). Slow slip events occur frequently in subduction zones around the globe (Saffer & Wallace, 2015). In the past two decades much effort has been dedicated to documenting their spatial and temporal characteristics in different tectonic environs (Obara, 2002; Rogers & Dragert, 2003; Beroza & Ide, 2011; Obara & Kato, 2016; Bürgmann, 2018, Behr & Bürgmann, 2021). Because slow slip events occur over significantly longer timescales than typical earthquakes, they generate very weak seismic waves that are both lower in amplitude than and depleted in high-frequency content (i.e., > 1 Hz) relative to typical earthquakes (e.g., Thomas et al. 2016). Obara (2002) first recognized what he dubbed non-volcanic tremor beneath the Shikoku and Kii peninsulas in Japan. He identified tremor as a low-amplitude signal with a predominant frequency content of 1-10 Hz lasting a few hours to a few days. He also recognized that tremor signals propagated with a velocity most consistent with that of S- waves and locate deep on the plate interface. Shortly thereafter nonvolcanic tremor was recognized as the seismic manifestation of deep slow slip (Rogers & Dragert, 2003). Tremor can be rapidly detected and is a useful tool for identifying and tracking SSE evolution. One of the most widely used tremor detection algorithms is that of Wech and 63 Creager (2008) which is run in real-time by the Pacific Northwest Seismic Network, identifies tremors by cross-correlating waveform envelopes and grid searching the location, which shifts the S-wave time until the summed cross-correlation functions for all the station pairs reach the maximum value (Wech, 2021). Because there are no clear P- and S-waves, the locations require a predefined grid and depth estimates are unreliable (Wech, 2021). Furthermore, detections are limited to 5 minute time windows which do not allow for analysis of shorter timescale phenomena or to resolve energy coming from multiple locations. Low-frequency earthquakes (LFEs) are more traditional seismic sources and do not suffer from the limitations above. Although the physical process responsible for their generation is still a matter of debate (Obara, 2002; Obara & Hirose, 2006; Seno & Yamasaki, 2003). It is known that LFEs are often accompany SSEs (Obara & Hirose, 2006; Ide, Shelly, & Beroza, 2007; Ito et al., 2007), and they can shed light on slow slip nucleation and evolution, and potentially earthquake hazards. For example, Yamashita et al. (2015) show that LFEs off Southern Kyushu, Japan, migrated toward shallow depth of the subduction zone where large earthquakes occurred, and may cause stress accumulating in the region that generate large and tsunamegenic earthquake in the future. Similarly, studies have shown that spatial and temporal distribution of LFEs, slow earthquakes and SSEs can link together and can be associated with large e.g., 2011 Mw9.0 Tohuku earthquake (Kato et al., 2012; Ito et al., 2013). Of all the seismic manifestations of SSEs, LFEs provide the most precise depth estimates. The location of LFEs relative to the structure of the subduction shear zone provides valuable information about which processes may be responsible for slow slip (e.g., 64 Calvert et al. 2020). Traditionally LFEs are detected by template matching approaches (Bostock et al., 2012; Chamberlain et al., 2014; Royer & Bostock, 2014; Shelly et al., 2007). This method requires an LFE waveform template which is cross-correlated through the continuous waveform data to search for similarity. When the summed cross-correlation function exceeds a threshold (e.g., eight times the median absolute deviation (Shelly et al., 2007)), the window is considered a detection. This process can be refined by stacking all the detected waveforms as a new LFE template to increase the signal-to-noise ratio. Obviously, the above approach is time consuming and computationally expensive. Most importantly, it can only detect known templates and is limited by the assumption that LFEs are repeating sources. Thomas et al. (2021) proposed a machine-learning (ML) approach that can identify LFE waveforms in noisy data from a single station. They have successfully applied this model in Parkfield, CA and shown that it identified new events that are not in the original catalog, suggesting the potential of applying such an approach. Here we take the same ML technique from Thomas et al. (2021) and apply it in Southern Vancouver Island (Figure 4.1) to detect LFEs. The detected events, without the aforementioned repeating limitations, can provide us a better idea of subduction zone processes. 65 Figure 4.1. Map view of the study area. Magenta triangles show the stations used for model training and testing; blue triangles represent the unseen stations, which are not involved during the training process, for model testing. Circles denote the LFEs locations from Royer & Bostock (2014) catalog, color coded by their depth. 66 4.2 Methods 4.2.1 Training Data for Phase Picks The first goal of this work is to develop a ML based phase picker which can distinguish LFEs from noise and make arrival time picks on records deemed to contain signal. Accomplishing this task requires obtaining training data that includes many representative examples of noise, P-, and S-waves with associated picks. We obtained manual phase picks from events that were originally identified by a combination of autocorrelation and template matching (Royer & Bostock, 2014). For each arrival time pick in the catalog, we download a 30 second window of data centered on the pick time using Obspy (Krischer et al. 2015). We interpolate any traces not sampled at 100 Hz, the sample rate we employ in this study. To distinguish earthquakes from noise, representative noise samples are included in the training data for the CNN. As such we download a similar number of noise windows (defined as the time period prior to the P arrival time pick). Noise data are randomly selected when there are no known LFEs prior and after the time with the minimum separation of 180 s. This process results in nearly 750,000 waveforms split approximately evenly between noise, P-wave, and S-wave picks. We use a Gaussian function as the network target. The Gaussian is centered on the P or S wave arrival time and has a standard deviation of 0.4 s. This small value is to allow some errors in the arrival time pick in the catalog, but it is not sufficiently large that it smears the detection resolution. For noise waveforms, we use an array with all zeros as the targets. (Figure 4.2). 67 Figure 4.2. Example of LFE and noise data at station TWKB. (a) 3-components waveforms from LFE catalog (family: 006) from Royer & Bostock (2014). Gray lines show the raw data normalized by their amplitude; red lines show the stacking of gray lines (only show a few examples here). Blue line shows the gaussian function as the possibility of the S-wave arrival, which is the target for model training. Note that the label applies to individual waveforms (i.e., gray lines), not the stacked data which is only for demonstration purposes. (b) Similar to (a) but for noise data. 68 4.2.2 Convolutional Neural Network Architecture and Training The input data to the network is three component seismic data. Since data windows are 30 s long and we employ a sample rate 100 Hz, the input data has a length of 3000 samples. Leaving the training data in the original form, with the pick in the middle, would result in the CNN learning to pick the middle sample each time. As such, similar to Thomas et al. (2021) we use a data generator during training which randomly selects subsamples of traces from the training data, called batches, and applies the following modifications to the data prior to input. First, we randomly select a start time in the first half of the trace and include only 15 seconds of data beginning at that time. This has the effect of randomly shifting the pick in time such that it can occur at any time during the window. Second, to account for variable amplitudes in the training data, we normalize the three component data with the maximum amplitude of all three components and apply a logarithmic transformation to the input data. This transformation maps each value, x, in the original traces to two numbers: the first is sign(x) while the second is the ln(abs(x)+eps) where eps=1e-6. This has the effect of scaling the features such that input amplitudes do not vary over orders of magnitude and preserving information on the sign. The data generator supplies six channels (3 components with a normalized amplitude and sign for each) in batches to the CNN during training and augments the training data by shifting the pick times. For the ML model we employ the U-Net architecture from Thomas et al. (2021). U- Nets are composed of several convolutional layers (LeCun et al., 1998) and links, which allow the raw and early information to be accessible to the later decision layers. This 69 architecture has shown to be successful in biomedical image processing (Ronneberger et al., 2015) and in seismic phase identification (e.g., Zhu and Beroza, 2018). The model contains a size factor to control the size of the neurons of convolutional layers and dropout layers to avoid overfitting. Here, we only test three network sizes (i.e., size 0.5, 1, and 2) and fix the standard deviation of arrival time label of 0.4 s since our goal is to build and test the feasibility of applying such a method in a noisy environment. We do not fine tune the hyperparameters, which have shown to have minor influence on the performance, and find that the size 2 model works the best for P-wave and S-wave detection in our case. The data partitioning is important for ML models. One modification that we make is instead of mixing all the waveforms from all the events (Thomas et al., 2021), we split the data by the event ID so that traces from the same event will not participate in both the training and testing datasets, potentially minimizing the model memorization. A total of 269,422 events are split into 80% and 20% for training and testing, respectively. On average there are 2-3 P-wave or S-wave recordings associated with one event. We set our model batch size to 32, with a total of 50 training epochs. Finally, we take the final model as the preferred model and test it with both the testing dataset and the continuous data. 4.3 Results 4.3.1 Assessing Model Performance We first perform the model on the testing dataset to evaluate the model performance (Figure 4.3). To do so we calculate the accuracy, precision, and recall for both the P- and S-wave models. We first calculate accuracy 70 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 (%) = =>?=8 × 100%, (4.1) =>?=8?@>?@8 where true positive (TP) is defined by the number of positive detection and that are actual LFEs; true negative (TN) is the number of negative detection and that are noise; false positive (FP) and false negative (FN) are the number of incorrect LFE and noise predictions, respectively. For simplicity, here we initially consider this as a binary classification problem and discuss the arrival time accuracy later. We calculate the accuracy as a function of the decision threshold for the P- and S-wave models and find that the S-wave model has slightly higher accuracy (92%) than the P-wave (90%) at threshold ~0.1. Next, we calculate precision as 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 (%) = => × 100%, (4.2) =>?@> Unlike accuracy, precision ignores the number from negative predictions, and the value simply represents the rate of positive predictions and that are actually positive. Both our P and S-model have a precision of 95% at threshold ~0.1 which means the predicted LFEs are 95% accurate, and only 5% of the detections are false detection i.e. noise. Furthermore, to understand the rate of misclassification of actual LFEs we calculate recall, or the true positive rate (TPR) 𝑅𝑒𝑐𝑎𝑙𝑙 (%) = => × 100%, (4.3) =>?@8 Recall evaluates the rate of actual LFEs and that are successfully detected. For example, both our P and S-model have a recall of 90% at threshold ~0.1, this means 90% of the LFEs can be identified, and 10% of the LFEs are misclassified as noise. 71 A receiver operating characteristic (ROC) curve is another metric to evaluate the overall model performance (Figure 4.3b). The ROC curve varies the decision thresholds of a binary classifier and examines the true positive rate (TPR) against the false positive rate (FPR). The Area Under the ROC Curve (AUC) is a more common representation of the ROC curve. AUC spans a value from 0.5 to 1, where 0.5 is randomly guessing and 1 means perfect prediction. To further validate the model, we perform three different tests: testing the model with the full testing dataset (v1); testing with only large (>M2.2) events (v2); and recording at close (<30 km) epicentral distances (v3). We randomly select data from the above criteria and pass them into the generator to generate 1,000 LFEs and 1,000 noise samples and repeat the procedure 20 times to assess the distribution of ROC curves and AUC values (Figure 4.3). In comparison to the v1 test, which had AUCs of 0.92 and 0.97 for the P- and S-wave models, respectively, we find that the model performs better when testing it using only large events with an AUC of 0.96 and 0.98, for P- and S-wave models, respectively. This suggests that the ML model performs better with the higher signal-to- noise ratio data, representative of larger LFEs. The v3 test demonstrates that the model does not perform significantly better than the v1 test. This is because most of the LFEs are located beneath the stations with depth of ~40 km (Figure 4.1) and thus the difference in horizontal distances is insignificant. 72 Figure 4.3. Performance analysis for our selected model (size=2). (a) Precision, recall, and accuracy curve as a function of decision threshold. (b) ROC curve for testing with v1: unfiltered data, v2: large events (M>2.2) only, and v3: close epicentral distance (<30km) events only. The AUC values and their standard deviations are calculated from 20 groups of 2,000 random samples, mixing with half (i.e., 1,000) of noise data, from the testing dataset. (c), (d), same as (a), (b) but for S-wave model. 73 4.3.2 Application to Continuous Seismic Data According to the metrics, we set a decision threshold at 0.1 for both the P- and S- wave model and run our ML model on 14 years of continuous waveform data from 2005 to 2018. We find that, unsurprisingly, the model is able to identify known LFEs. Figure 4.4 shows example S-wave detections of a known LFE on multiple stations. The model clearly picks the arrival at stations TWKB, LZB, PGC, and SSIB. For station SILB and VGC the model detects the event but with a few second arrival time difference. Overall the model is adept at identifying existing LFEs. Furthermore, we find that our ML model detects events that are not in the original catalog; such events are routinely detected on multiple stations (Figure 4.5). Assuming there is only one LFE in the 15 s time window and all the detections are independent, the chance that such detections are false detections is smaller than 1%. 74 Figure 4.4. Example of S wave detections of a known LFE. Family ID: 022, origin time: 2005-09-09T16:55:26.065 from testing dataset (only showing East-component). All the waveforms are normalized by their amplitude and plotted with their epicentral distance along the y-axis. Bolded lines show the model prediction. Blue dots mark the detected arrivals from the model, red stars show the actual arrivals. Figure 4.5. Example of S wave detections of an unknown LFE which is not in the original catalog. Waveforms are normalized by their amplitude (only showing East-component). Bolded lines and dots show the model prediction and the detected arrivals from the model, respectively. 75 Beyond individual detections, Figure 4.6a shows timeseries of daily detection counts for the 14 stations that were used to train the network. High LFE rates manifest across the network during times of known large magnitude SSEs while inter-SSE time periods have very low detection rates. The result shows that the detections are consistent with known tremor activity catalog (Wech, 2021) and the original Royer & Bostock (2014) catalog (Figure 4.6b). We next define robust detections as those in which a minimum of three stations detect an arrival in the same 15 s time window. Our result shows that in the ten year period from 2005-2014, the model detects more than five times the number of events than the original catalog of Royer and Bostock (2014) in a fraction of the time. This is not surprising considering those authors applied their template matching approach to times of known SSEs to minimize computing time but does indicate that there are likely undetected LFEs during large SSEs and significant inter-SSE LFE activity that may be used to better resolve slip throughout the slow slip cycle. Figure 4.6. Model performance on multiple stations. (a) daily detection number for all the 14 stations, normalized by their maximum value. (b) Cumulative number of model detection, LFE catalog (Royer & Bostock, 2014), and tremor catalog (Wech, 2021). Detection is constrained by a minimum of 3 stations detecting an arrival at the same 15 s detection time window. 76 Furthermore, we apply the trained model to 10 stations that were not used during model training (Figure 4.7). The detection counts on these stations have the same low daily detection counts during inter-SSE periods that increase abruptly during times of known SSEs for time periods when data is available. The CNN has not seen any of the LFE data from these stations, yet it can still robustly detect LFEs, and the patterns are consistent with the original catalog (Figure 4.7). This result demonstrates the CNNs ability to extrapolate learned information to new settings and that path, site effects, and noise character for the unseen stations are likely to be similar to the training stations. Figure 4.7. Model performance on unused stations. The station locations are shown in Figure 4.1. 77 4.3.3 P and S-wave Arrival Time Estimates As demonstrated in Figure 4.4, arrival time prediction can be challenging especially for low signal-to-noise ratio data. We find that by setting a decision threshold of 0.1, the model has an averaged S-wave travel time misfit of -0.2 s, with a standard deviation of 3.6 s (Figure 4.8). This slightly decreases to -0.17 s and a standard deviation of 2.6 s when setting a higher threshold of 0.5. The negative mean value is mainly because the model identifies some of the earlier P-wave arrivals. This is shown in Figure 4.8a in the 0-40 km distance groups, where the arrival time misfits show a secondary peak at -6 s, the expected P-S wave arrival time difference for the depth of ~40km. Similarly, this can be also seen for the larger magnitude events shown in Figure 4.8b where the P-wave amplitude is expected to be more obvious. We find that the misfits do not decrease when using events with epicentral distances of less than 40 km, this is likely because all the sources are deep and thus the difference in the horizontal distance is insignificant, similar to the result of v1 and v3 tests in Figure 4.3. For the P-wave model, we do not find the predicted arrival time useful because the predictions are frequently mixed with the S-wave arrivals, yielding large misfits with a standard deviation of 4.2 s. This is expected, as shown in Figure 4.2, P-wave arrivals usually have such low signal-to-noise ratio that they are rarely detected. Thus in our daily seismicity analysis presented in Figure 4.6, we do not include the detections from the P-wave model. 78 Figure 4.8. Distribution of arrival time misfits in different groups. (a) shows the misfits in different distance groups. (b) The misfits in different magnitude groups. (c) Histogram of the misfits. 79 4.4 Discussion 4.4.1 Flexibility and Efficiency Past studies have shown that traditional template matching methods are an effective tool for identifying repeating LFEs in continuous seismic data (e.g. Shelly et al. 2007, Thomas and Bostock, 2015). Despite its success, the method has several limitations: it requires templates be selected a-priori, it finds only known signals and cannot extrapolate to waveforms of similar character, it requires similar station distributions through time and is computationally intensive. The CNN we develop here has several advantages over template matching. First, it is capable of identifying new and known LFEs as demonstrated above. Second, it can be applied to new stations in the same geographic region to detect existing and new LFEs. Finally, it is computationally efficient. Once it is trained, the time complexity of the model is O(N), which is linearly dependent on the total number of data, compare to the template matching method of O(N*M) which is depending on both the number of data and templates. Our result shows the presenting arrival misfits for ~seconds that will require additional work to accurately locate LFEs, but as a test of methodology, we anticipate that the predictions can be added as an additional constraint for a more robust detection i.e., only keeps the events when both the P- and S-wave are clear and reasonably separated. 80 4.4.2 Implication and Future Work From the daily seismicity plot (Figure 4.6), we find that the seismicity emerges (e.g., June 2006 - December 2006) before a group of LFEs occurrence, associated with the 2007 SSE occurrence. Such LFEs emergence seems to be a robust feature as it can be seen in almost all the inter-SSE periods, constraining by at least 3 stations. The pre-SSE activities is also correlated with the tremor catalog (Wech, 2021), where daily tremor rate gradually increases and eventually reaches the peak when an SSE emergence. Whether these activities suggest a process of slip nucleation and acceleration, or fluid been inconstantly supplied remains unknown. Future work on event locating and validation are necessarily to help understanding the actual seismic activities and the mechanism downdip the Vancouver Island. 4.5 Conclusion LFEs activities provide a tool to track SSE evolutions, associated to earthquake hazards. Traditional methods for detecting LFEs are computationally expensive and they are usually limited by the assumption that sources repeat. Here we train a CNN to detect LFEs and identify their P- and S-wave arrivals in Southern Vancouver Island. We first consider LFEs identification as a binary classification problem. When applied to the testing dataset our model has a high accuracy of 92% for S-waves and 90% for P-waves at a decision threshold of 0.1. This is remarkable considering the low signal-to-noise ratio of the data. We applied the CNN to 14 years of continuous data and find that the model detects more than five times the number of events than the original catalog (Bostock et al. 2012; Royer & Bostock, 2014). These events comprise undetected LFEs that occur during large 81 SSEs and interSSE LFEs that were not included in the original catalog. Given the high precision of 95% and by requiring simultaneous detections on at least 3 stations the false detection rate is less than 1% suggesting the vast majority of detections are robust. A high resolution LFEs catalog would undoubtedly help to constrain the enigmatic processes responsible for deep slow slip events however further work is needed to accurately locate LFEs before such investigations can yield insightful results. 82 CHAPTER V OVERLAPPING OF COSEISMIC AND TRANSIENT SLOW SLIP From Lin, J. T., Aslam, K. S., Thomas, A. M., & Melgar, D. (2020). Overlapping regions of coseismic and transient slow slip on the Hawaiian décollement. Earth and Planetary Science Letters, 544, 116353. 5.1 Introduction 5.1.1 Fast and Slow Slip Earthquakes represent rapid release of accumulated strain along faults and typically occur on timescales of fractions of a second to minutes in the largest earthquakes. To generate high frequency seismic waves, typical earthquakes slip at velocities ranging from 10−4 to 1+ m/s (e.g., Galetzka et al., 2015; Thomas et al., 2016). In contrast, slow, transient fault slip occurs on timescales of seconds to years resulting in slip velocities that are usually at most 1-2 orders of magnitude above the plate rate (Bürgmann, 2018). Transient slow slip events were originally discovered down dip of seismogenic zone in Japan and Cascadia (Obara, 2002; Rogers and Dragert, 2003). However, in the nearly 20 years since its original discovery it is now clear that transient slow slip can also occur in the shallow portion 83 of subduction megathrusts (Yamashita et al., 2015; Saffer & Wallace, 2015; Wallace et al., 2016) and within the seismogenic zone (Ito et al., 2013; Dixon et al., 2014). Regions that regularly host transient slow slip are often observed adjacent to highly coupled regions of subduction megathrusts capable of generating large magnitude earthquakes (e.g., Dixon et al., 2014; Radiguet et al., 2016; Rolandone et al., 2018). This observation naturally raises questions as to whether the same section of fault is capable of hosting both slow and fast slip. If they are, then using only the highly coupled regions of subduction megathrusts to estimate the areal footprint of future earthquakes may fundamentally underestimate rupture extent and earthquake magnitude. Beyond estimating the extent of future earthquakes, the ability of slip to penetrate into regions hosting slow slip has important implications for ground motion and tsunami hazards. For example, in many subduction zone geometries the down dip region of subduction megathrusts is the section of fault closest to major population centers, hence increasing slip in this region will increase ground motions there (Chapman & Melbourne, 2009; Frankel et al., 2015; Ramos and Huang, 2019). In contrast, increasing slip near the trench results in larger sea floor displacements and resulting tsunamis (Satake et al., 2013; Melgar et al., 2016). Despite the aforementioned implications, to date there is a dearth of observations of regions known to host slow slip overlapping with those that host coseismic slip. Here we show that the 2018 M7.1 Hawaii earthquake (Figure 5.1) ruptured the entirety of an adjacent region that regularly hosts transient aseismic slip. 84 Figure 5.1. Map of southeastern Hawai'i island. (a) Blue, orange and green vectors show the recorded GNSS horizontal displacements from the 2018 Mw7.1 earthquake, SSE-East and SSE-West, respectively. Triangles denote the strong motion stations from the Hawaiian Volcano Observatory network. The yellow circles represent tide gauges (KAPO, HONU) and one DART buoy (DART). The colormap shows the joint GPS, strong motion, and corrected tsunami inversion for the 2018 earthquake on the same fault plane of Liu et al. (2018). Orange and green contours mark the SSE areas of Foster et al. (2013). Focal mechanisms and hypocenters (stars) are taken from the USGS NEIC, including the Mw5.7 foreshock (20180504T21:32:44) shown in black and the mainshock (red). Brown lines denote regional active faults identified by the USGS. (b) Cross section along the red line shown in Panel A (a). The black triangle indicates the Kilauea's eastern rift system. Black lines represent the décollement of Hawaiian flank and the oceanic crust inferred by Morgan et al. (2000) (reflection profile 2 in Fig. 4) and the structures modified from Montgomery-Brown et al. (2009, Figure 18). Brown rectangles show all 45 fault geometries (i.e., projected from 3D to 2D) explored in our overlap analysis described below. The red rectangle marks the fault of Liu et al. (2018) that is also shown in Figure 5.1(a). 85 After exploring the extent of overlap as a function of fault plane geometry and inversion regularization parameters we find that, for most reasonable choices, the SSEs and coseismic slip overlap nearly entirely with the majority of the SSE regions hosting 1+ m of coseismic slip. Finally, we use simplified numerical models of fault slip to explore the conditions that promote significant overlap and discuss which variations in fault mechanical properties are likely responsible for variations in slip behavior. 5.1.2 Tectonic Setting of Hawaii The south east flank of Hawaii's Kīlauea volcano is a region of complex tectonics that reflect a combination of earthquake and magmatic processes (Owen et al., 1995). Leveling, continuous, and campaign Global Navigation Satellite System (GNSS) measurements indicate that it moves seaward at rates that can exceed 10 cm/yr (Owen et al., 1995; Delaney et al., 1998). This deformation is thought to be driven by unstable gravitational spreading and accumulation of magma along Kīlauea eastern rift systems (Thurber and Gripp, 1988) and is accommodated on a décollement dipping shallowly to the NW. The décollement is thought to coincide with the interface between the underlying oceanic crust and the volcanic edifice (Hill, 1969; Furumoto and Kovach, 1979; Nakamura, 1982). As such, the décollement is likely made up of a thin (<1 km) layer of oceanic sediments based on observations of low velocity seismic waves (Thurber et al., 1989) and measurements of sediment thickness in the Pacific basin surrounding Hawaii (Nakamura, 1982). The distribution of seismicity beneath the East rift zone (ERZ) clearly illuminates a planar geometry at a depth of 7~10 km (Klein et al., 1987; Denlinger & 86 Okubo, 1995; Got and Okubo, 2003) that extends seaward, which has been well- constrained by focal mechanisms and seismic reflection data (Gillard et al., 1996; Morgan et al., 2000; Park et al., 2007). Large earthquakes have occurred on the décollement, such as the 1868 M7.9 Kao and the 1975 Mw7.7 Kalapana earthquakes. Additionally, this structure hosts both periodic and aperiodic slow slip events (SSEs). 5.1.3 Slow Slip in Hawaii Cervelli et al. (2002) first identified slow slip in GPS data recorded on Kilauea's south flank that manifest as systematic GPS displacements of up to 1.5 cm occurring over a 36 hour period. Those authors found that the displacements were best fit by a shallow, thrust source located at 4.5 km depth. Accounting for material heterogeneity increased the estimated source depth to 5 km which led those authors to conclude that the slow slip was too shallow to be on the décollement. Later, Segall et al. (2006) and Brooks et al. (2006) showed that M5.5~5.9 transient SSEs regularly occur in Hawaii and can be magmatically triggered (Brooks et al., 2008). Additionally, Segall et al. (2006) employed the location and timing of coeval, triggered microseismicity to argue that these SSEs do indeed occur on the décollement. Montgomery-Brown et al. (2009) later showed the accounting for the combined effects of topography and elastic heterogeneity can reconcile the inferred depths of the SSEs with the depths of coeval microseismicity, further supporting the idea that both occur on the décollement. Additionally, they identified new SSEs with distinct deformation patterns. Foster et al. (2013) later recognized that there are two different asperities that produce SSEs; a western asperity that hosts nearly periodic 87 SSEs that average with typical magnitudes of ~M5.8 and an eastern asperity that hosts irregular ~M5.4 slip events. 5.2 Data 5.2.1 The Mw7.1 2018 Hawaii Earthquake The May 4 Mw7.1 Hawaii earthquake is the largest earthquake on the island since the 1975 M7.7 Kalapana earthquake (Owen and Bürgmann, 2006). In addition to widespread strong shaking it produced a tsunami with a maximum wave height of 40 cm recorded at a nearby tide gauge (KAPO, Figure 5.1). Different slip models using teleseismic, strong motion, GPS, tsunami data or some combination thereof have been published in the literature (Liu et al., 2018; Bai et al., 2018; Chen et al., 2019). While these models differ in their specifics, their salient features are similar and include peak slip of approximately 3 m on a structure dipping shallowly (5~7.5°) to the NW, with an average rupture speed of approximately 1 km/s. Another common feature of these slip models is they either do not include or fail to completely reproduce near-field tsunami observations at stations HONU and KAPO (Figure 5.1). Published inversions for the Hawaii earthquake that have attempted to include the near-field tsunami data ultimately have noted large arrival time discrepancies (e.g. the difference between modeled and observed arrival times as large as ~4 min at KAPO, only 30 km from the hypocenter (Bai et al., 2018)). Near field tsunami data provide important constraints on rupture extent and total moment (Satake et al., 2013), hence reproducing these observations is important because the 2018 Hawaii earthquake occurred almost completely offshore. We first download two near-field 88 tide gauges and an additional far-field ocean-bottom pressures from the Deep-ocean Assessment and Reporting of Tsunamis (DART) sensors from the Pacific Tsunami Warning Center that will be used to validate the inversion. Additionally, we download the GNSS observations processed by the Nevada Geodetic Laboratory (Blewitt et al., 2018), estimated from the 5 minutes sampling displacement timeseries. A total of 36 3- components GNSS stations with epicentral distances less than 0.6° were selected to ensure high signal-to-noise ratio. We also download data for the 9 closest strong motion stations from the IRIS DMC (https://ds.iris.edu/ds/nodes/dmc/), remove the instrument response, integrate to velocity, apply a bandpass filter with corner frequencies of 0.08-0.4 Hz, and resample to 1 Hz. We high-pass filter the tsunami observations with a 1 hour corner to remove the long period ocean tides and re-sample the data to 15 sec (Figure 5.2). We note that large tsunami misfits may indicate that flank failure may have occurred as part of the 2018 event, similar to the 1975 Kalapana earthquake discussed by Owen and Bürgmann (2006). However, our analysis completely rules out this possibility. Careful examination of the tide gauge records reveals a small tsunami signal resulting from a Mw5.7 foreshock which occurred ~1 hr before the mainshock with very similar source location and focal mechanism (Figure 5.3). This small event also has a significant time delay. This timing issue in near-field recordings has been noted before (Romano et al., 2016) and is attributed to artificial delays introduced into modeling of the tsunami propagation by the finite resolution of the digital bathymetric model in the near- shore region where tide gauges are located. To correct for this issue, we employ a novel empirical time calibration technique to the tsunami data recorded at near-field stations. We take the tsunami from Mw5.7 foreshock recorded at both tide gauges and use it to find the 89 optimal time shift for the Mw7.1 mainshock tsunami record. The tsunami waveforms of the Mw5.7 foreshock and mainshock are very similar as they share nearly identical path and site effects. Because the Mw5.7 earthquake is a relatively small event, it can be treated as a point source. We use the W-phase moment tensor solution to calculate the expected coseismic seafloor deformation and use this as the tsunami initial condition. We model its propagation model to both tide gauges and obtain the timing misfit by cross correlation between the modeled and observed arrivals. The time shifts between synthetic and observations are 240 s and 135 s for KAPO and HONU, respectively. These delays are employed as empirical time corrections for the mainshock inversion and lead to good fits to the data. 90 Figure 5.2. Data and fitting from the joint inversion model in Figure 5.1. (a) shows the GPS fitting, where the blue and red vectors represent horizontal coseismic displacement data and model, respectively. The circles and squares are color coded to show the vertical component of data and model, respectively. Triangles denote the strong motion stations. (b) A zoom in view of dashed area shown in (a). Red box outlines the surface projection of the fault plane. (c) Fitting of strong motion data (integrated to velocity), where the black and red lines represent normalized data and model, respectively. Texts mark the peak amplitude. (d) Model fitting of tsunami data. Station locations are shown in Figure 5.1. The top two panels show the model fitting and the comparison of tsunami without correction. Note that DART is not used in the joint inversion model, but for model validation purpose. 91 Figure 5.3. Tsunami waveforms and empirical tsunami correction. The blue (T1) and red (T2) areas mark the tsunami data from foreshock and mainshock, respectively. Inset T1 v.s. T1 model shows the tsunami model (blue dashed line) and the shifted model (red line) of the foreshock. The shifts are then used to correct the unmodeled error in the mainshock tsunami. Inset T1 v.s. T2 shows the normalized tsunami of foreshock (blue) and mainshock (red). 5.3 Joint Inversion Model We use the Mudpy (Melgar & Bock, 2015) with Laplacian regularized slips in the inversion. The fault geometry is taken from (Liu et al., 2018). We divide it into 18 by 13 subfaults with total length of 45 km and 39 km for the strike and dip direction, respectively. We use a frequency-wavenumber approach (Zhu & Rivera, 2002) and an average 1-D local 92 velocity model from Klein (1981) to calculate the static and strong motion Green’s functions (GFs) (Table 5.1). The Greens functions (GFs) of tsunami data are calculated in a two-step procedure following Melgar and Bock (2015). First, we obtain the vertical seafloor coseismic displacement produced from unit slip on each subfault. This is considered the initial condition for tsunami propagation. In a second step we propagate the tsunami to each tide gauge using the GeoClaw shallow water solver (Berger et al., 2011). We use the SRTM-15 topography and bathymetry. GeoClaw uses adaptive mesh refinement (AMR) and we allowed 4-Levels of AMR with an inner 15 arc-second and an outer 10 arc-min grid. We force the tsunami calculation to use the finest grid spacing as tsunami waves get close to the stations when subtle bathymetry changes become significant. We apply different weights to different dataset depending on data norm and level of uncertainty. The strong motion data has the highest weight in the inversion with 1/0.6, with respect to tsunami of 1/1.4. We set the weight of horizontal GPS as 1 and vertical GPS as 1/2, because of the larger uncertainty in GPS vertical component. We also jointly invert the GPS and strong motion data and grid-search the rupture velocity which best fit the strong motion waveforms (Figure 5.2). Note that although GPS is not sensitive to rupture velocity, it can still constrain the final moment in the joint inversion process. Similarly, for the inversion of SSEs, we take the GPS data from Foster et al. (2013), a study based on template matching the previously identified events. We invert the averaged GPS observations for all the identified East and West SSE families using the same inversion method that we invert for earthquake slip. In our preferred model, slip occurs on an asperity located southeast of the hypocenter (Figure 5.1) with a maximum slip of 3.2 m at a depth of 3.5 km resulting in a Mw7.1 event. Our preferred slip pattern also shows that the 2018 93 event ruptured southwest of the hypocenter toward the SSE area. We find a preferred rupture velocity of 1.2 km/s, consistent with the results of previous studies (Liu et al., 2018; Bai et al., 2018). Considering the 3.7 km/s shear wave velocity at this depth, this is a relatively slow rupture speed. Analysis of rupture progression indicates the rupture front reached the main asperity 10 s after initiation, gradually ruptured to the southwest beginning at 17 s, and ceased after 32 s (Figure 5.4). We also test the joint inversion with corrected and uncorrected tsunami waveforms, and the result shows that the model cannot explain tsunami data without empirical time corrections. Although the fault geometries of the 2018 earthquake employed in previous studies (e.g., Liu et al., 2018; Bai et al., 2018; Chen et al., 2019) seem to be distinct from the décollement, i.e., they have a shallower preferred depth of ~6 km and a steeper fault dip of ~7° or 20° according to the USGS W-phase moment tensor, this possibility has been ruled out by Chen et al. (2019). As they show, the fault depth is not very sensitive to the data they used in their joint inversion (i.e., teleseismic, GPS and strong motion) since the variance reduction (VR) only changes by up to 2% when varying the hypocentral depth between 5 and 9 km. Thus, the preferred shallower fault plane is only a plausible solution and we can only resolve the geometry within some range based on reasonable constraints (Table 5.2) from grid-searching, not a global minimum of the models. Also, the initial dip angle of 20° from USGS was refined by a later analysis of Love Wave radiation patterns to between 2.5-7.5° (Lay et al., 2018), suggesting the 2018 earthquake took place on a fault plane with a geometry similar to the décollement. 94 Figure 5.4. Snapshots for the kinematic rupture model in Figure 5.1. Subplots show the progression of the earthquake in different time stages. Red stars represent the earthquake epicenter. White vectors denote the direction and amplitude of the slip. 95 Table 5.1: Velocity model used in this study. Thickness(km) Vs(km/s) Vp(km/s) Density(g/cm^3) Qs Qp 4.5 2.37 4.23 2.12 300 600 5.5 3.74 6.65 2.89 300 600 5.0 3.85 6.85 2.96 300 600 1.25 4.25 7.56 3.19 300 600 83.75 4.58 8.15 3.38 300 600 5.4 Overlap of Slow and Fast Slip 5.4.1 Evidence from Inversion Model The results presented above suggest that there is significant overlap between the 2018 earthquake and SSE slip distributions so we now explore to what extent coseismic slip overlaps with SSEs. The relatively simple faulting environment and very dense modern geophysical observations when compared to subduction zone settings (e.g., Yamashita et al., 2015; Wallace et al., 2016; Ito et al., 2013; Dixon et al., 2014) make Hawaii Island an ideal place to explore this question. Owing to the large magnitude and associated fault dimension (i.e. >30 km), all studies of the 2018 earthquake agree the event, like the SSEs, ruptured the décollement (Liu et al., 2018; Bai et al., 2018; Chen et al., 2019). In order to determine the extent of overlap, we re-invert GPS data from Foster et al. (2013) for the SSE slip distribution; and jointly invert GPS, strong motion, and tsunami records for the earthquake slip distribution using the same inversion method introduced in the supplementary material. Our checkerboard tests show good resolution for the earthquake 96 inversion, and satisfactory resolution in the SSE source region. Because of the uncertainties in both the location and geometry of the décollement, we test a range of fault geometries with strikes of 229°-249°, dips of 3°-11°, and depth of 5-9 km; a total of 45 possible fault geometries. We also grid search Laplacian spatial regularization for each geometry. Rather than determining a best fit fault plane and slips we consider all possible models that have a reasonable VR and moment magnitude when compared to far-field moment tensor solutions and results from different sources of joint inversion (Bai et al., 2018; Liu et al., 2018; Chen et al., 2019), defined in Table 5.2. Note that for SSEs, only the VRGPS is considered. For the earthquake, the joint inversion has regularized all the dataset together and VRGPS above 70 should give satisfactory models that will be used for our overlap analysis discussed below. We test a total of 45 geometries and compute their overlap on each tested geometry (Figure 5.1b). For each geometry, each event (i.e., EQ, East and West SSE) has 50 models resulting from different regularization parameters. We select only the reasonable solutions from those 50 models based on the filtering criteria of Table 5.2. We note that the smearing effect in the inversion, thus, to focus only on the confident slip, we compute the top 50% of slip contour for each model, and calculate the overlap between EQ and SSEs (Figure 5.5). We also compare the results of setting the slip contour of top 30% to 80%, or compute the averaged value with only the overlaps are certain (i.e., non-zero averaging), the overlapping area are slightly different but the relatively features are similar to the Figure 5.5. This suggests that when the fault geometry is similar to that of the Hawaiian décollement, which is the most reasonable choice, this overlap feature of EQ and SSEs is hard to be rejected. 97 Figure 5.5. Overlapping areas of coseismic slip and SSEs from our inversion tests. (a) The light gray circles show the overlapping areas of all the possible solutions. Colored circles show the averaged of the light gray circles with numbers denoting the averaged overlapping area (in km2). Dashed black boxes mark the preferred décollement dip and depth. (b)- (d) show examples of overlapping calculations with (229°, 7°, 5 km), (244°, 3°, 7 km) and (229°, 7°, 9 km) for strike dip and depth, respectively. The red, blue and black lines mark the top 50% of slip from the earthquake, East SSE and West SSE inversion, respectively. Gray areas in the insects show the solutions that fit the given VR, Mw and maximum slip constraints (Table 5.2). 98 Table 5.2. Constraints for the grid-searched inversions. Earthquake SSE-West SSE-East Mw 6.980 VR>80 VR>50 Slipmax(m) 1 0.6), and is shown in Figure 5.7(a). A rupture nucleates within the CSZ (Figure 5.7(a), blue curves), and then expands within CSZ (green curves). The propagation speed with which the rupture expands is observed to vary in each simulation. Rupture begins decelerating at the CSZ-SSZ interface as it penetrates into the SSZ. The rupture deceleration is marked in Figure 5(a). During penetration of the rupture into the SSZ, no secondary nucleating patch within the SSZ is observed, and thus the rupture can be considered as a single rupture, penetrating from the CSZ to the SSZ. The second rupture scenario occurs at an intermediate effective stress ratio (0.6-0.4), and is shown in Figure 5.7(b). Rupture nucleates and expands within the CSZ (Figure 5.7(b), blue and green curves). Once it hits the CSZ-SSZ interface, a small secondary patch within the SSZ nucleates. The secondary patch is labeled in Figure 5.7(b). This secondary patch nucleation results in the larger penetration of the rupture as compared to the first case. The final rupture scenario occurs at a lower effective stress ratio and is shown in Figure 5.7(c). A primary rupture nucleates within the CSZ, while before the co- seismic rupture hits the CSZ-SSZ interface, we observe nucleation occurring on a secondary patch within the SSZ. Both the primary and secondary nucleation can be observed in Figure 5.7(c). Since the effective stress is significantly lower within the SSZ, the elastic stress changes from the primary slip patch assist in the nucleation of the secondary pulse. Once the rupture front from the CSZ reaches the CSZ-SSZ interface, the pre-existing secondary rupture facilitates rupture into the SSZ, which in most of our 10 4 simulations results in a 100% penetration (or complete penetration) of the CSE into the SSZ. Figure 5.7. Different CSE rupture scenarios observed in this study. Each curve represents a time snapshot of slip rate as a function of distance along fault, not equally spaced in time. Note that the time step decreases as the slip speed increases. Different colors represent different stages of the CSE rupture. The blue color represents the early stage of a CSE. This is the stage when rupture initiation occurs. The green color shows the middle stage of a CSE. This is the stage when the rupture grows and expands along the fault, reaches the CSZ-SSZ interface. The red color shows the final stage of a CSE, this stage marks the onset of the rupture termination. (a) The first CSE rupture scenario (with stress ratio = 0.7, σCSZ = 10 MPa) where no secondary nucleating patch within the SSZ is observed. (b) The second rupture scenario (with stress ratio = 0.4, σCSZ = 5 MPa) where a small secondary patch within the SSZ nucleates when the primary rupture reaches the CSZ-SSZ interface. (c) The third rupture scenario (with stress ratio = 0.2, σCSZ = 5 MPa) where a secondary patch nucleation is observed before the primary rupture reaches the CSZ-SSZ interface. 10 5 Figure 5.8(a) and Figure 5.8(b) shows the percent penetration of coseismic rupture into the SSZ for different 𝜎!0M levels and different values of the effective stress ratio. The percent penetration is defined as the ratio of the length of the SSZ ruptured by the CSE to the total length of the SSZ. Figure 5.8(a) shows penetration of a CSE for 𝑉00M -8! = 10 m/s, while Figure 5.8(b) shows penetration of CSE for 𝑉00M! = 10-6 m/s respectively. The minimum CSE penetration is observed to be 37% in Figure 5.8(a) and 48% in Figure 5.8 (b). Minimum penetration occurs at high 𝜎!0M (= 10 MPa), and a high effective stress ratio of 0.8. For 𝜎!0M values of 5 MPa and 2.5 MPa (shown in Figure 5.8) by triangles, and circles respectively, we observe lesser rupture penetration for higher effective stress ratios, comparatively more rupture penetration for smaller stress ratios, while complete rupture penetration occurs at the smallest stress ratio used in this study. For a 𝜎!0M value of 10 MPa (shown as squares in Figure 5.8), we observe lesser rupture penetration for higher stress ratios, but complete rupture penetration occurs at relatively higher stress ratio as compared to smaller stress levels. This is also indicated in Figure 5.8 by squares at 100% penetration levels for stress ratios of 0.2, 0.3 and 0.4. Our results suggest that the rupture penetration mainly depends on the stress ratio, and even slower coseismic ruptures may result in a complete CSE penetration under favorable stress conditions. This further suggests that the rupture penetration within the SSZ is largely assisted by the secondary pulse nucleation that helps the rupture to move farther within the SSZ. The overall behavior of the rupture penetration as a function of stress ratio remains the same for the different 𝑉00M! values used in this study i.e. a larger rupture penetration occurs for smaller stress ratios, while comparatively lesser rupture penetration occurs at higher stress ratios. Additionally, we observe that higher 𝑉00M! values result in larger rupture penetrations for similar 𝜎!0M and 10 6 stress ratios. Finally, the average rupture speed does not appear to be a significant factor in controlling percent penetration. Figure 5.8. The percentage penetration of CSE into the SSZ for different σcsz and stress ratios (σssz/σcsz). (a) Shows the percentage penetration of CSE into the SSZ when Vcssz=10−8 m/s. (b) same as (a), but when Vcssz=10−6 m/s. 5.5 Discussion Our analysis presented above clearly documents that the slip distribution of the 2018 M7.1 Hawaii earthquake overlaps with the slip distributions of regularly occurring slow slip events. As mentioned above, overlapping regions of slow and coseismic slip have been either observed or inferred by previous authors (Ito et al., 2013; Tsang et al., 2015; Clark et al., 2019). However, these studies are either not based on modern geodetic data, do not thoroughly explore inversion parameters, or do not explore the overlap in detail. For example, Ito et al. (2013) solved for a fault on subduction zone interface with assuming homogeneous slip to fit ocean-bottom pressure gauge data. The inversion results we show 10 7 here explore the role of fault geometry and regularization constraints on the resulting coseismic and slow slip distributions and summarize the reasonable models. Additionally, the checkerboard test demonstrates that our joint inversion for earthquake is robust, and has overall good resolution. We note that the inversion for SSEs has low resolution in up- dip area far offshore. However, since most of the overlapping slip regions occur where both the earthquake and SSEs have good resolution (i.e. mid-Western fault plane), we contend that it is a robust feature, and the 2018 Hawai'i earthquake ruptured a significant fraction of the region that regularly hosts slow slip events. One limitation of the inversion practices we employed is that they have limited spatial resolution and hence we cannot, through inversion alone, rule out the possibility that the SSEs and coseismic slip occurred on nearly parallel fault planes closely separated in depth. However, we consider this unlikely. Rupture of a shallower high angle splay (Figure 5.1b) is a possibility, but owing to its large spatial extent (width usually >30 km), all studies of the 2018 earthquake argue or assume that it ruptured the low-angle décollement (Liu et al., 2018; Bai et al., 2018; Chen et al., 2019). While the locations of the SSEs were also originally unclear, the work of Segall et al. (2006) and Montgomery- Brown et al. (2009) demonstrate that they too coincide with and are likely hosted by the décollement. In addition to the results we present here, a previous study that employed detailed trilateration measurements has shown that the 1975 Kalapana earthquake ruptured predominantly to the South-West of its hypocenter (Owen & Bürgmann, 2006). If the same region hosting SSEs today is persistent and existed in 1975, then this rupture likely completely ruptured the SSE area, providing another piece of evidence for overlapping regions of slow and fast slip. 10 8 While the specific mechanism that gives rise to slow, transient fault slip is still not known, that mechanism must allow for slip events that accelerate to slip rates 1-2 orders of magnitude higher than the background loading rates but not favor further increases in slip rate. Despite this, our observations indicate that it is possible to have coseismic rupture in regions that regularly host slow slip. Our model setup is motivated by geologic interpretations of the Hawaiian décollement and observed along-strike variation between coseismic and slow slip zones implying that frictional properties and other material properties should be relatively homogenous. Our simulation results suggest that lower effective stress ratios, i.e., larger contrasts between effective stress between the SSZ and the CSZ, promote penetration of the CSE into the SSZ resulting in larger overlap. Additionally, the effective stresses within the SSZ and the elastic stress changes from the primary slip patch (within the CSZ) control the overlap characteristics of the CSZ-SSZ such that even coseismic ruptures with slower overall rupture velocities, as observed in the 2018 Hawaii earthquake, can easily rupture the adjacent SSZ. While more realistic transitions in effective stress between the CSZ and SSZ would undoubtedly influence our estimates of penetration, the general result that coseismic rupture is capable of rupturing well into regions hosting slow slip is also supported by the modeling results of Ramos & Huang (2019). 5.6 Conclusion We provide strong evidence of overlapping fast and slow slip on the Hawaiian décollement. Joint inversion of GPS, strong motion and tide gauge tsunami observations 10 9 clearly reveals the slip pattern of the 2018 Mw7.1 Hawai'i earthquake. After thoroughly exploring inversion parameters we find that the earthquake likely ruptured an adjacent region regularly hosts transient slow slip events. Furthermore, we performed quasi- dynamic modeling in 2D with rate and state friction to simulate how coseismic ruptures interact with adjacent regions that host slow slip. The unique tectonic setting of the Hawaiian décollement suggests differences in effective stress are likely a key control on slip diversity. Our modeling results suggest that fast slip can easily penetrate into the slow slip zone, as we observe. Our result reinforces the idea that an individual section of fault is capable of hosting a variety of distinct slip behaviors, and provides observational and theoretical evidence that a fault undergoes slow slip behavior can also rupture seismically. This behavior may also be a common feature in other tectonic environments such as subduction zones. As such, earthquake magnitude forecasts should include adjacent slow slip regions in estimates of areal extent and magnitude. 11 0 CHAPTER VI CONCLUSION AND FUTURE STEPS This dissertation presents the work of applying machine-learning and traditional seismic analysis techniques to large and small earthquake problems. In Chapter II, I have shown that by harnessing machine-learning, rupture simulations, and HR-GNSS data, earthquake magnitude can be determined from complex surface deformation patterns. The model, called M-LARGE, can rapidly predict earthquake moment magnitude with an accuracy of 99% for large (Mw7.0+) magnitude earthquakes without saturation issue, outperforming other similar methods. In Chapter III, I then showed that additionally to the earthquake magnitude, our M-LARGE model can predict the evolution of centroid location and fault size during the rupture. With an accuracy of 99% for magnitude, and an averaged of 95% for other source parameters, earthquake rupture extends can be predicted and further be utilized for ground motion forecasting. The tests on simulated data and real data showed a long median warning time of 36~40.5 s at sites with MMI=4, longer than the existing methods. For sites with distance of 200 km, our model performed a warning time of 34.5 s, and 58.4 s for distance of 300 km. We concluded that our proposed model can work in tandem with the existing EEW models and provide accurate and unsaturated source information and fast alerts that will mitigate earthquake hazards. 11 1 In Chapter IV, I have shown how machine-learning can be used as a data mining tool to detect small magnitude seismicity with low signal-to-noise ratio. We demonstrated that our CNN model could predict LFEs in Southern Vancouver Island, with a high accuracy of 92% and 90% for S-wave, and P-wave detection at threshold ~0.1, respectively. We applied the model to 14 years of continuous data and found that the model detected more than five times the number of events than the original catalog. Given the high precision of 95% and constraining by simultaneous detections on at least 3 stations, and the similar feature of the tremor catalog, which was built independently, the detected LFEs were robust. We found that the LFEs occurred frequently during the inter-SSE periods that were undetected due to the limitations of previous methods. This can help to understand the processes of deep slow slip events; however, further work is needed to accurately locate LFEs before such investigations can yield insightful results. In Chapter V, I have demonstrated how small and slow earthquakes link to large and fast events and their implication on earthquake hazard assessment. With jointly inverted GNSS, strong motion, and tsunami data of the 2018 M7.1 Hawaii earthquake, we found strong evidence of overlap between the fast slip and previously identified slow slip on a same section of fault. The result was further validated by rupture simulations, where we found that the effective stress can be a dominant factor that controlled the rupture extent. This finding supported the idea that slow slip should be considered as rupture extent for a more accurate hazard assessment. In this dissertation I have shown the work that integrated machine-learning techniques to solve the previously challenging problems in large and small earthquake seismology. Our solutions can provide faster and more accurate warning in advance for 11 2 large earthquake to mitigate hazards. For future steps, this can also be made to rapid tsunami early warning. Tsunami waves travel much slower than the seismic waves but generates large and catastrophic damages from region to global scale. Our unsaturated and rapid source prediction provides great potential to forecast the tsunami height and inundation way before their onsets. For LFE detection, our machine-learning model shown great promise for detecting events. Since it is more efficient and flexible than the traditional approach, we anticipate this can be running in real-time providing additional controls for time dependent subduction zone activities. Future work will be also focused on improving the spatial resolution of the LFEs before insightful results can be made. Finally, our overlapping investigation on Hawaiian decollement provided one of the rare real-world constraints. We anticipate this may also be a common feature in other tectonic environments as more observations are available (e.g., the constraint from LFE detections in Chapter IV). Understanding the rupture extent is helpful for building more realistic rupture simulations that helps the coseismic model (e.g., large earthquake EEW in Chapter II & III) for better early warning solutions. 11 3 REFERENCES CITED Allen, R. M. (2007). The ElarmS earthquake early warning methodology and application across California. In Earthquake early warning systems (pp. 21-43). Springer, Berlin, Heidelberg. Allen, R. M., & Kanamori, H. (2003). The potential for earthquake early warning in southern California. Science, 300(5620), 786-789. Allen, R. M., & Melgar, D. (2019). Earthquake early warning: Advances, scientific challenges, and societal needs. Annual Review of Earth and Planetary Sciences, 47, 361- 388. Ando, R., Takeda, N., & Yamashita, T. (2012). Propagation dynamics of seismic and aseismic slip governed by fault heterogeneity and Newtonian rheology. Journal of Geophysical Research: Solid Earth, 117(B11). Báez, J. C., Leyton, F., Troncoso, C., del Campo, F., Bevis, M., Vigny, C., et al. (2018). The Chilean GNSS network: Current status and progress toward early warning applications. Seismological Research Letters, 89(4), 1546–1554. Bai, Y., Ye, L., Yamazaki, Y., Lay, T., & Cheung, K. F. (2018). The 4 May 2018 MW 6.9 Hawaii Island earthquake and implications for tsunami hazards. Geophysical Research Letters, 45(20), 11-040. Behr, W. M., & Bürgmann, R. (2021). What’s down there? The structures, materials and environment of deep-seated slow slip and tremor. Philosophical Transactions of the Royal Society A, 379(2193), 20200218. Berger, M. J., George, D. L., LeVeque, R. J., & Mandli, K. T. (2011). The GeoClaw software for depth-averaged flows with adaptive refinement. Advances in Water Resources, 34(9), 1195-1206. Beroza, G. C., & Ide, S. (2011). Slow earthquakes and nonvolcanic tremor. Annual review of Earth and planetary sciences, 39, 271-296. Blaser, L., Krüger, F., Ohrnberger, M., & Scherbaum, F. (2010). Scaling relations of earthquake source parameter estimates with special focus on subduction environment. Bulletin of the Seismological Society of America, 100(6), 2914–2926. 11 4 Blewitt, G., Hammond, W. C., & Kreemer, C. (2018). Harnessing the GPS data explosion for interdisciplinary science. Eos, 99(10.1029), 485. Bock, Y., & Melgar, D. (2016). Physical applications of GPS geodesy: A review. Reports on Progress in Physics, 79(10), 106801. Boore, D. M., & Bommer, J. J. (2005). Processing of strong-motion accelerograms: needs, options and consequences. Soil Dynamics and Earthquake Engineering, 25(2), 93- 115. Böse, M., Heaton, T. H., & Hauksson, E. (2012). Real-time finite fault rupture detector (FinDer) for large earthquakes. Geophysical Journal International, 191(2), 803-812. Böse, M., Ionescu, C., & Wenzel, F. (2007). Earthquake early warning for Bucharest, Romania: Novel and revised scaling relations. Geophysical research letters, 34(7). Böse, M., Smith, D. E., Felizardo, C., Meier, M. A., Heaton, T. H., & Clinton, J. F. (2018). FinDer v. 2: Improved real-time ground-motion predictions for M2–M9 with seismic finite-source characterization. Geophysical Journal International, 212(1), 725- 742. Bostock, M. G., Royer, A. A., Hearn, E. H., & Peacock, S. M. (2012). Low frequency earthquakes below southern Vancouver Island. Geochemistry, Geophysics, Geosystems, 13(11). Brooks, B. A., Foster, J. H., Bevis, M., Frazer, L. N., Wolfe, C. J., & Behn, M. (2006). Periodic slow earthquakes on the flank of Kīlauea volcano, Hawaiʻi. Earth and Planetary Science Letters, 246(3-4), 207-216. Brooks, B. A., Foster, J., Sandwell, D., Wolfe, C. J., Okubo, P., Poland, M., & Myer, D. (2008). Magmatically triggered slow slip at Kilauea Volcano, Hawaii. Science, 321(5893), 1177-1177. Brown, J. R., Prejean, S. G., Beroza, G. C., Gomberg, J. S., & Haeussler, P. J. (2013). Deep low‐frequency earthquakes in tectonic tremor along the Alaska‐Aleutian subduction zone. Journal of Geophysical Research: solid earth, 118(3), 1079-1090. Bürgmann, R. (2018). The geophysics, geology and mechanics of slow fault slip. Earth and Planetary Science Letters, 495, 112-134. Calvert, A. J., Bostock, M. G., Savard, G., & Unsworth, M. J. (2020). Cascadia low frequency earthquakes at the base of an overpressured subduction shear zone. Nature communications, 11(1), 1-10. Cervelli, P., Segall, P., Johnson, K., Lisowski, M., & Miklius, A. (2002). Sudden aseismic fault slip on the south flank of Kilauea volcano. Nature, 415(6875), 1014-1018. 11 5 Chamberlain, C. J., Shelly, D. R., Townend, J., & Stern, T. A. (2014). Low-frequency earthquakes reveal punctuated slow slip on the deep extent of the Alpine fault, New Zealand. Geochemistry, Geophysics, Geosystems, 15(7), 2984-2999. Chapman, J. S., & Melbourne, T. I. (2009). Future Cascadia megathrust rupture delineated by episodic tremor and slip. Geophysical Research Letters, 36(22). Chen, K., Smith, J. D., Avouac, J. P., Liu, Z., Song, Y. T., & Gualandi, A. (2019). Triggering of the Mw 7.2 Hawaii earthquake of 4 May 2018 by a dike intrusion. Geophysical Research Letters, 46(5), 2503-2510. Chestler, S. R., & Creager, K. C. (2017). Evidence for a scale‐limited low‐frequency earthquake source process. Journal of Geophysical Research: Solid Earth, 122(4), 3099- 3114. Clark, K., Howarth, J., Litchfield, N., Cochran, U., Turnbull, J., Dowling, L., Howell, A., Berryman, K. & Wolfe, F. (2019). Geological evidence for past large earthquakes and tsunamis along the Hikurangi subduction margin, New Zealand. Marine Geology, 412, 139-172. Cochran, E. S., Bunn, J., Minson, S. E., Baltay, A. S., Kilb, D. L., Kodera, Y., & Hoshiba, M. (2019). Event detection performance of the PLUM earthquake early warning algorithm in southern California. Bulletin of the Seismological Society of America, 109(4), 1524-1541. Colombelli, S., Allen, R. M., & Zollo, A. (2013). Application of real‐time GPS to earthquake early warning in subduction and strike‐slip environments. Journal of Geophysical Research: Solid Earth, 118(7), 3448-3461. Crowell, B. W., Melgar, D., Bock, Y., Haase, J. S., & Geng, J. (2013). Earthquake magnitude scaling using seismogeodetic data. Geophysical Research Letters, 40(23), 6089-6094. Crowell, B. W., Schmidt, D. A., Bodin, P., Vidale, J. E., Baker, B., Barrientos, S., & Geng, J. (2018). G-FAST earthquake early warning potential for great earthquakes in Chile. Seismological Research Letters, 89(2A), 542–556. Crowell, B. W., Schmidt, D. A., Bodin, P., Vidale, J. E., Gomberg, J., Renate Hartog, J., et al. (2016). Demonstration of the Cascadia G-FAST geodetic earthquake early warning system for the Nisqually, Washington, earthquake. Seismological Research Letters, 87(4), 930– 943. Delaney, P. T., Denlinger, R. P., Lisowski, M., Miklius, A., Okubo, P. G., Okamura, A. T., & Sako, M. K. (1998). Volcanic spreading at Kilauea, 1976–1996. Journal of Geophysical Research: Solid Earth, 103(B8), 18003-18023. 11 6 DeMets, C., Gordon, R. G., & Argus, D. F. (2010). Geologically current plate motions. Geophysical Journal International, 181(1), 1–80. Denlinger, R. P., & Okubo, P. (1995). Structure of the mobile south flank of Kilauea Volcano, Hawaii. Journal of Geophysical Research: Solid Earth, 100(B12), 24499- 24507. Dieterich, J. H. (1978). Time-dependent friction and the mechanics of stick-slip. In Rock friction and earthquake prediction (pp. 790-806). Birkhäuser, Basel. Dieterich, J. H. (1979). Modeling of rock friction: 1. Experimental results and constitutive equations. Journal of Geophysical Research: Solid Earth, 84(B5), 2161- 2168. Dixon, T. H., Jiang, Y., Malservisi, R., McCaffrey, R., Voss, N., Protti, M., & Gonzalez, V. (2014). Earthquake and tsunami forecasts: Relation of slow slip events to subsequent earthquake rupture. Proceedings of the National Academy of Sciences, 111(48), 17039- 17044. Dragert, H., Wang, K., & James, T. S. (2001). A silent slip event on the deeper Cascadia subduction interface. Science, 292(5521), 1525-1528. Duputel, Z., Tsai, V. C., Rivera, L., & Kanamori, H. (2013). Using centroid time-delays to characterize source durations and identify earthquakes with unique characteristics. Earth and Planetary Science Letters, 374, 92-100. Espinosa-Aranda, J. M., Cuellar, A., Garcia, A., Ibarrola, G., Islas, R., Maldonado, S., & Rodriguez, F. H. (2009). Evolution of the Mexican seismic alert system (SASMEX). Seismological research letters, 80(5), 694-706. Espinosa Aranda, J. M., Jimenez, A., Ibarrola, G., Alcantar, F., Aguilar, A., Inostroza, M., & Maldonado, S. (1995). Mexico City seismic alert system. Seismological research letters, 66, 42-53. Foster, J. H., Lowry, A. R., & Brooks, B. A. (2013). Fault frictional parameters and material properties revealed by slow slip events at Kilauea volcano, Hawai ‘i. Geophysical Research Letters, 40(23), 6059-6063. Frank, W. B., Shapiro, N. M., Kostoglodov, V., Husker, A. L., Campillo, M., Payero, J. S., & Prieto, G. A. (2013). Low‐frequency earthquakes in the Mexican Sweet Spot. Geophysical Research Letters, 40(11), 2661-2666. Frankel, A., Chen, R., Petersen, M., Moschetti, M., & Sherrod, B. (2015). 2014 update of the Pacific Northwest portion of the US National Seismic Hazard Maps. Earthquake Spectra, 31(1_suppl), S131-S148. 11 7 Frankel, A., Wirth, E., Marafi, N., Vidale, J., & Stephenson, W. (2018). Broadband synthetic seismograms for magnitude 9 earthquakes on the cascadia megathrust based on 3D simulations and stochastic synthetics, part 1: Methodology and overall results. Bulletin of the Seismological Society of America, 108(5A), 2347– 2369. Furumoto, A. S., & Kovach, R. L. (1979). The Kalapana earthquake of November 29, 1975: An intra-plate earthquake and its relation to geothermal processes. Physics of the Earth and Planetary Interiors, 18(3), 197-208. Galetzka, J., Melgar, D., Genrich, J. F., Geng, J., Owen, S., Lindsey, E. O., et al. (2015). Slip pulse and resonance of the Kathmandu basin during the 2015 Gorkha earthquake, Nepal. Science, 349(6252), 1091-1095. Geller, R. J. (1976). Scaling relations for earthquake source parameters and magnitudes. Bulletin of the Seismological Society of America, 66(5), 1501-1523. Geng, J., Pan, Y., Li, X., Guo, J., Liu, J., Chen, X., & Zhang, Y. (2018). Noise characteristics of high-rate multi-GNSS for subdaily crustal deformation monitoring. Journal of Geophysical Research: Solid Earth, 123(2), 1987–2002. Gibbons, S. J., & Ringdal, F. (2006). The detection of low magnitude seismic events using array-based waveform correlation. Geophysical Journal International, 165(1), 149- 166. Gillard, D., Wyss, M., & Okubo, P. (1996). Type of faulting and orientation of stress and strain as a function of space and time in Kilauea's south flank, Hawaii. Journal of Geophysical Research: Solid Earth, 101(B7), 16025-16042. Given, D.D., Cochran, E.S., Heaton, T., Hauksson, E., Allen, R., Hellweg, P., Vidale, J., and Bodin, P., 2014, Technical implementation plan for the ShakeAlert production system: An Earthquake Early Warning system for the West Coast of the United States: U.S. Geological Survey Open-File Report 2014–1097, 25 p. Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier neural networks. In Proceedings of the fourteenth international conference on artificial intelligence and statistics (pp. 315–323). Goda, K., Yasuda, T., Mori, N., & Maruyama, T. (2016), New scaling relationships of earthquake source parameters for stochastic tsunami simulation. Coastal Engineering Journal, 58(3), 1650010-1. Goldberg, D. E., & Melgar, D. (2020). Generation and validation of broadband synthetic P waves in semistochastic models of large earthquakes. Bulletin of the Seismological Society of America, 110(4), 1982– 1995. 11 8 Goldberg, D. E., Melgar, D., & Bock, Y. (2019). Seismogeodetic P‐wave amplitude: No evidence for strong determinism. Geophysical research letters, 46(20), 11118-11126. Got, J. L., & Okubo, P. (2003). New insights into Kilauea's volcano dynamics brought by large‐scale relative relocation of microearthquakes. Journal of Geophysical Research: Solid Earth, 108(B7). Grapenthin, R., Johanson, I. A., & Allen, R. M. (2014). Operational real‐time GPS‐ enhanced earthquake early warning. Journal of Geophysical Research: Solid Earth, 119(10), 7944-7965. Graves, R. W., & Pitarka, A. (2010). Broadband ground-motion simulation using a hybrid approach. Bulletin of the Seismological Society of America, 100(5A), 2095–2123. Graves, R., & Pitarka, A. (2015), Refinements to the Graves and Pitarka (2010) broadband ground‐motion simulation method. Seismological Research Letters, 86(1), 75- 80. Hawthorne, J. C., & Rubin, A. M. (2013). Laterally propagating slow slip events in a rate and state friction model with a velocity‐weakening to velocity‐strengthening transition. Journal of Geophysical Research: Solid Earth, 118(7), 3785-3808. Hayes, G. P. (2017). The finite, kinematic rupture properties of great-sized earthquakes since 1990. Earth and Planetary Science Letters, 468, 94– 100. Hayes, G. P., Moore, G. L., Portner, D. E., Hearne, M., Flamme, H., Furtney, M., & Smoczyk, G. M. (2018). Slab2, a comprehensive subduction zone geometry model. Science, 362(6410), 58– 61. Hill, D. P. (1969). Crustal structure of the island of Hawaii from seismic-refraction measurements. Bulletin of the Seismological Society of America, 59(1), 101-130. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. Hoshiba, M., Iwakiri, K., Hayashimoto, N., Shimoyama, T., Hirano, K., Yamada, Y., & Kikuta, H. (2011). Outline of the 2011 off the Pacific coast of Tohoku Earthquake (Mw 9.0)—Earthquake early warning and observed seismic intensity—. Earth Planets and Space, 63(7), 547– 551. Hoshiba M., & Ozaki T. (2014). Earthquake early warning and tsunami warning of the Japan MeteorologicalAgency, and their performance in the 2011 off the Pacific coast of Tohoku earthquake (M9.0). InEarlyWarning for Geological Disasters, ed. F Wenzel, J Zschau, pp. 1–28. Berlin: Springer-Verlag Hsiao, N. C., Wu, Y. M., Shin, T. C., Zhao, L., & Teng, T. L. (2009). Development of earthquake early warning system in Taiwan. Geophysical research letters, 36(5). 11 9 Hutchison, A. A., Böse, M., & Manighetti, I. (2020). Improving early estimates of large earthquake's final fault lengths and magnitudes leveraging source fault structural maturity information. Geophysical Research Letters, 47(14), e2020GL087539. Ide, S. (2019). Frequent observations of identical onsets of large and small earthquakes. Nature, 573(7772), 112-116. Ide, S., Beroza, G. C., Shelly, D. R., & Uchide, T. (2007). A scaling law for slow earthquakes. Nature, 447(7140), 76-79. Ide, S., Shelly, D. R., & Beroza, G. C. (2007). Mechanism of deep low frequency earthquakes: Further evidence that deep non‐volcanic tremor is generated by shear slip on the plate interface. Geophysical Research Letters, 34(3). Ito, Y., Hino, R., Kido, M., Fujimoto, H., Osada, Y., Inazu, D., et al. (2013). Episodic slow slip events in the Japan subduction zone before the 2011 Tohoku-Oki earthquake. Tectonophysics, 600, 14-26. Ito, Y., Obara, K., Shiomi, K., Sekine, S., & Hirose, H. (2007). Slow earthquakes coincident with episodic tremors and slow slip events. Science, 315(5811), 503-506. Jozinović, D., Lomax, A., Štajduhar, I., & Michelini, A. (2020). Rapid prediction of earthquake ground shaking intensity using raw waveform data and a convolutional neural network. Geophysical Journal International, 222(2), 1379-1389. Kamigaichi, O., Saito, M., Doi, K., Matsumori, T., Tsukada, S. Y., Takeda, K., et al. (2009). Earthquake early warning in Japan: Warning the general public and future prospects. Seismological Research Letters, 80(5), 717. Kato, A., Obara, K., Igarashi, T., Tsuruoka, H., Nakagawa, S., & Hirata, N. (2012). Propagation of slow slip leading up to the 2011 M w 9.0 Tohoku-Oki earthquake. Science, 335(6069), 705-708. Katsumata, A., & Kamaya, N. (2003). Low‐frequency continuous tremor around the Moho discontinuity away from volcanoes in the southwest Japan. Geophysical Research Letters, 30(1), 20-1. Kawamoto, S., Hiyama, Y., Ohta, Y., & Nishimura, T. (2016). First result from the GEONET real-time analysis system (REGARD): the case of the 2016 Kumamoto earthquakes. Earth, Planets and Space, 68(1), 1-12. Klein, F. W. (1981). A linear gradient crustal model for south Hawaii. Bulletin of the Seismological Society of America, 71(5), 1503-1510. 12 0 Klein, F. W., Koyanagi, R. Y., Nakata, J. S., Tanigawa, W. R., Decker, R. W., Wright, T. L., & Stauffer, P. H. (1987). Volcanism in Hawaii. US Geol. Surv. Prof. Pap, 1350, 1019-1185. Kodera, Y., Hayashimoto, N., Moriwaki, K., Noguchi, K., Saito, J., Akutagawa, J., et al. (2020). First-year performance of a nationwide earthquake early warning system using a wavefield-based ground-motion prediction algorithm in Japan. Seismological Research Letters, 91(2A), 826– 834. Kodera, Y., Yamada, Y., Hirano, K., Tamaribuchi, K., Adachi, S., Hayashimoto, N., et al. (2018). The propagation of local undamped motion (PLUM) method: A simple and robust seismic wavefield estimation approach for earthquake early warning. Bulletin of the Seismological Society of America, 108(2), 983– 1003. Kohler, M. D., Smith, D. E., Andrews, J., Chung, A. I., Hartog, R., Henson, I., et al. (2020). Earthquake early warning ShakeAlert 2.0: Public rollout. Seismological Research Letters, 91(3), 1763–1775. Kong, Q., Trugman, D. T., Ross, Z. E., Bianco, M. J., Meade, B. J., & Gerstoft, P. (2019). Machine learning in seismology: Turning data into insights. Seismological Research Letters, 90(1), 3– 14. Krischer, L., Megies, T., Barsch, R., Beyreuther, M., Lecocq, T., Caudron, C., & Wassermann, J. (2015). ObsPy: A bridge for seismology into the scientific Python ecosystem. Computational Science & Discovery, 8(1), 014003. Larson, K. M. (2009). GPS seismology. Journal of Geodesy, 83(3), 227-233. Lay, T., Ye, L., Kanamori, H., & Satake, K. (2018). Constraining the dip of shallow, shallowly dipping thrust events using long‐period love wave radiation patterns: Applications to the 25 October 2010 Mentawai, Indonesia, and 4 May 2018 Hawaii Island earthquakes. Geophysical Research Letters, 45(19), 10-342. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436– 444. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. LeVeque, R. J., Waagan, K., González, F. I., Rim, D., & Lin, G. (2016). Generating random earthquake events for probabilistic tsunami hazard assessment. Global Tsunami Science: Past and Future (Vol. I, pp. 3671–3692). Cham: Birkhäuser. Lin, J. T., Chang, W. L., Melgar, D., Thomas, A., & Chiu, C. Y. (2019). Quick determination of earthquake source parameters from GPS measurements: A study of suitability for Taiwan. Geophysical Journal International, 219(2), 1148–1162. 12 1 Lin, J. T., Melgar, D., Thomas, A. M., & Searcy, J. (2021). Early warning for great earthquakes from characterization of crustal deformation patterns with deep learning. Journal of Geophysical Research: Solid Earth, 126(10), e2021JB022703. Liu, C., Lay, T., & Xiong, X. (2018). Rupture in the 4 May 2018 Mw 6.9 earthquake seaward of the Kilauea east rift zone fissure eruption in Hawaii. Geophysical Research Letters, 45(18), 9508-9515. Liu, Y., & Rice, J. R. (2009). Slow slip predictions based on granite and gabbro friction data compared to GPS measurements in northern Cascadia. Journal of Geophysical Research: Solid Earth, 114(B9). Lomax, A., Michelini, A., & Jozinović, D. (2019). An investigation of rapid earthquake characterization using single-station waveforms and a convolutional neural network. Seismological Research Letters, 90(2A), 517–529. Maas, A. L., Hannun, A. Y., & Ng, A. Y. (2013). Rectifier nonlinearities improve neural network acoustic models. In Proc. icml (Vol. 30, p. 3). Mai, P. M., & Beroza, G. C. (2002). A spatial random field model to characterize complexity in earthquake slip. Journal of Geophysical Research: Solid Earth, 107(B11), ESE-10. Meier, M. A., Ampuero, J. P., Cochran, E., & Page, M. (2021). Apparent earthquake rupture predictability. Geophysical Journal International, 225(1), 657-663. Meier, M. A., Ampuero, J. P., & Heaton, T. H. (2017). The hidden simplicity of subduction megathrust earthquakes. Science, 357(6357), 1277-1281. Meier, M. A., Heaton, T., & Clinton, J. (2016). Evidence for universal earthquake rupture initiation behavior. Geophysical Research Letters, 43(15), 7991-7996. Melgar, D., Bock, Y., & Crowell, B. W. (2012). Real-time centroid moment tensor determination for large earthquakes from local and regional displacement records. Geophysical Journal International, 188(2), 703– 718. Melgar, D., & Bock, Y. (2015). Kinematic earthquake source inversion and tsunami runup prediction with regional geophysical data. Journal of Geophysical Research: Solid Earth, 120(5), 3324-3349. Melgar, D., Crowell, B. W., Bock, Y., & Haase, J. S. (2013). Rapid modeling of the 2011 Mw 9.0 Tohoku-Oki earthquake with seismogeodesy. Geophysical Research Letters, 40(12), 2963– 2968. 12 2 Melgar, D., Crowell, B. W., Geng, J., Allen, R. M., Bock, Y., Riquelme, S., et al. (2015). Earthquake magnitude calculation without saturation from the scaling of peak ground displacement. Geophysical Research Letters, 42(13), 5197– 5205. Melgar, D., Crowell, B. W., Melbourne, T. I., Szeliga, W., Santillan, M., & Scrivner, C. (2020). Noise characteristics of operational real-time high-rate GNSS positions in a large aperture network. Journal of Geophysical Research: Solid Earth, 125(7), e2019JB019197. Melgar, D., Fan, W., Riquelme, S., Geng, J., Liang, C., Fuentes, M., et al. (2016). Slip segmentation and slow rupture to the trench during the 2015, Mw8. 3 Illapel, Chile earthquake. Geophysical Research Letters, 43(3), 961-966. Melgar, D., & Hayes, G. P. (2017). Systematic observations of the slip pulse properties of large earthquake ruptures. Geophysical Research Letters, 44(19), 9691– 9698. Melgar, D., & Hayes, G. P. (2019). Characterizing large earthquakes before rupture is complete. Science Advances, 5(5), eaav2032. Melgar, D., LeVeque, R. J., Dreger, D. S., & Allen, R. M. (2016). Kinematic rupture scenarios and synthetic displacement data: An example application to the Cascadia subduction zone. Journal of Geophysical Research: Solid Earth, 121(9), 6658– 6674. Mena, B., Mai, P. M., Olsen, K. B., Purvance, M. D., & Brune, J. N. (2010), Hybrid broadband ground-motion simulation using scattering Green’s functions: Application to large-magnitude events. Bulletin of the Seismological Society of America, 100(5A), 2143- 2162. Minson, S. E., Baltay, A. S., Cochran, E. S., McBride, S. K., & Milner, K. R. (2021). Shaking is almost always a surprise: The earthquakes that produce significant ground motion. Seismological Society of America, 92(1), 460–468. Minson, S. E., Murray, J. R., Langbein, J. O., & Gomberg, J. S. (2014). Real‐time inversions for finite fault slip models and rupture geometry based on high‐rate GPS data. Journal of Geophysical Research: Solid Earth, 119(4), 3201-3231. Montalva, G. A., Bastías, N., & Rodriguez-Marek, A. (2017). Ground-motion prediction equation for the Chilean subduction zone. Bulletin of the Seismological Society of America, 107(2), 901-911. Montgomery‐Brown, E. K., Segall, P., & Miklius, A. (2009). Kilauea slow slip events: Identification, source inversions, and relation to seismicity. Journal of Geophysical Research: Solid Earth, 114(B6). 12 3 Morgan, J. K., Moore, G. F., Hills, D. J., & Leslie, S. (2000). Overthrusting and sediment accretion along Kilauea's mobile south flank, Hawaii: evidence for volcanic spreading from marine seismic reflection data. Geology, 28(7), 667-670. Mori, J., & Kanamori, H. (1996). Initial rupture of earthquakes in the 1995 Ridgecrest, California sequence. Geophysical Research Letters, 23(18), 2437-2440. Mousavi, S. M., & Beroza, G. C. (2020). A machine-learning approach for earthquake magnitude estimation. Geophysical Research Letters, 47(1), e2019GL085976. Mousavi, S. M., Ellsworth, W. L., Zhu, W., Chuang, L. Y., & Beroza, G. C. (2020). Earthquake transformer—an attentive deep-learning model for simultaneous earthquake detection and phase picking. Nature communications, 11(1), 1-12. Mousavi, S. M., & Beroza, G. C. (2022). Deep-learning seismology. Science, 377(6607), eabm4470. Münchmeyer, J., Bindi, D., Leser, U., & Tilmann, F. (2021). The transformer earthquake alerting model: A new versatile approach to earthquake early warning. Geophysical Journal International, 225(1), 646-656. Murray, J. R., Crowell, B. W., Grapenthin, R., Hodgkinson, K., Langbein, J. O., Melbourne, T., et al. (2018). Development of a geodetic component for the US West Coast earthquake early warning system. Seismological Research Letters, 89(6), 2322– 2336. Nadeau, R. M., & Dolenc, D. (2005). Nonvolcanic tremors deep beneath the San Andreas Fault. Science, 307(5708), 389-389. Nakamura, K. (1982). Why do long rift zones develop better in Hawaiian volcanoes: a possible role of thick oceanic sediments. Arquipélago. Série Ciências da Natureza, 3, 59- 73. Nakamura, Y. (1988). On the urgent earthquake detection and alarm system (UrEDAS). In Proc. of the 9th World Conference on Earthquake Engineering (Vol. 7, pp. 673-678). Obara, K. (2002). Nonvolcanic deep tremor associated with subduction in southwest Japan. Science, 296(5573), 1679-1681. Obara, K., & Hirose, H. (2006). Non-volcanic deep low-frequency tremors accompanying slow slips in the southwest Japan subduction zone. Tectonophysics, 417(1- 2), 33-51. Obara, K., & Kato, A. (2016). Connecting slow earthquakes to huge earthquakes. Science, 353(6296), 253-257. 12 4 Olson, E. L., & Allen, R. M. (2005). The deterministic nature of earthquake rupture. Nature, 438(7065), 212-215. Owen, S. E., & Bürgmann, R. (2006). An increment of volcano collapse: Kinematics of the 1975 Kalapana, Hawaii, earthquake. Journal of Volcanology and Geothermal Research, 150(1-3), 163-185. Owen, S., Segall, P., Freymueller, J., Mikijus, A., Denlinger, R., Árnadóttir, T., Sako, M. & Bürgmann, R. (1995). Rapid deformation of the south flank of Kilauea volcano, Hawaii. Science, 267(5202), 1328-1332. Park, J., Morgan, J. K., Zelt, C. A., Okubo, P. G., Peters, L., & Benesh, N. (2007). Comparative velocity structure of active Hawaiian volcanoes from 3-D onshore–offshore seismic tomography. Earth and Planetary Science Letters, 259(3-4), 500-516. Park, J., Song, T. R. A., Tromp, J., Okal, E., Stein, S., Roult, G., et al. (2005). Earth's free oscillations excited by the 26 December 2004 Sumatra-Andaman earthquake. Science, 308(5725), 1139-1144. Pasyanos, M. E., Masters, T. G., Laske, G., & Ma, Z. (2014), LITHO1. 0: An updated crust and lithospheric model of the Earth. Journal of Geophysical Research: Solid Earth, 119(3), 2153-2173. Peng, Z., & Gomberg, J. (2010). An integrated perspective of the continuum between earthquakes and slow-slip phenomena. Nature geoscience, 3(9), 599-607. Perol, T., Gharbi, M., & Denolle, M. (2018). Convolutional neural network for earthquake detection and location. Science Advances, 4(2), e1700578. Pitarka, A., Graves, R., Irikura, K., Miyakoshi, K., & Rodgers, A. (2020). Kinematic rupture modeling of ground motion from the M7 Kumamoto, Japan earthquake. Pure and Applied Geophysics, 177(5), 2199– 2221. Radiguet, M., Perfettini, H., Cotte, N., Gualandi, A., Valette, B., Kostoglodov, V., et al. (2016). Triggering of the 2014 Mw7. 3 Papanoa earthquake by a slow slip event in Guerrero, Mexico. Nature Geoscience, 9(11), 829-833. Ramos, M. D., & Huang, Y. (2019). How the transition region along the Cascadia megathrust influences coseismic behavior: Insights from 2‐D dynamic rupture simulations. Geophysical Research Letters, 46(4), 1973-1983. Riquelme, S., Medina, M., Bravo, F., Barrientos, S., Campos, J., & Cisternas, A. (2018). W-phase real-time implementation and network expansion from 2012 to 2017: The experience in Chile. Seismological Research Letters, 89(6), 2237–2248. 12 5 Rogers, G., & Dragert, H. (2003). Episodic tremor and slip on the Cascadia subduction zone: The chatter of silent slip. science, 300(5627), 1942-1943. Rolandone, F., Nocquet, J. M., Mothes, P. A., Jarrin, P., Vallée, M., Cubas, N., et al. (2018). Areas prone to slow slip events impede earthquake rupture propagation and promote afterslip. Science advances, 4(1), eaao6596. Romano, F., Piatanesi, A., Lorito, S., Tolomei, C., Atzori, S., & Murphy, S. (2016). Optimal time alignment of tide‐gauge tsunami waveforms in nonlinear inversions: Application to the 2015 Illapel (Chile) earthquake. Geophysical Research Letters, 43(21), 11-226. Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer, Cham. Ross, Z. E., Meier, M. A., & Hauksson, E. (2018). P wave arrival picking and first‐ motion polarity determination with deep learning. Journal of Geophysical Research: Solid Earth, 123(6), 5120-5129. Ross, Z. E., Meier, M. A., Hauksson, E., & Heaton, T. H. (2018). Generalized seismic phase detection with deep learning. Bulletin of the Seismological Society of America, 108(5A), 2894– 2901. Royer, A. A., & Bostock, M. G. (2014). A comparative study of low frequency earthquake templates in northern Cascadia. Earth and Planetary Science Letters, 402, 247-256. Rubin, A. M. (2008). Episodic slow slip events and rate‐and‐state friction. Journal of Geophysical Research: Solid Earth, 113(B11). Ruhl, C. J., Melgar, D., Chung, A. I., Grapenthin, R., & Allen, R. M. (2019). Quantifying the value of real‐time geodetic constraints for earthquake early warning using a global seismic and geodetic data set. Journal of Geophysical Research: Solid Earth, 124(4), 3819-3837. Ruhl, C. J., Melgar, D., Geng, J., Goldberg, D. E., Crowell, B. W., Allen, R. M., et al. (2019). A global database of strong-motion displacement GNSS recordings and an example application to PGD scaling. Seismological Research Letters, 90(1), 271–279. Ruina, A. (1983). Slip instability and state variable friction laws. Journal of Geophysical Research: Solid Earth, 88(B12), 10359-10370. Ruiz, S., & Madariaga, R. (2018). Historical and recent large megathrust earthquakes in Chile. Tectonophysics, 733, 37–56. 12 6 Rydelek, P., & Horiuchi, S. (2006). Is earthquake rupture deterministic?. Nature, 442(7100), E5-E6. Saffer, D. M., & Wallace, L. M. (2015). The frictional, hydrologic, metamorphic and thermal habitat of shallow slow earthquakes. Nature Geoscience, 8(8), 594-600. Satake, K., Fujii, Y., Harada, T., & Namegaya, Y. (2013). Time and space distribution of coseismic slip of the 2011 Tohoku earthquake as inferred from tsunami waveform data. Bulletin of the seismological society of America, 103(2B), 1473-1492. Segall, P., Desmarais, E. K., Shelly, D., Miklius, A., & Cervelli, P. (2006). Earthquakes triggered by silent slip events on Kīlauea volcano, Hawaii. Nature, 442(7098), 71-74. Segall, P., Rubin, A. M., Bradley, A. M., & Rice, J. R. (2010). Dilatant strengthening as a mechanism for slow slip events. Journal of Geophysical Research: Solid Earth, 115(B12). Seno, T., & Yamasaki, T. (2003). Low‐frequency tremors, intraslab and interplate earthquakes in Southwest Japan—from a viewpoint of slab dehydration. Geophysical Research Letters, 30(22). Shelly, D. R., Beroza, G. C., & Ide, S. (2007). Non-volcanic tremor and low-frequency earthquake swarms. Nature, 446(7133), 305-307. Shelly, D. R., & Hardebeck, J. L. (2010). Precise tremor source locations and amplitude variations along the lower‐crustal central San Andreas Fault. Geophysical Research Letters, 37(14). Skarbek, R. M., Rempel, A. W., & Schmidt, D. A. (2012). Geologic heterogeneity can produce aseismic slip transients. Geophysical Research Letters, 39(21). Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1), 1929-1958. Stein, S., & Okal, E. A. (2005). Speed and size of the Sumatra earthquake. Nature, 434(7033), 581-582. Thomas, A. M., Beroza, G. C., & Shelly, D. R. (2016). Constraints on the source parameters of low‐frequency earthquakes on the San Andreas Fault. Geophysical Research Letters, 43(4), 1464-1471. Thomas, A. M., Inbal, A., Searcy, J., Shelly, D. R., & Bürgmann, R. (2021). Identification of Low‐Frequency Earthquakes on the San Andreas Fault With Deep Learning. Geophysical Research Letters, 48(13), e2021GL093157. 12 7 Thurber, C. H., & Gripp, A. E. (1988). Flexure and seismicity beneath the south flank of Kilauea volcano and tectonic implications. Journal of Geophysical Research: Solid Earth, 93(B5), 4271-4278. Thurber, C. H., Li, Y., & Johnson, C. (1989). Seismic detection of a low‐velocity layer beneath the southeast flank of Mauna Loa, Hawaii. Geophysical Research Letters, 16(7), 649-652. Tsang, L. L., Meltzner, A. J., Philibosian, B., Hill, E. M., Freymueller, J. T., & Sieh, K. (2015). A 15 year slow‐slip event on the Sunda megathrust offshore Sumatra. Geophysical Research Letters, 42(16), 6630-6638. Vallée, M., & Douet, V. (2016). A new database of source time functions (STFs) extracted from the SCARDEC method. Physics of the Earth and Planetary Interiors, 257, 149-157. van den Ende, M. P., & Ampuero, J. P. (2020). Automated seismic source characterization using deep graph neural networks. Geophysical Research Letters, 47(17), e2020GL088690. Wallace, L. M., Webb, S. C., Ito, Y., Mochizuki, K., Hino, R., Henrys, S., Schwartz, S. Y. & Sheehan, A. F. (2016). Slow slip near the trench at the Hikurangi subduction zone, New Zealand. Science, 352(6286), 701-704. Wech, A. G. (2021). Cataloging tectonic tremor energy radiation in the Cascadia subduction zone. Journal of Geophysical Research: Solid Earth, 126(10), e2021JB022523. Wech, A. G., & Creager, K. C. (2008). Automated detection and location of Cascadia tremor. Geophysical Research Letters, 35(20). Wenzel, F., Erdik, M., Köhler, N., Zschau, J., Milkereit, C., Picozzi, M., et al. (2014). EDIM: Earthquake disaster information system for the Marmara region, Turkey. In Early warning for geological disasters (pp. 103-116). Springer, Berlin, Heidelberg. Williamson, A. L., Melgar, D., Crowell, B. W., Arcas, D., Melbourne, T. I., Wei, Y., & Kwong, K. (2020). Toward near‐field tsunami forecasting along the Cascadia subduction zone using rapid GNSS source models. Journal of Geophysical Research: Solid Earth, 125(8), e2020JB019636. Wu, Y. M., & Kanamori, H. (2005). Experiment on an onsite early warning method for the Taiwan early warning system. Bulletin of the Seismological Society of America, 95(1), 347- 353. Yamashita, Y., Yakiwara, H., Asano, Y., Shimizu, H., Uchida, K., Hirano, S., et al. (2015). Migrating tremor off southern Kyushu as evidence for slow slip of a shallow subduction interface. Science, 348(6235), 676-679. 12 8 Ye, L., Lay, T., Kanamori, H., & Rivera, L. (2016). Rupture characteristics of major and great (Mw ≥ 7.0) megathrust earthquakes from 1990 to 2015: 1. Source parameter scaling relationships. Journal of Geophysical Research: Solid Earth, 121(2), 826– 844. Zhang, H., Melgar, D., Sahakian, V., Searcy, J., & Lin, J. T. (2022). Learning source, path and site effects: CNN-based on-site intensity prediction for earthquake early warning. Geophysical Journal International, 231(3), 2186-2204. Zhu, W., & Beroza, G. C. (2019). PhaseNet: a deep-neural-network-based seismic arrival-time picking method. Geophysical Journal International, 216(1), 261-273. Zhu, L., & Rivera, L. A. (2002), A note on the dynamic and static displacements from a point source in multilayered media. Geophysical Journal International, 148(3), 619-627. Zollo, A., Iannaccone, G., Lancieri, M., Cantore, L., Convertito, V., Emolo, A., et al. (2009). Earthquake early warning system in southern Italy: Methodologies and performance evaluation. Geophysical research letters, 36(5). 12 9