PREDICTING PHENOTYPES IN SPARSELY SAMPLED GENOTYPE-PHENOTYPE MAPS by ZACHARY R. SAILER A DISSERTATION Presented to the Department of Chemistry and Biochemistry and the Graduate School of University of Oregon in partial fulfillment of the requirements for the degree of Doctor of Philosophy September 2018 DISSERTATION APPROVAL PAGE Student: Zachary R. Sailer Title: Predicting Phenotypes in Sparsely Sampled Genotype-Phenotype Maps This dissertation has been accepted and approved in partial fulfillment of the require- ments for the Doctor of Philosophy in the Department of Chemistry and Biochemistry by: Jeffrey Cina Chair Marina Guenza Member John Conery Institutional Representative Michael Harms Advisor and Janet Woodruff-Borden Vice Provost and Dean of the Graduate School Original approval signatures are on file with the University of Oregon Graduate School Degree awarded September 2018 ii c© 2018 Zachary R. Sailer This work is licensed under a Create Commons Attribution (United States) License. iii DISSERTATION ABSTRACT Zachary R. Sailer Doctor of Philosophy Department of Chemistry and Biochemistry September 2018 Title: Predicting Phenotypes in Sparsely Sampled Genotype-Phenotype Maps Naturally evolving proteins must navigate a vast set of possible sequences to evolve new functions. This process depends on the genotype-phenotype map. Much effort has been directed at measuring protein genotype phenotype maps to uncover evolu- tionary trajectories that lead to new functions. Often, these maps are too large to comprehensively measure. Sparsely measured maps, however, are prone to missing key evolutionary trajectories. Many groups turn to computational models to infer missing phenotypes. These models treat mutations as independent perturbations to the genotype-phenotype map. A key question is how to handle non-independent effects known as epistasis. In this dissertation, we address two sources of epista- sis: 1) global and 2) local epistasis. We find that incorporating global epistasis improves our predictive power, while local epistasis does not. We use our model to infer unknown phenotypes in the Plasmodium falciparum chloroquine transporter (PfCRT) genotype-phenotype map, a protein responsible for conferring drug resis- tance in malaria. From these predictions, we uncover key evolutionary trajectories that led high resistance. This dissertation includes previously published and unpub- lished co-authored material. iv CURRICULUM VITAE NAME OF AUTHOR: Zachary R. Sailer GRADUATE AND UNDERGRADUATE SCHOOLS ATTENDED: University of Oregon, Eugene California Polytechnic State University, San Luis Obispo, CA DEGREES AWARD: Doctor of Philosophy, Chemistry and Biochemistry, 2018, University of Ore- gon Bachelor of Science, Physics, 2018, California Polytechnic State University GRANTS, AWARDS, AND HONORS: Travel Award, SciPy, 2017 Travel Award, JupyterCon, 2017 Travel Award, SciPy, 2018 Art Rosen Memorial Scholar, Cal Poly SLO Physics Department, 2012 PUBLICATIONS: Sailer, Zachary R. and Harms, Michael J., Uninterpretable interactions: epistasis as uncertainty, bioRxiv (2018). Sailer, Zachary R. and Harms, Michael J., Molecular ensembles make evo- lution unpredictable, PNAS 114, 45 (2017), pp. 11938--11943. Sailer, Zachary R. and Harms, Michael J., High-order epistasis shapes evolu- tionary trajectories, PLOS Computational Biology 13, 5 (2017), pp. e1005541. Sailer, Zachary R. and Harms, Michael J., Detecting High-Order Epistasis in Nonlinear Genotype-Phenotype Maps, Genetics 205, 3 (2017), pp. 1079- -1088. v TABLE OF CONTENTS Chapter Page I INTRODUCTION .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 The Genotype-Phenotype Map that Led to Malarial Drug Resistance . . . . 2 Computational Approach to Inferring Missing Phenotypes . . . . . . . . . . . . . . . . 6 Epistasis Impairs Prediction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter-by-chapter Breakdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Broader Impacts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 II DETECTING HIGH-ORDER EPISTASIS IN NONLINEAR GENOTYPE- PHENOTYPE MAPS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Author Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Results and Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 III HIGH-ORDER EPISTASIS SHAPES EVOLUTIONARY TRAJECTO- RIES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Author Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Author Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 vi Chapter Page Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 IV MOLECULAR ENSEMBLES MAKE EVOLUTION UNPREDICTABLE . 56 Author Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 V UNINTERPRETABLE INTERACTIONS: EPISTASIS AS UNCERTAINTY .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Author Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 VI PREDICTING DRUG RESISTANCE IN MALARIA VIA THE PFCRT GENOTYPE-PHENOTYPE MAP .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Author Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 vii Chapter Page Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 VII CONCLUSIONS AND FUTURE DIRECTIONS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 VIII APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 REFERENCES CITED .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 viii LIST OF FIGURES Figure Page 1 John Maynard Smith’s word game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Incomplete PfCRT genotype-phenotype map shows epistasis. . . . . . . . . . . . . 4 3 Epistasis can be quantified using Walsh polynomials . . . . . . . . . . . . . . . . . . . . . 18 4 Nonlinearity in phenotype creates spurious high-order epistatic coeffi- cients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5 Epistasis and nonlinear scale induce different patterns of nonadditivity. 23 6 Experimental genotype-phenotype maps exhibit nonlinear phenotypes. . 25 7 High-order epistasis is present in genotype-phenotype maps . . . . . . . . . . . . . 27 8 Nonlinear phenotypes distort measured epistatic coefficients . . . . . . . . . . . . . 29 9 Contributions of epistasis to variation in fitness . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 10 Epistasis alters evolutionary trajectories through genotype-fitness maps 46 11 Changes in trajectories are not the result of experimental uncertainty . . 47 12 Epistasis alters trajectories in dataset V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 13 The magnitude of epistasis, not its order, predicts its effects on trajec- tories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 14 Epistasis complicates predicting trajectories from the ancestral genotype 51 15 Evolution is unpredictable in protein lattice models . . . . . . . . . . . . . . . . . . . . . . 64 16 Conformational ensembles induce epistasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 17 Ensemble-induced epistasis leads to unpredictibility . . . . . . . . . . . . . . . . . . . . . . 71 18 Addition of high-order epistasis does not lead to predictability . . . . . . . . . . 73 19 Linear epistatic coefficients cannot be estimated from an incomplete, simulated genotype-phenotype map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 20 Predictive epistatic coefficients cannot be resolved from experimental genotype-phenotype maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 21 Experimental maps resemble random maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 ix Figure Page 22 Epistasis as uncertainty. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 23 A predictive, additive model can be trained on a large genotype-phenotype map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 24 We have a sparse sample of phenotypes in the transition from low to high PfCRT CQ transportation activity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 25 We can train models on measured phenotypes to predict previously unmeasured phenotypes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 26 The additive+classifier+nonlinear model captures the most variation without over fitting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 27 Only a few pathways to high CQ-transport proteins are accessible. . . . . . 110 28 Experimental genotype-phenotype maps exhibit nonlinear phenotypes. . 121 29 Nonlinear phenotypes can be transformed to linear scale to estimate high-order epistasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 30 High-order epistasis is present in genotype-phenotype maps . . . . . . . . . . . . . 123 31 Nonlinear phenotypes distort measured epistatic coefficients . . . . . . . . . . . . . 124 32 Additive coefficients are well estimated, even when nonlinearity is ne- glected . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 33 Exponential fitness model leads to global nonlinearity in the -lactamase data set (III) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 34 Flow chart for removing epistasis in a genotype-fitness map . . . . . . . . . . . . . 127 35 Schematic of our resampling protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 36 Epistasis alters trajectories in dataset I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 37 Epistasis alters trajectories in dataset II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 38 Epistasis alters trajectories in dataset III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 39 Epistasis alters trajectories in dataset IV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 x Figure Page 40 Epistasis alters trajectories in dataset V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 41 Epistasis alters trajectories in dataset VI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 42 Unpredictability from the ensemble and neutral drift trade off with effective population size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 43 Epistasis in a simulated genotype-phenotype map.. . . . . . . . . . . . . . . . . . . . . . . . 138 44 Linear epistatic coefficients cannot be estimated from an incomplete, simulated genotype-phenotype map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 45 Predictive epistatic coefficients cannot be extracted from experimental genotype-phenotype maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 46 Epistasis in experimental maps looks like random epistasis.. . . . . . . . . . . . . . 141 xi LIST OF TABLES Table Page 1 Experimental genotype-phenotype maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2 Experimental genotype-fitness maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3 Mathematical formulation of multi-state ensembles in lattice proteins. . 69 4 Published experimental genotype-phenotype maps. . . . . . . . . . . . . . . . . . . . . . . . 86 xii CHAPTER I INTRODUCTION Over several billion years, proteins have evolved and diversified to perform various functions necessary for life. This is an incredible feat, since naturally evolving proteins must navigate a vast space of possible sequences to acquire new functions [1, 2]. While sequence space is large, it is also sparse–comprised mostly of nonfunctional or inviable sequences [3]. How do proteins navigate sequence space and evolve new functions? The genotype-phenotype map is a key determinant of protein evolution. John Maynard Smith illustrated protein genotype-phenotype maps using a famous word game. [4, 5]. He imagined protein sequences as English words. To evolve between two words, evolution can only proceed through adjacent meaningful words. In the evolution from WORD to COST, there are 16 possible trajectories (Figure 1A), but only one trajectory is accessible. The distribution of meaningful words strongly shapes evolution in this space. This is analogous to how Smith imagine proteins navigate sequence space. Evolutionary trajectories are strongly shaped by the distribution of phenotype in sequence space. Ogbunugafor et al. [5] extended the word game to show how proteins might evolve a quantitative phenotype by mapping each word’s frequency from various books pulled in Google Books Ngram viewer (Figure 1B). In this space, proteins accumulate mutations that incrementally improve the phenotype over evolutionary time. The word game shows that to understand how a protein evolved, it is necessary to study the genotype-phenotype map [6, 7, 2, 8, 9, 10]. The genotype-phenotype map insight to key evolutionary questions like: why is one evolutionary trajectory taken over another? What are the biological, chemical, and physical determinants of acces- 1 Figure 1: John Maynard Smith’s wordgame. A) Panel shows the full map of English words between WORD and COST. Red word represent meaningful words English words, black words represent gibberish words. The edges represented single letter substitutions between words. The red trajectory represents the only accessible trajectory. B) Panel shows the word game with quantitative phenotypes. The colors represent the frequency of each word in Google’s Ngram viewer. sible evolutionary trajectories? Is evolution constrained? Is evolution predictable? Measuring complete protein genotype-phenotype maps, however, is often intractable. The number of genotypes grows rapidly as the number of mutations increases. For example, in the word game (Figure 1), four mutations led to a binary genotype- phenotype with 16 genotypes. The same word game with 15 mutations (a modest number) leads to a map 32,768 genotypes. The numbers compound even quicker if we consider any amino acid at each site. For example, a protein with all 20 amino acids at 15 sites leads to a genotype-phenotype map with > 1019 genotypes. Even mod- ern high-throughput techniques cannot characterize maps with this many genotypes. Further, there are many cases in which high-throughput methods are not currently possible. As a result, even small genotype-phenotype maps can be experimentally challenging to characterize. The Genotype-Phenotype Map that Led to Malarial Drug Resistance A timely example is the chloroquine resistance transporter (PfCRT) protein that evolved to confer drug resistance in plasmodium falciparum [11]. Eight mutations were identified in PfCRT, that confers resistance to chloroquine (CQ). For many decades, chloroquine was the primary malaria treatment before these chloroquine-resistance 2 parasites naturally evolved and stunted its effectiveness. This was disastrous for world health [11]. Summers et al. constructed the genotype-phenotype map from the 8 mutations. This is shown in Figure 2A [11]. Many questions immediately arose: How did the eight mutations arise? What evolutionary trajectories were the most probable (i.e. in what order did the mutations occur)? Why did this system take many decades to evolve–a stark contrast from other evolutionary systems like influenza or HIV [10]. 3 Figure 2: Incomplete PfCRT genotype-phenotype map shows epistasis. A) Panel shows known and unknown phenotypes in PfCRT’s genotype-phenotype map. Nodes represent genotypes. Edges represent single substitutions. Node color rep- resents the measured CQ uptake. White nodes represent genotypes that were not measured. B) Panel shows one possible evolutionary trajectory in PfCRT. Labels show position and mutation that occurred only each edge in the trajectory. C) Panel shows the Pobs vs. Padd curve plot for the observed phenotypes. Points represent geno- types. Error bars represent uncertainty in the measured phenotypes (two standard deviations). Red line represents the 1:1 line. D) Panel shows the quantitative effects of each mutation. Black bars represent the effect of each mutation in the trajectory. White bars represent the effect of the each mutation in the derived genotype’s genetic background. Red star highlights mutation 356T which flips sign when introduced in the trajectory versus the derived background. PfCRT plays a key role in transporting chloroquine out of the cell’s digestive vacuole (DV). CQ is a diprotic base that diffuses into the DV[12]. Here, it de- protonates in the acidic environment and becomes trapped. It accumulates in high 4 concentration, prevents the cell from detoxifying, and causes cell death [13]. To counteract the high accumulation of CQ, the parasite hijacked PfCRT to transport CQ out of the digestive vacuole. The eight mutations that arose in nature malarial strains enabled PfCRT to transport de-protonated CQ molecules across the vacuole membrane and into the intercellular space. To understand how PfCRT evolved the ability to transport CQ, Summers et al. sought to measure portions of the genotype-phenotype map. They constructed dif- ferent combinations of the 8 mutations in PfCRT (shown in color in Figure 2). They expressed these mutants on the membrane of Xenopus laevis oocytes cells. They then left the cells in saturating concentrations of protonated CQ and measured the CQ uptake over time. This took Summers et al. a few years to measure 52 genotypes–only a fraction of the entire map. Using the information from the measured 52 genotypes, Summers et al. inferred a set of evolutionary trajectories that led to high-CQ transportation PfCRT [11]. One of these trajectories is shown in Figure 2B. Interestingly, they found that every trajectory required at least one neutral step–a substitution that led to no change in CQ uptake. They suggest that this may be the reason that CQ resistance took many decades to evolve. Though a few possible trajectories were proposed, it is unclear whether these are key trajectories without knowing the complete genotype-phenotype map. It is possible that important genotypes were not measured, and therefore, key trajectories were missed. Measuring new genotypes, however, takes many weeks. At this rate, it would take over a decade to measure the complete genotype-phenotype map. On the other hand, more efficient experimental methods (such as high-throughput methods) are not currently possible. 5 Computational Approach to Inferring Missing Phenotypes To alleviate the experimental challenges, this dissertation develops a computational approach to infer missing phenotypes from sparsely-sampled genotype-phenotype maps. In the PfCRT space, our model could shave off years of experimental work. We followed simple approach proposed by R.A. Fisher in 1918. He modeled each mutation as an independent perturbation to a quantitative phenotype [14, 15]. The phenotype of any genotype is simply the sum of its mutations. This is known as an additive model. We can estimate the effects of mutations in an additive model using ordinary linear regression. The regression has the form: Pobs = Padd +  (1) where Pobs is the observed phenotypes,  represents the regression residuals, Padd is the additive model. Padd has the form Padd = βref + β1x1 + β2x2 + ... (2) where βi represents the quantitative effect of mutation i, xi is 0 if the mutation is present and 1 otherwise, and βwt is a reference phenotype. When we apply an additive model to the PfCRT genotype-phenotype map, it yields a fit with R2 = 0.65. The error in the predictions can be seen in the correlation plot, Pobs versus Padd, shown in Figure 2C. A few immediate problems appear in the correlation plot of Pobs versus Padd that we will address throughout this dissertation. Epistasis Impairs Prediction The residual term in the additive model has a special name; it is called epistasis. Epistasis has many forms and arises from different sources [15, 16, 17, 18, 19, 20, 21, 22, 23]. Formally, epistasis is defined as a mutation having a different effect in two 6 different genetic backgrounds. This appears as deviations from the additive model. Epistasis plays an profound role in shaping evolutionary trajectories [2, 24, 25, 26, 16, 27, 17, 28, 29, 30, 10, 19, 31, 32, 33, 34, 20, 35, 23]. It can determine the specific order in which mutations accumulate [25, 19], stochastically open and close pathways [36, 37], and entrench mutations over evolutionary time, [38, 33], and lead to long range history contingency in a genotype-phenotype map [39, 36]. Epistasis is can be observed in PfCRT. In Figure 2D, the effect of each mutation in the trajectory is shown as black bars in Figure 2B. Summers et al. also measured the effect of these mutations in the derived genetic background, shown as white bars. We can see that each mutation has a different effect when introduced in the trajec- tory versus in the derived background. This observation is the result of epistasis. Interestingly, the +356T switches sign. It has a positive effect in the trajectory but a negative effect in the derived background. This is an extreme case known as sign epis- tasis. This non-additivity in the PfCRT genotype-phenotype map causes the additive model to yield poor results. This dissertation addresses epistasis when predicting phenotypes. The following chapters analyze epistasis in various genotype-phenotype maps and identifies key features that improve predictive power. Finally, we apply our improved model to the PfCRT genotype-phenotype map and predict unknown phenotypes. We then collaborate with the Martin lab (Australia National University), responsible for the original PfCRT data in Summers et. al [11], to experimentally characterize 25 new genotypes. We then compare these to our predictions. In the final chapter, we layout the limitations of this approach and propose alternative avenues for future improvement. 7 Chapter-by-chapter Breakdown Chapter II addresses the problem of epistasis by dissecting epistasis into two sources: global and local epistasis. We analyze various experimental genotype-phenotype maps collected from the literature. We identify global epistasis–or nonlinearity between the genotype and phenotype–in many of these maps. It explains a small fraction of the epistasis on these maps. We then analyze these maps for local epistasis. We find extensive high-order epistasis–or specific interactions between two, three, four, or even more mutations. We employ a rigorous statistical approach to confirm the observed epistasis is not due to experimental uncertainty. From this, we conclude that high-order epistasis is a ubiquitous feature of biology. The work in this chapter was published as a research article in the journal Genetics, and co-authored with Prof. Michael J. Harms. Chapter III then explores the role that high-order epistasis plays in shaping evolu- tionary trajectories through genotype-phenotype maps. We pull various experimental genotype-phenotype maps from the literature. We enumerate evolutionary trajecto- ries through each map and calculated their relative probabilities. We then com- putationally removed local epistatic interactions and observe their effects on these trajectories. In every map, we find that removing high-order epistasis drastically changed the trajectories we observed. From this, we conclude that high-order epista- sis plays a significant role in evolution, and therefore, cannot be ignored. The work in this chapter was published as a research article in the journal, PLoS Computational Biology , and co-authored with Prof. Michael J. Harms. Chapter IV addresses one possible source of high-order epistasis, molecular ensem- bles. We use a simple toy model of proteins, called protein lattice models, to explore how the protein’s conformational ensemble shapes it’s evolutionary trajectories. We first find that a two-state protein exhibits no high-order epistasis. On the other hand, 8 we find that a 3+ state protein exhibits extensive high-order epistasis. This suggests that high-order epistasis in lattice proteins arises from the conformational ensemble. We examine the ensemble over the course of an evolutionary trajectory and find the cause of high-order epistasis. Because mutations affect each conformation in a slightly different way, the effect of a mutation changes over the course of an evolutionary tra- jectory. This leads to extensive high-order epistasis in lattice proteins. We concluded that this phenomena is likely ubiquitous in molecular systems since conformational ensembles are a common feature in biological systems. The work in this chapter was published as a research article in the journal, Proceedings of the National Academy of Sciences, and co-authored with Prof. Michael J. Harms. Chapter V addresses a major problem when including local epistasis to predict missing phenotypes in sparsely-sampled genotype phenotype maps. Using both com- putational and experimental genotype-phenotype maps, we show that adding local epistasis yields poor predictions. The problem with fitting epistasis to sparse data is that it leads to biased estimates of epistatic coefficients. As a result, the predictions are biased. Fitting global epistasis, on the other hand, improves predictions. Thus, we conclude that the best approach to predicting phenotypes is to ignore local epista- sis. The work in this chapter is currently in preparation and co-authored with Prof. Michael J. Harms. In Chapter VI, we apply our model to infer missing phenotypes in the PfCRT genotype-phenotype map. We work in collaboration with Rowena Martin’s group at Australia National University. The Martin group measured 25 genotypes not previously characterized, while we generated predictions. We address a key feature in the observed phenotype. Two mutations are necessary for CQ transport. We model this feature by including a preprocessing step that classifies genotypes as no- CQ transport or positive-CQ transport. We then fit a nonlinear additive model to positive-CQ genotypes. We predict the 204 unknown phenotype and find that many of 9 the genotypes are nonfunctional. We then compare our predictions to the genotypes measured by the Martin group and find that our model accurately predicts with a known uncertainty. Finally, we map our predictions on the full PfCRT genotype- phenotype map and infer evolutionary trajectories that lead to high-CQ uptake. We find that many trajectories are inaccessible. The work in this chapter is currently in preparation and co-authored with Robert Summers, Sarah Shafik, Alex Joule, Prof. Rowena Martin and Prof. Michael J. Harms. Broader Impacts An immediate broader impact from this work is the prediction of unknown PfCRT genotypes. We were able to draw new conclusions about evolution of PfCRT. Specif- ically, we reveal many low-transport proteins in the genotype-phenotype map that make many evolutionary trajectories inaccessible. We also observe many neutral steps in the observed evolutionary trajectories. This could explain why CQ resistance took so many decades to evolve. It also informs future experimental work directed at understanding the evolution of drug resistance. An even broader impact is the generalization of our model. By the end of the dis- sertation, we present a general model for predicting phenotypes in sparsely-sampled genotype-phenotype maps. Together, with high-throughput experimental techniques, this model has the potential to uncover massive genotype-phenotype maps that were previously inaccessible. We also address the problem of epistasis and conclude that epistasis provides a simple metric of the uncertainty. This allows us to predict large genotype-phenotype maps from very few phenotypes with known uncertainty. Fur- ther, this dissertation informs the user how many phenotypes must be measured before reaching the maximum predictive power. All software used in this dissertation is free and open source. The prediction models can be accessed and downloaded on Github (https://github.com/harmslab). 10 All of the code is written in Python and built on top of core scientific python pack- ages. Each software package includes extensive documentation and examples on how to apply to various types of experimental data, Many examples are also written in Jupyter notebooks–reproducible electronic notebooks that include code, text, and fig- ures in a single document. This was our best effort to follow an open and reproducible approach to science. 11 CHAPTER II DETECTING HIGH-ORDER EPISTASIS IN NONLINEAR GENOTYPE-PHENOTYPE MAPS Author Contributions Zachary Sailer (ZRS) and Michael Harms (MJH) conceptualized the paper. ZRS conducted the simulations and computational analysis. ZRS generated figures. ZRS and MJH wrote and edited the manuscript. Abstract High-order epistasis has been observed in many genotype-phenotype maps. These multi-way interactions between mutations may be useful for dissecting complex traits and could have profound implications for evolution. Alternatively, they could be a statistical artifact. High-order epistasis models assume the effects of mutations should add, when they could in fact multiply or combine in some other nonlinear way. A mismatch in the “scale” of the epistasis model and the scale of the underlying map would lead to spurious epistasis. In this paper, we develop an approach to estimate the nonlinear scales of arbitrary genotype-phenotype maps. We can then linearize these maps and extract high-order epistasis. We investigated seven experimental genotype- phenotype maps for which high-order epistasis had been reported previously. We find that five of the seven maps exhibited nonlinear scales. Interestingly, even after accounting for nonlinearity, we found statistically significant high-order epistasis in all seven maps. The contributions of high-order epistasis to the total variation ranged from 2.2% to 31.0%, with an average across maps of 12.7%. Our results provide 12 strong evidence for extensive high-order epistasis, even after nonlinear scale is taken into account. Further, we describe a simple method to estimate and account for nonlinearity in genotype-phenotype maps. Introduction Recent analyses of genotype-phenotype maps have revealed “high-order” epistasis—that is, interactions between three, four, and even more mutations [40, 41, 32, 42, 43, 44, 45, 46, 27, 47, 48, 49, 50, 51]. The importance of these interactions for understand- ing biological systems and their evolution is the subject of current debate [44, 52]. Can they be interpreted as specific, biological interactions between loci? Or are they misleading statistical correlations? We set out to tackle one potential source of spurious epistasis: a mismatch between the “scale” of the map and the scale of the model used to dissect epistasis [14, 53, 54, 15, 16, 7]. The scale defines how to combine mutational effects. On a linear scale, the effects of individual mutations are added. On a multiplicative scale, the effects of mutations are multiplied. Other, arbitrarily complex scales, are also possible [55, 56, 57]. Application of a linear model to a nonlinear map will lead to apparent epistasis [14, 53, 54, 15, 16, 7]. Consider a map with independent, multiplicative mutations. Analysis with a multiplicative model will give no epistasis. In contrast, analysis with a linear model will give epistatic coefficients to account for the multiplicative nonlinearity [15, 16]. Epistasis arising from a mismatch in scale is mathematically valid, but obscures a key feature of the map: its scale. It is also not parsimonious, as it uses many coefficients to describe a potentially simple nonlinear function. Finally, it can be misleading because these epistatic coefficients partition global information about the nonlinear scale into (apparently) specific interactions between mutations. Most high-order epistasis models assume a linear scale (or a multiplicative scale 13 transformed onto a linear scale) [58, 7, 44, 52]. These models sum the indepen- dent effects of mutations to predict multi-mutation phenotypes. Epistatic coefficients account for the difference between the observed phenotypes and the phenotypes pre- dicted by summing mutational effects. The epistatic coefficients that result are, by construction, on the same linear scale [58, 44, 52]. Because the underlying scale of genotype-phenotype maps is not known a priori, the interpretation of high-order epistasis extracted on a linear scale is unclear. If a nonlinear scale can be found that removes high-order epistasis, it would suggest that high-order epistasis is spurious: a highly complex description of a simple, nonlinear system. In contrast, if no such scale can be found, high-order epistasis provides a window into the profound complexity of genotype-phenotype maps. In this paper, we set out to estimate the nonlinear scales of experimental genotype- phenotype maps. We then account for these scales in the analysis of high-order epistasis. We took our inspiration from the treatment of multiplicative maps, which can be transformed into linear maps using a log transform. Along these same lines, we set out to transform genotype-phenotype maps with arbitrary, nonlinear scales onto a linear scale for analysis of high-order epistasis. We develop our methodology using simulations and then apply it to experimentally measured genotype-phenotype maps. Materials and Methods Experimental data sets We collected a set of published genotype-phenotype maps for which high-order epista- sis had been reported previously. Measuring an Lth-order interaction requires knowing the phenotypes of all binary combinations of L mutations—that is, 2L genotypes. The data sets we used had exhaustively covered all 2L genotypes for five or six mutations. These data sets cover a broad spectrum of genotypes and phenotypes. Genotypes 14 included point mutations to a single protein [25], point mutations in both members of a protein/DNA complex [32], random genomic mutations [18, 6], and binary com- binations of alleles within a biosynthetic network [59]. Measured phenotypes included selection coefficients [25, 18, 6], molecular binding affinity [32], and yeast growth rate [59]. (For several data sets, the “phenotype” is a selection coefficient. We do not differentiate fitness from other properties for our analyses; therefore, for simplicity, we will refer to all maps as genotype-phenotype maps rather than specifying some as genotype-fitness maps). All data sets had a minimum of three independent measure- ments of the phenotype for each genotype. All data sets are available in a standardized ascii text format. Nonlinear scale We described nonlinearity in the genotype-phenotype map by a power transformation (see Results) [60, 61]. The independent variable for the transformation was ~P add, the predicted phenotypes of all genotypes assuming linear and additive affects for each mutation. The estimated additive phenotype of genotype i, is given by: Pˆadd,i = j≤L∑ j=1 〈∆Pj〉xi,j (3) where 〈∆Pj〉 is the average effect of mutation j across all backgrounds, xi,j is an index that encodes whether or not mutation j is present in genotype i, and L is the number of sites. The dependent variables are the observed phenotypes ~Pobs taken from the experimental genotype-phenotype maps. We use nonlinear least-squares regression to fit and estimate the power transfor- mation from ~Padd to ~Pobs : ~Pobs ∼ τ( ~ˆPadd; λˆ, Aˆ, Bˆ) + εˆ, 15 where ε is a residual and τ is a power transform function. This is given by: ~Pobs = ( ~ˆPadd + A) λ − 1 λ(GM)λ−1 +B, where A and B are translation constants, GM is the geometric mean of ( ~ˆPadd + A) , and λ is a scaling parameter. We used standard nonlinear regression techniques to minimize d: d = (~Pscale − ~Pobs)2 + ε. We then reversed this transformation to linearize Pobs using the estimated parameters Aˆ, Bˆ, and λˆ. We did so by the back-transform: Pobs,linear = {λˆ(GM)λ−1(Pobs − Bˆ) + 1}1/λˆ − Aˆ. (4) High-order epistasis model We dissected epistasis using a linear, high-order epistasis model. These have been discussed extensively elsewhere [58, 52, 44], so we will only briefly and informally review them here. A high-order epistasis model is a linear decomposition of a genotype-phenotype map. It yields a set of coefficients that account for all variation in phenotype. The signs and magnitudes of the epistatic coefficients quantify the effect of mutations and interactions between them. A binary map with 2L genotypes requires 2L epistatic coefficients and captures all interactions, up to Lth-order, between them. This is conveniently described in matrix notation. ~P = X~β : (5) a vector of phenotypes ~P can be transformed into a vector of epistatic coefficients 16 ~β using a 2L × 2L decomposition matrix that encodes which coefficients contribute to which phenotypes. If X is invertible, one can determine ~β from a collection of measured phenotypes by ~β = X−1 ~P . (6) X can be formulated in a variety of ways [52]. Following others in the genetics literature, we use the form derived from Walsh polynomials [58, 44, 52]. In this form, X is a Hadamard matrix. Conceptually, the transformation identifies the geometric center of the genotype-phenotype map and then measures the average effects of each mutation and combination of mutations in this “average” genetic background (Figure 3). To achieve this, we encoded each mutation at each site in each genotype as -1 (wildtype) or +1 (mutant) [58, 44, 52]. This has been called a Fourier analysis,[7, 62], global epistasis [52], or a Walsh space [58, 25]. Another common approach is to use a single wildtype genotype as a reference and encode mutations as either 0 (wildtype) or 1 (mutant) [52]. 17 Figure 3: Epistasis can be quantified using Walsh polynomials. A) A genotype-phenotype map exhibiting negative epistasis. Axes are genotype at po- sition 1 (g1), genotype at position 2 (g2), and phenotype (P ). For genotypic axes, “0” denotes wildtype and “1” denotes a mutant. Phenotype is encoded both on the P -axis and as a spectrum from white to blue. The map exhibits negative epistasis: relative to wildtype, the effect of the mutations together (P11 = 2) is less than the sum of the individual effects of mutations (P10 +P01 = 1+2 = 3). B) The map can be decomposed into epistatic coefficients using a Walsh polynomial, which measures the effects of each mutation relative to the geometric center of the genotype-phenotype map (green sphere). The additive coefficients β1 and β2 (red arrows) are the average effect of each mutation in all backgrounds. The epistatic coefficient β12 (orange ar- row) the variation not accounted for by β1 and β2. Geometrically, it is the distance between the center of the map and the “fold” given by vector connecting P00 and P11. One data set (IV, Table I) has four possible states (A, G, C and T) at two of the sites. We encoded these using the WYK tetrahedral-encoding scheme[63, 32]. Each state is encoded by a three-bit state. The wildtype state is given the bits (1, 1, 1). The remaining states are encoded with bits that form corners of a tetrahedron. For example, the wildtype of site 1 is G and encoded as the (1, 1, 1) state. The remaining states are encoded as follows: A is (1,−1,−1), C is (−1, 1,−1) and T is (−1,−1, 1). 18 ID genotype phenotype L ref I scattered genomic mutations E. coli fitness 5 [18] II chromosomes in asexual fungi A. niger fitness 5 [6] III protein point mutants bacterial fitness 5 [25] IV DNA/protein point mutants DNA/protein binding affinity 5 [32] V chromosomes in asexual fungi A. niger fitness 5 [6] VI alleles in biosynthetic network S. cerevisiae haploid growth rate 6 [59] VII alleles in biosynthetic network S. cerevisiae diploid growth rate 6 [59] Table 1: All data sets have 2L genotypes except the DNA/protein interaction data set (IV), which has 128 genotypes. This occurs because the data set has 2 DNA sites (each of which have 4 possible bases) and 3 protein sites (each of which has two possible amino acids). Experimental uncertainty We used a bootstrap approach to propagate uncertainty in measured phenotypes into uncertainty in epistatic coefficients. To do so we: 1) calculated the mean and standard deviation for each phenotype from the published experimental replicates; 2) sampled the uncertainty distribution for each phenotype to generate a pseudoreplicate vector ~Ppseudo that had one phenotype per genotype; 3) rescaled ~Ppseudo using a power- transform; and 4) determined the epistatic coefficients for ~Ppseudo,scaled. We then repeated steps 2-4 until convergence. We determined the mean and variance of each epistatic coefficient after every 50 pseudoreplicates. We defined convergence as the mean and variance of every epistatic coefficient changed by < 0.1 % after addition of 50 more pseudoreplicates. On average, convergence required ≈ 100, 000 replicates per genotype-phenotype map. Finally, we used a z-score to determine if each epistatic coefficient was significantly different than zero. To account for multiple testing, we applied a Bonferroni correction to all p-values [64]. Computational methods Our full epistasis software package—written in Python3 extended with Numpy and Scipy [65]—is available for download via github (https://harmslab.github.com/epistasis). 19 We used the python package scikit-learn for all regression [66]. Plots were generated using matplotlib and jupyter notebooks [67, 68]. Results and Discussion Nonlinear scale induces apparent high-order epistasis Our first goal was to understand how a nonlinear scale, if present, would affect es- timates of high-order epistasis. To probe this question, we constructed a five-site binary genotype-phenotype map on a nonlinear scale, and then extracted epistasis assuming a linear scale. The nonlinear scale we chose was a saturating function: Pg,trans = (1 +K)Pg 1 +KPg , (7) where Pg is the linear phenotype of genotype g, Pg,trans is the transformed phenotype of genotype g, and K is a scaling constant. As K → 0, the map becomes linear. As K increases, mutations have systematically smaller effects when introduced into backgrounds with higher phenotypes. We calculated Pg for all 2L binary genotypes using the random, additive coeffi- cients shown in Figure 4A. These coefficients included no epistasis. We then trans- formed Pg onto the nonlinear Pg,trans scale using Equation 7 with the relatively shallow (K = 2) saturation curve shown in Figure 4B. Finally, we applied a linear epistasis model to Pg,trans to extract epistatic coefficients. 20 Figure 4: Nonlinearity in phenotype creates spurious high-order epistatic coefficients. A)Simulated, random, first-order epistatic coefficients. The mutated site is indicated by panel below the bar graph; bar indicates magnitude and sign of the additive coefficient. B) A nonlinear map between a linear phenotype and a saturating, nonlinear phenotype. The first-order coefficients in panel A are used to generate a linear phenotype, which is then transformed by the function shown in B. C) Epistatic coefficients extracted from the genotype-phenotype map generated in panels A and B. Bars denote coefficient magnitude and sign. Color denotes the order of the coefficient: first (βi, red), second (βij, orange), third (βijk, green), fourth (βijkl, purple), and fifth (βijklm, blue). Filled squares in the grid below the bars indicate the identity of mutations that contribute to the coefficient. We found that nonlinearity in the genotype-phenotype map induced extensive high-order epistasis when the nonlinearity was ignored (Figure 4C). We observed epistasis up to the fourth order, despite building the map with purely additive co- efficients. This result is unsurprising: the only mechanism by which a linear model can account for variation in phenotype is through epistatic coefficients [53, 54, 15]. When given a nonlinear map, it partitions the variation arising from nonlinearity into specific interactions between mutations. This high-order epistasis is mathemati- cally valid, but does not capture the major feature of the map—namely, saturation. Indeed, this epistasis is deceptive, as it is naturally interpreted as specific interac- tions between mutations. For example, this analysis identifies a specific interaction between mutations one, two, four, and five (Figure 4C, purple). But this four-way interaction is an artifact of the nonlinearity in phenotype of the map, rather than a specific interaction. 21 Nonlinear scale and specific epistatic interactions induce different patterns of non-additivity Our next question was whether we could separate the effects of nonlinear scale and high-order epistasis in binary maps. One useful approach to develop intuition about epistasis is to plot the the observed phenotypes (Pobs) against the predicted phenotype of each genotype, assuming linear and additive mutational effects (Padd) [55, 56, 7]. In a linear map without epistasis, Pobs equals Padd, because each mutation would have the same, additive effect in all backgrounds. If epistasis is present, phenotypes will diverge from the Pobs = Padd line. We simulated maps including varying amounts of linear, high-order epistasis, placed them onto increasingly nonlinear scales, and then constructed Pobs vs. Padd plots. We added high-order epistasis by generating random epistatic coefficients and then calculating phenotypes using Eq. 5. We introduced nonlinearity by transforming these phenotypes with Eq. 7. For each genotype in these simulations, we calculated Padd as the sum of the first-order coefficients used in the generating model. Pobs is the observable phenotype, including both high-order epistasis and nonlinear scale. High-order epistasis and nonlinear scale had qualitatively different effects on Pobs vs. Padd plots. Figure 5A shows plots of Pobs vs. Padd for increasing nonlinearity (left-to-right) and high-order epistasis (bottom-to-top). As nonlinearity increases, Pobs curves systematically relative to Padd. This reflects the fact that Padd is on a linear scale and Pobs is on a saturating, nonlinear scale. The shape of the curve reflects the map between the linear and saturating scale: the smallest phenotypes are underestimated and the largest phenotypes overestimated. In contrast, high-order epistasis induces random scatter away from the Pobs = Padd line. This is because the epistatic coefficients used to generate the map are specific to each genotype, moving observations off the expected line, even if the scaling relationship is taken into account. 22 Figure 5: Epistasis and nonlinear scale induce different patterns of non- additivity. A) Patterns of nonadditivity for increasing epistasis and nonlinear scale. Main panel shows a grid ranging from no epistasis, linear scale (bottom-left) to high epistasis, highly nonlinear scale (top-right). Insets in sub-panels show added non- linearity. Going from left to right: K = 0, K = 2, K = 4. Epistatic coefficient plots to right show the magnitude of the input high-order epistasis, with colors and annotation as in Fig 2C. B) Plot of Pobs against Pˆadd for the middle sub panel in panel A. Red line is the fit of the power transform to these data. C) Correlation between epistatic coefficients input into the simulation and extracted from the simulation after linearization by the power transform. Each point is an epistatic coefficient, colored by order. The Pearson’s correlation coefficient is shown in the upper-left quadrant. D) Correlation between epistatic coefficients input into the simulation and extracted from the simulation without application of the power transform. Nonlinearity can be separated from underlying high-order epistasis The Pobs vs. Padd plots suggest an approach to disentangle high-order epistasis from nonlinear scale. By fitting a function to the Pobs vs Padd curve, we describe a transfor- 23 mation that relates the linear Padd scale to the (possibly nonlinear) Pobs scale [56, 7]. Once the form of the nonlinearity is known, we can then linearize the phenotypes so they are on an appropriate scale for epistatic analysis. Variation that remains (i.e. scatter) can then be confidently partitioned into epistatic coefficients. In the absence of knowledge about the source of the nonlinearity, a natural choice is a power transform [60, 61], which identifies a monotonic, continuous function through Pobs vs. Padd. A key feature of this approach is that power-transformed data are normally distributed around the fit curve and thus appropriately scaled for regression of a linear epistasis model. We tested this approach using one of our simulated data sets. One complication is that, for an experimental map, we do not know Padd. In the analysis above, we determined Padd from the additive coefficients used to generate the space. In a real map, Padd is not known; therefore, we had to estimate Padd. We did so by measuring the average effect of each mutation across all backgrounds, and then calculating Pˆadd for each genotype as the sum of these average effects (Eq. 3). We fit the power transform to Pobs vs. Pˆ add (solid red line, Figure 5B). The curve captures the nonlinearity added in the simulation. We linearized Pobs using the fit model (Eq. 4), and then extracted high-order epistatic coefficients. The ex- tracted coefficients were highly correlated with the coefficients used to generate the map (R2 = 0.998) (Figure 5C). In contrast, applying the linear epistasis model to this map without first accounting for nonlinearity gives much greater scatter between the input and output coefficients (R2 = 0.934) (Figure 5D). This occurs because phe- notypic variation from nonlinearity is incorrectly partitioned into the linear epistatic coefficients. 24 Nonlinearity is a common feature of genotype-phenotype maps Our next question was whether experimental maps exhibited nonlinear scales. We selected seven genotype-phenotype maps that had previously been reported to exhibit high-order epistasis (Table 1) and fit power transforms to each dataset (Figure 6, S1). We expected some phenotypes to be multiplicative (e.g. datasets I, II and IV were relative fitness), while we expected some to be additive (e.g. dataset IV is a free energy). Rather than rescaling the multiplicative datasets by taking logarithms of the phenotypes, we allowed our power transform to capture the appropriate scale. The power-transform identified nonlinearity in the majority of data sets. Of the seven data sets, three were less-than-additive (II, V, VI), two were greater-than-additive (III, IV), and two were approximately linear (I, VII). All data sets gave random residuals after fitting the power transform (Figure 6, S1). Figure 6: Experimental genotype-phenotype maps exhibit nonlinear phe- notypes. Plots show observed phenotype Pobs plotted against Pˆadd (Eq. 3) for data sets I through IV. Points are individual genotypes. Error bars are experimental stan- dard deviations in phenotype. Red lines are the fit of the power transform to the data set. Pearson’s coefficient for each fit are shown on each plot. Dashed lines are Padd = Pobs. Bottom panels in each plot show residuals between the observed phenotypes and the red fit line. Points are the individual residuals. Error bars are the experimental standard deviation of the phenotype. The horizontal histograms show the distribution of residuals across 10 bins. The red lines are the mean of the residuals. 25 High-order epistasis is a common feature of genotype-phenotype maps With estimated scales in hand, we linearized the maps using Eq. 4 and re-measured epistasis (Figure S2). We used bootstrap sampling of uncertainty in the measured phenotypes to determine the uncertainty of each epistatic coefficient (see Methods), and then integrated these distributions to determine whether each coefficient was significantly different than zero. We then applied a Bonferroni correction to each p-value to account for multiple testing. Despite our conservative statistical approach, we found high-order epistasis in ev- ery map studied (Figure 7A, S3). Every data set exhibited at least one statistically significant epistatic coefficient of fourth order or higher. We even detected statisti- cally significant fifth-order epistasis (blue bar in Figure 7A, data set II). High-order coefficients were both positive and negative, often with magnitudes equal to or greater than the second-order terms. These results reveal that high-order epistasis is a robust feature of these maps, even when nonlinearity and measurement uncertainty in the genotype-phenotype map is taken into account. 26 Figure 7: High-order epistasis is present in genotype-phenotype maps. A) Panels show epistatic coefficients extracted from data sets I-IV (Table 1, data set label circled above each graph). Bars denote coefficient magnitude and sign; error bars are propagated measurement uncertainty. Color denotes the order of the coefficient: first (βi, red), second (βij, orange), third (βijk, green), fourth (βijkl, purple), and fifth (βijklm, blue). Bars are colored if the coefficient is significantly different than zero (Z-score with p-value < 0.05 after Bonferroni correction for multiple testing). Stars denote relative significance: p < 0.05 (*), p < 0.01 (**), p < 0.001 (***). Filled squares in the grid below the bars indicate the identity of mutations that contribute to the coefficient. The names of the mutations, taken from the original publications, are indicated to the left of the grid squares. B) Sub-panels show fraction of variation accounted for by first through fifth order epistatic coefficients for data sets I-IV (colors as in panel A). Fraction described by each order is proportional to area. We also dissected the relative contributions of each epistatic order to the remaining variation. To do so, we created truncated epistasis models: an additive model, a model containing additive and pairwise terms, a model containing additive through third- order terms, etc. We then measured how well each model accounted for variation in the phenotype using a Pearson’s coefficient between the fit and the data. Finally, we asked how much the Pearson coefficient changed with addition of more epistatic coefficients. For example, to measure the contribution of pairwise epistasis, we took the difference in the correlation coefficient between the additive plus pairwise model 27 and the purely additive model. The contribution of epistasis to the maps was highly variable (Figure 7B, S3). For data set I, epistatic terms explained 5.9% of the variation in the data. The contributions of epistatic coefficients decayed with increasing order, with fifth-order epistasis only explaining 0.1% of the variation in the data. In contrast, for data set II, epistasis explains 43.3% of the variation in the map. Fifth-order epistasis accounts for 6.3% of the variation in the map. The other data sets had epistatic contributions somewhere between these extremes. Accounting for nonlinear genotype-phenotype maps alters epistatic coefficients Finally, we probed to what extent accounting for nonlinearity in phenotype altered the epistatic coefficients extracted from each space. Figure 8 and S4 show correlation plots between epistatic coefficients extracted both with and without linearization. The first-order coefficients were all highly correlated between the linear and nonlinear analyses for all data sets (Figure S5). 28 Figure 8: Nonlinear phenotypes distort measured epistatic coefficients. Sub-panels show correlation plots between epistatic coefficients extracted without accounting for nonlinearity (x-axis) and accounting for linearity (y-axis) for data sets I-IV. Each point is an epistatic coefficient, colored by order. Error bars are standard deviations from bootstrap replicates of each fitting approach. For the epistatic coefficients, the degree of correlation depended on the degree of nonlinearity in the dataset. Data set I—which was essentially linear—had identi- cal epistatic coefficients whether the nonlinear scale was taken into account or not. In contrast, the other data sets exhibited scatter off of the line. Data set III was particularly noteworthy. The epistatic coefficients were systematically overestimated when the nonlinear scale was ignored. Two large and favorable pairwise epistatic terms in the linear analysis became essentially zero when nonlinearity was taken into account. These interactions—M182T/g4205a and G283S/g4205a—were both noted as determinants of evolutionary trajectories in the original publication [25]; however, our results suggest the interaction is an artifact of applying a linear model to a non- linear data set. Further ≈ 20% (six of 27) epistatic coefficients flipped sign when nonlinearity was taken into account (Figure 8, III, bottom right quadrant). 29 Overall, we found that low-order epistatic coefficients were more robust to the linear assumption than high-order coefficients. Data set IV is a clear example of this behavior. The map exhibited noticeable nonlinearity (Figure 6). The first- and second-order terms were well correlated between the linear and nonlinear analyses (Figure 8, S4, S5). Higher-order terms, however, exhibited much poorer overall cor- relation. While the R2 for second-order coefficients was 0.95, the correlation was only 0.43 for third-order. This suggests that previous analyses of nonlinear genotype- phenotype maps correctly identified the key mutations responsible for variation in the map, but incorrectly estimated the high-order epistatic effects. Discussion Our results reveal that both nonlinear scales and high-order epistasis play important roles in shaping experimental genotype-phenotype maps. Five of the seven data sets we investigated exhibited nonlinear scales, and all of the data sets exhibited high- order epistasis, even after accounting for nonlinearity. This suggests that both should be taken into account in analyses of genotype-phenotype maps. Origins of nonlinear scales We observed two basic forms of nonlinearity in these maps: saturating, less-than- additive maps and exploding, greater-than-additive maps. Many have observed less- than-additive maps in which mutations have lower effects when introduced into more optimal backgrounds [69, 17]. Such saturation has been proposed to be a key factor shaping evolutionary trajectories [69, 17, 19, 70, 71]. Further, it is intuitive that optimizing a phenotype becomes more difficult as that phenotype improves. Our nonlinear fits revealed this behavior in three different maps. The greater-than-additive maps, in contrast, were more surprising: why would mutations have a larger effect when introduced into a more favorable background? 30 For the β-lactamase genotype-phenotype map (III, Figure 6), this may be an artifact of the original analysis used to generate the data set [25]. This data set describes the fitness of bacteria expressing variants of an enzyme with activity against β-lactam antibiotics. The original authors measured the minimum-inhibitory concentration (MIC) of the antibiotic against bacteria expressing each enzyme variant. They then converted their MIC values into apparent fitness by sampling from an exponential distribution of fitness values and assigning these fitness values to rank-ordered MIC values [25]. Our epistasis model extracts this original exponential distribution (Fig- ure S6). This result demonstrates the effectiveness of our approach in extracting nonlinearity in the genotype-phenotype map. The origins of the growth in the transcription factor/DNA binding data set are less clear (IV, Figure 6). The data set measures the binding free energy of variants of a transcription factor binding to different DNA response elements. We are aware of no physical reason for mutations to have a larger effect on free energy when introduced into a background with better binding. One possibility is that the genotype-phenotype map reflects multiple features that are simultaneously altered by mutations, giving rise to this nonlinear shape. This is a distinct possibility in this data set, where mutations are known to alter both DNA binding affinity and DNA binding cooperativity [72]. Best Practice Because nonlinearity is a common feature of these maps, linearity should not be assumed in analyses of epistasis. Given a sufficient number of phenotypic obser- vations, however, the appropriate scale can be estimated by construction of a Pobs vs. Padd plot and regression of a nonlinear scale model. With this scale in hand, one can then transform the genotype-phenotype map onto a linear scale appropriate for analysis using a high-order epistasis model. Our software pipeline automates this process. It takes any genotype-phenotype map in a standard text format, fits for non- 31 linearity, and then estimates high-order epistasis. It is freely available for download (https://harmslab.github.com/epistasis). One important question is how to select an appropriate function to describe the nonlinear scale. By visual inspection, all of the data sets we studied were monotonic in Pˆadd and could be readily captured by a power transform. Other maps may be better captured with other functions. For example, inspection of a Pobs vs Pˆadd plot could reveal a non-monotonic scale, leading to a better fit with a polynomial than a power-transform. Another possibility is that external biological knowledge motivates scale choice [56]. The choice of model determines what fraction of the variation is assigned to “scale” versus “epistasis.” The more complicated the function chosen, the more variation in the data is shifted from epistasis and into scale. One could, for example, fit a completely uninformative Lth-order polynomial, which would capture all of variation as scale and none as epistasis. Scale estimation should be governed by the well- established principles of model regression: find the simplest function that captures the maximum amount of variation in the data set without fitting stochastic noise. Because epistasis is scatter off the scale line (noise), model-selection approaches like the F-test, Akaike Information Criterion, and inspection of fit residuals are a natural strategy for partitioning variation between scale and epistasis. Interpretation Another powerful aspect of this approach is that it allows explicit separation of two distinct origins of non-additivity in genotype-phenotype maps. This can be illustrated with a simple, conceptual, example. Imagine mutations to an enzyme, expressed in bacteria, that have a less-than-additive effect on bacterial growth rate. To a first approximation, this epistasis could have two origins. The first is at the level of the enzyme: maybe the mutations have a specific, negative chemical 32 interactions that alter enzyme rate. The second is at the level of the whole cell: maybe, above a certain activity, the enzyme is fast enough that some other part of the cell starts limiting growth. Mutations continue to improve enzyme activity, but growth rate does not reflect this. These two origins of less-than-additive behavior will have different effects in a Padd vs. Pobs plot: saturation of growth rate will appear as nonlinearity, interactions between mutations at the enzyme level will appear as linear epistasis. Our analysis would reveal this pattern and set up further experiments to tease apart these possibilities. This may also provide important evolutionary insights. One important question is to what extent evolutionary paths are shaped by global constraints versus specific interactions that lead to specific historical contingencies [36, 33, 19]. For example, recent work has shown specific epistatic interactions lead to sequence-level unpre- dictability, while a globally less-than-additive scale leads to predictable phenotypes in evolution [19]. Our analysis approach naturally distinguishes these origins of non- additivity, and thus these evolutionary possibilities. Prevailing magnitude epistasis [6], global epistasis [19], and diminishing-returns epistasis [17, 69, 71, 70] will all appear as nonlinear scales. In contrast, specific interactions will appear in specific coefficients in the linear epistasis model. Our detection of nonlinearity and high-order epistasis in most datasets suggests that both forms of non-additivity will be in play over evolutionary time. High-order epistasis Finally, our work reveals that high-order epistasis is, indeed, a common feature of genotype-phenotype maps. Our study could be viewed as an attempt to “explain away” previously observed high-order epistasis. To do so, we both accounted for nonlinearity in the map and propagated experimental uncertainty to the epistatic coefficients. Surprisingly—to the authors, at least—high-order epistasis was robust 33 to these corrections. High-order epistasis can make huge contributions to genotype-phenotype maps. In data set II, third-order and higher epistasis accounts for fully 31.0% of the variation in the map. The average contribution, across maps, is 12.7%. We also do not see a consistent decay in the contribution of epistasis with increasing order. In data sets II, V and VI, third-order epistasis contributes more variation to the map than second- order epistasis. This suggests that epistasis could go to even higher orders in larger genotype-phenotype maps. The generality of these results across all genotype-phenotype maps is unclear. The maps we analyzed were measured and published because they were “interesting,” either from a mechanistic or evolutionary perspective. Further, most of the maps have a single, maximum phenotype peak. The nonlinearity and high-order epistasis we observed may be common for collections of mutations that, together, optimize a function, but less common in “flatter” or more random genotype-phenotype maps. This can only be determined by characterization of genotype-phenotype maps with different structural features. The observation of this epistasis also raises important questions: What are the origins of third, fourth, and even fifth-order correlations in these data sets? What, mechanistically, leads to a five-way interaction between mutations? Does neglecting high-order epistasis bias estimates of low-order epistasis [73]? What can this epistasis tell us about the biological underpinning of these maps [74, 75, 43, 76]? The evolutionary implications are also potentially fascinating. Epistasis creates temporal dependency between mutations: the effect of a mutation depends strongly on specific mutations that fixed earlier in time [77, 78, 36, 33] How does this play out for high-order epistasis, which introduces long-range correlations across genotype- phenotype maps? Do these low magnitude interactions matter for evolutionary out- comes or dynamics? These, and questions like them, are challenging and fascinating 34 future avenues for further research. 35 CHAPTER III HIGH-ORDER EPISTASIS SHAPES EVOLUTIONARY TRAJECTORIES Author Contributions Zachary Sailer (ZRS) and Michael Harms (MJH) conceptualized the paper. ZRS conducted the simulations and computational analysis. ZRS generated figures. ZRS and MJH wrote and edited the manuscript. Abstract An important question in evolutionary biology is the extent to which past events shape later evolution. High-order epistasis—where the effect of a mutation is de- termined by interactions with two or more other mutations—provides insight into this link between past and present. If high-order epistasis determines evolutionary trajectories, it reveals that the effect of a mutation depends on multiple, past substi- tutions: the past can strongly shape the present. High-order epistasis makes small, but detectable, contributions to genotype-fitness maps, but its evolutionary conse- quences remain poorly understood. To determine the effect of high-order epistasis on evolutionary trajectories, we computationally removed high-order epistasis from ex- perimental genotype-fitness maps and then compared trajectories through maps both with and without high-order epistasis. Here we show that high-order epistasis, despite its small magnitude, strongly shapes the accessibility and probability of evolutionary trajectories. This suggests that evolutionary outcomes are profoundly dependent on multiple, past events. 36 Author Summary A key goal for evolutionary biologists is understanding why one evolutionary trajec- tory is taken rather than others. This requires understanding how individual muta- tions, as well as interactions between them, determine the availability of evolution- ary pathways. We used a robust statistical analysis to reveal interactions between up to five mutations in published datasets, meaning that the effect of a mutation can depend on the presence or absence of four other mutations. Simulations reveal that these interactions strongly shape evolutionary trajectories, and that mutations which occur early in a trajectory continue to exert profound effects on later evolu- tion. These multi-way interactions are ubiquitous in biological datasets, suggesting that past events strongly determine future outcomes. Introduction Epistasis creates historical contingency, as it means that the effect of a mutation depends on previous substitutions [25, 16, 36, 19, 9, 37]. Interactions between pairs of mutations can cause mutations to accumulate in a specific order [25, 19], stochastically open and close pathways [36, 37], and make evolution irreversible [38, 33]. The effects of high-order epistasis—interactions between three or more muta- tions—on evolution are less well understood. Statistically-significant high-order epis- tasis has been observed in multiple genotype-phenotype maps [40, 44, 42, 52, 32, 41, 22, 37, 79], even when steps are taken to minimize its contribution to epistasis mod- els [22]. Its magnitude is generally lower than the individual and pairwise epistatic effects of mutations [22]. Several studies have suggested that it can alter evolutionary outcomes [44, 37, 79], but its overall importance for evolution is not well understood. Does high-order epistasis alter evolutionary outcomes? Or are trajectories primarily shaped by the additive and pairwise epistatic effects of mutations? 37 We set out to assess the effect of high-order epistasis on evolutionary trajecto- ries through experimentally measured genotype-fitness maps. We decomposed these maps into contributions from nonlinear scale, additive effects, and epistasis at differ- ent orders ranging from second to fifth. We then calculated “truncated” maps with different orders of epistasis deleted. By comparing the fitness values and probabilities of individual evolutionary trajectories through the truncated maps, we can reveal the extent to which high-order epistasis determines evolutionary outcomes. Materials and Methods Removal of high-order epistasis We used the following protocol to remove specific orders of epistasis from genotype- fitness maps. The steps correspond directly to the pipeline shown in Fig S1, which is described in detail in Sailer et al. [22]. 1. We identified an appropriate, possibly nonlinear, scale for the map by fitting a power transformation to the genotype-fitness map: ~Fexperimental = ( ~ˆFadd + A) λ − 1 λ(GM)λ−1 +B, (8) where ~Fexperimental is the vector of the observed fitness values, ~ˆFadd is the fitness of each genotype assuming each mutation has the same, average effect in all backgrounds, A and B are translation constants, GM is the geometric mean of ( ~ˆFadd + A) , and λ is a scaling parameter. ~ˆFadd is given by: Fadd,i = j≤L∑ j=1 〈∆Fj〉xi,j where 〈∆Fj〉 is the average effect of mutation j across all backgrounds, xi,j is an index that encodes whether or not mutation j is present in genotype i, and 38 L is the number of sites. We first regressed Fˆadd, and then regressed the power transform. 2. We linearized each map by transforming each element in ~Fexperimental with the nonlinear scale and coefficients determined in step 1. For each element in ~Fexperimental, we performed: Flinear = {λˆ(GM)λ−1(Fexperimental − Bˆ) + 1}1/λˆ − Aˆ. 3. We decomposed the variation in fitness into epistatic coefficients using a linear decomposition of the form: ~β = X−1 ~Flinear, where ~β is a collection of epistatic coefficients (ranging from 0th to Lth order) and X is a design matrix that indicates which coefficients contribute to fitness in which genotype. For most of the work described, we used a Hadamard matrix for X, which uses the geometric center of the genotype-fitness map as a reference state. [58, 44, 52, 22]. To construct this matrix, we encoded each mutation within each genotype as -1 (wildtype) or +1 (mutant) [52, 22]. For the final section, we use a “local” matrix for X, which measures the effect of each mutation relative to a defined reference phenotype. To construct this matrix, we encoded each mutation within each genotype as 0 (wildtype) or 1 (mutant). These to forms of X can be readily inter-converted [52]. 4. We truncated epistasis from the linearized map by setting the epistatic coeffi- cients from orders of interest to 0, creating ~βtrunc. 5. We recalculated the linearized fitness values, with truncated epistasis by: ~Flinear,trunc = X~βtrunc. 39 6. We transformed the ~Flinear,trunc onto the original, nonlinear scale using Eq. 8, with ~Flinear,trunc in place of ~Fadd. 7. We used the final ~Ftrunc values to construct a genotype-fitness map in which orders of epistasis were selectively removed, leaving the global, nonlinear scale intact. We quantified the contribution of epistasis to each map (φ) by determining the differ- ence in the variation explained by the ith and (i−1)th orders. φ = ρ2i −ρ2i−1, where ρ2x is the squared Pearson coefficient between linear fitness values in a model truncated to order x (~Flinear,trunc−to−x) and linear fitness values determined from the original map (~Flinear). Evolutionary Trajectories We calculated the probability of a given evolutionary trajectory as series of inde- pendent, sequential fixation events. We assumed that the time to fixation for each mutation was much less than the time between mutations (the so-called strong selec- tion/weak mutation regime) [80, 25, 18]. The relative probability of an evolutionary trajectory i is the product of its required fixation events relative to all possible tra- jectories: pi = ∏ x∈Si pix→x+1∑ j∈T ∏ x∈Sj pix→x+1 , where pix→x+1 is the fixation probability for genotype x+ 1 in the x background, Si is the set of steps that compose trajectory i, and T is the set of all forward trajectories. The model assumes the mutation rate is the same for all sites, and that population size and mutation rates are fixed over the evolutionary trajectory [81, 82, 18, 25, 83]. We calculated pix→x+1 for each step using the Gillespie model [84] pix→x+1 = 1− e−sx→x+1 1− e−Nsx→x+1 = 1− e−(1−wx+1/wx) 1− e−N(1−wx+1/wx) , 40 where N is population size, s is the selection coefficient and wx and wx+1 are the relative fitnesses of the x and x+ 1 genotypes visited over the trajectory. To determine the difference between sets of trajectories in maps with and without high-order epistasis, we measured the magnitude of the difference in probability for all L! forward trajectories through each space. We did so by: θ = ∑i=L! i=1 ∣∣∣pexperimentali − ptrunci ∣∣∣ 2 , where pexperimentali is the probability of the ith trajectory within the experimental map and ptrunci is the probability of that same trajectory in a truncated map, with high-order epistasis removed. Software We implemented the epistasis and trajectory models using Python 3 extended with the numpy and scipy packages [65]. We used the python package scikit-learn to perform linear regression with truncated forms of these models [66]. Plots were generated using matplotlib and jupyter notebooks [67, 68]. Our full software package is available in the epistasis package via github (https://harmslab.github.com/epistasis). Results High-order epistasis is common in all maps Our first goal was to determine the contributions of each order of epistasis to fitness in six experimentally measured genotype-fitness maps (Table 2). Each map consisted of all possible combinations of 5 mutations (25 = 32 genotypes) in a haploid genome. The mutations in datasets I and IV arose during adaptive, experimental evolution of E. coli, and occur throughout the genome [18, 85]. The mutations in datasets II and VI each occur in single genes that confer drug resistance in E. coli and HIV, 41 respectively [25, 27]. The mutations in datasets III and V were introduced randomly into the A. niger genome [6]. Previous workers characterized components of fitness for each genotype under defined experimental conditions. For four of the datasets (I, III, IV, and V), the authors measured relative fitness using competition assays. In dataset II, the authors measured minimum inhibitory concentration in the presence of an antibiotic, and from this estimated relative fitness [25]. In dataset VI, the authors measured HIV infectivity in an ex vivo assay, then treated this activity as a proxy for fitness [27]. All datasets exhibited a single fitness peak. ID genotype organism reference I scattered genomic mutations E. coli [18] II β-lactamase enzyme point mutations E. coli [25] III chromosomes combinations A. niger [6] IV scattered genomic mutations E. coli [85] V chromosomes combinations A. niger [6] VI envelope glycoprotein point mutations HIV-1 [27] Table 2: Experimental genotype-fitness maps We previously analyzed four of these datasets, finding small magnitude, but statistically-significant, high-order epistasis in each map [22]. We used this same approach to characterize epistasis in the remaining two maps (Figure S1, Materials & Methods). We sought to account for confounding effects that could lead to spurious epistasis, which would, in turn, lead to spurious effects on evolutionary trajectories. The most important confounding effect is the scale of the map. Models of high-order epistasis sum the effects of mutations and then account for deviation from this ex- pectation by epistasis [58, 52]. But there is no a priori reason to assume mutational effects should add: they may multiply or combine on some other nonlinear scale [15, 9, 52, 22]. To account for this, we empirically determined a nonlinear scale for each map using a power-transform, and then used this to linearize each map [22]. We then decomposed the linearized maps into epistatic coefficients using Walsh polynomials [58, 44, 52]. This approach uses the geometric center of the genotype- 42 fitness map as reference state and reveals global correlations in the effects of mutations across the map. Each order of epistasis accounts for variation that is not explained by the sum of all lower-order contributions. For example, third-order coefficients account for any “leftover” variation in the fitness of triple mutants after the first- order (additive) and second-order (pairwise) effects of those mutations are taken into account. We determined the contribution of each order of epistasis to the total variation in fitness for each dataset by sequentially setting fifth-, fourth-, third-, and second- order epistatic coefficients to zero. We recalculated the fitness of each genotype using each “truncated” model. This is directly analogous to decomposing a sound wave into a sum of frequencies using a Fourier transform [52]. After decomposition, the original sound wave can be approximated by a sum of principal frequencies, followed by a reverse Fourier transform. By selectively including frequencies, one can identify those that contribute most to the final sound wave. Our analysis follows the same logic, approximating fitness (the sound wave) using a collection of epistatic coefficients (sound frequencies). We quantified the contribution of each epistatic order by measuring the change in fitness when the ith order of epistasis was included in the model. As a metric, we used φ = ρ2i − ρ2i−1, where ρ2x is the squared Pearson’s coefficient between the measured fitness of each genotype and its fitness calculated for a model truncated to the xth order. φ ranges from -1 to 1. Figure 9A shows this calculation for dataset I. As epistatic orders are added, ρ2 between the truncated model and measured fitness values improves. This allows determination of φ for each order: first-order coefficients (additive effects) account for 94.0% of variation in fitness; second-order (pairwise epis- tasis) for 3.8%; third for 1.2%, fourth for 0.9%, and fifth for 0.1%. 43 Figure 9: Contributions of epistasis to variation in fitness. Panel A: Cor- relation between observed (linearized) fitness and fitness calculated for truncated epistasis models for dataset I. Each point on the plot is a single genotype-fitness pair; the dashed line is a 1:1 line. Colors correspond to the truncation order: to first (red), second (orange), third (green), fourth (blue), and fifth (purple). ρ2 and φ for each order are shown on the plot, colored by order. Panel B summarizes the contributions of each order of epistasis to the variation in fitness for all six datasets. Dataset is indicated with roman numeral. Colors follow panel A. We then applied this analysis to all six datasets. Figure 9B summarizes these results. The total contribution of epistasis to variation in fitness ranged from 6.0% (dataset I) to 32.2% (dataset VI). Other datasets exhibited intermediate levels of epistasis, comparable in magnitude to high-order epistasis observed in similar datasets [22, 44, 79]. In all datasets, the first-order (additive) effects of mutations made the largest contribution to variation in fitness. Outside of this, there was no simple pattern in the relative contributions of the different orders. In dataset I, II and IV, the contribution of epistasis to variation decayed with increasing order. In dataset V, epistasis does not decay. In dataset VI, the addition third-order epistasis (without fourth-order epistasis) actually does a worse job of predicting fitness than second-order alone. The quantitative and qualitative differences in the contribution of epistasis across datasets allow us to study how altering epistasis alters evolutionary trajectories. 44 Epistasis alters evolutionary trajectories Our next question was how each order of epistasis altered evolutionary trajectories. We first back-transformed our truncated, linearized maps onto the original scale. This creates a genotype-fitness map without specific epistatic interactions, but on the original, possibly nonlinear, scale of the map. We calculated the relative probabilities of all L! forward trajectories through these maps, starting from the ancestral state and ending at the derived state [25, 6, 18]. Because the maps describe fitnesses of asexual organisms with large population sizes, we modeled trajectories as a series of sequential fixation events captured by a Gillespie model for haploid organisms with large population size (Materials & Methods). [80, 25, 18]. In this scheme, the probability of a trajectory is the product of the probabilities of its individual fixation events, normalized across all trajectories. We visualized these trajectories by overlaying them on the genotype-fitness map weighted by their relative probabilities. Higher probability mutations have thicker lines connecting them. Figure 10 shows this analysis for dataset I. We started with a purely additive map (top left). All trajectories are accessible with similar probabil- ities because, in this map, all mutations are individually favorable. We then added successive orders of epistasis and recalculated trajectories through each new map. The addition of second-order epistasis altered the availabilities of trajectories. The changes are most readily evident in the lower row in Figure 10, which shows the change in the probability of each edge and node in the map. The left side of the map is red (indicating loss of probability), while the right side of the map is blue (indi- cating gain of probability). Addition of each new order, moving left to right across Figure 10, alters the probability of trajectories through the map. 45 Figure 10: Epistasis alters evolutionary trajectories through genotype- fitness maps. Figure show the effects of increasing orders of epistasis on evolutionary trajectories for dataset I. Top Row: Genotype-fitness maps with increasing amounts of epistasis included, increasing from none (far left) to fifth-order (far right). Networks show all 25 genotypes, arranged from ancestral (top) to derived (bottom), colored by relative fitness from low (purple) to high (yellow). Edges show the probability of a given mutation in a given background from low (thin) to high (heavy). Mutations with no probability have no edge. The numbers above the arrows are φ, with 95% confidence intervals in brackets. Bottom row: Change in trajectory probability as each order of epistasis is added. Edges reveal loss of probability (red) or gain of probability (blue). The weight of the edge is directly proportional to the change in probability. Mutations whose probability do not change have no edge. For each node, the thickness of the ring reveals the change in probability that this genotype is visited. For genotypes whose probability goes down, the red area indicates the loss in probability. For genotypes whose probability goes up, the blue area indicates the increase in probability. The numbers below each network are θ, with 95% confidence intervals in brackets. The p-value measures whether the observed value of θ would be expected from fitting experimental noise (Fig 3). To quantify differences in the sets of trajectories with increasing epistasis, we calculated the change in the probabilities of all 120 forward trajectories through maps with different amounts of epistasis included (θ). A θ of 0.0 indicates that the set of trajectories through the spaces are identical, while a θ of 1.0 means the sets of trajectories do not overlap at all (Materials & Methods). Intermediate values indicate that some fraction of the trajectory probability density is shared between the maps. In dataset I, trajectories through the additive and second-order epistatic maps have θ = 0.390. Put another way, the addition of pairwise epistasis to the additive map shifts 39.0% of the trajectory probability density. Addition of each new order of epistasis has a smaller effect on trajectory probability: θ2→3 = 0.340, θ3→4 = 0.292, 46 and θ4→5 = 0.122. To determine confidence intervals on our estimates of φ and θ, we sampled from the fitness measurement uncertainty for each genotype, generating a collection of pseudoreplicate genotype-fitness maps (Figure S2A). We then decomposed each pseu- doreplicate map into epistatic coefficients (including refitting the scale) and remea- sured φ and θ for each epistatic order. Figure 11A shows this calculation for dataset I. From these distributions, we can determine 95% confidence intervals for φ and θ (shown as gray, bracketed values in Figure 10). Figure 11: Changes in trajectories are not the result of experimental un- certainty. Data in panel A and B are for dataset I. Panel A shows the distribution of φ and θ for 10,000 pseudoreplicates generated by sampling uncertainty in each the fitness of each genotype (Fig S2A). Colors denote order of epistasis, as in panel A. Panel B shows the epistasis extracted from datasets without epistasis, but experi- mental uncertainty (Fig S2B). We next asked whether the observed epistasis and its effect on trajectories could be the result of uncertainty in the fitness values. An epistasis model accounts for random noise as leftover variation, and thus as apparent epistasis [22]. We thus posed the following question: if the epistasis at a given order resulted only from noise, what effect would it have on φ and θ? To ask this question, we constructed 47 “null” maps with truncated epistasis, but noisy fitness values (Figure S2B). We took our truncated maps at each order and then assigned each fitness the same variance that was measured for the original, un-truncated fitness values. We sampled from this uncertainty to generate pseudoreplicates, extracted apparent epistasis—in this case, arising from noise—and then calculated φ and θ for the pseudoreplicate. This allows us to construct distributions of φ and θ for epistasis arising purely from experimental noise. We show this calculation for dataset I in Figure 10B. Unlike the experimental dis- tributions, which spread out in φ, the distributions arising from random noise cluster at low values of φ. The φ/θ distributions of second-, third-, and fourth-order epistasis minimally overlap in Figure 10A versus Figure 10B. This indicates that the signal for epistasis in the datasets is greater than expected from noise in the measured fitness values. In contrast, the φ/θ distribution for fifth-order epistasis overlaps between Fig- ure 10A and 10B: the effect of fifth-order epistasis cannot be distinguished from noise. Because we are interested in the effect of epistasis on trajectories (θ), we determined a p-value for each θ. We took the mode of θ at each order from Figure 10A, and determined its percentile on the corresponding null distribution in Figure 10B. For second-, third-, and fourth-order epistasis, this yields a p-value < 0.05. In contrast, the p-value for fifth order was 0.12. With these quantification tools in hand, we next studied the relationship between epistasis and evolutionary trajectories for the increasing levels of epistasis exhibited by the remaining five datasets. Figure S3-S8 summarize our analyses for all six datasets. It is helpful to compare dataset I (Figure 10) and dataset V (Figure 12). While epistasis accounts for 6.0% of variation in fitness for dataset I, it accounts for 32% of the variation in fitness for dataset V. The large amount of epistasis in dataset V means that epistasis at all orders has a massive effect on evolutionary trajectories through this space. The addition of fourth-order epistasis is particularly striking. With only 48 third-order epistasis and down, there are multiple paths through the space. With the addition of fourth-order epistasis, all paths but two become inaccessible. The addition of fifth-order epistasis opens the space up again, but to a different set of trajectories than what existed in the third-order space. Figure 12: Epistasis alters trajectories in dataset V. Altered trajectories in dataset V with increasing epistasis. Colors, panel layouts, and statistics are as in Fig 2. We next asked whether magnitude of epistasis or the order of epistasis was a stronger predictor of its effect on evolutionary trajectories. We plotted φ versus θ for each order for each dataset on a single plot (Figure 13A). This reveals a correlation between the magnitude of the epistasis and its effect on trajectories. In contrast, we see no correlation between the order of epistasis and its effect on evolutionary trajec- tories (Figure 13B). When epistasis contributes more than ≈5% of the variation in fitness, regardless of order, the divergence in trajectory probabilities with and without the epistasis is 40% or greater. The magnitude of epistasis—not its order—predicts its effect. 49 Figure 13: The magnitude of epistasis, not its order, predicts its effects on trajectories. Plot shows θ graphed against φ for all datasets. Points are colored by the order of epistasis: second (orange), third (green), fourth (blue), and fifth (purple). Error bars are 95% confidence intervals. Gray points are orders that could not be distinguished from experimental uncertainty (p < 0.05). High-order epistasis limits evolutionary predictability Our next question was more practical: how important is epistasis for predicting evo- lutionary trajectories in these datasets? We imagined an experiment in which we measured the effects of all mutations in the ancestral genotype. We then asked if we could take these individually measured mutational effects and predict evolutionary trajectories. To ask this question, we re-analyzed the epistasis present in all six datasets, this time using the ancestral genotype as the reference state. In this formulation, the first- order coefficients are the effect of each mutation by itself in the ancestral background, the second-order coefficients are the difference in the effects of mutations introduced in pairs versus separately, and third-order coefficients are the difference in the fitness of genotypes combining three mutations versus two mutations that cannot be explained by the first- and second-order coefficients. (This has been called the “biochemical” or “local” model of high-order epistasis [52].) We describe this further in the Materials & Method section. To characterize the effect of epistasis on our ability to predict evolutionary tra- 50 jectories of increasing length, we calculated the probability of all possible forward trajectories of a defined number of steps starting from the ancestral genotype, and then repeated this probability calculation using maps truncated to various orders of epistasis. The difference in the actual and truncated map trajectory probability dis- tributions measures our predictive power for evolutionary trajectories. We show these results in Figure 14 for all six datasets. In each panel, we plot inclusion of increasing orders of epistasis left-to-right (starting from additive and going to fifth-order) and increasing trajectory length bottom-to-top (starting from one-step and going to five- step). The overlap between the trajectory distribution for the truncated and real map for each epistasis/trajectory-length is shown as a color ranging from white (perfect prediction) to red (poor prediction). Figure 14: Epistasis complicates predicting trajectories from the ancestral genotype. Panels show trajectory prediction accuracy (color) for different amounts of epistasis included in the model (x-axis) and for different length trajectories away from the ancestral genotype (y-axis). Accuracy is measured as the difference in the probability distributions for trajectories through the truncated and original maps, ranging from 0.0% (red, poor accuracy) to 100% (white, perfect accuracy). Panels A-F correspond to datasets I-VI. 51 We found that all orders of epistasis were important for predicting evolutionary trajectories. Dataset IV (panel D) illustrates behavior seen across all datasets, so we will use it as a specific example. In this dataset, additive coefficients are inadequate to capture even two-step trajectories: the trajectory probability distribution for two-step mutations only overlaps by 53.0% for the truncated and real maps. The prediction gets worse for longer trajectories, dropping to 39.4% for three steps, 30.5% for four steps, and 0.0% for five steps. The overlap for the final step is 0.0% because the additive model does not predict that the five-mutation genotype will be more fit than the four-mutation genotype. Trajectories in the additive map therefore do not proceed to this final genotype. Adding pairwise epistasis to the model allows perfect “prediction” of the two-step trajectories, as we have perfect knowledge of the fitness values of all possible single and double mutants. But the three-step and four-step trajectories are predicted worse with pairwise epistasis included than with the additive map. The three-step overlap is 25.9%, while the four-step and five-step trajectory overlap is 0.0. The four-mutation and five-mutation genotypes are predicted to have low fitness. Adding third-order epistasis—now imagining that we characterized all possible single, double, and triple mutants in the ancestral genotype—allows us to “predict” trajectories up to three steps long; however, it fails for four- and five-step trajectories. The overlap is 25.1% and 45.2% respectively. Even the addition of fourth-order epistasis is insufficient to capture the five-step trajectories: the overlap for five-step trajectories is 0.0%. Dataset IV is a particularly clean example, but all six datasets exhibit similar behavior (Figure 14). Neglecting epistasis leads to poor predictions of trajectories starting from the ancestral genotype. The lower-order the truncation, the worse the prediction as more mutations accumulate. Third- and fourth-order epistasis had an appreciable effect on all datasets. Fifth-order epistasis had an effect in four of the six datasets. Like the analysis using the global model above, high-order epistasis relative 52 to the ancestral genotype potently alters evolutionary trajectories. Discussion Our analysis reveals that high-order epistasis can strongly shape evolutionary trajec- tories. Removal of three-, four-, and five-way interactions between mutations signif- icantly alters the probabilities of trajectories through genotype-fitness maps (Figure 10, Figure 12). This result is robust to uncertainty in the measured fitness values (Figure 11) and appears to be a general pattern in many maps (Figure 13). Finally, neglecting high-order epistasis leads to poor predictions of evolutionary trajectories through these maps (Figure 14). In the majority of datasets, low-order models provide useful estimates of fitness. For datasets I-IV, ignoring three-way and higher-interactions yields fitness values within 15% of the actual map (Figure 9B). Dataset I would be particularly close, yielding fitness values within 2.5% of the actual map. This is consistent with other analyses of high-order epistasis in other datasets, which suggest that additive and pairwise epistatic effects can often provide sufficient information to predict multi- mutation fitness values to within 5-10% [40, 44, 42, 52, 32, 41, 22, 37]. While low-order models can often describe fitness with some degree of precision, low-order models are inadequate to describe evolutionary trajectories in any of the datasets. Even in dataset I, third- and fourth-order interactions potently shape evo- lutionary trajectories. The probability distributions of trajectories with and without fourth-order epistasis differ by 29.2%. And, as the magnitude of epistasis increases, its effect on trajectories grows (Figure 13A). In some instances, addition of high-order interactions completely shifts the set of trajectories available (Figure 12). The effect of high-order epistasis on evolutionary trajectories is profound. We can build this intuition by imagining predicting evolutionary trajectories. If we start with knowledge of the individual effects of mutations in the ancestral background we can 53 predict the first move perfectly, but not the second move. Pairwise epistasis means the effect of the second mutation is modulated by the presence of the first. We might try to overcome this difficulty by measuring the effect of each mutation and each pair of mutations, thereby accounting for pairwise epistasis. But our results reveal this is still insufficient to predict trajectories past the second step. There are three-way interactions that alter the effect of the third mutation, even after accounting for the first- and second-order effects of mutations. This continues all the way to fifth-order in these five-site datasets. This has two implications. First, this adds to the growing recognition of extensive contingency in evolution [86, 36, 19, 33]. The effect of an event today is contingent on a whole collection of previous events. Remarkably, we found that this contingency is mediated by epistasis at all orders, including up to five-way interactions between mutations. Second, this work implies that measuring the individual effects of many mutations in a single genetic background, despite revealing a local fitness landscape [34, 37, 20], will be of limited utility for understanding evolution past the first few moves. We expect the effect of high-order epistasis on trajectories will be amplified in larger maps that have more mutations. In a larger map, more mutations com- pete for fixation—each modulated by high-order interactions with previous substi- tutions—leading to even greater contingency on specific substitutions that occurred in the past. Further, the small maps we studied artificially limit the effects of high- order epistasis, as larger maps could, potentially, have even higher-order interactions. But even if no epistasis above fifth-order is present, trajectories will have more steps in a larger map; therefore, a fifth-order interaction could alter the relative probabilities of many more future moves in a larger space. One open question is the effect of recombination on this radical contingency. We studied trajectories in which mutations fixed sequentially. This means our results are 54 directly applicable to asexual organisms and loci in tight linkage, such as mutations to individual genes. Once recombination comes into play, other dynamics become possible. While recombination can completely overcome pairwise epistasis [87], it is unclear whether this result will apply to higher-order interactions. High-order epistasis appears to be a ubiquitous feature of experimental genotype- fitness (and genotype-phenotype) maps [40, 44, 42, 52, 32, 41, 22, 37]. The origins of this epistasis remain unknown. Further, epistasis may go to much higher-order than yet observed, leading to extremely long-term memory in evolution. The observation of cryptic epistasis between genetic backgrounds that appear similar, but in which mutations have radically different effects, may point to high-order epistasis between mutations in diverging backgrounds[88, 34]. Whatever the origins or order may be, our work reveals that combinations of early substitutions continue to have an effect as future mutations accumulate: the past continues to press upon the present. 55 CHAPTER IV MOLECULAR ENSEMBLES MAKE EVOLUTION UNPREDICTABLE Author Contributions Zachary Sailer (ZRS) and Michael Harms (MJH) conceptualized the paper. ZRS conducted the simulations and computational analysis. ZRS generated figures. ZRS and MJH wrote and edited the manuscript. Abstract Evolutionary prediction is of deep practical and philosophical importance. Here we show, using a simple computational protein model, that protein evolution remains unpredictable even if one knows the effects of all mutations in an ancestral protein background. We performed a virtual deep mutational scan—revealing the individual and pairwise epistatic effects of every mutation to our model protein—and then used this information to predict evolutionary trajectories. Our predictions were poor. This is a consequence of statistical thermodynamics. Proteins exist as ensembles of similar conformations. The effect of a mutation depends on the relative probabilities of conformations in the ensemble, which in turn depend on the exact amino acid sequence of the protein. Accumulating substitutions alter the relative probabilities of conformations, thereby changing the effects of future mutations. This manifests itself as subtle, but pervasive, high-order epistasis. Uncertainty in the effect of each mutation accumulates and undermines prediction. Because conformational ensembles are an inevitable feature of proteins, this is likely universal. 56 Significance A long-standing goal in evolutionary biology is predicting evolution. Here we show that the architecture of macromolecules fundamentally limits evolutionary predictabil- ity. Under physiological conditions, macromolecules like proteins flip between mul- tiple structures, forming an ensemble of structures. A mutation affects all of these structures in slightly different ways, redistributing the relative probabilities of struc- tures in the ensemble. As a result, mutations that follow the first mutation have a different effect than they would if introduced before. This implies that knowing the effects of every mutation in an ancestor would be insufficient to predict evolutionary trajectories past the first few steps, leading to profound unpredictability in evolution. We therefore conclude that detailed evolutionary predictions are not possible given the chemistry of macromolecules. Introduction Is evolution predictable? This is a fundamental question in evolutionary biology, both for philosophical [89, 90, 91, 36] and practical reasons [34, 92]. Deep mutational scanning experiments provide an intriguing new avenue to think about evolutionary prediction. These experiments reveal the effects of huge numbers of mutations and thus provide rich information about local adaptive landscapes [1, 93, 20]. This leads to a simple question: if we know the effect of every mutation in an ancestral genotype, can we predict future evolutionary trajectories? If not, what limits our ability to make predictions? To pose this question, we attempted to predict the evolution of simple physical protein model given a virtual deep mutational scan. In this context, prediction is knowing which mutations would accumulate, in what order, given knowledge of the effects of the mutations in the ancestral background.We attempted an “easy” pre- 57 diction, reasoning that it could act as a starting point for more difficult scenarios involving more complex evolutionary processes. To maximize predictive success, we studied adaptive trajectories in which the environment was stable, there was consis- tent directional selection, and mutations were fixed by selection rather than drift. We studied the evolution of improved thermodynamic stability—a shared feature of all folded proteins and a target of natural selection in many contexts [94, 95, 96]. This is a useful phenotype for a number of reasons. First, because stability is a ther- modynamic quantity, studying it may reveal features common to the evolution of other thermodynamic properties such as allostery and ligand binding. Second, biological systems—from molecules to ecosystems—are ultimately physical; therefore, insights at the physical level may provide insights for higher levels of biological organization [39, 97]. Surprisingly, we found that our predictions were quite poor. We even added all pairwise epistatic effects of mutations to our predictive model, requiring a massive virtual deep mutational scan of all possible pairs of mutations. Even this did not allow robust predictions of evolutionary trajectories. We find that the unpredictabil- ity arises directly from the thermodynamic ensemble of conformations populated by macromolecules—revealing a profound link between protein physics and the evolu- tionary process. Materials and Methods All of our analyses are are contained in Python scripts and Jupyter notebooks avail- able on Github (https://github.com/harmslab/notebooks-epistasis-ensembles). Full details are given in the SI Appendix. For the protein lattice model simulations, we extended the latticeproteins package originally written by Prof. Jesse Bloom (https://github.com/harmslab/latticeproteins) [98] using Miyazawa and Jernigan contact energies (from Table V in their paper) and 58 reduced temperature units [99]. We randomly generated 12-site protein sequences and then evolved each sequence until the fraction folded was ≈ 0.7. We calculated the probability of a given evolutionary trajectory as a series of independent, sequential fixation events using a strong-selection, weak-mutation model [80]. In this model, the fixation probability for going from genotype x to x+ 1 is: pix→x+1 = 1− e−(wx+1/wx−1), where wx and wx+1 are the relative fitnesses of the x and x + 1 genotypes (SI Ap- pendix). To quantify the difference between the predicted and actual trajectories, we cal- culated the magnitude of the difference in probability of all observed trajectories through each space [35] We predicted the ∆G◦N for each genotype and then used this to predict evolution- ary trajectories. For the additive model, the predicted stability of a genotype with a set of {M} mutations was: ∆Gˆ◦N,{M} = ∆G ◦ N,anc + ∑ i∈{M} ∆∆G◦N,i where ∆G◦N,anc is the stability of the ancestor and ∆∆G◦N,i is the effect of the ith mutation in the ancestral background. For the pairwise epistatic model, we added epistatic coefficients: ∆Gˆ◦N,{M} = ∆G ◦ N,anc + ∑ i∈{M} ∆∆G◦N,i + ∑ i threshold 0 else We then construct and train a logistic model to classify the additive phenotypes Padd given the observe phenotype classes Pobs,class: Pobs,class = Padd,class =  1 if α0β0 + ∑L i αi(βixi) + ξ > 0 0 else where βi are the additive coefficients determined in 18, αi are the odds that genotypes with mutation i are inactive, and ξ is the error in our model (which follows a logistic distribution) [161, 162]. Nonlinear model We addressed global epistasis in the genotype-phenotype map by identifying a non- linear function T that captures global curvature in the relationship between ~Pobs and ~Padd, ~Pobs = T (~Padd) + ~ε. (21) where ~ε are the fit residuals. We fit a 2nd-order spline to the Pobs vs. Padd curve [21] in the PfCRT map. We then used this transform to linearize the map and extract linear epistatic coefficients. 117 Software All software is free and open source. The prediction models can be accessed and down- loaded on Github (https://github.com/harmslab/gpseer). All of the code is written in Python and built on top of core scientific python packages [65, 128, 129, 67]. We have written extensive documentation and examples of how to apply to various types of experimental data (https://gpseer.readthedocs.io). The software takes a list of geno- types with their measured phenotypes and, from that, trains the model. The code is written with a modular application programming interface, allowing users to try each layer of the model—classifier, nonlinear fit, and epistasis—with simple Python scripts (or integrated into a programming environment such as Jupyter). Because it is built on the existing epistasis package (https://github.com/harmslab/epistasis) [22], it can also handle arbitrary genotype alphabets (it builds the alphabet from the input data). Finally, all statistical tests and cross-validation are implemented within the software, allowing for any researcher to fit and select between different predictive models of the genotype-phenotype map. 118 CHAPTER VII CONCLUSIONS AND FUTURE DIRECTIONS This dissertation addresses a key question when predicting genotype-phenotype maps. How should we handle non-additivity–or epistasis? We identified two sources of epis- tasis: 1) global epistasis and 2) local epistasis. Each source must be treated sepa- rately. First, global epistasis is important feature to capture. If a global feature, like nonlinearity, can be resolved from a sparsely-sampled genotype phenotype map, it can inform prediction. This dissertation shows various ways to infer global epistasis. Local epistasis, on the other hand, should be treated as uncertainty. In general, spe- cific epistatic interactions cannot be accurately estimated from a sparsely-sampled genotype-phenotype map. As a consequence, using local epistasis yields biased pre- dictions. A major limitation of this approach is when local epistasis is a large contributor to the variation in a genotype-phenotype map. In this scenario, the model will not predict phenotypes accurately. However, this model will estimate how much local epistasis is present within a heavy computational cost. We can apply this model and know immediately if we will be able to predict phenotypes accurately. A major benefit of this approach is that it can predict massive genotype-phenotype maps from very sparse data. The model is extremely lightweight and computationally optimized. It also generalizes to predict any arbitrary genotype-phenotype map. Further, the interpretation of the model parameters are intuitive. Each mutation has an average quantitative effect on the phenotype. We can resolve which mutations are key to function and use this information direct further analysis. A key follow up to this dissertation is: can we do better than a nonlinear additive 119 model? Recently, new approaches have been proposed that pull information from outside the genotype-phenotype map. For example, some groups are exploring vari- ational auto-encoders (VAE) as a viable approach. VAEs use a deep neural network to decompose a large set of aligned genotypes, identify latent key features, and re- compose these features to predict unknown phenotypes. The interpretation of such models are less clear, since the latent variable is not interpretable. However, they offer a promising future for phenotypic prediction. 120 CHAPTER VIII APPENDIX Figure 28: Experimental genotype-phenotype maps exhibit nonlinear phe- notypes. Plots show observed phenotype Pobs plotted against Pˆadd (Eq. 3) for data sets V through VII. Points are individual genotypes. Error bars are experimental standard deviations in phenotype. Red lines are the fit of the power transform to the data set. Pearson’s coefficient for each fit are shown on each plot. Dashed lines are Padd = Pobs. Bottom panels in each plot show residuals between the observed phenotypes and the red fit line. Points are the individual residuals. Errorbars are the experimental standard deviation of the phenotype. The horizontal histograms show the distribution of residuals across 10 bins. The red lines are the mean of the residuals. Datasets VI and VII form distinct clusters because each map has a single, large-effect mutation. The two clusters correspond to genotypes with and without the large-effect mutation. 121 Figure 29: Nonlinear phenotypes can be transformed to linear scale to estimate high-order epistasis. Flowchart shows the steps for estimating high- order epistasis in nonlinear genotype-phenotype maps. The plots beneath the chart show this pipeline for data set II. In step 1, a power transform function is used to fit the Pobs versus Pˆadd plot and estimate the map’s scale. In step 2, the inverse of the fitted transform is used to back-transform Pobs to a linear scale, Plinear. In step 3, a linear, high-order epistasis model is used to fit the variation in Plinear. In the left plot, points are individual genotypes, red line is the resulting fit and dashed line is the Padd = Pobs. In the middle plot, the blue line is the new scale of Pobs after back transforming. In the right plot, bars represent additive and epistatic coefficients extracted from the linear phenotypes. Error bars are propagated measurement uncertainty. Color denotes the order of the coefficient: first (βi, red), second (βij, orange), third (βijk, green), fourth (βijkl, purple), and fifth (βijklm, blue). Bars are colored if the coefficient is significantly different than zero (Z-score with p-value <0.05 after Bonferroni correction for multiple testing). Stars denote relative significance: p < 0.05 (*), p < 0.01 (**), p < 0.001 (***). Filled squares in the grid below the bars indicate the identity of mutations that contribute to the coefficient. 122 Figure 30: High-order epistasis is present in genotype-phenotype maps. A) Panels show epistatic coefficients extracted from data sets V-VII (Table 1, data set label circled above each graph). Bars denote coefficient magnitude and sign; error bars are propagated measurement uncertainty. Color denotes the order of the coefficient: first (βi, red), second (βij, orange), third (βijk, green), fourth (βijkl, purple), and fifth (βijklm, blue). Bars are colored if the coefficient is significantly different than zero (Z-score with p-value <0.05 after Bonferroni correction for multiple testing). Stars denote relative significance: p < 0.05 (*), p < 0.01 (**), p < 0.001 (***). Filled squares in the grid below the bars indicate the identity of mutations that contribute to the coefficient. The names of the mutations, taken from the original publications, are indicated to the left of the grid squares. B) Sub-panels show fraction of variation accounted for by first through fifth order epistatic coefficients for data sets I-IV (colors as in panel A). Fraction described by each order is proportional to area. 123 Figure 31: Nonlinear phenotypes distort measured epistatic coefficients. Sub-panels show correlation plots between epistatic coefficients extracted without accounting for nonlinearity (x-axis) and accounting for linearity (y -axis) for data sets V-VII. Each point is an epistatic coefficient, colored by order. 124 Figure 32: Additive coefficients are well estimated, even when nonlinearity is neglected. Sub-panels show correlation plots between both additive and epistatic coefficients extracted without accounting for nonlinearity (x-axis) and accounting for linearity (y-axis) for data sets I-VII. Each point is an epistatic coefficient, colored by order. Error bars are standard deviations from bootstrap replicates of each fitting approach. 125 Figure 33: Exponential fitness model leads to global nonlinearity in the β-lactamase data set (III). A) A recapitulation of the map used in the original publication [? ]. We first rank-ordered the genotypes according to the measured property (the minimum inhibitory concentration of a β-lactam antibiotic against a clonal population of bacteria expressing that protein). This gave us 13 classes of genotypes, as some genotypes had equivalent MIC values. We then drew 3,000 random fitness values from the distribution W = 1+x, where x is an exponential distribution centered around x¯ = 0.1. We took the top 13 values from this distribution and assigned them, in value order, to each of the 32 β-lactamase genotypes. Panel A shows the average and standard deviation of the fitness values W assigned to each of these ranks if we repeat the protocol above 1,000 times. B) Best fit for the power- transform for data set III. Solid red line denotes the best fit (nonlinear). This fit successfully pulls out the original distribution of W . 126 127 Figure 34: Flow chart for removing epistasis in a genotype-fitness map. This chart describes the pipeline we used to truncate epistasis from genotype-fitness maps. The data shown are for dataset II. Networks (left) show all 25 genotypes, arranged from ancestral (top) to derived (bottom), colored by relative fitness from 1.0 (purple) to 1.30 (yellow). The correlation plots (middle) show the fitness of each genotype plotted against the fitness of that genotype assuming each mutation has a linear, additive effect on fitness (Fadd). Y-axes correspond to: the experimentally measured fitness (Fexperimental, panel 2); the experimentally measured fitness linearized using the red scale in panel 2 (Flinear, panel 3); fitness values with third-, fourth- and fifth-order epistasis removed, on the linear scale from panel 3 (Flinear,trunc, panel 5); and fitness values with truncated epistasis on the red nonlinear scale from panel 2 (Ftrunc, panel 6). The right-most panels show the fraction of variation explained by first- (red), second- (orange), third- (green), and fourth-order (purple) epistatic coefficients. The area occupied by each color indicates its contribution to the fitness on the linear scale. 128 Figure 35: Schematic of our resampling protocol. Two-mutation maps are shown throughout, colored by fitness from low (purple) to high (yellow). We sampled from two maps: the original map with uncertainty (A, red) and a “null” map in which epistasis was removed, but experimental uncertainty maintained (B, blue). We used the same sampling protocal on each (“Start”). We generated pseudoreplicates (s1, s2, ...sn) from uncertainty (Gaussian curves above the color spectrum in A and B). We then truncated the pseudoreplicate to ith and (i−1)th order epistasis and calculated φ and θ for each pseudoreplicate: {(φ1, θ1), (φ2, θ2), ...(φn, θn). We can then plot and compare these distributions on φ/θ axes. 129 dd Figure 36: Epistasis alters trajectories in dataset I. A) Colors, panel layouts, and statistics are as in Fig 2. B) Colors, panel layouts, and statistics are as in Fig 1A. C-D): Colors and panel layouts are as in Fig 3. 130 Figure 37: Epistasis alters trajectories in dataset II. A) Colors, panel layouts, and statistics are as in Fig 2. B) Colors, panel layouts, and statistics are as in Fig 1A. C-D): Colors and panel layouts are as in Fig 3. 131 Figure 38: Epistasis alters trajectories in dataset III. A) Colors, panel layouts, and statistics are as in Fig 2. B) Colors, panel layouts, and statistics are as in Fig 1A. C-D): Colors and panel layouts are as in Fig 3. 132 Figure 39: Epistasis alters trajectories in dataset IV. A) Colors, panel layouts, and statistics are as in Fig 2. B) Colors, panel layouts, and statistics are as in Fig 1A. C-D): Colors and panel layouts are as in Fig 3. 133 Figure 40: Epistasis alters trajectories in dataset V. A) Colors, panel layouts, and statistics are as in Fig 2. B) Colors, panel layouts, and statistics are as in Fig 1A. C-D): Colors and panel layouts are as in Fig 3. 134 Figure 41: Epistasis alters trajectories in dataset VI. A) Colors, panel layouts, and statistics are as in Fig 2. B) Colors, panel layouts, and statistics are as in Fig 1A. C-D): Colors and panel layouts are as in Fig 3. Effect of population size on predictability We explored how population size changed the effect of ensemble-induced epistasis on evolutionary predictability. We repeated our evolutionary predictions using the variable population size variant of Gillespie’s fixation model: pix→x+1 = 1− e−(wx+1/wx−1) 1− e−Ne·(wx+1/wx−1) , (22) where Ne is the effective population size [80]. Because the total number of accessible trajectories becomes very large without selection to winnow out low fitness trajec- tories, we limited our simulations to subsets of trajectories for this calculation. We selected these subsets by making only three mutational steps from each genetic back- ground rather than all possible moves. For each genotype, we calculated ∆∆G◦ for all possible mutations. We then randomly selected three of these mutations, weighted by their relative probability given our fitness function. We introduced each of these individually, creating a 3-way branching set of trajectories. We repeated this proto- col for seven steps, yielding a final, discrete set of 37 = 2, 187 trajectories. We then attempted to predict the probabilities of trajectories within this sampled set, rather than completely open-ended evolutionary trajectories. Fig S1A shows the overlap between the predicted and actual trajectory probabil- ities after seven steps for effective population sizes ranging from one to one million. At a high population size (Ne = 1 × 106), we recapitulate our results from an infi- nite population model (Fig 3D). At the lowest population size (Ne = 1), we exactly predict the probabilities of all evolutionary trajectories. At first blush, this looks like evolution has become predictable, but this is not the case. At Ne = 1, drift dominates and all trajectories have equal probabilities. (This is the population genetics equiva- lent of infinite temperature). As a result, evolution is completely unpredictable. Our model can correctly predict that all trajectories are equally accessible because Eq. 22 135 reduces to one for all w: pix→x+1 no longer depends on the ∆∆G◦ values we calculate. By Ne = 10, we already observe a small amount of divergence between our predicted and true trajectories. We can more rigorously characterize the role of drift versus population size by calculating the Shannon entropy of all trajectories through the map: S = − ∑ i∈{T} pilog(pi), (23) where pi represents the probability of an individual trajectory, summed over the set of all trajectories {T}. When all trajectories have equal probability (Ne = 1), the entropy is high. As specific trajectories begin to increase in probability, the entropy drops. This is shown in Fig S1B. For Ne = 1, the entropy is 7.69. For Ne = 1× 106, the average entropy is 4.55. A comparison of Fig S1A and S1B reveals an inverse correlation between the ef- fect of the ensemble on predictability and the effect of drift. As the effect of drift decreases, predictability due to the ensemble increases. Put another way: the more selection favors specific trajectories, the more ensemble-induced epistasis makes evo- lution unpredictable. The less selection favors specific trajectories, more drift make evolution unpredictable. There is no sweet spot in population size at which both drift and ensemble-induced epistasis have negligible effects on predictability. 136 Figure 42: Unpredictability from the ensemble and neutral drift trade off with effective population size. Panel A shows a jitter plot of divergence between the true trajectories and trajectories predicted using a pairwise epistasis model using a full ensemble lattice model. Each series shows the divergence after seven steps using the effective population size shown. Panel B shows the Shannon entropy for the “true” trajectories from the simulations on the left as a function of population size. Gray points represent θ and S for individual maps. Red bars indicate the means. 137 Figure 43: Epistasis in a simulated genotype-phenotype map. Panel A shows simulated genotype-phenotype map using Hadamard coefficients. Each node is a genotype; each edge is a single point mutation. Colors indicate quantitative phenotypes. Panel B shows the correlation between the observed phenotypes and the additive model, with fit residuals shown below the plot. Panel C shows the magnitude of epistasis in the map (top sub panel) and the values of all model coefficients (bottom sub panel). Colors indicate additive components (red), pairwise components (green), and high-order components (blue). In the bottom sub panel, each bar shows the value of a single model coefficient: the red bars correspond to the 8 additive coefficients, the green bars to the 28 pairwise coefficients, and the blue bars to the 220 high-order coefficients. 138 Figure 44: Linear epistatic coefficients cannot be estimated from an in- complete, simulated genotype-phenotype map. Panels show the fitting and prediction performaces of different epistasis models (Hadamard or biochemical) and regression methods (ordinary least-squares, lasso, or ridge). Each subpanel shows the fit scores versus the fraction of data used to train the model, from 10% to 90%, for different epistasis models. The dashed gray line indicates the amount of additive variation in the map (80%). Colors indicate model: additive (red), pairwise epistasis (green), and high-order epistasis (blue). Dashed lines indicate ρ2train and solid lines indicate ρ2test. 139 Figure 45: Predictive epistatic coefficients cannot be extracted from ex- perimental genotype-phenotype maps. Each row shows ρ2train (black) and ρ2test (red) for a different experimental map (indicated to the left of each row) as epistatic orders are added to the model. The x-axis shows the results for various epistasis models (Hadamard or biochemical) and regression methods (ordinary least-squares, lasso, or ridge). Points are, from left to right: additive, pairwise, and high-order epistasis. Points and lines indicate the mean of 1,000 pseudoreplicate samples. Error bars are standard deviation of pseudoreplicate results. The dashed lines indicate the fraction of the variation in the map explained by the additive model. 140 Figure 46: Epistasis in experimental maps looks like random epistasis. Panels show the epistatic coefficients extracted from 29 decoy, simulated genotype- phenotype maps and 1 experimental genotype-phenotype map (dataset VIII). All maps have a fraction of epistasis equal to 26%. Epistasis was added to the 29 decoy maps by drawing a random epistasis parameter from a normal distribution for each phenotype. Panel 20 shows the results for the experimental map. In each panel, the top subpanel indicates the magnitude of epistasis in each map and the bottom subpanel indicates the values of all model coefficients. Colors indicate additive com- ponents (red), pairwise components (green), and high-order components (blue). In the bottom sub panel, each bar shows the value of a single model coefficient: the red bars correspond to the 5 additive coefficients, the green bars to the 10 pairwise coefficients, and the blue bars to the 17 high-order coefficients. 141 REFERENCES CITED [1] Douglas M. Fowler, Carlos L. Araya, Sarel J. Fleishman, Elizabeth H. Kellogg, Jason J. Stephany, David Baker, and Stanley Fields. High-Resolution Mapping of Protein Sequence-Function Relationships. Nature Methods, 7(9):741–746, September 2010. [2] Mark A. DePristo, Daniel M. Weinreich, and Daniel L. Hartl. Missense Mean- derings in Sequence Space: A Biophysical View of Protein Evolution. Nat Rev Genet, 6(9):678–687, September 2005. [3] Erich Bornberg-Bauer and Hue Sun Chan. Modeling Evolutionary Landscapes: Mutational Stability, Topology, and Superfunnels in Sequence Space. PNAS, 96(19):10689–10694, September 1999. [4] John Maynard Smith. Natural Selection and the Concept of a Protein Space. Nature, 225(5232):563–564, February 1970. [5] C. Brandon Ogbunugafor and Daniel L. Hartl. A New Take on John May- nard Smith’s Concept of Protein Space for Understanding Molecular Evolution. PLOS Computational Biology, 12(10):e1005046, October 2016. [6] J. Arjan G. M. de Visser, Su-Chan Park, and Joachim Krug. Exploring the Effect of Sex on Empirical Fitness Landscapes. The American Naturalist, 174(s1):S15–S30, July 2009. [7] Ivan G. Szendro, Martijn F. Schenk, Jasper Franke, Joachim Krug, and J. Arjan G. M. de Visser. Quantitative analyses of empirical fitness landscapes. Journal of Statistical Mechanics: Theory and Experiment, 2013(01):P01005, 2013. [8] Frank J. Poelwijk, Daniel J. Kiviet, Daniel M. Weinreich, and Sander J. Tans. Empirical Fitness Landscapes Reveal Accessible Evolutionary Paths. Nature, 445(7126):383–386, January 2007. [9] J. Arjan G. M. de Visser and Joachim Krug. Empirical Fitness Landscapes and the Predictability of Evolution. Nat Rev Genet, 15(7):480–490, July 2014. [10] Lizhi Ian Gong, Marc A. Suchard, and Jesse D. Bloom. Stability-Mediated Epistasis Constrains the Evolution of an Influenza Protein. eLife, 2:e00631, May 2013. 142 [11] Robert L. Summers, Anurag Dave, Tegan J. Dolstra, Sebastiano Bellanca, Rosa V. Marchetti, Megan N. Nash, Sashika N. Richards, Valerie Goh, Robyn L. Schenk, Wilfred D. Stein, Kiaran Kirk, Cecilia P. Sanchez, Michael Lanzer, and Rowena E. Martin. Diverse mutational pathways converge on saturable chloro- quine transport via the malaria parasite’s chloroquine resistance transporter. Proceedings of the National Academy of Sciences, 111(17):E1759–E1767, April 2014. [12] Rhys Hayward, Kevin J. Saliba, and Kiaran Kirk. The pH of the digestive vacuole of Plasmodium falciparum is not associated with chloroquine resistance. Journal of Cell Science, 119(6):1016–1025, March 2006. [13] David A. Fidock, Takashi Nomura, Angela K. Talley, Roland A. Cooper, Sergey M. Dzekunov, Michael T. Ferdig, Lyann M. B. Ursos, Amar bir Singh Sidhu, Bronwen Naudé, Kirk W. Deitsch, Xin-zhuan Su, John C. Woot- ton, Paul D. Roepe, and Thomas E. Wellems. Mutations in the P. falciparum Digestive Vacuole Transmembrane Protein PfCRT and Evidence for Their Role in Chloroquine Resistance. Molecular Cell, 6(4):861–871, October 2000. [14] R.A. Fisher. The Correlation between Relatives on the Supposition of Mendelian Inheritance. Philosophical Transactions of the Royal Society of Ed- inburgh, (52):399–433, 1918. [15] Heather J. Cordell. Epistasis: what it means, what it doesn’t mean, and statis- tical methods to detect it in humans. Human Molecular Genetics, 11(20):2463– 2468, October 2002. [16] Patrick C. Phillips. Epistasis — the Essential Role of Gene Interactions in the Structure and Evolution of Genetic Systems. Nat Rev Genet, 9(11):855–867, November 2008. [17] Hsin-Hung Chou, Hsuan-Chao Chiu, Nigel F. Delaney, Daniel Segrè, and Christopher J. Marx. Diminishing Returns Epistasis Among Beneficial Mu- tations Decelerates Adaptation. Science, 332(6034):1190–1192, March 2011. [18] Aisha I. Khan, Duy M. Dinh, Dominique Schneider, Richard E. Lenski, and Tim F. Cooper. Negative Epistasis Between Beneficial Mutations in an Evolving Bacterial Population. Science, 332(6034):1193–1196, March 2011. [19] Sergey Kryazhimskiy, Daniel P. Rice, Elizabeth R. Jerison, and Michael M. Desai. Global Epistasis Makes Adaptation Predictable despite Sequence-Level Stochasticity. Science, 344(6191):1519–1522, June 2014. 143 [20] Karen S. Sarkisyan, Dmitry A. Bolotin, Margarita V. Meer, Dinara R. Usman- ova, Alexander S. Mishin, George V. Sharonov, Dmitry N. Ivankov, Nina G. Bozhanova, Mikhail S. Baranov, Onuralp Soylemez, Natalya S. Bogatyreva, Pe- ter K. Vlasov, Evgeny S. Egorov, Maria D. Logacheva, Alexey S. Kondrashov, Dmitry M. Chudakov, Ekaterina V. Putintseva, Ilgar Z. Mamedov, Dan S. Tawfik, Konstantin A. Lukyanov, and Fyodor A. Kondrashov. Local Fitness Landscape of the Green Fluorescent Protein. Nature, 533(7603):397–401, May 2016. [21] Jakub Otwinowski, David M. McCandlish, and Joshua B. Plotkin. Inferring the shape of global epistasis. Proceedings of the National Academy of Sciences, page 201804015, July 2018. [22] Zachary R. Sailer and Michael J. Harms. Detecting High-Order Epistasis in Nonlinear Genotype-Phenotype Maps. Genetics, 205(3):1079–1088, March 2017. [23] Zachary R. Sailer and Michael J. Harms. Molecular ensembles make evolution unpredictable. Proceedings of the National Academy of Sciences, 114(45):11938– 11943, November 2017. [24] Daniel M. Weinreich, Richard A. Watson, and Lin Chao. Perspective: Sign Epistasis and Genetic Costraint on Evolutionary Trajectories. Evolution, 59(6):1165–1174, June 2005. [25] Daniel M. Weinreich, Nigel F. Delaney, Mark A. DePristo, and Daniel L. Hartl. Darwinian Evolution Can Follow Only Very Few Mutational Paths to Fitter Proteins. Science, 312(5770):111–114, July 2006. [26] Shimon Bershtein, Michal Segal, Roy Bekerman, Nobuhiko Tokuriki, and Dan S. Tawfik. Robustness–epistasis Link Shapes the Fitness Landscape of a Randomly Drifting Protein. Nature, 444(7121):929–932, December 2006. [27] Jack da Silva, Mia Coetzer, Rebecca Nedellec, Cristina Pastore, and Donald E. Mosier. Fitness Epistasis and Constraints on Adaptation in a Human Immun- odeficiency Virus Type 1 Protein Region. Genetics, 185(1):293–303, May 2010. [28] Sergey Kryazhimskiy, Jonathan Dushoff, Georgii A. Bazykin, and Joshua B. Plotkin. Prevalence of Epistasis in the Evolution of Influenza A Surface Pro- teins. PLoS Genet, 7(2):e1001301, February 2011. [29] Merijn L. M. Salverda, Eynat Dellus, Florien A. Gorter, Alfons J. M. Debets, John van der Oost, Rolf F. Hoekstra, Dan S. Tawfik, and J. Arjan G. M. de Visser. Initial Mutations Direct Alternative Pathways of Protein Evolution. PLOS Genet, 7(3):e1001321, March 2011. 144 [30] Michael S. Breen, Carsten Kemena, Peter K. Vlasov, Cedric Notredame, and Fyodor A. Kondrashov. Epistasis as the Primary Factor in Molecular Evolution. Nature, 490(7421):535–538, October 2012. [31] Danielle M. Tufts, Chandrasekhar Natarajan, Inge G. Revsbech, Joana Projecto-Garcia, Federico G. Hoffmann, Roy E. Weber, Angela Fago, Hideaki Moriyama, and Jay F. Storz. Epistasis Constrains Mutational Pathways of Hemoglobin Adaptation in High-Altitude Pikas. Mol Biol Evol, page msu311, November 2014. [32] Dave W. Anderson, Alesia N. McKeown, and Joseph W. Thornton. Intermolec- ular Epistasis Shaped the Function and Evolution of an Ancient Transcription Factor and Its DNA Binding Sites. eLife Sciences, page e07864, June 2015. [33] Premal Shah, David M. McCandlish, and Joshua B. Plotkin. Contingency and Entrenchment in Protein Evolution under Purifying Selection. PNAS, page 201412933, June 2015. [34] Charlotte M. Miton and Nobuhiko Tokuriki. How Mutational Epistasis Impairs Predictability in Protein Evolution and Design. Protein Science, 25(7):1260– 1272, July 2016. [35] Zachary R. Sailer and Michael J. Harms. High-order epistasis shapes evolution- ary trajectories. PLOS Computational Biology, 13(5):e1005541, May 2017. [36] Michael J. Harms and Joseph W. Thornton. Historical Contingency and Its Biophysical Basis in Glucocorticoid Receptor Evolution. Nature, 512(7513):203– 207, August 2014. [37] Nicholas C. Wu, Lei Dai, C. Anders Olson, James O. Lloyd-Smith, and Ren Sun. Adaptation in Protein Fitness Landscapes Is Facilitated by Indirect Paths. eLife, 5:e16965, July 2016. [38] Jamie T. Bridgham, Eric A. Ortlund, and Joseph W. Thornton. An Epistatic Ratchet Constrains the Direction of Glucocorticoid Receptor Evolution. Nature, 461(7263):515–519, September 2009. [39] Michael J. Harms and Joseph W. Thornton. Evolutionary Biochemistry: Re- vealing the Historical and Physical Causes of Protein Properties. Nature re- views. Genetics, 14(8):559–571, August 2013. [40] Marylyn D. Ritchie, Lance W. Hahn, Nady Roodi, L. Renee Bailey, William D. Dupont, Fritz F. Parl, and Jason H. Moore. Multifactor-Dimensionality Re- duction Reveals High-Order Interactions among Estrogen-Metabolism Genes in Sporadic Breast Cancer. The American Journal of Human Genetics, 69(1):138– 147, July 2001. 145 [41] Shozo Yokoyama, Ahmet Altun, Huiyong Jia, Hui Yang, Takashi Koyama, Da- vide Faggionato, Yang Liu, and William T. Starmer. Adaptive Evolutionary Paths from UV Reception to Sensing Violet Light by Epistatic Interactions. Science Advances, 1(8):e1500162, September 2015. [42] Jiya Sun, Fuhai Song, Jiajia Wang, Guangchun Han, Zhouxian Bai, Bin Xie, Xuemei Feng, Jianping Jia, Yong Duan, and Hongxing Lei. Hidden Risk Genes with High-Order Intragenic Epistasis in Alzheimer’s Disease. J. Alzheimers Dis., 41(4):1039–1056, 2014. [43] Ting Hu, Yuanzhu Chen, Jeff W. Kiralis, Ryan L. Collins, Christian Wejse, Giorgio Sirugo, Scott M. Williams, and Jason H. Moore. An Information-Gain Approach to Detecting Three-Way Epistatic Interactions in Genetic Association Studies. J Am Med Inform Assoc, pages amiajnl–2012–001525, February 2013. [44] Daniel M Weinreich, Yinghong Lan, C Scott Wylie, and Robert B. Heckendorn. Should Evolutionary Geneticists Worry about Higher-Order Epistasis? Current Opinion in Genetics & Development, 23(6):700–707, December 2013. [45] Yinhua Wang, Carolina Díaz Arenas, Daniel M. Stoebel, and Tim F. Cooper. Genetic Background Affects Epistatic Interactions between Two Beneficial Mu- tations. Biology Letters, page rsbl20120328, August 2012. [46] Mats Pettersson, Francois Besnier, Paul B. Siegel, and Örjan Carlborg. Repli- cation and Explorations of High-Order Epistasis Using a Large Advanced In- tercross Line Pedigree. PLoS Genet, 7(7):e1002180, July 2011. [47] Tomoaki Matsuura, Yasuaki Kazuta, Takuyo Aita, Jiro Adachi, and Tetsuya Yomo. Quantifying Epistatic Interactions among the Components Constituting the Protein Translation System. Molecular Systems Biology, 5(1):297, January 2009. [48] Marcin Imielinski and Calin Belta. Exploiting the Pathway Structure of Metabolism to Reveal High-Order Epistasis. BMC Systems Biology, 2:40, 2008. [49] Chia-Ti Tsai, Juey-Jen Hwang, Marylyn D. Ritchie, Jason H. Moore, Fu-Tien Chiang, Ling-Ping Lai, Kuan-Lih Hsu, Chuen-Den Tseng, Jiunn-Lee Lin, and Yung-Zu Tseng. Renin–angiotensin System Gene Polymorphisms and Coronary Artery Disease in a Large Angiographic Cohort: Detection of High Order Gene– gene Interaction. Atherosclerosis, 195(1):172–180, November 2007. 146 [50] Jianfeng Xu, James Lowey, Fredrik Wiklund, Jielin Sun, Fredrik Lindmark, Fang-Chi Hsu, Latchezar Dimitrov, Baoli Chang, Aubrey R. Turner, Wennan Liu, Hans-Olov Adami, Edward Suh, Jason H. Moore, S. Lilly Zheng, William B. Isaacs, Jeffrey M. Trent, and Henrik Grönberg. The Interaction of Four Genes in the Inflammation Pathway Significantly Predicts Prostate Cancer Risk. Cancer Epidemiol Biomarkers Prev, 14(11):2563–2568, January 2005. [51] Daniel Segrè, Alexander DeLuna, George M. Church, and Roy Kishony. Mod- ular Epistasis in Yeast Metabolism. Nat Genet, 37(1):77–83, January 2005. [52] Frank J. Poelwijk, Vinod Krishna, and Rama Ranganathan. The Context- Dependence of Mutations: A Linkage of Formalisms. PLOS Computational Biology, 12(6):e1004771, June 2016. [53] Kenneth J. Rothman, Sander Greenland, and Alexander M. Walker. Concepts of Interaction. Am. J. Epidemiol., 112(4):467–470, January 1980. [54] Wayne N. Frankel and Nicholas J. Schork. Who’s Afraid of Epistasis? Nat Genet, 14(4):371–373, December 1996. [55] Darin R. Rokyta, Paul Joyce, S. Brian Caudle, Craig Miller, Craig J. Beisel, and Holly A. Wichman. Epistasis between Beneficial Mutations and the Phenotype- to-Fitness Map for a ssDNA Virus. PLOS Genet, 7(6):e1002075, June 2011. [56] Martijn F. Schenk, Ivan G. Szendro, Merijn L. M. Salverda, Joachim Krug, and J. Arjan G. M. de Visser. Patterns of Epistasis between Beneficial Mutations in an Antibiotic Resistance Gene. Mol Biol Evol, 30(8):1779–1787, January 2013. [57] François Blanquart. Properties of Selected Mutations and Genotypic Land- scapes under Fisher’s Geometric Model. Evolution, 68(12):3537–54, 2014. [58] Robert B. Heckendorn and Darrell Whitley. Predicting Epistasis from Mathe- matical Models. Evolutionary Computation, 7(1):69–101, March 1999. [59] David W. Hall, Matthew Agan, and Sara C. Pope. Fitness Epistasis among 6 Biosynthetic Loci in the Budding Yeast Saccharomyces Cerevisiae. J Hered, 101(suppl 1):S75–S84, January 2010. [60] G. E. P. Box and D. R. Cox. An Analysis of Transformations. Journal of the Royal Statistical Society. Series B (Methodological), 26(2):211–252, 1964. [61] R. J. Carroll and David Ruppert. On Prediction and the Power Transformation Family. Biometrika, 68(3):609–615, January 1981. [62] Johannes Neidhart, Ivan G. Szendro, and Joachim Krug. Exact Results for Am- plitude Spectra of Fitness Landscapes. Journal of Theoretical Biology, 332:218– 227, September 2013. 147 [63] Chun-Ting Zhang and Ren Zhang. Analysis of Distribution of Bases in the Coding Sequences by a Digrammatic Technique. Nucl. Acids Res., 19(22):6313– 6317, November 1991. [64] Herve Abdi. The Bonferonni and Sidak Corrections for Multiple Comparisons. Encyclopedia of measurement and statistics, 3:103–107, 2007. [65] S. van der Walt, S. C. Colbert, and G. Varoquaux. The NumPy Array: A Struc- ture for Efficient Numerical Computation. Computing in Science Engineering, 13(2):22–30, March 2011. [66] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Pas- sos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12:2825– 2830, 2011. [67] J. D. Hunter. Matplotlib: A 2d Graphics Environment. Computing in Science Engineering, 9(3):90–95, May 2007. [68] F. Perez and B. E. Granger. IPython: A System for Interactive Scientific Computing. Computing in Science Engineering, 9(3):21–29, May 2007. [69] R. C. MacLean, G. G. Perron, and A. Gardner. Diminishing Returns From Ben- eficial Mutations and Pervasive Epistasis Shape the Fitness Landscape for Ri- fampicin Resistance in Pseudomonas Aeruginosa. Genetics, 186(4):1345–1354, December 2010. [70] Nobuhiko Tokuriki, Colin J. Jackson, Livnat Afriat-Jurnou, Kirsten T. Wyganowski, Renmei Tang, and Dan S. Tawfik. Diminishing Returns and Tradeoffs Constrain the Laboratory Optimization of an Enzyme. Nat Com- mun, 3:1257, December 2012. [71] Sarah Perin Otto and Marcus W. Feldman. Deleterious Mutations, Variable Epistatic Interactions, and the Evolution of Recombination. Theoretical Popu- lation Biology, 51(2):134–147, April 1997. [72] Alesia N. McKeown, Jamie T. Bridgham, Dave W. Anderson, Michael N. Mur- phy, Eric A. Ortlund, and Joseph W. Thornton. Evolution of DNA Specificity in a Transcription Factor Family Produced a New Gene Regulatory Module. Cell, 159(1):58–68, September 2014. [73] Jakub Otwinowski and Joshua B. Plotkin. Inferring Fitness Landscapes by Regression Produces Biased Estimates of Epistasis. PNAS, 111(22):E2301– E2309, March 2014. 148 [74] Joseph Lehár, Andrew Krueger, Grant Zimmermann, and Alexis Borisy. High- order Combination Effects and Biological Robustness. Molecular Systems Biol- ogy, 4(1):215, January 2008. [75] Ting Hu, Nicholas A. Sinnott-Armstrong, Jeff W. Kiralis, Angeline S. Andrew, Margaret R. Karagas, and Jason H. Moore. Characterizing Genetic Interactions in Human Disease Association Studies Using Statistical Epistasis Networks. BMC Bioinformatics, 12:364, 2011. [76] Matthew B. Taylor and Ian M. Ehrenreich. Higher-Order Genetic Interactions and their Contribution to Complex Traits. Trends in Genetics, 31(1):34–40, January 2015. [77] Mark A Bedau and Norman H Packard. Evolution of Evolvability via Adapta- tion of Mutation Rates. Biosystems, 69(2–3):143–162, May 2003. [78] Michael M. Desai. Reverse Evolution and Evolutionary Memory. Nat Genet, 41(2):142–143, February 2009. [79] Adam C. Palmer, Erdal Toprak, Michael Baym, Seungsoo Kim, Adrian Veres, Shimon Bershtein, and Roy Kishony. Delayed Commitment to Evolutionary Fate in Antibiotic Resistance Fitness Landscapes. Nature Communications, 6:7385, June 2015. [80] John H. Gillespie. Molecular Evolution Over the Mutational Landscape. Evo- lution, 38(5):1116–1129, 1984. [81] H. Allen Orr. The Population Genetics of Adaptation: The Adaptation of Dna Sequences. Evolution, 56(7):1317–1330, July 2002. [82] Guy Sella and Aaron E. Hirsh. The Application of Statistical Physics to Evo- lutionary Biology. PNAS, 102(27):9541–9546, May 2005. [83] Hirotogu Akaike. Information Theory and an Extension of the Maximum Like- lihood Principle. In Emanuel Parzen, Kunio Tanabe, and Genshiro Kitagawa, editors, Selected Papers of Hirotugu Akaike, Springer Series in Statistics, pages 199–213. Springer New York, 1998. [84] John H. Gillespie. Population Genetics: A Concise Guide. JHU Press, Decem- ber 2010. [85] Kenneth M. Flynn, Tim F. Cooper, Francisco B.-G. Moore, and Vaughn S. Cooper. The Environment Affects Epistatic Interactions to Alter the Topology of an Empirical Fitness Landscape. PLOS Genet, 9(4):e1003426, April 2013. 149 [86] Zachary D. Blount, Christina Z. Borland, and Richard E. Lenski. Historical Contingency and the Evolution of a Key Innovation in an Experimental Popu- lation of Escherichia Coli. PNAS, 105(23):7899–7906, October 2008. [87] James F. Crow. On Epistasis: Why It Is Unimportant in Polygenic Direc- tional Selection. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 365(1544):1241–1244, April 2010. [88] Mark Lunzer, G. Brian Golding, and Antony M. Dean. Pervasive Cryptic Epis- tasis in Molecular Evolution. PLoS Genet, 6(10):e1001162, October 2010. [89] Jacques Monod. On Chance and Necessity. In Francisco Jose Ayala and Theodo- sius Dobzhansky, editors, Studies in the Philosophy of Biology, pages 357–375. Macmillan Education UK, 1974. [90] Stephen Jay Gould. Wonderful Life: The Burgess Shale and the Nature of Life. New York: Norton, 1989. [91] Simon Conway Morris. Evolution: Like Any Other Science It Is Pre- dictable. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1537):133–145, January 2010. [92] Michael Lässig, Ville Mustonen, and Aleksandra M. Walczak. Predicting Evo- lution. Nature Ecology & Evolution, 1:0077, February 2017. [93] Ryan T. Hietpas, Jeffrey D. Jensen, and Daniel N. A. Bolon. Experimental Illumination of a Fitness Landscape. Proceedings of the National Academy of Sciences, 108(19):7896–7901, October 2011. [94] M. Michael Gromiha, Motohisa Oobatake, and Akinori Sarai. Important Amino Acid Properties for Enhanced Thermostability from Mesophilic to Thermophilic Proteins. Biophysical Chemistry, 82(1):51–67, November 1999. [95] Darin M. Taverna and Richard A. Goldstein. Why Are Proteins Marginally Stable? Proteins: Structure, Function, and Bioinformatics, 46(1):105–109, January 2002. [96] Rafael Couñago, Stephen Chen, and Yousif Shamoo. In Vivo Molecular Evo- lution Reveals Biophysical Origins of Organismal Fitness. Molecular Cell, 22(4):441–449, May 2006. [97] Tobias Sikosek and Hue Sun Chan. Biophysics of Protein Evolution and Evolutionary Protein Biophysics. Journal of The Royal Society Interface, 11(100):20140419, November 2014. 150 [98] Jesse D. Bloom, Claus O. Wilke, Frances H. Arnold, and Christoph Adami. Stability and the Evolvability of Function in a Model Protein. Biophysical Journal, 86(5):2758–2764, May 2004. [99] Sanzo Miyazawa and Robert L. Jernigan. Estimation of Effective Interresidue Contact Energies from Protein Crystal Structures: Quasi-Chemical Approxi- mation. Macromolecules, 18(3):534–552, March 1985. [100] Kit Fun Lau and Ken A. Dill. A Lattice Statistical Mechanics Model of the Conformational and Sequence Spaces of Proteins. Macromolecules, 22(10):3986– 3997, 1989. [101] Leonid A. Mirny, Victor I. Abkevich, and Eugene I. Shakhnovich. How Evo- lution Makes Proteins Fold Quickly. Proceedings of the National Academy of Sciences, 95(9):4976–4981, April 1998. [102] Gregorio Weber. Energetics of Ligand Binding to Proteins. In John T. Edsall C.B. Anfinsen and Frederic M. Richards, editors, Advances in Protein Chem- istry, volume 29, pages 1–83. Academic Press, 1975. [103] Elan Z. Eisenmesser, Oscar Millet, Wladimir Labeikovsky, Dmitry M. Korzh- nev, Magnus Wolf-Watz, Daryl A. Bosco, Jack J. Skalicky, Lewis E. Kay, and Dorothee Kern. Intrinsic Dynamics of an Enzyme Underlies Catalysis. Nature, 438(7064):117–121, November 2005. [104] Hesam N. Motlagh, James O. Wrabl, Jing Li, and Vincent J. Hilser. The Ensemble Nature of Allostery. Nature, 508(7496):331–339, April 2014. [105] K. Gunasekaran, Buyong Ma, and Ruth Nussinov. Is Allostery an Intrinsic Property of All Dynamic Proteins? Proteins, 57(3):433–443, November 2004. [106] Joseph A. Marsh and Sarah A. Teichmann. Protein Flexibility Facilitates Qua- ternary Structure Assembly and Evolution. PLOS Biology, 12(5):e1001870, May 2014. [107] Frederic Rousseau and Joost Schymkowitz. A Systems Biology Perspective on Protein Structural Dynamics and Signal Transduction. Current Opinion in Structural Biology, 15(1):23–30, February 2005. [108] Patrick A. Alexander, Yanan He, Yihong Chen, John Orban, and Philip N. Bryan. A Minimal Sequence Code for Switching Protein Structure and Func- tion. Proceedings of the National Academy of Sciences, 106(50):21149–21154, December 2009. 151 [109] Erik D. Nelson and Nick V. Grishin. Long-Range Epistasis Mediated by Structural Change in a Model of Ligand Binding Proteins. PLOS ONE, 11(11):e0166739, November 2016. [110] Mingyang Lu, Mohit Kumar Jolly, Ryan Gomoto, Bin Huang, José Onuchic, and Eshel Ben-Jacob. Tristability in Cancer-Associated microRNA-TF Chimera Toggle Switch. J Phys Chem B, 117(42):13164–13174, October 2013. [111] Sylvain Bessonnard, Laurane De Mot, Didier Gonze, Manon Barriol, Cynthia Dennis, Albert Goldbeter, Geneviève Dupont, and Claire Chazaud. Gata6, Nanog and Erk Signaling Control Cell Fate in the Inner Cell Mass through a Tristable Regulatory Network. Development, 141(19):3637–3648, October 2014. [112] Sergey Kryazhimskiy, Gašper Tkačik, and Joshua B. Plotkin. The Dynamics of Adaptation on Correlated Fitness Landscapes. Proceedings of the National Academy of Sciences, 106(44):18638–18643, March 2009. [113] Daniel M. Weinreich. High-Throughput Identification of Genetic Interactions in HIV-1. Nat Genet, 43(5):398–400, May 2011. [114] Shozo Yokoyama, Jinyi Xing, Yang Liu, Davide Faggionato, Ahmet Altun, and William T. Starmer. Epistatic Adaptive Evolution of Human Color Vision. PLOS Genet, 10(12):e1004884, December 2014. [115] Anna I. Podgornaia and Michael T. Laub. Pervasive degeneracy and epistasis in a protein-protein interface. Science, 347(6222):673–677, February 2015. [116] Michael B. Doud and Jesse D. Bloom. Accurate Measurement of the Effects of All Amino-Acid Mutations on Influenza Hemagglutinin. Viruses, 8(6):155, June 2016. [117] Evan A. Boyle, Johan O. L. Andreasson, Lauren M. Chircus, Samuel H. Stern- berg, Michelle J. Wu, Chantal K. Guegler, Jennifer A. Doudna, and William J. Greenleaf. High-throughput biochemical profiling reveals sequence determi- nants of dCas9 off-target binding and unbinding. Proceedings of the National Academy of Sciences, page 201700557, May 2017. [118] Thomas A. Hopf, John B. Ingraham, Frank J. Poelwijk, Charlotta P. I. Schärfe, Michael Springer, Chris Sander, and Debora S. Marks. Mutation effects pre- dicted from sequence co-variation. Nature Biotechnology, 35(2):128–135, Febru- ary 2017. [119] Yvonne H. Chan, Sergey V. Venev, Konstantin B. Zeldovich, and C. Robert Matthews. Correlation of fitness landscapes from three orthologous TIM barrels originates from sequence and structure constraints. Nature Communications, 8:14614, March 2017. 152 [120] Tyler N. Starr, Lora K. Picton, and Joseph W. Thornton. Alternative evolution- ary histories in the sequence space of an ancient protein. Nature, 549(7672):409– 413, September 2017. [121] Júlia Domingo, Guillaume Diss, and Ben Lehner. Pairwise and higher-order genetic interactions during the evolution of a tRNA. Nature, 558(7708):117– 121, June 2018. [122] Daniel M. Weinreich, Yinghong Lan, Jacob Jaffe, and Robert B. Heckendorn. The Influence of Higher-Order Epistasis on Biological Fitness Landscape To- pography. Journal of Statistical Physics, 172(1):208–225, July 2018. [123] Gideon Schreiber and Alan R. Fersht. Energetics of protein-protein interactions: Analysis ofthe Barnase-Barstar interface by single mutations and double mutant cycles. Journal of Molecular Biology, 248(2):478–486, January 1995. [124] Amnon Horovitz. Double-Mutant Cycles: A Powerful Tool for Analyzing Pro- tein Structure and Function. Folding and Design, 1(6):R121–R126, December 1996. [125] Ákos Nyerges, Bálint Csörgő, Gábor Draskovits, Bálint Kintses, Petra Szili, Györgyi Ferenc, Tamás Révész, Eszter Ari, István Nagy, Balázs Bálint, Bálint Márk Vásárhelyi, Péter Bihari, Mónika Számel, Dávid Balogh, Henri- etta Papp, Dorottya Kalapis, Balázs Papp, and Csaba Pál. Directed evolution of multiple genomic loci allows the prediction of antibiotic resistance. Proceed- ings of the National Academy of Sciences, page 201801646, June 2018. [126] Frank J. Poelwijk, Michael Socolich, and Rama Ranganathan. Learning the pattern of epistasis linking genotype and phenotype in a protein. bioRxiv, page 213835, November 2017. [127] Luca Ferretti, Daniel Weinreich, Fumio Tajima, and Guillaume Achaz. Evolu- tionary constraints in fitness landscapes. Heredity, page 1, July 2018. [128] Wes McKinney. Data Structures for Statistical Computing in Python. pages 51–56, 2010. [129] Matt Newville, Renee Otten, Andrew Nelson, Antonino Ingargiola, Till Sten- sitzki, Dan Allan, Austin Fox, Michał, Glenn, Yoav Ram, MerlinSmiles, Li Li, Christoph Deil, Stuermer, Alexandre Beelen, Oliver Frost, gpasquev, Allan L. R. Hansen, Alexander Stark, Tim Spillane, Shane Caldwell, Anthony Polloreno, andrewhannum, colgan, Robbie Clarken, Kostis Anagnostopoulos, Jose Bor- reguero, deep 42-thought, Ben Gamari, and Anthony Almarza. lmfit/lmfit-py 0.9.11, June 2018. 153 [130] Ben Lehner. Molecular mechanisms of epistasis within and between genes. Trends in Genetics, 27(8):323–331, August 2011. [131] Jakub Otwinowski. Biophysical inference of epistasis and the effects of mu- tations on protein stability and function. arXiv:1802.08744 [q-bio], February 2018. arXiv: 1802.08744. [132] Kristina Crona, Alex Gavryushkin, Devin Greene, and Niko Beerenwinkel. In- ferring genetic interactions from comparative fitness data. eLife, 6. [133] Matteo Figliuzzi, Hervé Jacquier, Alexander Schug, Oliver Tenaillon, and Mar- tin Weigt. Coevolutionary Landscape Inference and the Context-Dependence of Mutations in Beta-Lactamase TEM-1. Molecular Biology and Evolution, 33(1):268–280, January 2016. [134] Adam J. Riesselman, John B. Ingraham, and Debora Susan Marks. Deep gener- ative models of genetic variation capture mutation effects. bioRxiv, page 235655, December 2017. [135] Sam Sinai, Eric Kelsic, George M. Church, and Martin A. Nowak. Variational auto-encoding of protein sequences. arXiv:1712.03346 [cs, q-bio], December 2017. arXiv: 1712.03346. [136] Sheng Wang, Siqi Sun, Zhen Li, Renyu Zhang, and Jinbo Xu. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model. PLOS Computational Biology, 13(1):e1005324, January 2017. [137] Jianzhu Ma, Michael Ku Yu, Samson Fong, Keiichiro Ono, Eric Sage, Barry Demchak, Roded Sharan, and Trey Ideker. Using deep learning to model the hierarchical structure and function of a cell. Nature Methods, 15(4):290–298, April 2018. [138] Sandipan Dutta, Jean-Pierre Eckmann, Albert Libchaber, and Tsvi Tlusty. Green function of correlated genes in a minimal mechanical model of protein evolution. Proceedings of the National Academy of Sciences, 115(20):E4559– E4568, May 2018. [139] R. C. Lewontin and Ken-ichi Kojima. The Evolutionary Dynamics of Complex Polymorphisms. Evolution, 14(4):458–472, 1960. [140] Inna S. Povolotskaya and Fyodor A. Kondrashov. Sequence space and the on- going expansion of the protein universe. Nature, 465(7300):922–926, June 2010. [141] N. H. Barton and B. Charlesworth. Why Sex and Recombination? Science, 281(5385):1986–1990, September 1998. 154 [142] Thomas MacCarthy and Aviv Bergman. Coevolution of robustness, epistasis, and recombination favors asexual reproduction. Proceedings of the National Academy of Sciences, 104(31):12801–12806, July 2007. [143] Paul E O’Maille, Arthur Malone, Nikki Dellas, B Andes Hess, Lidia Smentek, Iseult Sheehan, Bryan T Greenhagen, Joe Chappell, Gerard Manning, and Joseph P Noel. Quantitative Exploration of the Catalytic Landscape Separating Divergent Plant Sesquiterpene Synthases. Nature Chemical Biology, 4(10):617– 623, October 2008. [144] Elena R. Lozovsky, Thanat Chookajorn, Kyle M. Brown, Mallika Imwong, Philip J. Shaw, Sumalee Kamchonwongpaisan, Daniel E. Neafsey, Daniel M. Weinreich, and Daniel L. Hartl. Stepwise Acquisition of Pyrimethamine Resis- tance in the Malaria Parasite. PNAS, 106(29):12025–12030, July 2009. [145] Marna S. Costanzo, Kyle M. Brown, and Daniel L. Hartl. Fitness Trade-Offs in the Evolution of Dihydrofolate Reductase and Drug Resistance in Plasmodium Falciparum. PLOS ONE, 6(5):e19636, May 2011. [146] Leah E. Cowen and Susan Lindquist. Hsp90 Potentiates the Rapid Evolution of New Traits: Drug Resistance in Diverse Fungi. Science, 309(5744):2185–2189, September 2005. [147] Matthew T. G. Holden, Edward J. Feil, Jodi A. Lindsay, Sharon J. Peacock, Nicholas P. J. Day, Mark C. Enright, Tim J. Foster, Catrin E. Moore, Lau- rence Hurst, Rebecca Atkin, Andrew Barron, Nathalie Bason, Stephen D. Bentley, Carol Chillingworth, Tracey Chillingworth, Carol Churcher, Louise Clark, Craig Corton, Ann Cronin, Jon Doggett, Linda Dowd, Theresa Feltwell, Zahra Hance, Barbara Harris, Heidi Hauser, Simon Holroyd, Kay Jagels, Keith D. James, Nicola Lennard, Alexandra Line, Rebecca Mayes, Sharon Moule, Karen Mungall, Douglas Ormond, Michael A. Quail, Ester Rabbinow- itsch, Kim Rutherford, Mandy Sanders, Sarah Sharp, Mark Simmonds, Kim Stevens, Sally Whitehead, Bart G. Barrell, Brian G. Spratt, and Julian Parkhill. Complete genomes of two clinical Staphylococcus aureus strains: Evidence for the rapid evolution of virulence and drug resistance. Proceedings of the National Academy of Sciences, 101(26):9786–9791, June 2004. [148] Douglas D. Richman, Terri Wrin, Susan J. Little, and Christos J. Petropoulos. Rapid evolution of the neutralizing antibody response to HIV type 1 infection. Proceedings of the National Academy of Sciences, 100(7):4144–4149, April 2003. [149] Jesse D. Bloom, Lizhi Ian Gong, and David Baltimore. Permissive Secondary Mutations Enable the Evolution of Influenza Oseltamivir Resistance. Science, 328(5983):1272–1275, June 2010. 155 [150] Robert L. Summers, Megan N. Nash, and Rowena E. Martin. Know your enemy: understanding the role of PfCRT in drug resistance could lead to new antimalarial tactics. Cellular and Molecular Life Sciences, 69(12):1967–1995, June 2012. [151] D. J. Krogstad, I. Y. Gluzman, D. E. Kyle, A. M. Oduola, S. K. Martin, W. K. Milhous, and P. H. Schlesinger. Efflux of chloroquine from Plasmodium fal- ciparum: mechanism of chloroquine resistance. Science, 238(4831):1283–1285, November 1987. [152] C. A. Homewood, D. C. Warhurst, W. Peters, and V. C. Baggaley. Lysosomes, pH and the Anti-malarial Action of Chloroquine. Nature, 235(5332):50–52, January 1972. [153] Andrew F. G. Slater. Chloroquine: Mechanism of drug action and resistance in plasmodium falciparum. Pharmacology & Therapeutics, 57(2):203–235, January 1993. [154] Rowena E. Martin, Rosa V. Marchetti, Anna I. Cowan, Susan M. Howitt, Stefan Bröer, and Kiaran Kirk. Chloroquine Transport via the Malaria Parasite’s Chloroquine Resistance Transporter. Science, 325(5948):1680–1682, September 2009. [155] Heather J. Cordell. Detecting Gene–gene Interactions that Underlie Human Diseases. Nat Rev Genet, 10(6):392–404, June 2009. [156] Jakub Otwinowski and Ilya Nemenman. Genotype to Phenotype Mapping and the Fitness Landscape of the E. Coli Lac Promoter. PLoS ONE, 8(5):e61570, May 2013. [157] Zachary R. Sailer and Michael J. Harms. Uninterpretable interactions: epistasis as uncertainty. bioRxiv, page 378489, July 2018. [158] Christopher K Williams and Carl Edward Rasmussen. Gaussian processes for machine learning. the MIT Press, 2(3):4, 2006. [159] Collin B. Erickson, Bruce E. Ankenman, and Susan M. Sanchez. Comparison of Gaussian process modeling software. European Journal of Operational Research, 266(1):179–192, April 2018. [160] Ian Jolliffe. Principal Component Analysis. In International Encyclopedia of Statistical Science, pages 1094–1096. Springer, Berlin, Heidelberg, 2011. [161] D. R. Cox. The Regression Analysis of Binary Sequences. Journal of the Royal Statistical Society. Series B (Methodological), 20(2):215–242, 1958. 156 [162] Frank E. Harrell. Ordinal Logistic Regression. In Regression Modeling Strate- gies, Springer Series in Statistics, pages 311–325. Springer, Cham, 2015. 157