LÉVY MOTION AND COLD ATOMS by WESLEY W. ERICKSON A DISSERTATION Presented to the Department of Physics and the Graduate School of the University of Oregon in partial fulfillment of the requirements for the degree of Doctor of Philosophy June 2020 DISSERTATION APPROVAL PAGE Student: Wesley W. Erickson Title: Lévy Motion and Cold Atoms This dissertation has been accepted and approved in partial fulfillment of the requirements for the Doctor of Philosophy degree in the Department of Physics by: Michael Raymer Chair Daniel A. Steck Advisor Eric Corwin Core Member Marina Guenza Institutional Representative and Kate Mondloch Interim Vice Provost and Dean of the Graduate School Original approval signatures are on file with the University of Oregon Graduate School. Degree awarded June 2020 ii ©c 2020 Wesley W. Erickson This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs (United States) License. iii DISSERTATION ABSTRACT Wesley W. Erickson Doctor of Philosophy Department of Physics June 2020 Title: Lévy Motion and Cold Atoms Lévy processes are a universal model for characterizing the behavior of extreme events and anomalously diffusive systems. They are also important in modeling the transport of laser-cooled atoms. This dissertation contributes to the understanding of Lévy processes themselves, and to their application in laser- cooled atomic dynamics. Lévy processes are models of systems that contain extreme events. However, since any particular event could, upon closer inspection, be composed of multiple smaller events, what makes an event “extreme”? To answer this question it is useful to consider the history of the event, by studying Lévy processes conditioned to have a fixed final state at some later time. We find that events that greatly exceed a particular threshold are more likely to be composed of many small events and a single large event, rather than a series of comparably sized events. Analysis of the source of this threshold helps clarify the Gaussian limit of Lévy processes, iv and serves as the foundation for a useful distinction between “short” and “long” steps. These ideas also suggest possible experimental techniques that may be employed for cold atoms. Studies of anomalous diffusion in laser-cooled atoms have typically focused on the expansion profiles of clouds of cold atoms. However, the results of experiments and theoretical models have not been in close agreement, suggesting that existing models are still incomplete. To address this, it is important to develop alternative experimental approaches. One such approach is to use a single atom as a probe of the diffusion, which allows for the collection of boundary crossing statistics, such as the time between when an atom enters and leaves an imaging region. Through simulations, we find that distributions of these statistics develop peaks that correspond to atomic Lévy flights, in direct contrast to featureless power-law distributions for atoms undergoing Brownian motion. We find that characterizing these distributions gives information on the anomalous- diffusion exponent and typical velocities. Furthermore, these distributions serve as the basis for a method to directly detect Lévy flights at the level of a single atom. v CURRICULUM VITAE NAME OF AUTHOR: Wesley W. Erickson GRADUATE AND UNDERGRADUATE SCHOOLS ATTENDED: University of Oregon, Eugene, OR Reed College, Portland, OR DEGREES AWARDED: Doctor of Philosophy, Physics, 2020, University of Oregon Master of Science, Physics, 2014, University of Oregon Bachelor of Arts, Physics, 2012, Reed College AREAS OF SPECIAL INTEREST: Laser-cooled Atoms Computational Physics Stochastic Processes PROFESSIONAL EXPERIENCE: Graduate Employee, University of Oregon, 2017-2020 Graduate Teaching Fellow, University of Oregon, 2012-2017 Senior Reactor Operator, Reed Research Reactor, 2010-2012 Reactor Operator, Reed Research Reactor, 2009-2010 GRANTS, AWARDS AND HONORS: International High Performance Computing Summer School, Toronto, ON, Summer 2015 Research Experience for Undergraduates, University of Arkansas, Summer 2010 vi PUBLICATIONS: Wesley W. Erickson and Daniel A. Steck, “The Anatomy of an Extreme Event: What Can We Infer About the History of a Heavy-Tailed Random Walk?”, submitted for publication (arXiv:2002.03849), (2020). vii ACKNOWLEDGEMENTS I owe a great many thanks to a great many people who have helped me throughout the completion of this dissertation. While I would like to thank everyone by name, below I only mention several, but you all have my sincerest appreciation. First and foremost, I would like to thank my advisor Daniel Steck, who introduced me to truly fascinating and beautiful ideas about the natural world, many of which will continue to hold my curiosity for years to come. More importantly, Dan has been an endlessly optimistic and supportive mentor, and his patience and encouragement were essential as I completed this dissertation. I also want to thank my committee members, Michael Raymer, Eric Corwin, and Marina Guenza, whose feedback and challenging questions helped me clarify and strengthen this work, as well as Steven van Enk, who provided several useful ideas for framing some of the results on conditioned processes. Next, I want to thank all the members of Steck Lab, Eryn Cook, Jonathan Mackrory, Paul Martin, and Richard Wagner, who were all welcoming when I joined the lab, generous in showing me the ropes, and often up for an interesting conversation. Also many thanks to the friends who made my graduate school experience full of wonderful memories: Mayra Amezcua, Sripoorna “SP” Bharadwaj, Gabriel Barello, George de Coster, Herbie Grotewohl, Erik Keever, Sofiane Merkouche, Chris Newby, Dileep Reddy, Rudy Resch, and Richard Wagner. We had many exciting events and adventures, both in physics but just as often outside it, and graduate school was all the better for it. viii I also wish to thank the many curious and supportive members of my family. There are too many to name, but relaxing with you all during the holidays made it easy to return determined. But I do want to especially thank my parents, who have stuck with me for quite a long education. Your unwavering support has made everything easier, I can’t thank you enough. And last but certainly not least, I want to thank my girlfriend Jun, who has been thoughtful and encouraging at every turn. Thank you for everything! ix TABLE OF CONTENTS Chapter Page I. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1. From Random walks to a Universal Stochastic Process . . . . . . 2 1.2. Lévy Statistics and Extreme Events . . . . . . . . . . . . . . . . 3 1.3. Anomalous Diffusion of Laser Cooled Atoms . . . . . . . . . . . . 5 1.4. Organization of this Dissertation . . . . . . . . . . . . . . . . . . 10 II. STOCHASTIC PROCESSES . . . . . . . . . . . . . . . . . . . . . . . 12 2.1. Wiener Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.1.1. Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . 14 2.1.2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.3. Fundamental Properties . . . . . . . . . . . . . . . . . . . . 20 2.1.4. Qualitative Behavior . . . . . . . . . . . . . . . . . . . . . . 21 2.1.5. Stochastic Differential Equations . . . . . . . . . . . . . . . . 23 2.1.6. Master Equations . . . . . . . . . . . . . . . . . . . . . . . . 29 2.1.7. Sampling and Simulation . . . . . . . . . . . . . . . . . . . . 30 2.1.8. Boundary Crossing Statistics . . . . . . . . . . . . . . . . . . 32 2.2. Lévy Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.2.1. Lévy Process Properties and Definitions . . . . . . . . . . . 41 2.2.2. Stable Lévy Processes . . . . . . . . . . . . . . . . . . . . . . 43 x Chapter Page 2.2.3. Generalized Central Limit Theorem . . . . . . . . . . . . . . 45 2.2.4. Sampling and Simulation . . . . . . . . . . . . . . . . . . . . 49 2.2.5. Boundary Crossing Statistics . . . . . . . . . . . . . . . . . . 51 III. CONDITIONED RANDOM PROCESSES . . . . . . . . . . . . . . . . 55 3.1. Brownian Bridges . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.1.1. Stretched Brownian Bridge . . . . . . . . . . . . . . . . . . . 56 3.1.2. Midpoint Brownian Bridge . . . . . . . . . . . . . . . . . . . 58 3.1.3. First Passage Times . . . . . . . . . . . . . . . . . . . . . . . 59 3.2. Lévy Bridges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.2.1. Failure of Stretched Lévy Bridges . . . . . . . . . . . . . . . 63 3.2.2. Lévy Midpoint Density . . . . . . . . . . . . . . . . . . . . . 66 3.2.3. Lévy Bridge Probability Bifurcations . . . . . . . . . . . . . 67 3.2.4. Conditioned Sampling . . . . . . . . . . . . . . . . . . . . . 80 3.2.5. Jump Detection . . . . . . . . . . . . . . . . . . . . . . . . . 83 3.2.6. Corrected Stretched Lévy Bridges . . . . . . . . . . . . . . . 84 3.2.7. Conditioned First Passage . . . . . . . . . . . . . . . . . . . 86 3.2.8. Inferring α from a Sample Path . . . . . . . . . . . . . . . . 87 IV. ANOMALOUS DIFFUSION IN SISYPHUS COOLING . . . . . . . . . 91 4.1. Sisyphus Cooling . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.2. Steady-State Momentum Distribution . . . . . . . . . . . . . . . 96 4.3. Brownian Regime . . . . . . . . . . . . . . . . . . . . . . . . . . 100 xi Chapter Page 4.4. Lévy Regime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.5. Power Law Extent . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.6. Boundary Crossing Statistics . . . . . . . . . . . . . . . . . . . . 107 4.6.1. Brownian Regime . . . . . . . . . . . . . . . . . . . . . . . . 108 4.6.2. Lévy Regime . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 V. CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 APPENDIX: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 A.1. Moments of Momentum Distribution . . . . . . . . . . . . . . . . 117 A.2. Hypergeometric Inversion Formula . . . . . . . . . . . . . . . . . 121 A.3. Equivalence of Discrete and Langevin Methods . . . . . . . . . . 123 REFERENCES CITED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 xii LIST OF FIGURES Figure Page 2.1. Plots of 10 sample paths of a Wiener process. Both plots show the same sample paths (indicated by colors), while scale of each plot is chosen to indicate the qualitative self-similarity at at different scales. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2. Schematic diagram illustrating first passage times. The process starts at d and ends when it hits the dashed line at x = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.3. Plots of the first passage time distributions for a Wiener process with a single boundary. The top plot is scaled normally, while the bottom plot is log-log scaled to emphasize the t−3/2 power law tail. . . . . . . . . . . 36 2.4. Schematic diagram illustrating escape times. The process starts at d and ends when it hits either dashed line at x = 0 of x = L. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.5. Escape time plots for a Wiener process with two absorbing boundaries. The top plot is shown for a = L/100, which has a clear power law extending over a wide range. The bottom plot is for a = L/2 and has no observable power law. . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.6. Plot of 10 sample paths of the α-stable Lévy process for 3 values of α. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.7. Trajectories from a single truncated Cauchy random walk with steps drawn from Eq. (2.77) with a = 105. Each plot moving to the right contains the entire motion of the previous plot, highlighted in green. For n a the motion appears like a Cauchy random walk, while for n a the motion appears Brownian. . . . . . . . . . . . . . . . . . . . 48 2.8. Demonstration of the applicability of the gCLT even for step distributions with a finite variance. Number of steps taken is indicated by the line color: 101 (blue), 102 (green), 103 (red), 104 (teal), 105 (purple) 106 (yellow). The second plot is scaled such that the distributions overlap with Eq. (2.79) (second plot, black), while the third plot is scaled such that the distributions overlap with Eq. (2.78) (third plot, black). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 xiii Figure Page 2.9. Schematic diagram illustrating the leapover distance l, as well as the distinction between first passage times τp and first hitting times τh for a Lévy process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 2.10. Leapover distributions (Eq. (2.86)) into the interval (−2L, 0) with L = 1 and x0 = 0.1 (red) and x0 = 10.0 (blue). . . . . . . . . . . . . . . . . 54 3.1. Examples of Brownian bridges generated through iterated sampling of the midpoint distribution using Eq. (3.7). The top plot has bridges with final displacement L = 0, while the bottom figure has bridges with L = 10. . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.2. Schematic diagram illustrating the first passage time for the Brownian bridge. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.3. Examples of stretched Lévy bridges generated through Eq. (3.9) with T = 1 and L = 0. Some of the bridges have noticeable drift, indicating that the stretch method for Lévy bridges does not work properly. . . . 64 3.4. Test showing the failure the equality shown in Eq. (3.10) using the first passage time for F . The LHS of Eq. (3.10) is shown in blue, while the RHS is shown in red. Simulations are for α = 1, with the passage boundary d = 10, time step ∆t = 10−4, bridge constrained time T = 1, and N = 106 trajectories. . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.5. Simulated Lévy bridges with α = 1 and for L = 0 (top) and L = 10 (bottom). Each bridge is generated through 10 iterations of Eq. (3.12) and has 210 steps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.6. Simulated Lévy bridges with α = 1.9 and for L = 2. Each bridge is generated through 10 iterations of Eq. (3.12) and has 210 steps. . . . . . 69 3.7. Plot of the midstep distribution for α = 1 for various L. . . . . . . . . . 70 3.8. Plot of the extrema of the midstep distribution (Eq. (3.14)) for the α = 1 Cauchy case where maxima are shown in red and minima in blue. For comparison, the extrema for both the unshifted form (top plot) and a symmeterized form (bottom plot) are shown. The bifurcation point is easily visible at L = tσ. . . . . . . . . . . . . . . . . 72 3.9. Bifurcation plot for α = 1.0 . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.10. Variation of the midstep density (Eq. (3.12)) with Lévy index α and arrival point L. Curves highlighting maxima (solid/red) and minima (dashed/blue) are superimposed. . . . . . . . . . . . . . . . . . . . . . . 75 xiv Figure Page 3.11. Bifurcation length Lb plotted as a function of α. . . . . . . . . . . . . . 76 3.12. Bifurcation closeup for α = 1.99 . . . . . . . . . . . . . . . . . . . . . . . 78 3.13. Variation of boundaries between “small” and “large” steps with α. Curves indicate bifurcation lengths Lb for which the center of the midstep density has vanishing curvature (red/solid), the midstep distribution develops side peaks (blue/dashed), and side peaks are equal in height to the center peak (green/dot-dashed). Inset: magnified view for α > αc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.14. Typical sample paths of Lévy bridges for α = 1.9, illustrating the qualitative transition with L. Each path was generated through 10 recursive subsamplings from the midpoint distribution (Eq. (3.12)). . . 81 3.15. Typical sample paths of Lévy bridges for α = 1.0. Each path was generated through 10 recursive subsamplings from the midpoint distribution (Eq. (3.12)). . . . . . . . . . . . . . . . . . . . . . . . . . . 82 3.16. Sample path of an unconditioned α-stable Lévy process, α = 1.9. Simulated path has time steps ∆t = T/500 For the top plot, increments ∆x > σ(∆t/T )1/α emphasized (bold/red). For the bottom plot, increments ∆x > Lb(∆t/T ) 1/α emphasized (bold/red). . . . . . . . . . 85 3.17. Simulated conditioned first passage time distributions for α = 1.99999 and d = L/2 are shown for L = 0.1Lb (blue/squares) and L = 2Lb (red/triangles). Exact densities for α = 2 [1] for the same L values are shown for comparison in each case (blue/solid and red/dashed, respectively). Simulations averaged 107 paths, with ∆t = 2−14T . . . . . 88 3.18. Long jump probability as a function of α. . . . . . . . . . . . . . . . . . 89 3.19. Effectiveness of the jump rate method (red) of alpha estimation when compared to the maximum-likelihood method (blue). Mean differences between the estimated and true parameter are shown with solid lines, while dashed lines indicate the standard deviation. . . . . . . . . . . . . 90 4.1. Simulated trajectories of Eq. (4.7) for various values of the diffusion constant D. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.2. Plot of Tsallis distribution in Eq. (4.20) with D = 0.1 (red), D = 0.2 (blue), and D = 0.6 (green). . . . . . . . . . . . . . . . . . . . . . . . . . 99 xv Figure Page 4.3. The solid red line shows the fraction of particles with |p| > 1 in the steady state distribution as a function of the diffusion parameter D. The dashed black line is a guide line with slope 1. . . . . . . . . . . . . . 100 4.4. Comparison of position distributions for Sisyphus simulations using Eq. (4.7) and Gaussian diffusion with a diffusion constant given by Eq. (4.24). For D = 0.01, a Gaussian simulation (green) and Sisyphus simulation (blue) are shown. For D = 0.19, a Gaussian simulation (red) and a Sisyphus simulation (yellow) are shown. Simulation parameters are ∆t = 0.1, T = 104, N = 220. . . . . . . . . . . . . . . . . . . . . . . . 102 4.5. Comparison of position distributions for Sisyphus simulations using Eq. (4.7) and anomalous diffusion with a diffusion constant given by Eq. (4.30). For D = 0.3, an α-stable Lévy simulation (brown) and Sisyphus simulation (light blue) are shown. For D = 0.5, an α-stable Lévy simulation (pink) and Sisyphus simulation (orange) are shown. For D = 0.7, an α-stable Lévy simulation (gray) and Sisyphus simulation (green) are shown. Also shown are associated power laws given by Eq. (4.34). Simulation parameters are ∆t = 0.1, T = 104, N = 220. . . (. . ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.6. Plot of log x110 as a function of D for various values of T . . . . . . . 107x0 4.7. Schematic for escape time simulations. . . . . . . . . . . . . . . . . . . . 108 4.8. Escape time distributions for D = 0.1, T = 104. Sisyphus simulations using Eq. (4.7) (solid) and theoretical curves using Eq. (2.66) (dashed) are shown. . . . . . . . . . . 109 4.9. Schematic for transit time distributions. The transit time is given by τtr = t2 − t1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 4.10. Transit time distributions for D = 0.7, T = 104. . . . . . . . . . . . . . . 110 4.11. Escape-time anatomy for D = 0.7, T = 104. . . . . . . . . . . . . . . . . 111 4.12. Selected trajectories for D = 0.7, T = 104, with region size B = 39.909. The appearance of curved trajectories is due to the semi-log scaling; unscaled, these curved trajectories appear relatively straight. . . . . . . 112 4.13. Plot of τmax extracted from transit-time distributions as a function of the region size. Parameters used: D = 0.7, T = 104. . . . . . . . . . . . 113 xvi Figure Page 4.14. Plot of τmin extracted from transit-time distributions as a function of the region size. Parameters used: D = 0.7, T = 104. . . . . . . . . . . . 113 xvii CHAPTER I INTRODUCTION Universality refers to an observed tendency for a broad variety of different systems to display remarkably similar behavior. For example, given the right circumstances, both the motion of a molecule suspended in a fluid [2], and the motion of a star gravitationally interacting with other stars [3], can be effectively modeled by Brownian motion, despite the vastly different scales, composition, and forces involved. While the microscopic details of these systems differ completely, these details are indiscernible when the system is viewed at larger scales as the behaviors converge to the universal ones. This example also alludes to the property of scale invariance, where a system at one scale behaves similarly to itself when viewed at a larger scale. It is this property that is often present in universal systems, as the same mechanism that relates different systems can also relate a system to itself. These universal and self-similar relations often arise as power-law scaling relationships, characterized by universal exponents that are independent of the system parameters. Universal techniques are generalizable to many systems, and the resulting models are more convenient for analysis, even if the true physical system is complex or difficult to measure directly. As there are many classes and subclasses of systems that exhibit universal properties, it is essential to demonstrate concrete examples of universal processes as well as specific applications of these principles to interesting and useful physical systems. In this thesis we will discuss the Lévy processes, stochastic processes constructed from a general type of random walk, which is a natural model for 1 extreme events and anomalous diffusion. As Lévy processes are a generalizable mathematical tool, we will touch on many subjects where they are applicable, before focusing on how they can be applied in the physics of laser-cooled atomic systems. 1.1. From Random walks to a Universal Stochastic Process A convenient starting point for the development of random walks is the year 1900, when Louis Bachelier completed his thesis “The Theory of Speculation,” which studied the pricing of stock options by introducing the first mathematical model of what would later be known as Brownian motion [4, 5]. Only five years later, the first use of the term random walk appeared in a letter to Nature in discussion of a model of the migratory behavior of mosquitoes, which used the same statistics as those in Bachelier’s thesis [6]. And, later that same month, Einstein’s paper on Brownian motion was published [2], explaining that the persistent irregular motion of particles suspended in a fluid (as earlier observed with pollen grains by Robert Brown [7]) could be explained via atomic theory and thermal molecular motion, again with the same random-walk statistics. It is fascinating that such similar behavior was observed in the dynamics of stock prices, animal migratory behavior, and diffusive atomic motion. As might be expected, the commonality of these systems is not accidental: each system used some form of the central limit theorem as the mechanism causing convergence to Brownian motion in the asymptotic limit of long-times. However, while the long time limit was used here, it is natural to ask what the limit looks like at short times. Within 3 years of Einstein’s results, Perrin published experimental results confirming Einstein’s theory, and thus the atomic nature of matter [8]. 2 His experiment involved tracing the trajectories of particles of resin suspended in solution. Additionally, he speculated on the actual trajectory of these particles, and noted “. . . how near the mathematicians are to the truth, in refusing, by logical instinct, to admit the pretended geometrical demonstrations, which are regarded as experimental evidence for the existence of a tangent at each point of a curve” [8]. In other words, a discrete random walk is only a partial sample of the “true” mathematical path, an idea which foreshadows the continuous limit of Brownian motion, known as the Wiener process. The Wiener process is the continuous limit of a Gaussian random walk, and though it does serve as a model for physical Brownian motion, it has some peculiarities. The process has no inertia, it is nowhere differentiable, it is fractal- like and scale-free, and it maintains the issues of Gaussian diffusion that allow for arbitrarily large velocities. So, while the Wiener process can serve as natural model for diffusion, it has a more important role as a mathematical tool and as a foundation for many other stochastic processes. More complex processes can be constructed using the Wiener process (e.g. the Ornstein–Uhlenbeck process, which is in a sense a Wiener process with inertia), and it is omnipresent in stochastic equations (e.g. Langevin equations and Feynman-Kac formulas can be written using the Wiener process). 1.2. Lévy Statistics and Extreme Events In the previous section we discussed the broad applicability of Brownian motion and the Wiener process, both of which are intrinsically Gaussian random processes. However, despite their effectiveness for modeling certain systems, they are often applied beyond their strict regime of applicability. The central limit 3 theorem that underpins the prevalence of Gaussian statistics is not applicable in the random variables with correlations, non-stationary distributions, or heavy tailed distributions, and applying it in these situations can be highly problematic, though it remains a common practice. One such example is in the Black-Scholes model [9] used in the pricing of financial derivatives. The model assumes that the log of stock prices fluctuate as stochastic Wiener processes with a fixed variance. That this model is heavily used, even today, may be somewhat surprising considering that stock prices often have strong correlations, are non-stationary, and are heavy tailed [10]. As such, the Black-Scholes pricing model has sometimes been considered by financial experts as putting “the wrong numbers in the wrong formula to get the right price” [11]. On the other hand, the Black-Scholes model may not accurately predict the “right price” after all. The author of the book The Black Swan [12], where the title refers to events that are rare and difficult to predict, was highly critical of the uses of Black-Scholes models and Gaussian risk management by financial institutions, and wrote “...the financial ecology is swelling into gigantic, incestuous, bureaucratic banks (often Gaussianized in their risk measurement)—when one falls, they all fall.” This portent of disaster was quite timely considering the devastating 2007-2008 financial crisis that began in the months after the book was published. Other misapplications of Gaussian statistics exist, typically with the error being an underestimate of the importance of large but rare events. With the potential failures of Gaussian statistics in mind, it is important to consider alternative models that can account for extreme events. While there is no perfect model, extreme events are commonly associated with probability distributions with “heavy” power-law tails. Lévy processes (specifically, stable 4 Lévy processes [13, 14, 15]) are a natural extension to Wiener processes to incorporate heavy-tailed statistics. Lévy statistics have found a wide range of uses, including modeling atmospheric dynamics, heart beat rhythm [16], stellar dynamics, and even quantum decay [17]. One particularly interesting result is that Lévy flights have been shown to be an optimal foraging strategy under certain conditions [18], partially explaining how the observed flight patterns of the albatross can be modeled with Lévy flights [19], random walks with heavy tailed step distributions. Based on the examples discussed here, “rare” and “extreme” events of Lévy statistics are actually quite prevalent. 1.3. Anomalous Diffusion of Laser Cooled Atoms Lévy processes also play a particularly important role in the understanding of the diffusion of cold atoms, though as we will see, detailed investigations of anomalous diffusion did not occur until relatively late in the development of laser cooling systems. In this section we overview some of the history of laser-cooling systems, highlighting studies involving diffusive properties of these systems, as well as how Lévy statistics are an effective description for more recently studied regimes of anomalous diffusion. One of the early and most effective cooling systems developed for neutral atoms is “optical molasses” [20, 21]. This cooling system relies on both radiation pressure [22], where atoms interacting with light experience a force in the direction the light is propagating, as well as the Doppler effect, where an atom in motion will observe a shift in frequency of incident light. A cooling effect is achieved by setting up counter-propagating beams that are red-detuned from the atomic 5 resonance of the atom species to be cooled. If an atom in motion is interacting with both beams, the beam that opposes the atom’s motion will be blue-shifted into resonance and therefore interact more strongly. This in turn increases the radiation pressure from the beam that opposes the atom’s motion, effectively damping the motion. As damping the motion reduces the kinetic energy, this can be reasonably be considered a cooling system for classical definitions of temperature1. This general cooling effect is known as Doppler cooling, while the setup with counter-propagating cooling beams is referred to as optical molasses as it damps motion in all directions akin to the viscous damping in molasses. The motion of atoms in optical molasses does not damp completely away, however, as photons emitted through spontaneous emission give the atom random momentum kicks, ultimately driving diffusive motion. For simple Doppler cooling, the competing cooling effect and the heating effect of spontaneous emission reach an equilibrium temperature known as the Doppler temperature [23]. This model suggests that a simple diffusion model is appropriate to describe atomic motion in optical molasses, and that the atomic motion can be used to characterize laser cooled atoms, especially for the purpose of measuring temperatures [21]. However, it became apparent that such simple models were inadequate for atomic species with hyperfine structure, when temperatures were measured far below the Doppler temperature. This led to the development of more sophisticated quantum models to explain the observed sub-Doppler cooling effect. The primary effect leading to the unexpectedly cold temperatures was found to be Sisyphus cooling, which is also the cooling force considered in this dissertation (see Chapter IV for more details on Sisyphus cooling). The more detailed quantum 1In the case of a single atom, various definitions of temperature may not be well defined, but these “cooling” systems still damp the atom’s motion. 6 models also led to the development of new cooling techniques such as velocity- selective coherent population trapping [24]. Interestingly, the cooling through velocity-selective coherent population trapping is a diffusive process rather than a friction force. The technique involves optically pumping slowly moving atoms into a dark state, where the atoms effectively decouple from the driving lasers due to destructive quantum interference. While in the dark state the atoms no longer experience heating due to spontaneous emission, and so remain moving slowly. Atoms which are not in a dark state still experience the typical cooling and heating effects, until their momentum happens to be small enough such that the atom is pumped into the dark state, where atoms tend to stick. Because the dark state corresponds to small atomic momentum, the atoms are effectively cooled. In addition to being an interesting diffusion-based cooling mechanism, this also led to the first use of Lévy statistics in the description of laser-cooled atoms, as the time that the atom remains in the dark state is random, with a power-law- tailed distribution [25]. Significant work has been done on analyzing the Lévy statistics for this type of cooling mechanism [26]. Our focus here, however, will be on Sisyphus cooling. The full quantum models for Sisyphus cooling can be complex [27, 28, 29]. As semiclassical models [30, 31] were developed, they were placed on a rigorous footing through both perturbative calculations [32] as well as through numerical simulations [31]. Semiclassical approximations treat the laser fields and atomic position classically, while maintaining the quantum nature of the atomic state. These models were often simplified further by a process of adiabatic elimination, where changes in the atomic state are assumed to occur much more quickly than motion of the atom, which allowed for investigation of the diffusive behavior of 7 Sisyphus cooling. These semiclassical models were used to develop a detailed theory of optical molasses, including the velocity distribution (a particular heavy- tailed distribution known as the Tsallis distribution [33]), as well as a more thorough investigation of the spatial diffusion properties [30]. These earlier studies still primarily focused on regimes of normal diffusion; the anomalous diffusion that we will discuss shortly was referred to as décrochage, or disintegration, and was generally considered undesirable as the density of confined atoms would rapidly drop in these regimes. The rational here was due to a major motivation behind laser cooling at the time, which was to reach Bose- Einstein condensation. This required high atomic densities through the use of magneto optical traps [34]; the dense atoms would then be “boiled” off through evaporative cooling techniques [35] to reach condensation, so high atom counts were critical. However, it was found in Ref. [30] the diffusion constant diverges in the regime of large detuning and low intensity laser fields, and upon closer investigation it was found that this divergence was a transition to a regime that could be described by anomalous diffusion and the onset of Lévy flights [31]. Unlike the velocity-selective coherent population trapping, these Lévy flights2 occur in space, leading to anomalous diffusion. While regular diffusion has the √ property that the standard deviation of the diffusing particles will grow as t, anomalous diffusion refers to any process where the growing width of an ensemble distribution scales as tµ for some µ 6= 1/2. While the onset of anomalous diffusion had been observed in Sisyphus cooling [30], more statistical signatures of anomalous diffusion and Lévy-flight 2Technically these are known as Lévy walks, which specify that the particle velocity is always finite. The distinction is not completely trivial (see [36]), but Lévy processes can be used to describe Lévy flight behavior in the same sense that a Gaussian diffusion can be used to describe physical processes even though Gaussian diffusion has no maximum velocity. 8 statistics were observed by studying the motion of a single trapped ion [37]. These results were in qualitative agreement with the theoretical work in Ref. [31], with the onset of anomalous diffusion occurring for shallow optical potential depth, though the optical potential depth for the onset of the diffusion did not match theoretical values. More recent experiments used clouds of atoms to investigate anomalous diffusion in Sisyphus cooling [38], which again found the onset of anomalous diffusion for shallow optical potentials. However, this cloud based experiment characterized the anomalous diffusion exponent using multiple methods, with results that suggested the presence of multiple distinct anomalous diffusion exponents. For simple models of anomalous diffusion, the parameter that describes the power law growth of the standard deviation should match the parameter that describes the power-law tail of the position distribution. However, this experiment found disagreement between these measured exponents. Theoretical work studying the onset of anomalous diffusion in more depth [39] also found that the anomalous diffusion should be characterized by a single universal exponent, indicating that current models may be missing some key component. Moreover, a phase transition to a regime where the anomalous diffusion exponent is fixed at α = 3/2, known as the Obukov-Richardson phase, has been predicted but not yet observed experimentally. This phase corresponds to integrated Brownian motion, and is in the limit where the cooling force is negligible but the heating effect due to spontaneous emission is still present. It has been proposed that some of these differences between theory and experiment may be due to loss effects via inelastic collisions and molecule formation [39], especially since the losses themselves may contribute to the 9 appearance of anomalous diffusion [40]. To circumvent such potential issues, and to enable alternative experimental probes of Lévy dynamics of laser-cooled atoms, in this dissertation I investigate the possibility of using experiments with single atoms to probe the stochastic dynamics. Single-atom experiments would open up the possibility of studying statistics not accessible to experiments with atomic clouds, such as boundary-crossing and escape statistics. Measurements of these statistics would complement existing measurements, which have been limited to characterizations of the scaling behavior of ensemble densities. 1.4. Organization of this Dissertation This dissertation is organized as follows: Chapter II introduces some basic concepts of stochastic processes, including the Wiener and Lévy processes and some important statistics. Building on this foundation, Chapter III explores conditioned stochastic processes, first introducing the well known Brownian bridge in Section 3.1, before focusing on the less common Lévy bridge in Section 3.2. The new results of this dissertation begin in Section 3.2.3, where I find that a conditioned displacement of the final state for the Lévy bridge experiences a bifurcation when the conditioned length exceeds a certain threshold. The displacement at which this bifurcation occurs defines a novel length scale which naturally characterizes many features of the conditioned Lévy process. Furthermore, I show that it has interesting uses such as detecting extreme events (Section 3.2.5), generating Lévy bridges (Section 3.2.6), explaining properties of first passage times (Section 3.2.7), and parameter inference of a Lévy process from data (Section 3.2.8). Finally, Chapter IV reviews existing theory on the onset of anomalous diffusion in Sisyphus cooling. Original work begins in Section 4.5 with 10 a formula that characterizes the extent of power-law scaling which is useful in the design of experiments studying Lévy behavior with laser-cooled atoms. New results continue in Section 4.6, where through simulations I establish the existence of a peak in the boundary-crossing statistics that is connected to the underlying Lévy- type behavior, and derive scaling relations for its extent. These results serve as a starting point for development of future single-atom experiments. 11 CHAPTER II STOCHASTIC PROCESSES This chapter reviews basic, well-known background material for handling stochastic processes, which serves as a foundation for understanding the conditioned processes investigated in Chapter III, and for understanding the dynamics of laser-cooled atoms in Chapter IV. There are a variety of good general references for this material [14, 15, 41, 42, 43], as well as some with specific focus on Lévy processes [16, 44, 45, 46]. Understanding the transport properties of cold atoms in Sisyphus cooling requires modeling of random processes at several different levels. First, the underlying physical process involves spontaneous emission, an inherently stochastic process that gives random momentum kicks to the cooled atoms. Next, the motion of the atom itself is the time integral of the momentum, and so is a sum of all the incremental random kicks. Finally, the overall diffusive process incorporates all possible trajectories each atom could take. These levels are each described by different machineries in the language of stochastic processes: The underlying physical process is modeled as an Itō stochastic differential equation for the momentum; the motion of the atom is the stochastic integral of the momentum equation; and the aggregate diffusive process is the evolution of a probability distribution through a master equation. And so, in this chapter we discuss some of the essential aspects of stochastic processes necessary for the understanding of cold-atom transport. The subject of stochastic processes is a broad one. It covers the analysis of random systems, lending itself well to describing many physical systems like 12 molecular diffusion and electrical noise, as well as less physical systems like fluctuations in the stock market and animal birth-death rates [14]. Despite this breadth in application, there are only a few fundamental random processes from which more complex stochastic systems can be constructed. In Section 2.1 we will discuss the Wiener process, an idealized mathematical description of Brownian motion. It is the one of the most widely used stochastic processes due to a universality among stochastic processes, where seemingly disparate systems ultimately have identical limiting behavior. The source of the mechanism that causes this universality for Wiener process is the central limit theorem, one of the most celebrated results of probability theory. We will also discuss the basics of stochastic differential equations, some important examples, as well as several statistics for the Wiener process. Next, in Section 2.2 we will discuss Lévy processes. These can be thought of as an extension to the Wiener processes by relaxing a continuity requirement to allow for systems involving rare and dramatic events, which appear as discontinuous “jumps” in the system’s evolution. We will also discuss a generalization of the central limit theorem that naturally leads a certain class of Lévy processes known as the α-stable Lévy processes. We will show that these processes are particularly important for models of anomalous diffusion, despite their apparently pathological behaviors like discontinuous sample paths which lead to the possibility for Lévy processes to jump across an absorbing region without being absorbed. 2.1. Wiener Processes The Wiener process is a continuous-time random walk that is an idealized mathematical description of Brownian motion. While true physical processes 13 differ somewhat from the Wiener processes, the Wiener process is a convenient mathematical object, and remains an excellent approximation for many systems exhibiting Brownian motion. To show why this is, we will start by deriving the central limit theorem (Section 2.1.1), which underpins the the importance of the Wiener process. Following this we will give an informal definition of the Wiener process as a limit of a discrete random walk (Section 2.1.2), as well some basic properties (Section 2.1.3). Finally, we will introduce several tools of stochastic calculus, including stochastic differential equations (Section 2.1.5), master equations (Section 2.1.6), methods for sampling stochastic processes (Section 2.1.7), and certain key statistics for the Wiener process (Section 2.1.8). 2.1.1. Central Limit Theorem One of the most interesting and useful results in the study of probability and statistics is the central limit theorem (CLT), where the sum of many random numbers from an unknown distribution has an asymptotically Gaussian distribution. This somewhat counter-intuitive result, where the combination of minimally described random numbers results in a precise statistical description, is broadly applicable. It explains quite generally why so many measured statistics appear Gaussian and is an important component of the explanation for regular diffusive behavior in cold atoms discussed in both the introduction and in Section 4.3. While there are many ways of presenting the CLT, we will show a brief proof using a random walk representation similar to proofs in Refs. [43, 47], which will be a useful formalism for motivating the Wiener process in the following section. A simple, one-dimensional random walk is a series of random steps ∆xi drawn from a 14 distrib∑ution f∆x, such that after n steps the random walker has a position given by x nn = i=1 ∆xi. Using this notation, a basic statement of the CLT is as follows: If one takes n independent random variables ∆xi drawn from an arbitrary distribution f∆x with a known mean µ = 〈∆x〉 and variance σ2 = 〈(∆x−µ)2〉, the sum of these numbers, xn, has the asymptotic, specific distribution given by the Gaussian 1 2 2 fxn(x) = √ e−(x−µ) /2nσ . (2.1) 2πnσ2 The proof for this is simplified if we use the shifted and scaled variables (∆√xi − µ)∆zi = (2.2) nσ which have zero mean and variance 1/n. The motivation for this choice will become clear later, but we are essentially scaling out the expected linear growth of variance in n. Using these variables ∆zi, the analogous sum to xn is ∑n zn = ∆zi. (2.3) i=1 As this is the sum of independent random variables, the probability distribution will be given by an iterated convolution of the individual distributions. Since each step has the same distribution, the probability distribution for zn is given by fzn(x) = [f∆z ∗ f∆z ∗ . . . ∗ f∆z](x), (2.4) 15 where we have n− 1 convolutions. From the convolution theorem, we can write the Fourier transform for fzn as a product of Fourier transforms of f∆x, which is F [fzn ](k) = F [f∆z](k)F [f∆z](k) . . .F [f∆z](k), (2.5) where we have defined the Fourier transform by ∫ ∞ F [f∆z](k) = dx f (x)eikx∆z . (2.6) −∞ We now can expand the exponential as a sum to find ∑∞ ∫ F 1 ∞ [f ](k) = (ik)m dx f (x)xm∆z ∆z , (2.7) m! m=0 −∞ where we see that the integral is just the mth moment of f∆z. Since the mean is zero and the variance is 1/n this expansion can be written as 2 F k[f∆z](k) = 1− +O(k3). (2.8) 2n Putting this into our expression in Eq. (2.5) gives us ( )n F − k 2 [f ](k) = 1 +O(k3zn ) . (2.9)2n We now can see that terms of order km are proportional to n−m/2, which justifies keeping only to lowest order in n in the limit n → ∞. Then, in this limit we can use the relation ex = lim (1 + x/n)nn→∞ to find F [f −k2/2zn ](k) = e , (2.10) 16 which through the inverse Fourier transform gives us √1 −x2f /2zn(x) = e . (2.11) 2π Finally, by unscaling xn as in Eq. (2.2) to we find the desired result in Eq. (2.1). 2.1.2. Definitions The CLT gives good reason to expect many systems to exhibit Gaussian behavior above some length or time scale, even if the underlying process is not fundamentally Gaussian. The Wiener process, however, represents an underlying process that explicitly consists only of infinitesimal Gaussian increments at all scales. They can be used as an ideal mathematical object that is conveniently manipulated to find exact results for many statistics, or as a particularly useful model for many real-world systems when the true underlying non-Gaussian behavior is experimentally inaccessible or irrelevant. Informally, a Wiener process W (t) can be defined as the continuous time analog of a discrete Gaussian random walk with n steps occurring at time intervals ∆t = t/n. The final position of such a walk is given by ∑n xn = ∆xi, (2.12) i where the ∆xi are independently drawn from a Gaussian probability distribution 1 2 2 f −x /2σ∆x (x) = √ e . (2.13)i 2πσ2 17 As the sum of independent Gaussian random variables is also a Gaussian random variable, we have the distribution for final position given by √ 1 2f (x) = e−x /2nσ2xn . (2.14) 2πnσ2 To properly translate this discrete walk into a continuous form, we would like the final position xn to correspond to the final position W (t) as we take the limit n → ∞. That is, we want 1 2 fW (t)(x) = √ e−x /2t (2.15) 2πt to match Eq. (2.14), as t ∝ n. However, as we increase n we see that we must scale σ such that the distribution for fxn does not diverge and maintains the appropriate scaling behavior. The choice for the variance of the individual steps ∆xi that maintains Eq. (2.14) is σ2 = ∆t. Taking this assumption we now can define the Wiener process as the limit ∑n W (t) = lim ∆x →∞ i . (2.16) n i=∑1 Note that similarly, we have W (t) = lim n σ2n→∞ i=1 . Notationally we can write Eq. (2.16) as a stochastic integral ∫ t W (t) = dW (s), (2.17) 0 where dW is an infinitesimal increment of the Wiener process. The Wiener increment dW can be thought of as a Gaussian random variable with a variance dt. This is because the variance of W (t) must grow linearly in time to be consistent with the CLT and with fW (t), so each of the increments dW must 18 accordingly have variance dt. While we have not proven this explicitly, this relation is known as Itō’s lemma [15] and can be written as dW 2 = dt. (2.18) Using the independence of Wiener increments, as well as Itō’s lemma, we can write the variance of the Wiener process as 〈∫ t ∫ t 〉 〈∫ t 〉 ∫ t 〈W 2(t)〉 = dW (s) dW (s′) = dW 2(s) = dt = t, (2.19) 0 0 0 0 which demonstrates some convenient manipulations commonly used to simplify stochastic integrals. Note that we are using the angle brackets to indicate an ensemble average. One remaining tool for Itō calculus is a generalization of the chain rule. For a function y(x, t) the typical chain rule can be written as ( ) ( ) ∂y ∂y dy = dx+ dt (2.20) ∂x ∂t which can be thought of as a Taylor expansion of y(x + dx, t + dt), dropping terms higher than first order. However, if dx contains a term proportional to a Wiener increment, we can see that we must keep terms containing dW 2 since they are actually first order in dt. This is done simply by adding the next term from the Taylor expansion, giving us the Itō chain rule ( ) ( ) ( ) ∂y ∂y 1 ∂2y dy = dx+ dt+ dx2. (2.21) ∂x ∂t 2 ∂x2 19 This covers a basic introduction to Wiener processes, which are a natural tool due to the CLT. In the next section we discuss some essential properties of the Wiener process. 2.1.3. Fundamental Properties The Wiener process has several fundamental mathematical properties, which we briefly discuss here; more detailed explanations can be found in Ref. [13, 44, 47]. 1. Stationary Increments. Also known as time-homogeneity, this states that that all Wiener process increments only depend on the current state and not on any other variables. That is, all elements of {W (t0 + t) −W (t0) : t ≥ 0} are independent of t0. 2. Independence of increments. For any choice of n ≥ 1 and 0 ≤ t0 < t1 < . . . < tn, all increments from W (t0),W (t1) − W (t0), . . . ,W (tn) − W (tn−1) must be statistically independent. 3. Infinite Divisibility. Processes that have both stationary and independent increments are called infinitely divisible, so this property also holds for the Wiener process. As the term indicates, the process W (t) can be split into a sum of N independent identically distributed random variables. 4. Martingale. The martingale property is the requirement that for all t > t0 > 0, the expectation value 〈W (t − t0)〉 = W (t0). This is essentially a statement that there is no deterministic drift. 20 5. Continuity. The Wiener process must satisfy the standard continuity requirement: For all t and for all  > 0 there exists a δ > 0 such that |W (t+ δ)−W (t)| < . 2.1.4. Qualitative Behavior In addition to the formal properties, there are many peculiar qualitative behaviors of the Wiener process that are important to understand. Sample Path Behavior. As stochastic processes depend on random elements with multiple possible outcomes, there are many solutions for a given stochastic process. Each possible solution to a stochastic process is known as a sample path. As an example we have plotted 10 sample paths for the Wiener process in Fig. 2.1. As we can see, the individual sample paths can vary greatly, and consistent results will only occur only in distributions of possible solutions. However, a single sample path is still useful for a qualitative understanding of the motion, as the sample paths can actually represent the microscopic behavior of a physical process. Furthermore, for real-world data an ensemble is not always available, and the information available from analyzing a single sample path can still be valuable. Boundary Crossing Behavior. The fractal-like nature of the Wiener process has some unusual properties when considering the crossing of a boundary. First, if a Wiener process crosses some threshold, one might näıvely assert that there must then be a well defined first crossing point. However, this interpretation has issues for a Wiener process considered at a finite resolution. While one resolution may have a crossing time at t1, any of the possible refinements (i.e. higher resolution sample paths that hit the same points and have consistent 21 2 0 -2 0 t 1 6 2 0 -2 -6 0 1 t 9 FIGURE 2.1. Plots of 10 sample paths of a Wiener process. Both plots show the same sample paths (indicated by colors), while scale of each plot is chosen to indicate the qualitative self-similarity at at different scales. statistics) of this process may have different crossing times at a distinct time t2 6= t1 or even multiple crossing times. In fact, a true Wiener process that crosses a boundary at least once will (almost surely) have an infinite number of crossings in the vicinity, and any given crossing for a finite resolution sample path will inevitably have an arbitrarily large number of nearby crossings when the sample paths are refined. Fortunately, the assertion that there must be a well defined first crossing point is actually correct for the true Wiener process [48], which allows us to 22 Wo(t) Wo(t) define well behaved distributions like the first passage time, which we explore in Section 2.1.8. Still, there are some cases that are somewhat problematic. For example, an excursion is defined as a process that starts and ends at the origin. For a Wiener process that starts at the origin, the first passage time at the origin is simply t = 0 since every crossing will have an infinite number of nearby crossings. This can be problematic because excursions can play a useful role in analyzing random walk behavior (see for instance Ref. [39], where this issue is mitigated by considering boundaries with a size  > 0 and take the limit  → 0 as needed). Fractal and Scale Free Behavior. Fractals are a broad class of self-similar geometric objects, where the object at one scale has similarities to itself at other scales. Since the Wiener process is fundamentally random, it is incredibly unlikely for a Wiener sample path to have any exact self-similarity. However, the Wiener process does have statistical self-similarity which firmly places it as an example of a random fractal. This self-similar behavior can be seen in Fig. 2.1, where each plot shows the same set of sample paths, but shown over different interval lengths and at different scales. Following the variance scaling of the Wiener process, the √ time axis is scaled by s while the length axis is scaled by s, where s = 3. Thus, with the appropriate scaling the qualitative appearance of these plots is similar, even if the individual trajectories are quite different. 2.1.5. Stochastic Differential Equations As we have seen, the Wiener process is a useful description of many random processes due to the CLT. However, it also is useful in describing a class of random systems beyond basic Brownian motion through appropriate stochastic differential 23 equations. Thus the Wiener process is not only fundamental because of the CLT, it also is a key building block for many other random processes. The Itō stochastic differential equation, given by dx = h(x, t)dt+ g(x, t)dW, (2.22) where h(x, t) is the drift coefficient and g(x, t) is the diffusion coefficient, describes the evolution of a process x(t) that can be computed through a stochastic integral. As is true for all SDEs, there are many possible solutions to this equation. We will now discuss two important examples that can be written in the form of Eq. (2.22) that are useful for the analysis and interpretation of the diffusion in cold atoms in Chapter IV. 2.1.5.1. Ornstein–Uhlenbeck Process The Ornstein–Uhlenbeck process (a Wiener process plus linear damping) is a common model for the momentum of a massive particle that results in physically observed Brownian motion. As such, the integrated Ornstein–Uhlenbeck process asymptotically approaches a Wiener process, but the particles in this model have inertia and do not instantaneously change direction as happens for the true Wiener process. It is also a successful model for some regimes cold atom diffusion, so it is a natural system to use for comparison to diffusive regimes that have non-linear damping. 24 The Ornstein–Uhlenbeck process (dp) and the corresponding particle position (x) is described by the pair of stochastic differential equations dp = −γp dt+ σp dW, (2.23) p dx = dt. (2.24) m where γ is a damping constant and σp is a momentum diffusion constant. For convenience we use the scaled variables t→ t/γ, p→ pm, along with the definition √ σ = σp/m γ, which gives the dimensionless equations dp = −p dt+ σ dW, (2.25) dx = pdt. (2.26) The typical method to solve this system is to notice that the homogeneous form of Eq. (2.25) (i.e. σ = 0) has the solution p(t) = p e−t0 , (2.27) which in turn motivates the transformation y = pet. (2.28) To find the increment dy, we apply the Itō chain rule in Eq. (2.21) resulting in ( ) ( ) dy = et (−p dt+ σ dW ) + pet dt (2.29) = etσ dW. (2.30) 25 We can now integrate this to find y(t), ∫ t y(t) = p0 + σ e sdW (s), (2.31) 0 where p0 is an integration constant. Finally, we transform back via Eq. (2.28) to find ∫ t p(t) = p e−t0 + σ e s−tdW (s). (2.32) 0 This has a nice form with a transient p0e −t that will vanish at long times, and a second term which is simply the integral of a Gaussian random variable (and thus is also a Gaussian with a variance that depends on t). As the mean is clearly zero, we just need to calculate the variance of this second term to fully characterize it, which can by found by 〈( ∫ ) 〉t 2 ( ∫ t )( ∫ t ) σ es−tdW (s) = σ es−t ′ 〈 〉 dW (s) σ e s −tdW (s′) (2.33) ( ∫0 t ) 0 02 ∫ t σ es−tdW (s) = σ2〈 〉 e −2t e2sds (2.34) ( ∫0 ) ( 0t 2 2− )σ es t σdW (s) = 1− e−2t . (2.35) 0 2 While this characterizes the momentum, we still have not solved for the position evolution. To find the position as a function of time, we must integrate Eq. (2.26): ∫ t ∫ t ∫ s′ ′ x(t) = x0 + p0e −sds+ σ ds′e−s esdW (s). (2.36) 0 0 0 The second term is easy to integrate giving ∫ t p −s −t0e ds = p0(1− e ). (2.37) 0 26 Since the third term in Eq. (2.36) is a Gaussian random variable with zero mean, we only need to compute the variance σ2x which is given by 〈[∫ t ∫ s ] [∫1 t ∫ s ]〉3 σ2 = σ2 ds e−s1 es2x 1 dW (s ) ds −s3 s4 2 3e e dW (s4) . (2.38) 0 0 0 0 Since the stochastic increments dW are independent random variables, all contributions to the integral will vanish in the average unless the increments dW (s2) and dW (s4) are the same increment. This allows us to rewrite this integral with the replacement s4 → s2 giving us ∫ t ∫ t ∫ min(s1,s3) σ2 = σ2 ds (2s2−s1−s3)x 1 ds3 e ds2, (2.39) 0 0 0 where we have dropped all uncorrelated products of dW . Notice that the limit for the s2 integral is the minimum of s1 and s3, as the integration limits must be truncated to the range where each dW (s2) has a complementary dW (s4). To continue simplifying this integral, it helps to think about it as an integral over a subsection of a t × t × t cube, where we only include the volume of a “quarter-pyramid”; the subsection of the cube selected by the integration limits is equivalent to the following integral: ∫ t ∫ t ∫ t σ2x = σ 2 ds ds ds e2s2−s1−s32 3 1 . (2.40) 0 s2 s2 Evaluating this, we have ( ) σ2 2 − 3 −t − 1= σ t + 2e e−2tx . (2.41)2 2 27 This gives us the expected asymptotic linear growth in the variance as well as an exponentially damped transient. However, it is interesting that there are two different transients proportional to e−t and to e−2t. Finally, we can combine the results from Eqs. (2.37) and (2.41) into the probability density √ ( )1 (x− x − p (1− e−t 20 0 ))f(x, t) = exp − , (2.42) 2π(σx(t))2 2(σx(t)) 2 which is just a Gaussian with a particular time-dependent mean and variance. 2.1.5.2. Integrated Brownian Motion One other commonly used process that will be relevant for later analysis is integrated Brownian motion. This is equivalent to the Ornstein–Uhlenbeck Process with a vanishing damping coefficient, and so we have the equations dp = σ dW, (2.43) dx = pdt, (2.44) which we can see leads to ∫ t dx = σ W (s)dt. (2.45) 0 28 This has the well known variance scaling of 〈x2〉 ∼ t3, which can be seen via ∫ t x(t) = σ ∫ W∫ (s)ds (2.46)s=0t s x(t) = σ ∫ dW∫(u)ds (2.47)s=0 u=0t t x(t) = σ ∫ dW (u) ds (2.48)u=0 s=ut x(t) = σ (t− u)dW (u), (2.49) u=0 where we have changed the order of integration. Since we have a sum of Gaussian random variables, we only need to compute the variance ∫ t ∫ t 〈x(t)2〉 = σ2 ∫ (t− u)dW (u) (t− u)dW (u) (2.50)0 0t 〈x(t)2〉 = σ2 (t− u)2dt (2.51) 0 σ2〈x(t)2〉 = t3, (2.52) 3 which gives us the expected scaling behavior. 2.1.6. Master Equations Master equations are differential equations that describe the evolution of a probability density. While the Itō SDEs can be thought of as describing the evolution of an individual particle, master equations can be used to describe either the probability of different possible outcomes for a single particle, or evolution of the density of an ensemble of particles. In both cases the trajectory of individual particles is not readily apparent and so this picture is more suited as a macroscopic description of a system. 29 The equivalent master equation to the Itō SDE in Eq. (2.22) is known as the Fokker–Planck equation 1 ∂tρ(x, t) = −∂x[h(x, t)ρ(x, t)] + ∂2x[D(x, t)ρ(x, t)], (2.53)2 where D(x, t) = g(x, t)2 is the diffusion coefficient. As a simple example, we can look at the solution in the case of the pure Wiener process, where we have h(x, t) = 0, D(x, t) = σ2. The Fokker Planck equation then reduces to the heat equation ∂ σ2 ∂ ρ(x, t) = ρ(x, t). (2.54) ∂t 2 ∂x2 The propagator [the solution given ρ(x, t = 0) = δ(x)] is just the Gaussian kernel ρG(x, t) = √ 1 −x2/2tσ2e (2.55) 2tσ2 which allows us to express the solution for an arbitrary initial distribution ρ(x, t = 0) as a convolution: ∫ ∞ ρ(x, t) = ρG(x− x′, t)ρ(x, t = 0)dx′. (2.56) −∞ 2.1.7. Sampling and Simulation This section outlines standard numerical methods used for integrating stochastic differential equations involving a Wiener process. These methods are used for producing Brownian bridge statistics in Chapter III, as well as the diffusion simulations of the Sisyphus cooling force in Chapter IV. Methods of 30 numerically solving ODEs often involve discretizing a continuous process into finite time steps ∆t and incrementally solving for the state at each discrete time n∆t. These methods can be employed for the numerical simulations of SDEs, but there are some additional challenges compared to simulations for ODEs. First, since there are a multitude of solutions, to properly sample an SDE we must collect a series of sample paths until the statistics settle and the behavior is well characterized. This can be substantially more computationally expensive than simulations for ODEs. Second, the addition of Gaussian noise at each discrete time step leads to poorer performance than similar algorithms for ODEs. Finally, certain numerical algorithms for SDEs only have weak convergence, where the simulation only has correct results for ensemble averages, while the individual paths may not reflect the true process. If the behavior of the individual paths is important (as is the case for statistics like first passage times in Section 2.1.8), then we require the algorithm to have strong convergence. Despite these additional concerns, some algorithms for SDEs are very similar to algorithms for ODEs. To numerically simulate an Itō SDE of the form dx = h(x, t)dt+ g(x, t)dW (2.57) the simplest approach is to use the Euler–Maruyama method [14, 49]. This is done by approximating the state x(t) at discrete time steps ∆t by xn ≈ x(t0 + n∆t) where the xn are iteratively generated through √ xn+1 = f(xn, n∆t)∆t+ g(xn, n∆t) ∆t Zn. (2.58) 31 The Zn are independent Gaussian random numbers with zero mean and unit variance. While similar to the Euler method for ODEs, the single-path error √ for the Euler–Maruyama method is of order ∆t. This is worse than the linear error for the Euler method, or ∆t4 for the (fourth order) Runge–Kutta method, so a small time increment can be necessary. However, with the availability of powerful graphics processors, the Euler–Maruyama method was adequate for our simulations with sufficiently small time steps. More complex and higher order methods for SDEs discussed in Refs. [49, 50, 51] may be useful if the Euler– Maruyama method has problematic convergence. 2.1.8. Boundary Crossing Statistics So far we have discussed stochastic differential equations, which describe the evolution of a stochastic trajectory, as well as the Fokker–Planck equation, which describes the evolution of a probability distribution for an ensemble of stochastic trajectories. Both of these describe the evolution of the system in time, while now we move to some more descriptive statistics that can be derived from these models. In this section we derive crossing probabilities and first-passage distributions for the Wiener process. These are used in Section 4.6.1 for modeling escape times of laser-cooled atoms. The first statistic we consider is the crossing probability. We define the crossing probability Pcross(t) as the probability that a particle has crossed a boundary located a distance d from the origin at a time t. To derive the crossing probability one can imagine tagging particles as they cross a boundary and calculate the fraction of tagged particles to find Pcross; while this is a useful interpretation, to derive Pcross, we will use a more convenient approach by defining 32 an absorbing boundary and calculating the survival probability Psurv. Then the crossing probability is just the probability that the particles did not survive, or Pcross = 1− Psurv. To calculate the survival probability we employ a trick known as the method of images1, where a required boundary condition (in this case the absorbing boundary) is satisfied by adding equal but oppositely weighted “anti-particle” on the opposite side of the boundary, offset by the same distance d. This guarantees that at all times t the absorbing boundary condition f(d, t) = 0 is satisfied. Since we know the evolution of both particles [Eq. (2.55)], the solution to the probability density is just the superposition 1 ( )2 f(x, t) = √ e−x /2tσ2 − e−(x−2d)2/2tσ2 , (2.59) 2tσ2 which completely accounts for the absorbing boundary when restricted to the domain x < d. The survival probability is just the integral ∫ d Psurv(t) = dx f(x, t), (2.60) −∞ which has the solution ( ) d Psurv(t) = erf √ , (2.61) 2tσ2 which in turn gives us ( ) d Pcross(t) = erfc √ , (2.62) 2tσ2 1There are many other ways to derive the survival probability. A “brute force” method using the backwards Fokker-Planck equation is shown in Ref. [15], while a clever method using the reflection principle is shown in Ref. [41]. 33 x d 0 0 τd t FIGURE 2.2. Schematic diagram illustrating first passage times. The process starts at d and ends when it hits the dashed line at x = 0. where erf and erfc are the error function and complementary error function, respectively. In the limit t → ∞, we see that Pcross → 1, showing that a random walker will always eventually cross a given boundary. This can be related to the “gambler’s ruin problem”—even with fair odds, a persistent gambler with fixed size bets will eventually go bankrupt, and be unable to continue playing. It also is sensible that as d → 0, we also have Pcross → 1, with the interpretation that if a random walker starts at a boundary, it must cross it immediately. A related concept to the crossing probability is the first passage time, defined as the first time a particle crosses a threshold, as illustrated in Fig. 2.2. Since a particle crossing the boundary gets “tagged” or absorbed only the first time it crosses, we can see that the distribution of first passage times fτ is then just thed derivative of the crossing probability ∂ f (t) = P (t′)∣∣∣∣ √ d −d2 2τ ′ cross = e /2tσ . (2.63)d ∂t 3 2t′=t 2πt σ 34 This distribution is shown in Fig. 2.3, which shows the same function plotted with linear scaling (top plot) and log-log scaling (bottom plot). The cutoff at short times is due to the low probability of the particle to exit immediately due to the Gaussian nature of the increment (exponentially damped tails). The density maximum is located at d2/3σ2, which serves as a transition to a t−3/2 power law, as indicated on the log-log plot. An extension of the first passage time for a single boundary is the escape- time distribution from an interval of length L, when the particle is offset from one of the boundaries by a distance d and offset from the other boundary by L − d, as illustrated in Fig. 2.4. The derivation follows similarly to that of the first passage time, starting by calculating the escape probability Pescape. This again uses the method of images, but since there are two boundaries, the images themselves also require images, and so on. The result is ∑∞ ( ) P (t) = (−1)j sgn(j + 0+ |j√L+ d|escape ) erfc , (2.64) 2 j=−∞ 2tσ and we can see that as L → ∞ this reduces to the crossing probability in Eq. (2.62) when one boundary has a negligible effect. Now we can define the escape time distribution fτe by ∣ ∂ ∣ fτe(t) = P ′ ∣ ′ escape(t )∣ (2.65)∂t t′=t and find 1 ∑∞ 2 2 fτe(t) = √ (−1)j sgn(j + 0+) |jL+ d|e−|jL+d| /2tσ . (2.66) 2πσ2t3 j=−∞ 35 a2−5 /310 fτ (t)p 10−5 10−5 10−5 10−5 100 100 104 104 104 104 104 τp a2/3 10−3 10−4 fτp(t) 10−5 ∼ t−3/2 10−6 10−7 10−8 10−9 10−10 10−11 10−12 102 103 104 105 106 107 108 109 τp FIGURE 2.3. Plots of the first passage time distributions for a Wiener process with a single boundary. The top plot is scaled normally, while the bottom plot is log-log scaled to emphasize the t−3/2 power law tail. 36 fτp(t) fτp(t) x L d 0 0 τe t FIGURE 2.4. Schematic diagram illustrating escape times. The process starts at d and ends when it hits either dashed line at x = 0 of x = L. This is plotted in Fig. 2.5 for two values of d. For d = L/100 we see that the power law present for the first passage times still occurs at τ 2 2e > d /3σ but now has a cutoff at (L − d)2/3σ2. This essentially corresponds to the time it takes for the width of the particle density function to expand to a size L − d so that the further absorbing boundary has a significant impact on the particle’s survival. The power law also retains t−3/2 scaling, despite the presence of an additional absorbing boundary. Note that this discussion is true for d < L/2; for d > L/2 the roles of L − d and d are swapped. The plot for d = L/2 shows no visible power law since, as we will see later in this section, the power law is caused by the interaction with a single absorbing boundary, while when d = L/2 the particle is equally likely to interact with either boundary. An important relation apparent in the above first-passage distributions is the power-law time dependence. For both the single-boundary first-passage distribution in Eq. (2.63) and the two-boundary escape-time distribution in Eq. (2.66), the asymptotic time behavior scales as t−3/2. These are two examples of a universal scaling relation attributed to Sparre Andersen [48]. This scaling 37 a2− /3 (L−a) 2/3 10 3 10−4 fτe (t) ∼t−3/2 10−5 10−6 10−7 10−8 10−9 10−10 10−11 10−12 102 103 104 105 106 107 108 109 τe (L/2)2/3 10−3 10−4 fτe (t) 10−5 10−6 10−7 10−8 10−9 10−10 10−11 10−12 102 103 104 105 106 107 108 109 τe FIGURE 2.5. Escape time plots for a Wiener process with two absorbing boundaries. The top plot is shown for a = L/100, which has a clear power law extending over a wide range. The bottom plot is for a = L/2 and has no observable power law. 38 P (τe) P (τe) behavior was proved originally for discrete random walks [52, 53], but it also holds in the continuous limit [16]. Surprisingly, these universal relations are for all random walks with symmetric step distributions and are not limited to Gaussian processes. 39 2.2. Lévy Processes Lévy processes are a natural extension of Wiener processes that encompass continuous-time random walks with drifts, heavy tailed step distributions, and most dramatically, discontinuous sample paths. They are defined to include all possible independent and identically distributed noise processes. Similarly to Wiener processes, Lévy processes are important in understanding a systems from a wide range of domains, including ecology [19], financial systems [13] such as option pricing [54, 55] or security returns [56], transport in fluid flows [57] and chaotic systems [58], optimal stochastic search strategies [18, 59] which can explain animal foraging behavior [60, 61], and some more obscure areas like the statistics of cricket games [62] or the process by which humans learn to balance sticks [63]. Also as we will discuss in more detail in Chapter IV, they have an important role in laser- cooled atoms [26, 31, 37, 38, 39, 64, 65, 66]. Since the CLT implies a certain universality of the Wiener process, it is reasonable to suspect there may be a natural extension to the CLT that implies a similar universality for the Lévy processes. This intuition turns out to be mostly correct: one extension to the CLT is known as the generalized central limit theorem (gCLT), and like the CLT it characterizes the how the distribution of sums of random variables converges to a limiting Lévy random process, with a couple caveats: First, the gCLT applies to sums of random variables with power- law-tailed step distributions, or Lévy flights in the language of random walks. Second, the limiting distribution of the gCLT is an α-stable distribution, which corresponds to the α-stable Lévy processes, a subset of the Lévy processes rather than a generic Lévy process. It is worth noting that many of the cited uses of the 40 Lévy processes predominately use α-stable Lévy processes, so the gCLT and the α-stable Lévy processes are particularly useful. With this in mind, we will start this section by introducing the Lévy process in general (Section 2.2.1), but we will quickly restrict ourselves to using the α- stable Lévy processes (Section 2.2.2) because of the broad applicability due to the gCLT. We will then cover methods for simulation of α-stable Lévy processes (Section 2.2.4), and discuss boundary crossing statistics, including leapover distributions and first passage distributions (Section 2.2.5). 2.2.1. Lévy Process Properties and Definitions A Lévy process L(t) can be defined as stochastic process that satisfies the following properties2 [44]: 1. Stationary Increments. Also known as time-homogeneity, this states that that all Lévy process increments only depend on the current state and not on any other variables. That is, all elements of {L(t0 + t) − L(t0) : t ≥ 0} are independent of t0. 2. Independence of increments. For any choice of n ≥ 1 and 0 ≤ t0 < t1 < . . . < tn, all increments from L(t0),L(t1)− L(t0), . . . ,L(tn)− L(tn−1) must be statistically independent. 3. Stochastic Continuity. For any  > 0 P (|L(s + t) − L(s)| > ) → 0 as t → 0. This is a limited form of continuity that allows for the presence of 2There are other formulations of Lévy processes that are sometimes included or excluded in definitions of the Lévy process. This list does not include the convention of a fixed origin at L(0) = 0. 41 discontinuous jumps, as long as they occur at a random time. Deterministic discontinuities are still excluded by this form of continuity. 4. Cadlag. This requirement states that Lévy processes must be right continuous with left limits. This simply means that the limiting value for the process at points of discontinuity is taken to be the value after the jump. As we can see, the formal definition of the Lévy process has similar properties to the Wiener process (Section 2.1.2), with the most exceptional difference being a relaxation of the Wiener process’ continuity requirement, which is replaced by stochastic continuity. This allows for jump discontinuities as long as the probability for a jump goes to zero as the time interval for the jump goes to zero. The other difference is the lack of the martingale property which allows for a Lévy process to have deterministic drift. Given this definition, it can be shown that any Lévy process has a distribution f(x, t) that can be written as the Fourier transform [44]: [ ( ∫ ( ))] F b 2 ′ [f ] (k, t) = exp (t− t ) ika− k2 + dx′W (x′) eikx −1−ikx′Θ(1−x′2)0 0 . 2 (2.67) This is known as the Lévy–Khintchine representation, with the drift coefficient a, the diffusion rate b, and a jump transition density W0. While a and b are constants, the transition density (or Lévy measure) W0(x) represents the expected number of jumps of size x per unit time and satisfies ∫ ∞ dx (1 ∧ x2)W0(x) <∞, (2.68) −∞ (x=6 0) 42 where we use the notation 1 |x| > 1(1 ∧ x2) =  (2.69)x2 |x| < 1. The interpretation we gave for the parameters in Eq. (2.67) is more clear if one considers each term in the exponential separately. Recall that a distribution for the sum of random variables is just a convolution of the individual distributions, which in Fourier space is equivalent to a product of the Fourier transforms of the individual distributions. Therefore, by reversing this chain of logic, each term in the exponential directly corresponds to an independent random process: the ika term corresponds to a linear drift at, the −b2k2/2 term becomes a Wiener process bW (t), and the W0 term is the jump process PW0(t) (also known as a compound Poisson process). Thus, the Lévy process can be written as L(t) = at+ bW (t) + PW0(t). (2.70) This representation is known Lévy–Itō decomposition. This decomposition allows all Lévy processes to be described as the sum of a drift term, a Wiener process, and a jump process, which is convenient and initially seems quite intuitive. However, we will see in Chapter III that this decomposition does not capture some of the strong similarities between Wiener processes and the stable Lévy processes. 2.2.2. Stable Lévy Processes We will now define the symmetric α-stable Lévy process, which as we will see in Section 2.2.3 is an important limiting process due to the gCLT. The symmetric 43 α-stable Lévy process is a subset of the Lévy processes, and so can be defined by specifying choices for the drift coefficient a, the diffusion rate b, and the transition density W0. The α-stable processes are pure power law jump processes, meaning both the drift and diffusion rate vanish, while the transition density is a pure power law A W0(x) = |x|α+1 (2.71) with the stability parameter α and jump rate parameter A. Note that α must satisfy 0 < α < 2 for Eq. (2.68) to hold. While these choices of a, b and W0 along with Eq. (2.67) are sufficient to define the α-stable process, a more convenient form uses the scale parameter σ, defined via the relation A = sin(πα/2)Γ(α + 1)σα/π. (2.72) Using this, Eq. (2.67) can be simplified into the form ∫ 1 ∞ f (x; t) = dk cos(kx)e−tσ αkα α (2.73) π 0 where again σ is the scale parameter and α is the stability parameter with 0 < α < 2. In general there is no simple expression for the integral in Eq. (2.73), but this form is still convenient as it can be easily compared to the Fourier transform of a standard Gaussian distribution. We can see that as α → 2 the distribution is exactly the Gaussian 1 2 2 f2(x; t) = √ e−x /4σ t (2.74) 4πσ2t 44 √ which has a standard deviation 2tσ. The only other α with a simple expression is α = 1 which reduces to the Cauchy distribution given by σt f1(x; t) = . (2.75) π (x2 + σ2t2) We can see that the Cauchy distribution has asymptotic |x|−2 tails, and similar behavior is present for all for all α < 2, where the densities have heavy tails with power-law behavior scaling as |x|−(1+α), matching the transition density in Eq. (2.71). As an example, sample paths for selected α are shown in Fig. 2.6. Long jumps associated with rare events in the tails of the distribution are present for all α, but the jumps tend to be increasingly important for smaller alpha, a reflection of the heavier tails. 2.2.3. Generalized Central Limit Theorem The generalized Central Limit Theorem (gCLT) is an expansion of the CLT to account for sums of random variables drawn from a distribution an infinite variance, where the infinite variance is due to the presence of power law tails. More precisely, if a step distribution has power law tails that decay like f (x) ∼ |x|−(α+1)∆x with 0 < α < 2, then the gCLT states that a sum of these random variables will approach an α-stable distribution, defined by ∫ 1 ∞ α α fα(x) = dk cos(kx− kµ)e−σ k . (2.76) π 0 The gCLT can be motivated with a few observations: First, we can see that the step distribution for α ≥ 2 has a finite second moment, and so would approach 45 3 ao=o1.5 0 -3 0 t 1 3 ao=o1.0 0 -3 0 t 1 3 ao=o0.5 0 -3 0 t 1 FIGURE 2.6. Plot of 10 sample paths of the α-stable Lévy process for 3 values of α. 46 Lo(t) Lo(t) Lo(t) a Gaussian distribution due to the regular CLT as expec∫ted. However, when α < 2 the requirements for the CLT would not be satisfied as dxf∆x(x)x 2 diverges. But while the second moment diverges∫for 0 < α < 2, the distribution remains sensible as a probability distribution since dxf∆x(x) is still finite. Thus we can anticipate the existence of some sort of limiting distribution for fxn even if the standard CLT fails. Proving the gCLT is similar to proving the CLT, and can be seen in more detail in Refs. [17, 41]. Again, similar to the CLT, the gCLT gives strong grounding for the universality of the α-stable Lévy process. As for actually applying the gCLT in practice it is important to understand how the processes typically converge which we discuss in the next section. 2.2.3.1. Slow Convergence to Gaussian One concern about applying the gCLT to a system is that it is often difficult to guaranteed that a power law tail holds indefinitely at all scales. However, it turns out that even with a power law that only spans a finite range, the gCLT is still useful for modeling behavior. The purpose of this section is to help build intuition for how stochastic processes can be described by the asymptotic Gaussian and stable Lévy distributions at finite times. It is particularly useful for interpretation of many of the simulations shown in Chapter IV. To see how systems can approach the asymptotic limits, we look at an example of applying both the CLT and the gCLT to a system with heavy but truncated tails. 47 n = 10 n = 102 n = 103 n = 104 n = 105 n = 106 n = 107 n = ∞ underlying dynamics gCLT regime CLT regime FIGURE 2.7. Trajectories from a single truncated Cauchy random walk with steps drawn from Eq. (2.77) with a = 105. Each plot moving to the right contains the entire motion of the previous plot, highlighted in green. For n a the motion appears like a Cauchy random walk, while for n a the motion appears Brownian. Consider a random walk with n independent steps drawn from a truncated Cauchy distribution defined by  1 1 |∆x| > a f (∆x) = 2 tan −1(a) 1 + (∆x)2 tc (2.77) 0 |∆x| < a where the 1/2 tan−1(a) is just a normalization factor and a is a parameter determining the truncation cutoff. This distribution has a finite second moment σ0 = −1 + a/ tan−1(a), and so in principle the CLT is applicable. However, the extent of the tails can be large (controlled by a), limiting the rate of convergence to a Gaussian. In practice there can be a large range of n where the gCLT should be applied instead. To see this qualitatively, we have plotted the trajectory of a single random walker with a = 105 for various values of n, shown in Fig. 2.7. When n  a, the motion appears to be Brownian motion (Brownian motion shown as n = ∞ for comparison), however, for n < a we notice that the motion appears like a Cauchy random walk with large jumps. 48 x(t) To see this more quantitatively, we can compare how the CLT limiting distribution, given by 1 2 2 fCLT(x, t) = √ e−x /2σ(t) , (2.78) 2πσ(t)2 compares to the gCLT limiting distribution γ(t) fgCLT(x, t) = (2.79) π (γ(t)2 + x2) √ where σ(t) = σ0 t and γ(t) = t. The latter should be valid only as a → ∞. To see the comparison, in Fig. 2.8 we show the position distribution of 105 random walks with steps drawn from Eq. (2.77) with a = 105 for various numbers of steps. For n  a the simulated distribution tends to overlap with the Lévy distribution, while for n  a the simulated distribution tends to overlap with the Gaussian distribution. This “slow” convergence to Gaussian behavior due to the presence of power laws has been noted before [67, 68]. 2.2.4. Sampling and Simulation This section covers standard methods for numerical simulation of stable Lévy processes, as well as how to sample from stable Lévy distributions. These techniques are used for all simulations in Chapter III that involve stable Lévy processes. To simulate an α-stable Lévy process, we write the process as a discrete random walk with positions given by the relation x 1/αn+1 = xn + (Kα) (∆t) Xα(n) (2.80) 49 10-1 102 100 101 10-2 10-1 100 10-3 10-2 10-1 10-4 10-3 10-2 10-5 -3 10-410 10-6 10 -4 10-5 10-5 10-7 10-6 10-6 10-8 10-7 10-7 10-9 10-8 10-8 10-10 10-9 10-9 100 101 102 103 104 105 106 107 10-6 10-5 10-4 10-3 10-2 10-1 100 101 102 103 104 105 106 10-3 10-2 10-1 100 101 10√2 103 104 105 106 107 xn xn /t xn / t FIGURE 2.8. Demonstration of the applicability of the gCLT even for step distributions with a finite variance. Number of steps taken is indicated by the line color: 101 (blue), 102 (green), 103 (red), 104 (teal), 105 (purple) 106 (yellow). The second plot is scaled such that the distributions overlap with Eq. (2.79) (second plot, black), while the third plot is scaled such that the distributions overlap with Eq. (2.78) (third plot, black). where Kα is an anomalous diffusion coefficient and Xα(n) is an α-stable random variable. As for generating the stable random variables Xα, many numerical tools and packages often do not have built in methods. However, there is a strai(ghtfor)ward formula that transforms a uniform random number U on the interval −π , π and 2 2 an exponential random variable W with mean 1 into a stable random variable through the transformation ( )(1−α)/α X = (1 + ζ2)1/2α sin(α(U + ξ)) cos(U)− α(U + ξ) , (2.81) (cos(U))1/α W where ζ = −β tan(πα/2) (2.82) and ξ = arctan(−ζ)/α. (2.83) 50 P(xn ) tP(xn ) √ tP(xn ) This is valid for all 0 < α < 1 and 1 < α < 2 [69]. When α = 1 the alternative formula ( 2 (π ) ( ))− πW cos(U)X = + βU tan(U) β log (2.84) π 2 π + 2βU is used, while when α = 2 the variable X is a Gaussian random variable. While this method is straightforward, a more numerically efficient algorithm is also outlined in Ref. [69]. 2.2.5. Boundary Crossing Statistics In this section we will briefly discuss some important and counterintuitive boundary crossing statistics for the Lévy processes. In Section 2.1.8 we discussed some of these statistics for the Wiener process, which included crossing probabilities and first passage times, however, results for Lévy processes are much more limited. 2.2.5.1. First Passage Times The derivations for first-passage distributions in Section 2.1.8 relied on the method of images. However, it has been shown that this approach does not work for the stable Lévy processes [70], and ultimately leads to erroneous results. The breakdown of the method of images is due to the discontinuous nature of Lévy processes. In the Brownian case, the image method works by guaranteeing zero net probability flux across an absorbing boundary by effectively forcing the boundary to have zero net probability at all times (see Section 2.1.8). Meanwhile, in the Lévy case, a particle can effectively leap across a boundary, so that zero net probability at a single point does not prevent a particle from crossing the boundary. 51 x l x0 τp τh t FIGURE 2.9. Schematic diagram illustrating the leapover distance l, as well as the distinction between first passage times τp and first hitting times τh for a Lévy process. Because of this difficulty, there is no simple expression for the first passage distributions for Lévy processes, and instead we only have the asymptotic τ−3/2 behavior for a single boundary, as was already guaranteed through the universal Sparre Andersen scaling [70]. 2.2.5.2. Leapovers As discussed previously, the discontinuous nature of Lévy processes leads to interesting behavior when assuming absorbing boundary conditions. For example, consider an α-stable Lévy process starting at some x0 > 0 with an absorbing point at x = 0. While a Wiener process would remain positive until it eventually is absorbed, a Lévy process can hop between negative and positive values over the absorbing point at the origin many times before it is absorbed [70, 71]. Again, this is because of the discontinuous jumps permitted for Lévy processes. This behavior also is applicable to an absorbing region, rather than an absorbing point. For an absorbing half-line region (−∞, 0], a Lévy process again starting at some x0 > 0 will (almost surely) penetrate some distance l into the absorbing region before it is absorbed. This leapover length is illustrated in 52 Fig. 2.9. Also illustrated in this figure is the difference in between the first hitting time τh and the first passage time τp. While the first passage time is marked the moment the particle crosses the boundary, the first hitting time is not marked until the particle diffuses “into” the boundary.3 The leapover length can be characterized by the leapover distribution fl. For an absorbing interval (−∞, 0] and a particle starting at x0 > 0, the leapover distribution is given by [72, 73] ( ) α/21 πα x fl(l) = sin 0 . (2.85) π 2 lα/2(x0 + l) For a finite absorbing region [−2L, 0] and a particle starting at x0 > 0, the leapover distribution is given by α/2 1 (πα) |L2 − (x0 + L)2| 1 fl(x) = sin (2.86) π 2( |L)2(− (x+ L)2| α/2)|x0 − x| 2 −α/2 ∫α− 1 πα (x+ L) |x0+L|/L− sin 1− du (u2 − 1)α/2−1. πL 2 L2 1 Plots for Eq. (2.86) are shown in Fig. 2.10. While the particle starts with x0 > 0, it is somewhat remarkable that the probability is peaked near x = −2L. This can be explained by the possibility that the particle jumps across the region entirely, at which point it is more likely to enter the region near x = −2L on the subsequent step. 3The criteria for a particle “hitting” the boundary can be quite subtle. While a δ absorbing point can always be approximated by an absorbing region with a finite width, the appropriate width of the region for a given time scale is not obvious. This is discussed more in Chapter III. 53 5 0 -2 l 0 5 0 -2 l 0 5 0 -2 l 0 FIGURE 2.10. Leapover distributions (Eq. (2.86)) into the interval (−2L, 0) with L = 1 and x0 = 0.1 (red) and x0 = 10.0 (blue). 54 fLo(l) fLo(l) fLo(l) CHAPTER III CONDITIONED RANDOM PROCESSES So far in our investigation of random processes we have assumed a stochastic process with initial condition and predicted some expected behavior through use of Ito SDEs, Fokker–Planck equations, and first passage distributions. However, in certain circumstances both the initial and final condition are known, while the history of how the transition occurred is unknown. As an example, consider a particle within the boundaries of some imaging system, such that scattered light from the particle is detected by a single photodetector. A sudden drop in the photodetector signal indicates that the particle must have escaped the imaging boundaries. However, the path that the particle took is unknown. This example is particularly relevant to some simulations we discuss in Chapter IV, but to investigate this type of system generically, we study the conditioned evolution of the process. Studies of conditioned evolution lends insight into the likely history of an atom in this hypothetical experiment. And so, in this chapter we will investigate conditioned random processes for both Wiener processes and Lévy processes. The most basic conditioned process is a stochastic bridge, where a process has known initial conditions and a known final condition after some evolution time. The stochastic bridges represent possible behavior of the particle between measurements, and they are important for both understanding the qualitative behavior of the motion as well as for computation of a variety of relevant statistics. 55 In Section 3.1, we introduce some well-known properties of the conditioned Wiener processes (Brownian bridges). Next, in Section 3.2, we introduce the conditioned Lévy process and briefly discuss existing literature, before moving on to some of the main results of this dissertation: In Section 3.2.3 we show that a bifurcation arises in the most probable history of a conditioned Lévy process, with the onset of the bifurcation occurring when the final displacement of the conditioned processes far exceeds a certain value. The value at which the bifurcation arises defines a novel length scale (the “bifurcation length”) which we show has applications for detecting extreme events (Section 3.2.5), generating Lévy bridges (Section 3.2.6), explaining properties of first passage times (Section 3.2.7), and parameter inference of a Lévy process from data (Section 3.2.8). 3.1. Brownian Bridges The most basic conditioned process is the the Brownian bridge—a continuous-time Gaussian stochastic process with a fixed starting location, a given evolution time, and a fixed final location. It also is one of the most widely used examples of a conditioned process and has been thoroughly studied [1] and applied in diverse areas, such as financial mathematics [74, 75], models of animal movements [76], Monte Carlo methods in quantum mechanics [77, 78], random interfaces and potentials [79, 80, 81], and extreme-value statistics [82]. 3.1.1. Stretched Brownian Bridge There are multiple useful ways of expressing a Brownian bridge, but a common and convenient definition is the “stretched” Wiener path. The standard Brownian bridge B(t) is defined on the time interval [0, 1] with both starting and 56 ending points at x = 0, and can be written as B(t) = W (t)− tW (1), (3.1) where the bridge is compressed back to the origin by gradually subtracting off the final position of an unconstrained Wiener process. In this definition, the bridge is essentially a Wiener process plus a linear drift, such that the linear drift and the Wiener process cancel out at the final time. Surprisingly, this “stretch” method for generating bridges generates the correct increment statistics of a Wiener process. That is, the total drift needed to cancel the Wiener process at the final time is spread evenly over the time interval, so that the statistics of the infinitesimal increment end up being the same as the pure Wiener process. This aspect of the Brownian bridge can be seen by looking at the infinitesimal increment dB(t) = dW (t)− dtW (1). (3.2) Since the constituent parts of dB are Gaussian noise, we know that dB must also be Gaussian and will be characterized by its mean and variance. Then, 〈dB(t)〉 = 0 since both dW (t) and W (1) are symmetric, while 〈dB(t)2〉 = 〈dW (t)2〉+ 2dt〈dW (t)W (1)〉+ dt2〈W (1)2〉 (3.3) which simplifies to 〈dB(t)2〉 = dt when we drop terms of higher order than dt in the continuum limit. Since the variance is dt, the statistics in this limit are identical to that of the Wiener process. 57 We also can extend the standard bridge to stretch to an arbitrary final time T at position L. In this case the stretched Brownian bridge is given by t B(t) = W (t)− (W (T )− L) , (3.4) T and calculating the statistics of the increment is similar to the standard bridge case, though we now have a non zero mean increment 〈dB(t)〉 = dt L , while the T variance remains 〈dB(t)2〉 = dt in the continuum limit. 3.1.2. Midpoint Brownian Bridge While the “stretch” method is convenient for generating Brownian bridges, it does not follow obviously from the definition of a conditioned stochastic bridge. A more easily motivated method is to generate a Brownian bridge by iteratively sampling the midpoint of the time interval using conditioned distributions. To do this, we need to find the probability density for the position of the particle at time T/2, given that the particle has a known final position L at the final time T . For this intermediate position we use the notation x1/2 := x(T/2). Finding the density of x1/2, which we refer to as the the midpoint density, is a straightforward application of Bayes’ Theorem, P (A|B) = P (B|A)P (A)/P (B). Since we are interested in the probability density that the particle passes through (x1/2;T/2) (condition A) conditioned on the particle passing (L;T ) (condition B), the midpoint density can be written as | fW (L;T |x1/2;T/2)fW (x1/2;T/2)fW (x1/2;T/2 L;T ) = (3.5) fW (L;T ) 58 where fW (x, t) is the probability density for a Wiener process to be at position x at time t. Since the propagator only depends on the relative displacement and duration, the first term can be simplified to find fW (L− x1/2;T/2)fW (x1/2;T/2) fW (x1/2;T/2|L;T ) = (3.6) fW (L;T ) and by using Eq. (2.55) we find that this factors into ( ) fW (x1/2;T/2|L;T ) = fW x1/2 − L/2, T/4 (3.7) which is just a Gaussian centered at x = L/2. This matches the intuition that at time T/2 the most likely position should be halfway between the starting and ending positions (〈x1/2〉 = L/2), while the variance of the overall position of the process is somewhat constrained due to the conditioning (at time T/2 the free Wiener process has a variance T/2, while we see that the bridge has a reduced variance of T/4). To use Eq. (3.7) to generate an entire bridge, we start by generating sample points for bridges at time T/2. Once this is done, the process is bisected into two bridges, at which point this process can be repeated to generate sample points in each subinterval of duration T/4. By iterating this sampling process we can generate bridges to arbitrarily fine resolution. This process was used to generate the Bridges shown in Fig. 3.1. 3.1.3. First Passage Times Similarly to the free Wiener process, we can also develop first passage statistics for the Brownian bridge. For a particle conditioned to travel from 59 2 0 -2 0 t 1 12 0 -2 0 t 1 FIGURE 3.1. Examples of Brownian bridges generated through iterated sampling of the midpoint distribution using Eq. (3.7). The top plot has bridges with final displacement L = 0, while the bottom figure has bridges with L = 10. 60 Bo(t) Bo(t) x L d x0 0 τd T t FIGURE 3.2. Schematic diagram illustrating the first passage time for the Brownian bridge. position x = 0 to x = L in a time t, the distribution of first passage times across a boundary at x = d is given by [47] √ √d t −(dt−Lx)2fτ (x) = e /(2tx(t−x)). (3.8)d 2πx3(t− x) This process is illustrated in Fig. 3.2, which makes it clear that for L > d the process is guaranteed to have a first passage time. For L < d it is possible that no first passage occurs, and so Eq. (3.8) becomes unnormalized. 3.2. Lévy Bridges Similar to the Brownian bridge, a Lévy bridge Bα(t) is defined as a Lévy process conditioned to have a particular final position. Unlike the Brownian bridge, however, there is relatively little literature related to Lévy bridges, which is somewhat surprising considering the many applications of Lévy statistics (as discussed in the introduction and in Section 2.2). Refs. [83, 84] characterize several functionals of Lévy bridges, through as the definition used in these references 61 constrains the bridge to return to the origin, the results have limited relation to this chapter. There are also some preliminary investigations into Lévy bridges that are not constrained to the origin in Ref. [85]; however, while the formulation was made for general Lévy processes, the primary results were specifically for constrained Gamma processes and Wiener processes; the results were also for computation of certain pricing statistics. The most relevant reference to the work in this chapter is Ref. [86], which demonstrated the construction of stable Lévy bridges for α = 1/2 with the skewness parameter β = 1 (i.e. not a symmetric stable Lévy process). This reference mentions in passing that the frequency of long jumps varies as a scaling parameter is adjusted, which has some relation to our results in Section 3.2.5 and in Section 3.2.8, though the interpretation is quite different. It is also worth mentioning that some basic properties and constructions for more general Markovian bridges have also been considered [87, 88, 89], though as these processes do not even require stochastic continuity (see Section 2.2.1), they are unlikely to find practical use for describing spatial diffusion. While Lévy bridges are defined similarly to Brownian bridges, much of the intuition developed for Brownian bridges does not carry over when studying conditioned Lévy processes. This has led to a somewhat dubious definition of a Lévy bridge [90], where the bridge is not even time symmetric. To address this directly, we begin this section with an explicit example of how a näıve extension of the Brownian bridge “stretch” method fails for Lévy bridges (Section 3.2.1). We then develop a proper Lévy bridge using conditioned distributions leading to a midpoint density that can be used to iteratively sample points of a Lévy bridge to arbitrary accuracy (Section 3.2.2). 62 The midpoint distribution also provides surprising insight into the behavior or Lévy processes. We find that unlike in the Brownian case, the Lévy midpoint distribution bifurcates when the bridge exceeds a critical length Lb (Section 3.2.3). This bifurcation length provides insight into the transition between the Lévy and Gaussian regimes, criteria for detecting “long” jumps (Section 3.2.5), correcting the stretched bridge method (Section 3.2.6), inferring the stability index α from a given sample path (Section 3.2.8), and in general can be useful for the analysis of extreme events. 3.2.1. Failure of Stretched Lévy Bridges One of the first peculiarities of Lévy bridges is that the “stretch” method we used for the Wiener bridge does not generalize in an obvious way to the Lévy case [83]. This is also briefly indicated in Ref. [85], where the stretched Lévy bridge definition is briefly considered before being rejected. The reason for the failure of the stretch method is somewhat subtle, since in Section 3.1.1 we saw that the method works for the Brownian case because the increment dt is small compared to dt1/2 in the continuous limit. This argument initially seems like it should hold for the Lévy case as well, as dt1/α is still small compared to dt for 0 < α < 2. However, the reasoning breaks since in some sense the drift term cannot simply be considered order dt due to the heavy tails of the final position. Additional details related to this argument can be seen in [83, 85]. To show that the stretched bridges do not generate correct statistics, we define the stretched Lévy bridge as t Bα-stretch(t) = Lα(t)− (Lα(T )− L) . (3.9) T 63 1 0 -1 0 t 1 FIGURE 3.3. Examples of stretched Lévy bridges generated through Eq. (3.9) with T = 1 and L = 0. Some of the bridges have noticeable drift, indicating that the stretch method for Lévy bridges does not work properly. Looking at bridges created using this approach in Fig. 3.3, we can immediately see that they are somewhat peculiar. Roughly speaking, bridges should “appear” to be identical to the free underlying process, except that they happen to end at a particular location. Instead, we have that even though the bridges have B(T ) = 0 for final position, many of the bridges appear to have a strong drift present over the entire bridge. While the bridge does return to the origin as required (indicating that the mean increment is zero), this is apparently not sufficient, as the typical step still has a significant drift which causes the bridge to appear biased. To check this more rigorously, we can consider an arbitrary function F [X ] of a path X . The distribution of this function over all unconstrained Lévy processes fF [Lα](x) should match the distribution over constrained processes fF [B ′α-stretch|L=L ] when integrated over all L′ and weighted by the probability of the final position 64 Lo(t) -4 10 stretch normal -6 10 -3 0 10 10 t FIGURE 3.4. Test showing the failure the equality shown in Eq. (3.10) using the first passage time for F . The LHS of Eq. (3.10) is shown in blue, while the RHS is shown in red. Simulations are for α = 1, with the passage boundary d = 10, time step ∆t = 10−4, bridge constrained time T = 1, and N = 106 trajectories. fα(L ′;T ). That is the equality ∫ ∞ f ′ ′F [L ′α](x) = dL fF [B f (L ;T ) (3.10)α-stretch|L=L ] α −∞ should hold if Bα-stretch is a proper bridge. We can test this numerically by choosing F to be some easily observable statistic. As an example we choose F to be the first passage time, with a plot of the corresponding distributions shown∫ in Fig. 3.4. The distribution fF [Lα](x) is∞ shown in blue while the distribution ′ ′−∞ dL fF [B ′ f (L ;T ) is shown inα-stretch|L=L ] α red, clearly indicating significant deviations. The stretch method has a large excess of short first passage times compared to the correct distribution. 65 f o(t) t 3.2.2. Lévy Midpoint Density While the stretched bridges do not work for Lévy bridges, the midpoint sampling approach from Section 3.1.2 does apply, since it follows directly from the definition of the Lévy bridge. As we did for the Brownian bridge, the Lévy bridge arrival point is specified as x(T ) = L for some arrival time T > 0. Then, the intermediate position x1/2 has the midpoint density fα(x1/2;T/2| fα(L;T |x1/2;T/2)fα(x1/2;T/2) L;T ) = , (3.11) fα(L;T ) where the fα(x1; t/2) and fα(x2; t) factors are simply the free α-stable density from Eq. (2.73). As in the Brownian case, the fα(L;T |x1/2;T/2) term can be rewritten as fα(L − x1/2|T/2) since a Lévy process has stationary increments. This gives us the midstep density | fα(x1/2;T/2) fα(L−x1/2;T/2)fα(x1/2;T/2 x=L;T ) = , (3.12) fα(L;T ) written in terms of a product of the unconditioned density fα(x; t). Similar to the Brownian case, the distribution is symmetric about x1/2 = L/2. However, unlike the Brownian case, these terms do not factor, which leads to a number of important properties, most notably that x1/2 = L/2 is not always a maximum of the midstep distribution. To use Eq. (3.12) for sampling Lévy bridges, we can use the same approach as the Brownian case. Once x1/2 at time T/2 is sampled, the bridge is effectively bisected into two bridges each of duration T/2. This allows for the midpoints of the new bridges to be sampled, a process that may be iterated to sample the Lévy 66 bridge to any desired time resolution. Bridges generated using this method with α = 1 are shown in Fig. 3.5 for both L = 0 and L = 10. We can see that the character of the bridges is completely different from the failed stretched bridges in Fig. 3.3, and no notable drifts are present for either L. This is even true for L = 10 which is somewhat surprising: a Brownian bridges with L = 10 does have significant drifts as seen in Fig. 3.1, while the Lévy bridge with L = 10 instead induces a long jump rather than a drift. This hints at some counterintuitive behavior and an interesting transition, since the Gaussian case should behave similarly to the Lévy case when α is large. In Fig. 3.6 we have simulated bridges for α = 1.9 and L = 2, which seem to have the drift behavior that was present in the Brownian bridges in Fig. 3.1, so we can see that the behavior of inducing a long jump for L > 0 is not completely guaranteed for all L and α. To explain this behavior we will need to consider the midstep distribution in more depth. 3.2.3. Lévy Bridge Probability Bifurcations We discussed in the previous section how the midstep distribution [Eq. (3.12)] is symmetric about x1/2 = L/2; however, there is no guarantee that the midstep distribution is maximized at x1/2 = L/2 like it is for α = 2. In fact there is a transition with L where the maximum at x1/2 = L/2 becomes a minimum. This transition in the conditional probability density marks the shift from a unimodal to bimodal distribution. We refer to this shift as a probability bifurcation,1 and it leads to many interesting and counter-intuitive effects that we will investigate throughout the rest of this chapter. 1The use of the term probability bifurcation has been used in a similar context in Ref. [91]. 67 1 0 -1 0 t 1 12 0 -2 0 t 1 FIGURE 3.5. Simulated Lévy bridges with α = 1 and for L = 0 (top) and L = 10 (bottom). Each bridge is generated through 10 iterations of Eq. (3.12) and has 210 steps. 68 Bo(t) Bo(t) 3 0 -1 0 t 1 FIGURE 3.6. Simulated Lévy bridges with α = 1.9 and for L = 2. Each bridge is generated through 10 iterations of Eq. (3.12) and has 210 steps. 3.2.3.1. Cauchy Case While the midstep distribution cannot be written in closed form for all α, the expression does simplify nicely for the Cauchy limit, making it a convenient example with analytically solvable results. For α = 1 the density f1 is a Cauchy distribution, given by 1 f1(x; t) = (( ) ) , (3.13)2 πtσ x + 1 tσ and so we can write Eq. (3.12) as | f1(x1/2;T/2) f1(L−x1/2;T/2)f1(x1/2;T/2 x=L;T ) = . (3.14) f1(L;T ) A brief look at this distribution shows that it cannot be factored into a shifted and scaled f1 as was possible for the Gaussian case. Furthermore, while the Gaussian case had a spatial shape that was independent of L, the conditional density for α = 1 has a shape that is strongly L-dependent. To show this we 69 Bo(t) 1.5 L=0.2sT L=2sT L=6sT 0 -5 -3 -1 0 1 3 5 x1/2/sT FIGURE 3.7. Plot of the midstep distribution for α = 1 for various L. have plotted Eq. (3.14) for various L in Fig. 3.7. For L = 0.2 we can see that the distribution has a single peak at x1/2 = L/2; this has a simple intuitive interpretation: if a particle travels from x = 0 to L in time T , then the most probable intermediate position at T/2 is L/2. This interpretation was completely correct for the Gaussian case, and we see that it also holds true in the Cauchy case for sufficiently small L. However, we also see the clear dependence of the conditional density shape on L and the splitting of the single peak into two peaks for larger values of L. This demonstrates that there is a bifurcation point Lb in the Cauchy limit such that the midstep distribution for L > Lb becomes bimodal. The bifurcation point Lb can be found by noting that the curvature of the midpoint distribution evaluated at x1/2 = L/2 must change signs when the maximum becomes a local minimum. This gives us the condition (( ∂2 ) ) ∣∣∣2f1(x1/2;T/2|x=Lb;T ) ∣∣ = 0, (3.15)∂x1/2 x1/2=L/2 70 (x1/2o-oL/2)/soTo which due to the symmetry of f1 can be simplified to the condition f ′′1 (Lb/2;T/2)f1(Lb/2;T/2)− f ′1(Lb/2;T/2)2 = 0 (3.16) where the primes indicate derivatives with respect to the first argument. This equation can be solved analytically, and we find that the bifurcation point for the Cauchy case is given by Lb = σT. (3.17) Another way to solve for the bifurcation length is by finding the maxima and minim( a of the midpoint di)stribution as a function of L using the condition ∂ f1(x1/2;T/2|x=L;T ) = 0 and solving for x∂x 1/2. The 3 real solutions are1/2  x1/2 = L/2 L < tσ (3.18)[L± (L2 − σ2T 2)1/2]/2 L > σT where there is a clear transition at the bifurcation point Lb. The solutions in Eq. (3.18) are plotted in Fig. 3.8, showing that the most probable midpoint as a function of L. For L < Lb we see that the most likely midpoint is L/2 as in the Gaussian case. This means that the most likely single step refinement would be into two steps with a size equal to L/2. However, for L > Lb the maximum splits in a pitchfork bifurcation. Further, we can see that for L  Lb the peaks are well separated, with maxima approaching asymptotes x1/2 ∼ 0, L. The interpretation here is that the long jump L tends to break down into two steps: one large step of order L and one small step. This means that a bridge with sufficiently large overall transition length L will tend to maintain this as a single jump discontinuity. 71 4 o=L x 1/2 L/2 x o=1/2 0 0 1 L/sT 4 2 -2 0 1 4 L/sT FIGURE 3.8. Plot of the extrema of the midstep distribution (Eq. (3.14)) for the α = 1 Cauchy case where maxima are shown in red and minima in blue. For comparison, the extrema for both the unshifted form (top plot) and a symmeterized form (bottom plot) are shown. The bifurcation point is easily visible at L = tσ. 72 (x1/2o-L/2)/sT x1/2o/sT ao=o1.0 0 4 (x o-oL/2)/soTo1-3 1/2 ooo¿ ooÅ 3 FIGURE 3.9. Bifurcation plot for α = 1.0 Another useful way to visualize this transition is with a waterfall chart, as shown in Fig. 3.9, which allows for observation of the change in curvature and the shifts in the positions of the extrema as a function of L. So far we have only investigated the Cauchy case with α = 1, but since bifurcation occurs for α = 1 but does not occur for α = 2, we know there must be some transition as α → 2. To see this transition we must look at the bifurcation in the general case. 3.2.3.2. General Case In this section we will explore the midpoint density in the general case, and we will see that similar structural changes to those observed in the Cauchy case occur for all α < 2. While we cannot write down analytic expressions for the densities, the general character of the midpoint density can be seen in the waterfall charts in Fig. 3.10. For α = 1.5 we see that there is a single pitchfork bifurcation, 73 L/soTo1oo¿oooÅ similar to the Cauchy case, where two maxima and a minimum are born from a single maximum. We also can see that the point of bifurcation Lb is not simply σT as it was for the Cauchy case, but is somewhat larger. However, if we take α closer to the Gaussian limit (both α = 1.99 and α = 1.99999 in Fig. 3.10), we begin to see additional structure beyond what was present for both the Cauchy case and for α = 1.5. As L is increased, instead of the maximum splitting in a pitchfork bifurcation, we see that a pair of side peaks form via tangent bifurcations. As L is increased further, these side peaks begin to dominate the central peak, until finally, a central minimum forms through reverse-pitchfork bifurcation. While the transition for these examples is more complicated, we still see that for all α, a sufficiently large L results in similar midpoint behavior, where the density is asymptotically bimodal with well separated peaks. To analyze this behavior in more detail, we will first look at the how the central peak’s bifurcation length Lb generalizes for all α < 2. To do this, we will first take a closer look at the more complex bifurcation structure seen for larger α by finding the critical value of α where the more complex structure first arises, as well as characterizing the bifurcation length for where the side peaks appear. 3.2.3.3. Characteristic Bifurcation Length To characterize the bifurcation point for all α < 2, we define the length Lb as the value of L for which the curvature of the midpoint density (Eq. (3.12)) at x1/2 = L/2 changes sign (Fig. 3.13). That is, we can simply generalize the condition 74 ao=o1.5 0 15 1 -10 (x1/2o-oL/2)/soTo ooo¿ ooÅ 10 ao=o1.99 0 15 -10 (x1/2o-oL/2)/soTo 1oo¿oooÅ 10 ao=o1.99999 0 15 -10 (x1/2o-oL/2)/soTo 1oo¿oooÅ 10 FIGURE 3.10. Variation of the midstep density (Eq. (3.12)) with Lévy index α and arrival point L. Curves highlighting maxima (solid/red) and minima (dashed/blue) are superimposed. 75 L/soTo1ooo¿ ooÅ L/soTo1oo¿oooÅ L/soTo1ooo¿ ooÅ 10 9 8 7 6 5 4 3 2 1 0 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 a FIGURE 3.11. Bifurcation length Lb plotted as a function of α. for the Cauchy case in Eq. (3.15) for all α, giving us the condition (( ∂2 ) ) ∣∣∣2fα(x1/2;T/2|x=Lb;T ) ∣∣ = 0, (3.19)∂x1/2 x1/2=L/2 which again due to the symmetry of fα can be simplified to f ′′α(Lb/2;T/2)fα(Lb/2;T/2)− f ′ 2α(Lb/2;T/2) = 0. (3.20) This more general condition can be be solved numerically, giving us the bifurcation point Lb as a function of α as shown in Fig. 3.11. The bifurcation point Lb can roughly be considered the transition between “long” and “short” jumps, where for L  Lb the displacement is more likely to be a composite of multiple small jumps, while for L Lb the displacement is more likely to be due to a single large jump. Fig. 3.11 also is important in illustrating the transition to the Gaussian limit as α −→ 2. As we can see, the bifurcation length diverges in this limit, so 76 o/os oo1/aLb oT that for the Gaussian (α = 2) case, any final step L is effectively a “short step.” This matches our intuition for Wiener processes which explicitly forbid any jump discontinuities. To analyze this divergence further, we can make use of the the asymptotic density fα(x; t = 1) ∼ f2(x; 1) + δ|x|δ−3, (3.21) which is valid for large |x| and small δ := 2 − α, and plug it into Eq. (3.12) [92]. After some simplifications, we find that Lb diverges as √ √ ( ) √ ( ) [ ( )] ≈ − −πδ 2 √ 2 2 Lb 2tσ 2W−1 ≈ πδ πδ 2σ t − log + log − log , (3.22) 2 2 2 which can be approximated by Lb ∼ [−4σ2T log(πδ2/2)]1/2. (3.23) The nature of this divergence is somewhat peculiar, as even very close to the Gaussian limit α = 2, Lb remains relatively small. This becomes especially noticeable for large α, for instance for α = 1.99999 we have Lb ≈ 10.188473, even though as α→ 2 we must have that Lb →∞. 3.2.3.4. Critical α, Alternative Bifurcation Lengths As we saw in Fig. 3.10, for larger α the midpoint density does not exhibit a simple bifurcation to a bimodal density; rather, the first sign of a transition is the formation of side peaks through tangent bifurcations, then the side peaks grow until they overcome the central peak, and then finally the central peak becomes a 77 Ls Lp Lb 3 ao=o1.99 2 1 maxima minima 0 -1 -2 -3 5 6 7 8 L/sToo1ooo¿ ooÅ FIGURE 3.12. Bifurcation closeup for α = 1.99 local minimum. Thus, to characterize this more complex transition there are three useful values for L along the transition: 1. The side peak bifurcation length Ls, the value of L where the side peaks first originate. 2. The equal peak weight length Lp, the value of L where the side peaks have the same peak probability as the central peak. 3. Characteristic bifurcation length Lb, the value of L where the central maximum becomes a minimum. These are shown in Fig. 3.12 for α = 1.99, and as a function of α in Fig. 3.13 for α > αc. Interestingly, all the important lengths diverge in a similar manner as Eq. (3.23), so we use Lb as it is the most straightforward to calculate. To find the critical value αc for which these transitions first appear, it is helpful to consider the diagram explicitly showing the shoulder bifurcation points 78 x/sToo1oo¿oooÅ 10 5 0 0 0.5 1 1.5 2 a 10 8 6 4 1.8 1.9 2 a FIGURE 3.13. Variation of boundaries between “small” and “large” steps with α. Curves indicate bifurcation lengths Lb for which the center of the midstep density has vanishing curvature (red/solid), the midstep distribution develops side peaks (blue/dashed), and side peaks are equal in height to the center peak (green/dot- dashed). Inset: magnified view for α > αc. 79 o/os oo1/a o/os oo1/aLb oT Lb oT ao=oac (Ls) and the reverse pitchfork bifurcation (Lb), as shown in Fig. 3.12. As α is lowered, the difference between the Ls and Lb will decrease, until they finally meet when α = αc and merge together. The condition for this type of transition is that both the second and fourth derivatives vanish: (( ) ∣∂2 ) ∣∣2fαc(x1( /2 ;T/2|x=Lb;T ) ∣∣ = 0, (3.24)∂x1/2 ( ) )∂4 | ∣∣∣ x1/2=Lb/2 4fαc(x1/2;T/2 x=Lb;T ) ∂x ∣∣ = 0. (3.25)1/2 x1/2=Lb/2 Numerically solving these equations for the critical value gives αc ≈ 1.7999233 (3.26) which is curiously close to, but not equal to 1.8. 3.2.4. Conditioned Sampling The importance of the bifurcation length can be seen by analyzing Lévy bridges of different lengths. We will see that above the bimodal transition (L  Lb), the typical conditioned history is biased towards containing only a single large event, while below the transition (L  Lb) the tendency is towards a composite of smaller events.In Fig. 3.14 we show conditioned Lévy processes for various L and α, with the choices of L are scaled in terms of the bifurcation length Lb to illustrate the transition. For the α = 1.9 the bias towards preserving long jumps when L exceeds the bifurcation length is easily evident. For L = 1.5Lb each sample path has 80 3 1.5oLb 1.0oLb 0.5oLb -2 0 t/T 1 5 1.5oLb 1.0oLb 0.5oLb -2 0 t/T 1 9 1.5oLb 1.0oLb 0.5oLb -2 0 t/T 1 FIGURE 3.14. Typical sample paths of Lévy bridges for α = 1.9, illustrating the qualitative transition with L. Each path was generated through 10 recursive subsamplings from the midpoint distribution (Eq. (3.12)). 81 x/soToo1ooo¿oooÅ x/soToo1oooo¿ ooÅ x/soToo1oooo¿ ooÅ 15 10.0oLb 0.5oLb -5 0 t/T 1 FIGURE 3.15. Typical sample paths of Lévy bridges for α = 1.0. Each path was generated through 10 recursive subsamplings from the midpoint distribution (Eq. (3.12)). a single long jump, while for L = 0.5Lb nearly every sample path looks akin to Brownian motion, with a just a handful of larger jumps. The reason for this behavior is that in each stage of sampling of the midstep distribution Eq. (3.12), if the displacement L exceeds Lb then the entire length L is likely to be preserved as a single jump in one of the two subintervals. Then, upon subsampling of the subinterval containing this jump, Lb is effectively smaller due to the smaller time interval. This means the jump length L tends to exceed Lb by an ever increasing margin, making it progressively less likely to be split into smaller jumps. The case for α = 1 and α = 1.5 is not quite as clear. While L > Lb indicates that refinement is likely to have a large jump of order L persist under refinement, it does not preclude the possibility that additional jumps may form, which ultimately make the qualitative difference between L < Lb and L > Lb less distinct when L is close to Lb. Ultimately through, the same behavior still holds as long as L is much greater than Lb, as can be see in Fig. 3.15. 82 x/soToo1oooo¿ ooÅ 3.2.5. Jump Detection As we discussed previously, the conditioned subsampling shows that the bifurcation length Lb is intimately related to the long jumps of an α-stable process. We found that an observed final displacement |x(T )|  Lb most likely corresponds to a single, similarly large jump discontinuity. Meanwhile, a smaller final displacement |x(T )| ∼ Lb is more likely to be a composite event comprising multiple smaller jumps. Perhaps most surprising is not this behavior itself, as Lévy processes are well known to have long jumps, but that it implies the existence of a distinct length scale Lb. Through the Lévy–Khintchine representation of Lévy processes (Section 2.2.1), as well as the pure power law jump distribution for the stable Lévy processes (Section 2.2.2), we saw that the stable Lévy processes are inherently scale invariant. It thus seems like a contradiction for there to be an intrinsic length scale Lb. So how is this possible? The answer is subtle: in a sense, conditioning introduces a time scale T , which in turn induces a natural length scale σT 1/α. If this were the scale of the bifurcation length, the result would seem underwhelming as it came from natural consequence of the conditioning. However, the bifurcation length Lb differs from σT 1/α, and not by a little: Lb can be arbitrarily large as it diverges as α → 2. This suggests that it truly is a distinct scale that captures a different quality than the natural scale σT 1/α, namely the scale of the large visible jumps present in the Lévy process. To show this, we can consider comparing Lb to the individual displacements of (finite) sample paths. As any given sample path is effectively a collection of bridges between a set of known values, the midpoint density gives insight into 83 the likely behavior of the process in between sampled values, and the bifurcation length Lb becomes useful criteria for detecting “long” jumps. To demonstrate the usefulness of this Lb criteria, it’s useful to consider the other natural alternative criteria. While the typical natural criteria for detecting “long” jump outliers is the standard deviation, as this diverges for Lévy processes we instead use the natural scale σT 1/α. In Fig. 3.16 we show a Lévy process with with highlights on all increments that have ∆x > σ∆t1/α. Interestingly, the majority of steps are highlighted, and it does not seem particularly useful for identifying extreme events. On the other hand, if we highlight increments that exceed the bifurcation length Lb as in Fig. 3.16, we find that the intuitive “jumps” are clearly highlighted, while the other motion appears almost Brownian in character. This is somewhat peculiar, as the Lévy–Itō decomposition (Section 2.2.1) shows that stable Lévy processes are distinct from the Wiener process. However, while the processes are distinct, analysis using the bifurcation length has led to more grounded reasons for the strong similarities that arise between Wiener processes and Lévy processes. 3.2.6. Corrected Stretched Lévy Bridges In Section 3.2.1 we saw that stretched Lévy Bridges did not satisfy the properties we want in a Lévy bridge, and generated incorrect statistics. However, since we have a criteria that can detect if a jump is “long”, it becomes possible to deal with excessive stretches through a rejection algorithm. The basic idea is to generate unconditioned Lévy bridges until one of them is sufficiently close to the intended final position, and then stretch it as before. The key is that the 84 80 60 40 20 0 -10 0 100 200 300 400 500 t 3 2 1 0 -0.5 0 t/T 1 FIGURE 3.16. Sample path of an unconditioned α-stable Lévy process, α = 1.9. Simulated path has time steps ∆t = T/500 For the top plot, increments ∆x > σ(∆t/T )1/α emphasized (bold/red). For the bottom plot, increments ∆x > Lb(∆t/T ) 1/α emphasized (bold/red). 85 x/soToo1ooo¿oooÅ Xo(t) bifurcation length can serve as the criteria for “sufficiently close”, allowing for more efficient stretched bridge generation. The algorithm is as follows: take an unconditioned Lévy sample path L′, and choose some threshold Lthresh. If the difference between the final position of the generated path and the intended final position is smaller than the threshold, that is |L′(T ) − L| < Lthresh, then the bridge is accepted; otherwise it is rejected and a new candidate bridge is generated. In the limit that Lthresh → 0 the bridge is exact, though such a bridge would take arbitrarily long to generate. A näıve approach for this rejection algorithm would use σ∆t1/α as the scale for Lthresh, but using the α-dependent bifurcation length Lb as the scale for Lthresh is much more efficient and should still produce accurate Lévy bridges. This method has been confirmed to work through numerical simulations [93]. 3.2.7. Conditioned First Passage In this section we analyze the first passage distributions for conditioned Lévy processes. As we have seen, conditioned Lévy bridges have a particularly sensitive transition as α −→ 2, which we will see is especially true for first-passage times. An intuitive picture of the conditioned first-passage time follows from the qualitative appearance of the sample paths for L = 1.5Lb in Fig. 3.14. In this figure a long jump is consistently present among the paths, but not at any particular time. This can be regarded as an outcome of recursively sampling the midpoint density [Eq. (3.12)]. For L  Lb, a large jump likely persists under sampling iterations, but due to the symmetry of the midpoint distribution, the jump is equally likely to be associated with any time subinterval. Since the first- 86 passage time is likely due to the long jump, the first-passage time should be uniformly distributed. Fig. 3.17 confirms this intuition with simulations of the first passage density. For L = 2Lb the first passage density is indeed uniform. A small change from α = 1.99999 to the Gaussian case yields a remarkably different distribution, one that is approximately Gaussian and centered at t ≈ T/2. The Gaussian result follows intuitively from since the most likely bridges in this regime are concentrated around the ballistic path to the endpoint. For a smaller overall jump (L = 0.1Lb), the first-passage-time densities in the α = 1.99999 and Gaussian cases match closely. This is consistent with the observation that for L Lb, the conditioned Lévy bridges are qualitatively similar to Brownian bridges. Nevertheless, the rare but important long jumps generate remarkably non-Gaussian behavior, even close to the Gaussian limit. This result also suggests that boundary-crossing statistics can be a particularly sensitive probe for the presence of anomalous diffusion, which is investigated more in Chapter IV. 3.2.8. Inferring α from a Sample Path Calculating the value of α for a sample path is a useful way to characterize heavy tailed random walks or anomalous diffusive motion. For collections of many sample paths, there are two common approaches to calculating the exponent from the distributions of particle positions. First is the dynamic approach, where the width of the distribution will grow as ∼ t1/α, which can be measured by fitting a distribution, calculating the FWHM, or calculating measures of self-similarity. The other approach is by analyzing the shape of the distribution, as the tails of the distribution scale as ∼ x−α−1. 87 (scaled !1/2) L=0.1Lb L=2Lb 0 t/T 1 FIGURE 3.17. Simulated conditioned first passage time distributions for α = 1.99999 and d = L/2 are shown for L = 0.1Lb (blue/squares) and L = 2Lb (red/triangles). Exact densities for α = 2 [1] for the same L values are shown for comparison in each case (blue/solid and red/dashed, respectively). Simulations averaged 107 paths, with ∆t = 2−14T . On the other hand, it is also possible to analyze individual trajectories. For a given trajectory with N steps ∆xi occurring every ∆t, the most straightforward method to infer the value of α is by using maximum-likelihood estimation. For a uniform prior α ∈ (1, 2) the maximum likelihood is found by maximizing the function [94] P (α|∆xi∆xN) ∝ P (∆x1|α) . . . P (∆xN |α) (3.27) which can be written as P (α|∆xi) ∝ fα(∆x1; ∆t) . . . fα(∆xN ; ∆t). (3.28) Typically this is calculated via the log likelihood log (P (α|∆xi)) = log (fα(∆x1; ∆t)) + . . .+ log (fα(∆xN ; ∆t)) + C, (3.29) 88 first passage density 1 0.5 0 1.0 1.5 2 a FIGURE 3.18. Long jump probability as a function of α. where C is the log of the proportionality constant which is not relevant for the maximization. While this approach is straightforward and well grounded in theory, the downside is that it requires many evaluations of fα which can be computationally costly. In this section we infer α through an alternative method by analyzing the sample path using the concept of “jumps”. We look at each step ∆x and consider it a jump if ∆x > Lb, which can be used to create the jump rate statistic jα defined as the number of jumps in the sample path over the total number of steps. This proposed jump rate can be compared to the theoretical jump rate Jα, which is defined as ∫ ∞ Jα := dx fα(x; t), (3.30) Lb and this function is plotted for reference in Fig. 3.18. We must now numerically solve for the α such that Jα = jα, (3.31) 89 Po(L>l ) a 0.4 0 -0.4 1 1.1 1.3 1.5 1.7 1.9 2 a FIGURE 3.19. Effectiveness of the jump rate method (red) of alpha estimation when compared to the maximum-likelihood method (blue). Mean differences between the estimated and true parameter are shown with solid lines, while dashed lines indicate the standard deviation. which should hold for the correct choice of α when N → ∞. This approach is less computationally intensive since there is no need to evaluate fα at n points for each value of α considered. Instead, one only needs to evaluate the inequality ∆xi > Lb at n points and compare it to Jα for each α considered. For this approach, only two lookup tables are needed, one for Lb and one for Jα, compared to a lookup table for every fα. Results from a test of the jump based inference method compared to the maximum-likelihood method are shown in Fig. 3.19; while the maximum-likelihood method still has superior accuracy, the jump method still has reasonable results. 90 ag-a CHAPTER IV ANOMALOUS DIFFUSION IN SISYPHUS COOLING Optical molasses refers to a laser cooling technique for atoms isolated in vacuum, where counter-propagating laser fields create a region of space where the motion of atoms is strongly damped, as if the atoms were embedded in viscous molasses. Laser cooling of atoms in optical molasses is a complex process that can involve multiple cooling mechanisms, including Doppler cooling, Sisyphus cooling, and polarization-gradient cooling. Also present are lattice effects of the cooling beams, as well as spontaneous emission, where photons are released in a random direction, causing a heating effect. Atom losses, caused by molecule formation, spontaneous emission to non-resonant energy levels, or simply diffusion out of the cooling region, can also play a significant role. Altogether these effects drive a complex diffusion process. Despite the complexity, under some reasonable choices of parameters it is possible to tune the optical molasses into a regime where the Sisyphus force dominates and can be approximated as a simple stochastic differential equation (Section 4.1). In this approximation, the system has a steady-state momentum distribution with heavy tails known as a Tsallis distribution (Section 4.2), indicating the possibility of anomalous diffusion. When the optical lattice depth is sufficiently deep, the power laws tails of the momentum distribution have only a minor impact, and the particle position can asymptotically be represented by a simple diffusion equation (Section 4.3). But, for sufficiently shallow optical lattices, the diffusion constant diverges, which leads to Lévy walks behavior and the onset of anomalous diffusion (Section 4.4). 91 To detect the transition between these regimes, previous experiments have made use of cloud-expansion experiments (for instance, Refs. [21, 30, 38]). However, as mentioned in the Introduction, this led to some conflict between experimental and theory thought to be in part due to atom losses [39]. To mitigate the effect of losses, single-atom experiments are particularly useful, and have also been demonstrated to be effective at detecting the onset of anomalous diffusion [37]. This chapter begins by introducing the semiclassical Sisyphus cooling force in Section 4.1 as well as some of its known properties, including the steady state momentum distribution (Section 4.2) and diffusive behavior in both the Brownian regime (Section 4.3) and the Lévy regime (Section 4.4). Original work begins in Section 4.5 with a formula that characterizes the extent of power-law scaling in the Lévy regime. Next, in Section 4.6, we propose a scheme for detecting the onset of anomalous diffusion through the use of boundary-crossing statistics. This leads to one of our primary results, where through simulations we demonstrate the existence of a peak in boundary-crossing statistics that is connected to the Lévy behavior of the underlying process. To show this connection, we derive scaling relations that characterizing the peak. These scaling relations also serve as a useful starting point for future single-atom experiments. 4.1. Sisyphus Cooling Sisyphus cooling is a laser cooling mechanism that arises due to coherent pumping between ground-state Zeeman sublevels. The details of Sisyphus cooling are complex, but we will summarize the basic mechanism (for more details, see Refs. [27, 95]). For cooling in one dimension, Sisyphus cooling can be 92 created by setting up two counter-propagating beams with perpendicular linear polarizations (lin⊥lin configuration). The total electric field created by these counter-propagating beams has an elliptical polarization that varies in space. This polarization gradient in turn causes oscillations in the ground state energies. For laser frequencies ω detuned below the atomic frequency ω0 by an amount ∆, optical pumping will occur from the upper ground state to the lower ground state. This alone would reduce the energy of the atom by an amount ∼ h̄∆. However, this cycle can be repeated: for an atom in motion along the polarization gradient, if it has already been pumped into the lower ground state, it will effectively be moving up a potential gradient due to the oscillations in the ground state energies. Moving up the potential gradient has a corresponding loss in kinetic energy. Finally, at some point the atom will have moved far enough such that the lower ground state becomes the upper ground state, and the atom will again be optically pumped from the (now) upper ground state into the (now) lower ground state. Roughly speaking, kinetic energy of order h̄∆ will be dissipated each cycle, with the cycle period limited by both the optical pumping rate and the time to climb the potential well (related to laser wavelength and the velocity of the atom). While we have only depicted a simple model for Sisyphus cooling, the full treatment is an inherently quantum-mechanical process. However, it has been shown in Ref. [32] that fully quantum treatments are well approximated by the semiclassical models discussed in Ref. [27], which has also been confirmed with comparisons to fully quantum models [31]. 93 The formalism we use to describe Sisyphus cooling follows closely from Ref. [96], which gives the semiclassical momentum evolution by the pair of SDEs − ᾱp √ dp = + 2D(p) dW (t), (4.1) 1 + (p/p )2c p dx = dt. (4.2) m The parameters ᾱ is the Sisyphus friction coefficient, pc is a characteristic momentum for the system, and D2 D(p) = D1 + (4.3) 1 + (p/pc) is the momentum dependent diffusion parameter. The parameters ᾱ, pc, D1, and D2 can be directly related to physical parameters as in Ref. [31], but there is a more convenient form which we will derive momentarily. First, it turns out that the asymptotic behavior for this system does not depend on D2, so for simplicity we can set D2 = 0 [31]. We will now transform into a non-dimensionalized form by defining new coordinates via p = p′ pc (4.4) t = t′/α (4.5) x = x′pc/mᾱ, (4.6) 94 where the primes denote dimensionless variables. With these substitutions and dropping the primes for simplicity, the equations of motion become p √ dp = − + 2DdW (t) (4.7) 1 + p2 dx = p dt, (4.8) where we have defined the parameter D by D1 D = . (4.9) ᾱp2c The relation of D to physical parameters is given by ER D = C (4.10) U0 where the recoil energy E = h̄2k2R /2m, U0 is the optical potential depth, and the dimensionless constant C depends on the particular1 atomic transition [39]. It is important to note that this model has several implicit approximations. First, we are ignoring the effect of the potential lattice induced by the counter propagating beams. This is valid as long as the atom’s kinetic energy is large compared to the depth of the lattice wells. Second, we have neglected the Doppler cooling force. This is valid for large detunings and slower atom velocities. However, it is important to note that Doppler cooling will become relevant for sufficiently large velocities. This sets an upper limit on the velocity, beyond which the model in Eq. (4.7) begins to break down [31]. 1Values for the constant C vary, but tend to be of of order 10. For a Jg = 1/2 → Je = 3/2 transition C = 12.3. [31, 39]. 95 Simulations of Eq. (4.7) are shown in Fig. 4.1 for various values of D. Qualitatively D has a significant effect: for D = 0.01 we can see that the trajectories are akin to Brownian motion, while for D = 0.19 the motion is mostly Brownian in appearance, though there are occasionally some longer jumps present. For D = 0.3 and D = 0.5 the jumps are obvious and tend to dominate the motion. 4.2. Steady-State Momentum Distribution A straightforward statistic that characterizes the Sisyphus force is the steady-state momentum distribution. Following Refs. [28, 30], a convenient way to find the steady state momentum distribution is by transforming Eq. (4.7) to the Fokker–Plank equation. Recalling from Section 2.1.6 that any stochastic differential of the form dx = g(x, t)dt+ h(x, t)dW, (4.11) has a corresponding Fokker-Plank equation given by ∂ ∂ 1 ∂2 ( ) f(x, t) = − (g(x, t)f(x, t)) + h(x, t)2f(x, t) . (4.12) ∂t ∂x 2 ∂x2 To find the steady state momentum distribution, we set ∂ f(p, t) = 0, g(p, t) = ∂t √ − p 2 , and h(p, t) = 2D. Rewriting Eq. (4.12) we have1+p ∂2 2 ∂ f(p) = (g(p)f(p)) , (4.13) ∂p2 2D ∂p where we have dropped the t dependence of both f(p, t) and g(p, t). Integrating by p gives ∂ 2 f(p) = (g(p)f(p)) + C0 (4.14) ∂p 2D 96 12 10 D = 0.01 8 6 4 2 0 −2 −4 −6 0 100 200 300 400 500 600 700 800 900 1000 t 150 D = 0.19 100 50 0 −50 −100 0 100 200 300 400 500 600 700 800 900 1000 t 400 300 D = 0.3 200 100 0 −100 −200 −300 −400 −500 −600 0 100 200 300 400 500 600 700 800 900 1000 t 2000 1500 D = 0.5 1000 500 0 −500 −1000 −1500 0 100 200 300 400 500 600 700 800 900 1000 t FIGURE 4.1. Simulated trajectories of Eq. (4.7) for various values of the diffusion constant D. 97 x(t) x(t) x(t) x(t) for some integration constant C0. The force g(p) vanishes at p = 0, and since the distribution f(p) is an even function, ∂pf(p) must also vanish at p = 0, thus forcing C0 = 0. Integrating once more, we have ∫ f(p) ∫1 2 p df = dp′g(p′), (4.15) f(0) f 2D 0 which has the solution ( ∫ 2 p ) f(p) = f(0) exp dp′g(p′) . (4.16) 2D 0 Performing the final integral ∫ p ∫ p ′ p 1 ( )dp g(p′) = − dp′ = − ln 1 + p2 , (4.17) 2 0 0 1 + p 2 we have ( 1 ( )) ( )−1/2D f(p) = f(0) exp − ln 1 + p2 = f(0) 1 + p2 . (4.18) 2D Finally, f(0) is just a normalization constant, which is given by √1 Γ( 1 ) f(0) = 2D π Γ( 1 − 1 (4.19)) 2D 2 Putting these together, we have the steady state momentum distribution √1 Γ( 1 ) ( ) f(p) = 2D 2 −1/2D 1 + p . (4.20) π Γ( 1 − 1) 2D 2 98 7 0 -4 p 4 FIGURE 4.2. Plot of Tsallis distribution in Eq. (4.20) with D = 0.1 (red), D = 0.2 (blue), and D = 0.6 (green). This is known as a Tsallis distribution. This distribution is normalizable for 0 < D < 1 and is plotted in Fig. 4.2 for various values of D. We can see that for larger values of D the distribution has particularly heavy tails. The momentum distribution can also be helpful for interpreting the momentum diffusion parameter D, which relates the relative magnitude of the damping and heating effects. In Fig. 4.3 we plot the fraction of particles that have momentum |p| > 1, which shows that D roughly tracks the fraction of particles in the steady state momentum distribution with |p| > 1. This is useful as the threshold p ∼ 1 is approximately where the Sisyphus force becomes nonlinear. For smaller D, linear damping effects dominate, causing a larger fraction of particles to be in the linear F (p) ∼ −p regime of the force. Conversely, larger D is dominated by the heating effects, causing more particles to be affected by the F (p) ∼ −1/p regime of the force. While we derived the Tsallis distribution using the Fokker–Planck equation, it is also possible to do so via the Itō SDE. This is done through calculating all 99 fo(p) 1 0 0 D 1 FIGURE 4.3. The solid red line shows the fraction of particles with |p| > 1 in the steady state distribution as a function of the diffusion parameter D. The dashed black line is a guide line with slope 1. of the moments of the distribution as shown in Appendix A.1, and then using an appropriate inversion as shown in Appendix A.2. Interestingly, having both derivations yields an inversion formula for a particular hypergeometric function. 4.3. Brownian Regime For certain values of the momentum diffusion parameter D, through a process of adiabatic elimination of the momentum variable, it is possible to show that position distribution is described asymptotically by a diffusion equation, with a diffusion constant given by [30, 39] ∫ [∫ 1 ∞ ∞ ]2 K = dp eV (p)/D dp′ e−V (p)/D ′2 Z p , (4.21)D −∞ p where Z is given by √ ( − )/ ( )Z 1 D 1= π Γ Γ , (4.22) 2D 2D 100 Po(o|opo|o>1) and V (p) is an effective potential for the Sisyphus force in momentum space, given by ∫ V (p) = dpF (p) = (1/2) ln(1 + p2). (4.23) This integral is finite for 0 < D < 1/5, and in this case K2 has the simplified expression D(4D − 1) K2 = . (4.24) (2D − 1)(3D − 1)(5D − 1) We can see that this diverges as D → 1/5, indicating the onset of the Lévy regime and the failure of the standard diffusion equation. In Fig. 4.4 we show simulations of atoms cooled via the Sisyphus force compared to Gaussian diffusion simulation. For D = 0.01 the position distributions for both simulations are nearly identical. However, for D = 0.19 we see the presence of power laws in the tails of the position distribution for Sisyphus cooling. While this is in the regime of Gaussian asymptotics, we see a slow convergence to a Gaussian distribution similar to that seen in the truncated Cauchy case (Section 2.2.3.1). This also seems consistent with the “jumps” visible in Fig. 4.1 for D = 0.19. 4.4. Lévy Regime When the diffusion constant diverges for D > 1/5, the Sisyphus cooling behavior is remarkably different from the Brownian regime and leads to long correlated motion that prevents the development of asymptotic Gaussian behavior. A qualitative explanation is that when D is large enough, the atom is more likely to escape out of the linear regime of the Sisyphus force. For large enough p, the asymptotic force goes as −1/|p|, with the magnitude of the damping diminishing 101 FIGURE 4.4. Comparison of position distributions for Sisyphus simulations using Eq. (4.7) and Gaussian diffusion with a diffusion constant given by Eq. (4.24). For D = 0.01, a Gaussian simulation (green) and Sisyphus simulation (blue) are shown. For D = 0.19, a Gaussian simulation (red) and a Sisyphus simulation (yellow) are shown. Simulation parameters are ∆t = 0.1, T = 104, N = 220. for larger p. This gives rise to long, correlated motions that can be modeled as Lévy flights. The derivation to show that this behavior leads to Lévy flights nontrivial and is outlined in Refs. [39, 64]. The general idea is that the microscopic behavior can be split the motion into a series of correlated “jumps” χ and jump durations τ . The jumps and jump durations (also called waiting times) are defined to be the displacement and time elapsed between momentum zero-crossings, respectively. It was shown [31] that χ and τ have distributions with asymptotic behavior given by fχ(x) ∼ |x|−(4/3+1/3D) (4.25) fτ (t) ∼ |t|−(3/2+1/2D) , (4.26) 102 as well as correlations that scale as χ ∼ τ 3/2. (4.27) While the jumps and waiting times are correlated, the asymptotic behavior can be found by approximating the joint jump-wait distribution as the uncorrelated product gχ,τ (x, t) ∼ fχ(x)fτ (t). (4.28) While this ignores the correlations, ultimately the correlations are not relevant in the asymptotic limit. The Lévy behavior follows from this assumption after applying a specialized Montroll–Weiss equation, which relates continuous time step distributions to the position probability density [39]. At this point the process can be asymptotically described by an α-stable Lévy process with 1 +D α = , (4.29) 3D and an anomalous diffusion coefficient given by 1 (π(3)α− 1)α−1Kα = Z , (4.30)sin πα 32α−1 |Γ(α)2| 2 where again Z is given by ( Z √ 1− )/ ( ) D 1 = π Γ Γ . (4.31) 2D 2D 103 This is the connection that justifies the modeling of diffusion in Sisyphus cooling with stable Lévy processes. 4.5. Power Law Extent Interestingly, as D increases the position distribution goes through several phases. Close to D = 0 the position distribution is Gaussian, but then it begins to develop heavy tails for finite simulation time. For D < 1/5 these tails eventually decay at long times, approaching asymptotically Gaussian behavior. However, for D > 1/5 the tails persist and grow in extent as the simulation time increases. To investigate this behavior, we want to estimate the start and end of the power law behavior. The onset of the power laws for D > 1/5 is given by 1 x0 := (KαT ) − α , (4.32) which is the effective width of the Gaussian-like portion of the distribution. Meanwhile, the cutoff of the power law for D > 1/5 is given by √ x := DT−3/21 , (4.33) which is when correlations between χ and τ become important (for a finite simulation time T , there is an effective maximum jump length). If we assume a uniform position distribution with a magnitude a, followed by the power law decay f(x) = bx−(α+1), (4.34) 104 extending from x0 to x1, we have the normalization condition ∫ x1 1 = ax + b x−(α+1)0 dx. (4.35) x0 This leads to b 1 = ax α0 − (x1 − xα0 ) , (4.36)α and for continuity we require bx−α−10 = a. (4.37) This in turn gives us the relation b 1 = bx−α0 − (xα1 − xα0 ) , (4.38)α allowing us to solve for b αxαxα b = 0 1− . (4.39)αxα + xα1 1 xα0 This can be used to plot the power laws as shown in Fig. 4.5. We can see from Fig. 4.4 and Fig. 4.5, that as D is increased, the position distribution begins Gaussian (D = 0.01), develops tails (D = 1.9), has long tails (D = 3.0), and then the tails begin to shorten (D = 0.5, D = 0.7), in terms of orders of magnitude spanned. To illustrate this, we look at the numb(er ) of order of magnitudes the power law extends over, specifically a plot of log x110 ,x0 as shown in Fig. 4.6. Looking at these plots, we find an interesting dependence: For large enough T , a peak forms for a smaller value of D. This occurs because the central portion of the distribution grows faster for larger D than tail of the √ distribution which grows with D. This implies that if visible power laws are desired, D should be tuned to be slightly beyond the D = 0.2 threshold. This 105 FIGURE 4.5. Comparison of position distributions for Sisyphus simulations using Eq. (4.7) and anomalous diffusion with a diffusion constant given by Eq. (4.30). For D = 0.3, an α-stable Lévy simulation (brown) and Sisyphus simulation (light blue) are shown. For D = 0.5, an α-stable Lévy simulation (pink) and Sisyphus simulation (orange) are shown. For D = 0.7, an α-stable Lévy simulation (gray) and Sisyphus simulation (green) are shown. Also shown are associated power laws given by Eq. (4.34). Simulation parameters are ∆t = 0.1, T = 104, N = 220. 106 4 t=104 3 t=103 2 t=102 1 t=101 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 D ( ) FIGURE 4.6. Plot of log x110 as a function of D for various values of T .x0 choice has two benefits: First, at any reasonable “long” time we expect further extent of power-law behavior, and as we increase the time further, the power laws grow in extent more quickly. 4.6. Boundary Crossing Statistics The beginning of Chapter III considered a simple experimental setup with a single atom and a single photodetector. As a signal on the photodetector indicates the presence of the atom in the imaging region, changes in the signal correspond to the atom entering or leaving the imaging region. By measuring the times at which the atom enters or leaves the imaging region, we can therefore measure boundary- crossing statistics. In this section we investigate the measurement of boundary-crossing statistics for atoms undergoing Sisyphus cooling. In Section 4.6.1 we look at simulations 107 orders of magnitude spanned x L d 0 0 τe t FIGURE 4.7. Schematic for escape time simulations. of escape times (Section 2.1.8) for Sisyphus cooling in the Brownian regime, which allows for comparison to exact formula. Next in Section 4.6.2 we look at simulations in the Lévy regime for transit times, a statistic that can be more easily translated into an experimental system. 4.6.1. Brownian Regime Simulations of most boundary-crossing statistics in the Brownian regime behave similarly to theoretical predictions for Brownian motion. As an example we show the escape time distributions (see Section 2.1.8 and Fig. 4.7) for D = 0.1 for various region sizes in Fig. 4.8. 4.6.2. Lévy Regime For the Lévy regime we look at simulations of what we call transit time, τtr. Transit times are the time it takes for a particle to cross a particular region, as shown in Fig. 4.9. For these simulations the particle is started at the center of the region, and is allowed to diffuse until it exits the region. Eventually, the particle 108 0.01 B = 0.028µm B = 0.279µm 0.0001 B = 2.794µm B = 27.937µm 1e-06 1e-08 1e-10 1e-12 1e-14 1e-16 0.001 0.01 0.1 1 10 100 1000 10000 100000 τe/ms FIGURE 4.8. Escape time distributions for D = 0.1, T = 104. Sisyphus simulations using Eq. (4.7) (solid) and theoretical curves using Eq. (2.66) (dashed) are shown. will reenter the region, and the transit time is then measured as the elapsed time between when the particle enters the region and when it leaves. Fig. 4.10 shows simulated transit time distributions for D = 0.7 for various region sizes. In any particular distribution as in Fig. 4.11, there is a broad background until it is cutoff at some large time τcutoff . These distributions appear similar to the Brownian escape times, except that there is a peak present that extends from some time τmin to some time τmax. This peak is somewhat peculiar since there is no obvious velocity scale for the system due to the power-law tail in the Tsallis distribution. To understand this peak, it is helpful to look at sample paths with particular transit times, shown on a semi-log scale in Fig. 4.12. Since this shows trajectories for all transit times, it gives a clearer understanding of what typical trajectories look like for a given transit time. For very short times, the typical particle 109 P (τe) x L x0 0 t0 t1 t2 t FIGURE 4.9. Schematic for transit time distributions. The transit time is given by τtr = t2 − t1. FIGURE 4.10. Transit time distributions for D = 0.7, T = 104. 110 FIGURE 4.11. Escape-time anatomy for D = 0.7, T = 104. trajectory enters the region, turns around, and leaves the region. However, for longer times the trajectories seem to traverse across the region in a single jump (i.e., without getting “stuck” and changing directions). Finally, for long times, the particles occasionally become trapped, and are more or less equally likely to leave the region at either boundary. The transit times corresponding to particles that take the longest to cross (τmax) are the slowest particles that cross over the entire region in a single jump. Due to the correlations between jump length χ and jump duration τ given by χ ∼ τ 3/2, (4.40) the slowest particles have shorter jumps, and so the slowest particles that cross the region in a single jump must have χ ∼ B. Therefore, τmax ∼ B2/3. (4.41) 111 40 20 0 -20 -40 0.0001 0.001 0.01 0.1 1 10 t/ms FIGURE 4.12. Selected trajectories for D = 0.7, T = 104, with region size B = 39.909. The appearance of curved trajectories is due to the semi-log scaling; unscaled, these curved trajectories appear relatively straight. On the other hand, the transit times corresponding to particles that take the shortest time to cross (τmin), must be due to the fastest particles that cross over the entire region in a single jump. Since the velocity of the fastest particles does not depend on the region size we have the relation τmin ∼ B/ 〈vmax〉 . (4.42) These scaling arguments for τmax and τmin are confirmed through simulations in Fig. 4.13 and Fig. 4.14 respectively. Further, with the observations of long, single- direction trajectories for transits occurring between these bounds in Fig. 4.12, we can conclude that the presence of these peaks can be used in the direct detection of jumps due to Lévy flight behavior. 112 x(t)/µm FIGURE 4.13. Plot of τmax extracted from transit-time distributions as a function of the region size. Parameters used: D = 0.7, T = 104. FIGURE 4.14. Plot of τmin extracted from transit-time distributions as a function of the region size. Parameters used: D = 0.7, T = 104. 113 CHAPTER V CONCLUSION This dissertation began with a discussion of universality and the idea that distinctly different systems often approach the same behavior. In some sense, the hope was that by understanding the universal limits, we can ignore the complexities of the individual systems. However, much discussion was devoted to the subtleties of applying these tools, and that there is nuanced process of building connections between abstract universal rules and concrete examples. For example, from the development of fundamental stochastic processes, we saw that the asymptotic behavior guaranteed through the CLT may turn on slowly, such that the asymptotic behavior suggested by the gCLT could be employed even though the preconditions were not satisfied. This led to a sort of phase transition between Lévy flights and Brownian motion where both universal behaviors were present, but at different scales. We also saw that common techniques in the Brownian regime, such as the method of images or the generation of stretched bridge, break down for heavy tailed processes; and yet, the seemingly related Sparre Anderson scaling somehow still holds for both Brownian and Lévy regimes. Developing intuition is particularly important for heavy tailed processes, since many abstract results are prone to error or controversy. We continue to see the application of ill-advised Gaussian statistics in finance, even when more general models are available [13]. As pointed out in Ref. [97], there are several examples of erroneous results in the Lévy flight literature due to poor assumptions involving absorbing boundary conditions for Lévy flights. Even whether animals truly 114 follow Lévy flight patterns, or if such search strategies are actually optimal is not completely clear [19, 59, 60]. Altogether, these examples suggest it is important to develop intuitive ways of approaching Lévy processes. One of the more significant advances on this front was further development of conditioned Lévy processes and Lévy bridges. In particular, the observation that the Lévy process has bifurcations in the midstep distribution, unlike the Brownian case, and that these bifurcations lead to a new length scale that allows for discrimination between singularly large events and composites of many smaller events. This interpretation gives a better intuition for why sample paths appear by eye to have discrete “jumps”, an observation unexplained by the more formal Lévy–Khintchine representation. Since this scale naturally is visible by eye, it seems likely that it would also arise naturally in some physical systems. Finally, we investigated Sisyphus cooling, which represented many of the themes discussed throughout the prior chapters. One of the more interesting outcomes of studying boundary-crossing statistics was a surprising peak in the transit time distribution, which corresponds to the presence of Lévy flights. Furthermore, the scaling relationships developed to explain this peak have relations both to the underlying physical process as well as to the asymptotic behavior. These statistics and relations are useful for the design of future single- atom experiments. 115 APPENDIX 116 A.1. Moments of Momentum Distribution In the main text (Section 4.2) we calculated the moments of p for Sisyphus cooling (described by Eq. (4.7)) through the typical approach of computing ∫ ∞ 〈pm〉 = dpπ(p)pm. (A.1) −∞ However, another way of analyzing this system is via the Langevin approach. We follow the method described in Jacobs’ Stochastic Processes for Physicists pg. 41. As we have we have a nonlinear term in our force, we expect that our solution for 〈p2〉 will depend on all higher moments of p; however, we will see that we will still be able to solve for the moments exactly. First note that the force is anti-symmetric in p. This means that for symmetric initial conditions we expect that 〈pn〉 = 0 for all odd moments. To find the even moments, we will need to find the increment for d(pn) which can be done using Ito’s formula, ( ) ( ) ( ) ∂f ∂f 1 ∂2f dy = dx+ dt+ dx2. (A.2) ∂x ∂t 2 ∂x2 This gives us n(n− 1) d(pn) = (npn−1)dp+ pn−2dp2 (A.3) 2 into which we can insert dp and dp2 to find ( ) ( ) d(pn n−1 − p n(n− 1)) = (np ) dt+ σdW + pn−2 σ2dt, (A.4) 1 + p2 2 117 where we have made use of dW 2 = dt. Taking the expectation gives us 〈 〉 n 2 d(〈pn〉) = − p n(n− 1)σn dt+ 〈pn−2〉dt, (A.5) 1 + p2 2 which can be divided through by dt and reordered into d(〈 〈 〉pn〉) n(n− 1)σ2 n = 〈pn−2〉 − pn . (A.6) dt 2 1 + p2 n Setting d〈p 〉 = 0 to find the steady state, our equation for the moments becomes dt 〈 〉 n 〈pn−2〉 p= An (A.7) 1 + p2 were we have defined the coefficient A = 2n − 2 . It is tempting to distribute the(n 1)σ expectation brackets, but 〈 〉 p2 6 〈p 2〉 = (A.8) 1 + p2 1 + 〈p2〉 since p is not a Gaussian random variable. However, we can expand this term as a series: 〈 〉 n ∑Np = 〈p2〉 − 〈p4〉+ 〈p6〉+ . . . ≈ (−1)k+1〈p2k〉. (A.9) 1 + p2 k=n/2 In doing this, we are assuming the existance and finiteness of the higher moments, and that moments greater than 〈p2N〉 are negligible. This truncation has also made this problem tractable, as we have the relation ∑N 〈pn−2〉 ≈ A k+1 2kn (−1) 〈p 〉 (A.10) k=n/2 118 which holds for n = 2, 4, . . . , 2(N − 1), 2N . Thus there are N equations and N unknowns (all even moments from 〈p2〉 to 〈p2N〉), so this system can be solved. To find a recursive formula, we write a version of Eq. A.10 shifted by n→ n+ 2: ∑N 〈pn〉 = An+2 (−1)k+1〈p2k〉. (A.11) k=n/2+1 Now, we we write another version of Eq. A.10, pulling out one term from the sum:  ∑ N〈pn−2〉 = A 〈pn〉 − (−1)k+1〈p2k〉n . (A.12) k=n/2+1 Solving for the sum, we have ∑N (−1)k+1〈p2k〉 = 〈 1pn〉 − 〈pn−2〉, (A.13) An k=n/2+1 and substituting this into for Eq. A.11 we have ( ) 〈 n〉 〈 n〉 − 1p = A p 〈pn−2n+2 〉 . (A.14) An Now we can solve for 〈pn〉: 〈pn〉 An+2= 〈pn−2〉 (A.15) An(An+2 − 1) Noting that 〈p0〉 = 1, this is a general formula for the moments ∏n/2 〈pn〉 A2k+2= A2k(A2k+2 − , (A.16) 1) k=1 119 and substituting for An gives ∏n/2 〈 n〉 (1− 2k)σ 2 p = . (A.17) (1 + 2k)σ2 − 2 k=1 Finally, this product can be rewritten as ( ) ( ) n n+1 3 1 〈pn〉 i Γ Γ −= √ (2 2 σ)2 . (A.18) π Γ n + 3 − 1 2 2 σ2 The first 4 even moments are 〈 2〉 (1!!)σ 2 p = (2− 3σ2) 〈 4〉 (3!!)σ 4 p = (2− 3σ2)(2− 5σ2) (5!!)σ6〈p6〉 = (2− 3σ2)(2− 5σ2)(2− 7σ2) 6 〈 (7!!)σp8〉 = , (2− 3σ2)(2− 5σ2)(2− 7σ2)(2− 9σ2) where the double factorial (!!) is defined for odd integers N by N !! = N · (N − 2) · (N − 4) . . . 5 · 3 · 1. √ For σ = 2D we have 〈 2〉 (1!!)Dp = (1− 3D) 2 〈p4〉 (3!!)D= (1− 3D)(1− 5D) (5!!)D4〈p6〉 = (1− 3D)(1− 5D)(1− 7D) 〈 8〉 (7!!)D 6 p = . (1− 3D)(1− 5D)(1− 7D)(1− 9D) 120 A.2. Hypergeometric Inversion Formula As an aside, we also can use the moments derived in Appendix A.1 to find an alternative form for the momentum distribution. As we already solved for the momentum distribution in Section 4.2, this amounts to deriving a formula for an inverse Fourier transform of a particular hypergeometric function. We first construct the moment generating function, which is just a power series with the general form ∑∞ 1 Mp(x) = 〈pn〉xn, (A.19) n! n=0 where the coefficients are the moments of f(p). Thus, the moments can be generated from Mp(x) by the relation ∣ 〈pn〉 = (∂n ∣xMp(x)) . (A.20)x=0 Substituting in the moments for f(p) from Eq. (A.18) and noting that the odd moments vanish), the moment generating function is ∑∞ 1 Mp(t) = 1 + 〈p2n〉t2n. (A.21) (2n)! n=1 Substituting in for 〈p2n〉 we get ∑∞ ( ( ) ( )(−1)n Γ n+ 1 Γ 3 − 12M 2 2 σ 2np(x) = 1 + √ ) π Γ n+ 3 − x . (A.22)1 Γ(2n+ 1) n=1 2 σ2 121 It turns out that this series is equivalent to the confluent hypergeometric function given by ( ) 3 1 2 Mp(x) = 0F 1 − ,− x . (A.23) 2 σ2 4 To find the distribution f(p) from the moment generating function, we can first perform a Wick rotation (x → ix), which provides the characteristic function for f(p). Finally, we take the inverse Fourier transform to recover f(p): [ ( )] 3 1 x2 f(p) = F−1[Mp(ix)](p) = F −1 0F 1 − , . (A.24)2 σ2 4 Combining this result with Eq. (4.20), we have the following inversion formula [ ( )] 3 1 x2 1 Γ( 1− 2 ) ( )1 − − √ σ 2 −1/σ2F 0F 1 , (p) = 1 − 1 1 + p , (A.25)2 σ2 4 π Γ( 2 )σ 2 which can be verified numerically. 122 A.3. Equivalence of Discrete and Langevin Methods The discrete random walk takes a step ∆p in momentum every time step ∆t. The direction of the step is determined by ( ) 1 1 P± = 1± F (p)∆p , (A.26) 2 2D where P+ is the probability of a positive step and P− is the probability of a negative step, and F (p) = −p/(1 + p2) is the Sisyphus cooling force. Notice that the size of the time step ∆t must be related to the simulation parameters as it determins the scattering rate Γ. The correct relation is Γ = 1/∆t = ∆p2/2D. Meanwhile the size of the momentum step is what determines the accuracy of the simulation, with a smaller ∆p giving better consistency with the Langevin equation1. To show this is equivalent to the Langevin equation, we can write this discrete step process as a sum of two Poisson processes dp = ∆p dN+ −∆p dN−, (A.27) with the ensemble average of dN± given by 〈〈N±〉〉 = Γ± = P±/∆t. This approximation is valid as long as t  ∆t, the regime where the discrete momentum kicks are not resolvable. We can now approximate our possion process as a drift term and a Wiener process by making the substituion dN± → 1Since photon emission and absorption events are discrete, ∆p is actually physically determined, and the discrete method has some grounds to be a more accurate model. 123 √ Γ± dt+ Γ± dW and then simplifying: √ √ dp = ∆p (Γ+ − Γ−) dt+ ∆p√Γ+ dW1 + ∆p Γ− dW2 (A.28)∆p dp = ((P+ − P−) dt)+ ∆p |Γ+|+ |Γ−| dW (A.29)∆t ∆p 1 ∆p dp = F (p)∆p dt+ √ dW (A.30) ∆t 2D ∆t ∆p2 √ dp = F (p) dt+ 2DdW (A.31) 2D∆t √ dp = F (p) dt+ 2DdW (A.32) This is the same as Eq. (4.7). 124 REFERENCES CITED [1] Borodin, Handbook of Brownian Motion 2nd Ed (Birkhäuser, 2002). [2] A. Einstein, Ann. Phys 19, 11 (1905). [3] A. Shukurov, Dynamics and Evolution of Galactic Nuclei (Taylor & Francis, 2014). [4] L. Bachelier, Annales scientifiques de l’École normale supérieure 17, 21 (1900). [5] M. Sewell, UCL Research 11, 4 (2011). [6] B. Hughes, Random Walks and Random Environments I (Oxford University Press, 1995). [7] R. Brown, The Miscellaneous Botanical Works of Robert Brown (The Ray Society, 1866). [8] J. Perrin, Brownian Movement and Molecular Reality (Taylor and Francis, 1910). [9] F. Black and M. Scholes, The Journal of Political Economy 81, 637 (1973). [10] K. Kiyono, Z. R. Struzik, and Y. Yamamoto, Physical Review Letters 96, 068701 (2006). [11] R. Rebonato, Volatility and Correlation: In the Pricing of Equity, FX and Interest-Rate Options (John Wiley & Sons, 1999). [12] N. N. Taleb, The Black Swan: The Impact of the Highly Improbable (Random House, 2007). [13] R. Cont and P. Tankov, Financial Modelling with Jump Processes (Chapman & Hall, 2004). [14] C. Gardiner, Stochastic Methods. A Handbook for the Natural and Social Sciences, 4th ed. (Springer, 2009). [15] K. Jacobs, Stochastic Processes for Physicists: Understanding Noisy Systems (Cambridge University Press, 2010). [16] M. F. Shlesinger, G. M. Zaslavsky, and U. Frisch, eds., Lévy Flights and Related Topics in Physics: Proceedings of the International Workshop Held at Nice, France, 27-30 June 1994 , Lecture Notes in Physics No. 450 (Springer-Verlag, 1995). 125 [17] V. V. Uchaikin and V. M. Zolotarev, Chance and Stability: Stable Distributions and Their Applications (De Gruyter, 1999). [18] R. Metzler, T. Koren, B. van den Broek, G. J. L. Wuite, and M. A. Lomholt, Journal of Physics A: Mathematical and Theoretical 42, 434005 (2009). [19] G. M. Viswanathan, V. Afanasyev, S. V. Buldyrev, E. J. Murphy, P. A. Prince, and H. E. Stanley, Nature 381, 413 (1996). [20] T. W. Hänsch and A. L. Schawlow, Optics Communications 13, 68 (1975). [21] S. Chu, L. Hollberg, J. E. Bjorkholm, A. Cable, and A. Ashkin, Physical Review Letters 55, 48 (1985). [22] A. Ashkin, Physical Review Letters 24, 156 (1970). [23] V. S. Letokhov and B. D. Pavlik, Soviet Journal of Experimental and Theoretical Physics 45, 698 (1977). [24] A. Aspect, E. Arimondo, R. e a1 Kaiser, N. Vansteenkiste, and C. Cohen-Tannoudji, Physical Review Letters 61, 826 (1988). [25] F. Bardou, J. P. Bouchaud, O. Emile, A. Aspect, and C. Cohen-Tannoudji, Physical review letters 72, 203 (1994). [26] F. Bardou, A. Aspect, J.-P. Bouchaud, and C. Cohen-Tannoudji, Lévy Statistics and Laser Cooling: How Rare Events Bring Atoms to Rest (Cambridge University Press, 2002). [27] C. Cohen-Tannoudji and J. Dalibard, Laser Cooling beyond the Doppler Limit by Polarization Gradients: Simple Theoretical Models (World Scientific, 1989). [28] C. Cohen-Tannoudji, in Light Induced Kinetic Effects on Atoms, Ions and Molecules (ETS Editrice, 1991). [29] J. Dalibard and C. Cohen-Tannoudji, Journal of the Optical Society of America. B, Optical physics 2, 1707 (1985). [30] T. W. Hodapp, C. Gerz, C. Furtlehner, C. I. Westbrook, W. D. Phillips, and J. Dalibard, Applied Physics B 60, 135 (1995). [31] S. Marksteiner, K. Ellinger, and P. Zoller, Physical Review A 53 (1996). [32] Dalibard and Cohen-Tannoudji, Journal of Physics B: Atomic and Molecular Physics 18, 1661 (1985). [33] E. Lutz, Physical Review A 67, 10.1103/PhysRevA.67.051402 (2003). 126 [34] E. L. Raab, M. Prentiss, A. Cable, S. Chu, and D. E. Pritchard, Physical Review Letters 59, 2631 (1987). [35] W. Ketterle and N. J. Van Druten, Advances in atomic, molecular, and optical physics 37, 181 (1996). [36] V. Zaburdaev, S. Denisov, and J. Klafter, Reviews of Modern Physics 87, 483 (2015). [37] H. Katori, S. Schlipf, and H. Walther, Physical Review Letters 79, 2221 (1997). [38] Y. Sagi, M. Brook, I. Almog, and N. Davidson, Physical Review Letters 108, 10.1103/PhysRevLett.108.093002 (2012). [39] E. Barkai, E. Aghion, and D. A. Kessler, Physical Review X 4, 10.1103/PhysRevX.4.021036 (2014). [40] E. G. Flekkø y, Physical Review E 95, 10.1103/PhysRevE.95.012139 (2017). [41] D. A. Steck, Stochastic Processes, http://steck.us/teaching (2019). [42] G. A. Pavliotis, Stochastic Processes and Applications , Texts in Applied Mathematics, Vol. 60 (Springer New York, New York, NY, 2014). [43] J. Klafter and I. M. Sokolov, First Steps in Random Walks: From Tools to Applications (Oxford University Press, 2011). [44] O. E. Barndorff-Nielsen, S. I. Resnick, and T. Mikosch, eds., Lévy Processes (Birkhäuser Boston, Boston, MA, 2001). [45] J. Bertoin, Lévy Processes, Vol. 121 (Cambridge university press Cambridge, 1996). [46] J. Bertoin, Handbook of statistics 19, 117 (2001). [47] D. A. Steck, Quantum and Atom Optics, http://steck.us/teaching (2019). [48] S. Redner, A Guide to First-Passage Processes (Cambridge University Press, 2001). [49] P. E. Kloeden and E. Platen, Numerical Solution of Stochastic Differential Equations (Springer, 1995). [50] K. Burrage, P. Burrage, D. J. Higham, P. E. Kloeden, and E. Platen, Physical Review E 74, 068701 (2006). [51] K. Debrabant and A. Kvæ rnø, BIT Numerical Mathematics 57, 153 (2017). [52] E. S. Andersen, Mathematica Scandinavica 1, 263 (1953). 127 [53] E. S. Andersen, Skand. Aktuarietikskr 36, 123 (1953). [54] S. Song, J. Jeong, and J. Song, Journal of the Korean Statistical Society 40, 227 (2011). [55] C. Li, Option Pricing with Generalized Continuous Time Random Walk Models , Ph.D. thesis, Queen Mary, University of London (2016). [56] L. Wu, SSRN Electronic Journal 10.2139/ssrn.871760 (2006). [57] T. H. Solomon, E. R. Weeks, and H. L. Swinney, Physical Review Letters 71, 3975 (1993). [58] M. F. Shlesinger, G. M. Zaslavsky, and J. Klafter, Nature 363, 31 (1993). [59] V. V. Palyulin, A. V. Chechkin, and R. Metzler, Proceedings of the National Academy of Sciences 111, 2931 (2014). [60] S. Benhamou, Ecology 88, 1962 (2007). [61] M. A. Lewis, P. K. Maini, and S. V. Petrovskii, eds., Dispersal, Individual Movement and Spatial Ecology , Lecture Notes in Mathematics, Vol. 2071 (Springer Berlin Heidelberg, 2013). [62] H. V. Ribeiro, S. Mukherjee, and X. H. T. Zeng, Physical Review E 86, 10.1103/PhysRevE.86.022102 (2012). [63] J. L. Cabrera and J. G. Milton, Chaos: An Interdisciplinary Journal of Nonlinear Science 14, 691 (2004). [64] D. A. Kessler and E. Barkai, Physical Review Letters 108, 10.1103/PhysRevLett.108.230602 (2012). [65] G. Afek, J. Coslovsky, A. Courvoisier, O. Livneh, and N. Davidson, Physical Review Letters 119, 10.1103/PhysRevLett.119.060602 (2017). [66] E. Aghion, D. A. Kessler, and E. Barkai, Physical Review Letters 118, 10.1103/PhysRevLett.118.260601 (2017). [67] J.-i. Inoue and N. Sazuka, Physical Review E 76, 021111 (2007). [68] R. Mantegna, Phys. Rev. Lett. 73 (1994). [69] J. M. Chambers, C. L. Mallows, and B. W. Stuck, J. Am. Stat. Assoc. 71, 6 (1976). [70] A. V. Chechkin, R. Metzler, V. Y. Gonchar, J. Klafter, and L. V. Tanatarov, Journal of Physics A: Mathematical and General 36, L537 (2003). 128 [71] T. Koren, A. Chechkin, and J. Klafter, Physica A: Statistical Mechanics and its Applications 379, 10 (2007). [72] B. o. Dybiec, E. Gudowska-Nowak, and A. Chechkin, Journal of Physics A: Mathematical and Theoretical 49, 504001 (2016). [73] R. M. Blumenthal, R. K. Getoor, and D. B. Ray, Trans. Am. Math. Soc. 99, 16 (1961). [74] D. C. Brody, L. P. Hughston, and A. Macrina, in Advances in Mathematical Finance, edited by J. J. Benedetto, A. Aldroubi, I. Daubechies, C. Heil, J. McClellan, M. Unser, M. V. Wickerhauser, D. Cochran, H. G. Feichtinger, M. Kunt, W. Sweldens, M. Vetterli, M. C. Fu, R. A. Jarrow, J.-Y. J. Yen, and R. J. Elliott (Birkhäuser Boston, Boston, MA, 2007) pp. 231–257. [75] B. Moskowitz and R. Caflisch, Mathematical and Computer Modelling 23, 37 (1996). [76] J. S. Horne, E. O. Garton, S. M. Krone, and J. S. Lewis, Ecology 88, 2354 (2007). [77] H. Gies, K. Langfeld, and L. Moyaerts, Journal of High Energy Physics 2003, 018 (2003). [78] J. B. Mackrory, T. Bhattacharya, and D. A. Steck, Physical Review A 94, 042508 (2016), arXiv:1606.00150 . [79] D. S. Dean, A. Iorio, E. Marinari, and G. Oshanin, Physical Review E 94, 032131 (2016). [80] P. Levitz, D. S. Grebenkov, M. Zinsmeister, K. M. Kolwankar, and B. Sapoval, Physical Review Letters 96, 180601 (2006). [81] F. Mori, S. N. Majumdar, and G. Schehr, Physical Review Letters 123, 200201 (2019). [82] A. Perret, A. Comtet, S. N. Majumdar, and G. Schehr, Physical Review Letters 111, 240601 (2013). [83] F. B. Knight, Astérisque 263, 19 (1996). [84] P. J. Fitzsimmons and R. K. Getoor, Stochastic Processes and their Applications 58, 17 (1995). [85] E. Hoyle, L. P. Hughston, and A. Macrina, Stochastic Processes and their Applications 121, 856 (2011). 129 [86] E. Hoyle, L. P. Hughston, and A. Macrina, Stable-1/2 Bridges and Insurance: A Bayesian approach to non-life reserving (2010). [87] L. Chaumont and G. Uribe Bravo, The Annals of Probability 39, 609 (2011). [88] P. Fitzsimmons, J. Pitman, and M. Yor, in Seminar on Stochastic Processes, 1992 , edited by E. Çinlar, K. L. Chung, M. J. Sharpe, R. F. Bass, and K. Burdzy (Birkhäuser Boston, Boston, MA, 1993) pp. 101–134. [89] N. Privault, Annales de Institut Henri Poincare (B) Probability and Statistics 40, 599 (2004). [90] A. Janicki and A. Weron, Simulation and Chaotic Behavior of Alpha-Stable Stochastic Processes (CRC Press, 1993). [91] C. Chiarella, European Journal of Political Economy 7, 14 (1991). [92] V. Nagaev and M. Shkol’Nik, Theory of Probability and its Applications 33, 7 (1989). [93] W. W. Erickson and D. A. Steck, arXiv:2002.03849 [cond-mat, physics:nlin] (2020), arXiv:2002.03849 [cond-mat, physics:nlin] . [94] P. Bickel, P. Diggle, S. Fienberg, K. Krickeberg, N. Wermuth, and S. Zeger, in Springer (1999) p. 262. [95] A. Aspect, J. Dalibard, A. Heidmann, C. Salomon, and C. Cohen-Tannoudji, Physical review letters 57, 1688 (1986). [96] D. A. Kessler and E. Barkai, Physical Review Letters 105, 10.1103/PhysRevLett.105.120602 (2010). [97] B. Dybiec, E. Gudowska-Nowak, and P. Hänggi, Physical Review E 73, 10.1103/PhysRevE.73.046104 (2006) 130