THE MOTION OF THE SPHERICAL COW: NUMERICAL SIMULATIONS OF THE DECELERATION OF A MULTI-ATOM SYSTEM BY A MOTION THROUGH A GASEOUS ENVIRONMENT by JULIA ABDOULLAEVA A THESIS Presented to the Department of Chemistry and Biochemistry and the Robert D. Clark Honors College in partial fulfillment of the requirements for the degree of Bachelor of Science May 2024 An Abstract of the Thesis of Julia Abdoullaeva for the degree of Bachelor of Science in the Department of Chemistry and Biochemistry to be taken May 2024 Title: The Motion of the Spherical Cow: Numerical Simulations of the Deceleration of a Multi- Atom System by a Motion Through a Gaseous Environment Approved: Jeffrey Cina Ph.D. _ Primary Thesis Advisor Particle collision theory is applied in experimental settings such as mass spectroscopy and other particle fragmentation techniques to gain insight into observed behaviors. The energetic interactions that occur between large macromolecules and ions during a collision are complex and require several analytical tools and techniques to fully understand. The research described in this thesis raises the question of whether it is possible to develop a numerical simulation that simplifies this collision model and allows us to form a general understanding of particle interactions during a collision. Several computational methods are implemented in the development of this program including Monte Carlo, a binning procedure, and a double rolling procedure. The numerical simulation will aim to calculate an appropriate background gas particle distribution and relevant trajectories such as the monomer’s center of mass velocity. 2 Acknowledgements First I would like to thank my primary thesis advisor, Dr. Jeffrey Cina, for inviting me to work on this thought provoking research question and for guiding my learning. This research has played a huge role in sparking my passion for theoretical physical chemistry, and for inspiring me to pursue graduate studies. I am very thankful for all of the time and effort that was spent educating me. I would like to thank Dr. Rebecca Altman for being my CHC representative on this thesis. I would like to thank the Prell Lab at the University of Oregon for providing us with the experimental basis for the numerical simulation. I would like to thank the Pathway Oregon Program and the Summit Scholarship Program for funding my undergraduate education. 3 Table of Contents Introduction 6 Computational Methods 11 Monte Carlo Method 11 Bins Procedure 12 Double-Roll Procedure 14 Results and Discussion 17 Conclusions and Future Directions 25 Appendix 26 Bibliography 38 4 List of Figures Figure 1. Simulation setup ............................................................................................................ 10 Figure 2. Distribution of particle velocities .................................................................................. 17 Figure 3. Bin Distribution ............................................................................................................. 18 Figure 4. Modified Bin Distribution ............................................................................................. 18 Figure 5. Distribution of Velocities in the nth Bin ....................................................................... 19 Figure 6. Analytical Convergence to the Boltzmann Distribution at 1000 velocities .................. 20 Figure 7. Analytical Convergence to the Boltzmann Distribution at 10,000 velocities ............... 21 Figure 8. Analytical Convergence to the Boltzmann Distribution at 100,000 velocities ............. 22 Figure 9. Analytical Convergence to the Boltzmann Distribution at 1,000,000 velocities. ......... 23 Figure 10. Monomer center of mass velocity ............................................................................... 24 5 Introduction Numerical simulations in physical chemistry are essential tools for gaining a better understanding of experimental outcomes. The numerical simulation described in this thesis was developed in order to create a simple model of protein behavior in a gaseous environment. The original experiment was developed by the Prell Lab at the University of Oregon to determine mass fragmentation patterns of a protein structure.1 Electrospray ionization (ESI) is an analytical technique in chemistry that aims to transfer a folded protein into the gas phase through ion mobility. 1 Ion mobility is a research technique that aims to promote separation of ionized molecules, as well as their identification. This technique is particularly relevant because of its ability to preserve elements of the initial structure along with the global size measurement. 1 The global size measurement is a way to describe the size of a complex using the mass to charge ratio. The application of these two techniques in conjucture is referred to as native ion mobility- mass spectrometry (IM-MS). 1 IM-MS is relevant to experiments in bioanalytical chemistry, as it can be used as a technique to determine structural biology through a perturbation of the complex using gas-phase activation along with IM-MS. 1 Gas-phase activation is a technique used to fragment gas molecules through collisions with molecular ions. In the experiment of interest, IM-MS was implemented alongside gas-phase activation methods to determine the quaternary and tertiary structures of a protein complex. 1 IM-MS begins by introducing ions into an isothermal and isobaric gas chamber that is influenced by an electrical field. 1 As collisions occur, the technique promotes the degeneration of the protein complex to measure overall fragmentation patterns of the protein. 1 The numerical simulation takes foundational elements from the processes that occur within the IM-MS experiment to develop features of the 6 simulation. Although we aren’t using the results from the simulation to make experimental insights, it is important to use realistic principles to develop a simple model. Collision-induced unfolding (CIU) as well as collision-induced dissociation (CID) are combined in tandem to induce collisions between gas phase particles and the protein complex. 1 During the applications of CIU and CID, protein ions are accelerated to reach a high kinetic energy, and several thousand protein ions are forced to collide with the neutral gas atom. The collisions will eventually lead to the heating of the protein ions. 1 Once the protein ions surpass the maximum threshold for structure stability, the protein will begin dissociation or unfolding. 1 When a protein complex undergoes CID, it can provide information about the complex stoichiometry, as well as the identity of subunits. 1 Application of CIU allows experimenters to determine other important features such as the overall structure, binding patterns, and specificities of the ligand binding site. 1 In 1989, an article titled β€˜Theory of Collisional Activation of Macromolecules. Impulsive Collisions of Organic Ions’ was published by chemists at the University of Oslo describing computational methods for collisional activation. 2 The experiment that is initially described involves the guidance of ions towards a gaseous cell. 1 Energy transfers that occur between the ions and the gas cell result in ion fragmentation due to a technique referred to as collisional activation (CA). 2 Collisional activation is performed by positioning a collision cell directly between two mass analyzers. 2 The first mass analyzer (MS1) will transmit the selected ion, and the second mass analyzer (MS2) will scan and record the fragmented ions. 2 Collisional activation can be described using gas-phase kinetics, and the following model is used to show the time-dependent process of collisional activation and the resultant reaction that occurs because of the kinetic energy transfer. 2 This model denotes molecules A and B in various stages of 7 collisional activation. The ionic complex AB+ undergoes a collision with a gas atom, G, to form the energized complex (A-B+)*. (1) 𝐴𝐡 + 𝐺 β†’ (𝐴 βˆ’ 𝐡 )βˆ— + πΊβˆ— π‘˜(𝐸), 𝑃(𝐸) (2) (𝐴 βˆ’ 𝐡 ) βˆ— 𝐴 + 𝐡 β†’ Once the collisions have occurred, the energy distribution of the ions are referred to as the function P(E). 2 The distribution of P(E) is dependent on the transfer of energy between the ion beam and the internal degrees of freedom of the gas cell. 2 It is concluded that the rate of fragmentation during the experiment is dependent upon the distribution of P(E) and the rate constants k(E). 2 The ion fragmentation rate is determined by the unimolecular rate constants k(E) as well as the internal energy distribution P(E). 2 The theory of impulsive collision transfer (ICT) is a mathematical model used to describe energy transfers due to particle collisions in the experiment. 2 ICT defines the macromolecular ion as a sum of atoms that have masses m 2a,j. The summation of ma,j is defined as the total mass of the ion m 2ion. Prior to the collision and subsequent energy transfer, the macromolecular ion has a relative velocity in the lab frame of reference and is traveling on the defined x-axis within the mass spectrometer. 2 The lab frame of reference referes to the laboratory setting where the experiment of interest is performed. The kinetic energy and momentum of the ion are defined by equations 3 and 4. 2 1 1 (3) 𝐸 = π‘š , 𝑣 = π‘š 𝑣 2 2 (4) 𝑝 = π‘š , 𝑣 = π‘š 𝑣 In order to increase the quantity of fragmented ions, the amount of energy that is channeled into the system is increased by transferring a large amount of energy into the macromolecular ion. 2 8 The center of mass energy of the macromolecular ion can be expressed using the following equation. 2 This equation indicates the maximum energy that can be used for the transfer to the internal degrees of freedom of the system. 2 (5) π‘šπΈ = 𝐸 π‘š + π‘š The foundational theoretical concepts from the impulsive collision transfer (ICT) model are the basis for the development of the numerical simulation. The numerical simulation simplifies the treatment from a macromolecular ion to a unit referred to as the β€œspherical cow”. The spherical cow is a concept in theoretical physics that is used to describe highly simplified models. In the numerical simulation, the particle is modeled as a mono-unit spherical cow that participates in linear, one-dimensional motion through a gaseous environment. As described by the ICT model, the monomer will experience frequent collisions via gas atoms that result in changes of the total energy and center of mass velocity of the monomer. In my simulations, the focus is on the decrease over time of the monomer’s center of mass velocity. An important feature of the numerical simulation that is modeled after the gas-phase activation procedure in IM-MS, is a theoretical distribution of gas particles. The gas particle distribution is modeled after the infamous Boltzmann distribution which can be written in the following form where E is the energy of the system, kB is the Boltzmann constant, and T is the temperature of the system. 3 (6) 𝑃 ∝ 𝑒 The Boltzmann distribution is applied in later calculations, along with factors accounting for the relative velocity of the monomer and a gas atom. Generally, this distribution is used to determine the behavior of the gas particles and how they interact with the monomer. We can define two types of collisions that may occur during the simulation. A backside collision occurs when a gas particle hits the monomer from behind, this results in energy transfer 9 that alters the center of mass velocity. Similarly, a frontside collision is when the gas particle hits the monomer from the front. This will also result in an alteration of the center of mass velocity. Since the monomer is moving in one dimension, a backside collision will visually occur on the left side of the monomer, and a frontside collision will occur on the right side of the monomer. This can be visualized using the diagram modeled in Figure 1.These principles are used to develop the computational methods described below that can perform calculations and obtain results for further analysis. Figure 1. Simulation setup 10 Computational Methods Monte Carlo Method A prevalent concept in computational chemistry that is often applied to large statistical ensembles is a method referred to as Monte Carlo. 4 Monte Carlo procedures have a wide range of applications in research such as macromolecular structures and polymers. The method uses conditional probability to determine the appropriate statistical distributions for a system of study. 4 In the case of this simulation, a Monte Carlo method was developed in order to create a Boltzmann distribution of background gas particles that exist within an appropriate range of velocities. The appropriate range of velocities will be defined as any velocity between -vmax and +vmax where the variable vmax can be computed as follows where Tg and mg are the temperatures of the background gas and the mass of the background gas particles, respectively. 2𝑇 (7) 𝑣 = πΏπ‘œπ‘”[1000] π‘š My application of the Monte Carlo method was developed using the Mathematica software, and follows a procedure that is programmed to test each velocity. Randomized velocities are generated through a While/If loop structure where the initial step implements a RandomReal function. RandomReal is a Mathematica function that will generate a pseudorandom number between 0 and 1. The RandomReal function can be combined alongside other functions to generate an appropriate randomized number for the situation at hand. In the case of the Boltzmann distribution, we can combine the RandomReal function with a function that generates both negative and positive numbers, and multiply it by vmax to create an output that exists within the specified range. The program will then take the velocity, and generate a corresponding 11 probability using the Boltzmann distribution. The procedure will implement a condition for the calculated probability; if the probability is greater than a pseudorandom number generated by the RandomReal function, the condition is met and the gas particle velocity can be appended to a list of velocities. If the condition is not met, the program will restart the procedure until we reach the indicated number of gas particle velocities. The program is set to acquire one million gas particle velocities in order to have a complete understanding of the nature of this distribution. Bins Procedure Once the Monte Carlo method has generated the correct amount of gas particle velocities, we then implemented a procedure to categorize each velocity into a corresponding bin. The program will generate 200 bins, each with specific criteria for binned velocities. The width of each bin is defined as twice the value of vmax divided by the number of bins. A Do loop structure is utilized to test each velocity and determine the correct bin placement using the Floor function. The Floor function takes a number as an input and outputs the greatest integer less than or equal to the input. In this procedure, the Floor function is used to take each gas particle velocity and output the number of the corresponding bin. The function is written as follows where vsb[[nn]] is the specified gas particle velocity, vmax is the maximum gas particle velocity, and dv is the width of each bin. (8) 𝑣𝑠𝑏 [𝑛𝑛] + 𝑣𝑏𝑖𝑛𝑠 = πΉπ‘™π‘œπ‘œπ‘Ÿ + 1 𝑑𝑣 The Do loop will test each velocity using this procedure. The output of this function is the number of the corresponding bin that will be populated. It is important to note that the bin does not contain the numerical value of the velocity itself, instead it contains an integer which 12 represents the number of velocities that corresponds to that bin. Once every velocity is tested we obtain a rough representation of the Boltzmann distribution within the ranges of -vmax and +vmax. We can then take the rough distribution and create a modifed bin distribution by normalizing each value. This is accomplished by creating a new list of values that takes each bin and divides that value by the total number of elements in the original list of velocities. We can now implement a function that tests each modified bin for the middle value. We can define a function that calculates the middle of each bin using the following syntax where the variable n indicates the bin number. (9) π‘‘π‘£π‘šπ‘›π‘[𝑛 ] ≔ βˆ’π‘£ + + 𝑑𝑣(𝑛 βˆ’ 1) 2 This function is used in tandem with a normalized Boltzmann distribution function which is written as follows in equation 10; π‘š ( [ ]) (10) 𝑃(π‘šπ‘›π‘[𝑛]) = 𝑒 𝑑𝑣 2πœ‹ βˆ— 𝑇𝑔 The value for the middle of each bin is inputted into the normalized Boltzmann distribution, and the value obtained from the normalized bin distribution is placed into a list that when plotted is a representation of the distribution of velocities in each bin. We can then perform an analytical comparison between the original Boltzmann distribution and the normalized bin distribution to show a convergence to the Boltzmann distribution as we increasingly populate the velocity bins. This analytical convergence to the Boltzmann distribution is shown as we go from a distribution as small as 1000 velocities to the ideal range of 1,000,000 velocities. The convergence is an important factor to keep track of because we want to ensure the simulation is behaving properly. By performing the procedure at a wide range of velocities, we ensure that the simulation is actually converging to the ideal model. 13 Double-Roll Procedure Now that we have a functioning distribution of gas particle velocities, we need to determine how the gas particles will interact with the monomer. The double roll procedure will determine when and what kind of collisions occur. There are two types of collisions that are relevant to the numerical simulation; a backside collision and a forward collision. This distinction is important to note because we will use slightly different probabilities and probability densities to describe the behavior of each type of collision. In the case of a backside collision we can define a probability for a collision between the monomer and a gas particle as follows where  is the linear density of the background gas particles, dt is the change in time,  is a variable defined as the mass of the gas particles divided by twice the temperature of the gas, and v1 is the monomer velocity. 1 𝑣 (11) 𝑝 (𝑣 , ∞) = π›Ώπ‘‘πœŒ{ 𝑒 βˆ’ π‘’π‘Ÿπ‘“π‘[βˆšπ›Όπ‘£ ]} 2βˆšπœ‹π›Ό 2 At the beginning of the timestep, the simulation uses the RandomReal function to select a number between 0 and 1. In the case where the randomized number is greater than the probability function, there is no collision that occurs during that timestep, and the procedure restarts. In the case where a collision does occur, we need to determine what type of collision occurs and ensure that the gas particle exists within an appropriate range of velocities. This part of the procedure is performed through a guess-and-select procedure using the probability density for backside collisions. The probability density is defined as follows where vg is the gas particle velocity. 𝛼 (12) 𝑀 𝑣 , 𝑣 = 𝛿𝑑 𝑣 βˆ’ 𝑣 π‘ˆπ‘›π‘–π‘‘π‘†π‘‘π‘’π‘[𝑣 βˆ’ 𝑣 ]𝜌 𝑒 πœ‹ 14 The probability density will take its maximum value when vg is equal to vg,bar. vg,bar is defined as follows in equation 13; 1 2 (13) 𝑣 , = 𝑣 + 𝑣 + 2 𝛼 The following function will serve as a weighting function which accepts values between zero and unity. This can accept or reject values for vg within the range of -vg,max and vg,max. 𝑀 (𝑣 , 𝑣 ) 𝑣 βˆ’ 𝑣 (14) = π‘ˆπ‘›π‘–π‘‘π‘†π‘‘π‘’π‘(𝑣 βˆ’ 𝑣 ) 𝑒 ( , ) 𝑀 (𝑣 , 𝑣 , ) 𝑣 , βˆ’ 𝑣 After the weighting function determines the type of collision that occurs, we calculate what is referred to as the bump in velocity. This refers to the velocity changes that the monomer experiences as a result of frequent collisions, and is used to track the monomer’s center of mass velocity. The velocity bump is calculated at the end of each successfully completed timestep, and can be expressed as follows where mg is the mass of one gas particle, m is the mass of the monomer, and vmf[[jj]] is the previous monomer velocity that is used for the updated calculation. 2π‘š (15) 𝑣 = (𝑣 βˆ’ π‘£π‘šπ‘“ [𝑗𝑗] ) π‘š + π‘š The velocity bump is added to the previous velocity to calculate the overall change in center of mass velocity. The probability for head on collisions is defined as follows. (16) 1 𝑣𝑝 (𝑣 , βˆ’βˆž) = π›Ώπ‘‘πœŒ{ 𝑒 + π‘’π‘Ÿπ‘“π‘[βˆ’βˆšπ›Όπ‘£ ]} 2βˆšπœ‹π›Ό 2 Head on collisions will only occur if the rolled number between zero and unity is less than the probability for head on collisions to occur. The corresponding probability density is defined as follows. 15 𝛼 (17) 𝑀 𝑣 , 𝑣 = π›Ώπ‘‘πœŒ 𝑣 βˆ’ 𝑣 π‘ˆπ‘›π‘–π‘‘π‘†π‘‘π‘’π‘(𝑣 βˆ’ 𝑣 ) 𝑒 πœ‹ The probability density has a maximum value at the value vg,dbar which can be written using the following syntax. 1 2 (18) 𝑣 , = 𝑣 βˆ’ 𝑣 + 2 𝛼 The weighting funcztion for head on collisions, is written as follows, and is used to select gas particle velocities within the appropriate range for head on collisions. 𝑀 (𝑣 , 𝑣 ) 𝑣 βˆ’ 𝑣 (19) = π‘ˆπ‘›π‘–π‘‘π‘†π‘‘π‘’π‘ 𝑣 βˆ’ 𝑣 𝑒 ( , ) 𝑀 (𝑣 , 𝑣 , ) 𝑣 βˆ’ 𝑣 , This part of the procedure will also calculate the bump of velocity after each successful timestep, and the updated velocity will be appended to the list of center of mass monomer velocities. Once the procedure has been performed the desired amount of times, we can plot the list of monomer velocities to view changes in the center of mass velocity throughout the simulation. 16 Results and Discussion The Monte Carlo procedure allows us to generate a distribution of gas particle velocities, as shown below in Figure 2. The velocity list, where the gas particle velocities are stored, has 1000 list elements appended to it, each with a varying distribution of velocities ranging anywhere from -3 vg,max d/t to 3 vg,max d/t . Figure 2. Distribution of particle velocities As described earlier, each velocity is binned according to its numerical value, and is represented in Figure 3 as the unmodified bin distribution. The bin distribution is later modified as is shown in Figure 4, in order to create a normalized representation of the bin distribution for later comparisons with the Boltzmann distribution. 17 Figure 3. Bin Distribution Bin Distribution 20 15 10 5 0 0 50 100 150 200 Bin Figure 4. Modified Bin Distribution Modified Bin Distribution 0.020 0.015 0.010 0.005 0.000 0 50 100 150 200 Bin 18 Modified Number of Corresponding Velocities Number of Velocities The probability distribution described in Figure 5, shows the probability of a gas particle velocity exisiting in the nth bin. The distribution is modeled after the Boltzmann distribution, we can see that the greatest probability for a gas particle existing in a bin, is in the center bin of the overall distribution. As you go farther from the center bin, the probability decreases and approaches zero. Figure 5. Distribution of Velocities in the nth Bin Distribution of Velocities in the nth Bin 0.014 0.012 0.010 0.008 0.006 0.004 0.002 0.000 0 50 100 150 200 Bin In Figures 6-9, we plot the modified bin distribution together with the probability distribution described in Figure 5 at varying numbers of velocities in order to show analytical convergence. We begin at 1000 velocities in Figure 6 and eventually reaching 1,000,000 velocities in Figure 9. As the number of velocities is increased, we see that the modified bin distribution eventually begins to converge to the probability distribution. This indicates that the binning distribution closely follows a Boltzmann probability distribution, which is the desired 19 Probability of Existing in the Velocity of the Nth Bin result for this computation. The Boltzmann distribution is the ideal model for the simulation because the Boltzmann distribution describes the behavior of an infinite collection of gas particle velocities. If the simulation was performed to obtain an infinite collection of gas particles, the distribution would ideally perfectly match a Boltzmann distribution. Figure 6. Analytical Convergence to the Boltzmann Distribution at 1000 velocities 20 Figure 7. Analytical Convergence to the Boltzmann Distribution at 10,000 velocities 21 Figure 8. Analytical Convergence to the Boltzmann Distribution at 100,000 velocities 22 Figure 9. Analytical Convergence to the Boltzmann Distribution at 1,000,000 velocities. The double-roll procedure described earlier allows us to plot the monomer’s center of mass velocity which is graphed in Figure 10. The monomer has an initial velocity of 5.0 d/t, and within a duration of 200,000 timesteps we see the monomer’s velocity slow down to approximately 3.5 d/t in a stepwise way. Although it is expected that the monomer’s center of mass velocity slows down, the stepwise pattern is unexpected. Initial predictions for the center of mass velocity is that it will slow down in a smooth pattern, but the results show that the velocity oscillates between a specific range of values and then rapidly drops down. For the purposes of this simulation, the current results are valid, but are open to further testing and development to investigate this pattern. 23 Figure 10. Monomer center of mass velocity Center of Mass Velocity 3.9 3.8 3.7 3.6 3.5 0 50 000 100 000 150 000 200 000 TimeStep 24 Monomer Velocity Conclusions and Future Directions The results gained from this numerical simulation have allowed us to paint an initial picture of what occurs when a single-unit spherical cow interacts with a distribution of gas particle velocities. These initial results indicate that the Monte Carlo procedure and the binning procedure create a desired distribution of gas particle velocities that is modeled after a modified version of the Boltzmann distribution. We can also say that the monomer’s center of mass velocity slows down throughout the course of the simulation. The numerical simulation has several future goals that need to be achieved in order to gain a more conclusive picture of particle collisions in a mass spectroscopy experiment. The next step that needs to be taken is a simulation of two harmonically bound particles, and how they experience heating and cooling due to energy transfers from collisions with the gas particles. The end goal of this program is to simulate a spherical cow with many components, in order to get a more conclusive picture. 25 Appendix 26 27 28 29 30 31 32 33 34 35 36 37 Bibliography 1. Donor, M.; Shepherd, S. O.; Prell, J. S. Rapid Determination of Activation Energies for Gas-Phase Protein Unfolding and Dissociation in a Q-IM-ToF Mass Spectrometer. Journal of the American Society for Mass Spectrometry 2020, 31 (3), 602–610. https://doi.org/10.1021/jasms.9b00055. 2. Einar Uggerud, and Peter J Derrick. β€œTheory of Collisional Activation of Macromolecules. Impulsive Collisions of Organic Ions.” The Journal of Physical Chemistry, vol. 95, no. 3, 1 Feb. 1991, pp. 1430– 3. Leonard Kollender Nash. Elements of Statistical Thermodynamics. Mineola, New York, Dover Publ., Cop, 2006. 4. Allen, Michael P, and Dominic J Tildesley. Computer Simulation of Liquids. Oxford,Oxford University Press, 2017. 38