Fickas, StephenSchwartz, Samuel2024-12-192024-12-192024-12-19https://hdl.handle.net/1794/30280This dissertation is about empirically-driven quantitative and qualitative analyses of software projects in two large ecosystems of research production in the United States: national laboratories and universities. It is grounded in the fields of software engineering and software repository mining. In the 2002 paper, "What makes good research in software engineering" the authors identified several categories of software engineering research questions and gave examples. Three of these categories include:- Method or Means of Development (e.g., How can we do/create X?) - Generalization or Characterization (e.g., What, exactly, do we mean by X? What are the important characteristics of X, What are the varieties of X, and how are they related?) - Design, Evaluation, or Analysis of a Particular Instance (e.g., How does X compare to Y? What is the current state of S/practice of P?) This work interrogates these three categories as applied to software engineering projects categorized under the umbrella of the emerging field of research software engineering. Focused on the domains of the United State's national laboratories and universities due to their high levels of publicly available research output, we are asking the following research questions (RQs): RQ1: How can we find open source software repositories connected to universities and national laboratories? (Method of Development)RQ2: Given our methodology, what is the current state of affairs? Just how many open source software repositories and projects affiliated with universities and national laboratories are out there? (Analysis of a Particular Instance/Domain) RQ3: What are the properties, characteristics, and varieties of software projects with a nexus to these research institutions? (Generalization or Characterization, Analysis of a Particular Instance/Domain) RQ4: How do the characteristics of repositories in the university ecosystem compare with the characteristics of repositories in the national laboratory ecosystem? (Analysis of a Particular Instance/Domain) RQ5: How does the code in these research projects relate with and depend on other projects in the ecosystem? (Generalization or Characterization) In this work we contextualize these questions with background information and answer each in turn.en-USAll Rights Reserved.Empirical Quantitative Analyses of Research Software Engineering Projects in Scientific Computing.Electronic Thesis or Dissertation