Understanding and Adapting Tree Ensembles: A Training Data Perspective

Brophy, Jonathan

Understanding and Adapting Tree Ensembles: A Training Data Perspective

Files

Brophy_oregon_0171A_13478.pdf (11.81 MB)

Date

2023-03-24

Authors

Brophy, Jonathan

Publisher

University of Oregon

Abstract

Despite the impressive success of deep-learning models on unstructured data (e.g., images, audio, text), tree-based ensembles such as random forests and gradient-boosted trees are hugely popular and remain the preferred choice for tabular or structured data, and are regularly used to win challenges on data-competition websites such as Kaggle and DrivenData. Despite their impressive predictive performance, tree-based ensembles lack certain characteristics which may limit their further adoption, especially for safety-critical or privacy-sensitive domains such as weather forecasting or predictive medical modeling. This dissertation investigates the shortcomings currently facing tree-based ensembles---lack of explainable predictions, limited uncertainty estimation, and inefficient adaptability to changes in the training data---and posits that numerous improvements to tree-based ensembles can be made by analyzing the relationships between the training data and the resulting learned model. By studying the effects of one or many training examples on tree-based ensembles, we develop solutions for these models which (1) increase their predictive explainability, (2) provide accurate uncertainty estimates for individual predictions, and (3) efficiently adapt learned models to accurately reflect updated training data. This dissertation includes previously published coauthored material.

Keywords

gradient-boosted trees, influence estimation, machine unlearning, random forests, tree ensembles, uncertainty estimation

URI

https://hdl.handle.net/1794/28085

Collections

Theses and Dissertations
Computer Science Theses and Dissertations

Full item page

Scholars' Bank

Understanding and Adapting Tree Ensembles: A Training Data Perspective

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By