A Multi-Task Weak Supervision Framework for Internet Measurements
Loading...
Date
2021
Authors
Knofczynski, Jared
Journal Title
Journal ISSN
Volume Title
Publisher
University of Oregon
Abstract
The ability of machine learning (ML) systems to learn and identify patterns in data is of growing importance to researchers in all fields, especially in the domain of Internet measurements. As our reliance on the Internet continues to grow, ML solutions to networking problems continue to be invaluable in ensuring the sustained performance of networked systems around the globe. One key issue networking researchers face is a lack of labeled training data, particularly at scale. Traditional labeling strategies such as crowdsourcing or manual annotation by subject matter experts are less viable in networking domains, as labeling Internet measurement data often requires significant domain expertise that crowdsourced labeling resources do not possess, and the vast quantities of networking data make large-scale manual annotation infeasible. Additionally, many networking applications of ML require running multiple tasks concurrently, resulting in the multiplicative growth of training times as the number of tasks increases. This reliance on isolated models also means that potentially useful information may be discarded if a model deems it irrelevant for the task at hand when it could be useful to other models training on the same dataset. Given these challenges, we propose ARISE, a multi-task weak supervision framework for Internet measurements capable of leveraging weak supervision strategies in the form of labeling functions to label vast quantities of networking data, while also sharing information between tasks during the training process to decrease training times, improve classification accuracy, and reduce the influences of hidden biases potentially contained within sets of training data.
Description
1 page.
Keywords
Machine learning, Internet measurements, Weak supervision, Networking, Computer Science