Browsing by Author "Rejaie, Reza"
Now showing 1 - 11 of 11
Results Per Page
Sort Options
Item Open Access A Longitudinal Assessment of Website Complexity(University of Oregon, 2018-09-06) Mostafavi, Seyed Hooman; Rejaie, RezaNowadays, most people use several websites on a daily basis for various purposes like social networking, shopping, reading news, etc. which shows the significance of these websites in our lives. Due to this phenomenon, businesses can make a lot of profit by designing high quality websites to attract more people. An important aspect of a good website is its page load time. There has been a lot of studies which analyzed this aspect of the websites from different perspectives. In this thesis, we characterize and examine the complexity of a wide range of popular websites in order to discover the trends in their complexity metrics, like their number, size and type of the objects and number and type of the contacted servers for delivering the objects, over the past six year. Moreover, we analyze the correlation between these metrics and the page load times.Item Open Access Characterizing Online Social Media: Topic Inference and Information Propagation(University of Oregon, 2018-10-31) Rezayidemne, Seyedsaed; Rejaie, RezaWord-of-mouth (WOM) communication is a well studied phenomenon in the literature and content propagation in Online Social Networks (OSNs) is one of the forms of WOM mechanism that have been prevalent in recent years specially with the widespread surge of online communities and online social networks. The basic piece of information in most OSNs is a post (e.g., a tweet in Twitter or a post in Facebook). A post can contain different types of content such as text, photo, video, etc, or a mixture of two or more them. There are also various ways to enrich the text by mentioning other users, using hashtags, and adding URLs to external contents. The goal of this study is to investigate what factors contribute into the propagation of messages in Google+. To answer to this question a multidimensional study will be conducted. On one hand this question could be viewed as a natural language processing problem where topic or sentiment of posts cause message dissemination. On the other hand the propagation can be effect of graph properties i.e., popularity of message originators (node degree) or activities of communities. Other aspects of this problem are time, external contents, and external events. All of these factors are studied carefully to find the most highly correlated attribute(s) in the propagation of posts.Item Open Access Characterizing the Structure of Twitter Network Through Socially-Aware Clustering of Users(University of Oregon, 2020) Tan, Eugene; Tan, Eugene; Rejaie, Reza; Rejaie, RezaPopular online social networks (OSN) such as Twitter form a networked system where millions of users interconnect and change information. Characterizing the structural properties of the resulting"relationship graph" among the OSN users is very informative but inherently challenging because of its huge size and complex connectivity patterns. This project explores a novel "socially-aware" approach to classify Twitter users and thus partition the structure of the Twitter relationship graph. To this end, we consider the top 10K most-followed Twitter users, called Twitter elite, and show that these users form coherent and socially meaningful communities, called Twitter elite communities. We define a "social interest vector" for each regular (i.e. non-elite) Twitter user where each element of this vector captures the user's relative level of interest to a specific elite community based on the fraction of her followings in that elite community. We then rely on this multi-dimensional measure of user's social interest to cluster millions of randomly selected Twitter users. We collect profile information, list of friends and followers along with available tweets for selected Twitter users in each cluster to assess (i) whether the resulting clusters of users are socially coherent, (ii) relatively degree of connectivity between different pairs of clusters, and (iii) the key social attributes of each cluster. Overall, our analysis will illustrate if elite communities can serve as "landmarks" to meaningfully classify regular Twitter users and characterize the structure of the Twitter network.Item Open Access Examining the Complexity of Popular Websites(University of Oregon, 2015-08-18) Tian, Ran; Rejaie, RezaA significant fraction of today's Internet traffic is associated with popular web sites such as YouTube, Netflix or Facebook. In recent years, major Internet websites have become more complex as they incorporate a larger number and more diverse types of objects (e.g. video, audio, code) along with more elaborate ways from multiple servers. These not only affect the loading time of pages but also determine the pattern of resulting traffic on the Internet. In this thesis, we characterize the complexity of major Internet websites through large-scale measurement and analysis. We identify thousands of the most popular Internet websites from multiple locations and characterize their complexities. We examine the effect of the relative popularity ranking and business type of the complexity of websites. Finally we compare and contrast our results with a similar study conducted 4 years earlier and report on the observed changes in different aspects.Item Open Access Gathering Information about Network Infrastructure from DNS Names and Its Applications(University of Oregon, 2015-01-14) Alur, Abhijit; Rejaie, RezaDNS (Domain Name System) names contain a wide variety of information, such as geographic location, speed of the interface, type of interface, etc. However, extracting this information is challenging since this information does not have a consistent format across different ISPs (internet service providers) or even a particular ISP. We present a new tool, GINIE, which extracts useful information and some common dictionary words from a DNS name. We use three ISPs and a CAIDA (Center for Applied Internet Data Analysis) dataset to demonstrate these capabilities. Information extracted with GINIE provides valuable insight about the infrastructure of the three ISPs and shows the availability and type of information in a collection of DNS names from many ISPs that exist in a typical dataset. The embedded information from DNS names can be used (with some additional active measurements) to infer the geo-aware topology of an ISP.Item Open Access Investigating the Mutual Impact of the P2P Overlay and the AS-level Underlay(University of Oregon, 2013-07-11) Rasti Ekbatani, Hassan; Rejaie, RezaDuring the past decade, the Internet has witnessed a dramatic increase in the popularity of Peer-to-Peer (P2P) applications. This has caused a significant growth in the volume of P2P traffic. This trend has been particularly alarming for the Internet Service Providers (ISPs) that need to cope with the associated cost but have limited control in routing or managing P2P traffic. To alleviate this problem, researchers have proposed mechanisms to reduce the volume of external P2P traffic for individual ISPs. However, prior studies have not examined the global effect of P2P applications on the entire network, namely the traffic that a P2P application imposes on individual underlying Autonomous Systems (ASs). Such a global view is particularly important because of the large number of geographically scattered peers in P2P applications. This dissertation examines the global effect of P2P applications on the underlying AS-level Internet. Toward this end, first we leverage a large number of complete overlay snapshots from a large-scale P2P application, namely Gnutella, to characterize the connectivity and evolution of its overlay structure. We also conduct a case study on the performance of BitTorrent and its correlation with peer- and group-level properties. Second, we present and evaluate Respondent-driven sampling as a promising technique to collect unbiased samples for characterizing peer properties in large-scale P2P overlays without requiring the overlay's complete snapshot. Third, we propose a new technique leveraging the geographical location of peers in an AS to determine its geographical footprint and identify the cities where its Points-of-Presence (PoPs) are likely to be located. Fourth, we present a new methodology to characterize the effect of a given P2P overlay on the underlying ASs. Our approach relies on the large scale simulation of BGP routing over the AS-level snapshots of the Internet to identify the imposed load on each transit AS. Using our methodology, we characterize the impact of Gnutella overlay on the AS-level underlay over a 4-year period. Our investigation provides valuable insights on the global impact of large scale P2P overlay on individual ASs. This dissertation includes my previously published and co-authored material.Item Open Access Longitudinal Analysis of Major Video Streaming Services in the US(University of Oregon, 2020) Hooshmand, Donna; Hooshmand, Donna; Rejaie, Reza; Rejaie, RezaThis study relies on several years of NETFLOW data for exchanged traffic between the University of Oregon network (UOnet) and the Internet to perform a longitudinal analysis on the characteristics of popular Internet Applications. We develop techniques to identify connections related to video streams from their NETFLOW records. We then investigate how the fraction of UOnet traffic associated with (i.e. popularity of) major video streaming applications (e.g. YoutTube, Netflix, Amazon Prime), the basic characteristics of their video (e.g. bandwidth and duration) and their delivery mechanism have evolved over the past few years. Our empirical findings will offer valuable insights into important practical aspects of video streams services and their evolution over time.Item Open Access Measurement-based Characterization of Large-Scale Networked Systems(University of Oregon, 2017-05-01) Motamedi, Reza; Rejaie, RezaAs the Internet has grown to represent arguably the largest “engineered” system on earth, network researchers have shown increasing interest in measuring this large-scale networked system. In the process, structures such as the physical Internet or the many different (logical) overlay networks that this physical infrastructure enables have been the focus of numerous studies. Many of these studies have been fueled by the ease of access to “big data”. Moreover, they benefited from advances in the study of complex networks. However, an important missing aspect in typical applications of complex network theory to the study of real-world distributed systems has been a general lack of attention to domain knowledge. On the one hand, missing or superficial domain knowledge can negatively affect the studies “input”; that is, limitations or idiosyncrasies of the measurement methods can render the resulting graphs difficult to interpret if not meaningless. On the other hand, lacking or insufficient domain knowledge can result in specious “output”; that is, popular graph abstractions of real-world systems are incapable of accounting for “details” that are important from an engineering perspective. In this thesis, we take a closer look at measurement-based characterization of a few real-world large-scale networked systems and focus on the role that domain knowledge plays in gaining a thorough understanding of these systems key properties and behavior. More specifically, we use domain knowledge to (i) design context-aware measurement strategies that capture the relevant information about the system of interest, (ii) analyze the captured view of the networked system baring in mind the abstraction imposed by the chosen graph representation, and (iii) scrutinize the results derived from the analysis of the graph-based representations by investigating the root causes underlying these findings. The main technical contribution of our work is twofolds. First, we establish concrete connections between the amount and level of domain knowledge needed and the quality of the measurements collected from networked systems. Second, we also provide concrete evidence for the role that domain knowledge plays in the analysis of views inferred from measurements collected from large-scale networked systemsItem Open Access Measuring the Evolving Internet in the Cloud Computing Era: Infrastructure, Connectivity, and Performance(University of Oregon, 2020-02-27) Yeganeh, Bahador; Rejaie, RezaThe advent of cloud computing as a means of offering virtualized computing and storage resources has radically transformed how modern enterprises run their business and has also fundamentally changed how today's large cloud providers operate. For example, as these large cloud providers offer an increasing number of ever-more bandwidth-hungry cloud services, they end up carrying a significant fraction of today's Internet traffic. In response, they have started to build-out and operate their private backbone networks and have expanded their service infrastructure by establishing a presence in a growing number of colocation facilities at the Internet's edge. As a result, more and more enterprises across the globe can directly connect (i.e., peer) with any of the large cloud providers so that much of the resulting traffic will traverse these providers' private backbones instead of being exchanged over the public Internet. Furthermore, to reap the benefits of the diversity of these cloud providers' service offerings, enterprises are rapidly adopting multi-cloud deployments in conjunction with multi-cloud strategies (i.e., end-to-end connectivity paths between multiple cloud providers). While prior studies have focused mainly on various topological and performance-related aspects of the Internet as a whole, little to no attention has been given to how these emerging cloud-based developments impact connectivity and performance in today's cloud traffic-dominated Internet. This dissertation presents the findings of an active measurement study of the cloud ecosystem of today's Internet. In particular, the study explores the connectivity options available to modern enterprises and examines the performance of the cloud traffic that utilizes the corresponding end-to-end paths. The study's main contributions include (i) studying the locality of traffic for major content providers (including cloud providers) from the edge of the network (ii) capturing and characterizing the peering fabric of a major cloud provider, (iii) characterizing the performance of different multi-cloud strategies and associated end-to-end paths, and (iv) designing a cloud measurement platform and decision support framework for the construction of optimal multi-cloud overlays.Item Open Access On the Multi-Fractal Nature of Observed IP Addresses in Measured Internet Traffic(University of Oregon, 2023-07-06) OConnor, Walton; Rejaie, RezaWe examine the presence of multifractal properties in the spatial structure of observed IPv4 addresses in measured Internet traffic. A collection of traffic samples from a variety of network settings are assembled and their spatial structures evaluated for multifractal properties using the method of moments approach. We show that all collected traces have properties consistent with multifractal scaling, but that the scaling behaviors vary by trace. We propose mechanisms which may give rise to these behaviors, and then discuss a number of ways by which our empirical finding concerning the spatial structure of observed IP addresses in measured network traffic can be utilized in practice, including its use in modern dataplane network monitor settings, both as a metric to monitor and as a means to increase hardware utilization efficiency.Item Open Access The Applications of Machine Learning Techniques in Networked Systems(University of Oregon, 2020-12-08) Jamshidi, Soheil; Rejaie, RezaMany large networked systems ranging from the Internet to ones deployed atop the Internet (e.g., Amazon) play critical roles in our daily lives. In these systems, individual nodes (e.g., a computer) establish a physical or virtual connection/relationship to form a networked system and exchange data. An important task in these systems is the timely and accurate detection of security or management events, e.g. a denial of service attack on campus. Machine learning (ML) models offer a promising data-driven method to learn the ``signature'' of these events from the past instances and use that to detect future events. While ML models have been very successful in other domains (e.g., image processing), there are clear challenges in using them for event detection in networked systems including (i) limited availability of large scale labeled dataset, (ii) subtle and changing signature of target event, (iii) selecting and capturing proper traffic features for (re)training, (iv) ``black-box'' nature of ML models. This dissertation presents three different applications of ML models for event detection based on exchanged messages in networked systems that tackle the above challenges. First, we develop an ML-based method to identify incentivized Amazon reviews. To this end, we present a heuristic-based signature to identify explicitly incentivized reviews (EIRs) and characterize related reviews, products, and reviewers. We use EIRs to train an ML model for detecting implicitly incentivized reviews. Second, we examine how casting and training strategies of unsupervised ML (and statistical) model affects their accuracy and overhead (and thus feasibility) for forecasting network data streams. In particular, we study the impact of the size, selection, and recency of the training data on accuracy and overhead. Third, we design and evaluate anomaly detection mechanisms based on an unsupervised ML-based method that takes input data streams from network traffic, end-system, and application load. Furthermore, we leverage model interpretation to identify the most important input data streams and deploy model extraction to infer the rules that represent model behavior. Overall, these three cases studies result in numerous insightful findings on a range of practical issues that arise in deploying ML models for event detection in networked systems.