The workshop started with a focus on measurements. A large portion of the submitted papers presented and discussed measurement data, and these submissions provided a good basis for a better understanding of the situation, covering different angles and aspects of network traffic and different kinds of networks.
Changes in Internet traffic due to the COVID-19 pandemic affected different networks in various ways. Yet all networks saw some form of change, be it a reduction in traffic, an increase in traffic, a change in workday and weekend diurnal patterns, or a change in traffic classes. Traffic volume, directionality ratios, and traffic origins and destinations were radically different than from before COVID-19.
At a high level, while traffic from home networks increased significantly, for the traffic in mobile networks different trends were observed. Either the traffic increased as well -- for instance, in locations where use of residential ISP services is less common -- traffic decreased as a result of reduced population mobility. This observed traffic decrease in mobile networks reflected rather the opposite trend than what was observed in residential ISPs.
While diurnal congestion at interconnect points as well in certain last-mile networks was reported, mainly in March, no persistent congestion was observed. Further, a downward trend in download throughput to certain cloud regions was measured, which can probably be explained by the increased use of cloud services. This gives another indication that the scaling of shared resources in the Internet is working reasonably well enough to handle even larger changes in traffic as experienced during the first nearly global lockdown of the COVID-19 pandemic.
The global pandemic has significantly accelerated the growth of data traffic worldwide. Based on the measurement data of one ISP, three IXPs, a metropolitan educational network, and a mobile operator, it was observed at the beginning of the workshop [Feldmann2020
] that, overall, the network was able to handle the situation well despite a significant and sudden increase in the traffic growth rate in March and April. That is, after the lockdown was implemented in March, a traffic increase of 15-20% was observed at the ISP as well as at the three IXPs. This traffic growth, which would typically occur over a year, took place over a few weeks -- a substantial increase. At DE-CIX Frankfurt, the world's largest Internet Exchange Point in terms of data throughput, the year 2020 saw the largest increase in peak traffic within a single year since the IXP was founded in 1995. Additionally, mobile traffic has slightly receded. In access networks, the growth rate of upstream traffic also exceeded the growth in downstream traffic, reflecting increased adoption and use of videoconferencing and other remote work and school applications.
Most traffic increases happened outside of pre-pandemic peak hours. Before the first COVID-19 lockdowns, the main time of use was in the evening hours during the week, whereas, since March, it has been spread more equally across the day. That is, the increase in usage has mainly occurred outside the previous peak usage times (e.g., during the day while working from home). This means that, for the first time, network utilization on weekdays resembled that on weekends. The effects of the increased traffic volume could easily be absorbed, either by using existing reserve capacity or by quickly switching additional bandwidth. This is one reason why the Internet was able to cope well with the pandemic during the first lockdown period.
Some of the lockdowns were lifted or relaxed around May 2020. As people were allowed to resume some of their daily activities outside of their home again, as expected, there was a decrease in the traffic observed at the IXPs and the ISP; instead, mobile traffic began to grow again.
The composition of data traffic has changed since the beginning of the pandemic: the use of videoconferencing services and virtual private networks (VPNs) for access to company resources from the home environment has risen sharply. In ISP and IXP networks, it was observed [Feldmann2020
] that traffic associated with web conferencing, video, and gaming increased significantly in March 2020 as a result of the increasing user demand for solutions like Zoom or Microsoft Teams. For example, the relative traffic share of many "essential" applications like VPN and conferencing tools increased by more than 200%.
Also, as people spent more hours at home, they tended to watch videos or play games, thus increasing entertainment traffic demands. At the same time, the traffic share for other traffic classes decreased substantially, e.g., traffic related to education, social media, and, for some periods, content delivery networks (CDNs). In April and June, web conferencing traffic was still high compared to the pre-pandemic scenario, while a slight decrease in CDN and social media traffic was observed. During these months, many people were still working from home, but restrictions had been lifted or relaxed, which likely led to an increase in in-person social activities and a decrease in online social activities.
Changes in traffic have been observed at university campus networks as well, especially due to the necessary adoption of remote teaching. The Politecnico di Torino (Italy) deployed its in-house solution for remote teaching, which caused the outgoing traffic to grow by 2.5 times, driven by more than 600 daily online classes. Incoming traffic instead decreased by a factor of 10 due to the cessation of any in-person activity. Based on their measurements, this change in traffic and network usage did not, however, lead to noticeable performance impairments, nor has significantly poor performance been observed in students in remote regions of Italy. Outgoing traffic also increased due to other remote working solutions, such as collaboration platforms, VPNs, and remote desktops.
Similar changes were observed by measuring REDIMadrid [Feldmann2020
], a European educational and research network that connects 16 independent universities and research centers in the metropolitan region of Madrid. A drop of up to 55% in traffic volume on working days during the pandemic was observed. Similar to findings for ISP/IXP networks, it was observed that working days and weekend days are becoming more similar in terms of total traffic. The hourly traffic patterns reveal a traffic increase between 9 pm and 7 am. This could be due to users working more frequently at unusual times but could also potentially be caused by overseas students (mainly from Latin America and East Asia as suggested by the Autonomous System (AS) numbers from which these connections came) who accessed university network resources from their home countries.
Given the fact that the users of the academic network (e.g., students and research staff) had to leave campus as a response to lockdown measures, the traffic in-and-out (i.e., ingress and egress) ratio also changed drastically. Prior to the lockdown, the incoming traffic volume was much larger than the outgoing traffic volume. This changed to a more balanced ratio. This change of traffic asymmetry can be explained by the nature of remote work. On the one hand, users connected to the network services mainly to access resources, hence the increase in outgoing traffic. On the other hand, all external (i.e., Internet-based) resources requested during work were no longer accessed from the educational network but from the users' homes.
Mobile network data usage appeared to decline following the imposition of localized lockdown measures as these reduced typical levels of mobility and roaming.
] measured the cellular network of O2 UK to evaluate how the changes in people's mobility impacted traffic patterns. By analyzing cellular network signaling information regarding users' device mobility activity, they observed a decrease of 50% in mobility (according to different mobility metrics) in the UK during the lockdown period. As they found no correlation between this reduction in mobility and the number of confirmed COVID-19 cases, only the enforced government order was effective in significantly reducing mobility, and this reduction was more significant in densely populated urban areas than in rural areas. For London specifically, it could be observed from the mobile network data that approximately 10% of residents temporarily relocated during the lockdown.
These mobility changes had immediate implications in the traffic patterns of the cellular network. The downlink data traffic volume aggregated for all bearers (including conversational voice) decreased for the entire UK by up to 25% during the lockdown period. This correlates with the reduction in mobility that was observed countrywide, which likely resulted in people relying more on residential broadband Internet access to run download-intensive applications such as video streaming. The observed decrease in the radio cell load, with a reduction of approximately 15% across the UK after the stay-at-home order was enacted, further corroborates the drop in cellular connectivity usage.
The total uplink data traffic volume, on the other hand, experienced little change (between -7% and +1.5%) during lockdown. This was mainly due to the increase of 4G voice traffic (i.e., Voice over LTE (VoLTE)) across the UK that peaked at 150% after the lockdown compared to the national median value before the pandemic, thus compensating for the decrease in data traffic in the uplink.
Finally, it was also observed that mobility changes have a different impact on network usage in geodemographic area clusters. In densely populated urban areas, a significantly higher decrease of mobile network usage (i.e., downlink and uplink traffic volume, radio load, and active users) was observed compared to rural areas. In the case of London, this was likely due to the geodemographics of the central districts, which include many seasonal residents (e.g., tourists) and business and commercial areas.
Traffic at points of network interconnection noticeably increased, but most operators reacted quickly by rapidly adding additional capacity [Feldmann2020
]. The amount of increase varied, with some networks that hosted popular applications such as videoconferencing experiencing traffic growth of several hundred to several thousand percent. At the IXP level, it was observed that port utilization increased. This phenomenon is mostly explained by higher traffic demand from residential users.
Measurements of interconnection links at major US ISPs by the Center for Applied Internet Data Analysis (CAIDA) and the Massachusetts Institute of Technology (MIT) found some evidence of diurnal congestion around the March 2020 time frame [Clark2020
], but most of this congestion disappeared in a few weeks, which suggests that operators indeed took steps to add capacity or otherwise mitigate the congestion.
Cloud infrastructure played a key role in supporting bandwidth-intensive videoconferencing and remote learning tools to practice social distancing during the COVID-19 pandemic. Network congestion between cloud platforms and access networks could impact the quality of experience of these cloud-based applications. CAIDA leveraged web-based speed test servers to take download and upload throughput measurements from virtual machines in public cloud platforms to various access ISPs in the United States [Mok2020
The key findings included the following:
Persistent congestion events were not widely observed between cloud platforms and these networks, particular for large-scale ISPs, but we could observe large diurnal download throughput variations in peak hours from some locations to the cloud.
There was evidence of persistent congestion in the egress direction to regional ISPs serving suburban areas in the US. Their users could have suffered from poor video streaming or file download performance from the cloud.
The macroscopic analysis over 3 months (June-August 2020) revealed downward trends in download throughput from ISPs and educational networks to certain cloud regions. We believe that increased use of the cloud in the pandemic could be one of the factors that contributed to the decreased performance.
The last mile is the centerpiece of broadband connectivity, where poor last-mile performance generally translates to poor quality of experience. In a recent Internet Measurement Conference (IMC '20) research paper, Fontugne et al. investigated last-mile latency using traceroute data from Reseaux IP Europeens (RIPE) Atlas probes located in 646 ASes and looked for recurrent performance degradation [Fontugne2020-1
]. They found that, in normal times, Atlas probes experience persistent last-mile congestion in only 10% of ASes, but they recorded 55% more congested ASes during the COVID-19 outbreak. This deterioration caused by stay-at-home measures is particularly marked in networks with a very large number of users and in certain parts of the world. They found Japan to be the most impacted country in their study, looking specifically at the Nippon Telegraph and Telephone (NTT) Corporation Open Computer Network (OCN) but noting similar observations for several Japanese networks, including Internet Initiative Japan (IIJ) (AS2497).
From mid-2020 onward, however, they observed better performance than before the pandemic. In Japan, this was partly due to the deployments originally planned for accommodating the Tokyo Olympics, and, more generally, it reflects the efforts of network operators to cope with these exceptional circumstances. The pandemic has demonstrated that its adaptive design and proficient community can keep the Internet operational during such unprecedented events. Also, from the numerous research and operational reports recently published, the pandemic is apparently shaping a more resilient Internet; as Nietzsche wrote, "What does not kill me makes me stronger".
The type of traffic needed by the users also changed in 2020. Upstream traffic increased due the use of videoconferences, remote schooling, and similar applications. The National Cable & Telecommunications Association (NCTA) and Comcast reported that while downstream traffic grew 20%, upstream traffic grew by as much as 30-37% [NCTA2020
]. Vodafone reported that upstream traffic grew by 100% in some markets [Vodafone2020
Ericsson's ConsumerLab surveyed users regarding their usage and experiences during the crisis. Some of the key findings in [ConsumerlabReport2020
] were as follows:
9 in 10 users increased Internet activities, and time spent connected increased. In addition, 1 in 5 started new online activities; many in the older generation felt that they were helped by video calling; parents felt that their children's education was helped; and so on.
Network performance was, in general, found satisfactory. 6 in 10 were very satisfied with fixed broadband, and 3 in 4 felt that mobile broadband was the same or better compared to before the crisis. Consumers valued resilience and quality of service as the most important responsibility for network operators.
Smartphone application usage changed, with the fastest growth in apps related to COVID-19 tracking and information, remote working, e-learning, wellness, education, remote health consultation, and social shared experience applications. The biggest decreases were in travel and booking, ride hailing, location, and parking applications.
Some of the behaviors are likely permanent changes [ConsumerlabReport2020
]. The adoption of video calls and other new services by many consumers, such as the older generation, is likely going to have a long-lasting effect. Surveys in various organizations point to a likely long-term increase in the number of people interested in remote work [WorkplaceAnalytics2020
The second and third days of the workshop focused on open discussions of arising operational and architectural issues and the conclusions that could be reached from previous discussions and other issues raised in the position papers.
Measurements from Fastly confirmed that Internet traffic volume in multiple countries rose rapidly while COVID cases were increasing and lockdown policies were coming into effect. Download speeds also decreased but in a much less dramatic fashion than when overall bandwidth usage increased. School closures led to a dramatic increase in traffic volume in many regions, and other public policy announcements triggered large traffic shifts. This suggests that governments should coordinate with operators to allow time for preemptive operational changes in some cases.
Measurements from the US showed that download rates correlate with income levels. However, download rates in the lowest income zip codes increased as the pandemic progressed, closing the divide with higher income areas. One possible reason for this in the data is decisions by some ISPs, such as Comcast and Cox, that increased speeds for users on certain lower-cost plans and in certain areas. This suggests that network capacity was available and that the correlation between income and download rates was not necessarily due to differences in the deployed infrastructure in different regions, although it was noted that certain access link technologies provide more flexibility than others in this regard.
Web conferencing systems (e.g., Microsoft Teams, Zoom, Webex) saw incredible growth, with overnight traffic increases of 15-20% in response to public policy changes, such as lockdowns. This required significant and rapid changes in infrastructure provisioning.
Major video providers (YouTube, etc.) reduced bandwidth by 25% in some regions. It was suggested that this had a huge impact on the quality of videoconferencing systems until networks could scale to handle the full bit rate, but other operators of some other services saw limited impact.
Updates to popular games have a significant impact on network load. Some discussions were reported between ISPs, CDNs, and the gaming industry on possibly coordinating various high-bandwidth update events, similar to what was done for entertainment/video download speeds. There was an apparently difficult interplay between bulk download and interactive real-time applications, potentially due to buffer bloat and queuing delays.
It was noted that operators have experience with rapid growth of Internet traffic. New applications with exponential growth are not that unusual in the network, and the traffic spike due to the lockdown was not that unprecedented for many. Many operators have tools and mechanisms to deal with this. Ensuring that knowledge is shared is a challenge.
Following these observations, traffic prioritization was discussed, starting from Differentiated Services Code Point (DSCP) marking. The question arose as to whether a minimal priority-marking scheme would have helped during the pandemic, e.g., by allowing marking of less-than-best-effort traffic. That discussion quickly devolved into a more general QoS and observability discussion and, as such, also touched on the effects of increased encryption. The group was not, unsurprisingly, able to resolve the different perspectives and interests involved, but the discussion demonstrated that progress was made.
It is clear that there is a contrast in experience. Many operators reported few problems in terms of metrics, such as measured download bandwidth, while videoconferencing applications experienced significant usability problems running on those networks. The interaction between application providers and network providers worked very smoothly to resolve these issues, supported by strong personal contacts and relationships. But it seems clear that the metrics used by many operators to understand their network performance don't fully capture the impact on certain applications, and there is an observability gap. Do we need more tools to figure out the various impacts on user experience?
These types of applications use surprising amounts of Forward Error Correction (FEC). Applications hide lots of loss to ensure a good user experience. This makes it harder to observe problems. The network can be behaving poorly, but the experience can be good enough. Resiliency measures can improve the user experience but hide severe problems. There may be a missing feedback loop between application developers and operators.
It's clear that it's difficult for application providers and operators to isolate problems. Is a problem due to the local Wi-Fi, the access network, the cloud network, etc.? Metrics from access points would help, but in general, lack of observability into the network as a whole is a real concern when it comes to debugging performance issues.
Further, it's clear that it can be difficult to route problem reports to the person who can fix them, especially if the reported information needs to be shared across multiple networks in the Internet. COVID-enhanced cooperation made it easier to debug problems; lines of communication are important.
The increased threats and network security impacts arising from COVID-19 fall into two areas: (1) the agility of malicious actors to spin up new campaigns using COVID-19 as a lure, and (2) the increased threat surface from a rapid shift towards working from home.
During 2020, there was a shift to home working generally, and in the way in which people used the network. IT departments rolled out new equipment quickly and used technologies like VPNs for the first time, while others put existing solutions under much greater load. As VPN technology became more widespread and more widely used, it arguably became a more valuable target; one Advanced Persistent Threat group (APT29) was successful in using recently published exploits in a range of VPN software to gain initial footholds [Kirsty2020
Of all scams detected by the United Kingdom National Cyber Security Centre (UK NCSC) that purported to originate from the UK Government, more related to COVID-19 than any other subject. There are other reports of a strong rise in phishing, fraud, and scams related to COVID [Kirsty2020
]. Although the overall levels of cybercrime have not increased from the data seen to date, there was certainly a shift in activity as both the NCSC and the Department of Homeland Security Cybersecurity and Infrastructure Security Agency (DHS CISA) saw growing use of COVID-19-related themes by malicious cyber actors as a lure. Attackers used COVID-19-related scams and phishing emails to target individuals, small and medium businesses, large organizations, and organizations involved in both national and international COVID-19 responses (healthcare bodies, pharmaceutical companies, academia, and medical research organizations). New targets (for example, organizations involved in COVID-19 vaccine development) were attacked using VPN exploits, highlighting the potential consequences of vulnerable infrastructure.
It's unclear how to effectively detect and counter these attacks at scale. Approaches such as using Indicators of Compromise and crowdsourced flagging of suspicious emails were found to be effective in response to COVID-19-related scams [Kirsty2020
], and observing the DNS to detect malicious use is widespread and effective. The use of DNS over HTTPS offers privacy benefits, but current deployment models can bypass these existing protective DNS measures.
It was also noted that when everyone moves to performing their job online, lack of understanding of security becomes a bigger issue. Is it reasonable to expect every user of the Internet to have password training? Or is there a fundamental problem with a technical solution? Modern advice advocates a layered approach to security defenses, with user education forming just one of those layers.
Communication platforms such as Zoom are not new: many people have used them for years, but as COVID-19 saw an increasing number of organizations and individuals turning to these technologies, they became an attractive target due to increased usage. In turn, there was an increase in malicious cyber actor activity, either through hijacking online meetings that were not secured with passwords or leveraging unpatched software as an attack vector. How can new or existing measures protect users from the attacks levied against the next vulnerable service?
Overall, it may be that there were fewer security challenges than expected arising from many people suddenly working from home. However, the agility of attackers, the importance of robust and scalable defense mechanisms, and some existing security problems and challenges may have become even more obvious and acute with an increased use of Internet-based services, particularly in a pandemic situation and in times of uncertainty, where users can be more vulnerable to social engineering techniques and attacks.
There is a concern that we're missing observability for the network as a whole. Each application provider and operator has their own little lens. No one has the big-picture view of the network.
How much of a safety margin do we need? Some of the resiliency comes from us not running the network too close to its limit. This allows traffic to shift and gives headroom for the network to cope. The best-effort nature of the network may help here. Using techniques to run the network closer to its limits usually improves performance, but highly optimized networks may be less robust.
Finally, it was observed that we get what we measure. There may be an argument for operators to perhaps shift their measurement focus away from pure capacity to instead measure Quality of Experience (QoE) or resilience. The Internet is a critical infrastructure, and people are realizing that now. We should use this as a wake-up call to improve resilience, both in protocol design and operational practice, not necessarily to optimize for absolute performance or quality of experience.
There is a wealth of data about the performance of the Internet during the COVID-19 crisis. The main conclusion from the various measurements is that fairly large shifts occurred. And those shifts were not merely about exchanging one application for another; they actually impacted traffic flows and directions and caused, in many cases, a significant traffic increase. Early reports also seem to indicate that the shifts have gone relatively smoothly from the point of view of overall consumer experience.
An important but not so visible factor that led to running smoothly was that many people and organizations were highly motivated to ensure good user experience. A lot of collaboration happened in the background, problems were corrected, many providers significantly increased their capacity, and so on.
On the security front, the COVID-19 crisis showcased the agility with which malicious actors can move in response to a shift in user Internet usage and the vast potential of the disruption and damage that they can inflict. Equally, it showed the agility of defenders when they have access to the tools and information they need to protect users and networks, and it showcased the power of Indicators of Compromise when defenders around the world are working together against the same problem.
In general, the Internet also seems well suited for adapting to new situations, at least within some bounds. The Internet is designed for flexibility and extensibility, rather than being optimized for today's particular traffic types. This makes it possible to use it for many applications and in many deployment situations and to make changes as needed. The generality is present in many parts of the overall system, from basic Internet technology to browsers and from name servers to content delivery networks and cloud platforms. When usage changes, what is needed is often merely different services, perhaps some reallocation of resources as well as consequent application and continuation of existing security defenses, but not fundamental technology or hardware changes.
On the other hand, this is not to say that no improvements are needed:
We need a better understanding of the health of the Internet. Going forward, the critical nature that the Internet plays in our lives means that the health of the Internet needs to receive significant attention. Understanding how well networks work is not just a technical matter; it is also of crucial importance to the people and economies of the societies using it. Projects and research that monitor Internet and services performance on a broad scale and across different networks are therefore important.
We need to maintain defensive mechanisms to be used in times of crisis. Malicious cyber actors are continually adjusting their tactics to take advantage of new situations, and the COVID-19 pandemic is no exception. Malicious actors used the strong appetite for COVID-19-related information as an opportunity to deliver malware and ransomware and to steal user credentials. Against the landscape of a shift to working from home and an increase in users vulnerable to attack, and as IT departments were often overwhelmed by rolling out new infrastructure and devices, sharing Indicators of Compromise (IoC) was a vital part of the response to COVID-19-related scams and attacks.
We need to ensure that broadband is available to all and that Internet services equally serve different groups. The pandemic has shown how the effects of the digital divide can be amplified during a crisis and has further highlighted the importance of equitable Internet access.
We need to continue to work on all the other improvements that are seen as necessary anyway, such as further improvements in security, the ability for networks and applications to collaborate better, etc.
We need to ensure that informal collaboration between different parties involved in the operation of the network continues and is strengthened to ensure continued operational resilience.