Internet Engineering Task Force (IETF) M. Linsner Request for Comments: 7536 Cisco Systems Category: Informational P. Eardley ISSN: 2070-1721 T. Burbridge BT F. Sorensen Nkom May 2015 Large-Scale Broadband Measurement Use Cases
AbstractMeasuring broadband performance on a large scale is important for network diagnostics by providers and users, as well as for public policy. Understanding the various scenarios and users of measuring broadband performance is essential to development of the Large-scale Measurement of Broadband Performance (LMAP) framework, information model, and protocol. This document details two use cases that can assist in developing that framework. The details of the measurement metrics themselves are beyond the scope of this document. Status of This Memo This document is not an Internet Standards Track specification; it is published for informational purposes. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7536.
Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. 1. Introduction ....................................................3 2. Use Cases .......................................................3 2.1. Internet Service Provider (ISP) Use Case ...................3 2.2. Regulator Use Case .........................................4 3. Details of ISP Use Case .........................................5 3.1. Understanding the Quality Experienced by Customers .........5 3.2. Understanding the Impact and Operation of New Devices and Technology .............................................6 3.3. Design and Planning ........................................6 3.4. Monitoring Service Level Agreements ........................7 3.5. Identifying, Isolating, and Fixing Network Problems ........7 4. Details of Regulator Use Case ...................................8 4.1. Providing Transparent Performance Information ..............8 4.2. Measuring Broadband Deployment .............................9 4.3. Monitoring Traffic Management Practices ...................10 5. Implementation Options .........................................10 6. Conclusions ....................................................12 7. Security Considerations ........................................13 8. Informative References .........................................15 Contributors ......................................................17 Authors' Addresses ................................................17
users understand whether the problem exists in their home network or with a third-party application service instead of with their broadband (BB) product. o Design and planning. Through monitoring the end-user experience, the ISP can design and plan their network to ensure specified levels of user experience. Services may be moved closer to end users, services upgraded, the impact of QoS assessed, or more capacity deployed at certain locations. Service Level Agreements (SLAs) may be defined at network or product boundaries. o Understanding the quality experienced by customers. The network operator would like to gain better insight into the end-to-end performance experienced by its customers. "End-to-end" could, for instance, incorporate home and enterprise networks, and the impact of peering, caching, and Content Delivery Networks (CDNs). o Understanding the impact and operation of new devices and technology. As a new product is deployed, or a new technology introduced into the network, it is essential that its operation and its impact are measured. This also helps to quantify the advantage that the new technology is bringing and support the business case for larger roll-out.
shared needs for scalable, cost-effective, scientifically robust solutions to the measurement and collection of broadband Internet access service performance information. Section 5. The panel needs to include a representative sample of the operator's technologies and broadband speeds. For instance, it might encompass speeds ranging from below 8 Mbps to over 100 Mbps. The operator would like the end-to-end view of the service, rather than just the access portion. This involves relating the pure network parameters to something like a 'mean opinion score' [MOS], which will be service dependent (for instance, web-browsing QoE is largely determined by latency above a few Mbps). An operator will also want compound metrics such as "reliability", which might involve packet loss, DNS failures, retraining of the line, video streaming under-runs, etc. The operator really wants to understand the end-to-end service experience. However, the home network (Ethernet, Wi-Fi, powerline) is highly variable and outside its control. To date, operators (and regulators) have instead measured performance from the home gateway. However, mobile operators clearly must include the wireless link in the measurement. Active measurements are the most obvious approach, i.e., special measurement traffic is sent by -- and to -- the probe. In order not to degrade the service of the customer, the measurement data should only be sent when the user is silent, and it shouldn't reduce the customer's data allowance. The other approach is passive measurements on the customer's ordinary traffic; the advantage is that it measures what the customer actually does, but it creates extra variability (different traffic mixes give different results) and, in particular, it raises privacy concerns. [RFC6973] discusses privacy considerations for Internet protocols in general, while [Framework] discusses them specifically for large-scale measurement systems.
From an operator's viewpoint, understanding customer experience enables it to offer better services. Also, simple metrics can be more easily understood by senior managers who make investment decisions and by sales and marketing. Extend-TCP]). o Investigate a QoS mechanism (e.g., checking whether Diffserv markings are respected on some path).
from a limited panel of probes, can help quantify the advantage that a new technology brings and support the business case for larger roll-out. It may also be possible to use probes to run stress tests for risk analysis. For example, an operator could run a carefully controlled and limited experiment in which probing is used to assess the potential impact if some new application becomes popular.
An operator can obtain useful information without measuring the performance on every broadband line. By measuring a subset, the operator can identify problems that affect a group of customers. For example, the issue could be at a shared point in the network topology (such as an exchange), or common to a vendor, or equipment type; for instance, [IETF85-Plenary] describes a case where a particular home gateway upgrade had caused a (mistaken!) drop in line rate. A more extensive deployment of the measurement capability to every broadband line would enable an operator to identify issues unique to a single customer. Overall, large-scale measurements can help an operator fix the fault more rapidly and/or allow the affected customers to be informed of what's happening. More accurate information enables the operator to reassure customers and take more rapid and effective action to cure the problem. Often, customers experience poor broadband due to problems in the home network -- the ISP's network is fine. For example, they may have moved too far away from their wireless access point. Anecdotally, a large fraction of customer calls about fixed BB problems are due to in-home wireless issues. These issues are expensive and frustrating for an operator, as they are extremely hard to diagnose and solve. The operator would like to narrow down whether the problem is in the home (a problem with the home network, edge device, or home gateway), in the operator's network, or with an application service. The operator would like two capabilities: firstly, self-help tools that customers use to improve their own service or understand its performance better -- for example, to reposition their devices for better Wi-Fi coverage; and secondly, on-demand tests that the operator can run instantly, so that the call center person answering the phone (or e-chat) could trigger a test and get the result while the customer is still in an online session.
The published information needs to be: o Accurate - the measurement results must be correct and not influenced by errors or side effects. The results should be reproducible and consistent over time. o Comparable - common metrics should be used across different ISPs and service offerings, and over time, so that measurement results can be compared. o Meaningful - the metrics used for measurements need to reflect what end users value about their broadband Internet access service. o Reliable - the number and distribution of measurement agents, and the statistical processing of the raw measurement data, need to be appropriate. In practical terms, the regulators may measure network performance from users towards multiple content and application providers, including dedicated test measurement servers. Measurement probes are distributed to a 'panel' of selected end users. The panel covers all the operators and packages in the market, spread over urban, suburban, and rural areas, and often includes both fixed and mobile Internet access. Periodic tests running on the probes can, for example, measure actual speed at peak and off-peak hours, but can also measure other detailed quality metrics like delay and jitter. Collected data goes afterwards through statistical analysis, deriving estimates for the whole population. Summary information, such as a service quality index, is published regularly, perhaps alongside more detailed information. The regulator can also facilitate end users to monitor the performance of their own broadband Internet access service. They might use this information to check that the performance meets that specified in their contract or to understand whether their current subscription is the most appropriate. DAE]. The actual measurements can be made in the same way as described in Section 4.1.
M-Labs_NSDI-2010] and follow-on tool "Glasnost" [Glasnost] provide an example of work in this area. A regulator could also monitor the performance of the broadband service over time, to try and detect if the specialized service is provided at the expense of the Internet access service. Comparison between ISPs or between different countries may also be relevant for this kind of evaluation. The motivation for a regulator monitoring such traffic management practices is that regulatory approaches related to net neutrality and the open Internet have been introduced in some jurisdictions. Examples of such efforts are the Internet policy as outlined by the Body of European Regulators for Electronic Communications guidelines for quality of service [BEREC-Guidelines] and the US FCC's "Preserving the Open Internet" Report and Order [FCC-R&O]. Although legal challenges can change the status of policy, the take-away for LMAP purposes is that policy-makers are looking for measurement solutions to assist them in discovering biased treatment of traffic flows. The exact definitions and requirements vary from one jurisdiction to another.
ensures that measurements do not detrimentally impact the home user experience or corrupt the results by testing when the user is also using the broadband line. The system is therefore tightly controlled by the operator of the measurement system. One advantage of this approach is that it is possible to get reliable benchmarks for the performance of a network with only a few devices. One disadvantage is that it would be expensive to deploy hardware devices on a mass scale sufficient to understand the performance of the network at the granularity of a single broadband user. Another type of probe involves implementing the measurement capability as a webpage or an "app" that end users are encouraged to download onto their mobile phone or computing device. Measurements are triggered by the end user; for example, the user interface may have a button to "test my broadband now." One advantage of this approach is that the performance is measured to the end user, rather than to the home gateway, and so includes the home network. Another difference is that the system is much more loosely controlled, as the panel of end users and the schedule of tests are determined by the end users themselves rather than the measurement system. While this approach makes it easier to make measurements on a large scale, it is harder to get comparable benchmarks, as the measurements are affected by the home network; also, the population is self-selecting and so potentially biased towards those who think they have a problem. This could be alleviated by encouraging widespread downloading of the app and careful post-processing of the results to reduce biases. There are several other possibilities. For example, as a variant on the first approach, the measurement capability could be implemented as software embedded in the home gateway, which would make it more viable to have the capability on every user line. As a variant on the second approach, the end user could initiate measurements in response to a request from the measurement system. The operator of the measurement system should be careful to ensure that measurements do not detrimentally impact users. Potential issues include the following: * Measurement traffic generated on a particular user's line may impact that end user's quality of experience. The danger is greater for measurements that generate a lot of traffic over a lengthy period. * The measurement traffic may impact that particular user's bill or traffic cap.
* The measurement traffic from several end users may, in combination, congest a shared link. * The traffic associated with the control and reporting of measurements may overload the network. The danger is greater where the traffic associated with many end users is synchronized.
common way to collect the results. Standardization of this control and reporting functionality allows the operator of a measurement system to buy the various components from different vendors. After the measurement results are collected, they need to be understood and analyzed. Often, it is sufficient to measure only a small subset of end users, but per-line fault diagnosis requires the ability to test every individual line. Analysis requires accurate definition and understanding of where the test points are, as well as contextual information about the topology, line, product, and the subscriber's contract. The actual analysis of results is beyond the scope of LMAP, as is the key challenge of how to integrate the measurement system into a network operator's existing tools for diagnostics and network planning. Finally, the test data, along with any associated network, product, or subscriber contract data, is commercial or private information and needs to be protected. RFC6973], and business sensitivity issues: 1. A malicious party may try to gain control of probes to launch DoS (Denial of Service) attacks at a target. A DoS attack could be targeted at a particular end user or set of end users, a certain network, or a specific service provider. 2. A malicious party may try to gain control of probes to create a platform for pervasive monitoring [RFC7258] or for more targeted monitoring. [RFC7258] summarizes the threats as follows: "An attack may change the content of the communication, record the content or external characteristics of the communication, or through correlation with other communication events, reveal information the parties did not intend to be revealed." For example, a malicious party could distribute to the probes a new measurement test that recorded (and later reported) information of maleficent interest. Similar concerns also arise if the measurement results are intercepted or corrupted. * From the end user's perspective, the concerns include a malicious party monitoring the traffic they send and receive, who they communicate with, the websites they visit, and such information about their behavior as when they are at home and the location of their devices. Some of the concerns may be greater when the probe is on the end user's device rather than on their home gateway.
* From the network operator's perspective, the concerns include the leakage of commercially sensitive information about the design and operation of their network, their customers, and suppliers. Some threats are indirect; for example, the attacker could reconnoiter potential weaknesses, such as open ports and paths through the network, which enabled it to launch an attack later. * From the regulator's perspective, the concerns include distortion of the measurement tests or alteration of the measurement results. Also, a malicious network operator could try to identify the broadband lines that the regulator was measuring and prioritize that traffic ("game the system"). 3. Another potential issue is a measurement system that does not obtain the end user's informed consent, fails to specify a specific purpose in the consent, or uses the collected information for secondary uses beyond those specified. 4. Another potential issue is a measurement system that does not indicate who is responsible for the collection and processing of personal data and who is responsible for fulfilling the rights of users. The responsible party (often termed the "data controller") should, as good practice, consider such issues as defining: o the purpose for which the data is collected and used, o how the data is stored, accessed, and processed, o how long the data is retained, and o how the end user can view, update, and even delete their personal data. If anonymized personal data is shared with a third party, the data controller should consider the possibility that the third party can de-anonymize it by combining it with other information. These security and privacy issues will need to be considered carefully by any measurement system. In the context of LMAP, [Framework] considers them further, along with some potential mitigations. Other LMAP documents will specify one or more protocols that enable the measurement system to instruct a probe about what measurements to make and that enable the probe to report the measurement results. Those documents will need to discuss solutions to the security and privacy issues. However, the protocol documents
will not consider the actual usage of the measurement information. Many use cases can be envisaged, and earlier in this document we described some likely ones for the network operator and regulator. [IETF85-Plenary] Crawford, S., "Large-Scale Active Measurement of Broadband Networks", 'example' from slide 18, November 2012, <http://www.ietf.org/proceedings/85/slides/ slides-85-iesg-opsandtech-7.pdf>. [Extend-TCP] Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A., Handley, M., and H. Tokuda, "Is it Still Possible to Extend TCP?", Proceedings of IETF 82, November 2011, <http://www.ietf.org/proceedings/82/slides/IRTF-1.pdf>. [Framework] Eardley, P., Morton, A., Bagnulo, M., Burbridge, T., Aitken, P., and A. Akhter, "A framework for Large-Scale Measurement of Broadband Performance (LMAP)", Work in Progress, draft-ietf-lmap-framework-14, April 2015. [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., Morris, J., Hansen, M., and R. Smith, "Privacy Considerations for Internet Protocols", RFC 6973, July 2013, <http://www.rfc-editor.org/info/rfc6973>. [RFC7258] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an Attack", BCP 188, RFC 7258, May 2014, <http://www.rfc-editor.org/info/rfc7258>. [FCC-R&O] United States Federal Communications Commission, "Preserving the Open Internet; Broadband Industries Practices: Report and Order", FCC 10-201, December 2010, <http://hraunfoss.fcc.gov/edocs_public/attachmatch/ FCC-10-201A1.pdf>. [BEREC-Guidelines] Body of European Regulators for Electronic Communications, "BEREC Guidelines for quality of service in the scope of net neutrality", <http://berec.europa.eu/eng/ document_register/subject_matter/berec/download/0/ 1101-berec-guidelines-for-quality-of-service-_0.pdf>.
[M-Labs_NSDI-2010] M-Lab, "Glasnost: Enabling End Users to Detect Traffic Differentiation", <http://www.measurementlab.net/ download/AMIfv945ljiJXzG-fgUrZSTu2hs1xRl5Oh- rpGQMWL305BNQh-BSq5oBoYU4a7zqXOvrztpJhK9gwk5unOe- fOzj4X-vOQz_HRrnYU-aFd0rv332RDReRfOYkJuagysstN3GZ__lQHTS8_ UHJTWkrwyqIUjffVeDxQ/>. [Glasnost] M-Lab tool "Glasnost", <http://mlab-live.appspot.com/ tools/glasnost>. [MOS] Wikipedia, "Mean Opinion Score", January 2015, <http://en.wikipedia.org/w/index.php? title=Mean_opinion_score&oldid=644494161>. [DAE] Digital Agenda for Europe, COM(2010)245 final, "Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions", <http://eur-lex.europa.eu/legal-content/EN/TXT/ PDF/?uri=CELEX:52010DC0245&from=EN>.