Network Working Group A. Barbir Request for Comments: 3568 Nortel Networks Category: Informational B. Cain Storigen Systems R. Nair Consultant O. Spatscheck AT&T July 2003 Known Content Network (CN) Request-Routing Mechanisms Status of this Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved.
AbstractThis document presents a summary of Request-Routing techniques that are used to direct client requests to surrogates based on various policies and a possible set of metrics. The document covers techniques that were commonly used in the industry on or before December 2000. In this memo, the term Request-Routing represents techniques that is commonly called content routing or content redirection. In principle, Request-Routing techniques can be classified under: DNS Request-Routing, Transport-layer Request-Routing, and Application-layer Request-Routing. 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. DNS based Request-Routing Mechanisms . . . . . . . . . . . . 3 2.1. Single Reply . . . . . . . . . . . . . . . . . . . . . 3 2.2. Multiple Replies . . . . . . . . . . . . . . . . . . . 3 2.3. Multi-Level Resolution . . . . . . . . . . . . . . . . 4 2.3.1. NS Redirection . . . . . . . . . . . . . . . . 4 2.3.2. CNAME Redirection. . . . . . . . . . . . . . . 5 2.4. Anycast. . . . . . . . . . . . . . . . . . . . . . . . 5 2.5. Object Encoding. . . . . . . . . . . . . . . . . . . . 6 2.6. DNS Request-Routing Limitations. . . . . . . . . . . . 6 3. Transport-Layer Request-Routing . . . . . . . . . . . . . . 7
4. Application-Layer Request-Routing . . . . . . . . . . . . . 8 4.1. Header Inspection. . . . . . . . . . . . . . . . . . . 8 4.1.1. URL-Based Request-Routing. . . . . . . . . . . 8 4.1.2. Header-Based Request-Routing . . . . . . . . . 9 4.1.3. Site-Specific Identifiers. . . . . . . . . . .10 4.2. Content Modification . . . . . . . . . . . . . . . . .10 4.2.1. A-priori URL Rewriting . . . . . . . . . . . .11 4.2.2. On-Demand URL Rewriting. . . . . . . . . . . .11 4.2.3. Content Modification Limitations . . . . . . .11 5. Combination of Multiple Mechanisms . . . . . . . . . . . . .11 6. Security Considerations . . . . . . . . . . . . . . . . . .12 7. Additional Authors and Acknowledgements . . . . . . . . . .12 A. Measurements . . . . . . . . . . . . . . . . . . . . . . . .13 A.1. Proximity Measurements . . . . . . . . . . . . . . . .13 A.1.1. Active Probing . . . . . . . . . . . . . . . .13 A.1.2. Metric Types . . . . . . . . . . . . . . . . .14 A.1.3. Surrogate Feedback . . . . . . . . . . . . . .14 8. Normative References . . . . . . . . . . . . . . . . . . . .15 9. Informative References . . . . . . . . . . . . . . . . . . .15 10. Intellectual Property and Copyright Statements . . . . . . .17 11. Authors' Addresses . . . . . . . . . . . . . . . . . . . . .18 12. Full Copyright Statement . . . . . . . . . . . . . . . . . .19 8]. Content Networks include network infrastructure that exists in layers 4 through 7. Content Networks deal with the routing and forwarding of requests and responses for content. Content Networks rely on layer 7 protocols such as HTTP  for transport. Request-Routing techniques are generally used to direct client requests for objects to a surrogate or a set of surrogates that could best serve that content. Request-Routing mechanisms could be used to direct client requests to surrogates that are within a Content Network (CN) .
Request-Routing techniques are used as a vehicle to extend the reach and scale of Content Delivery Networks. There exist multiple Request-Routing mechanisms. At a high-level, these may be classified under: DNS Request-Routing, transport-layer Request-Routing, and application-layer Request-Routing. A request routing system uses a set of metrics in an attempt to direct users to surrogate that can best serve the request. For example, the choice of the surrogate could be based on network proximity, bandwidth availability, surrogate load and availability of content. Appendix A provides a summary of metrics and measurement techniques that could be used in the selection of the best surrogate. The memo is organized as follows: Section 2 provides a summary of known DNS based Request-Routing techniques. Section 3 discusses transport-layer Request-Routing methods. In section 4 application layer Request-Routing mechanisms are explored. Section 5 provides insight on combining the various methods that were discussed in the earlier sections in order to optimize the performance of the Request-Routing System. Appendix A provides a summary of possible metrics and measurements techniques that could be used by the Request-Routing system to choose a given surrogate. 10]. In DNS based Request-Routing techniques, a specialized DNS server is inserted in the DNS resolution process. The server is capable of returning a different set of A, NS or CNAME records based on user defined policies, metrics, or a combination of both. In  RFC 2782 (DNS SRV) provides guidance on the use of DNS for load balancing. The RFC describes some of the limitations and suggests appropriate useage of DNS based techniques. The next sections provides a summary of some of the used techniques.
10] would eventually request a resolution of a.b.example.com from the name server authoritative for example.com. The name server authoritative for this domain might be a Request-Routing NS server. In this case the Request-Routing DNS server can either return a set of A records or can redirect the resolution of the request a.b.example.com to the DNS server that is authoritative for example.com using NS records. One drawback of using NS records is that the number of Request-Routing DNS servers are limited by the number of parts in the DNS name. This problem results from DNS policy that causes a client site DNS server to abandon a request if no additional parts of the DNS name are resolved in an exchange with an authoritative DNS server. A second drawback is that the last DNS server can determine the TTL of the entire resolution process. Basically, the last DNS server can return in the authoritative section of its response its own NS record. The client will use this cached NS record for further request resolutions until it expires.
Another drawback is that some implementations of bind voluntarily cause timeouts to simplify their implementation in cases in which a NS level redirect points to a name server for which no valid A record is returned or cached. This is especially a problem if the domain of the name server does not match the domain currently resolved, since in this case the A records, which might be passed in the DNS response, are discarded for security reasons. Another drawback is the added delay in resolving the request due to the use of multiple DNS servers. 5] is an inter-network service that is applicable to networking situations where a host, application, or user wishes to locate a host which supports a particular service but, if several servers utilizes the service, it does not particularly care which server is used. In an anycast service, a host transmits a datagram to an anycast address and the inter-network is responsible for providing best effort delivery of the datagram to at least one, and preferably only one, of the servers that accept datagrams for the anycast address. The motivation for anycast is that it considerably simplifies the task of finding an appropriate server. For example, users, instead of consulting a list of servers and choosing the closest one, could simply type the name of the server and be connected to the nearest one. By using anycast, DNS resolvers would no longer have to be configured with the IP addresses of their servers, but rather could send a query to a well-known DNS anycast address. Furthermore, to combine measurement and redirection, the Request-Routing DNS server can advertise an anycast address as its IP address. The same address is used by multiple physical DNS servers. In this scenario, the Request-Routing DNS server that is the closest to the client site DNS server in terms of OSPF and BGP routing will receive the packet containing the DNS resolution request. The server can use this information to make a Request-Routing decision.
Drawbacks of this approach are listed below: o The DNS server may not be the closest server in terms of routing to the client. o Typically, routing protocols are not load sensitive. Hence, the closest server may not be the one with the least network latency. o The server load is not considered during the Request-Routing process.
o DNS servers can request and allow recursive resolution of DNS names. For recursive resolution of requests, the Request-Routing DNS server will not be exposed to the IP address of the client's site DNS server. In this case, the Request-Routing DNS server will be exposed to the address of the DNS server that is recursively requesting the information on behalf of the client's site DNS server. For example, imgs.example.com might be resolved by a CN, but the request for the resolution might come from dns1.example.com as a result of the recursion. o Users that share a single client site DNS server will be redirected to the same set of IP addresses during the TTL interval. This might lead to overloading of the surrogate during a flash crowd. o Some implementations of bind can cause DNS timeouts to occur while handling exceptional situations. For example, timeouts can occur for NS redirections to unknown domains. DNS based request routing techniques can suffer from serious limitations. For example, the use of such techniques can overburden third party DNS servers, which should not be allowed . In  RFC 2782 provides warnings on the use of DNS for load balancing. Readers are encouraged to read the RFC for better understanding of the limitations. 20] are used to hand off the session to a more appropriate surrogate are beyond the scope of this document. In general, the forward-flow traffic (client to newly selected surrogate) will flow through the surrogate originally chosen by DNS. The reverse-flow (surrogate to client) traffic, which normally transfers much more data than the forward flow, would typically take the direct path.
The overhead associated with transport-layer Request-Routing  is better suited for long-lived sessions such as FTP  and RTSP . However, it also could be used to direct clients away from overloaded surrogates. In general, transport-layer Request-Routing can be combined with DNS based techniques. As stated earlier, DNS based methods resolve clients requests based on domains or sub domains with exposure to the client's DNS server IP address. Hence, the DNS based methods could be used as a first step in deciding on an appropriate surrogate with more accurate refinement made by the transport-layer Request-Routing system. 4], RTSP , and SSL  provide hints in the initial portion of the session about how the client request must be directed. These hints may come from the URL of the content or other parts of the MIME request header such as Cookies. 6]. In many cases, this information is sufficient to disambiguate the content and suitably direct the request. In most cases, it may be sufficient to make Request-Routing decision just by examining the prefix or suffix of the URL. 4] or RTSP ) to redirect the client to the actual delivery node.
This technique is relatively simple to implement. However, the main drawback of this method is the additional latency involved in sending the redirect message back to the client. 4] such as Cookie, Language, and User-Agent, in order to select a surrogate. In  some examples of using this technique are provided. Cookies can be used to identify a customer or session by a web site. Cookie based Request-Routing provides content service differentiation based on the client. This approach works provided that the cookies belong to the client. In addition, it is possible to direct a connection from a multi-session transaction to the same server to achieve session-level persistence. The language header can be used to direct traffic to a language-specific delivery node. The user-agent header helps identify the type of client device. For example, a voice-browser, PDA, or cell phone can indicate the type of delivery node that has content specialized to handle the content request.
9]. Special considerations must be made to ensure that the task of modifying the content is performed in a manner that is consistent with RFC 3238  that specifies the architectural considerations for intermediaries that perform operations or modifications on content. The basic types of URL rewriting are discussed in the following subsections.
23]. For example: o The first request from a client to a specific site must be served from the origin server. o Content that has been modified to include references to nearby surrogates rather than to the origin server should be marked as non-cacheable. Alternatively, such pages can be marked to be cacheable only for a relatively short period of time. Rewritten URLs on cached pages can cause problems, because they can get outdated and point to surrogates that are no longer available or no longer good choices.
A basic problem of DNS Request-Routing is the resolution granularity that allows resolution on a per-domain level only. A per-object redirection cannot easily be achieved. However, content modification can be used together with DNS Request-Routing to overcome this problem. With content modification, references to different objects on the same origin server can be rewritten to point into different domain name spaces. Using DNS Request-Routing, requests for those objects can now dynamically be directed to different surrogates. 9] RFC 3238 addresses the main requirements for entities that intend to modify requests for content in the Internet. Some active probing techniques will set off intrusion detection systems and firewalls. Therefore, it is recommended that implementers be aware of routing protocol security . It is important to note the impact of TLS  on request routing in CNs. Specifically, when TLS is used the full URL is not visible to the content network unless it terminates the TLS session. The current document focuses on HTTP techniques. TLS based techniques that require the termination of TLS sessions on Content Peering Gateways  are beyond the of scope of this document. The details of security techniques are also beyond the scope of this document.
24]. Proximity measurements can be exchanged between surrogates and the requesting entity. In many cases, proximity measurements are "one-way" in that they measure either the forward or reverse path of packets from the surrogate to the requesting entity. This is important as many paths in the Internet are asymmetric . In order to obtain a set of proximity measurements, a network may employ active probing techniques.
o Probes often cause security alarms to be triggered on intrusion detection systems. 24].
 Postel, J. and J. Reynolds, "File Transfer Protocol", STD 9, RFC 959, October 1985.  Dierks, T. and C. Allen, "The TLS Protocol Version 1", RFC 2246, January 1999.  Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming Protocol", RFC 2326, April 1998.  Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.  Partridge, C., Mendez, T. and W. Milliken, "Host Anycasting Service", RFC 1546, November 1993.  Berners-Lee, T., Masinter, L. and M. McCahill, "Uniform Resource Locators (URL)", RFC 1738, December 1994.  Schulzrinne, H., Casner, S., Federick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 1889, January 1996.  Day, M., Cain, B., Tomlinson, G. and P. Rzewski, "A Model for Content Internetworking (CDI)", RFC 3466, February 2003.  Floyd, S. and L. Daigle, "IAB Architectural and Policy Considerations for Open Pluggable Edge Services", RFC 3238, January 2002.  Eastlake, D. and A, Panitz, "Reserved Top Level DNS Names", BCP 32, RFC 2606, June 1999.  Gulbrandsen, A., Vixie, P. and L. Esibov, "A DNS RR for specifying the location of services (DNS SRV)", RFC 2782, February 2002.  Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, November 1987.  Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1035, November 1987.
 Elz, R. and R. Bush, "Clarifications to the DNS Specification", RFC 2181, July 1997.  Awduche, D., Chiu, A., Elwalid, A., Widjaja, I. and X. Xiao, "Overview and Principles of Internet Traffic Engineering", RFC 3272, May 2002.  Crawley, E., Nair, R., Rajagopalan, B. and H. Sandick, "A Framework for QoS-based Routing in the Internet", RFC 2386, August 1998.  Huston, G., "Commentary on Inter-Domain Routing in the Internet", RFC 3221, December 2001.  M. Welsh et al., "SEDA: An Architecture for Well-Conditioned, Scalable Internet Services", Proceedings of the Eighteenth Symposium on Operating Systems Principles (SOSP-18) 2001, October 2001.  A. Shaikh, "On the effectiveness of DNS-based Server Selection", INFOCOM 2001, August 2001.  C. Yang et al., "An effective mechanism for supporting content- based routing in scalable Web server clusters", Proc. International Workshops on Parallel Processing 1999, September 1999.  R. Liston et al., "Using a Proxy to Measure Client-Side Web Performance", Proceedings of the Sixth International Web Content Caching and Distribution Workshop (WCW'01) 2001, August 2001.  W. Jiang et al., "Modeling of packet loss and delay and their effect on real-time multimedia service quality", Proceedings of NOSSDAV 2000, June 2000.  K. Johnson et al., "The measured performance of content distribution networks", Proceedings of the Fifth International Web Caching Workshop and Content Delivery Workshop 2000, May 2000.  V. Paxson, "End-to-end Internet packet dynamics", IEEE/ACM Transactions 1999, June 1999.  F. Wang et al., "Secure routing protocols: Theory and Practice", Technical report, North Carolina State University 1997, May 1997.
Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society.