Internet Engineering Task Force (IETF) S. Bortzmeyer Request for Comments: 7626 AFNIC Category: Informational August 2015 ISSN: 2070-1721 DNS Privacy Considerations
AbstractThis document describes the privacy issues associated with the use of the DNS by Internet users. It is intended to be an analysis of the present situation and does not prescribe solutions. Status of This Memo This document is not an Internet Standards Track specification; it is published for informational purposes. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7626. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1. The Alleged Public Nature of DNS Data . . . . . . . . . . 4 2.2. Data in the DNS Request . . . . . . . . . . . . . . . . . 5 2.3. Cache Snooping . . . . . . . . . . . . . . . . . . . . . 6 2.4. On the Wire . . . . . . . . . . . . . . . . . . . . . . . 7 2.5. In the Servers . . . . . . . . . . . . . . . . . . . . . 8 2.5.1. In the Recursive Resolvers . . . . . . . . . . . . . 8 2.5.2. In the Authoritative Name Servers . . . . . . . . . . 9 2.5.3. Rogue Servers . . . . . . . . . . . . . . . . . . . . 10 2.6. Re-identification and Other Inferences . . . . . . . . . 11 2.7. More Information . . . . . . . . . . . . . . . . . . . . 11 3. Actual "Attacks" . . . . . . . . . . . . . . . . . . . . . . 11 4. Legalities . . . . . . . . . . . . . . . . . . . . . . . . . 12 5. Security Considerations . . . . . . . . . . . . . . . . . . . 12 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 6.1. Normative References . . . . . . . . . . . . . . . . . . 12 6.2. Informative References . . . . . . . . . . . . . . . . . 13 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 17 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 17 Section 8 of [RFC6973]. The Domain Name System is specified in [RFC1034], [RFC1035], and many later RFCs, which have never been consolidated. It is one of the most important infrastructure components of the Internet and often ignored or misunderstood by Internet users (and even by many professionals). Almost every activity on the Internet starts with a DNS query (and often several). Its use has many privacy implications and this is an attempt at a comprehensive and accurate list. Let us begin with a simplified reminder of how the DNS works. (See also [DNS-TERMS].) A client, the stub resolver, issues a DNS query to a server, called the recursive resolver (also called caching resolver or full resolver or recursive name server). Let's use the query "What are the AAAA records for www.example.com?" as an example. AAAA is the QTYPE (Query Type), and www.example.com is the QNAME (Query Name). (The description that follows assumes a cold cache, for instance, because the server just started.) The recursive resolver will first query the root name servers. In most cases, the root name servers will send a referral. In this example, the referral will be to the .com name servers. The resolver repeats the query to one of the .com name servers. The .com name servers, in
turn, will refer to the example.com name servers. The example.com name server will then return the answer. The root name servers, the name servers of .com, and the name servers of example.com are called authoritative name servers. It is important, when analyzing the privacy issues, to remember that the question asked to all these name servers is always the original question, not a derived question. The question sent to the root name servers is "What are the AAAA records for www.example.com?", not "What are the name servers of .com?". By repeating the full question, instead of just the relevant part of the question to the next in line, the DNS provides more information than necessary to the name server. Because DNS relies on caching heavily, the algorithm described just above is actually a bit more complicated, and not all questions are sent to the authoritative name servers. If a few seconds later the stub resolver asks the recursive resolver, "What are the SRV records of _xmpp-server._tcp.example.com?", the recursive resolver will remember that it knows the name servers of example.com and will just query them, bypassing the root and .com. Because there is typically no caching in the stub resolver, the recursive resolver, unlike the authoritative servers, sees all the DNS traffic. (Applications, like web browsers, may have some form of caching that does not follow DNS rules, for instance, because it may ignore the TTL. So, the recursive resolver does not see all the name resolution activity.) It should be noted that DNS recursive resolvers sometimes forward requests to other recursive resolvers, typically bigger machines, with a larger and more shared cache (and the query hierarchy can be even deeper, with more than two levels of recursive resolvers). From the point of view of privacy, these forwarders are like resolvers, except that they do not see all of the requests being made (due to caching in the first resolver). Almost all this DNS traffic is currently sent in clear (unencrypted). There are a few cases where there is some channel encryption, for instance, in an IPsec VPN, at least between the stub resolver and the resolver. Today, almost all DNS queries are sent over UDP [thomas-ditl-tcp]. This has practical consequences when considering encryption of the traffic as a possible privacy technique. Some encryption solutions are only designed for TCP, not UDP. Another important point to keep in mind when analyzing the privacy issues of DNS is the fact that DNS requests received by a server are triggered by different reasons. Let's assume an eavesdropper wants to know which web page is viewed by a user. For a typical web page, there are three sorts of DNS requests being issued:
namespaces notwithstanding, the DNS operates under the assumption that public-facing authoritative name servers will respond to "usual" DNS queries for any zone they are authoritative for without further authentication or authorization of the client (resolver). Due to the lack of search capabilities, only a given QNAME will reveal the resource records associated with that name (or that name's non- existence). In other words: one needs to know what to ask for, in order to receive a response. The zone transfer QTYPE [RFC5936] is often blocked or restricted to authenticated/authorized access to enforce this difference (and maybe for other reasons). Another differentiation to be considered is between the DNS data itself and a particular transaction (i.e., a DNS name lookup). DNS data and the results of a DNS query are public, within the boundaries described above, and may not have any confidentiality requirements. However, the same is not true of a single transaction or a sequence of transactions; that transaction is not / should not be public. A typical example from outside the DNS world is: the web site of Alcoholics Anonymous is public; the fact that you visit it should not be. RFC6269]). The QNAME is the full name sent by the user. It gives information about what the user does ("What are the MX records of example.net?" means he probably wants to send email to someone at example.net, which may be a domain used by only a few persons and is therefore very revealing about communication relationships). Some QNAMEs are more sensitive than others. For instance, querying the A record of a well-known web statistics domain reveals very little (everybody visits web sites that use this analytics service), but querying the A record of www.verybad.example where verybad.example is the domain of an organization that some people find offensive or objectionable may create more problems for the user. Also, sometimes, the QNAME embeds the software one uses, which could be a privacy issue. For instance, _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.example.org. There are also some BitTorrent clients that query an SRV record for _bittorrent-tracker._tcp.domain.example.
Another important thing about the privacy of the QNAME is the future usages. Today, the lack of privacy is an obstacle to putting potentially sensitive or personally identifiable data in the DNS. At the moment, your DNS traffic might reveal that you are doing email but not with whom. If your Mail User Agent (MUA) starts looking up Pretty Good Privacy (PGP) keys in the DNS [DANE-OPENPGPKEY], then privacy becomes a lot more important. And email is just an example; there would be other really interesting uses for a more privacy- friendly DNS. For the communication between the stub resolver and the recursive resolver, the source IP address is the address of the user's machine. Therefore, all the issues and warnings about collection of IP addresses apply here. For the communication between the recursive resolver and the authoritative name servers, the source IP address has a different meaning; it does not have the same status as the source address in an HTTP connection. It is now the IP address of the recursive resolver that, in a way, "hides" the real user. However, hiding does not always work. Sometimes [CLIENT-SUBNET] is used (see its privacy analysis in [denis-edns-client-subnet]). Sometimes the end user has a personal recursive resolver on her machine. In both cases, the IP address is as sensitive as it is for HTTP [sidn-entrada]. A note about IP addresses: there is currently no IETF document that describes in detail all the privacy issues around IP addressing. In the meantime, the discussion here is intended to include both IPv4 and IPv6 source addresses. For a number of reasons, their assignment and utilization characteristics are different, which may have implications for details of information leakage associated with the collection of source addresses. (For example, a specific IPv6 source address seen on the public Internet is less likely than an IPv4 address to originate behind a CGN or other NAT.) However, for both IPv4 and IPv6 addresses, it's important to note that source addresses are propagated with queries and comprise metadata about the host, user, or application that originated them. grangeia.snooping]. Since this also is a reconnaissance technique for subsequent cache poisoning attacks, some counter measures have already been developed and deployed.
RFC4033], explicitly excludes confidentiality from its goals.) So, if an initiator starts an HTTPS communication with a recipient, while the HTTP traffic will be encrypted, the DNS exchange prior to it will not be. When other protocols will become more and more privacy-aware and secured against surveillance, the DNS may become "the weakest link" in privacy. An important specificity of the DNS traffic is that it may take a different path than the communication between the initiator and the recipient. For instance, an eavesdropper may be unable to tap the wire between the initiator and the recipient but may have access to the wire going to the recursive resolver, or to the authoritative name servers. The best place to tap, from an eavesdropper's point of view, is clearly between the stub resolvers and the recursive resolvers, because traffic is not limited by DNS caching. The attack surface between the stub resolver and the rest of the world can vary widely depending upon how the end user's computer is configured. By order of increasing attack surface: The recursive resolver can be on the end user's computer. In (currently) a small number of cases, individuals may choose to operate their own DNS resolver on their local machine. In this case, the attack surface for the connection between the stub resolver and the caching resolver is limited to that single machine. The recursive resolver may be at the local network edge. For many/most enterprise networks and for some residential users, the caching resolver may exist on a server at the edge of the local network. In this case, the attack surface is the local network. Note that in large enterprise networks, the DNS resolver may not be located at the edge of the local network but rather at the edge of the overall enterprise network. In this case, the enterprise network could be thought of as similar to the Internet Access Provider (IAP) network referenced below. The recursive resolver can be in the IAP premises. For most residential users and potentially other networks, the typical case is for the end user's computer to be configured (typically automatically through DHCP) with the addresses of the DNS recursive resolvers at the IAP. The attack surface for on-the-
wire attacks is therefore from the end-user system across the local network and across the IAP network to the IAP's recursive resolvers. The recursive resolver can be a public DNS service. Some machines may be configured to use public DNS resolvers such as those operated today by Google Public DNS or OpenDNS. The end user may have configured their machine to use these DNS recursive resolvers themselves -- or their IAP may have chosen to use the public DNS resolvers rather than operating their own resolvers. In this case, the attack surface is the entire public Internet between the end user's connection and the public DNS service. RFC6973], the DNS servers (recursive resolvers and authoritative servers) are enablers: they facilitate communication between an initiator and a recipient without being directly in the communications path. As a result, they are often forgotten in risk analysis. But, to quote again [RFC6973], "Although [...] enablers may not generally be considered as attackers, they may all pose privacy threats (depending on the context) because they are able to observe, collect, process, and transfer privacy-relevant data." In [RFC6973] parlance, enablers become observers when they start collecting data. Many programs exist to collect and analyze DNS data at the servers -- from the "query log" of some programs like BIND to tcpdump and more sophisticated programs like PacketQ [packetq] [packetq-list] and DNSmezzo [dnsmezzo]. The organization managing the DNS server can use this data itself, or it can be part of a surveillance program like PRISM [prism] and pass data to an outside observer. Sometimes, this data is kept for a long time and/or distributed to third parties for research purposes [ditl] [day-at-root], security analysis, or surveillance tasks. These uses are sometimes under some sort of contract, with various limitations, for instance, on redistribution, given the sensitive nature of the data. Also, there are observation points in the network that gather DNS data and then make it accessible to third parties for research or security purposes ("passive DNS" [passive-dns]).
outside of the ccTLD's country. End users may not anticipate that, when doing a security analysis. Also, it seems (see the survey described in [aeris-dns]) that there is a strong concentration of authoritative name servers among "popular" domains (such as the Alexa Top N list). For instance, among the Alexa Top 100K, one DNS provider hosts today 10% of the domains. The ten most important DNS providers host together one third of the domains. With the control (or the ability to sniff the traffic) of a few name servers, you can gather a lot of information. dnschanger] can change the recursive resolver in the machine's configuration, or the routing itself can be subverted (for instance, [ripe-atlas-turkey]). A practical consequence of this section is that solutions for DNS privacy may have to address authentication of the server, not just passive sniffing.
herrmann-reidentification] found that such re- identification is possible so that "73.1% of all day-to-day links were correctly established, i.e. user u was either re-identified unambiguously (1) or the classifier correctly reported that u was not present on day t+1 any more (2)." While that study related to web browsing behavior, equally characteristic patterns may be produced even in machine-to-machine communications or without a user taking specific actions, e.g., at reboot time if a characteristic set of services are accessed by the device. For instance, one could imagine that an intelligence agency identifies people going to a site by putting in a very long DNS name and looking for queries of a specific length. Such traffic analysis could weaken some privacy solutions. The IAB privacy and security program also have a work in progress [RFC7624] that considers such inference-based attacks in a more general framework. tor-leak] (about the risk of privacy leak through DNS) and in a few academic papers: [yanbin-tsudik], [castillo-garcia], [fangming-hori-sakurai], and [federrath-fuchs-herrmann-piosecny]. Section 1). But, in this time of "big data" processing, powerful techniques now exist to get from the raw data to what the eavesdropper is actually interested in. Many research papers about malware detection use DNS traffic to detect "abnormal" behavior that can be traced back to the activity of
malware on infected machines. Yes, this research was done for the good, but technically it is a privacy attack and it demonstrates the power of the observation of DNS traffic. See [dns-footprint], [dagon-malware], and [darkreading-dns]. Passive DNS systems [passive-dns] allow reconstruction of the data of sometimes an entire zone. They are used for many reasons -- some good, some bad. Well-known passive DNS systems keep only the DNS responses, and not the source IP address of the client, precisely for privacy reasons. Other passive DNS systems may not be so careful. And there is still the potential problems with revealing QNAMEs. The revelations (from the Edward Snowden documents, which were leaked from the National Security Agency (NSA)) of the MORECOWBELL surveillance program [morecowbell], which uses the DNS, both passively and actively, to surreptitiously gather information about the users, is another good example showing that the lack of privacy protections in the DNS is actively exploited. data-protection-directive] (European Union) in the context of DNS traffic data is not an easy task, and we do not know a court precedent here. See an interesting analysis in [sidn-entrada]. QNAME-MINIMIZATION] for the minimization of data or [TLS-FOR-DNS] about encryption. [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, <http://www.rfc-editor.org/info/rfc1034>. [RFC1035] Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, November 1987, <http://www.rfc-editor.org/info/rfc1035>.
[RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., Morris, J., Hansen, M., and R. Smith, "Privacy Considerations for Internet Protocols", RFC 6973, DOI 10.17487/RFC6973, July 2013, <http://www.rfc-editor.org/info/rfc6973>. [RFC7258] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May 2014, <http://www.rfc-editor.org/info/rfc7258>. [aeris-dns] Vinot, N., "Vie privee: et le DNS alors?", (In French), 2015, <https://blog.imirhil.fr/vie-privee-et-le-dns-alors.html>. [castillo-garcia] Castillo-Perez, S. and J. Garcia-Alfaro, "Anonymous Resolution of DNS Queries", 2008, <http://deic.uab.es/~joaquin/papers/is08.pdf>. [CLIENT-SUBNET] Contavalli, C., Gaast, W., Lawrence, D., and W. Kumari, "Client Subnet in DNS Queries", Work in Progress, draft-ietf-dnsop-edns-client-subnet-02, July 2015. [dagon-malware] Dagon, D., "Corrupted DNS Resolution Paths: The Rise of a Malicious Resolution Authority", ISC/OARC Workshop, 2007, <https://www.dns-oarc.net/files/workshop-2007/ Dagon-Resolution-corruption.pdf>. [DANE-OPENPGPKEY] Wouters, P., "Using DANE to Associate OpenPGP public keys with email addresses", Work in Progress, draft-ietf-dane-openpgpkey-03, April 2015. [darkreading-dns] Lemos, R., "Got Malware? Three Signs Revealed In DNS Traffic", InformationWeek Dark Reading, May 2013, <http://www.darkreading.com/analytics/security-monitoring/ got-malware-three-signs-revealed-in-dns-traffic/d/ d-id/1139680>.
[data-protection-directive] European Parliament, "Directive 95/46/EC of the European Pariament and of the council on the protection of individuals with regard to the processing of personal data and on the free movement of such data", Official Journal L 281, pp. 0031 - 0050, November 1995, <http://eur-lex.europa.eu/LexUriServ/ LexUriServ.do?uri=CELEX:31995L0046:EN:HTML>. [day-at-root] Castro, S., Wessels, D., Fomenkov, M., and K. Claffy, "A Day at the Root of the Internet", ACM SIGCOMM Computer Communication Review, Vol. 38, Number 5, DOI 10.1145/1452335.1452341, October 2008, <http://www.sigcomm.org/sites/default/files/ccr/ papers/2008/October/1452335-1452341.pdf>. [denis-edns-client-subnet] Denis, F., "Security and privacy issues of edns-client- subnet", August 2013, <https://00f.net/2013/08/07/ edns-client-subnet/>. [ditl] CAIDA, "A Day in the Life of the Internet (DITL)", 2002, <http://www.caida.org/projects/ditl/>. [dns-footprint] Stoner, E., "DNS Footprint of Malware", OARC Workshop, October 2010, <https://www.dns-oarc.net/files/ workshop-201010/OARC-ers-20101012.pdf>. [DNS-TERMS] Hoffman, P., Sullivan, A., and K. Fujiwara, "DNS Terminology", Work in Progress, draft-ietf-dnsop-dns-terminology-03, June 2015. [dnschanger] Wikipedia, "DNSChanger", October 2013, <https://en.wikipedia.org/w/ index.php?title=DNSChanger&oldid=578749672>. [dnsmezzo] Bortzmeyer, S., "DNSmezzo", 2009, <http://www.dnsmezzo.net/>.
[fangming-hori-sakurai] Fangming, Z., Hori, Y., and K. Sakurai, "Analysis of Privacy Disclosure in DNS Query", 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE 2007), Seoul, Korea, ISBN: 0-7695-2777-9, pp. 952-957, DOI 10.1109/MUE.2007.84, April 2007, <http://dl.acm.org/citation.cfm?id=1262690.1262986>. [federrath-fuchs-herrmann-piosecny] Federrath, H., Fuchs, K., Herrmann, D., and C. Piosecny, "Privacy-Preserving DNS: Analysis of Broadcast, Range Queries and Mix-based Protection Methods", Computer Security ESORICS 2011, Springer, page(s) 665-683, ISBN 978-3-642-23821-5, 2011, <https://svs.informatik.uni-hamburg.de/publications/2011/ 2011-09-14_FFHP_PrivacyPreservingDNS_ESORICS2011.pdf>. [grangeia.snooping] Grangeia, L., "DNS Cache Snooping or Snooping the Cache for Fun and Profit", February 2004, <http://www.msit2005.mut.ac.th/msit_media/1_2551/nete4630/ materials/20080718130017Hc.pdf>. [herrmann-reidentification] Herrmann, D., Gerber, C., Banse, C., and H. Federrath, "Analyzing Characteristic Host Access Patterns for Re-Identification of Web User Sessions", DOI 10.1007/978-3-642-27937-9_10, 2012, <http://epub.uni-regensburg.de/21103/1/ Paper_PUL_nordsec_published.pdf>. [morecowbell] Grothoff, C., Wachs, M., Ermert, M., and J. Appelbaum, "NSA's MORECOWBELL: Knell for DNS", GNUnet e.V., January 2015, <https://gnunet.org/morecowbell>. [packetq] Dot SE, "PacketQ, a simple tool to make SQL-queries against PCAP-files", 2011, <https://github.com/dotse/packetq/wiki>. [packetq-list] PacketQ, "PacketQ Mailing List", <http://lists.iis.se/mailman/listinfo/packetq>. [passive-dns] Weimer, F., "Passive DNS Replication", April 2005, <http://www.enyo.de/fw/software/dnslogger/#2>.
[prism] Wikipedia, "PRISM (surveillance program)", July 2015, <https://en.wikipedia.org/w/index.php?title=PRISM_ (surveillance_program)&oldid=673789455>. [QNAME-MINIMIZATION] Bortzmeyer, S., "DNS query name minimisation to improve privacy", Work in Progress, draft-ietf-dnsop-qname-minimisation-04, June 2015. [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. Rose, "DNS Security Introduction and Requirements", RFC 4033, DOI 10.17487/RFC4033, March 2005, <http://www.rfc-editor.org/info/rfc4033>. [RFC5155] Laurie, B., Sisson, G., Arends, R., and D. Blacka, "DNS Security (DNSSEC) Hashed Authenticated Denial of Existence", RFC 5155, DOI 10.17487/RFC5155, March 2008, <http://www.rfc-editor.org/info/rfc5155>. [RFC5936] Lewis, E. and A. Hoenes, Ed., "DNS Zone Transfer Protocol (AXFR)", RFC 5936, DOI 10.17487/RFC5936, June 2010, <http://www.rfc-editor.org/info/rfc5936>. [RFC6269] Ford, M., Ed., Boucadair, M., Durand, A., Levis, P., and P. Roberts, "Issues with IP Address Sharing", RFC 6269, DOI 10.17487/RFC6269, June 2011, <http://www.rfc-editor.org/info/rfc6269>. [RFC7624] Barnes, R., Schneier, B., Jennings, C., Hardie, T., Trammell, B., Huitema, C., and D. Borkmann, "Confidentiality in the Face of Pervasive Surveillance: A Threat Model and Problem Statement", RFC 7624, DOI 10.17487/RFC7624, August 2015, <http://www.rfc-editor.org/info/rfc7624>. [ripe-atlas-turkey] Aben, E., "A RIPE Atlas View of Internet Meddling in Turkey", March 2014, <https://labs.ripe.net/Members/emileaben/ a-ripe-atlas-view-of-internet-meddling-in-turkey>. [sidn-entrada] Hesselman, C., Jansen, J., Wullink, M., Vink, K., and M. Simon, "A privacy framework for 'DNS big data' applications", November 2014, <https://www.sidnlabs.nl/uploads/tx_sidnpublications/ SIDN_Labs_Privacyraamwerk_Position_Paper_V1.4_ENG.pdf>.
[thomas-ditl-tcp] Thomas, M. and D. Wessels, "An Analysis of TCP Traffic in Root Server DITL Data", DNS-OARC 2014 Fall Workshop, October 2014, <https://indico.dns-oarc.net/event/20/ session/2/contribution/15/material/slides/1.pdf>. [TLS-FOR-DNS] Zi, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., and P. Hoffman, "TLS for DNS: Initiation and Performance Considerations", Work in Progress, draft-ietf-dprive- start-tls-for-dns-01, July 2015. [tor-leak] Tor, "DNS leaks in Tor", 2013, <https://www.torproject.org/docs/ faq.html.en#WarningsAboutSOCKSandDNSInformationLeaks>. [yanbin-tsudik] Yanbin, L. and G. Tsudik, "Towards Plugging Privacy Leaks in the Domain Name System", October 2009, <http://arxiv.org/abs/0910.2472>. http://www.afnic.fr/