5. Methodology
Mapping the relationship between human rights, protocols, and
architectures is a new research challenge that requires a good amount
of interdisciplinary and cross-organizational cooperation to develop
a consistent methodology.
The methodological choices made in this document are based on the
political-science-based method of discourse analysis and ethnographic
research methods [Cath]. This work departs from the assumption that
language reflects the understanding of concepts. Or, as [Jabri]
holds, policy documents are "social relations represented in texts
where the language contained within these texts is used to construct
meaning and representation." This process happens in society
[Denzin] and manifests itself in institutions and organizations
[King], exposed using the ethnographic methods of semi-structured
interviews and participant observation. Or, in non-academic
language, the way the language in IETF/IRTF documents describes and
approaches the issues they are trying to address is an indication of
the underlying social assumptions and relationships of the engineers
to their engineering. By reading and analyzing these documents, as
well as interviewing engineers and participating in the IETF/IRTF
working groups, it is possible to distill the relationship between
human rights, protocols, and the Internet's infrastructure as it
pertains to the work of the IETF.
The discourse analysis was operationalized using qualitative and
quantitative means. The first step taken by the authors and
contributors was reading RFCs and other official IETF documents. The
second step was the use of a Python-based analyzer, using the
"Bigbang" tool, adapted by Nick Doty [Doty], to scan for the concepts
that were identified as important architectural principles (distilled
on the initial reading and supplemented by the interviews and
participant observation). Such a quantitative method is very precise
and speeds up the research process [Ritchie]. But this tool is
unable to understand "latent meaning" [Denzin]. In order to mitigate
these issues of automated word-frequency-based approaches and to get
a sense of the "thick meaning" [Geertz] of the data, a second
qualitative analysis of the data set was performed. These various
rounds of discourse analysis were used to inform the interviews and
further data analysis. As such, the initial rounds of quantitative
discourse analysis were used to inform the second rounds of
qualitative analysis. The results from the qualitative interviews
were again used to feed new concepts into the quantitative discourse
analysis. As such, the two methods continued to support and enrich
each other.
The ethnographic methods of the data collection and processing
allowed the research group to acquire the data necessary to "provide
a holistic understanding of research participants' views and actions"
[Denzin] that highlighted ongoing issues and case studies where
protocols impact human rights. The interview participants were
selected through purposive sampling [Babbie], as the research group
was interested in getting a wide variety of opinions on the role of
human rights in guiding protocol development. This sampling method
also ensured that individuals with extensive experience working at
the IETF in various roles were targeted. The interviewees included
individuals in leadership positions (Working Group (WG) chairs, Area
Directors (ADs)), "regular participants", and individuals working for
specific entities (corporate, civil society, political, academic) and
represented various backgrounds, nationalities, and genders.
5.1. Data Sources
In order to map the potential relationship between human rights and
protocols, the HRPC Research Group gathered data from three specific
sources:
5.1.1. Discourse Analysis of RFCs
To start addressing the issue, a mapping exercise analyzing Internet
infrastructure and protocol features vis-a-vis their possible impact
on human rights was undertaken. Therefore, research on (1) the
language used in current and historic RFCs and (2) information
gathered from mailing-list discussions was undertaken to expose core
architectural principles, language, and deliberations on the human
rights of those affected by the network.
5.1.2. Interviews with Members of the IETF Community
Over 30 interviews with the current and past members of the Internet
Architecture Board (IAB), current and past members of the Internet
Engineering Steering Group (IESG), chairs of selected working groups,
and RFC authors were done at the IETF 92 meeting in Dallas in
March 2015 to get an insider's understanding of how they view the
relationship (if any) between human rights and protocols, and how
this relationship plays out in their work. Several of the
participants opted to remain anonymous. If you are interested in
this data set, please contact the authors of this document.
5.1.3. Participant Observation in Working Groups
By participating in various working groups, in person at IETF
meetings, and on mailing lists, information about the IETF's
day-to-day workings was gathered, from which general themes,
technical concepts, and use cases about human rights and protocols
were extracted. This process started at the IETF 91 meeting in
Honolulu and continues today.
5.2. Data Analysis Strategies
The data above was processed using three consecutive strategies:
mapping protocols related to human rights, extracting concepts from
these protocols, and creation of a common glossary (detailed under
Section 2). Before going over these strategies, some elaboration on
the process of identifying technical concepts as they relate to human
rights is needed:
5.2.1. Identifying Qualities of Technical Concepts That Relate to Human
Rights
5.2.1.1. Mapping Protocols and Standards to Human Rights
By combining data from the three data sources named above, an
extensive list of protocols and standards that potentially enable the
Internet as a tool for freedom of expression and association was
created. In order to determine the enabling (or inhibiting)
features, we relied on direct references in the RFCs as related to
such impacts, as well as input from the community. Based on this
analysis, a list of RFCs that describe standards and protocols that
are potentially closely related to human rights was compiled.
5.2.1.2. Extracting Concepts from Selected RFCs
The first step was to identify the protocols and standards that are
related to human rights and to create an environment that enables
human rights. For that, we needed to focus on specific technical
concepts that underlie these protocols and standards. Based on this
list, a number of technical concepts that appeared frequently were
extracted and used to create a second list of technical terms that,
when combined and applied in different circumstances, create an
enabling environment for exercising human rights on the Internet.
5.2.1.3. Building a Common Vocabulary of Technical Concepts That Impact
Human Rights
While interviewing experts, investigating RFCs, and compiling
technical definitions, several concepts of convergence and divergence
were identified. To ensure that the discussion was based on a common
understanding of terms and vocabulary, a list of definitions was
created. The definitions are based on the wording found in various
IETF documents; if the definitions were not available therein,
definitions were taken from other SDOs or academic literature, as
indicated in Section 2.
5.2.1.4. Translating Human Rights Concepts into Technical Definitions
The previous steps allowed for the clarification of relationships
between human rights and technical concepts. The steps taken show
how the research process "zoomed in", from compiling a broad list of
protocols and standards that relate to human rights to extracting the
precise technical concepts that make up these protocols and
standards, in order to understand the relationship between the two.
This subsection presents the next step: translating human rights to
technical concepts by matching the individual components of the
rights to the accompanying technical concepts, allowing for the
creation of a list of technical concepts that, when partially
combined, can create an enabling environment for human rights.
5.2.1.5. List of Technical Terms That, When Partially Combined, Can
Create an Enabling Environment for Human Rights
Based on the prior steps, the following list of technical terms was
drafted. When partially combined, this list can create an enabling
environment for human rights, such as freedom of expression and
freedom of association.
Architectural principles Enabling features
and system properties for user rights
/------------------------------------------------\
| |
+=================|=============================+ |
= | = |
= | End-to-end = |
= | Reliability = |
= | Resilience = Access as |
= | Interoperability = human right |
= Good enough | Transparency = |
= principle | Data minimization = |
= | Permissionless innovation = |
= Simplicity | Graceful degradation = |
= | Connectivity = |
= | Heterogeneity support = |
= | = |
= | = |
= \------------------------------------------------/
= =
+===============================================+
Figure 1: Relationship between Architectural Principles and Enabling
Features for User Rights
5.2.2. Relating Human Rights to Technical Concepts
The technical concepts listed in the steps above have been grouped
according to their impact on specific rights, as mentioned in the
interviews done at IETF 92 as well as the study of literature (see
Section 4 ("Literature and Discussion Review") above).
This analysis aims to assist protocol developers in better
understanding the roles that specific technical concepts have with
regard to their contribution to an enabling environment for people to
exercise their human rights.
This analysis does not claim to be a complete or exhaustive mapping
of all possible ways in which protocols could potentially impact
human rights, but it presents a mapping of initial concepts based on
interviews and on discussion and review of the literature.
+-----------------------+-----------------------------------------+
| Technical Concepts | Rights Potentially Impacted |
+-----------------------+-----------------------------------------+
| Connectivity | |
| Privacy | |
| Security | |
| Content agnosticism | Right to freedom of expression |
| Internationalization | |
| Censorship resistance | |
| Open standards | |
| Heterogeneity support | |
+-----------------------+-----------------------------------------+
| Anonymity | |
| Privacy | |
| Pseudonymity | Right to non-discrimination |
| Accessibility | |
+-----------------------+-----------------------------------------+
| Content agnosticism | |
| Security | Right to equal protection |
+-----------------------+-----------------------------------------+
| Accessibility | |
| Internationalization | Right to political participation |
| Censorship resistance | |
| Connectivity | |
+-----------------------+-----------------------------------------+
| Open standards | |
| Localization | Right to participate in cultural life, |
| Internationalization | arts, and science, and |
| Censorship resistance | Right to education |
| Accessibility | |
+-----------------------+-----------------------------------------+
| Connectivity | |
| Decentralization | |
| Censorship resistance | Right to freedom of assembly |
| Pseudonymity | and association |
| Anonymity | |
| Security | |
+-----------------------+-----------------------------------------+
| Reliability | |
| Confidentiality | |
| Integrity | Right to security |
| Authenticity | |
| Anonymity | |
| | |
+-----------------------+-----------------------------------------+
Figure 2: Relationship between Specific Technical Concepts
with Regard to Their Contribution to an Enabling Environment
for People to Exercise Their Human Rights5.2.3. Mapping Cases of Protocols, Implementations, and Networking
Paradigms That Adversely Impact Human Rights or Are Enablers
Thereof
Given the information above, the following list of cases of
protocols, implementations, and networking paradigms that either
adversely impact or enable human rights was formed.
It is important to note that the assessment here is not a general
judgment on these protocols, nor is it an exhaustive listing of all
the potential negative or positive impacts on human rights that these
protocols might have. When these protocols were conceived, there
were many criteria to take into account. For instance, relying on a
centralized service can be bad for freedom of speech (it creates one
more control point, where censorship could be applied), but it may be
a necessity if the endpoints are not connected and reachable
permanently. So, when we say "protocol X has feature Y, which may
endanger freedom of speech," it does not mean that protocol X is bad,
much less that its authors were evil. The goal here is to show, with
actual examples, that the design of protocols has practical
consequences for some human rights and that these consequences have
to be considered in the design phase.
5.2.3.1. IPv4
The Internet Protocol version 4 (IPv4), also known as "Layer 3" of
the Internet and specified with a common encapsulation and protocol
header, is defined in [RFC791]. The evolution of Internet
communications led to continued development in this area,
"encapsulated" in the development of version 6 (IPv6) of the protocol
[RFC8200]. In spite of this updated protocol, we find that 23 years
after the specification of IPv6 the older IPv4 standard continues to
account for a sizable majority of Internet traffic. Most of the
issues discussed here (Network Address Translators (NATs) are a major
exception; see Section 5.2.3.1.2 ("Address Translation and
Mobility")) are valid for IPv4 as well as IPv6.
The Internet was designed as a platform for free and open
communication, most notably encoded in the end-to-end principle, and
that philosophy is also present in the technical implementation of IP
[RFC3724]. While the protocol was designed to exist in an
environment where intelligence is at the end hosts, it has proven to
provide sufficient information that a more intelligent network core
can make policy decisions and enforce policy-based traffic shaping,
thereby restricting the communications of end hosts. These
capabilities for network control and for limitations on freedom of
expression by end hosts can be traced back to the design of IPv4,
helping us to understand which technical protocol decisions have led
to harm to this human right. A feature that can harm freedom of
expression as well as the right to privacy through misuse of IP is
the exploitation of the public visibility of the host pairs for all
communications and the corresponding ability to differentiate and
block traffic as a result of that metadata.
5.2.3.1.1. Network Visibility of Source and Destination
The IPv4 protocol header contains fixed location fields for both the
source IP address and destination IP address [RFC791]. These
addresses identify both the host sending and the host receiving each
message; they also allow the core network to understand who is
talking to whom and to practically limit communication selectively
between pairs of hosts. Blocking of communication based on the pair
of source and destination is one of the most common limitations on
the ability for people to communicate today [CAIDA] and can be seen
as a restriction of the ability for people to assemble or to
consensually express themselves.
Inclusion of an Internet-wide identified source in the IP header
is not the only possible design, especially since the protocol is
most commonly implemented over Ethernet networks exposing only
link-local identifiers [RFC894].
A variety of alternative designs do exist, such as the Accountable
and Private Internet Protocol [APIP] and High-speed Onion Routing at
the Network Layer (HORNET) [HORNET] as well as source routing. The
latter would allow the sender to choose a predefined (safe) route and
spoofing of the source IP address, which are technically supported by
IPv4, but neither are considered good practice on the Internet
[Farrow]. While projects like [TorProject] provide an alternative
implementation of anonymity in connections, they have been developed
in spite of the IPv4 protocol design.
5.2.3.1.2. Address Translation and Mobility
A major structural shift in the Internet that undermined the protocol
design of IPv4, and significantly reduced the freedom of end users to
communicate and assemble, was the introduction of network address
translation [RFC3022]. Network address translation is a process
whereby organizations and autonomous systems connect two networks by
translating the IPv4 source and destination addresses between them.
This process puts the router performing the translation in a
privileged position, where it is predetermined which subset of
communications will be translated.
This process of translation has widespread adoption despite promoting
a process that goes against the stated end-to-end process of the
underlying protocol [NATusage]. In contrast, the proposed mechanism
to provide support for mobility and forwarding to clients that may
move -- encoded instead as an option in IP [RFC5944] -- has failed to
gain traction. In this situation, the compromise made in the design
of the protocol resulted in a technology that is not coherent with
the end-to-end principles and thus creates an extra possible hurdle
for freedom of expression in its design, even though a viable
alternative exists. There is a particular problem surrounding NATs
and Virtual Private Networks (VPNs) (as well as other connections
used for privacy purposes), as NATs sometimes cause VPNs not to work.
5.2.3.2. DNS
The Domain Name System (DNS) [RFC1035] provides service discovery
capabilities and provides a mechanism to associate human-readable
names with services. The DNS is organized around a set of
independently operated "root servers" run by organizations that
function in line with ICANN's policy by answering queries for which
organizations have been delegated to manage registration under each
Top-Level Domain (TLD). The DNS is organized as a rooted tree, and
this brings up political and social concerns over control. TLDs are
maintained and determined by ICANN. These namespaces encompass
several classes of services. The initial namespaces, including
".com" and ".net", provide common spaces for expression of ideas,
though their policies are enacted through US-based companies. Other
namespaces are delegated to specific nationalities and may impose
limits designed to focus speech in those forums, to both (1) promote
speech from that nationality and (2) comply with local limits on
expression and social norms. Finally, the system has recently been
expanded with additional generic and sponsored namespaces -- for
instance, ".travel" and ".ninja" -- that are operated by a range of
organizations that may independently determine their registration
policies. This new development has both positive and negative
implications in terms of enabling human rights. Some individuals
argue that it undermines the right to freedom of expression because
some of these new generic TLDs have restricted policies on
registration and particular rules on hate speech content. Others
argue that precisely these properties are positive because they
enable certain (mostly minority) communities to build safer spaces
for association, thereby enabling their right to freedom of
association. An often-mentioned example is an application like
.gay [CoE].
As discussed in [RFC7626], DNS has significant privacy issues. Most
notable is the lack of encryption to limit the visibility of requests
for domain resolution from intermediary parties, and a limited
deployment of DNSSEC to provide authentication, allowing the client
to know that they received a correct, "authoritative" answer to a
query. In response to the privacy issues, the IETF DNS Private
Exchange (DPRIVE) Working Group is developing mechanisms to provide
confidentiality to DNS transactions, to address concerns surrounding
pervasive monitoring [RFC7258].
Authentication through DNSSEC creates a validation path for records.
This authentication protects against forged or manipulated DNS data.
As such, DNSSEC protects directory lookups and makes it harder to
hijack a session. This is important because interference with the
operation of the DNS is currently becoming one of the central
mechanisms used to block access to websites. This interference
limits both the freedom of expression of the publisher to offer their
content and the freedom of assembly for clients to congregate in a
shared virtual space. Even though DNSSEC doesn't prevent censorship,
it makes it clear that the returned information is not the
information that was requested; this contributes to the right to
security and increases trust in the network. It is, however,
important to note that DNSSEC is currently not widely supported or
deployed by domain name registrars, making it difficult to
authenticate and use correctly.
5.2.3.2.1. Removal of Records
There have been a number of cases where the records for a domain are
removed from the name system due to political events. Examples of
this removal include the "seizure" of wikileaks [BBC-wikileaks] and
the names of illegally operating gambling operations by the United
States Immigration and Customs Enforcement (ICE) unit. In the first
case, a US court ordered the registrar to take down the domain. In
the second, ICE compelled the US-based registry in charge of the .com
TLD to hand ownership of those domains over to the US government.
The same technique has been used in Libya to remove sites in
violation of "our Country's Law and Morality (which) do not allow any
kind of pornography or its promotion." [techyum]
At a protocol level, there is no technical auditing for name
ownership, as in alternate systems like Namecoin [Namecoin]. As a
result, there is no ability for users to differentiate seizure from
the legitimate transfer of name ownership, which is purely a policy
decision made by registrars. While DNSSEC addresses the network
distortion events described below, it does not tackle this problem.
(Although we mention alternative techniques, this is not a comparison
of DNS with Namecoin: the latter has its own problems and
limitations. The idea here is to show that there are several
possible choices, and they have consequences for human rights.)
5.2.3.2.2. Distortion of Records
The most common mechanism by which the DNS is abused to limit freedom
of expression is through manipulation of protocol messages by the
network. One form occurs at an organizational level, where client
computers are instructed to use a local DNS resolver controlled by
the organization. The DNS resolver will then selectively distort
responses rather than request the authoritative lookup from the
upstream system. The second form occurs through the use of Deep
Packet Inspection (DPI), where all DNS protocol messages are
inspected by the network and objectionable content is distorted, as
can be observed in Chinese networks.
A notable instance of distortion occurred in Greece [Ververis], where
a study found evidence of both (1) DPI to distort DNS replies and
(2) more excessive blocking of content than was legally required or
requested (also known as "overblocking"). Internet Service Providers
(ISPs), obeying a governmental order, prevented clients from
resolving the names of domains, thereby prompting this particular
blocking of systems there.
At a protocol level, the effectiveness of these attacks is made
possible by a lack of authentication in the DNS protocol. DNSSEC
provides the ability to determine the authenticity of responses when
used, but it is not regularly checked by resolvers. DNSSEC is not
effective when the local resolver for a network is complicit in the
distortion -- for instance, when the resolver assigned for use by an
ISP is the source of injection. Selective distortion of records is
also made possible by the predictable structure of DNS messages,
which makes it computationally easy for a network device to watch all
passing messages even at high speeds, and the lack of encryption,
which allows the network to distort only an objectionable subset of
protocol messages. Specific distortion mechanisms are discussed
further in [Hall].
Users can switch to another resolver -- for instance, a public
resolver. The distorter can then try to block or hijack the
connection to this resolver. This may start an arms race, with the
user switching to secured connections to this alternative resolver
[RFC7858] and the distorter then trying to find more sophisticated
ways to block or hijack the connection. In some cases, this search
for an alternative, non-disrupting resolver may lead to more
centralization because many people are switching to a few big
commercial public resolvers.
5.2.3.2.3. Injection of Records
Responding incorrectly to requests for name lookups is the most
common mechanism that in-network devices use to limit the ability of
end users to discover services. A deviation that accomplishes a
similar objective and may be seen as different from a "freedom of
expression" perspective is the injection of incorrect responses to
queries. The most prominent example of this behavior occurs in
China, where requests for lookups of sites deemed inappropriate will
trigger the network to return a false response, causing the client to
ignore the real response when it subsequently arrives
[greatfirewall]. Unlike the other network paradigms discussed above,
injection does not stifle the ability of a server to announce its
name; it instead provides another voice that answers sooner. This is
effective because without DNSSEC, the protocol will respond to
whichever answer is received first, without listening for subsequent
answers.
5.2.3.3. HTTP
The Hypertext Transfer Protocol (HTTP) version 1.1 [RFC7230]
[RFC7231] [RFC7232] [RFC7233] [RFC7234] [RFC7235] [RFC7236] [RFC7237]
is a request-response application protocol developed throughout the
1990s. HTTP factually contributed to the exponential growth of the
Internet and the interconnection of populations around the world.
Its simple design strongly contributed to the fact that HTTP has
become the foundation of most modern Internet platforms and
communication systems, from websites to chat systems and computer-to-
computer applications. In its manifestation in the World Wide Web,
HTTP radically revolutionized the course of technological development
and the ways people interact with online content and with each other.
However, HTTP is also a fundamentally insecure protocol that doesn't
natively provide encryption properties. While the definition of the
Secure Sockets Layer (SSL) [RFC6101], and later of Transport Layer
Security (TLS) [RFC5246], also happened during the 1990s, the fact
that HTTP doesn't mandate the use of such encryption layers by
developers and service providers was one of the reasons for a very
late adoption of encryption. Only in the middle of the 2000s did we
observe big ISPs, such as Google, starting to provide encrypted
access to their web services.
The lack of sensitivity and understanding of the critical importance
of securing web traffic incentivized certain (offensive) actors to
develop, deploy, and utilize interception systems at large and to
later launch active injection attacks, in order to swipe large
amounts of data and compromise Internet-enabled devices. The
commercial availability of systems and tools to perform these types
of attacks also led to a number of human rights abuses that have been
discovered and reported over the years.
Generally, we can identify traffic interception (Section 5.2.3.3.1)
and traffic manipulation (Section 5.2.3.3.2) as the two most
problematic attacks that can be performed against applications
employing a cleartext HTTP transport layer. That being said, the
IETF is taking steady steps to move to the encrypted version of HTTP,
HTTP Secure (HTTPS).
While this is commendable, we must not lose track of the fact that
different protocols, implementations, configurations, and networking
paradigms can intersect such that they (can be used to) adversely
impact human rights. For instance, to facilitate surveillance,
certain countries will throttle HTTPS connections, forcing users to
switch to (unthrottled) HTTP [Aryan-etal].
5.2.3.3.1. Traffic Interception
While we are seeing an increasing trend in the last couple of years
to employ SSL/TLS as a secure traffic layer for HTTP-based
applications, we are still far from seeing a ubiquitous use of
encryption on the World Wide Web. It is important to consider that
the adoption of SSL/TLS is also a relatively recent phenomenon.
Email providers such as riseup.net were the first to enable SSL by
default. Google did not introduce an option for its Gmail users to
navigate with SSL until 2008 [Rideout] and turned TLS on by default
later, in 2010 [Schillace]. It took an increasing amount of security
breaches and revelations on global surveillance from Edward Snowden
before other mail service providers followed suit. For example,
Yahoo did not enable SSL/TLS by default on its webmail services until
early 2014 [Peterson].
TLS itself has been subject to many attacks and bugs; this situation
can be attributed to some fundamental design weaknesses, such as lack
of a state machine (which opens a vulnerability for triple handshake
attacks) and flaws caused by early US government restrictions on
cryptography, leading to cipher-suite downgrade attacks (Logjam
attacks). These vulnerabilities are being corrected in TLS 1.3
[Bhargavan] [Adrian].
HTTP upgrading to HTTPS is also vulnerable to having an attacker
remove the "s" in any links to HTTPS URIs from a web page transferred
in cleartext over HTTP -- an attack called "SSL Stripping"
[sslstrip]. Thus, for high-security use of HTTPS, IETF standards
such as HTTP Strict Transport Security (HSTS) [RFC6797], certificate
pinning [RFC7469], and/or DNS-Based Authentication of Named Entities
(DANE) [RFC6698] should be used.
As we learned through Snowden's revelations, intelligence agencies
have been intercepting and collecting unencrypted traffic at large
for many years. There are documented examples of such
mass-surveillance programs with the Government Communications
Headquarters's (GCHQ's) Tempora [WP-Tempora] and the National
Security Agency's (NSA's) XKeyscore [Greenwald]. Through these
programs, the NSA and the GCHQ have been able to swipe large amounts
of data, including email and instant messaging communications that
have been transported in the clear for years by providers
unsuspecting of the pervasiveness and scale of governments' efforts
and investment in global mass-surveillance capabilities.
However, similar mass interception of unencrypted HTTP communications
is also often employed at the national level by some democratic
countries, by exercising control over state-owned ISPs and through
the use of commercially available monitoring, collection, and
censorship equipment. Over the last few years, a lot of information
has come to public attention on the role and scale of a surveillance
industry dedicated to developing different types of interception
gear, making use of known and unknown weaknesses in existing
protocols [RFC7258]. We have several records of such equipment being
sold and utilized by some regimes in order to monitor entire segments
of a population, especially at times of social and political
distress, uncovering massive human rights abuses. For example, in
2013, the group Telecomix revealed that the Syrian regime was making
use of Blue Coat products in order to intercept cleartext traffic as
well as to enforce censorship of unwanted content [RSF]. Similarly,
in 2011, it was found that the French technology firm Amesys provided
the Gadhafi government with equipment able to intercept emails,
Facebook traffic, and chat messages at a country-wide level [WSJ].
The use of such systems, especially in the context of the Arab Spring
and of civil uprisings against the dictatorships, has caused serious
concerns regarding significant human rights abuses in Libya.
5.2.3.3.2. Traffic Manipulation
The lack of a secure transport layer under HTTP connections not only
exposes users to interception of the content of their communications
but is more and more commonly abused as a vehicle for actively
compromising computers and mobile devices. If an HTTP session
travels in the clear over the network, any node positioned at any
point in the network is able to perform man-in-the-middle attacks;
the node can observe, manipulate, and hijack the session and can
modify the content of the communication in order to trigger
unexpected behavior by the application generating the traffic. For
example, in the case of a browser, the attacker would be able to
inject malicious code in order to exploit vulnerabilities in the
browser or any of its plugins. Similarly, the attacker would be able
to intercept, add malware to, and repackage binary software updates
that are very commonly downloaded in the clear by applications such
as word processors and media players. If the HTTP session were
encrypted, the tampering of the content would not be possible, and
these network injection attacks would not be successful.
While traffic manipulation attacks have long been known, documented,
and prototyped, especially in the context of Wi-Fi and LAN networks,
in the last few years we have observed an increasing investment in
the production and sale of network injection equipment that is both
commercially available and deployed at scale by intelligence
agencies.
For example, we learned from some of the documents provided by Edward
Snowden to the press that the NSA has constructed a global network
injection infrastructure, called "QUANTUM", able to leverage mass
surveillance in order to identify targets of interest and
subsequently task man-on-the-side attacks to ultimately compromise a
selected device. Among other attacks, the NSA makes use of an attack
called "QUANTUMINSERT" [Haagsma], which intercepts and hijacks an
unencrypted HTTP communication and forces the requesting browser to
redirect to a host controlled by the NSA instead of the intended
website. Normally, the new destination would be an exploitation
service, referred to in Snowden documents as "FOXACID", which would
attempt to execute malicious code in the context of the target's
browser. The Guardian reported in 2013 that the NSA has, for
example, been using these techniques to target users of the popular
anonymity service Tor [Schneier]. The German Norddeutscher Rundfunk
(NDR) reported in 2014 that the NSA has also been using its
mass-surveillance capabilities to identify Tor users at large
[Appelbaum].
Recently, similar capabilities used by Chinese authorities have been
reported as well in what has been informally called the "Great
Cannon" [Marcak], which raised numerous concerns on the potential
curb on human rights and freedom of speech due to the increasingly
tighter control of Chinese Internet communications and access to
information.
Network injection attacks are also made widely available to state
actors around the world through the commercialization of similar,
smaller-scale equipment that can be easily acquired and deployed at a
country-wide level. Certain companies are known to have network
injection gear within their products portfolio [Marquis-Boire]. The
technology devised and produced by some of them to perform network
traffic manipulation attacks on HTTP communications is even the
subject of a patent application in the United States [Googlepatent].
Access to offensive technologies available on the commercial lawful
interception market has led to human rights abuses and illegitimate
surveillance of journalists, human rights defenders, and political
activists in many countries around the world [Collins]. While
network injection attacks haven't been the subject of much attention,
they do enable even unskilled attackers to perform silent and very
resilient compromises, and unencrypted HTTP remains one of the main
vehicles.
There is a new version of HTTP, called "HTTP/2" [RFC7540], which aims
to be largely backwards compatible while also offering new options
such as data compression of HTTP headers, pipelining of requests, and
multiplexing multiple requests over a single TCP connection. In
addition to decreasing latency to improve page-loading speeds, it
also facilitates more efficient use of connectivity in low-bandwidth
environments, which in turn enables freedom of expression; the right
to assembly; the right to political participation; and the right to
participate in cultural life, arts, and science. [RFC7540] does not
mandate TLS or any other form of encryption, nor does it support
opportunistic encryption even though opportunistic encryption is now
addressed in [RFC8164].
5.2.3.4. XMPP
The Extensible Messaging and Presence Protocol (XMPP), specified in
[RFC6120], provides a standard for interactive chat messaging and has
evolved to encompass interoperable text, voice, and video chat. The
protocol is structured as a federated network of servers, similar to
email, where users register with a local server that acts on their
behalf to cache and relay messages. This protocol design has many
advantages, allowing servers to shield clients from denial of service
and other forms of retribution for their expression; it is also
designed to avoid central entities that could control the ability to
communicate or assemble using the protocol.
Nonetheless, there are plenty of aspects of the protocol design of
XMPP that shape the ability for users to communicate freely and to
assemble via the protocol.
5.2.3.4.1. User Identification
The XMPP specification [RFC6120] dictates that clients are identified
with a resource (<node@domain/home> / <node@domain/work>) to
distinguish the conversations to specific devices. While the
protocol does not specify that the resource must be exposed by the
client's server to remote users, in practice this has become the
default behavior. In doing so, users can be tracked by remote
friends and their servers, who are able to monitor the presence of
not just the user but of each individual device the user logs in
with. This has proven to be misleading to many users [Pidgin], since
many clients only expose user-level rather than device-level
presence. Likewise, user invisibility so that communication can
occur while users don't notify all buddies and other servers of their
availability is not part of the formal protocol and has only been
added as an extension within the XML stream rather than enforced by
the protocol.
5.2.3.4.2. Surveillance of Communication
XMPP specifies the standard by which communications channels may be
encrypted, but it does not provide visibility to clients regarding
whether their communications are encrypted on each link. In
particular, even when both clients ensure that they have an encrypted
connection to their XMPP server to ensure that their local network is
unable to read or disrupt the messages they send, the protocol does
not provide visibility into the encryption status between the two
servers. As such, clients may be subject to selective disruption of
communications by an intermediate network that disrupts
communications based on keywords found through DPI. While many
operators have committed to only establishing encrypted links from
their servers in recognition of this vulnerability, it remains
impossible for users to audit this behavior, and encrypted
connections are not required by the protocol itself [XMPP-Manifesto].
In particular, Section 13.14 of the XMPP specification [RFC6120]
explicitly acknowledges the existence of a downgrade attack where an
adversary controlling an intermediate network can force the
inter-domain federation between servers to revert to a non-encrypted
protocol where selective messages can then be disrupted.
5.2.3.4.3. Group Chat Limitations
Group chat in XMPP is defined as an extension within the XML
specification of XMPP (https://xmpp.org/extensions/xep-0045.html).
However, it is not encoded or required at a protocol level and is not
uniformly implemented by clients.
The design of multi-user chat in XMPP suffers from extending a
protocol that was not designed with assembly of many users in mind.
In particular, in the federated protocol provided by XMPP, multi-user
communities are implemented with a distinguished "owner" who is
granted control over the participants and structure of the
conversation.
Multi-user chat rooms are identified by a name specified on a
specific server, so that while the overall protocol may be federated,
the ability for users to assemble in a given community is moderated
by a single server. That server may block the room and prevent
assembly unilaterally, even between two users, neither of whom trust
or use that server directly.
5.2.3.5. Peer-to-Peer
Peer-to-Peer (P2P) is a distributed network architecture [RFC5694] in
which all the participant nodes can be responsible for the storage
and dissemination of information from any other node (see [RFC7574],
an IETF standard that discusses a P2P architecture called the
"Peer-to-Peer Streaming Peer Protocol" (PPSPP)). A P2P network is a
logical overlay that lives on top of the physical network and allows
nodes (or "peers") participating in it to establish contact and
exchange information directly with each other. The implementation of
a P2P network may vary widely: it may be structured or unstructured,
and it may implement stronger or weaker cryptographic and anonymity
properties. While its most common application has traditionally been
file-sharing (and other types of content delivery systems), P2P is a
popular architecture for networks and applications that require (or
encourage) decentralization. Prime examples include Bitcoin and
other proprietary multimedia applications.
In a time of heavily centralized online services, P2P is regularly
described as an alternative, more democratic, and resistant option
that displaces structures of control over data and communications and
delegates all peers to be equally responsible for the functioning,
integrity, and security of the data. While in principle P2P remains
important to the design and development of future content
distribution, messaging, and publishing systems, it poses numerous
security and privacy challenges that are mostly delegated to
individual developers to recognize, analyze, and solve in each
implementation of a given P2P network.
5.2.3.5.1. Network Poisoning
Since content, and sometimes peer lists, are safeguarded and
distributed by their members, P2P networks are prone to what are
generally defined as "poisoning attacks". Poisoning attacks might be
aimed directly at the data that is being distributed, for example,
(1) by intentionally corrupting the data, (2) at the index tables
used to instruct the peers where to fetch the data, or (3) at routing
tables, with an attempt to provide connecting peers with lists of
rogue or nonexistent peers, with the intention to effectively cause a
denial of service on the network.
5.2.3.5.2. Throttling
P2P traffic (and BitTorrent in particular) represents a significant
percentage of global Internet traffic [Sandvine], and it has become
increasingly popular for ISPs to perform throttling of customers'
lines in order to limit bandwidth usage [torrentfreak1] and,
sometimes, probably as an effect of the ongoing conflict between
copyright holders and file-sharing communities [wikileaks]. Such
throttling undermines the end-to-end principle.
Throttling the P2P traffic makes some uses of P2P networks
ineffective; this throttling might be coupled with stricter
inspection of users' Internet traffic through DPI techniques,
possibly posing additional security and privacy risks.
5.2.3.5.3. Tracking and Identification
One of the fundamental and most problematic issues with traditional
P2P networks is a complete lack of anonymization of their users. For
example, in the case of BitTorrent, all peers' IP addresses are
openly available to the other peers. This has led to ever-increasing
tracking of P2P and file-sharing users [ars]. As the geographical
location of the user is directly exposed, as could also be his
identity, the user might become a target of additional harassment and
attacks of a physical or legal nature. For example, it is known that
in Germany law firms have made extensive use of P2P and file-sharing
tracking systems in order to identify downloaders and initiate legal
actions looking for compensations [torrentfreak2].
It is worth noting that there are some varieties of P2P networks that
implement cryptographic practices and that introduce anonymization of
their users. Such implementations may be proved to be successful in
resisting censorship of content and tracking of network peers. A
prime example is Freenet [freenet1], a free software application that
is (1) designed to make it significantly more difficult to identify
users and content and (2) dedicated to fostering freedom of speech
online [freenet2].
5.2.3.5.4. Sybil Attacks
In open-membership P2P networks, a single attacker can pretend to be
many participants, typically by creating multiple fake identities of
whatever kind the P2P network uses [Douceur]. Attackers can use
Sybil attacks to bias choices that the P2P network makes collectively
to the attacker's advantage, e.g., by making it more likely that a
particular data item (or some threshold of the replicas or shares of
a data item) is assigned to attacker-controlled participants. If the
P2P network implements any voting, moderation, or peer-review-like
functionality, Sybil attacks may be used to "stuff the ballots" to
benefit the attacker. Companies and governments can use Sybil
attacks on discussion-oriented P2P systems for "astroturfing" or
creating the appearance of mass grassroots support for some position
where in reality there is none. It is important to know that there
are no known complete, environmentally sustainable, and fully
distributed solutions to Sybil attacks, and routing via "friends"
allows users to be de-anonymized via their social graph. It is
important to note that Sybil attacks in this context (e.g.,
astroturfing) are relevant to more than P2P protocols; they are also
common on web-based systems, and they are exploited by governments
and commercial entities.
Encrypted P2P and anonymous P2P networks have already emerged. They
provide viable platforms for sharing material [Tribler], publishing
content anonymously, and communicating securely [Bitmessage]. These
platforms are not perfect, and more research needs to be done. If
adopted at large, well-designed and resistant P2P networks might
represent a critical component of a future secure and distributed
Internet, enabling freedom of speech and freedom of information
at scale.
5.2.3.6. Virtual Private Networks
The VPNs discussed here are point-to-point connections that enable
two computers to communicate over an encrypted tunnel. There are
multiple implementations and protocols used in the deployment of
VPNs, and they generally diversify by encryption protocol or
particular requirements, most commonly in proprietary and enterprise
solutions. VPNs are commonly used to (1) enable some devices to
communicate through peculiar network configurations, (2) use some
privacy and security properties in order to protect the traffic
generated by the end user, or both. VPNs have also become a very
popular technology among human rights defenders, dissidents, and
journalists worldwide to avoid local monitoring and eventually also
to circumvent censorship. VPNs are often debated among human rights
defenders as a potential alternative to Tor or other anonymous
networks. Such comparisons are misleading, as some of the privacy
and security properties of VPNs are often misunderstood by less
tech-savvy users and could ultimately lead to unintended problems.
As VPNs have increased in popularity, commercial VPN providers have
started growing as businesses and are very commonly picked by human
rights defenders and people at risk, as they are normally provided
with an easy-to-use service and, sometimes, even custom applications
to establish the VPN tunnel. Not being able to control the
configuration of the network, let alone the security of the
application, assessing the general privacy and security state of
common VPNs is very hard. Such services have often been discovered
to be leaking information, and their custom applications have been
found to be flawed. While Tor and similar networks receive a lot of
scrutiny from the public and the academic community, commercial or
non-commercial VPNs are far less analyzed and understood [Insinuator]
[Alshalan-etal], and it might be valuable to establish some standards
to guarantee a minimal level of privacy and security to those who
need them the most.
5.2.3.6.1. No Anonymity against VPN Providers
One of the common misconceptions among users of VPNs is the level of
anonymity that VPNs can provide. This sense of anonymity can be
betrayed by a number of attacks or misconfigurations of the VPN
provider. It is important to remember that, in contrast to Tor and
similar systems, VPNs were not designed to provide anonymity
properties. From a technical point of view, a VPN might leak
identifiable information or might be the subject of correlation
attacks that could expose the originating address of a connecting
user. Most importantly, it is vital to understand that commercial
and non-commercial VPN providers are bound by the law of the
jurisdiction in which they reside or in which their infrastructure is
located, and they might be legally forced to turn over data of
specific users if legal investigations or intelligence requirements
dictate so. In such cases, if the VPN providers retain logs, it is
possible that a user's information could be provided to the user's
adversary and lead to his or her identification.
5.2.3.6.2. Logging
Because VPNs are point-to-point connections, the service providers
are in fact able to observe the original location of connecting
users, and they are able to track at what time they started their
session and, eventually, also to which destinations they're trying to
connect. If the VPN providers retain logs for a long enough time,
they might be forced to turn over the relevant data or they might be
otherwise compromised, leading to the same data getting exposed. A
clear log-retention policy could be enforced, but considering that
countries enforce different levels of data-retention policies, VPN
providers should at least be transparent regarding what information
they store and for how long it is being kept.
5.2.3.6.3. Third-Party Hosting
VPN providers very commonly rely on third parties to provision the
infrastructure that is later going to be used to run VPN endpoints.
For example, they might rely on external dedicated server providers
or on uplink providers. In those cases, even if the VPN provider
itself isn't retaining any significant logs, the information on
connecting users might be retained by those third parties instead,
introducing an additional collection point for the adversary.
5.2.3.6.4. IPv6 Leakage
Some studies proved that several commercial VPN providers and
applications suffer from critical leakage of information through IPv6
due to improper support and configuration [PETS2015VPN]. This is
generally caused by a lack of proper configuration of the client's
IPv6 routing tables. Considering that most popular browsers and
similar applications have been supporting IPv6 by default, if the
host is provided with a functional IPv6 configuration, the traffic
that is generated might be leaked if the VPN application isn't
designed to manipulate such traffic properly.
5.2.3.6.5. DNS Leakage
Similarly, VPN services that aren't handling DNS requests and aren't
running DNS servers of their own might be prone to DNS leaking that
might not only expose sensitive information on the activity of a user
but could also potentially lead to DNS hijacking attacks and
subsequent compromises.
5.2.3.6.6. Traffic Correlation
Some VPN implementations appear to be particularly vulnerable to
identification and collection of key exchanges that, some Snowden
documents revealed, are systematically collected and stored for
future reference. The ability of an adversary to monitor network
connections at many different points over the Internet can allow them
to perform traffic correlation attacks and identify the origin of
certain VPN traffic by cross-referencing the connection time of the
user to the endpoint and the connection time of the endpoint to the
final destination. These types of attacks, although very expensive
and normally only performed by very resourceful adversaries, have
been documented [SPIEGEL] to be already in practice, and they could
completely nullify the use of a VPN and ultimately expose the
activity and the identity of a user at risk.
5.2.3.7. HTTP Status Code 451
"Every Internet user has run into the '404 Not Found' Hypertext
Transfer Protocol (HTTP) status code when trying, and failing, to
access a particular website" [Cath]. It is a response status that
the server sends to the browser when the server cannot locate the
URL. "403 Forbidden" is another example of this class of code signals
that gives users information about what is going on. In the "403"
case, the server can be reached but is blocking the request because
the user is trying to access content forbidden to them, typically
because some content is only for identified users, based on a payment
or on special status in the organization. Most of the time, 403 is
sent by the origin server, not by an intermediary. If a firewall
prevents a government employee from accessing pornography on a work
computer, it does not use 403.
As surveillance and censorship of the Internet are becoming more
commonplace, voices were raised at the IETF to introduce a new status
code that indicates when something is not available for "legal
reasons" (like censorship):
The 451 status code would allow server operators to operate with
greater transparency in circumstances where issues of law or public
policy affect their operation. This transparency may be beneficial
to both (1) these operators and (2) end users [RFC7725].
The status code is named "451" in reference to both Bradbury's famous
novel "Fahrenheit 451" and to 451 degrees Fahrenheit (the temperature
at which some claim book paper autoignites).
During the IETF 92 meeting in Dallas, there was discussion about the
usefulness of 451. The main tension revolved around the lack of an
apparent machine-readable technical use of the information. The
extent to which 451 is just "political theatre" or whether it has a
concrete technical use was heatedly debated. Some argued that "the
451 status code is just a status code with a response body"; others
said it was problematic because "it brings law into the picture."
Still others argued that it would be useful for individuals or for
organizations like the "Chilling Effects" project that are crawling
the Web to get an indication of censorship (IETF discussion on 451 --
author's field notes, March 2015). There was no outright objection
during the Dallas meeting against moving forward on status code 451,
and on December 18, 2015, the IESG approved "An HTTP Status Code to
Report Legal Obstacles" (now [RFC7725]) for publication. HTTP status
code 451 is now an IETF-approved HTTP status code that signals when
resource access is denied as a consequence of legal demands.
What is interesting about this particular case is that not only
technical arguments but also the status code's outright potential
political use for civil society played a substantial role in shaping
the discussion and the decision to move forward with this technology.
It is nonetheless important to note that HTTP status code 451 is not
a solution to detect all occasions of censorship. A large swath of
Internet filtering occurs in the network, at a lower level than HTTP,
rather than at the server itself. For these forms of censorship, 451
plays a limited role, as typical censoring intermediaries won't
generate it. Besides technical reasons, such filtering regimes are
unlikely to voluntarily inject a 451 status code. The use of 451 is
most likely to apply in the case of cooperative, legal versions of
content removal resulting from requests to providers. One can think
of content that is removed or blocked for legal reasons, like
copyright infringement, gambling laws, child abuse, etc. Large
Internet companies and search engines are constantly asked to censor
content in various jurisdictions. 451 allows this to be easily
discovered -- for instance, by initiatives like the Lumen Database.
Overall, the strength of 451 lies in its ability to provide
transparency by giving the reason for blocking and giving the
end user the ability to file a complaint. It allows organizations to
easily measure censorship in an automated way and prompts the user to
access the content via another path (e.g., Tor, VPNs) when (s)he
encounters the 451 status code.
Status code 451 impacts human rights by making censorship more
transparent and measurable. It increases transparency by signaling
the existence of censorship (instead of a much broader HTTP error
message such as HTTP status code 404) as well as providing details of
the legal restriction, which legal authority is imposing it, and to
what class of resources it applies. This empowers the user to seek
redress.
5.2.3.8. DDoS Attacks
Many individuals, including IETF engineers, have argued that DDoS
attacks are fundamentally against freedom of expression.
Technically, DDoS attacks are attacks where one host or multiple
hosts overload the bandwidth or resources of another host by flooding
it with traffic or making resource-intensive requests, causing it to
temporarily stop being available to users. One can roughly
differentiate three types of DDoS attacks:
1. volume-based attacks (which aim to make the host unreachable by
using up all its bandwidth; often-used techniques are UDP floods
and ICMP floods)
2. protocol attacks (which aim to use up actual server resources;
often-used techniques are SYN floods, fragmented packet attacks,
and "ping of death" [RFC4949])
3. application-layer attacks (which aim to bring down a server, such
as a web server)
DDoS attacks can thus stifle freedom of expression and complicate the
ability of independent media and human rights organizations to
exercise their right to (online) freedom of association, while
facilitating the ability of governments to censor dissent. When it
comes to comparing DDoS attacks to protests in offline life, it is
important to remember that only a limited number of DDoS attacks
solely involved willing participants. In the overwhelming majority
of cases, the clients are hacked hosts of unrelated parties that
have not consented to being part of a DDoS (for exceptions, see
Operation Ababil [Ababil] or the Iranian Green Movement's DDoS
campaign at election time [GreenMovement]). In addition,
DDoS attacks are increasingly used as an extortion tactic.
All of these issues seem to suggest that the IETF should try to
ensure that their protocols cannot be used for DDoS attacks; this is
consistent with the long-standing IETF consensus that DDoS is an
attack that protocols should mitigate to the extent they can [BCP72].
Decreasing the number of vulnerabilities in protocols and (outside of
the IETF) the number of bugs in the network stacks of routers or
computers could address this issue. The IETF can clearly play a role
in bringing about some of these changes, but the IETF cannot be
expected to take a positive stance on (specific) DDoS attacks or to
create protocols that enable some attacks and inhibit others. What
the IETF can do is critically reflect on its role in the development
of the Internet and how this impacts the ability of people to
exercise their human rights, such as freedom of expression.