Terms in this document are defined in-line when used and are also
defined below for reference. The definitions in this section use
terminology and concepts that are not explained until later in the
Admitting Peer (AP): A peer in the overlay which helps the Joining
Node join the Overlay.
Bootstrap Node: A network node used by Joining Nodes to help locate
the Admitting Peer.
Client: A host that is able to store data in and retrieve data from
the overlay, but does not participate in routing or data storage
for the overlay.
Configuration Document: An XML document containing all the Overlay
Parameters for one overlay instance.
Connection Table: Contains connection information for the set of
nodes to which a node is directly connected, which include nodes
that are not yet available for routing.
Destination List: A list of Node-IDs, Resource-IDs, and Opaque IDs
through which a message is to be routed, in strict order. A
single Node-ID, Resource-ID, or Opaque ID is a trivial form of
Destination List. When multiple Node-IDs are specified, a
Destination List is a loose source route. The list is reduced hop
by hop, and does not include the source but does include the
DHT: A distributed hash table. A DHT is an abstract storage service
realized by storing the contents of the hash table across a set of
ID: A generic term for any kind of identifiers in an Overlay. This
document specifies an ID as being an Application-ID, a Kind-ID, a
Node-ID, a transaction ID, a component ID, a response ID, a
Resource-ID, or an Opaque ID.
Joining Node (JN): A node that is attempting to become a peer in a
Kind: A Kind defines a particular type of data that can be stored in
the overlay. Applications define new Kinds to store the data they
use. Each Kind is identified with a unique integer called a
Kind-ID: A unique 32-bit value identifying a Kind. Kind-IDs are
either private or allocated by IANA (see Section 14.6).
Maximum Request Lifetime: The maximum time a request will wait for a
response. This value is equal to the value of the overlay
reliability value (defined in Section 11.1) multiplied by the
number of transmissions (defined in Section 6.2.1), and so
defaults to 15 seconds.
Node: The term "node" refers to a host that may be either a peer or
a client. Because RELOAD uses the same protocol for both clients
and peers, much of the text applies equally to both. Therefore,
we use "node" when the text applies to both clients and peers, and
we use the more specific term (i.e., "client" or "peer") when the
text applies only to clients or only to peers.
Node-ID: A value of fixed but configurable length that uniquely
identifies a node. Node-IDs of all 0s and all 1s are reserved. A
value of 0 is not used in the wire protocol, but can be used to
indicate an invalid node in implementations and APIs. The Node-ID
of all 1s is used on the wire protocol as a wildcard.
Overlay Algorithm: An overlay algorithm defines the rules for
determining which peers in an overlay store a particular piece of
data and for determining a topology of interconnections amongst
peers in order to find a piece of data.
Overlay Instance: A specific overlay algorithm and the collection of
peers that are collaborating to provide read and write access to
it. Any number of overlay instances can be running in an IP
network at a time, and each operates in isolation of the others.
Overlay Parameters: A set of values that are shared among all nodes
in an overlay. The overlay parameters are distributed in an XML
document called the Configuration Document.
Peer: A host that is participating in the overlay. Peers are
responsible for holding some portion of the data that has been
stored in the overlay, and they are responsible for routing
messages on behalf of other hosts as needed by the Overlay
Peer Admission: The act of admitting a node (the Joining Node) into
an Overlay. After the admission process is over, the Joining Node
is a fully functional peer of the overlay. During the admission
process, the Joining Node may need to present credentials to prove
that it has sufficient authority to join the overlay.
Resource: An object or group of objects stored in a P2P network.
Resource-ID: A value that identifies some resources and which is
used as a key for storing and retrieving the resource. Often this
is not human friendly/readable. One way to generate a Resource-ID
is by applying a mapping function to some other unique name (e.g.,
user name or service name) for the resource. The Resource-ID is
used by the distributed database algorithm to determine the peer
or peers that are responsible for storing the data for the
overlay. In structured P2P networks, Resource-IDs are generally
fixed length and are formed by hashing the Resource Name. In
unstructured networks, Resource Names may be used directly as
Resource-IDs and may be of variable length.
Resource Name: The name by which a resource is identified. In
unstructured P2P networks, the Resource Name is sometimes used
directly as a Resource-ID. In structured P2P networks, the
Resource Name is typically mapped into a Resource-ID by using the
string as the input to hash function. Structured and unstructured
P2P networks are described in [RFC5694]. A SIP resource, for
example, is often identified by its AOR (address-of-record), which
is an example of a Resource Name.
Responsible Peer: The peer that is responsible for a specific
resource, as defined by the Topology Plug-in algorithm.
Routing Table: The set of directly connected peers which a node can
use to forward overlay messages. In normal operation, these peers
will all be in the Connection Table, but not vice versa, because
some peers may not yet be available for routing. Peers may send
messages directly to peers that are in their Connection Tables,
but may forward messages to peers that are not in their Connection
Table only through peers that are in the Routing Table.
Successor Replacement Hold-Down Time: The amount of time to wait
before starting replication when a new successor is found; it
defaults to 30 seconds.
Transaction ID: A randomly chosen identifier selected by the
originator of a request that is used to correlate requests and
Usage: The definition of a set of data structures (data Kinds) that
an application wants to store in the overlay. A usage may also
define a set of network protocols (Application IDs) that can be
tunneled over TLS or DTLS direct connections between nodes. For
example, the SIP Usage defines a SIP registration data Kind, which
contains information on how to reach a SIP endpoint, and two
Application IDs corresponding to the SIP and SIPS protocols.
User: A physical person identified by the certificates assigned to
User Name: A name identifying a user of the overlay, typically used
as a Resource Name or as a label on a resource that identifies the
user owning the resource.
4. Overlay Management Overview
The most basic function of RELOAD is as a generic overlay network.
Nodes need to be able to join the overlay, form connections to other
nodes, and route messages through the overlay to nodes to which they
are not directly connected. This section provides an overview of the
mechanisms that perform these functions.
4.1. Security and Identification
The overlay parameters are specified in a Configuration Document.
Because the parameters include security-critical information, such as
the certificate signing trust anchors, the Configuration Document
needs to be retrieved securely. The initial Configuration Document
is either initially fetched over HTTPS or manually provisioned.
Subsequent Configuration Document updates are received either as a
result of being refreshed periodically by the configuration server,
or, more commonly, by being flood-filled through the overlay, which
allows for fast propagation once an update is pushed. In the latter
case, updates are via digital signatures that trace back to the
initial Configuration Document.
Every node in the RELOAD overlay is identified by a Node-ID. The
Node-ID is used for three major purposes:
o To address the node itself.
o To determine the node's position in the overlay topology (if the
overlay is structured; overlays do not need to be structured).
o To determine the set of resources for which the node is
Each node has a certificate [RFC5280] containing its Node-ID in a
subjectAltName extension, which is unique within an overlay instance.
The certificate serves multiple purposes:
o It entitles the user to store data at specific locations in the
Overlay Instance. Each data Kind defines the specific rules for
determining which certificates can access each Resource-ID/Kind-ID
pair. For instance, some Kinds might allow anyone to write at a
given location, whereas others might restrict writes to the owner
of a single certificate.
o It entitles the user to operate a node that has a Node-ID found in
the certificate. When the node forms a connection to another
peer, it uses this certificate so that a node connecting to it
knows it is connected to the correct node. (Technically, a TLS or
DTLS association with client authentication is formed.) In
addition, the node can sign messages, thus providing integrity and
authentication for messages which are sent from the node.
o It entitles the user to use the user name found in the
If a user has more than one device, typically they would get one
certificate for each device. This allows each device to act as a
RELOAD supports multiple certificate issuance models. The first is
based on a central enrollment process, which allocates a unique name
and Node-ID and puts them in a certificate for the user. All peers
in a particular Overlay Instance have the enrollment server as a
trust anchor and so can verify any other peer's certificate.
The second model is useful in settings, when a group of users want to
set up an overlay network but are not concerned about attack by other
users in the network. For instance, users on a LAN might want to set
up a short-term ad hoc network without going to the trouble of
setting up an enrollment server. RELOAD supports the use of self-
generated, self-signed certificates. When self-signed certificates
are used, the node also generates its own Node-ID and user name. The
Node-ID is computed as a digest of the public key, to prevent Node-ID
theft. Note that the relevant cryptographic property for the digest
is partial preimage resistance. Collision resistance is not needed,
because an attacker who can create two nodes with the same Node-ID
but a different public key obtains no advantage. This model is still
subject to a number of known attacks (most notably, Sybil attacks
[Sybil]) and can be safely used only in closed networks where users
are mutually trusting. Another drawback of this approach is that the
user's data is then tied to their key, so if a key is changed, any
data stored under their Node-ID needs to be re-stored. This is not
an issue for centrally issued Node-IDs provided that the
Certification Authority (CA) reissues the same Node-ID when a new
certificate is generated.
The general principle here is that the security mechanisms (TLS or
DTLS at the data link layer and message signatures at the message
transport layer) are always used, even if the certificates are self-
signed. This allows for a single set of code paths in the systems,
with the only difference being whether certificate verification is
used to chain to a single root of trust.
4.1.1. Shared-Key Security
RELOAD also provides an admission control system based on shared
keys. In this model, the peers all share a single key which is used
to authenticate the peer-to-peer connections via TLS-PSK [RFC4279] or
RELOAD defines a single protocol that is used both as the peer
protocol and as the client protocol for the overlay. Having a single
protocol simplifies implementation, particularly for devices that may
act in either role, and allows clients to inject messages directly
into the overlay.
We use the term "peer" to identify a node in the overlay that routes
messages for nodes other than those to which it is directly
connected. Peers also have storage responsibilities. We use the
term "client" to refer to nodes that do not have routing or storage
responsibilities. When text applies to both peers and clients, we
will simply refer to such devices as "nodes".
RELOAD's client support allows nodes that are not participating in
the overlay as peers to utilize the same implementation and to
benefit from the same security mechanisms as the peers. Clients
possess and use certificates that authorize the user to store data at
certain locations in the overlay. The Node-ID in the certificate is
used to identify the particular client as a member of the overlay and
to authenticate its messages.
In RELOAD, unlike some other designs, clients are not first-class
entities. From the perspective of a peer, a client is a node that
has connected to the overlay, but that has not yet taken steps to
insert itself into the overlay topology. It might never do so (if
it's a client), or it might eventually do so (if it's just a node
that is taking a long time to join). The routing and storage rules
for RELOAD provide for correct behavior by peers regardless of
whether other nodes attached to them are clients or peers. Of
course, a client implementation needs to know that it intends to be a
client, but this localizes complexity only to that node.
For more discussion about the motivation for RELOAD's client support,
see Appendix B.
4.2.1. Client Routing
Clients may insert themselves in the overlay in two ways:
o Establish a connection to the peer responsible for the client's
Node-ID in the overlay. Then, requests may be sent from/to the
client using its Node-ID in the same manner as if it were a peer,
because the responsible peer in the overlay will handle the final
step of routing to the client. This may require a TURN [RFC5766]
relay in cases where NATs or firewalls prevent a client from
forming a direct connection with its responsible peer. Note that
clients that choose this option need to process Update messages
from the peer (Section 126.96.36.199). These updates can indicate that
the peer is no longer responsible for the client's Node-ID. The
client would then need to form a connection to the appropriate
peer. Failure to do so will result in the client no longer
o Establish a connection with an arbitrary peer in the overlay
(perhaps based on network proximity or an inability to establish a
direct connection with the responsible peer). In this case, the
client will rely on RELOAD's Destination List feature
(Section 188.8.131.52) to ensure reachability. The client can initiate
requests, and any node in the overlay that knows the Destination
List to its current location can reach it, but the client is not
directly reachable using only its Node-ID. If the client is to
receive incoming requests from other members of the overlay, the
Destination List needed to reach the client needs to be learnable
via other mechanisms, such as being stored in the overlay by a
usage. A client connected this way using a certificate with only
a single Node-ID can proceed to use the connection without
performing an Attach (Section 6.5.1). A client wishing to connect
using this mechanism with a certificate with multiple Node-IDs can
use a Ping (Section 6.5.3) to probe the Node-ID of the node to
which it is connected before performing the Attach.
4.2.2. Minimum Functionality Requirements for Clients
A node may act as a client simply because it does not have the
capacity or need to act as a peer in the overlay, or because it does
not even have an implementation of the Topology Plug-in defined in
Section 6.4.1, needed to act as a peer in the overlay. In order to
exchange RELOAD messages with a peer, a client needs to meet a
minimum level of functionality. Such a client will:
o Implement RELOAD's connection-management operations that are used
to establish the connection with the peer.
o Implement RELOAD's data retrieval methods (with client
o Be able to calculate Resource-IDs used by the overlay.
o Possess security credentials needed by the overlay that it is
A client speaks the same protocol as the peers, knows how to
calculate Resource-IDs, and signs its requests in the same manner as
peers. While a client does not necessarily require a full
implementation of the overlay algorithm, calculating the Resource-ID
requires an implementation of an appropriate algorithm for the
This section discusses the capabilities of RELOAD's routing layer and
the protocol features used to implement the capabilities, and
provides a brief overview of how they are used. Appendix A discusses
some alternative designs and the trade-offs that would be necessary
to support them.
RELOAD's routing provides the following capabilities:
Resource-based Routing: RELOAD supports routing messages based
solely on the name of the resource. Such messages are delivered
to a node that is responsible for that resource. Both structured
and unstructured overlays are supported, so the route may not be
deterministic for all Topology Plug-ins.
Node-based Routing: RELOAD supports routing messages to a specific
node in the overlay.
Clients: RELOAD supports requests from and to clients that do not
participate in overlay routing. The clients are located via
either of the mechanisms described above.
NAT Traversal: RELOAD supports establishing and using connections
between nodes separated by one or more NATs, including locating
peers behind NATs for those overlays allowing/requiring it.
Low State: RELOAD's routing algorithms do not require significant
state (i.e., state linear or greater in the number of outstanding
messages that have passed through it) to be stored on intermediate
Routability in Unstable Topologies: Overlay topology changes
constantly in an overlay of moderate size due to the failure of
individual nodes and links in the system. RELOAD's routing allows
peers to reroute messages when a failure is detected, and replies
can be returned to the requesting node as long as the peers that
originally forwarded the successful request do not fail before the
response is returned.
RELOAD's routing utilizes three basic mechanisms:
Destination Lists: While, in principle, it is possible to just
inject a message into the overlay with a single Node-ID as the
destination, RELOAD provides a source-routing capability in the
form of "Destination Lists". A Destination List provides a list
of the nodes through which a message flows in order (i.e., it is
loose source routed). The minimal Destination List contains just
a single value.
Via Lists: In order to allow responses to follow the same path as
requests, each message also contains a "Via List", which is
appended to by each node a message traverses. This Via List can
then be inverted and used as a Destination List for the response.
RouteQuery: The RouteQuery method allows a node to query a peer for
the next hop it will use to route a message. This method is
useful for diagnostics and for iterative routing (see
The basic routing mechanism that RELOAD uses is symmetric recursive.
We will first describe symmetric recursive routing and then discuss
its advantages in terms of the requirements discussed above.
Symmetric recursive routing requires that a request message follow a
path through the overlay to the destination: each peer forwards the
message closer to its destination. The return path of the response
goes through the same nodes as the request (though it may also go
through some new intermediate nodes due to topology changes). Note
that a failure on the reverse path caused by a topology change after
the request was sent will be handled by the end-to-end retransmission
of the response as described in Section 6.2.1. For example, the
following figure shows a message following a route from A to Z
through B and X:
A B X Z
Note that this figure does not indicate whether A is a client or
peer. A forwards its request to B, and the response is returned to A
in the same manner regardless of A's role in the overlay.
This figure shows use of full Via Lists by intermediate peers B and
X. However, if B and/or X are willing to store state, then they may
elect to truncate the lists and save the truncated information
internally using the transaction ID as a key to allow it to be
retrieved later. Later, when the response message arrives, the
transaction ID would be used to recover the truncated information and
return the response message along the path from which the request
arrived. This option requires a greater amount of state to be stored
on intermediate peers, but saves a small amount of bandwidth and
reduces the need for modifying the message en route. Selection of
this mode of operation is a choice for the individual peer; the
techniques are interoperable even on a single message. The figure
below shows B using full Via Lists, but X truncating them to X1 and
saving the state internally.
A B X Z
As before, when B receives the message, B creates a Via List
consisting of [A]. However, instead of sending [A, B], X creates an
opaque ID X1 which maps internally to [A, B] (perhaps by being an
encryption of [A, B]) and then forwards to Z with only X1 as the Via
List. When the response arrives at X, it maps X1 back to [A, B],
then inverts it to produce the new Destination List [B, A], and
finally routes it to B.
RELOAD also supports a basic iterative "routing" mode, in which the
intermediate peers merely return a response indicating the next hop,
but do not actually forward the message to that next hop themselves.
Iterative routing is implemented using the RouteQuery method (see
Section 184.108.40.206), which requests this behavior. Note that iterative
routing is selected only by the initiating node.
4.4. Connectivity Management
In order to provide efficient routing, a peer needs to maintain a set
of direct connections to other peers in the Overlay Instance. Due to
the presence of NATs, these connections often cannot be formed
directly. Instead, we use the Attach request to establish a
connection. Attach uses Interactive Connectivity Establishment (ICE)
[RFC5245] to establish the connection. It is assumed that the reader
is familiar with ICE.
Say that peer A wishes to form a direct connection to peer B, either
to join the overlay or to add more connections in its Routing Table.
It gathers ICE candidates and packages them up in an Attach request,
which it sends to B through usual overlay routing procedures. B does
its own candidate gathering and sends back a response with its
candidates. A and B then do ICE connectivity checks on the candidate
pairs. The result is a connection between A and B. At this point, A
and B MAY send messages directly between themselves without going
through other overlay peers. In other words, A and B are in each
other's Connection Tables. They MAY then execute an Update process,
resulting in additions to each other's Routing Tables, and may then
become able to route messages through each other to other overlay
There are two cases where Attach is not used. The first is when a
peer is joining the overlay and is not connected to any peers. In
order to support this case, a small number of bootstrap nodes
typically need to be publicly accessible so that new peers can
directly connect to them. Section 11 contains more detail on this.
The second case is when a client connects to a peer at an arbitrary
IP address, rather than to its responsible peer, as described in the
second bullet point of Section 4.2.1.
In general, a peer needs to maintain connections to all of the peers
near it in the Overlay Instance and to enough other peers to have
efficient routing (the details on what "enough" and "near" mean
depend on the specific overlay). If a peer cannot form a connection
to some other peer, this is not necessarily a disaster; overlays can
route correctly even without fully connected links. However, a peer
needs to try to maintain the specified Routing Table defined by the
Topology Plug-in algorithm and needs to form new connections if it
detects that it has fewer direct connections than specified by the
algorithm. This also implies that peers, in accordance with the
Topology Plug-in algorithm, need to periodically verify that the
connected peers are still alive and, if not, need to try to re-form
the connections or form alternate ones. See Section 10.7.4.3 for an
example on how a specific overlay algorithm implements these
4.5. Overlay Algorithm Support
The Topology Plug-in allows RELOAD to support a variety of overlay
algorithms. This specification defines a DHT based on Chord, which
is mandatory to implement, but the base RELOAD protocol is designed
to support a variety of overlay algorithms. The information needed
to implement this DHT is fully contained in this specification, but
it is easier to understand if you are familiar with Chord-based
[Chord] DHTs. A nice tutorial can be found at [wikiChord].
4.5.1. Support for Pluggable Overlay Algorithms
RELOAD defines three methods for overlay maintenance: Join, Update,
and Leave. However, the contents of these messages, when they are
sent, and their precise semantics are specified by the actual overlay
algorithm, which is specified by configuration for all nodes in the
overlay and thus is known to nodes before they attempt to join the
overlay. RELOAD merely provides a framework of commonly needed
methods that provide uniformity of notation (and ease of debugging)
for a variety of overlay algorithms.
4.5.2. Joining, Leaving, and Maintenance Overview
When a new peer wishes to join the Overlay Instance, it will need a
Node-ID that it is allowed to use and a set of credentials which
match that Node-ID. When an enrollment server is used, the Node-ID
used is the one found in the certificate received from the enrollment
server. The details of the joining procedure are defined by the
overlay algorithm, but the general steps for joining an Overlay
o Form connections to some other peers.
o Acquire the data values this peer is responsible for storing.
o Inform the other peers which were previously responsible for that
data that this peer has taken over responsibility.
The first thing the peer needs to do is to form a connection to some
bootstrap node. Because this is the first connection the peer makes,
these nodes will need public IP addresses so that they can be
connected to directly. Once a peer has connected to one or more
bootstrap nodes, it can form connections in the usual way, by routing
Attach messages through the overlay to other nodes. After a peer has
connected to the overlay for the first time, it can cache the set of
past adjacencies which have public IP addresses and can attempt to
use them as future bootstrap nodes. Note that this requires some
notion of which addresses are likely to be public as discussed in
After a peer has connected to a bootstrap node, it then needs to take
up its appropriate place in the overlay. This requires two major
o Form connections to other peers in the overlay to populate its
o Get a copy of the data it is now responsible for storing, and
assume responsibility for that data.
The second operation is performed by contacting the Admitting Peer
(AP), the node which is currently responsible for the relevant
section of the overlay.
The details of this operation depend mostly on the overlay algorithm
involved, but a typical case would be:
1. JN sends a Join request to AP announcing its intention to join.
2. AP sends a Join response.
3. AP does a sequence of Stores to JN to give it the data it will
4. AP does Updates to JN and to other peers to tell them about its
own Routing Table. At this point, both JN and AP consider JN
responsible for some section of the Overlay Instance.
5. JN makes its own connections to the appropriate peers in the
After this process completes, JN is a full member of the Overlay
Instance and can process Store/Fetch requests.
Note that the first node is a special case. When ordinary nodes
cannot form connections to the bootstrap nodes, then they are not
part of the overlay. However, the first node in the overlay can
obviously not connect to other nodes. In order to support this case,
potential first nodes (which can also initially serve as bootstrap
nodes) need to somehow be instructed that they are the entire
overlay, rather than part of an existing overlay (e.g., by comparing
their IP address to the bootstrap IP addresses in the configuration
Note that clients do not perform either of these operations.
4.6. First-Time Setup
Previous sections addressed how RELOAD works after a node has
connected. This section provides an overview of how users get
connected to the overlay for the first time. RELOAD is designed so
that users can start with the name of the overlay they wish to join
and perhaps an account name and password, and can leverage these into
having a working peer with minimal user intervention. This helps
avoid the problems that have been experienced with conventional SIP
clients in which users need to manually configure a large number of
4.6.1. Initial Configuration
In the first phase of the setup process, the user starts with the
name of the overlay and uses it to download an initial set of overlay
configuration parameters. The node does a DNS SRV [RFC2782] lookup
on the overlay name to get the address of a configuration server. It
can then connect to this server with HTTPS [RFC2818] to download a
Configuration Document which contains the basic overlay configuration
parameters as well as a set of bootstrap nodes which can be used to
join the overlay. The details of the relationships between names in
the HTTPS certificates and the overlay names are described in
If a node already has the valid Configuration Document that it
received by an out-of-band method, this step can be skipped. Note
that this out-of-band method needs to provide authentication and
integrity, because the Configuration Document contains the trust
anchors used by the overlay.
If the overlay is using centralized enrollment, then a user needs to
acquire a certificate before joining the overlay. The certificate
attests both to the user's name within the overlay and to the
Node-IDs which they are permitted to operate. In this case, the
Configuration Document will contain the address of an enrollment
server which can be used to obtain such a certificate and will also
contain the trust anchor, so this document must be retrieved securely
(see Section 11.2). The enrollment server may (and probably will)
require some sort of account name for the user and a password before
issuing the certificate. The enrollment server's ability to ensure
attackers cannot get a large number of certificates for the overlay
is one of the cornerstones of RELOAD's security.
Significant advice around managing a RELOAD overlay and extensions
for diagnostics are described in [P2P-DIAGNOSTICS].
5. Application Support Overview
RELOAD is not intended to be used alone, but rather as a substrate
for other applications. These applications can use RELOAD for a
variety of purposes:
o To store data in the overlay and to retrieve data stored by other
o As a discovery mechanism for services such as TURN.
o To form direct connections which can be used to transmit
application-level messages without using the overlay.
This section provides an overview of these services.
5.1. Data Storage
RELOAD provides operations to Store and Fetch data. Each location in
the Overlay Instance is referenced by a Resource-ID. However, each
location may contain data elements corresponding to multiple Kinds
(e.g., certificate and SIP registration). Similarly, there may be
multiple elements of a given Kind, as shown below:
| Resource-ID |
| +------------+ +------------+ |
| | Kind 1 | | Kind 2 | |
| | | | | |
| | +--------+ | | +--------+ | |
| | | Value | | | | Value | | |
| | +--------+ | | +--------+ | |
| | | | | |
| | +--------+ | | +--------+ | |
| | | Value | | | | Value | | |
| | +--------+ | | +--------+ | |
| | | +------------+ |
| | +--------+ | |
| | | Value | | |
| | +--------+ | |
| +------------+ |
Each Kind is identified by a Kind-ID, which is a code point either
assigned by IANA or allocated out of a private range. As part of the
Kind definition, protocol designers may define constraints (such as
limits on size) on the values which may be stored. For many Kinds,
the set may be restricted to a single value, while some sets may be
allowed to contain multiple identical items, and others may have only
unique items. Note that a Kind may be employed by multiple usages,
and new usages are encouraged to use previously defined Kinds where
possible. We define the following data models in this document,
although other usages can define their own structures:
single value: There can be at most one item in the set, and any
value overwrites the previous item.
array: Many values can be stored and addressed by a numeric index.
dictionary: The values stored are indexed by a key. Often, this key
is one of the values from the certificate of the peer sending the
In order to protect stored data from tampering by other nodes, each
stored value is individually digitally signed by the node which
created it. When a value is retrieved, the digital signature can be
verified to detect tampering. If the certificate used to verify the
stored value signature expires, the value can no longer be retrieved
(although it may not be immediately garbage collected by the storing
node), and the creating node will need to store the value again if it
desires that the stored value continue to be available.
5.1.1. Storage Permissions
A major issue in peer-to-peer storage networks is minimizing the
burden of becoming a peer and, in particular, minimizing the amount
of data which any peer needs to store for other nodes. RELOAD
addresses this issue by allowing any given node to store data only at
a small number of locations in the overlay, with those locations
being determined by the node's certificate. When a peer uses a Store
request to place data at a location authorized by its certificate, it
signs that data with the private key that corresponds to its
certificate. Then the peer responsible for storing the data is able
to verify that the peer issuing the request is authorized to make
that request. Each data Kind defines the exact rules for determining
what certificate is appropriate.
The most natural rule is that a certificate authorizes a user to
store data keyed with their user name X. Thus, only a user with a
certificate for "firstname.lastname@example.org" could write to that location in
the overlay (see Section 11.3). However, other usages can define any
rules they choose, including publicly writable values.
The digital signature over the data serves two purposes. First, it
allows the peer responsible for storing the data to verify that this
Store is authorized. Second, it provides integrity for the data.
The signature is saved along with the data value (or values) so that
any reader can verify the integrity of the data. Of course, the
responsible peer can "lose" the value, but it cannot undetectably
The size requirements of the data being stored in the overlay are
variable. For instance, a SIP AOR and voicemail differ widely in the
storage size. RELOAD leaves it to the usage and overlay
configuration to limit size imbalances of various Kinds.
Replication in P2P overlays can be used to provide:
persistence: if the responsible peer crashes and/or if the storing
peer leaves the overlay
security: to guard against DoS attacks by the responsible peer or
routing attacks to that responsible peer
load balancing: to balance the load of queries for popular resources
A variety of schemes are used in P2P overlays to achieve some of
these goals. Common techniques include replicating on neighbors of
the responsible peer, randomly locating replicas around the overlay,
and replicating along the path to the responsible peer.
The core RELOAD specification does not specify a particular
replication strategy. Instead, the first level of replication
strategies is determined by the overlay algorithm, which can base the
replication strategy on its particular topology. For example, Chord
places replicas on successor peers, which will take over
responsibility if the responsible peer fails [Chord].
If additional replication is needed, for example, if data persistence
is particularly important for a particular usage, then that usage may
specify additional replication, such as implementing random
replications by inserting a different well-known constant into the
Resource Name used to store each replicated copy of the resource.
Such replication strategies can be added independently of the
underlying algorithm, and their usage can be determined based on the
needs of the particular usage.
By itself, the distributed storage layer provides only the
infrastructure on which applications are built. In order to do
anything useful, a usage needs to be defined. Each usage needs to
specify several things:
o Register Kind-ID code points for any Kinds that the usage defines
o Define the data structure for each of the Kinds (the value member
in Section 7.2). If the data structure contains character
strings, conversion rules between characters and the binary
storage need to be specified.
o Define access control rules for each of the Kinds (Section 7.3).
o Define how the Resource Name is used to form the Resource-ID where
each Kind is stored.
o Describe how values will be merged when a network partition is
The Kinds defined by a usage may also be applied to other usages.
However, a need for different parameters, such as a different access
control model, would imply the need to create a new Kind.
5.3. Service Discovery
RELOAD does not currently define a generic service discovery
algorithm as part of the base protocol, although a simplistic TURN-
specific discovery mechanism is provided. A variety of service
discovery algorithms can be implemented as extensions to the base
protocol, such as the service discovery algorithm ReDIR
[opendht-sigcomm05] and [REDIR-RELOAD].
5.4. Application Connectivity
There is no requirement that a RELOAD Usage needs to use RELOAD's
primitives for establishing its own communication if it already
possesses its own means of establishing connections. For example,
one could design a RELOAD-based resource discovery protocol which
used HTTP to retrieve the actual data.
For more common situations, however, it is the overlay itself --
rather than an external authority such as DNS -- which is used to
establish a connection. RELOAD provides connectivity to applications
using the AppAttach method. For example, if a P2PSIP node wishes to
establish a SIP dialog with another P2PSIP node, it will use
AppAttach to establish a direct connection with the other node. This
new connection is separate from the peer protocol connection. It is
a dedicated DTLS or TLS flow used only for the SIP dialog.
6. Overlay Management Protocol
This section defines the basic protocols used to create, maintain,
and use the RELOAD overlay network. We start by defining the basic
concept of how message destinations are interpreted when routing
messages. We then describe the symmetric recursive routing model,
which is RELOAD's default routing algorithm. Finally, we define the
message structure and the messages used to join and maintain the
6.1. Message Receipt and Forwarding
When a node receives a message, it first examines the overlay,
version, and other header fields to determine whether the message is
one it can process. If any of these are incorrect, as defined in
Section 6.3.2, it is an error and the message MUST be discarded. The
peer SHOULD generate an appropriate error, but local policy can
override this and cause the message to be silently dropped.
Once the peer has determined that the message is correctly formatted
(note that this does not include signature-checking on intermediate
nodes as the message may be fragmented), it examines the first entry
on the Destination List. There are three possible cases here:
o The first entry on the Destination List is an ID for which the
peer is responsible. A peer is always responsible for the
wildcard Node-ID. Handling of this case is described in
o The first entry on the Destination List is an ID for which another
peer is responsible. Handling of this case is described in
o The first entry on the Destination List is an opaque ID that is
being used for Destination List compression. Handling of this
case is described in Section 6.1.3. Note that opaque IDs can be
distinguished from Node-IDs and Resource-IDs on the wire as
described in Section 220.127.116.11.
These cases are handled as discussed below.
6.1.1. Responsible ID
If the first entry on the Destination List is an ID for which the
peer is responsible, there are several (mutually exclusive) subcases
o If the entry is a Resource-ID, then it MUST be the only entry on
the Destination List. If there are other entries, the message
MUST be silently dropped. Otherwise, the message is destined for
this node, so the node MUST verify the signature as described in
Section 7.1 and MUST pass it to the upper layers. "Upper layers"
is used here to mean the components above the "Overlay Link
Service Boundary" line in the figure in Section 1.2.
o If the entry is a Node-ID which equals this node's Node-ID, then
the message is destined for this node. If it is the only entry on
the Destination List, the message is destined for this node and so
the node passes it to the upper layers. Otherwise, the node
removes the entry from the Destination List and repeats the
routing process with the next entry on the Destination List. If
the message is a response and list compression was used, then the
node first modifies the Destination List to reinsert the saved
state, e.g., by unpacking any opaque IDs.
o If the entry is the wildcard Node-ID (all "1"s), the message is
destined for this node, and the node passes the message to the
upper layers. A message with a wildcard Node-ID as its first
entry is never forwarded; it is consumed locally.
o If the entry is a Node-ID which is not equal to this node, then
the node MUST drop the message silently unless the Node-ID
corresponds to a node which is directly connected to this node
(i.e., a client). In the latter case, the node MUST attempt to
forward the message to the destination node as described in the
next section (though this may fail for connectivity reasons,
because the TTL has expired, or because of some other error.)
Note that this process implies that in order to address a message to
"the peer that controls region X", a sender sends to Resource-ID X,
not Node-ID X.
6.1.2. Other ID
If the first entry on the Destination List is neither an opaque ID
nor an ID the peer is responsible for, then the peer MUST forward the
message towards that entry. This means that it MUST select one of
the peers to which it is connected and which is most likely to be
responsible (according to the Topology Plug-in) for the first entry
on the Destination List. For the CHORD-RELOAD topology, the routing
to the most likely responsible node is explained in Section 10.3. If
the first entry on the Destination List is in the peer's Connection
Table, the peer MUST forward the message to that peer directly.
Otherwise, the peer consults the Routing Table to forward the
Any intermediate peer which forwards a RELOAD request MUST ensure
that if it receives a response to that message, the response can be
routed back through the set of nodes through which the request
passed. The peer selects one of these approaches:
o The peer can add an entry to the Via List in the forwarding header
that will enable it to determine the correct node. This is done
by appending to the Via List the Node-ID of the node from which
the request was received.
o The peer can keep per-transaction state which will allow it to
determine the correct node.
As an example of the first strategy, consider an example with nodes
A, B, C, D, and E. If node D receives a message from node C with Via
List [A, B], then D would forward to the next node E with Via List
[A, B, C]. Now, if E wants to respond to the message, it reverses
the Via List to produce the Destination List, resulting in
[D, C, B, A]. When D forwards the response to C, the Destination
List will contain [C, B, A].
As an example of the second strategy, if node D receives a message
from node C with transaction ID X (as assigned by A) and Via List
[A, B], it could store [X, C] in its state database and forward the
message with the Via List unchanged. When D receives the response,
it consults its state database for transaction ID X, determines that
the request came from C, and forwards the response to C.
Intermediate peers which modify the Via List are not required to
simply add entries. The only requirement is that the peer MUST be
able to reconstruct the correct Destination List on the return route.
RELOAD provides explicit support for this functionality in the form
of opaque IDs, which can replace any number of Via List entries.
For instance, in the above example, Node D might send E a Via List
containing only the opaque ID I. E would then use the Destination
List [D, I] to send its return message. When D processes this
Destination List, it would detect that I is an opaque ID, recover the
Via List [A, B, C], and reverse that to produce the correct
Destination List [C, B, A] before sending it to C. This feature is
called "list compression". Possibilities for an opaque ID include a
compressed and/or encrypted version of the original Via List and an
index into a state database containing the original Via List, but the
details are a local matter.
No matter what mechanism for storing Via List state is used, if an
intermediate peer exits the overlay, then on the return trip the
message cannot be forwarded and will be dropped. The ordinary
timeout and retransmission mechanisms provide stability over this
type of failure.
Note that if an intermediate peer retains per-transaction state
instead of modifying the Via List, it needs some mechanism for timing
out that state; otherwise, its state database will grow without
bound. Whatever algorithm is used, unless a FORWARD_CRITICAL
forwarding option (Section 18.104.22.168) or an overlay configuration
option explicitly indicates this state is not needed, the state MUST
be maintained for at least the value of the overlay-reliability-timer
configuration parameter and MAY be kept longer. Future extensions,
such as [P2PSIP-RELAY], may define mechanisms for determining when
this state does not need to be retained.
There is no requirement to ensure that a request issued after the
receipt of a response follows the same path as the response. As a
consequence, there is no requirement to use either of the mechanisms
described above (Via List or state retention) when processing a
A node receiving a request from another node MUST ensure that any
response to that request exits that node with a Destination List
equal to the concatenation of the Node-ID of the node from which the
request was received with the Via List in the original request. The
intermediate node normally learns the Node-ID that the other node is
using via an Attach, but a node using a certificate with a single
Node-ID MAY elect not to send an Attach (see Section 4.2.1, bullet
2). If a node with a certificate with multiple Node-IDs attempts to
route a message other than a Ping or Attach through a node without
performing an Attach, the receiving node MUST reject the request with
an Error_Forbidden error. The node MUST implement support for
returning responses to a Ping or Attach request made by a Joining
Node Attaching to its responsible peer.
6.1.3. Opaque ID
If the first entry on the Destination List is an opaque ID (e.g., a
compressed Via List), the peer MUST replace the entry with the
original Via List that it replaced and then re-examine the
Destination List to determine which of the three cases in Section 6.1
6.2. Symmetric Recursive Routing
This section defines RELOAD's Symmetric Recursive Routing algorithm,
which is the default algorithm used by nodes to route messages
through the overlay. All implementations MUST implement this routing
algorithm. An overlay MAY be configured to use alternative routing
algorithms, and alternative routing algorithms MAY be selected on a
per-message basis. That is, a node in an overlay which supports
Symmetric Recursive Routing and some other routing algorithm called
XXX might use Symmetric Recursive Routing some of the time and XXX at
6.2.1. Request Origination
In order to originate a message to a given Node-ID or Resource-ID, a
node MUST construct an appropriate Destination List. The simplest
such Destination List is a single entry containing the Node-ID or
Resource-ID. The resulting message MUST be forwarded to its
destination via the normal overlay routing mechanisms. The node MAY
also construct a more complicated Destination List for source
Once the message is constructed, the node sends the message to an
adjacent peer. If the first entry on the Destination List is
directly connected, then the message MUST be routed down that
connection. Otherwise, the Topology Plug-in MUST be consulted to
determine the appropriate next hop.
Parallel requests for a resource are a common solution to improve
reliability in the face of churn or subversive peers. Parallel
searches for usage-specified replicas are managed by the usage layer,
for instance, by having the usage store data at multiple
Resource-IDs, with the requesting node sending requests to each of
those Resource-IDs. However, a single request MAY also be routed
through multiple adjacent peers, even when they are known to be
suboptimal, to improve reliability [vulnerabilities-acsac04]. Such
parallel searches MAY be specified by the Topology Plug-in, in which
case it would return multiple next hops and the request would be
routed to all of them.
Because messages can be lost in transit through the overlay, RELOAD
incorporates an end-to-end reliability mechanism. When an
originating node transmits a request, it MUST set a timer to the
current overlay-reliability-timer. If a response has not been
received when the timer fires, the request MUST be retransmitted with
the same transaction identifier. The request MAY be retransmitted up
to 4 times, for a total of 5 messages. After the timer for the fifth
transmission fires, the message MUST be considered to have failed.
Although the originating node will be doing both end-to-end and hop-
by-hop retransmissions, the end-by-end retransmission procedure is
not followed by intermediate nodes. They follow the hop-by-hop
reliability procedure described in Section 6.6.3.
The above algorithm can result in multiple requests being delivered
to a node. Receiving nodes MUST generate semantically equivalent
responses to retransmissions of the same request (this can be
determined by the transaction ID) if the request is received within
the maximum request lifetime (15 seconds). For some requests (e.g.,
Fetch), this can be accomplished merely by processing the request
again. For other requests (e.g., Store), it may be necessary to
maintain state for the duration of the request lifetime.
6.2.2. Response Origination
When a peer sends a response to a request using this routing
algorithm, it MUST construct the Destination List by reversing the
order of the entries on the Via List. This has the result that the
response traverses the same peers as the request traversed, except in
reverse order (symmetric routing) and possibly with extra nodes