Tech-invite3GPPspaceIETF RFCsSIP
929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 3711

The Secure Real-time Transport Protocol (SRTP)

Pages: 56
Proposed Standard
Errata
Updated by:  55066904
Part 2 of 3 – Pages 19 to 36
First   Prev   Next

Top   ToC   RFC3711 - Page 19   prevText

4. Pre-Defined Cryptographic Transforms

While there are numerous encryption and message authentication algorithms that can be used in SRTP, below we define default algorithms in order to avoid the complexity of specifying the encodings for the signaling of algorithm and parameter identifiers. The defined algorithms have been chosen as they fulfill the goals listed in Section 2. Recommendations on how to extend SRTP with new transforms are given in Section 6.

4.1. Encryption

The following parameters are common to both pre-defined, non-NULL, encryption transforms specified in this section. * BLOCK_CIPHER-MODE indicates the block cipher used and its mode of operation * n_b is the bit-size of the block for the block cipher * k_e is the session encryption key * n_e is the bit-length of k_e * k_s is the session salting key * n_s is the bit-length of k_s * SRTP_PREFIX_LENGTH is the octet length of the keystream prefix, a non-negative integer, specified by the message authentication code in use. The distinct session keys and salts for SRTP/SRTCP are by default derived as specified in Section 4.3. The encryption transforms defined in SRTP map the SRTP packet index and secret key into a pseudo-random keystream segment. Each keystream segment encrypts a single RTP packet. The process of encrypting a packet consists of generating the keystream segment corresponding to the packet, and then bitwise exclusive-oring that keystream segment onto the payload of the RTP packet to produce the Encrypted Portion of the SRTP packet. In case the payload size is not an integer multiple of n_b bits, the excess (least significant) bits of the keystream are simply discarded. Decryption is done the same way, but swapping the roles of the plaintext and ciphertext.
Top   ToC   RFC3711 - Page 20
   +----+   +------------------+---------------------------------+
   | KG |-->| Keystream Prefix |          Keystream Suffix       |---+
   +----+   +------------------+---------------------------------+   |
                                                                     |
                               +---------------------------------+   v
                               |     Payload of RTP Packet       |->(*)
                               +---------------------------------+   |
                                                                     |
                               +---------------------------------+   |
                               | Encrypted Portion of SRTP Packet|<--+
                               +---------------------------------+

   Figure 3: Default SRTP Encryption Processing.  Here KG denotes the
   keystream generator, and (*) denotes bitwise exclusive-or.

   The definition of how the keystream is generated, given the index,
   depends on the cipher and its mode of operation.  Below, two such
   keystream generators are defined.  The NULL cipher is also defined,
   to be used when encryption of RTP is not required.

   The SRTP definition of the keystream is illustrated in Figure 3.  The
   initial octets of each keystream segment MAY be reserved for use in a
   message authentication code, in which case the keystream used for
   encryption starts immediately after the last reserved octet.  The
   initial reserved octets are called the "keystream prefix" (not to be
   confused with the "encryption prefix" of [RFC3550, Section 6.1]), and
   the remaining octets are called the "keystream suffix".  The
   keystream prefix MUST NOT be used for encryption.  The process is
   illustrated in Figure 3.

   The number of octets in the keystream prefix is denoted as
   SRTP_PREFIX_LENGTH.  The keystream prefix is indicated by a positive,
   non-zero value of SRTP_PREFIX_LENGTH.  This means that, even if
   confidentiality is not to be provided, the keystream generator output
   may still need to be computed for packet authentication, in which
   case the default keystream generator (mode) SHALL be used.

   The default cipher is the Advanced Encryption Standard (AES) [AES],
   and we define two modes of running AES, (1) Segmented Integer Counter
   Mode AES and (2) AES in f8-mode.  In the remainder of this section,
   let E(k,x) be AES applied to key k and input block x.
Top   ToC   RFC3711 - Page 21

4.1.1. AES in Counter Mode

Conceptually, counter mode [AES-CTR] consists of encrypting successive integers. The actual definition is somewhat more complicated, in order to randomize the starting point of the integer sequence. Each packet is encrypted with a distinct keystream segment, which SHALL be computed as follows. A keystream segment SHALL be the concatenation of the 128-bit output blocks of the AES cipher in the encrypt direction, using key k = k_e, in which the block indices are in increasing order. Symbolically, each keystream segment looks like E(k, IV) || E(k, IV + 1 mod 2^128) || E(k, IV + 2 mod 2^128) ... where the 128-bit integer value IV SHALL be defined by the SSRC, the SRTP packet index i, and the SRTP session salting key k_s, as below. IV = (k_s * 2^16) XOR (SSRC * 2^64) XOR (i * 2^16) Each of the three terms in the XOR-sum above is padded with as many leading zeros as needed to make the operation well-defined, considered as a 128-bit value. The inclusion of the SSRC allows the use of the same key to protect distinct SRTP streams within the same RTP session, see the security caveats in Section 9.1. In the case of SRTCP, the SSRC of the first header of the compound packet MUST be used, i SHALL be the 31-bit SRTCP index and k_e, k_s SHALL be replaced by the SRTCP encryption session key and salt. Note that the initial value, IV, is fixed for each packet and is formed by "reserving" 16 zeros in the least significant bits for the purpose of the counter. The number of blocks of keystream generated for any fixed value of IV MUST NOT exceed 2^16 to avoid keystream re-use, see below. The AES has a block size of 128 bits, so 2^16 output blocks are sufficient to generate the 2^23 bits of keystream needed to encrypt the largest possible RTP packet (except for IPv6 "jumbograms" [RFC2675], which are not likely to be used for RTP-based multimedia traffic). This restriction on the maximum bit-size of the packet that can be encrypted ensures the security of the encryption method by limiting the effectiveness of probabilistic attacks [BDJR]. For a particular Counter Mode key, each IV value used as an input MUST be distinct, in order to avoid the security exposure of a two- time pad situation (Section 9.1). To satisfy this constraint, an implementation MUST ensure that the combination of the SRTP packet
Top   ToC   RFC3711 - Page 22
   index of ROC || SEQ, and the SSRC used in the construction of the IV
   are distinct for any particular key.  The failure to ensure this
   uniqueness could be catastrophic for Secure RTP.  This is in contrast
   to the situation for RTP itself, which may be able to tolerate such
   failures.  It is RECOMMENDED that, if a dedicated security module is
   present, the RTP sequence numbers and SSRC either be generated or
   checked by that module (i.e., sequence-number and SSRC processing in
   an SRTP system needs to be protected as well as the key).

4.1.2. AES in f8-mode

To encrypt UMTS (Universal Mobile Telecommunications System, as 3G networks) data, a solution (see [f8-a] [f8-b]) known as the f8- algorithm has been developed. On a high level, the proposed scheme is a variant of Output Feedback Mode (OFB) [HAC], with a more elaborate initialization and feedback function. As in normal OFB, the core consists of a block cipher. We also define here the use of AES as a block cipher to be used in what we shall call "f8-mode of operation" RTP encryption. The AES f8-mode SHALL use the same default sizes for session key and salt as AES counter mode. Figure 4 shows the structure of block cipher, E, running in f8-mode.
Top   ToC   RFC3711 - Page 23
                    IV
                    |
                    v
                +------+
                |      |
           +--->|  E   |
           |    +------+
           |        |
     m -> (*)       +-----------+-------------+--  ...     ------+
           |    IV' |           |             |                  |
           |        |   j=1 -> (*)    j=2 -> (*)   ...  j=L-1 ->(*)
           |        |           |             |                  |
           |        |      +-> (*)       +-> (*)   ...      +-> (*)
           |        |      |    |        |    |             |    |
           |        v      |    v        |    v             |    v
           |    +------+   | +------+    | +------+         | +------+
    k_e ---+--->|  E   |   | |  E   |    | |  E   |         | |  E   |
                |      |   | |      |    | |      |         | |      |
                +------+   | +------+    | +------+         | +------+
                    |      |    |        |    |             |    |
                    +------+    +--------+    +--  ...  ----+    |
                    |           |             |                  |
                    v           v             v                  v
                   S(0)        S(1)          S(2)  . . .       S(L-1)

   Figure 4.  f8-mode of operation (asterisk, (*), denotes bitwise XOR).
   The figure represents the KG in Figure 3, when AES-f8 is used.

4.1.2.1. f8 Keystream Generation
The Initialization Vector (IV) SHALL be determined as described in Section 4.1.2.2 (and in Section 4.1.2.3 for SRTCP). Let IV', S(j), and m denote n_b-bit blocks. The keystream, S(0) ||... || S(L-1), for an N-bit message SHALL be defined by setting IV' = E(k_e XOR m, IV), and S(-1) = 00..0. For j = 0,1,..,L-1 where L = N/n_b (rounded up to nearest integer if it is not already an integer) compute S(j) = E(k_e, IV' XOR j XOR S(j-1)) Notice that the IV is not used directly. Instead it is fed through E under another key to produce an internal, "masked" value (denoted IV') to prevent an attacker from gaining known input/output pairs.
Top   ToC   RFC3711 - Page 24
   The role of the internal counter, j, is to prevent short keystream
   cycles.  The value of the key mask m SHALL be

           m = k_s || 0x555..5,

   i.e., the session salting key, appended by the binary pattern 0101..
   to fill out the entire desired key size, n_e.

   The sender SHOULD NOT generate more than 2^32 blocks, which is
   sufficient to generate 2^39 bits of keystream.  Unlike counter mode,
   there is no absolute threshold above (below) which f8 is guaranteed
   to be insecure (secure).  The above bound has been chosen to limit,
   with sufficient security margin, the probability of degenerative
   behavior in the f8 keystream generation.

4.1.2.2. f8 SRTP IV Formation
The purpose of the following IV formation is to provide a feature which we call implicit header authentication (IHA), see Section 9.5. The SRTP IV for 128-bit block AES-f8 SHALL be formed in the following way: IV = 0x00 || M || PT || SEQ || TS || SSRC || ROC M, PT, SEQ, TS, SSRC SHALL be taken from the RTP header; ROC is from the cryptographic context. The presence of the SSRC as part of the IV allows AES-f8 to be used when a master key is shared between multiple streams within the same RTP session, see Section 9.1.
4.1.2.3. f8 SRTCP IV Formation
The SRTCP IV for 128-bit block AES-f8 SHALL be formed in the following way: IV= 0..0 || E || SRTCP index || V || P || RC || PT || length || SSRC where V, P, RC, PT, length, SSRC SHALL be taken from the first header in the RTCP compound packet. E and SRTCP index are the 1-bit and 31-bit fields added to the packet.
Top   ToC   RFC3711 - Page 25

4.1.3. NULL Cipher

The NULL cipher is used when no confidentiality for RTP/RTCP is requested. The keystream can be thought of as "000..0", i.e., the encryption SHALL simply copy the plaintext input into the ciphertext output.

4.2. Message Authentication and Integrity

Throughout this section, M will denote data to be integrity protected. In the case of SRTP, M SHALL consist of the Authenticated Portion of the packet (as specified in Figure 1) concatenated with the ROC, M = Authenticated Portion || ROC; in the case of SRTCP, M SHALL consist of the Authenticated Portion (as specified in Figure 2) only. Common parameters: * AUTH_ALG is the authentication algorithm * k_a is the session message authentication key * n_a is the bit-length of the authentication key * n_tag is the bit-length of the output authentication tag * SRTP_PREFIX_LENGTH is the octet length of the keystream prefix as defined above, a parameter of AUTH_ALG The distinct session authentication keys for SRTP/SRTCP are by default derived as specified in Section 4.3. The values of n_a, n_tag, and SRTP_PREFIX_LENGTH MUST be fixed for any particular fixed value of the key. We describe the process of computing authentication tags as follows. The sender computes the tag of M and appends it to the packet. The SRTP receiver verifies a message/authentication tag pair by computing a new authentication tag over M using the selected algorithm and key, and then compares it to the tag associated with the received message. If the two tags are equal, then the message/tag pair is valid; otherwise, it is invalid and the error audit message "AUTHENTICATION FAILURE" MUST be returned.

4.2.1. HMAC-SHA1

The pre-defined authentication transform for SRTP is HMAC-SHA1 [RFC2104]. With HMAC-SHA1, the SRTP_PREFIX_LENGTH (Figure 3) SHALL be 0. For SRTP (respectively SRTCP), the HMAC SHALL be applied to the session authentication key and M as specified above, i.e., HMAC(k_a, M). The HMAC output SHALL then be truncated to the n_tag left-most bits.
Top   ToC   RFC3711 - Page 26

4.3. Key Derivation

4.3.1. Key Derivation Algorithm

Regardless of the encryption or message authentication transform that is employed (it may be an SRTP pre-defined transform or newly introduced according to Section 6), interoperable SRTP implementations MUST use the SRTP key derivation to generate session keys. Once the key derivation rate is properly signaled at the start of the session, there is no need for extra communication between the parties that use SRTP key derivation. packet index ---+ | v +-----------+ master +--------+ session encr_key | ext | key | |----------> | key mgmt |-------->| key | session auth_key | (optional | | deriv |----------> | rekey) |-------->| | session salt_key | | master | |----------> +-----------+ salt +--------+ Figure 5: SRTP key derivation. At least one initial key derivation SHALL be performed by SRTP, i.e., the first key derivation is REQUIRED. Further applications of the key derivation MAY be performed, according to the "key_derivation_rate" value in the cryptographic context. The key derivation function SHALL initially be invoked before the first packet and then, when r > 0, a key derivation is performed whenever index mod r equals zero. This can be thought of as "refreshing" the session keys. The value of "key_derivation_rate" MUST be kept fixed for the lifetime of the associated master key. Interoperable SRTP implementations MAY also derive session salting keys for encryption transforms, as is done in both of the pre- defined transforms. Let m and n be positive integers. A pseudo-random function family is a set of keyed functions {PRF_n(k,x)} such that for the (secret) random key k, given m-bit x, PRF_n(k,x) is an n-bit string, computationally indistinguishable from random n-bit strings, see [HAC]. For the purpose of key derivation in SRTP, a secure PRF with m = 128 (or more) MUST be used, and a default PRF transform is defined in Section 4.3.3.
Top   ToC   RFC3711 - Page 27
   Let "a DIV t" denote integer division of a by t, rounded down, and
   with the convention that "a DIV 0 = 0" for all a.  We also make the
   convention of treating "a DIV t" as a bit string of the same length
   as a, and thus "a DIV t" will in general have leading zeros.

   Key derivation SHALL be defined as follows in terms of <label>, an
   8-bit constant (see below), master_salt and key_derivation_rate, as
   determined in the cryptographic context, and index, the packet index
   (i.e., the 48-bit ROC || SEQ for SRTP):

   *  Let r = index DIV key_derivation_rate (with DIV as defined above).

   *  Let key_id = <label> || r.

   *  Let x = key_id XOR master_salt, where key_id and master_salt are
      aligned so that their least significant bits agree (right-
      alignment).

   <label> MUST be unique for each type of key to be derived.  We
   currently define <label> 0x00 to 0x05 (see below), and future
   extensions MAY specify new values in the range 0x06 to 0xff for other
   purposes.  The n-bit SRTP key (or salt) for this packet SHALL then be
   derived from the master key, k_master as follows:

      PRF_n(k_master, x).

   (The PRF may internally specify additional formatting and padding of
   x, see e.g., Section 4.3.3 for the default PRF.)

   The session keys and salt SHALL now be derived using:

   - k_e (SRTP encryption): <label> = 0x00, n = n_e.

   - k_a (SRTP message authentication): <label> = 0x01, n = n_a.

   - k_s (SRTP salting key): <label> = 0x02, n = n_s.

   where n_e, n_s, and n_a are from the cryptographic context.

   The master key and master salt MUST be random, but the master salt
   MAY be public.

   Note that for a key_derivation_rate of 0, the application of the key
   derivation SHALL take place exactly once.

   The definition of DIV above is purely for notational convenience.
   For a non-zero t among the set of allowed key derivation rates, "a
   DIV t" can be implemented as a right-shift by the base-2 logarithm of
Top   ToC   RFC3711 - Page 28
   t.  The derivation operation is further facilitated if the rates are
   chosen to be powers of 256, but that granularity was considered too
   coarse to be a requirement of this specification.

   The upper limit on the number of packets that can be secured using
   the same master key (see Section 9.2) is independent of the key
   derivation.

4.3.2. SRTCP Key Derivation

SRTCP SHALL by default use the same master key (and master salt) as SRTP. To do this securely, the following changes SHALL be done to the definitions in Section 4.3.1 when applying session key derivation for SRTCP. Replace the SRTP index by the 32-bit quantity: 0 || SRTCP index (i.e., excluding the E-bit, replacing it with a fixed 0-bit), and use <label> = 0x03 for the SRTCP encryption key, <label> = 0x04 for the SRTCP authentication key, and, <label> = 0x05 for the SRTCP salting key.

4.3.3. AES-CM PRF

The currently defined PRF, keyed by 128, 192, or 256 bit master key, has input block size m = 128 and can produce n-bit outputs for n up to 2^23. PRF_n(k_master,x) SHALL be AES in Counter Mode as described in Section 4.1.1, applied to key k_master, and IV equal to (x*2^16), and with the output keystream truncated to the n first (left-most) bits. (Requiring n/128, rounded up, applications of AES.)

5. Default and mandatory-to-implement Transforms

The default transforms also are mandatory-to-implement transforms in SRTP. Of course, "mandatory-to-implement" does not imply "mandatory-to-use". Table 1 summarizes the pre-defined transforms. The default values below are valid for the pre-defined transforms. mandatory-to-impl. optional default encryption AES-CM, NULL AES-f8 AES-CM message integrity HMAC-SHA1 - HMAC-SHA1 key derivation (PRF) AES-CM - AES-CM Table 1: Mandatory-to-implement, optional and default transforms in SRTP and SRTCP.
Top   ToC   RFC3711 - Page 29

5.1. Encryption: AES-CM and NULL

AES running in Segmented Integer Counter Mode, as defined in Section 4.1.1, SHALL be the default encryption algorithm. The default key lengths SHALL be 128-bit for the session encryption key (n_e). The default session salt key-length (n_s) SHALL be 112 bits. The NULL cipher SHALL also be mandatory-to-implement.

5.2. Message Authentication/Integrity: HMAC-SHA1

HMAC-SHA1, as defined in Section 4.2.1, SHALL be the default message authentication code. The default session authentication key-length (n_a) SHALL be 160 bits, the default authentication tag length (n_tag) SHALL be 80 bits, and the SRTP_PREFIX_LENGTH SHALL be zero for HMAC-SHA1. In addition, for SRTCP, the pre-defined HMAC-SHA1 MUST NOT be applied with a value of n_tag, nor n_a, that are smaller than these defaults. For SRTP, smaller values are NOT RECOMMENDED, but MAY be used after careful consideration of the issues in Section 7.5 and 9.5.

5.3. Key Derivation: AES-CM PRF

The AES Counter Mode based key derivation and PRF defined in Sections 4.3.1 to 4.3.3, using a 128-bit master key, SHALL be the default method for generating session keys. The default master salt length SHALL be 112 bits and the default key-derivation rate SHALL be zero.

6. Adding SRTP Transforms

Section 4 provides examples of the level of detail needed for defining transforms. Whenever a new transform is to be added to SRTP, a companion standard track RFC MUST be written to exactly define how the new transform can be used with SRTP (and SRTCP). Such a companion RFC SHOULD avoid overlap with the SRTP protocol document. Note however, that it MAY be necessary to extend the SRTP or SRTCP cryptographic context definition with new parameters (including fixed or default values), add steps to the packet processing, or even add fields to the SRTP/SRTCP packets. The companion RFC SHALL explain any known issues regarding interactions between the transform and other aspects of SRTP. Each new transform document SHOULD specify its key attributes, e.g., size of keys (minimum, maximum, recommended), format of keys, recommended/required processing of input keying material, requirements/recommendations on key lifetime, re-keying and key derivation, whether sharing of keys between SRTP and SRTCP is allowed or not, etc.
Top   ToC   RFC3711 - Page 30
   An added message integrity transform SHOULD define a minimum
   acceptable key/tag size for SRTCP, equivalent in strength to the
   minimum values as defined in Section 5.2.

7. Rationale

This section explains the rationale behind several important features of SRTP.

7.1. Key derivation

Key derivation reduces the burden on the key establishment. As many as six different keys are needed per crypto context (SRTP and SRTCP encryption keys and salts, SRTP and SRTCP authentication keys), but these are derived from a single master key in a cryptographically secure way. Thus, the key management protocol needs to exchange only one master key (plus master salt when required), and then SRTP itself derives all the necessary session keys (via the first, mandatory application of the key derivation function). Multiple applications of the key derivation function are optional, but will give security benefits when enabled. They prevent an attacker from obtaining large amounts of ciphertext produced by a single fixed session key. If the attacker was able to collect a large amount of ciphertext for a certain session key, he might be helped in mounting certain attacks. Multiple applications of the key derivation function provide backwards and forward security in the sense that a compromised session key does not compromise other session keys derived from the same master key. This means that the attacker who is able to recover a certain session key, is anyway not able to have access to messages secured under previous and later session keys (derived from the same master key). (Note that, of course, a leaked master key reveals all the session keys derived from it.) Considerations arise with high-rate key refresh, especially in large multicast settings, see Section 11.

7.2. Salting key

The master salt guarantees security against off-line key-collision attacks on the key derivation that might otherwise reduce the effective key size [MF00].
Top   ToC   RFC3711 - Page 31
   The derived session salting key used in the encryption, has been
   introduced to protect against some attacks on additive stream
   ciphers, see Section 9.2.  The explicit inclusion method of the salt
   in the IV has been selected for ease of hardware implementation.

7.3. Message Integrity from Universal Hashing

The particular definition of the keystream given in Section 4.1 (the keystream prefix) is to give provision for particular universal hash functions, suitable for message authentication in the Wegman-Carter paradigm [WC81]. Such functions are provably secure, simple, quick, and especially appropriate for Digital Signal Processors and other processors with a fast multiply operation. No authentication transforms are currently provided in SRTP other than HMAC-SHA1. Future transforms, like the above mentioned universal hash functions, MAY be added following the guidelines in Section 6.

7.4. Data Origin Authentication Considerations

Note that in pair-wise communications, integrity and data origin authentication are provided together. However, in group scenarios where the keys are shared between members, the MAC tag only proves that a member of the group sent the packet, but does not prevent against a member impersonating another. Data origin authentication (DOA) for multicast and group RTP sessions is a hard problem that needs a solution; while some promising proposals are being investigated [PCST1] [PCST2], more work is needed to rigorously specify these technologies. Thus SRTP data origin authentication in groups is for further study. DOA can be done otherwise using signatures. However, this has high impact in terms of bandwidth and processing time, therefore we do not offer this form of authentication in the pre-defined packet-integrity transform. The presence of mixers and translators does not allow data origin authentication in case the RTP payload and/or the RTP header are manipulated. Note that these types of middle entities also disrupt end-to-end confidentiality (as the IV formation depends e.g., on the RTP header preservation). A certain trust model may choose to trust the mixers/translators to decrypt/re-encrypt the media (this would imply breaking the end-to-end security, with related security implications).
Top   ToC   RFC3711 - Page 32

7.5. Short and Zero-length Message Authentication

As shown in Figure 1, the authentication tag is RECOMMENDED in SRTP. A full 80-bit authentication-tag SHOULD be used, but a shorter tag or even a zero-length tag (i.e., no message authentication) MAY be used under certain conditions to support either of the following two application environments. 1. Strong authentication can be impractical in environments where bandwidth preservation is imperative. An important special case is wireless communication systems, in which bandwidth is a scarce and expensive resource. Studies have shown that for certain applications and link technologies, additional bytes may result in a significant decrease in spectrum efficiency [SWO]. Considerable effort has been made to design IP header compression techniques to improve spectrum efficiency [RFC3095]. A typical voice application produces 20 byte samples, and the RTP, UDP and IP headers need to be jointly compressed to one or two bytes on average in order to obtain acceptable wireless bandwidth economy [RFC3095]. In this case, strong authentication would impose nearly fifty percent overhead. 2. Authentication is impractical for applications that use data links with fixed-width fields that cannot accommodate the expansion due to the authentication tag. This is the case for some important existing wireless channels. For example, zero- byte header compression is used to adapt EVRC/SMV voice with the legacy IS-95 bearer channel in CDMA2000 VoIP services. It was found that not a single additional octet could be added to the data, which motivated the creation of a zero-byte profile for ROHC [RFC3242]. A short tag is secure for a restricted set of applications. Consider a voice telephony application, for example, such as a G.729 audio codec with a 20-millisecond packetization interval, protected by a 32-bit message authentication tag. The likelihood of any given packet being successfully forged is only one in 2^32. Thus an adversary can control no more than 20 milliseconds of audio output during a 994-day period, on average. In contrast, the effect of a single forged packet can be much larger if the application is stateful. A codec that uses relative or predictive compression across packets will propagate the maliciously generated state, affecting a longer duration of output.
Top   ToC   RFC3711 - Page 33
   Certainly not all SRTP or telephony applications meet the criteria
   for short or zero-length authentication tags.  Section 9.5.1
   discusses the risks of weak or no message authentication, and section
   9.5 describes the circumstances when it is acceptable and when it is
   unacceptable.

8. Key Management Considerations

There are emerging key management standards [MIKEY] [KEYMGT] [SDMS] for establishing an SRTP cryptographic context (e.g., an SRTP master key). Both proprietary and open-standard key management methods are likely to be used for telephony applications [MIKEY] [KINK] and multicast applications [GDOI]. This section provides guidance for key management systems that service SRTP session. For initialization, an interoperable SRTP implementation SHOULD be given the SSRC and MAY be given the initial RTP sequence number for the RTP stream by key management (thus, key management has a dependency on RTP operational parameters). Sending the RTP sequence number in the key management may be useful e.g., when the initial sequence number is close to wrapping (to avoid synchronization problems), and to communicate the current sequence number to a joining endpoint (to properly initialize its replay list). If the pre-defined transforms are used, SRTP allows sharing of the same master key between SRTP/SRTCP streams belonging to the same RTP session. First, sharing between SRTP streams belonging to the same RTP session is secure if the design of the synchronization mechanism, i.e., the IV, avoids keystream re-use (the two-time pad, Section 9.1). This is taken care of by the fact that RTP provides for unique SSRCs for streams belonging to the same RTP session. See Section 9.1 for further discussion. Second, sharing between SRTP and the corresponding SRTCP is secure. The fact that an SRTP stream and its associated SRTCP stream both carry the same SSRC does not constitute a problem for the two-time pad due to the key derivation. Thus, SRTP and SRTCP corresponding to one RTP session MAY share master keys (as they do by default). Note that message authentication also has a dependency on SSRC uniqueness that is unrelated to the problem of keystream reuse: SRTP streams authenticated under the same key MUST have a distinct SSRC in order to identify the sender of the message. This requirement is needed because the SSRC is the cryptographically authenticated field
Top   ToC   RFC3711 - Page 34
   used to distinguish between different SRTP streams.  Were two streams
   to use identical SSRC values, then an adversary could substitute
   messages from one stream into the other without detection.

   SRTP/SRTCP MUST NOT share master keys under any other circumstances
   than the ones given above, i.e., between SRTP and its corresponding
   SRTCP, and, between streams belonging to the same RTP session.

8.1. Re-keying

The recommended way for a particular key management system to provide re-key within SRTP is by associating a master key in a crypto context with an MKI. This provides for easy master key retrieval (see Scenarios in Section 11), but has the disadvantage of adding extra bits to each packet. As noted in Section 7.5, some wireless links do not cater for added bits, therefore SRTP also defines a more economic way of triggering re-keying, via use of <From, To>, which works in some specific, simple scenarios (see Section 8.1.1). SRTP senders SHALL count the amount of SRTP and SRTCP traffic being used for a master key and invoke key management to re-key if needed (Section 9.2). These interactions are defined by the key management interface to SRTP and are not defined by this protocol specification.

8.1.1. Use of the <From, To> for re-keying

In addition to the use of the MKI, SRTP defines another optional mechanism for master key retrieval, the <From, To>. The <From, To> specifies the range of SRTP indices (a pair of sequence number and ROC) within which a certain master key is valid, and is (when used) part of the crypto context. By looking at the 48-bit SRTP index of the current SRTP packet, the corresponding master key can be found by determining which From-To interval it belongs to. For SRTCP, the most recently observed/used SRTP index (which can be obtained from the cryptographic context) is used for this purpose, even though SRTCP has its own (31-bit) index (see caveat below). This method, compared to the MKI, has the advantage of identifying the master key and defining its lifetime without adding extra bits to each packet. This could be useful, as already noted, for some wireless links that do not cater for added bits. However, its use SHOULD be limited to specific, very simple scenarios. We recommend to limit its use when the RTP session is a simple unidirectional or bi-directional stream. This is because in case of multiple streams, it is difficult to trigger the re-key based on the <From, To> of a single RTP stream. For example, if several streams share a master
Top   ToC   RFC3711 - Page 35
   key, there is no simple one-to-one correspondence between the index
   sequence space of a certain stream, and the index sequence space on
   which the <From, To> values are based.  Consequently, when a master
   key is shared between streams, one of these streams MUST be
   designated by key management as the one whose index space defines the
   re-keying points.  Also, the re-key triggering on SRTCP is based on
   the correspondent SRTP stream, i.e., when the SRTP stream changes the
   master key, so does the correspondent SRTCP.  This becomes obviously
   more and more complex with multiple streams.

   The default values for the <From, To> are "from the first observed
   packet" and "until further notice".  However, the maximum limit of
   SRTP/SRTCP packets that are sent under each given master/session key
   (Section 9.2) MUST NOT be exceeded.

   In case the <From, To> is used as key retrieval, then the MKI is not
   inserted in the packet (and its indicator in the crypto context is
   zero).  However, using the MKI does not exclude using <From, To> key
   lifetime simultaneously.  This can for instance be useful to signal
   at the sender side at which point in time an MKI is to be made
   active.

8.2. Key Management parameters

The table below lists all SRTP parameters that key management can supply. For reference, it also provides a summary of the default and mandatory-to-support values for an SRTP implementation as described in Section 5.
Top   ToC   RFC3711 - Page 36
   Parameter                     Mandatory-to-support    Default
   ---------                     --------------------    -------

   SRTP and SRTCP encr transf.       AES_CM, NULL         AES_CM
   (Other possible values: AES_f8)

   SRTP and SRTCP auth transf.       HMAC-SHA1           HMAC-SHA1

   SRTP and SRTCP auth params:
     n_tag (tag length)                 80                 80
     SRTP prefix_length                  0                  0

   Key derivation PRF                 AES_CM              AES_CM

   Key material params
   (for each master key):
     master key length                 128                128
     n_e (encr session key length)     128                128
     n_a (auth session key length)     160                160
     master salt key
     length of the master salt         112                112
     n_s (session salt key length)     112                112
     key derivation rate                 0                  0

     key lifetime
        SRTP-packets-max-lifetime      2^48               2^48
        SRTCP-packets-max-lifetime     2^31               2^31
        from-to-lifetime <From, To>
     MKI indicator                       0                 0
     length of the MKI                   0                 0
     value of the MKI

   Crypto context index params:
     SSRC value
     ROC
     SEQ
     SRTCP Index
     Transport address
     Port number

   Relation to other RTP profiles:
     sender's order between FEC and SRTP FEC-SRTP      FEC-SRTP
     (see Section 10)