Tech-invite3GPPspaceIETFspace
959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 8019

Protecting Internet Key Exchange Protocol Version 2 (IKEv2) Implementations from Distributed Denial-of-Service Attacks

Pages: 32
Proposed Standard
Part 1 of 2 – Pages 1 to 15
None   None   Next

Top   ToC   RFC8019 - Page 1
Internet Engineering Task Force (IETF)                            Y. Nir
Request for Comments: 8019                                   Check Point
Category: Standards Track                                     V. Smyslov
ISSN: 2070-1721                                               ELVIS-PLUS
                                                           November 2016


      Protecting Internet Key Exchange Protocol Version 2 (IKEv2)
       Implementations from Distributed Denial-of-Service Attacks

Abstract

This document recommends implementation and configuration best practices for Internet Key Exchange Protocol version 2 (IKEv2) Responders, to allow them to resist Denial-of-Service and Distributed Denial-of-Service attacks. Additionally, the document introduces a new mechanism called "Client Puzzles" that helps accomplish this task. Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc8019. Copyright Notice Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
Top   ToC   RFC8019 - Page 2

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions Used in This Document . . . . . . . . . . . . . . 3 3. The Vulnerability . . . . . . . . . . . . . . . . . . . . . . 3 4. Defense Measures While the IKE SA Is Being Created . . . . . 6 4.1. Retention Periods for Half-Open SAs . . . . . . . . . . . 6 4.2. Rate Limiting . . . . . . . . . . . . . . . . . . . . . . 7 4.3. The Stateless Cookie . . . . . . . . . . . . . . . . . . 8 4.4. Puzzles . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.5. Session Resumption . . . . . . . . . . . . . . . . . . . 11 4.6. Keeping Computed Shared Keys . . . . . . . . . . . . . . 11 4.7. Preventing "Hash and URL" Certificate Encoding Attacks . 11 4.8. IKE Fragmentation . . . . . . . . . . . . . . . . . . . . 12 5. Defense Measures after an IKE SA Is Created . . . . . . . . . 12 6. Plan for Defending a Responder . . . . . . . . . . . . . . . 14 7. Using Puzzles in the Protocol . . . . . . . . . . . . . . . . 16 7.1. Puzzles in IKE_SA_INIT Exchange . . . . . . . . . . . . . 16 7.1.1. Presenting a Puzzle . . . . . . . . . . . . . . . . . 17 7.1.2. Solving a Puzzle and Returning the Solution . . . . . 19 7.1.3. Computing a Puzzle . . . . . . . . . . . . . . . . . 20 7.1.4. Analyzing Repeated Request . . . . . . . . . . . . . 21 7.1.5. Deciding Whether to Serve the Request . . . . . . . . 22 7.2. Puzzles in an IKE_AUTH Exchange . . . . . . . . . . . . . 23 7.2.1. Presenting the Puzzle . . . . . . . . . . . . . . . . 24 7.2.2. Solving the Puzzle and Returning the Solution . . . . 24 7.2.3. Computing the Puzzle . . . . . . . . . . . . . . . . 25 7.2.4. Receiving the Puzzle Solution . . . . . . . . . . . . 25 8. Payload Formats . . . . . . . . . . . . . . . . . . . . . . . 26 8.1. PUZZLE Notification . . . . . . . . . . . . . . . . . . . 26 8.2. Puzzle Solution Payload . . . . . . . . . . . . . . . . . 27 9. Operational Considerations . . . . . . . . . . . . . . . . . 28 10. Security Considerations . . . . . . . . . . . . . . . . . . . 28 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 12.1. Normative References . . . . . . . . . . . . . . . . . . 30 12.2. Informative References . . . . . . . . . . . . . . . . . 31 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 31 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 32
Top   ToC   RFC8019 - Page 3

1. Introduction

Denial-of-Service (DoS) attacks have always been considered a serious threat. These attacks are usually difficult to defend against since the amount of resources the victim has is always bounded (regardless of how high it is) and because some resources are required for distinguishing a legitimate session from an attack. The Internet Key Exchange Protocol version 2 (IKEv2) described in [RFC7296] includes defense against DoS attacks. In particular, there is a cookie mechanism that allows the IKE Responder to defend itself against DoS attacks from spoofed IP addresses. However, botnets have become widespread, allowing attackers to perform Distributed Denial-of-Service (DDoS) attacks, which are more difficult to defend against. This document presents recommendations to help the Responder counter DoS and DDoS attacks. It also introduces a new mechanism -- "puzzles" -- that can help accomplish this task.

2. Conventions Used in This Document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

3. The Vulnerability

The IKE_SA_INIT exchange described in Section 1.2 of [RFC7296] involves the Initiator sending a single message. The Responder replies with a single message and also allocates memory for a structure called a half-open IKE Security Association (SA). This half-open SA is later authenticated in the IKE_AUTH exchange. If that IKE_AUTH request never comes, the half-open SA is kept for an unspecified amount of time. Depending on the algorithms used and implementation, such a half-open SA will use from around one hundred to several thousand bytes of memory. This creates an easy attack vector against an IKE Responder. Generating the IKE_SA_INIT request is cheap. Sending large amounts of IKE_SA_INIT requests can cause a Responder to use up all its resources. If the Responder tries to defend against this by throttling new requests, this will also prevent legitimate Initiators from setting up IKE SAs. An obvious defense, which is described in Section 4.2, is limiting the number of half-open SAs opened by a single peer. However, since all that is required is a single packet, an attacker can use multiple spoofed source IP addresses.
Top   ToC   RFC8019 - Page 4
   If we break down what a Responder has to do during an initial
   exchange, there are three stages:

   1.  When the IKE_SA_INIT request arrives, the Responder:

       *  Generates or reuses a Diffie-Hellman (DH) private part.

       *  Generates a Responder Security Parameter Index (SPI).

       *  Stores the private part and peer public part in a half-open SA
          database.

   2.  When the IKE_AUTH request arrives, the Responder:

       *  Derives the keys from the half-open SA.

       *  Decrypts the request.

   3.  If the IKE_AUTH request decrypts properly, the Responder:

       *  Validates the certificate chain (if present) in the IKE_AUTH
          request.

   The fourth stage where the Responder creates the Child SA is not
   reached by attackers who cannot pass the authentication step.

   Stage #1 is pretty light on CPU usage, but requires some storage, and
   it's very light for the Initiator as well.  Stage #2 includes
   private-key operations, so it is much heavier CPU-wise.  Stage #3 may
   include public key operations if certificates are involved.  These
   operations are often more computationally expensive than those
   performed at stage #2.

   To attack such a Responder, an attacker can attempt to exhaust either
   memory or CPU.  Without any protection, the most efficient attack is
   to send multiple IKE_SA_INIT requests and exhaust memory.  This is
   easy because IKE_SA_INIT requests are cheap.

   There are obvious ways for the Responder to protect itself without
   changes to the protocol.  It can reduce the time that an entry
   remains in the half-open SA database, and it can limit the amount of
   concurrent half-open SAs from a particular address or prefix.  The
   attacker can overcome this by using spoofed source addresses.

   The stateless cookie mechanism from Section 2.6 of [RFC7296] prevents
   an attack with spoofed source addresses.  This doesn't completely
   solve the issue, but it makes the limiting of half-open SAs by
   address or prefix work.  Puzzles, introduced in Section 4.4,
Top   ToC   RFC8019 - Page 5
   accomplish the same thing -- only more of it.  They make it harder
   for an attacker to reach the goal of getting a half-open SA.  Puzzles
   do not have to be so hard that an attacker cannot afford to solve a
   single puzzle; it is enough that puzzles increase the cost of
   creating half-open SAs, so the attacker is limited in the amount they
   can create.

   Reducing the lifetime of an abandoned half-open SA also reduces the
   impact of such attacks.  For example, if a half-open SA is kept for 1
   minute and the capacity is 60,000 half-open SAs, an attacker would
   need to create 1,000 half-open SAs per second.  If the retention time
   is reduced to 3 seconds, the attacker would need to create 20,000
   half-open SAs per second to get the same result.  By introducing a
   puzzle, each half-open SA becomes more expensive for an attacker,
   making it more likely to prevent an exhaustion attack against
   Responder memory.

   At this point, filling up the half-open SA database is no longer the
   most efficient DoS attack.  The attacker has two alternative attacks
   to do better:

   1.  Go back to spoofed addresses and try to overwhelm the CPU that
       deals with generating cookies, or

   2.  Take the attack to the next level by also sending an IKE_AUTH
       request.

   If an attacker is so powerful that it is able to overwhelm the
   Responder's CPU that deals with generating cookies, then the attack
   cannot be dealt with at the IKE level and must be handled by means of
   the Intrusion Prevention System (IPS) technology.

   On the other hand, the second alternative of sending an IKE_AUTH
   request is very cheap.  It requires generating a proper IKE header
   with the correct IKE SPIs and a single Encrypted payload.  The
   content of the payload is irrelevant and might be junk.  The
   Responder has to perform the relatively expensive key derivation,
   only to find that the Message Authentication Code (MAC) on the
   Encrypted payload on the IKE_AUTH request fails the integrity check.
   If a Responder does not hold on to the calculated SKEYSEED and SK_*
   keys (which it should in case a valid IKE_AUTH comes in later), this
   attack might be repeated on the same half-open SA.  Puzzles make
   attacks of such sort more costly for an attacker.  See Section 7.2
   for details.

   Here too, the number of half-open SAs that the attacker can achieve
   is crucial, because each one allows the attacker to waste some CPU
   time.  So making it hard to make many half-open SAs is important.
Top   ToC   RFC8019 - Page 6
   A strategy against DDoS has to rely on at least 4 components:

   1.  Hardening the half-open SA database by reducing retention time.

   2.  Hardening the half-open SA database by rate-limiting single
       IPs/ prefixes.

   3.  Guidance on what to do when an IKE_AUTH request fails to decrypt.

   4.  Increasing the cost of half-open SAs up to what is tolerable for
       legitimate clients.

   Puzzles are used as a solution for strategy #4.

4. Defense Measures While the IKE SA Is Being Created

4.1. Retention Periods for Half-Open SAs

As a UDP-based protocol, IKEv2 has to deal with packet loss through retransmissions. Section 2.4 of [RFC7296] recommends "that messages be retransmitted at least a dozen times over a period of at least several minutes before giving up." Many retransmission policies in practice wait one or two seconds before retransmitting for the first time. Because of this, setting the timeout on a half-open SA too low will cause it to expire whenever even one IKE_AUTH request packet is lost. When not under attack, the half-open SA timeout SHOULD be set high enough that the Initiator will have enough time to send multiple retransmissions, minimizing the chance of transient network congestion causing an IKE failure. When the system is under attack, as measured by the amount of half- open SAs, it makes sense to reduce this lifetime. The Responder should still allow enough time for the round-trip, for the Initiator to derive the DH shared value, and to derive the IKE SA keys and create the IKE_AUTH request. Two seconds is probably as low a value as can realistically be used. It could make sense to assign a shorter value to half-open SAs originating from IP addresses or prefixes that are considered suspect because of multiple concurrent half-open SAs.
Top   ToC   RFC8019 - Page 7

4.2. Rate Limiting

Even with DDoS, the attacker has only a limited amount of nodes participating in the attack. By limiting the amount of half-open SAs that are allowed to exist concurrently with each such node, the total amount of half-open SAs is capped, as is the total amount of key derivations that the Responder is forced to complete. In IPv4, it makes sense to limit the number of half-open SAs based on IP address. Most IPv4 nodes are either directly attached to the Internet using a routable address or hidden behind a NAT device with a single IPv4 external address. For IPv6, ISPs assign between a /48 and a /64, so it does not make sense for rate limiting to work on single IPv6 IPs. Instead, rate limits should be done based on either the /48 or /64 of the misbehaving IPv6 address observed. The number of half-open SAs is easy to measure, but it is also worthwhile to measure the number of failed IKE_AUTH exchanges. If possible, both factors should be taken into account when deciding which IP address or prefix is considered suspicious. There are two ways to rate limit a peer address or prefix: 1. Hard Limit -- where the number of half-open SAs is capped, and any further IKE_SA_INIT requests are rejected. 2. Soft Limit -- where if a set number of half-open SAs exist for a particular address or prefix, any IKE_SA_INIT request will be required to solve a puzzle. The advantage of the hard limit method is that it provides a hard cap on the amount of half-open SAs that the attacker is able to create. The disadvantage is that it allows the attacker to block IKE initiation from small parts of the Internet. For example, if a network service provider or some establishment offers Internet connectivity to its customers or employees through an IPv4 NAT device, a single malicious customer can create enough half-open SAs to fill the quota for the NAT device external IP address. Legitimate Initiators on the same network will not be able to initiate IKE. The advantage of a soft limit is that legitimate clients can always connect. The disadvantage is that an adversary with sufficient CPU resources can still effectively DoS the Responder. Regardless of the type of rate limiting used, legitimate Initiators that are not on the same network segments as the attackers will not be affected. This is very important as it reduces the adverse impact
Top   ToC   RFC8019 - Page 8
   caused by the measures used to counteract the attack and allows most
   Initiators to keep working even if they do not support puzzles.

4.3. The Stateless Cookie

Section 2.6 of [RFC7296] offers a mechanism to mitigate DoS attacks: the stateless cookie. When the server is under load, the Responder responds to the IKE_SA_INIT request with a calculated "stateless cookie" -- a value that can be recalculated based on values in the IKE_SA_INIT request without storing Responder-side state. The Initiator is expected to repeat the IKE_SA_INIT request, this time including the stateless cookie. This mechanism prevents DoS attacks from spoofed IP addresses, since an attacker needs to have a routable IP address to return the cookie. Attackers that have multiple source IP addresses with return routability, such as in the case of botnets, can fill up a half-open SA table anyway. The cookie mechanism limits the amount of allocated state to the number of attackers, multiplied by the number of half- open SAs allowed per peer address, multiplied by the amount of state allocated for each half-open SA. With typical values, this can easily reach hundreds of megabytes.

4.4. Puzzles

The puzzle introduced here extends the cookie mechanism of [RFC7296]. It is loosely based on the proof-of-work technique used in Bitcoin [BITCOINS]. Puzzles set an upper bound, determined by the attacker's CPU, to the number of negotiations the attacker can initiate in a unit of time. A puzzle is sent to the Initiator in two cases: o The Responder is so overloaded that no half-open SAs may be created without solving a puzzle, or o The Responder is not too loaded, but the rate-limiting method described in Section 4.2 prevents half-open SAs from being created with this particular peer address or prefix without first solving a puzzle.
Top   ToC   RFC8019 - Page 9
   When the Responder decides to send the challenge to solve a puzzle in
   response to an IKE_SA_INIT request, the message includes at least
   three components:

   1.  Cookie -- this is calculated the same as in [RFC7296], i.e., the
       process of generating the cookie is not specified.

   2.  Algorithm, this is the identifier of a Pseudorandom Function
       (PRF) algorithm, one of those proposed by the Initiator in the SA
       payload.

   3.  Zero-Bit Count (ZBC).  This is a number between 8 and 255 (or a
       special value - 0; see Section 7.1.1.1) that represents the
       length of the zero-bit run at the end of the output of the PRF
       function calculated over the cookie that the Initiator is to
       send.  The values 1-8 are explicitly excluded, because they
       create a puzzle that is too easy to solve.  Since the mechanism
       is supposed to be stateless for the Responder, either the same
       ZBC is used for all Initiators or the ZBC is somehow encoded in
       the cookie.  If it is global, then it means that this value is
       the same for all the Initiators who are receiving puzzles at any
       given point of time.  The Responder, however, may change this
       value over time depending on its load.

   Upon receiving this challenge, the Initiator attempts to calculate
   the PRF output using different keys.  When enough keys are found such
   that the resulting PRF output calculated using each of them has a
   sufficient number of trailing zero bits, that result is sent to the
   Responder.

   The reason for using several keys in the results, rather than just
   one key, is to reduce the variance in the time it takes the Initiator
   to solve the puzzle.  We have chosen the number of keys to be four
   (4) as a compromise between the conflicting goals of reducing
   variance and reducing the work the Responder needs to perform to
   verify the puzzle solution.

   When receiving a request with a solved puzzle, the Responder verifies
   two things:

   o  That the cookie is indeed valid.

   o  That the results of PRF of the transmitted cookie calculated with
      the transmitted keys has a sufficient number of trailing zero
      bits.
Top   ToC   RFC8019 - Page 10
   Example 1: Suppose the calculated cookie is
   739ae7492d8a810cf5e8dc0f9626c9dda773c5a3 (20 octets), the algorithm
   is PRF-HMAC-SHA256, and the required number of zero bits is 18.
   After successively trying a bunch of keys, the Initiator finds the
   following four 3-octet keys that work:

      +--------+----------------------------------+----------------+
      |  Key   | Last 32 Hex PRF Digits           | # of Zero Bits |
      +--------+----------------------------------+----------------+
      | 061840 | e4f957b859d7fb1343b7b94a816c0000 |       18       |
      | 073324 | 0d4233d6278c96e3369227a075800000 |       23       |
      | 0c8a2a | 952a35d39d5ba06709da43af40700000 |       20       |
      | 0d94c8 | 5a0452b21571e401a3d00803679c0000 |       18       |
      +--------+----------------------------------+----------------+

               Table 1: Four Solutions for the 18-Bit Puzzle

   Example 2: Same cookie, but modify the required number of zero bits
   to 22.  The first 4-octet keys that work to satisfy that requirement
   are 005d9e57, 010d8959, 0110778d, and 01187e37.  Finding these
   requires 18,382,392 invocations of the PRF.

            +----------------+-------------------------------+
            | # of Zero Bits | Time to Find 4 Keys (Seconds) |
            +----------------+-------------------------------+
            |       8        |                        0.0025 |
            |       10       |                        0.0078 |
            |       12       |                        0.0530 |
            |       14       |                        0.2521 |
            |       16       |                        0.8504 |
            |       17       |                        1.5938 |
            |       18       |                        3.3842 |
            |       19       |                        3.8592 |
            |       20       |                       10.8876 |
            +----------------+-------------------------------+

   Table 2: The Time Needed to Solve a Puzzle of Various Difficulty for
            the Cookie 39ae7492d8a810cf5e8dc0f9626c9dda773c5a3

   The figures above were obtained on a 2.4 GHz single-core Intel i5
   processor in a 2013 Apple MacBook Pro.  Run times can be halved or
   quartered with multi-core code, but they would be longer on mobile
   phone processors, even if those are multi-core as well.  With these
   figures, 18 bits is believed to be a reasonable choice for puzzle
   level difficulty for all Initiators, and 20 bits is acceptable for
   specific hosts/prefixes.
Top   ToC   RFC8019 - Page 11
   Using the puzzles mechanism in the IKE_SA_INIT exchange is described
   in Section 7.1.

4.5. Session Resumption

When the Responder is under attack, it SHOULD prefer previously authenticated peers who present a Session Resumption ticket [RFC5723]. However, the Responder SHOULD NOT serve resumed Initiators exclusively because dropping all IKE_SA_INIT requests would lock out legitimate Initiators that have no resumption ticket. When under attack, the Responder SHOULD require Initiators presenting Session Resumption tickets to pass a return routability check by including the COOKIE notification in the IKE_SESSION_RESUME response message, as described in Section 4.3.2. of [RFC5723]. Note that the Responder SHOULD cache tickets for a short time to reject reused tickets (Section 4.3.1 of [RFC5723]); therefore, there should be no issue of half-open SAs resulting from replayed IKE_SESSION_RESUME messages. Several kinds of DoS attacks are possible on servers supported by IKE Session Resumption. See Section 9.3 of [RFC5723] for details.

4.6. Keeping Computed Shared Keys

Once the IKE_SA_INIT exchange is finished, the Responder is waiting for the first message of the IKE_AUTH exchange from the Initiator. At this point, the Initiator is not yet authenticated, and this fact allows an attacker to perform an attack, described in Section 3. Instead of sending a properly formed and encrypted IKE_AUTH message, the attacker can just send arbitrary data, forcing the Responder to perform costly CPU operations to compute SK_* keys. If the received IKE_AUTH message failed to decrypt correctly (or failed to pass the Integrity Check Value (ICV) check), then the Responder SHOULD still keep the computed SK_* keys, so that if it happened to be an attack, then an attacker cannot get an advantage of repeating the attack multiple times on a single IKE SA. The Responder can also use puzzles in the IKE_AUTH exchange as described in Section 7.2.

4.7. Preventing "Hash and URL" Certificate Encoding Attacks

In IKEv2, each side may use the "Hash and URL" Certificate Encoding to instruct the peer to retrieve certificates from the specified location (see Section 3.6 of [RFC7296] for details). Malicious Initiators can use this feature to mount a DoS attack on the Responder by providing a URL pointing to a large file possibly
Top   ToC   RFC8019 - Page 12
   containing meaningless bits.  While downloading the file, the
   Responder consumes CPU, memory, and network bandwidth.

   To prevent this kind of attack, the Responder should not blindly
   download the whole file.  Instead, it SHOULD first read the initial
   few bytes, decode the length of the ASN.1 structure from these bytes,
   and then download no more than the decoded number of bytes.  Note
   that it is always possible to determine the length of ASN.1
   structures used in IKEv2, if they are DER-encoded, by analyzing the
   first few bytes.  However, since the content of the file being
   downloaded can be under the attacker's control, implementations
   should not blindly trust the decoded length and SHOULD check whether
   it makes sense before continuing to download the file.
   Implementations SHOULD also apply a configurable hard limit to the
   number of pulled bytes and SHOULD provide an ability for an
   administrator to either completely disable this feature or limit its
   use to a configurable list of trusted URLs.

4.8. IKE Fragmentation

IKE fragmentation described in [RFC7383] allows IKE peers to avoid IP fragmentation of large IKE messages. Attackers can mount several kinds of DoS attacks using IKE fragmentation. See Section 5 of [RFC7383] for details on how to mitigate these attacks.

5. Defense Measures after an IKE SA Is Created

Once an IKE SA is created, there is usually only a limited amount of IKE messages exchanged. This IKE traffic consists of exchanges aimed to create additional Child SAs, IKE rekeys, IKE deletions, and IKE liveness tests. Some of these exchanges require relatively little resources (like a liveness check), while others may be resource consuming (like creating or rekeying a Child SA with DH exchange). Since any endpoint can initiate a new exchange, there is a possibility that a peer would initiate too many exchanges that could exhaust host resources. For example, the peer can perform endless continuous Child SA rekeying or create an overwhelming number of Child SAs with the same Traffic Selectors, etc. Such behavior can be caused by broken implementations, misconfiguration, or as an intentional attack. The latter becomes more of a real threat if the peer uses NULL Authentication, as described in [RFC7619]. In this case, the peer remains anonymous, allowing it to escape any responsibility for its behavior. See Section 3 of [RFC7619] for details on how to mitigate attacks when using NULL Authentication.
Top   ToC   RFC8019 - Page 13
   The following recommendations apply especially for NULL-authenticated
   IKE sessions, but also apply to authenticated IKE sessions, with the
   difference that in the latter case, the identified peer can be locked
   out.

   o  If the IKEv2 window size is greater than one, peers are able to
      initiate multiple simultaneous exchanges that increase host
      resource consumption.  Since there is no way in IKEv2 to decrease
      window size once it has been increased (see Section 2.3 of
      [RFC7296]), the window size cannot be dynamically adjusted
      depending on the load.  It is NOT RECOMMENDED to allow an IKEv2
      window size greater than one when NULL Authentication has been
      used.

   o  If a peer initiates an abusive amount of CREATE_CHILD_SA exchanges
      to rekey IKE SAs or Child SAs, the Responder SHOULD reply with
      TEMPORARY_FAILURE notifications indicating the peer must slow down
      their requests.

   o  If a peer creates many Child SAs with the same or overlapping
      Traffic Selectors, implementations MAY respond with the
      NO_ADDITIONAL_SAS notification.

   o  If a peer initiates many exchanges of any kind, the Responder MAY
      introduce an artificial delay before responding to each request
      message.  This delay would decrease the rate the Responder needs
      to process requests from any particular peer and frees up
      resources on the Responder that can be used for answering
      legitimate clients.  If the Responder receives retransmissions of
      the request message during the delay period, the retransmitted
      messages MUST be silently discarded.  The delay must be short
      enough to avoid legitimate peers deleting the IKE SA due to a
      timeout.  It is believed that a few seconds is enough.  Note,
      however, that even a few seconds may be too long when settings
      rely on an immediate response to the request message, e.g., for
      the purposes of quick detection of a dead peer.

   o  If these countermeasures are inefficient, implementations MAY
      delete the IKE SA with an offending peer by sending Delete
      Payload.

   In IKE, a client can request various configuration attributes from
   the server.  Most often, these attributes include internal IP
   addresses.  Malicious clients can try to exhaust a server's IP
   address pool by continuously requesting a large number of internal
   addresses.  Server implementations SHOULD limit the number of IP
Top   ToC   RFC8019 - Page 14
   addresses allocated to any particular client.  Note, this is not
   possible with clients using NULL Authentication, since their identity
   cannot be verified.

6. Plan for Defending a Responder

This section outlines a plan for defending a Responder from a DDoS attack based on the techniques described earlier. The numbers given here are not normative, and their purpose is to illustrate the configurable parameters needed for surviving DDoS attacks. Implementations are deployed in different environments, so it is RECOMMENDED that the parameters be settable. For example, most commercial products are required to undergo benchmarking where the IKE SA establishment rate is measured. Benchmarking is indistinguishable from a DoS attack, and the defenses described in this document may defeat the benchmark by causing exchanges to fail or to take a long time to complete. Parameters SHOULD be tunable to allow for benchmarking (if only by turning DDoS protection off). Since all countermeasures may cause delays and additional work for the Initiators, they SHOULD NOT be deployed unless an attack is likely to be in progress. To minimize the burden imposed on Initiators, the Responder should monitor incoming IKE requests for two scenarios: 1. A general DDoS attack. Such an attack is indicated by a high number of concurrent half-open SAs, a high rate of failed IKE_AUTH exchanges, or a combination of both. For example, consider a Responder that has 10,000 distinct peers of which at peak, 7,500 concurrently have VPN tunnels. At the start of peak time, 600 peers might establish tunnels within any given minute, and tunnel establishment (both IKE_SA_INIT and IKE_AUTH) takes anywhere from 0.5 to 2 seconds. For this Responder, we expect there to be less than 20 concurrent half-open SAs, so having 100 concurrent half-open SAs can be interpreted as an indication of an attack. Similarly, IKE_AUTH request decryption failures should never happen. Supposing that the tunnels are established using Extensible Authentication Protocol (EAP) (see Section 2.16 of [RFC7296]), users may be expected to enter a wrong password about 20% of the time. So we'd expect 125 wrong password failures a minute. If we get IKE_AUTH decryption failures from multiple sources more than once per second, or EAP failures more than 300 times per minute, this can also be an indication of a DDoS attack.
Top   ToC   RFC8019 - Page 15
   2.  An attack from a particular IP address or prefix.  Such an attack
       is indicated by an inordinate amount of half-open SAs from a
       specific IP address or prefix, or an inordinate amount of
       IKE_AUTH failures.  A DDoS attack may be viewed as multiple such
       attacks.  If these are mitigated successfully, there will not be
       a need to enact countermeasures on all Initiators.  For example,
       measures might be 5 concurrent half-open SAs, 1 decrypt failure,
       or 10 EAP failures within a minute.

   Note that using countermeasures against an attack from a particular
   IP address may be enough to avoid the overload on the half-open SA
   database.  In this case, the number of failed IKE_AUTH exchanges will
   never exceed the threshold of attack detection.

   When there is no general DDoS attack, it is suggested that no cookie
   or puzzles be used.  At this point, the only defensive measure is to
   monitor the number of half-open SAs, and set a soft limit per peer IP
   or prefix.  The soft limit can be set to 3-5.  If the puzzles are
   used, the puzzle difficulty SHOULD be set to such a level (number of
   zero bits) that all legitimate clients can handle it without degraded
   user experience.

   As soon as any kind of attack is detected, either a lot of
   initiations from multiple sources or a lot of initiations from a few
   sources, it is best to begin by requiring stateless cookies from all
   Initiators.  This will mitigate attacks based on IP address spoofing
   and help avoid the need to impose a greater burden in the form of
   puzzles on the general population of Initiators.  This makes the per-
   node or per-prefix soft limit more effective.

   When cookies are activated for all requests and the attacker is still
   managing to consume too many resources, the Responder MAY start to
   use puzzles for these requests or increase the difficulty of puzzles
   imposed on IKE_SA_INIT requests coming from suspicious nodes/
   prefixes.  This should still be doable by all legitimate peers, but
   the use of puzzles at a higher difficulty may degrade the user
   experience, for example, by taking up to 10 seconds to solve the
   puzzle.

   If the load on the Responder is still too great, and there are many
   nodes causing multiple half-open SAs or IKE_AUTH failures, the
   Responder MAY impose hard limits on those nodes.

   If it turns out that the attack is very widespread and the hard caps
   are not solving the issue, a puzzle MAY be imposed on all Initiators.
   Note that this is the last step, and the Responder should avoid this
   if possible.


(next page on part 2)

Next Section