tech-invite   World Map     

IETF     RFCs     Groups     SIP     ABNFs    |    3GPP     Specs     Gloss.     Arch.     IMS     UICC    |    Misc.    |    search     info

RFC 2909


The Multicast Address-Set Claim (MASC) Protocol

Part 3 of 3, p. 45 to 56
Prev RFC Part


prevText      Top      Up      ToC       Page 45 
12.  Operational Considerations

12.1.  Bootup Operations

   To learn about its parent domains' IDs and prefixes, a MASC node
   SHOULD try to establish connections to its PARENT nodes before
   initiating a connection to a SIBLING node.  To avoid learning about
   its own PREFIX_MANAGED from its children or siblings, a MASC node
   SHOULD try to establish connections to its PARENT nodes and
   INTERNAL_PEER nodes before initiating a connection to a CHILD or
   SIBLING node.

12.2.  Leaf and Non-leaf MASC Domain Operation

   A non-leaf MASC domain (i.e. a domain that has children domains)
   should advertise its PREFIX_MANAGED addresses to its children, and
   should claim from that space the sub-ranges that would be advertised
   to the internal MAASs (the claim wait time SHOULD be equal to
   [WAITING_PERIOD]).  A MASC node that belongs to a non-leaf MASC
   domain should perform dual functions by being a child of itself with
   regard to the claiming and management of the sub-ranges for local
   usage.  A leaf MASC domain should advertise all PREFIX_MANAGED
   addresses to its MAASs without explicitly claiming them for internal
   usage.  A MASC node can assume that it belongs to a leaf domain if it
   simply does not have any UPDATEs by children domains.  If an UPDATE
   by a child is received, the domain MUST switch from "leaf" to "non-
   leaf" mode, and if it needs more addresses for internal usage, it
   MUST claim them from that domain's PREFIX_MANAGED.  After the last
   UPDATE originated by a child expires, the domain can switch back to
   "leaf" mode.

12.3.  Clock Skew Workaround

   Each UPDATE has "Claim Timestamp" field that is set to the absolute
   time of the MASC node that originated that UPDATE. The timestamp is
   used for two purposes: to resolve collisions, and to define how long
   an UPDATE should be kept in the local cache of other MASC nodes. A
   skew in the clock could result in unfair collision decision such that
   the claims originated by nodes that have their clock behind the real
   time will always win; however, because collisions are presumably
   rare, this will not be an issue.  Skew in the clock however might
   result in expiring an UPDATE earlier than it really should be
   expired, and a node might assume too early that the expired
   UPDATE/prefix is free for allocation. To compensate for the clock
   skew, an UPDATE message should be kept longer than the amount of time
   specified in the Claim Holdtime. For example, keeping UPDATEs for an
   additional 24 hours will compensate for clock skew for up to 24

Top      Up      ToC       Page 46 
12.4.  Clash Resolving Mechanism

   If a MASC node receives a PREFIX_IN_USE claim originated by a sibling
   and the claim overlaps with some of the local prefixes, the clash
   must be resolved.  Two MASC domains should not manage overlapping
   address ranges, unless the domains have an ancestor-descendant (e.g.
   parent-child) relationship in the MASC hierarchy.  Also, two MASC
   domains should not have locally-allocated overlapping address ranges.
   The clashed address ranges should not be advertised to the MAASs and
   allocated to multicast applications/sessions.  If a clashed address
   has being allocated to an application, the application should be
   informed to stop using that address and switch to a new one.

   The G-RIB database must be consistent, such that it does not have
   ambiguous entries.  "Ambiguous G-RIB entries" are those entries that
   might cause the multicast routing protocol to loop or lose
   connectivity.  In MASC the WITHDRAW message is used to solve this
   problem.  When a clashing PREFIX_IN_USE is received, it is compared
   (using the function describe in Section 5.1.1) against all prefixes
   allocated to the local domain.  If the local PREFIX_IN_USE is the
   winner, no further actions are taken.  If the local PREFIX_IN_USE is
   the loser, the clashing address range must be withdrawn by initiating
   a WITHDRAW message. The message must have Role = INTERNAL, Origin
   Node ID and Origin Domain ID must be the same as the corresponding
   local PREFIX_IN_USE message, while Claim Timestamp, Claim Lifetime,
   Claim Holdtime, Address and Mask must be the same as the received
   winning PREFIX_IN_USE.  The initiated WITHDRAW message must be
   processed as described in Section 11.7.

   If a cached WITHDRAW times out and the local MASC domain owns an
   overlapping PREFIX_MANAGED or PREFIX_IN_USE, the overlapping prefix
   ranges can be injected back into the G-RIB database.  Similarly, the
   address ranges that were not advertised to the local domain's MAASs
   due to the WITHDRAW, can now be advertised again.

   In addition to the automatic resolving of clashes, a MASC
   implementation should support manual resolving of clashes.  For
   example, after a clash is detected, the network administrator should
   be informed that a clash has occurred.  The specific manual
   mechanisms are outside the scope of this protocol.

   A MASC node must be configured to operate using either manual or
   automatic clash resolution mechanisms.

Top      Up      ToC       Page 47 
12.5.  Changing Network Providers

   If a MASC domain changes a network provider, such that the old
   provider cannot be used to provide connectivity, any traffic for
   sessions that are in progress and use that MASC domain as the root of
   multicast distribution trees will not be able to reach that domain.

   If the new network provider is willing to carry the traffic for the
   old sessions rooted at the customer domain, then it must propagate
   the customer's old prefixes through the G-RIB.  However, at least one
   MASC node in the customer domain must maintain a TCP connection to
   one of the old network provider's MASC nodes.  Thus, it can continue
   to "defend" the customer's prefixes, and should continue until the
   old prefixes' lifetimes expire.

   If the new network provider is not willing to propagate the old
   prefixes, then the customer should remove its prefixes from the G-
   RIB.  If BGMP is in use, the old network provider's domain will
   automatically become the Root Domain for the customer's old groups
   due to the lack of a more specific group route.  MASC nodes in the
   customer domain MAY still connect with the old provider's MASC nodes
   to defend their allocation.

12.6.  Debugging

12.6.1.  Prefix-to-Domain Lookup

   Use mtrace [MTRACE] to find the BGMP/MASC root domain for a group
   address chosen from that prefix.

12.6.2.  Domain-to-Prefix Lookup

   We can find the address space allocated to a particular MASC domain
   by directly querying one of the MASC servers within that domain, by
   observing the state in parents, siblings, or children MASC domains,
   or by observing the G-RIB information originated by that domain.
   From those three methods, the first method can provide the most
   detailed information. Finding the address of one of the MASC nodes
   within a particular domain is outside the scope of MASC.

13.  MASC Storage

   In general, MASC will be run by a border routers, which, in general
   do not have stable storage.  In this case, MASC must use the Layer 2
   protocol/mechanism (e.g., ([AAP]) as described in [MALLOC] to store
   the important information (the prefixes allocated by the local
   domain) in the domain's MAASs who should have stable storage.  If the

Top      Up      ToC       Page 48 
   MASC speaker has local storage, it should use it instead of the Layer
   2 protocol/mechanism.  Claims that are in progress do not have to be
   saved by using the Layer 2 protocol/mechanism.

14.  Security Considerations

   IPsec [IPSEC] can be used to address security concerns between two
   MASC peering nodes.  However, because of the store-and-forward nature
   of the UPDATE messages, it is possible that if a non-trustworthy MASC
   node can connect to some point of the MASC topology, then this node
   can undetectably inject malicious UPDATEs that may disturb the normal
   operation of other MASC nodes.  To address this problem, each MASC
   node should allow peering only with trustworthy nodes.

   After a reboot, a MASC node/domain can restore its state from its
   neighbors (internal peers, parents, siblings, children). Typically,
   the state received from a parent or internal peer will be
   trustworthy, but a node may choose to drop its own UPDATEs that were
   received through a sibling or a child.

   A misbehaving node may attempt a Denial of Service attack by sending
   a large number of colliding messages that would prevent any of its
   siblings from allocating more addresses.  A single mis-behaving node
   can easily be identified by all of its siblings, and all of its
   UPDATEs can be ignored.  A Denial of Service attack that uses
   multiple origin addresses can be prevented if a third-party UPDATE
   (e.g. by a non-directly connected sibling) is accepted only if it is
   sent via the common parent domain, and the MASC nodes in the parent
   domain accept children UPDATEs only if they come via an internal
   peer, or come directly from a child node that is same as the Origin
   Node ID.

15.  IANA Considerations

   This document defines several number spaces (MASC message types, MASC
   OPEN message optional parameters types, MASC UPDATE message attribute
   types, MASC UPDATE message optional parameters types, and MASC
   NOTIFICATION message error codes and subcodes).  For all of these
   number spaces, certain values are defined in this specification.  New
   values may only be defined by IETF Consensus, as described in [IANA-
   CONSIDERATIONS].  Basically, this means that they are defined by RFCs
   approved by the IESG.

16.  Acknowledgments

   The authors would like to thank the participants of the IETF for
   their assistance with this protocol.

Top      Up      ToC       Page 49 
17.  APPENDIX A: Sample Algorithms

   DISCLAIMER: This section describes some preliminary suggestions by
   various people for algorithms which could be used with MASC.

17.1.  Claim Size and Prefix Selection Algorithm

   This section covers the algorithms used by a MASC node (on behalf of
   a MASC domain) to satisfy the demand for multicast addresses.  The
   allocated addresses should be aggregatable, the address utilization
   should be reasonably high, and the allocation latency to the MAASs
   should be shorter than [WAITING_PERIOD] whenever possible.

17.1.1.  Prefix Expansion

   For ease of implementation and troubleshooting, MASC should use
   contiguous masks to specify the address ranges, i.e. prefixes.
   (Research indicates that sufficiently good results can be achieved
   using contiguous masks only.)  The chosen prefixes should be as
   expandable as possible.  The method used to choose the children sub-
   prefixes from the parent's prefix is the so called Reverse Bit
   Ordering (idea by Dave Thaler; inspired by Kampai [KAMPAI]).  For
   example, if the parent's prefix width is four bits, the addresses of
   the sub-prefixes are chosen in the following order:

   Parent:       xxxx

   Child A:      0000
   Child B:      1000
   Child C:      0100
   Child D:      1100

   If some of the children need to expand their sub-prefix, they try to
   double the corresponding sub-prefix starting from the right:

   Child A:      000x
   Child A:      00xx
   Child D:      110x
   Child D:      11xx

   and so on.

   However, because the address ordering is very strict, to reduce the
   probability for collision, when a new sub-prefix has to be chosen,
   the choice should be random among all candidates with the same
   potential for expandability.  For example, if the free sub-prefixes
   are 01xx, 10xx, 110x, then the new prefix to claim should be chosen
   with probability of 50% for 01xx and 50% for 10xx for example.

Top      Up      ToC       Page 50 
17.1.2.  Reducing Allocation Latency

   To reduce the allocation latency, a MASC node uses pre-allocation.
   It constantly monitors the demand for addresses from its children (or
   MAASs), and predicts what would be the address usage after
   [WAITING_PERIOD].  Only if the available addresses will be used up
   within [WAITING_PERIOD], a MASC node claims more addresses in

17.1.3.  Address Space Utilization

   Because every prefix size is a power of two, if a node tries to
   allocate just a single prefix, the utilization at that node (i.e. at
   that node's domain) can be as low as 50%.  To improve the
   utilization, a MASC node can have more than one prefix allocated at a
   time (typically, each of them with different size).  By using a pre-
   allocation and allocating several prefixes of different size (see
   below), a MASC node should try to keep its address utilization in the
   range 70-90%.

17.1.4.  Prefix Selection After Increase of Demand

   To additionally reduce the allocation latency by reducing the
   probability for collision, and to improve the aggregability of the
   allocated addresses, a MASC node carefully chooses the prefixes to
   claim. The first prefix is chosen at random among all reasonably
   expandable candidates.  If a node chooses to allocate another,
   smaller prefix, then, instead of doubling the size of the first one
   which might reduce significantly the address utilization, a second
   "neighbor" prefix is chosen.  For example, if prefix 224.0/16 was
   already allocated, and the MASC domain needs 256 more addresses, the
   second prefix to claim will be 224.1.0/24. If the domain needs more
   addresses, the second prefix will eventually grow to 224.1/16, and
   then both prefixes can be automatically aggregated into 224.0/15.
   Only if 224.0.1/24 could not be allocated, a MASC node will choose
   another prefix (eventually random among the unused prefixes).

   If the number of allocated prefixes increases above some threshold,
   and none of them can be extended when more addresses are needed,
   then, to reduce the amount of state, a MASC node should claim a new
   larger prefix and should stop re-claiming the older non-expandable
   prefixes.  Research results show that up to three prefixes per MASC
   domain is a reasonable threshold, such that the address utilization
   can be in the range 70-90%, and at the same time the prefix flux will
   be reasonably low.

Top      Up      ToC       Page 51 
17.1.5.  Prefix Selection After Decrease of Demand

   If the demand for addresses decreases, such that its address space is
   under-utilized, a MASC node implicitly returns the unused prefixes
   after their lifetimes expire, or re-claims some smaller sub-prefixes.
   For example, if prefix 224.0/15 is 50% used by the MAASs and/or
   children MASC domains, and the overall utilization is such that
   approximately 2^16 (64K) addresses should be returned, a MASC node
   should stop reclaiming 224.0/15 and should start reclaiming either
   224.0/16 or 224.1/16 (whichever sub-prefix utilization is higher).

17.1.6.  Lifetime Extension Algorithm

   If the demand for addresses did not decrease, then a MASC node re-
   claims the prefixes it has allocated before their lifetime expires.
   Each prefix (or sub-prefix if the demand has decreased) should be
   re-claimed every 48 hours.

18.  APPENDIX B: Strawman Deployment

   At the moment of writing, is temporarily
   allocated to MALLOC.  Presumably this block of addresses will be used
   for experimental deployment and testing.

   If MASC were widely deployed on the Internet, we might expect numbers
   similar to the following:

   o  Initially will have approximately 128 Top-Level Domains

   o  Assume initially approximately 8192 level-2 MASC domains; on
      average, a TLD will have approximately 64 children domains.

   o  MASC managed global addresses:

      The following (large) ranges are not allocated yet (2^N represents
      the size of the contiguous mask prefixes): - = 2^26 + 2^25 + 2^24 - = 2^25 + 2^25 + 2^24
       Total:   12*2^24 addresses

      Initially, the range - (4*2^24 = 2^26 =
      64M) could be used by MASC as the global addresses pool. The rest
      (8*2^24) should be reserved.  Part of it could be added later to
      MASC, or can be used to enlarge the pool of administratively
      scoped addresses (currently 239.X.X.X), or the pool for static
      allocation (233.X.X.X).

Top      Up      ToC       Page 52 
   o  If the multicast addresses are evenly distributed, each TLD would
      have a maximum of 2^19 (512K) addresses, while each level-2 MASC
      domain would have 8192 addresses.

   o  Initial claim size: 256 addresses/MASC domain

   o  Could use soft and hard thresholds to specify the maximum amount
      of claimed+allocated addresses per domain.  For example, trigger a
      warning message if claimed+allocated addresses by a domain is >=
      1.0*average_assumed_per_domain (a strawman default soft

         * if a TLD claim+allocation >= 512K
         * if a second level MASC domain claim+allocation >= 8K

      The hard threshold (for example, 2.0*average_assumed_per_domain)
      can be enforced by sending an explicit DENIED message.

      The TLDs thresholds (with regard to the claims by the second level
      MASC domains) is a private matter and is a part of the particular
      TLD policy: the thresholds could be per customer, and the warnings
      to the administrators could be a signal that it is time to change
      the policy.

   o  Initial claim lifetime is of the order of 30 days.  Prefix
      lifetime is periodically (every 48 hours) reclaimed/extended,
      unless the prefix is under-utilized (see APPENDIX A).  Because the
      allocation is demand-driven, the allocated prefix lifetime will be
      automatically extended if the MAASs need longer prefix lifetime
      (e.g. 3-6 months).

   o  A level-2 MASC domain could have children (i.e. level-3) MASC

   o  If a level-2 or level-3 MASC domain uses less than 128 addresses,
      a Layer 2 protocol/mechanism (e.g. AAP) should be run among that
      domain and its parent MASC domain.

19.  Authors' Addresses

   Pavlin Radoslavov
   Computer Science Department
   University of Southern California/ISI
   Los Angeles, CA 90089


Top      Up      ToC       Page 53 
   Deborah Estrin
   Computer Science Department
   University of Southern California/ISI
   Los Angeles, CA 90089


   Ramesh Govindan
   University of Southern California/ISI
   4676 Admiralty Way
   Marina Del Rey, CA 90292


   Mark Handley
   AT&T Center for Internet Research at ISCI (ACIRI)
   1947 Center St., Suite 600
   Berkeley, CA 94704


   Satish Kumar
   Computer Science Department
   University of Southern California/ISI
   Los Angeles, CA 90089


   David Thaler
   One Microsoft Way
   Redmond, WA 98052


Top      Up      ToC       Page 54 
20.  References

   [AAP]                 Handley, M. and S. Hanna, "Multicast Address
                         Allocation Protocol (AAP)", Work in Progress.

   [API]                 Finlayson, R., "An Abstract API for Multicast
                         Address Allocation", RFC 2771, February 2000.

   [BGMP]                Thaler, D., Estrin, D. and D. Meyer, "Border
                         Gateway Multicast Protocol (BGMP): Protocol
                         Specification", Work in Progress.

   [BGP]                 Rekhter, Y. and T. Li, "A Border Gateway
                         Protocol 4 (BGP-4)", RFC 1771, March 1995.

   [CIDR]                Rekhter, Y. and C. Topolcic, "Exchanging
                         Routing Information Across Provider Boundaries
                         in the CIDR Environment", RFC 1520, September

   [IANA]                Reynolds, J. and J. Postel, "Assigned Numbers",
                         STD 2, RFC 1700, October 1994.

   [IANA-CONSIDERATIONS] Alvestrand, H. and T. Narten, "Guidelines for
                         Writing an IANA Considerations Section in
                         RFCs", BCP 26, RFC 2434, October 1998.

   [IPSEC]               Kent, S. and R. Atkinson, "Security
                         Architecture for the Internet Protocol", RFC
                         2401, November 1998.

   [KAMPAI]              Tsuchiya, P., "Efficient and Flexible
                         Hierarchical Address Assignment", INET92, June
                         1992, pp. 441--450.

   [MADCAP]              Hanna, S., Patel, B. and M. Shah, "Multicast
                         Address Dynamic Client Allocation Protocol
                         (MADCAP)", RFC 2730, December 1999.

   [MALLOC]              Thaler, D., Handley, M. and D. Estrin, "The
                         Internet Multicast Address Allocation
                         Architecture", RFC 2908, September 2000.

   [MBGP]                Bates, T., Chandra, R., Katz, D. and Y.
                         Rekhter, "Multiprotocol Extensions for BGP-4",
                         RFC 2283, September 1997.

Top      Up      ToC       Page 55 
   [MTRACE]              Fenner, W., and S. Casner, "A `traceroute'
                         facility for IP Multicast", Work in Progress.

   [MZAP]                Handley, M, Thaler, D. and R. Kermode
                         "Multicast-Scope Zone Announcement Protocol
                         (MZAP)", RFC 2776, February 2000.

   [RFC1112]             Deering, S., "Host Extensions for IP
                         Multicasting", STD 5, RFC 1112, August 1989.

   [RFC2119]             Bradner, S., "Key words for use in RFCs to
                         Indicate Requirement Levels", BCP 14, RFC 2119,
                         March 1997.

   [RFC2373]             Hinden, R. and S. Deering, "IP Version 6
                         Addressing Architecture", RFC 2373, July 1998.

   [RFC2460]             Deering, S. and R. Hinden, "Internet Protocol,
                         Version 6 (IPv6) Specification", RFC 2460,
                         December 1998.

   [SCOPE]               Meyer, D., "Administratively Scoped IP
                         Multicast", RFC 2365, July 1998.

Top      Up      ToC       Page 56 
21.  Full Copyright Statement

   Copyright (C) The Internet Society (2000).  All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an


   Funding for the RFC Editor function is currently provided by the
   Internet Society.