RFC 7798

RTP Payload Format for High Efficiency Video Coding (HEVC)

Pages: 86
Proposed Standard

Part 3 of 4 – Pages 42 to 64

RFC7798 - Page 42 prevText

7.  Payload Format Parameters

   This section specifies the parameters that MAY be used to select
   optional features of the payload format and certain features or
   properties of the bitstream or the RTP stream.  The parameters are
   specified here as part of the media type registration for the HEVC
   codec.  A mapping of the parameters into the Session Description
   Protocol (SDP) [RFC4566] is also provided for applications that use
   SDP.  Equivalent parameters could be defined elsewhere for use with
   control protocols that do not use SDP.

7.1.  Media Type Registration

   The media subtype for the HEVC codec is allocated from the IETF tree.

   The receiver MUST ignore any unrecognized parameter.

   Type name:     video

   Subtype name:  H265

   Required parameters: none

   OPTIONAL parameters:

      profile-space, tier-flag, profile-id, profile-compatibility-
      indicator, interop-constraints, and level-id:

RFC7798 - Page 43

         These parameters indicate the profile, tier, default level, and
         some constraints of the bitstream carried by the RTP stream and
         all RTP streams the RTP stream depends on, or a specific set of
         the profile, tier, default level, and some constraints the
         receiver supports.

         The profile and some constraints are indicated collectively by
         profile-space, profile-id, profile-compatibility-indicator, and
         interop-constraints.  The profile specifies the subset of
         coding tools that may have been used to generate the bitstream
         or that the receiver supports.

            Informative note: There are 32 values of profile-id, and
            there are 32 flags in profile-compatibility-indicator, each
            flag corresponding to one value of profile-id.  According to
            HEVC version 1 in [HEVC], when more than one of the 32 flags
            is set for a bitstream, the bitstream would comply with all
            the profiles corresponding to the set flags.  However, in a
            draft of HEVC version 2 in [HEVCv2], Subclause A.3.5, 19
            Format Range Extensions profiles have been specified, all
            using the same value of profile-id (4), differentiated by
            some of the 48 bits in interop-constraints; this (rather
            unexpected way of profile signaling) means that one of the
            32 flags may correspond to multiple profiles.  To be able to
            support whatever HEVC extension profile that might be
            specified and indicated using profile-space, profile-id,
            profile-compatibility-indicator, and interop-constraints in
            the future, it would be safe to require symmetric use of
            these parameters in SDP offer/answer unless recv-sub-layer-
            id is included in the SDP answer for choosing one of the
            sub-layers offered.

         The tier is indicated by tier-flag.  The default level is
         indicated by level-id.  The tier and the default level specify
         the limits on values of syntax elements or arithmetic
         combinations of values of syntax elements that are followed
         when generating the bitstream or that the receiver supports.

         A set of profile-space, tier-flag, profile-id, profile-
         compatibility-indicator, interop-constraints, and level-id
         parameters ptlA is said to be consistent with another set of
         these parameters ptlB if any decoder that conforms to the
         profile, tier, level, and constraints indicated by ptlB can
         decode any bitstream that conforms to the profile, tier, level,
         and constraints indicated by ptlA.

RFC7798 - Page 44

         In SDP offer/answer, when the SDP answer does not include the
         recv-sub-layer-id parameter that is less than the sprop-sub-
         layer-id parameter in the SDP offer, the following applies:

            o  The profile-space, tier-flag, profile-id, profile-
               compatibility-indicator, and interop-constraints
               parameters MUST be used symmetrically, i.e., the value of
               each of these parameters in the offer MUST be the same as
               that in the answer, either explicitly signaled or
               implicitly inferred.

            o  The level-id parameter is changeable as long as the
               highest level indicated by the answer is either equal to
               or lower than that in the offer.  Note that the highest
               level is indicated by level-id and max-recv-level-id
               together.

         In SDP offer/answer, when the SDP answer does include the recv-
         sub-layer-id parameter that is less than the sprop-sub-layer-id
         parameter in the SDP offer, the set of profile-space, tier-
         flag, profile-id, profile-compatibility-indicator, interop-
         constraints, and level-id parameters included in the answer
         MUST be consistent with that for the chosen sub-layer
         representation as indicated in the SDP offer, with the
         exception that the level-id parameter in the SDP answer is
         changeable as long as the highest level indicated by the answer
         is either lower than or equal to that in the offer.

         More specifications of these parameters, including how they
         relate to the values of the profile, tier, and level syntax
         elements specified in [HEVC] are provided below.

      profile-space, profile-id:

         The value of profile-space MUST be in the range of 0 to 3,
         inclusive.  The value of profile-id MUST be in the range of 0
         to 31, inclusive.

         When profile-space is not present, a value of 0 MUST be
         inferred.  When profile-id is not present, a value of 1 (i.e.,
         the Main profile) MUST be inferred.

         When used to indicate properties of a bitstream, profile-space
         and profile-id are derived from the profile, tier, and level
         syntax elements in SPS or VPS NAL units as follows, where
         general_profile_space, general_profile_idc,
         sub_layer_profile_space[j], and sub_layer_profile_idc[j] are
         specified in [HEVC]:

RFC7798 - Page 45

            If the RTP stream is the highest RTP stream, the following
            applies:

            o profile-space = general_profile_space
            o profile-id = general_profile_idc

            Otherwise (the RTP stream is a dependee RTP stream), the
            following applies, with j being the value of the sprop-sub-
            layer-id parameter:

            o profile-space = sub_layer_profile_space[j]
            o profile-id = sub_layer_profile_idc[j]

      tier-flag, level-id:

         The value of tier-flag MUST be in the range of 0 to 1,
         inclusive.  The value of level-id MUST be in the range of 0 to
         255, inclusive.

         If the tier-flag and level-id parameters are used to indicate
         properties of a bitstream, they indicate the tier and the
         highest level the bitstream complies with.

         If the tier-flag and level-id parameters are used for
         capability exchange, the following applies.  If max-recv-level-
         id is not present, the default level defined by level-id
         indicates the highest level the codec wishes to support.
         Otherwise, max-recv-level-id indicates the highest level the
         codec supports for receiving.  For either receiving or sending,
         all levels that are lower than the highest level supported MUST
         also be supported.

         If no tier-flag is present, a value of 0 MUST be inferred; if
         no level-id is present, a value of 93 (i.e., level 3.1) MUST be
         inferred.

         When used to indicate properties of a bitstream, the tier-flag
         and level-id parameters are derived from the profile, tier, and
         level syntax elements in SPS or VPS NAL units as follows, where
         general_tier_flag, general_level_idc, sub_layer_tier_flag[j],
         and sub_layer_level_idc[j] are specified in [HEVC]:

            If the RTP stream is the highest RTP stream, the following
            applies:

            o tier-flag = general_tier_flag
            o level-id = general_level_idc

RFC7798 - Page 46

            Otherwise (the RTP stream is a dependee RTP stream), the
            following applies, with j being the value of the sprop-sub-
            layer-id parameter:

            o tier-flag = sub_layer_tier_flag[j]
            o level-id = sub_layer_level_idc[j]

      interop-constraints:

         A base16 [RFC4648] (hexadecimal) representation of six bytes of
         data, consisting of progressive_source_flag,
         interlaced_source_flag, non_packed_constraint_flag,
         frame_only_constraint_flag, and reserved_zero_44bits.

         If the interop-constraints parameter is not present, the
         following MUST be inferred:

            o progressive_source_flag = 1
            o interlaced_source_flag = 0
            o non_packed_constraint_flag = 1
            o frame_only_constraint_flag = 1
            o reserved_zero_44bits = 0

         When the interop-constraints parameter is used to indicate
         properties of a bitstream, the following applies, where
         general_progressive_source_flag,
         general_interlaced_source_flag,
         general_non_packed_constraint_flag,
         general_non_packed_constraint_flag,
         general_frame_only_constraint_flag,
         general_reserved_zero_44bits,
         sub_layer_progressive_source_flag[j],
         sub_layer_interlaced_source_flag[j],
         sub_layer_non_packed_constraint_flag[j],
         sub_layer_frame_only_constraint_flag[j], and
         sub_layer_reserved_zero_44bits[j] are specified in [HEVC]:

            If the RTP stream is the highest RTP stream, the following
            applies:

            o progressive_source_flag = general_progressive_source_flag

            o interlaced_source_flag = general_interlaced_source_flag

            o non_packed_constraint_flag =
                 general_non_packed_constraint_flag

RFC7798 - Page 47

            o frame_only_constraint_flag =
                 general_frame_only_constraint_flag

            o reserved_zero_44bits = general_reserved_zero_44bits

            Otherwise (the RTP stream is a dependee RTP stream), the
            following applies, with j being the value of the sprop-sub-
            layer-id parameter:

            o progressive_source_flag =
                 sub_layer_progressive_source_flag[j]

            o interlaced_source_flag =
                 sub_layer_interlaced_source_flag[j]

            o non_packed_constraint_flag =
                 sub_layer_non_packed_constraint_flag[j]

            o frame_only_constraint_flag =
                 sub_layer_frame_only_constraint_flag[j]

            o reserved_zero_44bits = sub_layer_reserved_zero_44bits[j]

            Using interop-constraints for capability exchange results in
            a requirement on any bitstream to be compliant with the
            interop-constraints.

      profile-compatibility-indicator:

         A base16 [RFC4648] representation of four bytes of data.

         When profile-compatibility-indicator is used to indicate
         properties of a bitstream, the following applies, where
         general_profile_compatibility_flag[j] and
         sub_layer_profile_compatibility_flag[i][j] are specified in
         [HEVC]:

            The profile-compatibility-indicator in this case indicates
            additional profiles to the profile defined by profile-space,
            profile-id, and interop-constraints the bitstream conforms
            to.  A decoder that conforms to any of all the profiles the
            bitstream conforms to would be capable of decoding the
            bitstream.  These additional profiles are defined by
            profile-space, each set bit of profile-compatibility-
            indicator, and interop-constraints.

RFC7798 - Page 48

            If the RTP stream is the highest RTP stream, the following
            applies for each value of j in the range of 0 to 31,
            inclusive:

            o bit j of profile-compatibility-indicator =
                 general_profile_compatibility_flag[j]

            Otherwise (the RTP stream is a dependee RTP stream), the
            following applies for i equal to sprop-sub-layer-id and for
            each value of j in the range of 0 to 31, inclusive:

            o bit j of profile-compatibility-indicator =
                 sub_layer_profile_compatibility_flag[i][j]

         Using profile-compatibility-indicator for capability exchange
         results in a requirement on any bitstream to be compliant with
         the profile-compatibility-indicator.  This is intended to
         handle cases where any future HEVC profile is defined as an
         intersection of two or more profiles.

         If this parameter is not present, this parameter defaults to
         the following: bit j, with j equal to profile-id, of profile-
         compatibility-indicator is inferred to be equal to 1, and all
         other bits are inferred to be equal to 0.

      sprop-sub-layer-id:

         This parameter MAY be used to indicate the highest allowed
         value of TID in the bitstream.  When not present, the value of
         sprop-sub-layer-id is inferred to be equal to 6.

         The value of sprop-sub-layer-id MUST be in the range of 0 to 6,
         inclusive.

      recv-sub-layer-id:

         This parameter MAY be used to signal a receiver's choice of the
         offered or declared sub-layer representations in the sprop-vps.
         The value of recv-sub-layer-id indicates the TID of the highest
         sub-layer of the bitstream that a receiver supports.  When not
         present, the value of recv-sub-layer-id is inferred to be equal
         to the value of the sprop-sub-layer-id parameter in the SDP
         offer.

         The value of recv-sub-layer-id MUST be in the range of 0 to 6,
         inclusive.

RFC7798 - Page 49

      max-recv-level-id:

         This parameter MAY be used to indicate the highest level a
         receiver supports.  The highest level the receiver supports is
         equal to the value of max-recv-level-id divided by 30.

         The value of max-recv-level-id MUST be in the range of 0 to
         255, inclusive.

         When max-recv-level-id is not present, the value is inferred to
         be equal to level-id.

         max-recv-level-id MUST NOT be present when the highest level
         the receiver supports is not higher than the default level.

      tx-mode:

         This parameter indicates whether the transmission mode is SRST,
         MRST, or MRMT.

         The value of tx-mode MUST be equal to "SRST", "MRST" or "MRMT".
         When not present, the value of tx-mode is inferred to be equal
         to "SRST".

         If the value is equal to "MRST", MRST MUST be in use.
         Otherwise, if the value is equal to "MRMT", MRMT MUST be in
         use.  Otherwise (the value is equal to "SRST"), SRST MUST be in
         use.

         The value of tx-mode MUST be equal to "MRST" for all RTP
         streams in an MRST.

         The value of tx-mode MUST be equal to "MRMT" for all RTP
         streams in an MRMT.

      sprop-vps:

         This parameter MAY be used to convey any video parameter set
         NAL unit of the bitstream for out-of-band transmission of video
         parameter sets.  The parameter MAY also be used for capability
         exchange and to indicate sub-stream characteristics (i.e.,
         properties of sub-layer representations as defined in [HEVC]).
         The value of the parameter is a comma-separated (',') list of
         base64 [RFC4648] representations of the video parameter set NAL
         units as specified in Section 7.3.2.1 of [HEVC].

RFC7798 - Page 50

         The sprop-vps parameter MAY contain one or more than one video
         parameter set NAL unit. However, all other video parameter sets
         contained in the sprop-vps parameter MUST be consistent with
         the first video parameter set in the sprop-vps parameter.  A
         video parameter set vpsB is said to be consistent with another
         video parameter set vpsA if any decoder that conforms to the
         profile, tier, level, and constraints indicated by the 12 bytes
         of data starting from the syntax element general_profile_space
         to the syntax element general_level_idc, inclusive, in the
         first profile_tier_level( ) syntax structure in vpsA can decode
         any bitstream that conforms to the profile, tier, level, and
         constraints indicated by the 12 bytes of data starting from the
         syntax element general_profile_space to the syntax element
         general_level_idc, inclusive, in the first profile_tier_level(
         ) syntax structure in vpsB.

      sprop-sps:

         This parameter MAY be used to convey sequence parameter set NAL
         units of the bitstream for out-of-band transmission of sequence
         parameter sets.  The value of the parameter is a comma-
         separated (',') list of base64 [RFC4648] representations of the
         sequence parameter set NAL units as specified in Section
         7.3.2.2 of [HEVC].

      sprop-pps:

         This parameter MAY be used to convey picture parameter set NAL
         units of the bitstream for out-of-band transmission of picture
         parameter sets.  The value of the parameter is a comma-
         separated (',') list of base64 [RFC4648] representations of the
         picture parameter set NAL units as specified in Section 7.3.2.3
         of [HEVC].

      sprop-sei:

         This parameter MAY be used to convey one or more SEI messages
         that describe bitstream characteristics.  When present, a
         decoder can rely on the bitstream characteristics that are
         described in the SEI messages for the entire duration of the
         session, independently from the persistence scopes of the SEI
         messages as specified in [HEVC].

         The value of the parameter is a comma-separated (',') list of
         base64 [RFC4648] representations of SEI NAL units as specified
         in Section 7.3.2.4 of [HEVC].

RFC7798 - Page 51

            Informative note: Intentionally, no list of applicable or
            inapplicable SEI messages is specified here.  Conveying
            certain SEI messages in sprop-sei may be sensible in some
            application scenarios and meaningless in others.  However, a
            few examples are described below:

               1) In an environment where the bitstream was created from
                  film-based source material, and no splicing is going
                  to occur during the lifetime of the session, the film
                  grain characteristics SEI message or the tone mapping
                  information SEI message are likely meaningful, and
                  sending them in sprop-sei rather than in the bitstream
                  at each entry point may help with saving bits and
                  allows one to configure the renderer only once,
                  avoiding unwanted artifacts.

               2) The structure of pictures information SEI message in
                  sprop-sei can be used to inform a decoder of
                  information on the NAL unit types, picture-order count
                  values, and prediction dependencies of a sequence of
                  pictures.  Having such knowledge can be helpful for
                  error recovery.

               3) Examples for SEI messages that would be meaningless to
                  be conveyed in sprop-sei include the decoded picture
                  hash SEI message (it is close to impossible that all
                  decoded pictures have the same hashtag), the display
                  orientation SEI message when the device is a handheld
                  device (as the display orientation may change when the
                  handheld device is turned around), or the filler
                  payload SEI message (as there is no point in just
                  having more bits in SDP).

      max-lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, max-tc:

         These parameters MAY be used to signal the capabilities of a
         receiver implementation.  These parameters MUST NOT be used for
         any other purpose.  The highest level (specified by max-recv-
         level-id) MUST be the highest that the receiver is fully
         capable of supporting.  max-lsr, max-lps, max-cpb, max-dpb,
         max-br, max-tr, and max-tc MAY be used to indicate capabilities
         of the receiver that extend the required capabilities of the
         highest level, as specified below.

         When more than one parameter from the set (max-lsr, max-lps,
         max-cpb, max-dpb, max-br, max-tr, max-tc) is present, the
         receiver MUST support all signaled capabilities simultaneously.
         For example, if both max-lsr and max-br are present, the

RFC7798 - Page 52

         highest level with the extension of both the picture rate and
         bitrate is supported.  That is, the receiver is able to decode
         bitstreams in which the luma sample rate is up to max-lsr
         (inclusive), the bitrate is up to max-br (inclusive), the coded
         picture buffer size is derived as specified in the semantics of
         the max-br parameter below, and the other properties comply
         with the highest level specified by max-recv-level-id.

            Informative note: When the OPTIONAL media type parameters
            are used to signal the properties of a bitstream, and max-
            lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, and max-tc
            are not present, the values of profile-space, tier-flag,
            profile-id, profile-compatibility-indicator, interop-
            constraints, and level-id must always be such that the
            bitstream complies fully with the specified profile, tier,
            and level.

      max-lsr:

         The value of max-lsr is an integer indicating the maximum
         processing rate in units of luma samples per second.  The max-
         lsr parameter signals that the receiver is capable of decoding
         video at a higher rate than is required by the highest level.

         When max-lsr is signaled, the receiver MUST be able to decode
         bitstreams that conform to the highest level, with the
         exception that the MaxLumaSR value in Table A-2 of [HEVC] for
         the highest level is replaced with the value of max-lsr.
         Senders MAY use this knowledge to send pictures of a given size
         at a higher picture rate than is indicated in the highest
         level.

         When not present, the value of max-lsr is inferred to be equal
         to the value of MaxLumaSR given in Table A-2 of [HEVC] for the
         highest level.

         The value of max-lsr MUST be in the range of MaxLumaSR to 16 *
         MaxLumaSR, inclusive, where MaxLumaSR is given in Table A-2 of
         [HEVC] for the highest level.

      max-lps:

         The value of max-lps is an integer indicating the maximum
         picture size in units of luma samples.  The max-lps parameter
         signals that the receiver is capable of decoding larger picture
         sizes than are required by the highest level.  When max-lps is
         signaled, the receiver MUST be able to decode bitstreams that
         conform to the highest level, with the exception that the

RFC7798 - Page 53

         MaxLumaPS value in Table A-1 of [HEVC] for the highest level is
         replaced with the value of max-lps.  Senders MAY use this
         knowledge to send larger pictures at a proportionally lower
         picture rate than is indicated in the highest level.

         When not present, the value of max-lps is inferred to be equal
         to the value of MaxLumaPS given in Table A-1 of [HEVC] for the
         highest level.

         The value of max-lps MUST be in the range of MaxLumaPS to 16 *
         MaxLumaPS, inclusive, where MaxLumaPS is given in Table A-1 of
         [HEVC] for the highest level.

      max-cpb:

         The value of max-cpb is an integer indicating the maximum coded
         picture buffer size in units of CpbBrVclFactor bits for the VCL
         HRD parameters and in units of CpbBrNalFactor bits for the NAL
         HRD parameters, where CpbBrVclFactor and CpbBrNalFactor are
         defined in Section A.4 of [HEVC].  The max-cpb parameter
         signals that the receiver has more memory than the minimum
         amount of coded picture buffer memory required by the highest
         level.  When max-cpb is signaled, the receiver MUST be able to
         decode bitstreams that conform to the highest level, with the
         exception that the MaxCPB value in Table A-1 of [HEVC] for the
         highest level is replaced with the value of max-cpb.  Senders
         MAY use this knowledge to construct coded bitstreams with
         greater variation of bitrate than can be achieved with the
         MaxCPB value in Table A-1 of [HEVC].

         When not present, the value of max-cpb is inferred to be equal
         to the value of MaxCPB given in Table A-1 of [HEVC] for the
         highest level.

         The value of max-cpb MUST be in the range of MaxCPB to 16 *
         MaxCPB, inclusive, where MaxLumaCPB is given in Table A-1 of
         [HEVC] for the highest level.

            Informative note: The coded picture buffer is used in the
            hypothetical reference decoder (Annex C of [HEVC]).  The use
            of the hypothetical reference decoder is recommended in HEVC
            encoders to verify that the produced bitstream conforms to
            the standard and to control the output bitrate.  Thus, the
            coded picture buffer is conceptually independent of any
            other potential buffers in the receiver, including de-
            packetization and de-jitter buffers.  The coded picture
            buffer need not be implemented in decoders as specified in
            Annex C of [HEVC], but rather standard-compliant decoders

RFC7798 - Page 54

            can have any buffering arrangements provided that they can
            decode standard-compliant bitstreams.  Thus, in practice,
            the input buffer for a video decoder can be integrated with
            de-packetization and de-jitter buffers of the receiver.

      max-dpb:

         The value of max-dpb is an integer indicating the maximum
         decoded picture buffer size in units decoded pictures at the
         MaxLumaPS for the highest level, i.e., the number of decoded
         pictures at the maximum picture size defined by the highest
         level.  The value of max-dpb MUST be in the range of 1 to 16,
         respectively.  The max-dpb parameter signals that the receiver
         has more memory than the minimum amount of decoded picture
         buffer memory required by default, which is MaxDpbPicBuf as
         defined in [HEVC] (equal to 6).  When max-dpb is signaled, the
         receiver MUST be able to decode bitstreams that conform to the
         highest level, with the exception that the MaxDpbPicBuff value
         defined in [HEVC] as 6 is replaced with the value of max-dpb.
         Consequently, a receiver that signals max-dpb MUST be capable
         of storing the following number of decoded pictures
         (MaxDpbSize) in its decoded picture buffer:

           if( PicSizeInSamplesY <= ( MaxLumaPS >> 2 ) )
              MaxDpbSize = Min( 4 * max-dpb, 16 )
           else if ( PicSizeInSamplesY <= ( MaxLumaPS >> 1 ) )
              MaxDpbSize = Min( 2 * max-dpb, 16 )
           else if ( PicSizeInSamplesY <= ( ( 3 * MaxLumaPS ) >> 2
         ) )
              MaxDpbSize = Min( (4 * max-dpb) / 3, 16 )
           else
              MaxDpbSize = max-dpb

         Wherein MaxLumaPS given in Table A-1 of [HEVC] for the highest
         level and PicSizeInSamplesY is the current size of each decoded
         picture in units of luma samples as defined in [HEVC].

         The value of max-dpb MUST be greater than or equal to the value
         of MaxDpbPicBuf (i.e., 6) as defined in [HEVC].  Senders MAY
         use this knowledge to construct coded bitstreams with improved
         compression.

         When not present, the value of max-dpb is inferred to be equal
         to the value of MaxDpbPicBuf (i.e., 6) as defined in [HEVC].

            Informative note: This parameter was added primarily to
            complement a similar codepoint in the ITU-T Recommendation
            H.245, so as to facilitate signaling gateway designs.  The

RFC7798 - Page 55

            decoded picture buffer stores reconstructed samples.  There
            is no relationship between the size of the decoded picture
            buffer and the buffers used in RTP, especially de-
            packetization and de-jitter buffers.

      max-br:

         The value of max-br is an integer indicating the maximum video
         bitrate in units of CpbBrVclFactor bits per second for the VCL
         HRD parameters and in units of CpbBrNalFactor bits per second
         for the NAL HRD parameters, where CpbBrVclFactor and
         CpbBrNalFactor are defined in Section A.4 of [HEVC].

         The max-br parameter signals that the video decoder of the
         receiver is capable of decoding video at a higher bitrate than
         is required by the highest level.

         When max-br is signaled, the video codec of the receiver MUST
         be able to decode bitstreams that conform to the highest level,
         with the following exceptions in the limits specified by the
         highest level:

            o  The value of max-br replaces the MaxBR value in Table A-2
               of [HEVC] for the highest level.

            o  When the max-cpb parameter is not present, the result of
               the following formula replaces the value of MaxCPB in
               Table A-1 of [HEVC]:

               (MaxCPB of the highest level) * max-br / (MaxBR of the
               highest level)

         For example, if a receiver signals capability for Main profile
         Level 2 with max-br equal to 2000, this indicates a maximum
         video bitrate of 2000 kbits/sec for VCL HRD parameters, a
         maximum video bitrate of 2200 kbits/sec for NAL HRD parameters,
         and a CPB size of 2000000 bits (2000000 / 1500000 * 1500000).

         Senders MAY use this knowledge to send higher bitrate video as
         allowed in the level definition of Annex A of [HEVC] to achieve
         improved video quality.

         When not present, the value of max-br is inferred to be equal
         to the value of MaxBR given in Table A-2 of [HEVC] for the
         highest level.

RFC7798 - Page 56

         The value of max-br MUST be in the range of MaxBR to 16 *
         MaxBR, inclusive, where MaxBR is given in Table A-2 of [HEVC]
         for the highest level.

            Informative note: This parameter was added primarily to
            complement a similar codepoint in the ITU-T Recommendation
            H.245, so as to facilitate signaling gateway designs.  The
            assumption that the network is capable of handling such
            bitrates at any given time cannot be made from the value of
            this parameter.  In particular, no conclusion can be drawn
            that the signaled bitrate is possible under congestion
            control constraints.

      max-tr:

         The value of max-tr is an integer indication the maximum number
         of tile rows.  The max-tr parameter signals that the receiver
         is capable of decoding video with a larger number of tile rows
         than the value allowed by the highest level.

         When max-tr is signaled, the receiver MUST be able to decode
         bitstreams that conform to the highest level, with the
         exception that the MaxTileRows value in Table A-1 of [HEVC] for
         the highest level is replaced with the value of max-tr.

         Senders MAY use this knowledge to send pictures utilizing a
         larger number of tile rows than the value allowed by the
         highest level.

         When not present, the value of max-tr is inferred to be equal
         to the value of MaxTileRows given in Table A-1 of [HEVC] for
         the highest level.

         The value of max-tr MUST be in the range of MaxTileRows to 16 *
         MaxTileRows, inclusive, where MaxTileRows is given in Table A-1
         of [HEVC] for the highest level.

      max-tc:

         The value of max-tc is an integer indication the maximum number
         of tile columns.  The max-tc parameter signals that the
         receiver is capable of decoding video with a larger number of
         tile columns than the value allowed by the highest level.

         When max-tc is signaled, the receiver MUST be able to decode
         bitstreams that conform to the highest level, with the
         exception that the MaxTileCols value in Table A-1 of [HEVC] for
         the highest level is replaced with the value of max-tc.

RFC7798 - Page 57

         Senders MAY use this knowledge to send pictures utilizing a
         larger number of tile columns than the value allowed by the
         highest level.

         When not present, the value of max-tc is inferred to be equal
         to the value of MaxTileCols given in Table A-1 of [HEVC] for
         the highest level.

         The value of max-tc MUST be in the range of MaxTileCols to 16 *
         MaxTileCols, inclusive, where MaxTileCols is given in Table A-1
         of [HEVC] for the highest level.

      max-fps:

         The value of max-fps is an integer indicating the maximum
         picture rate in units of pictures per 100 seconds that can be
         effectively processed by the receiver.  The max-fps parameter
         MAY be used to signal that the receiver has a constraint in
         that it is not capable of processing video effectively at the
         full picture rate that is implied by the highest level and,
         when present, one or more of the parameters max-lsr, max-lps,
         and max-br.

         The value of max-fps is not necessarily the picture rate at
         which the maximum picture size can be sent, it constitutes a
         constraint on maximum picture rate for all resolutions.

            Informative note: The max-fps parameter is semantically
            different from max-lsr, max-lps, max-cpb, max-dpb, max-br,
            max-tr, and max-tc in that max-fps is used to signal a
            constraint, lowering the maximum picture rate from what is
            implied by other parameters.

         The encoder MUST use a picture rate equal to or less than this
         value.  In cases where the max-fps parameter is absent, the
         encoder is free to choose any picture rate according to the
         highest level and any signaled optional parameters.

         The value of max-fps MUST be smaller than or equal to the full
         picture rate that is implied by the highest level and, when
         present, one or more of the parameters max-lsr, max-lps, and
         max-br.

RFC7798 - Page 58

      sprop-max-don-diff:

         If tx-mode is equal to "SRST" and there is no NAL unit naluA
         that is followed in transmission order by any NAL unit
         preceding naluA in decoding order (i.e., the transmission order
         of the NAL units is the same as the decoding order), the value
         of this parameter MUST be equal to 0.

         Otherwise, if tx-mode is equal to "MRST" or "MRMT", the
         decoding order of the NAL units of all the RTP streams is the
         same as the NAL unit transmission order and the NAL unit output
         order, the value of this parameter MUST be equal to either 0 or
         1.

         Otherwise, if tx-mode is equal to "MRST" or "MRMT" and the
         decoding order of the NAL units of all the RTP streams is the
         same as the NAL unit transmission order but not the same as the
         NAL unit output order, the value of this parameter MUST be
         equal to 1.

         Otherwise, this parameter specifies the maximum absolute
         difference between the decoding order number (i.e., AbsDon)
         values of any two NAL units naluA and naluB, where naluA
         follows naluB in decoding order and precedes naluB in
         transmission order.

         The value of sprop-max-don-diff MUST be an integer in the range
         of 0 to 32767, inclusive.

         When not present, the value of sprop-max-don-diff is inferred
         to be equal to 0.

      sprop-depack-buf-nalus:

         This parameter specifies the maximum number of NAL units that
         precede a NAL unit in transmission order and follow the NAL
         unit in decoding order.

         The value of sprop-depack-buf-nalus MUST be an integer in the
         range of 0 to 32767, inclusive.

         When not present, the value of sprop-depack-buf-nalus is
         inferred to be equal to 0.

         When sprop-max-don-diff is present and greater than 0, this
         parameter MUST be present and the value MUST be greater than 0.

RFC7798 - Page 59

      sprop-depack-buf-bytes:

         This parameter signals the required size of the de-
         packetization buffer in units of bytes.  The value of the
         parameter MUST be greater than or equal to the maximum buffer
         occupancy (in units of bytes) of the de-packetization buffer as
         specified in Section 6.

         The value of sprop-depack-buf-bytes MUST be an integer in the
         range of 0 to 4294967295, inclusive.

         When sprop-max-don-diff is present and greater than 0, this
         parameter MUST be present and the value MUST be greater than 0.
         When not present, the value of sprop-depack-buf-bytes is
         inferred to be equal to 0.

            Informative note: The value of sprop-depack-buf-bytes
            indicates the required size of the de-packetization buffer
            only.  When network jitter can occur, an appropriately sized
            jitter buffer has to be available as well.

      depack-buf-cap:

         This parameter signals the capabilities of a receiver
         implementation and indicates the amount of de-packetization
         buffer space in units of bytes that the receiver has available
         for reconstructing the NAL unit decoding order from NAL units
         carried in one or more RTP streams.  A receiver is able to
         handle any RTP stream, and all RTP streams the RTP stream
         depends on, when present, for which the value of the sprop-
         depack-buf-bytes parameter is smaller than or equal to this
         parameter.

         When not present, the value of depack-buf-cap is inferred to be
         equal to 4294967295.  The value of depack-buf-cap MUST be an
         integer in the range of 1 to 4294967295, inclusive.

            Informative note: depack-buf-cap indicates the maximum
            possible size of the de-packetization buffer of the receiver
            only, without allowing for network jitter.

RFC7798 - Page 60

      sprop-segmentation-id:

         This parameter MAY be used to signal the segmentation tools
         present in the bitstream and that can be used for
         parallelization.  The value of sprop-segmentation-id MUST be an
         integer in the range of 0 to 3, inclusive.  When not present,
         the value of sprop-segmentation-id is inferred to be equal to
         0.

         When sprop-segmentation-id is equal to 0, no information about
         the segmentation tools is provided.  When sprop-segmentation-id
         is equal to 1, it indicates that slices are present in the
         bitstream.  When sprop-segmentation-id is equal to 2, it
         indicates that tiles are present in the bitstream.  When sprop-
         segmentation-id is equal to 3, it indicates that WPP is used in
         the bitstream.

      sprop-spatial-segmentation-idc:

         A base16 [RFC4648] representation of the syntax element
         min_spatial_segmentation_idc as specified in [HEVC].  This
         parameter MAY be used to describe parallelization capabilities
         of the bitstream.

      dec-parallel-cap:

         This parameter MAY be used to indicate the decoder's additional
         decoding capabilities given the presence of tools enabling
         parallel decoding, such as slices, tiles, and WPP, in the
         bitstream.  The decoding capability of the decoder may vary
         with the setting of the parallel decoding tools present in the
         bitstream, e.g., the size of the tiles that are present in a
         bitstream.  Therefore, multiple capability points may be
         provided, each indicating the minimum required decoding
         capability that is associated with a parallelism requirement,
         which is a requirement on the bitstream that enables parallel
         decoding.

         Each capability point is defined as a combination of 1) a
         parallelism requirement, 2) a profile (determined by profile-
         space and profile-id), 3) a highest level, and 4) a maximum
         processing rate, a maximum picture size, and a maximum video
         bitrate that may be equal to or greater than that determined by
         the highest level.  The parameter's syntax in ABNF [RFC5234] is
         as follows:

RFC7798 - Page 61

         dec-parallel-cap = "dec-parallel-cap={" cap-point *(","
                            cap-point) "}"

         cap-point = ("w" / "t") ":" spatial-seg-idc 1*(";"
                      cap-parameter)

         spatial-seg-idc = 1*4DIGIT ; (1-4095)

         cap-parameter = tier-flag / level-id / max-lsr
                         / max-lps / max-br

         tier-flag = "tier-flag" EQ ("0" / "1")

         level-id  = "level-id" EQ 1*3DIGIT ; (0-255)

         max-lsr   = "max-lsr" EQ  1*20DIGIT ; (0-
         18,446,744,073,709,551,615)

         max-lps   = "max-lps" EQ 1*10DIGIT ; (0-4,294,967,295)

         max-br    = "max-br"  EQ 1*20DIGIT ; (0-
         18,446,744,073,709,551,615)

         EQ = "="

         The set of capability points expressed by the dec-parallel-cap
         parameter is enclosed in a pair of curly braces ("{}").  Each
         set of two consecutive capability points is separated by a
         comma (',').  Within each capability point, each set of two
         consecutive parameters, and, when present, their values, is
         separated by a semicolon (';').

         The profile of all capability points is determined by profile-
         space and profile-id, which are outside the dec-parallel-cap
         parameter.

         Each capability point starts with an indication of the
         parallelism requirement, which consists of a parallel tool
         type, which may be equal to 'w' or 't', and a decimal value of
         the spatial-seg-idc parameter.  When the type is 'w', the
         capability point is valid only for H.265 bitstreams with WPP in
         use, i.e., entropy_coding_sync_enabled_flag equal to 1.  When
         the type is 't', the capability point is valid only for H.265
         bitstreams with WPP not in use (i.e.,
         entropy_coding_sync_enabled_flag equal to 0).  The capability-
         point is valid only for H.265 bitstreams with
         min_spatial_segmentation_idc equal to or greater than spatial-
         seg-idc.

RFC7798 - Page 62

         After the parallelism requirement indication, each capability
         point continues with one or more pairs of parameter and value
         in any order for any of the following parameters:

            o tier-flag
            o level-id
            o max-lsr
            o max-lps
            o max-br

         At most, one occurrence of each of the above five parameters is
         allowed within each capability point.

         The values of dec-parallel-cap.tier-flag and dec-parallel-
         cap.level-id for a capability point indicate the highest level
         of the capability point.  The values of dec-parallel-cap.max-
         lsr, dec-parallel-cap.max-lps, and dec-parallel-cap.max-br for
         a capability point indicate the maximum processing rate in
         units of luma samples per second, the maximum picture size in
         units of luma samples, and the maximum video bitrate (in units
         of CpbBrVclFactor bits per second for the VCL HRD parameters
         and in units of CpbBrNalFactor bits per second for the NAL HRD
         parameters where CpbBrVclFactor and CpbBrNalFactor are defined
         in Section A.4 of [HEVC]).

         When not present, the value of dec-parallel-cap.tier-flag is
         inferred to be equal to the value of tier-flag outside the dec-
         parallel-cap parameter.  When not present, the value of dec-
         parallel-cap.level-id is inferred to be equal to the value of
         max-recv-level-id outside the dec-parallel-cap parameter.  When
         not present, the value of dec-parallel-cap.max-lsr, dec-
         parallel-cap.max-lps, or dec-parallel-cap.max-br is inferred to
         be equal to the value of max-lsr, max-lps, or max-br,
         respectively, outside the dec-parallel-cap parameter.

         The general decoding capability, expressed by the set of
         parameters outside of dec-parallel-cap, is defined as the
         capability point that is determined by the following
         combination of parameters: 1) the parallelism requirement
         corresponding to the value of sprop-segmentation-id equal to 0
         for a bitstream, 2) the profile determined by profile-space,
         profile-id, profile-compatibility-indicator, and interop-
         constraints, 3) the tier and the highest level determined by
         tier-flag and max-recv-level-id, and 4) the maximum processing
         rate, the maximum picture size, and the maximum video bitrate
         determined by the highest level.  The general decoding
         capability MUST NOT be included as one of the set of capability
         points in the dec-parallel-cap parameter.

RFC7798 - Page 63

         For example, the following parameters express the general
         decoding capability of 720p30 (Level 3.1) plus an additional
         decoding capability of 1080p30 (Level 4) given that the
         spatially largest tile or slice used in the bitstream is equal
         to or less than 1/3 of the picture size:

            a=fmtp:98 level-id=93;dec-parallel-cap={t:8;level- id=120}

         For another example, the following parameters express an
         additional decoding capability of 1080p30, using dec-parallel-
         cap.max-lsr and dec-parallel-cap.max-lps, given that WPP is
         used in the bitstream:

            a=fmtp:98 level-id=93;dec-parallel-cap={w:8;
                        max-lsr=62668800;max-lps=2088960}

            Informative note: When min_spatial_segmentation_idc is
            present in a bitstream and WPP is not used, [HEVC] specifies
            that there is no slice or no tile in the bitstream
            containing more than 4 * PicSizeInSamplesY / (
            min_spatial_segmentation_idc + 4 ) luma samples.

      include-dph:

         This parameter is used to indicate the capability and
         preference to utilize or include Decoded Picture Hash (DPH) SEI
         messages (see Section D.3.19 of [HEVC]) in the bitstream. DPH
         SEI messages can be used to detect picture corruption so the
         receiver can request picture repair, see Section 8.  The value
         is a comma-separated list of hash types that is supported or
         requested to be used, each hash type provided as an unsigned
         integer value (0-255), with the hash types listed from most
         preferred to the least preferred.  Example: "include-dph=0,2",
         which indicates the capability for MD5 (most preferred) and
         Checksum (less preferred).  If the parameter is not included or
         the value contains no hash types, then no capability to utilize
         DPH SEI messages is assumed.  Note that DPH SEI messages MAY
         still be included in the bitstream even when there is no
         declaration of capability to use them, as in general SEI
         messages do not affect the normative decoding process and
         decoders are allowed to ignore SEI messages.

   Encoding considerations:

      This type is only defined for transfer via RTP (RFC 3550).

RFC7798 - Page 64

   Security considerations:

      See Section 9 of RFC 7798.

   Published specification:

      Please refer to RFC 7798 and its Section 12.

   Additional information: None

   File extensions: none

   Macintosh file type code: none

   Object identifier or OID: none

   Person & email address to contact for further information:

      Ye-Kui Wang (yekui.wang@gmail.com)

   Intended usage: COMMON

   Author: See Authors' Addresses section of RFC 7798.

   Change controller:

      IETF Audio/Video Transport Payloads working group delegated from
      the IESG.

(page 64 continued on part 4)