Tech-invite3GPPspaceIETF RFCsSIP
Top   in Index   Prev   Next

TR 26.911
Codec(s) for Circuit-Switched (CS) Multimedia Telephony Service –
Terminal Implementor's Guide

V17.0.0 (PDF)2022/03  15 p.
V16.0.0  2020/06  15 p.
V15.0.0  2018/06  15 p.
V14.0.0  2017/03  15 p.
V13.0.0  2015/12  15 p.
V12.0.0  2014/09  15 p.
V11.0.0  2012/09  15 p.
V10.0.0  2011/04  15 p.
V9.0.0  2009/12  15 p.
V8.0.0  2008/12  15 p.
V7.1.0  2006/10  15 p.
V6.0.0  2004/09  15 p.
V5.1.0  2003/04  14 p.
V4.2.0  2003/04  14 p.
V3.4.0  2003/04  14 p.
Dr. Jung, Kyunghun
Samsung Electronics Co., Ltd

Content for  TR 26.911  Word version:  16.0.0

Here   Top

1  ScopeWord‑p. 5

The present document provides non-mandatory recommendations for the use of the different codec implementation options for the circuit switched multimedia telephony service which is based on ITU-T Recommendation H.324 [7], and Annex C of ITU-T Recommendation H.324 [7] in particular. These recommendations address issues specific to the 3G operating environment, including guaranteeing sufficient error resilience and interworking between terminals.
The contents of the present document are provided for information to assist in high quality implementation of multimedia telephony terminals. All references to "terminals" in this report are to terminals supporting the Circuit Switched Multimedia Telephony Service as described in [7-9].

2  ReferencesWord‑p. 5

The following documents contain provisions which, through reference in this text, constitute provisions of the present document.
  • References are either specific (identified by date of publication, edition number, version number, etc.) or non specific.
  • For a specific reference, subsequent revisions do not apply.
  • For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document.
ITU-T Recommendation H.223 (1996): "Multiplexing protocol for low bit rate multimedia communication".
ITU-T Recommendation H.223 - Annex A (1998): "Multiplexing protocol for low bit rate multimedia mobile communication over low error-prone channels".
ITU-T Recommendation H.223 - Annex B (1998): "Multiplexing protocol for low bit rate multimedia mobile communication over moderate error-prone channels".
ITU-T Recommendation H.223 - Annex C (1998): "Multiplexing protocol for low bit rate multimedia mobile communication over highly error-prone channels".
ITU-T Recommendation H.245 (2000): "Control protocol for multimedia communication"
ITU-T Recommendation H.261 (1993): "Video codec for audiovisual services at px64 kbit/s"
ITU-T Recommendation H.324 (2006): "Terminal for low bitrate multimedia communication"
ITU-T Recommendation G.723.1 (1996): "Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s".
ITU-T Recommendation H.263 (1998): "Video coding for low bit rate communication"
TS 26.110: "Codec for Circuit Switched Multimedia Telephony Service: General Description".
TS 26.111: "Codec for Circuit Switched Multimedia Telephony Service, Modifications to H.324".
TS 26.112: "Codec for Circuit Switched Multimedia Telephony Service; General description".
→ to date, withdrawn by 3GPP
TR 26.912: "Codec for Circuit Switched Multimedia Telephony Service; Quantitative performance evaluation of H.324 Annex C over 3G".
International Standard ISO/IEC 14496-2: "Information technology - Coding of audio-visual objects - Part 2: Visual".
ISO/IEC JTC1/SC29/WG11 MPEG 99/N2724: "MPEG-4 Applications", March 1999.
ITU-T Recommendation V.80: "In-band DCE control and synchronous data modes for asynchronous DTE".
TS 25.301: Radio Interface Protocol Architecture.
ITU-T Recommendation H.264 (2003): "Advanced video coding for generic audiovisual services" | ISO/IEC 14496-10:2003: "Information technology - Coding of audio-visual objects - Part 10: Advanced Video Coding"
ITU-T Recommendation H.241 (2003): "Extended video procedures and control signals for H.300 series terminals"

3  Definitions, symbols and abbreviationsWord‑p. 6

3.1  DefinitionsWord‑p. 6

For the purposes of the present document, the following terms and definitions apply:
3G-324M terminal:
multimedia telephony terminal conforming to TS 26.110 and targeted for use in 3G mobile networks
3G-324M codec:
implementation of ITU-T Recommendation H.324 [7] and all its elements adapted to the 3G environment (known as 3G 324M) is seen as a "codec" consisting of an encoder and a decoder
3G-324M encoder:
encoder part of the 3G-324M codec
3G-324M decoder:
decoder part of the 3G-324M codec

3.2  SymbolsWord‑p. 6


3.3  AbbreviationsWord‑p. 6

For the purposes of the present document, the following abbreviations apply:
ITU-T Recommendation H.223 [2] Adaptation layers 1, 2 and 3 (see [1])
Adaptation Layer Service Data Unit (see [1])
Adaptive Multi-Rate (Audio Codec)
Common Intermediate Format (a picture format for Video Codec)
Cyclic Redundancy Check
Data Circuit-terminating Equipment
Data Terminal Equipment
Group of blocks (a sub-part of a video picture)
Global System for Mobile communications
General Switched Telephone Network
Integrated Services Digital Network
International Telecommunication Union - Telecommunication Standardization Sector
Media Oriented Negotiation Acceleration
Multiplex Packet Data Unit (see [1])
Numbered Simple Retransmission Protocol
Picture start code (synchronization field for Video Codec)
Quarter CIF (a picture format for Video Codec)
Reversible Variable Length Code (see [11])
Sub QCIF (a picture format for Video Codec)
Simple Retransmission Protocol
Acknowledgement timer used by ITU-T Recommendation H.245 [5] implementations
Video Object Plane (see [11])
Windowed Numbered Simple Retransmission Protocol

4  GeneralWord‑p. 7

The following clauses give implementation recommendations for different parts of the 3G-324M codec. The clause division loosely follows the structure of ITU-T Recommendation H.324 [7].
Most of the recommendations in the present document assume that both transmitting and receiving terminals operate within the 3G system and conform to 3G-324M specifications in [7]-[9]. Clause 11 additionally includes recommendations relevant for interoperability between 3G-324M terminals and other terminals.
The recommendations are primarily targeted for such aspects of the codec implementation which have a significant effect on the quality perceived by the user at the other end of the connection which usually implies emphasizing encoder recommendations over decoder recommendations, although this division cannot be made in all cases. It should be recognized that the ITU-T Recommendation H.324 [7] specification leaves substantial amount of freedom for terminal implementations and no definite quality guarantee can be given even if all recommendations in the present document are followed.

5  Multiplex ProtocolWord‑p. 7

Multiplexing of video, audio, data, and control information is based on the ITU-T Recommendation H.223 [1]. The following general guidelines are recommended to be followed in the implementation of ITU-T Recommendation H.223 [1].
MUX-PDU size should be limited to be smaller than in typical GSTN use. Specific values depend on the bit-rate and channel characteristics, but suitable upper limits for MUX-PDU size are often in the range of 100-200 octets.
Encoders are recommended to support the boolean ITU-T Recommendation H.245 [5] maxMUXPDUSizeCapability (clause of [2] Version 3) to indicate that they are able to restrict the size of the MUX-PDUs that they transmit. Decoders are recommended to utilize the maxH223MUXPDUsize ITU-T Recommendation H.245 [5] command (clause 7.11.5 of [2] Version 3) to restrict the size of the MUX-PDUs, sent by the encoder, to a maximum of the specified number of octets.
ITU-T Recommendation H.324 [7] mandates that ITU-T Recommendation H.263 [9] encoders shall align picture start codes (PSC) with the start of an AL-SDU (see [4], clause 6.6.1). It is here further recommended that AL-SDUs that do not start with a PSC should start with a GOB header to improve error resilience.
ITU-T Recommendation H.241 [19] mandates that ITU-T Recommendation H.264 [18] encoders shall align Annex B/H.264 start code prefix for the first NAL unit of each access unit with the start of an AL-SDU. Use of the NAL Alignment Mode defined in TS 26.111 is here further recommended.
No more than 1-3 audio frames should be included in one MUX-PDU to avoid excessive delay.
Use of the optional retransmission procedure for video when using Adaptation Layer Type 3 (AL3) is not recommended due to delay considerations. This recommendation implies that receiving terminals should not send retransmission requests. It is recommended that terminals support video also using Adaptation Layer Type 2 (AL2) where retransmission is not possible and overhead is slightly smaller.
The ITU-T Recommendation H.223 [1] abort procedures should not be used (see ITU-T Recommendation H.223 [1] clauses 6.4.3, 7.2.3, 7.3.4, and 7.4.4).

5.1  H.223 Multiplex Transmission Bit OrderWord‑p. 7

H.223 multiplex transmission bit order is defined in H.223 [1] Sec 3.2.2 as LSB first. This first bit is transparently mapped to the first bit of "higher layer PDU" depicted in clause 5.3.5 of TS 25.301, and vice versa.
An example is given by the following Figure:
Copy of original 3GPP image for 3GPP TS 26.911, Figure 1: UMTS Network Model (3G Terminal ↔ 3G Terminal)

6  Control ProtocolWord‑p. 8

It is recommended that terminals support the latest possible version of ITU-T Recommendation H.245 [5]. Capability to support latest improvements in ITU-T Recommendation H.324 [7] are usually dependent on supporting the corresponding signalling in ITU-T Recommendation H.245. [5] Most of the recommendations in the present document require support for at least ITU-T Recommendation H.245 [5] Version 3 and some require even newer versions.
Recommendations for the control protocol are not limited to this clause of the present document. Other clauses of the present document give recommendations for the different parts of the terminal often implying corresponding support from ITU-T Recommendation H.245 [5]. These recommendations are not replicated in this clause, but they should still be interpreted as recommendations for the ITU-T Recommendation H.245 [5] control protocol implementation.
Note that it is allowed for terminals to declare only H.245 [5] "transmit" capabilities, indicating that the terminal is only capable of transmitting media, and that logical channels should be established accordingly. Also note that it is allowed for terminals to use H.245 [5] to declare only audio or only video capabilities, and that logical channels should be established accordingly.
The end-to-end transmission delay in the 3G system is expected to be somewhat higher than in GSTN. This will need to be considered for timer settings in connection with the ITU-T Recommendation H.245 [5] implementation. For that reason, ITU-T Recommendation H.324 [7] Annex C (and hence also 3G-324M) mandates the use of ITU-T Recommendation H.324 [7] Annex E for initializing the timer T401. The following additional guidelines for initializing and updating the timer T401 should be considered: ffs.
ITU-T recommendation H.324 [7] Annex A defines WNSRP (Windowed Numbered Simple Retransmission Protocol). WNSRP should be supported.
If WNSRP is not supported, NSRP or SRP shall be used. H.324 [7] Annex A defines the NSRP retransmission protocol that H.324 [7] Annex C mandates for use on mobile channels. To reduce the application setup time, H.245 [5] messages should be concatenated into as few NSRP packets as possible. Note that NSRP is not a windowed protocol and thus requires that the transmitter receive an NSRP response frame before the next NSRP command frame can be sent.
ITU-T recommendation H.324 [7] Annex K defines MONA (Media Oriented Negotiation Acceleration), a call setup time reduction technique. H.324 [7] Annex K should be supported.
MONA can be used in conjunction with WNSRP.
Note that the H.245 [5] OpenLogicalChannel replacementFor procedure may be used to obtain seamless H.264 [18] change of sequence (parameter set update).

6.1  Usage of DRAWING_ORDER-information for MPEG-4 video objectsWord‑p. 9

3G-324M decoders should ignore any drawing order information as signalled by ITU-T Recommendation H.245 [5] drawingOrder Capability, see Table E.5 in ITU-T Recommendation H.245 [5], if the MPEG-4 simple profile level 1 is used.

7  Video CodecWord‑p. 9

This clause gives recommendations for the video codec implementations within 3G-324M terminals. Clause 7.1 is applicable to the use of any mandatory or optional video codec. Clause 7.2 includes specific recommendations for using the ITU-T Recommendation H.263 [9] codec. Clause 7.3 gives specific recommendations for the use of MPEG-4 and other possible optional video codecs.

7.1  General RecommendationsWord‑p. 9

Regardless of which specific video codec standard is used, all video decoder implementations should include basic error concealment techniques. These techniques may include replacing erroneous parts of the decoded video frame with interpolated picture material from previous decoded frames or from spatially different locations of the erroneous frame. The decoder should aim to prevent the display of substantially corrupted parts of the picture. In any case, it is recommended that the terminal should tolerate every possible bitstream without catastrophic behaviour (such as the need for a user-initiated reset of the terminal).
3G-324M encoders and decoders are recommended to support the 1:1 pixel format (square format) . Encoders should signal this capability using ITU-T Recommendation H.245 [5] capability exchange and the appropriate header fields in video codecs so that unnecessary pixel shape conversions can be avoided.

7.2  H.263Word‑p. 9

Several of the optional annexes of ITU-T Recommendation H.263 [9] are useful for improving the compression efficiency and error resilience of the codec. The annexes below form a balanced set of tools with respect to error robustness, compression efficiency, quality, and complexity. It is recommended that an
ITU-T Recommendation H.263 [9] video decoder should support the following annexes. The main feature of each annex is also mentioned:
  • Annex I (Advanced Intra Coding), improves error resilience and compression efficiency.
  • Annex J (Deblocking Filter), improves compression efficiency.
  • Annex K (Slice Structure Mode, without RS submode), improves error resilience.
  • Annex T (Modified Quantizer), improves compression efficiency.
Non-empty GOB headers should be used frequently to improve error resilience (see [6], Clause 5.2).
ITU-T Recommendation H.263 [9] encoders in 3G-324M terminals should respond to all videoFastUpdate commands received via the ITU-T Recommendation H.245 [5] control channel (i.e., videoFastUpdatePicture, videoFastUpdateGOB, and videoFastUpdateMB presented in clause 7.11.5 of [2] Version 3). Using this feedback information to make a focused picture update can significantly improve the error performance of the codec. 3G-324M decoders are correspondingly recommended to transmit videoFastUpdate commands when the received picture is detected to be significantly corrupted due to transmission errors.
It is recommended that ITU-T Recommendation H.263 [9] decoders take advantage of the GOB and slice header GOB Frame ID (GFID) field in recovering corrupted picture header data (see Clauses 5.2.5 and K.2 of ITU-T Recommendation H.263 [9] recommendation version 2). For this purpose it is recommended that ITU-T Recommendation H.263 [9] encoders should not use the Rounding Type (RTYPE) bit of the extended picture header as described in Clause of [1]. The RTYPE bit should always be set to 0 since it otherwise effectively prevents the use of the GFID field for picture header recovery.

7.3  Other Video CodecsWord‑p. 10

It is recommended that all 3G-324M terminals additionally support the ISO/IEC 14496-2 [14] (MPEG-4 Visual) video codec [11]. The explanatory text below gives justification and further detail for this recommendation.
One of the main target environments for MPEG-4 Visual is mobile use. For this purpose the following error resilient techniques have been adopted in MPEG-4 Visual: Resynch Marker, Header Extension Code, Data Partitioning, and Reversible Variable Length Code. With these techniques MPEG 4 Visual codec can be used over errorprone channels enabling highly efficient low delay multimedia communication services for 3G networks. Support for MPEG-4 Visual potentially provides capabilities for communicating with heterogeneous networks without transcoding, or reusing pictures/video from 3G multimedia telephony service by different applications and vice versa.
MPEG-4 Visual and ITU-T Recommendation H.263 [9] have substantial technical similarities. MPEG-4 Visual also includes support for the ITU-T Recommendation H.263 [9] baseline codec.
Because of multi-functionality of MPEG-4 Visual, subsets of different tools have been defined in order to allow effective implementations of the standard. These subsets, called "Profiles", limit the tool set which shall be implemented. For each of these Profiles one or more Levels have been set to restrict the computational complexity of implementations. It is here recommended that the Simple Visual Profile @ Level 0 is supported to achieve adequate error resilience for transmission error and low complexity simultaneously. No other Profiles are recommended to be supported. Higher Levels for the Simple Visual Profile may be supported depending on the terminal capabilities.
MPEG-4 Visual accepts various sizes of input picture within the capability specified from the Profile and Level. Picture size of QCIF for Level 1 should be used for the sake of interoperability.
All of the error resilience tools in Simple Visual Profile are recommended to be activated.
Resync Marker is a tool which increases the opportunities for the decoder to resynchronize with the bitstream and after loss of synchronization due to errors in the bitstream, thus enabling normal decoder operation to continue. The encoder should insert Resync Marker in the bitstream, in order to enable the decoder to search for the Resync Marker in addition to the Start Code.
Header Extension Code (HEC) enables independent decoding of each video packet. One or more than one video packet in a VOP should have HEC in order for. the decoder to utilize information derived from HEC, to avoid discarding a whole VOP when the VOP header could not be received.
Data Partitioning is a tool that separates the information within a video packet to improve the degree of error localization and concealment. When the decoder detect errors in a video packet, the decoder may not discard whole the packet if themotion information or the I-VOP DC coefficients are decoded correctly. The decoder may reconstruct the corresponding part of the picture utilizing the above motion information or DC coefficients. The encoder should use Data Partitioning syntax in order to enable the decoder the above operation.
Reversible Variable Length Code (RVLC) is a tool which reduce the number of discarded bits.. RVLC decoding operation as described in clause E.1.4 of Annex E in [11] may be performed. The encoder should utilize RVLC to enable the decoder to perform such operation.
In addition to these tools, Intra Refresh should be inserted in order to prevent inter-frame propagation of errors. Adaptive Intra Refresh (AIR) described in clause E.1.5 in Annex E of [11] should be used in conjunction with cyclic Intra Refresh.
One Video Packet of MPEG-4 Visual should be mapped to one AL-SDU of ITU-T Recommendation H.223 [1] Adaptive Layer.
When an incoming bi-directional openLogicalChannel request has unsuitable reverse parameters for the local encoder, e.g., unsuitable MPEG-4 decoderConfigurationInformation, the terminal should reject the request. The cause field of openLogicalChannelReject should be set to value unsuitableReverseChannelParameters. A new openLogicalChannel request should be sent to the other end, now using the forward channel parameters of the rejected request as reverse channel parameters, and specifying new preferred forward channel parameters.
All MPEG-4 encoders should accept and respond to ITU-T Recommendation H.245 [5] videoTemporalSpatialTradeOff commands. Support for temporal-spatial trade-off cannot be signaled for MPEG-4 encoders, but the encoders should provide that support by default. MPEG-4 decoders are encouraged to utilize the videoTemporalSpatialTradeOff command. The specific response to the TemporalSpatialTradeOff command by MPEG-4 encoders is not defined and it is up to the implementation to decide how to respond to the command.

8  Audio CodecWord‑p. 11

8.1  AMR CodecWord‑p. 11

FFS. This clause will include guidance on how to utilize the different modes of the AMR codec.

8.2  Other Audio CodecsWord‑p. 11


9  Data ProtocolsWord‑p. 11


10  Terminal ProceduresWord‑p. 11


11  Interoperation with Other TerminalsWord‑p. 11

11.1  Audio CodecsWord‑p. 11

It is recommended that terminals additionally support the ITU-T Recommendation G.723.1 [8] audio codec [5] when it is expected that interoperability with GSTN is needed, because it cannot be guaranteed that
ITU-T Recommendation H.324 [7] terminals developed for GSTN use will support the AMR codec.

12  DTE-DCE InterfaceWord‑p. 12

It is recommended to use procedures defined in ITU-T Recommendation V.80 [16] in the DTE-DCE interface in case of non-integrated videophone terminal implementations with separate DTE and DCE devices. Due to the requirements of 3G-324M, the transparent synchronous access mode is the only relevant submode.

13  Optional EnhancementsWord‑p. 12


14  Multipoint ConsiderationsWord‑p. 12


15  Other RecommendationsWord‑p. 12


$  Change historyWord‑p. 13

Up   Top