This Annex gives the outline of possible example adaptation implementations that make use of adaptation signalling for speech as described in clause 10.2. Several different adaptation implementations are possible and the examples shown in this clause are not to be seen as a set of different adaptive schemes excluding other designs. Implementers are free to use these examples or to use any other adaptation algorithms. The examples are based on measuring the packet loss rate (PLR) but Annex C.1.3.1 describes how the measured frame loss rate (FLR) can be used instead of the PLR. A real implementation is free to use other adaptation triggers. The purpose of the clause is to show a few different examples of how receiver state machines can be used both to control the signalling but also to control the signalling requests. Notice that the MTSI clients can have different implementations of the adaptation state machines.
The Annex is divided into three sections:
Signalling considerations: Implementation considerations on the signalling mechanism; the signalling state machine.
Adaptation state machines: Three different examples of adaptation state machines either using the full set of adaptation dimensions or a subset thereof.
Other issues and solutions: Default actions and lower layer triggers.
In this Annex, a media receiver is the receiving end of the media flow, hence the request sender of any adaptation request. A media sender is the sending entity of the media, hence the request receiver of the adaptation request. The three different adaptation mechanisms available; bit-rate, packet-rate and error resilience, represents different ways to adapt to current transport characteristics:
Bit-rate adaptation: Reducing the bit-rate is in all examples shown in this section the first action done whenever a measurement indicating that action is needed to further optimize the session quality. A bit-rate reduction will reduce the utilization of the network resources to transmit the data. In the radio case, this would reduce the required transmission power and free resources either for more data or added channel coding. It is reasonable to assume, also consistent with a proper behaviour on IP networks, that a reduction of bit-rate is a valid first measure to take whenever the transport characteristics indicate that the current settings of the session do not provide an optimized session quality.
Packet-rate adaptation: In some of the examples, packet-rate adaptation is a second measure available to further adapt to the transport characteristics. A reduction of packet rate will in some cases improve the session quality, e.g. in transmission channels including WLAN. Further, a reduction of packet rate will also reduce the protocol overhead since more data is encapsulated into each RTP packet. Although robust header compression (RoHC) can reduce the protocol overhead over the wireless link, the core network will still see the full header and for speech data, it consists of a considerable part of the data transmitted. Hence, packet-rate adaptation serves as a second step in reducing the total bit-rate needed for the session.
Error resilience: The last adaptive measure in these examples is the use of error resilience measures, or explicitly, application level redundancy. Application level redundancy does not reduce the amount of bits needed to be transmitted but instead transmit the data in a more robust way. Application level redundancy should only be seen as a last measure when no other adaptation action has succeeded in optimizing the session quality sufficiently well. For most normal use cases, application level redundancy is not foreseen to be used, rather it serves as the last resort when the session quality is severely jeopardized.
The control of the adaptation signalling can by itself be characterized as a state machine. The implementation of the state machine is in the decoder and each MTSI client has its own implementation. The decoder sends requests as described in clause 10.2 to the encoder in the other end.
The requests that are transmitted can be queued up in a send buffer to be transmitted the next time an RTCP-APP packet is to be sent. Hence, a sender might receive one, two or all three receiver requests at the same time. It should not expect any specific order of the requests. A receiver shall not send multiple requests of the same type in the same RTCP-APP packet. Transmission of the requests should preferably be done immediately using the AVPF early mode but in some cases it may be justified to delay the transmission a limited time or until the next DTX period in order to minimize disturbance on the RTP stream, in the latter case monitoring of the RTP stream described below must take the additional delay into account.
A request can be sent immediately (alone in one RTCP-APP packet) but the subsequent RTCP-APP packet must follow the transmission rules for RTCP.
RTCP-APP packets may be delayed until the next DTX period.
Reception of the transmitted RTCP-APP packets is not guaranteed. Similar to the RTP packets, the RTCP packets might be lost due to link losses. Monitoring that the adaptation requests are followed can to be done by means of inspection of the received RTP stream.
For various reasons the requests might not be followed even though they received successfully by the other end. This behaviour can be seen in the following ways:
Request completely ignored: An example is a request for 1 frame/packet which might be rejected as the MTSI client decides that the default mode of operation 2 frames/packet or more and a frame aggregation reduction compared to the default state is not allowed.
Request partially followed: An example here is when no redundancy is received and a request for 100% redundancy with 1 extra frame offset is made which may be realized by the media sender as 100% redundancy with no extra offset. Another example is when a request for 5.9 kbps codec rate is sent and it is realized as e.g. 6.7 kbps codec rate. Table C.1 displays how the requests and realizations are grouped. E.g. it can be seen (if Ninit =1) that a request for 3 frames per packets realized as 2 frames per packet is considered to be fulfilled.
In Table C.1 above Ninit is 1 in most cases which corresponds to 1 frame per packet. In certain cases Ninit might have another value, one such example is E-GPRS access where Ninit may be 2. Ninit is given by the ptime SDP attribute.
If the requests are not followed as requested, the request should not be repeated infinitely as it will increase the total bit-rate without clear benefit. In order to avoid such behaviour the following recommendations apply:
Partially fulfilled requests should be considered as obeyed.
If a new request is not fulfilled within T_RESPONSE ms, the request is repeated again with a delay between trials of 2*T_RESPONSE ms. If the three attempts have been made without sender action, it should be assumed that the request cannot be fulfilled. In this case, the adaptation state machine will stay in the previous state or in a state that matches the current properties (codec mode, redundancy, frame aggregation). Any potential mismatch between define states in the adaptation state machine and the current properties of the media stream should resolved by the request sender.
The default mode of operation for a MTSI client if the RTCP bandwidth for the session is greater than zero is that the requests received should be followed. Ignoring requests should be avoided as much as possible. However, it is required that any signalling requests are aligned with the agreed session parameters in the SDP.
In some cases the adaptation state machine may go out-of-synch with the received RTP stream. Such cases may occur if e.g. the other MTSI client makes a reset. These special cases can be sensed, e.g. through a detection of a large gap in timestamp and/or sequence number. The state machine should then reset to the default state and start over again.
The signalling state machine has three states according to Table C.2.
Idle state: This is the default state of the signalling state machine. The signalling state should always return here after a state transition and when it has been detected that the media sender has followed the request, either completely or partially. The signalling state machine remains in this state as long as the selected adaptation is "stable", i.e. as long as the adaptation measures are appropriate for the current operating conditions. When it has been detected that the operating conditions has changed so much that the current adaptation measures are no longer appropriate then the adaptation function triggers a request signalling and the signalling state machine goes to state T2.
In this state, the received RTP stream is monitored to verify that the properties of a given adaptation state (redundancy, frame aggregation and codec mode) are detected in the received RTP stream. If necessary, some of the requests are repeated maximum 3 times. If any of the properties is considered to be not fulfilled, the signalling state machine enters state T3.
In this state, the properties of the RTP stream (redundancy, frame aggregation and codec rate) is reverted back to the properties of the last successful state and a new state transition is tested in T2, or alternatively the adaptation state is set to the state that matches the current properties (codec mode, redundancy, frame aggregation).