Tech-invite3GPPspaceIETF RFCsSIP
Top   in Index   Prev   Next

TS 26.451
Codec for Enhanced Voice Services (EVS) –
Voice Activity Detection (VAD)

V17.0.0 (PDF)  2022/03  9 p.
V16.0.0  2020/06  9 p.
V15.0.0  2018/06  9 p.
V14.0.0  2017/03  9 p.
V13.0.0  2015/12  9 p.
V12.0.0  2014/09  9 p.
Mr. Wang, Bin

Content for  TS 26.451  Word version:  17.0.0

Here   Top

1  Scopep. 5

The present document specifies the Voice Activity Detector (VAD) used in the Discontinuous Transmission (DTX) of the EVS Codec. Although the main application of the VAD algorithm is the detection of speech or voice signals, the algorithm is more accurately described as a Signal Activity Detection (SAD) algorithm.
The present document is a high level overview of the functionality with reference to the Codec Detailed Algorithmic Description where the functionality is specified in detail.

2  Referencesp. 5

The following documents contain provisions which, through reference in this text, constitute provisions of the present document.
  • References are either specific (identified by date of publication, edition number, version number, etc.) or non-specific.
  • For a specific reference, subsequent revisions do not apply.
  • For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document.
TR 21.905: "Vocabulary for 3GPP Specifications".
TS 26.441: "Codec for Enhanced Voice Services (EVS); General Overview".
TS 26.445: "Codec for Enhanced Voice Services (EVS); Detailed Algorithmic Description".
TS 26.442: "Codec for Enhanced Voice Services (EVS); ANSI C code (fixed-point)".
TS 26.443: "Codec for Enhanced Voice Services (EVS); ANSI C code (floating-point)".
TS 26.444: "Codec for Enhanced Voice Services (EVS); Test Sequences".
TS 26.446: "Codec for Enhanced Voice Services (EVS); AMR-WB Backward Compatible Functions".
TS 26.449: "Codec for Enhanced Voice Services (EVS); Comfort Noise Generation (CNG) Aspects".
TS 26.450: "Codec for Enhanced Voice Services (EVS); Discontinuous Transmission (DTX)".
TR 26.952: "Codec for Enhanced Voice Services (EVS); Performance Characterization".

3  Abbreviationsp. 5

For the purposes of the present document, the abbreviations given in TR 21.905 and the following apply. An abbreviation defined in the present document takes precedence over the definition of the same abbreviation, if any, in TR 21.905.
Algebraic Code-Excited Linear Prediction
Adaptive Multi Rate Wideband (codec)
Comfort Noise Generator
Discontinuous Transmission
Enhanced Voice Services
Frame Erasure Concealment
Internet Protocol
Jitter Buffer Management
Most Significant Bit
Multimedia Telephony Service for IMS
Packet Switched
Public Switched Telephone Network
Signal Activity Detection
Source Controlled - Variable Bit Rate
Silence Insertion Descriptor
Super Wideband
Voice Activity Detection
Weighted Millions of Operations Per Second

4  Generalp. 6

The function of the Enhanced Voice Services coder VAD algorithm, or more accurately the SAD algorithm, is to indicate whether each 20 ms frame contains signals that should be transmitted, e.g. speech, music or other audio. The output of the SAD algorithm is a Boolean flag (ƒSAD) that is set to one for the active signal, which is any useful signal bearing some meaningful information. Otherwise, the flag is set to zero indicating an inactive signal, which has no meaningful information. The inactive signal is mostly a pause or background noise.
The procedure of the present document is mandatory for implementation in all network entities and User Equipment (UE)s supporting the EVS coder.
The present document does not describe the ANSI-C code of this procedure. In the case of discrepancy between the procedure described in the present document and its ANSI-C code specifications contained in TS 26.442 the procedure defined by the TS 26.442 prevails.

5  The SAD Algorithmp. 6

The Enhanced Voice Services codec signal activity detection (SAD) module described in the present document consists of three sub-SAD modules; SAD1, SAD2 and SAD3.
SAD1 and SAD2 are combined initially to provide an efficient preliminary activity decision. This preliminary decision is then modified by the third sub-SAD module, SAD3, depending upon the codec mode of operation.
The efficient preliminary activity output is used as the final SAD decision for the AMR-WB IO modes, while the activity output with SAD3 is used as the final SAD decision for all other bit-rates.
Sub-clause 5.1.12 of TS 26.445 describes the operation of the SAD and the algorithms involved in the three sub-SAD modules in detail.

$  Change historyp. 7

Up   Top