Tech-invite 3GPPspace IETFspace

21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 4‑5x

Content for TS 26.445 Word version: 18.0.0

1… 2… 5… 5.2… 5.4… 6… 7…

5.1 Common processing
...

5 Functional description of the encoder p. 25

5.1 Common processing p. 25

5.1.1 High-pass Filtering p. 25

5.1.2 Complex low-delay filter bank analysis p. 25

5.1.2.1 Sub-band analysis p. 25

5.1.2.2 Sub-band energy estimation p. 26

5.1.3 Sample rate conversion to 12.8 kHz p. 27

5.1.3.1 Conversion of 16, 32 and 48 kHz signals to 12.8 kHz p. 27

5.1.3.2 Conversion of 8 kHz signals to 12.8 kHz p. 27

5.1.3.3 Conversion of input signals to 16, 25.6 and 32 kHz p. 29

5.1.4 Pre-emphasis p. 29

5.1.5 Spectral analysis p. 30

5.1.5.1 Windowing and DFT p. 30

5.1.5.2 Energy calculations p. 31

5.1.6 Bandwidth detection p. 32

5.1.6.1 Mean and maximum energy values per band p. 32

5.1.7 Bandwidth decision p. 34

5.1.8 Time-domain transient detection p. 37

5.1.9 Linear prediction analysis p. 38

5.1.9.1 LP analysis window p. 38

5.1.9.2 Autocorrelation computation p. 38

5.1.9.3 Adaptive lag windowing p. 39

5.1.9.4 Levinson-Durbin algorithm p. 39

5.1.9.5 Conversion of LP coefficients to LSP parameters p. 40

5.1.9.6 LSP interpolation p. 41

5.1.9.7 Conversion of LSP parameters to LP coefficients p. 41

5.1.9.8 LP analysis at 16kHz p. 42

5.1.10 Open-loop pitch analysis p. 43

5.1.10.1 Perceptual weighting p. 43

5.1.10.2 Correlation function computation p. 44

5.1.10.3 Correlation reinforcement with past pitch values p. 45

5.1.10.4 Normalized correlation computation p. 46

5.1.10.5 Correlation reinforcement with pitch lag multiples p. 46

5.1.10.6 Initial pitch lag determination and reinforcement based on pitch coherence with other half-frames p. 47

5.1.10.7 Pitch lag determination and parameter update p. 48

5.1.10.8 Correction of very short and stable open-loop pitch estimates p. 49

5.1.10.9 Fractional open-loop pitch estimate for each subframe p. 51

5.1.11 Background noise energy estimation p. 52

5.1.11.1 First stage of noise energy update p. 52

5.1.11.2 Second stage of noise energy update p. 54

5.1.11.2.1 Basic parameters for noise energy update p. 54

5.1.11.2.2 Spectral diversity p. 55

5.1.11.2.3 Complementary non-stationarity p. 55

5.1.11.2.4 HF energy content p. 56

5.1.11.2.5 Tonal stability p. 56

5.1.11.2.6 High frequency dynamic range p. 60

5.1.11.2.7 Combined decision for background noise energy update p. 60

5.1.11.3 Energy-based parameters for noise energy update p. 62

5.1.11.3.1 Closeness to current background estimate p. 62

5.1.11.3.2 Features related to last correlation or harmonic event p. 62

5.1.11.3.3 Energy-based pause detection p. 63

5.1.11.3.4 Long-term linear prediction efficiency p. 63

5.1.11.3.5 Additional long-term parameters used for noise estimation p. 64

5.1.11.4 Decision logic for noise energy update p. 65

5.1.12 Signal activity detection p. 68

5.1.12.1 SAD1 module p. 69

5.1.12.1.1 SNR outlier filtering p. 71

5.1.12.2 SAD2 module p. 72

5.1.12.3 Combined decision of SAD1 and SAD2 modules for WB and SWB signals p. 75

5.1.12.4 Final decision of the SAD1 module for NB signals p. 75

5.1.12.5 Post-decision parameter update p. 76

5.1.12.6 SAD3 module p. 77

5.1.12.6.1 Sub-band FFT p. 77

5.1.12.6.2 Computation of signal features p. 78

5.1.12.6.3 Computation of SNR parameters p. 81

5.1.12.6.4 Decision of background music p. 83

5.1.12.6.5 Decision of background update flag p. 83

5.1.12.6.6 SAD3 Pre-decision p. 84

5.1.12.6.7 SAD3 Hangover p. 86

5.1.12.7 Final SAD decision p. 86

5.1.12.8 DTX hangover addition p. 88

5.1.13 Coding mode determination p. 90

5.1.13.1 Unvoiced signal classification p. 91

5.1.13.1.1 Voicing measure p. 92

5.1.13.1.2 Spectral tilt p. 92

5.1.13.1.3 Sudden energy increase from a low energy level p. 93

5.1.13.1.4 Total frame energy difference p. 94

5.1.13.1.5 Energy decrease after spike p. 94

5.1.13.1.6 Decision about UC mode p. 95

5.1.13.2 Stable voiced signal classification p. 96

5.1.13.3 Signal classification for FEC p. 96

5.1.13.3.1 Signal classes for FEC p. 97

5.1.13.3.2 Signal classification parameters p. 97

5.1.13.3.3 Classification procedure p. 98

5.1.13.4 Transient signal classification p. 99

5.1.13.5 Modification of coding mode in special cases p. 100

5.1.13.6 Speech/music classification p. 101

5.1.13.6.1 First stage of the speech/music classifier p. 101

5.1.13.6.2 Scaling of features in the first stage of the speech/music classifier p. 103

5.1.13.6.3 Log-probability and decision smoothing p. 104

5.1.13.6.4 State machine and final speech/music decision p. 105

5.1.13.6.5 Improvement of the classification for mixed and music content p. 108

5.1.13.6.6 Second stage of the speech/music classifier p. 112

5.1.13.6.7 Context-based improvement of the classification for stable tonal signals p. 114

5.1.13.6.8 Detection of sparse spectral content p. 118

5.1.13.6.9 Decision about AC mode p. 120

5.1.13.6.10 Decision about IC mode p. 120

5.1.14 Coder technology selection p. 120

5.1.14.1 ACELP/MDCT-based technology selection at 9.6kbps, 16.4 and 24.4 kbps p. 121

5.1.14.1.1 Segmental SNR estimation of the MDCT-based technology p. 121

5.1.14.1.2 Segmental SNR estimation of the ACELP technology p. 127

5.1.14.1.3 Hysteresis and final decision p. 128

5.1.14.2 TCX/HQ MDCT technology selection at 13.2 and 16.4 kbps p. 129

5.1.14.3 TCX/HQ MDCT technology selection at 24.4 and 32 kbps p. 131

5.1.14.4 TD/Multi-mode FD BWE technology selection at 13.2 kbps and 32 kbps p. 134