Digital cellular telecommunications system (Phase 2+) (GSM); Half rate speech; Half rate speech transcoding (GSM 06.20 version 6.0.1 Release 1997)

Upgrade from Phase 2+ to Release 1997

Digitalni celični telekomunikacijski sistem (faza 2+) – Govor s polovično hitrostjo – Prekodiranje pri polovični hitrosti govora (GSM 06.20, različica 6.0.1, izdaja 1997)

General Information

Status
Published
Publication Date
30-Nov-2003
Current Stage
6060 - National Implementation/Publication (Adopted Project)
Start Date
01-Dec-2003
Due Date
01-Dec-2003
Completion Date
01-Dec-2003
Mandate
Standard
SIST EN 300 969 V6.0.1:2003
English language
47 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day

Standards Content (Sample)


2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.Digital cellular telecommunications system (Phase 2+) (GSM); Half rate speech; Half rate speech transcoding (GSM 06.20 version 6.0.1 Release 1997)33.070.50Globalni sistem za mobilno telekomunikacijo (GSM)Global System for Mobile Communication (GSM)ICS:Ta slovenski standard je istoveten z:EN 300 969 Version 6.0.1SIST EN 300 969 V6.0.1:2003en01-december-2003SIST EN 300 969 V6.0.1:2003SLOVENSKI
STANDARD
EN 300 969 V6.0.1 (1999-06)European Standard (Telecommunications series)Digital cellular telecommunications system (Phase 2+);Half rate speech;Half rate speech transcoding(GSM 06.20 version 6.0.1 Release 1997)GLOBAL SYSTEM
FOR MOBILE COMMUNICATIONSRSIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)2GSM 06.20 version 6.0.1 Release 1997ReferenceDEN/SMG-110620Q6 (8z00300o.PDF)KeywordsDigital cellular telecommunications system,Global System for Mobile communications(GSM), CODEC, GSM, speechETSIPostal addressF-06921 Sophia Antipolis Cedex - FRANCEOffice address650 Route des Lucioles - Sophia AntipolisValbonne - FRANCETel.: +33 4 92 94 42 00
Fax: +33 4 93 65 47 16Siret N° 348 623 562 00017 - NAF 742 CAssociation à but non lucratif enregistrée à laSous-Préfecture de Grasse (06) N° 7803/88Internetsecretariat@etsi.frIndividual copies of this ETSI deliverablecan be downloaded fromhttp://www.etsi.orgCopyright NotificationNo part may be reproduced except as authorized by written permission.The copyright and the foregoing restriction extend to reproduction in all media.© European Telecommunications Standards Institute 1999.All rights reserved.SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)3GSM 06.20 version 6.0.1 Release 1997ContentsIntellectual Property Rights.5Foreword.51Scope.72References.73Definitions, symbols and abbreviations.73.1Definitions.73.2Symbols.93.3Abbreviations.104Functional description of the GSM half rate speech codec.114.1GSM half rate speech encoder.114.1.1High-pass filter.134.1.2Segmentation.144.1.3Fixed Point Lattice Technique (FLAT).144.1.4Spectral quantization.154.1.4.1Autocorrelation Fixed Point Lattice Technique (AFLAT).154.1.5Frame energy calculation and quantization.174.1.6Soft interpolation of the spectral parameters.174.1.7Spectral noise weighting filter coefficients.184.1.8Long Term Predictor lag determination.194.1.8.1Open loop long term search initialization.204.1.8.2Open loop lag search.214.1.8.3Frame lag trajectory search (Mode ¹ 0).264.1.8.4Voicing mode selection.284.1.8.5Closed loop lag search.284.1.9Harmonic noise weighting.294.1.10Code search algorithm.314.1.10.1Decorrelation of filtered basis vectors.314.1.10.2Fast search technique.324.1.11Multimode gain vector quantization.334.1.11.1Coding GS and P0.344.2GSM half rate speech decoder.374.2.1Excitation generation.374.2.2Adaptive pitch prefilter.374.2.3Synthesis Filter.384.2.4Adaptive spectral postfilter.384.2.5Updating decoder states.405Homing sequences.405.1Functional description.405.2Definitions.405.3Encoder homing.405.4Decoder homing.415.5Encoder home state.415.6Decoder home state.41Annex A (normative):Codec parameter description.42A.1Codec parameter description.42A.1.1MODE.42A.1.2R0.42A.1.3LPC1 - LPC3.43A.1.4LAG_1 - LAG_4.43A.1.5CODEx_1 - CODEx_4.43A.1.6GSP0_1 - GSP0_4.43SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)4GSM 06.20 version 6.0.1 Release 1997A.2Basic coder parameters.43Annex B (normative):Order of occurrence of the codec parameters over Abis.44Annex C (informative):Bibliography.45Annex D (informative):Change Request History.46History.47SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)5GSM 06.20 version 6.0.1 Release 1997Intellectual Property RightsIPRs essential or potentially essential to the present document may have been declared to ETSI. The informationpertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be foundin SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respectof ETSI standards", which is available free of charge from the ETSI Secretariat. Latest updates are available on theETSI Web server (http://www.etsi.org/ipr).Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guaranteecan be given as to the existence of other IPRs not referenced in SR 000 314 (or the updates on the ETSI Web server)which are, or may be, or may become, essential to the present document.ForewordThis European Standard (Telecommunications series) has been produced by ETSI Technical Committee Special MobileGroup (SMG).The present document specifies the speech codec to be used for the GSM half rate channel for the digital cellulartelecommunications system. The present document is part of a series covering the half rate speech traffic channels asdescribed below:GSM 06.02"Digital cellular telecommunications system (Phase 2+); Half rate speech; Half rate speechprocessing functions".GSM 06.06"Digital cellular telecommunications system (Phase 2+); Half rate speech; ANSI-C code for theGSM half rate speech codec".GSM 06.07"Digital cellular telecommunications system (Phase 2+); Half rate speech; Test sequences for theGSM half rate speech codec".GSM 06.20"Digital cellular telecommunications system (Phase 2+); Half rate speech; Half rate speechtranscoding".GSM 06.21"Digital cellular telecommunications system (Phase 2+); Half rate speech; Substitution and mutingof lost frames for half rate speech traffic channels".GSM 06.22"Digital cellular telecommunications system (Phase 2+); Half rate speech; Comfort noise aspectsfor half rate speech traffic channels".GSM 06.41"Digital cellular telecommunications system (Phase 2+); Half rate speech; DiscontinuousTransmission (DTX) for half rate speech traffic channels".GSM 06.42"Digital cellular telecommunications system (Phase 2+); Half rate speech; Voice Activity Detector(VAD) for half rate speech traffic channels".The contents of the present document is subject to continuing work within SMG and may change following formal SMGapproval. Should SMG modify the contents of the present document it will be re-released with an identifying change ofrelease date and an increase in version number as follows:Version 6.x.ywhere:6indicates Release 1997 of GSM Phase 2+xthe second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates,etc.ythe third digit is incremented when editorial only changes have been incorporated in the specification.SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)6GSM 06.20 version 6.0.1 Release 1997Proposed national transposition datesDate of adoption of this EN:05 June 1999Date of latest announcement of this EN (doa):30 September 1999Date of latest publication of new National Standardor endorsement of this EN (dop/e):31 March 2000Date of withdrawal of any conflicting National Standard (dow):31 March 2000SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)7GSM 06.20 version 6.0.1 Release 19971ScopeThe present document specifies the speech codec to be used for the GSM half rate channel. It also specifies the testmethods to be used to verify that the codec implementation complies with the present document.The requirements are mandatory for the codec to be used either in GSM Mobile Stations (MS)s or Base Station Systems(BSS)s that utilize the half rate GSM speech traffic channel.2ReferencesThe following documents contain provisions which, through reference in this text, constitute provisions of the presentdocument.· References are either specific (identified by date of publication, edition number, version number, etc.) ornon-specific.· For a specific reference, subsequent revisions do not apply.· For a non-specific reference, the latest version applies.· A non-specific reference to an ETS shall also be taken to refer to later versions published as an EN with the samenumber.[1]GSM 06.02: "Digital cellular telecommunications system (Phase 2+); Half rate speech; Half ratespeech processing functions".[2]GSM 06.06: "Digital cellular telecommunications system (Phase 2+); Half rate speech; ANSI-Ccode for the GSM half rate speech codec".[3]GSM 06.07: "Digital cellular telecommunications system (Phase 2+); Half rate speech; Testsequences for the GSM half rate speech codec".3Definitions, symbols and abbreviations3.1DefinitionsFor the purposes of the present document, the following definitions apply:adaptive codebook: The adaptive codebook is derived from the long term filter state. The lag value can be viewed as anindex into the adaptive codebook.adaptive pitch prefilter: In the GSM half rate speech decoder, this filter is applied to the excitation signal to enhancethe periodicity of the reconstructed speech. Note that this is done prior to the application of the short term filter.adaptive spectral postfilter: In the GSM half rate speech decoder, this filter is applied to the output of the short termfilter to enhance the perceptual quality of the reconstructed speech.allowable lags: The set of lag values which may be coded by the GSM half rate speech encoder and transmitted to theGSM half rate speech decoder. This set contains both integer and fractional values (see table 3).analysis window: For each frame, the short term filter coefficients are computed using the high pass filtered speechsamples within the analysis window. The analysis window is 170 samples in length, and is centered about the last 100samples in the frame.basis vectors: A set of M, M1, or M2 vectors of length Ns used to generate the VSELP codebook vectors. Thesevectors are not necessarily orthogonal.SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)8GSM 06.20 version 6.0.1 Release 1997closed loop lag search: A process of determining the near optimal lag value from the weighted input speech and thelong term filter state.closed loop lag trajectory: For a given frame, the sequence of near optimal lag values whose elements correspond toeach of the four subframes as determined by the closed loop lag search.codebook: A set of vectors used in a vector quantizer.Codeword (OR Code): An M, M1, or M2 bit symbol indicating the vector to be selected from a VSELP codebook.Delta (LAG) code: A four bit code indicating the change in lag value for a subframe relative to the previous subframe'scoded lag. For frames in which the long term predictor is enabled (MODE 1, 2, or 3), the lag for subframe 1 isindependently coded using eight bits, and delta codes are used for subframes 2, 3, and 4.direct form coefficients: One of the formats for storing the short term filter parameters. All filters which are used tomodify speech samples use direct form coefficients.fractional lags: A set of lag values having sub-sample resolution. Note that not every fractional lag value considered inthe GSM half rate speech encoder is an allowable lag value.frame: A time interval equal to 20 ms, or 160 samples at an 8 kHz sampling rate.harmonic noise weighting filter: This filter exploits the noise masking properties of the spectral peaks which occur atharmonics of the pitch frequency by weighting the residual error less in regions near the pitch harmonics and more inregions away from them. Note that this filter is only used when the long term filter is enabled (MODE = 1, 2 or 3).high pass filter: This filter is used to de-emphasize the low frequency components of the input speech signal.integer lags: A set of lag values having whole sample resolution.interpolating filter: An FIR filter used to estimate sub-sample resolution samples, given an input sampled with integersample resolution.lag: The long term filter delay. This is typically the pitch period, or a multiple or sub-multiple of it.long term filter: This filter is used to generate the periodic component in the excitation for the current subframe. Thisfilter is only enabled for MODE = 1, 2 or 3.LPC coefficients: Linear Predictive Coding (LPC) coefficients is a generic descriptive term for describing the shortterm filter coefficients.open loop lag search: A process of estimating the near optimal lag directly from the weighted speech input. This isdone to narrow the range of lag values over which the closed loop lag search shall be performed.open loop lag trajectory: For a given frame, the sequence of near optimal lag values whose elements correspond to thefour subframes as determined by the open loop lag search.reflection coefficients: An alternative representation of the information contained in the short term filter parameters.residual: The output signal resulting from an inverse filtering operation.short term filter: This filter introduces, into the excitation signal, short term correlation which models the impulseresponse of the vocal tract.soft interpolation: A process wherein a decision is made for each frame to use either interpolated or uninterpolatedshort term filter parameters for the four subframes in that frame.soft interpolation bit: A one bit code indicating whether or not interpolation of the short term parameters is to be usedin the current frame.spectral noise weighting filter: This filter exploits the noise masking properties of the formants (vocal tractresonances) by weighting the residual error less in regions near the formant frequencies and more in regions away fromthem.subframe: A time interval equal to 5 ms, or 40 samples at an 8 kHz sampling rate.SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)9GSM 06.20 version 6.0.1 Release 1997vector quantization: A method of grouping several parameters into a vector and quantizing them simultaneously.GSP0 vector quantizer: The process of vector quantization, its intermediate parameters (GS and P0) for the coding ofthe excitation gains b and g.VSELP codebook: Vector-Sum Excited Linear Predictive (VSELP) codebook, used in the GSM half rate speech coder,wherein each codebook vector is constructed as a linear combination of the fixed basis vectors.zero input response: The output of a filter due to all past inputs, i.e. due to the present state of the filter, given that aninput of zeros is applied.zero state response: The output of a filter due to the present input, given that no past inputs have been applied, i.e.given the state information in the filter is all zeroes.3.2SymbolsFor the purposes of the present document, the following symbols apply:A(z)Short term spectral filter.aiThe LPC coefficients.bL(n)The output of the long term filter state (adaptive codebook) for lag L.bThe long term filter coefficient.C(z)Second weighting filter.e(n)Weighted error signalfj(i)The coefficients of the jth phase of the 10th order interpolating filter used to evaluate candidatefractional lag values; i ranges from 0 to Pf-1.gj(i)The coefficients of the jth phase of the 6th order interpolating filter used to interpolate C's and G'sas well as fractional lags in the harmonic noise weighting; i ranges from 0 to Pg-1.gThe gain applied to the vector(s) selected from the VSELP codebook(s).HA M2 bit code indicating the vector to be selected from the second VSELP codebook (whenoperating in mode 0).IA M or M1 bit code indicating the vector to be selected from one of the two first VSELPcodebooks.LThe long term filter lag value.Lmax142 (samples), the maximum possible value for the long term filter lag.Lmin21 (samples), the minimum possible value for the long term filter lag.M9, the number of basis vectors, and the number of bits in a codeword, for the VSELP codebookused in modes 1, 2, and 3.M17, the number of basis vectors, and the number of bits in a codeword, for the first VSELPcodebook used in mode 0.M27, the number of basis vectors, and the number of bits in a codeword, for the second VSELPcodebook used in mode 0.MODEA two bit code indicating the mode for the current frame (see annex A).NA170, the length of the analysis window. This is the number of high pass filtered speech samplesused to compute the short term filter parameters for each frame.NF160, the number of samples per frame (at a sampling rate of 8 kHz).Np10, the short term filter order.Ns40, the number of samples per subframe (at a sampling rate of 8 kHz).P16, the number of bits in the prequantizer for the r1 - r3 vector quantizer.P25, the number of bits in the prequantizer for the r4 - r6 vector quantizer.P34, the number of bits in the prequantizer for the r7 - r10 vector quantizer.PfThe order of one phase of an interpolating filter used to evaluate candidate fractional lag values. Pfequals 10 for j ¹ 0 and equal to 1 for j = 0.PgThe order of one phase of an interpolating filter, fj(n), used to interpolate C's and G's as well asfractional lags in the harmonic noise weighting, Pg equals 6.pitchThe time duration between the glottal pulses which result when the vocal chords vibrate duringspeech production.Q111, the number of bits in the r1 - r3 reflection coefficient vector quantizer.Q29, the number of bits in the r4 - r6 reflection coefficient vector quantizer.Q38, the number of bits in the r7 - r10 reflection coefficient vector quantizer.SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)10GSM 06.20 version 6.0.1 Release 1997R0A five bit code used to indicate the energy level in the current frame.r(n)The long term filter state (the history of the excitation signal); n < 0rL(n)The long term filter state with the adaptive codebook output for lag L appended.s'(n)Synthesized speech.W(z)Spectral weighting filter.lhnwThe harmonic noise weighting filter coefficient.xThe adaptive pitch prefilter coefficient.éxùCeiling function: the largest integer y where y < x + 1,0.ëxûFloor function: the largest integer y where y £ x.()xiijK=åSummation: x(j)+x(j+1)+.+x(K).()ijKxi=ÕProduct: x(j)(x(j+1)).(x(K))max(x,y)Find the larger of two numbers x and y.min(x,y)Find the smaller of two numbers x and y.round(x)Round the non-integer x to the closest integer yyx:,=+05y: y=x+0,5.3.3AbbreviationsFor the purposes of the present document, the following abbreviations apply:AFLATAutocorrelation Fixed point LAttice TechniqueCELPCode Excited Linear PredictionFLATFixed Point Lattice TechniqueLTPLong Term PredictorSSTSpectral Smoothing TechniqueVSELPVector-Sum Excited Linear PredictionSIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)11GSM 06.20 version 6.0.1 Release 19974Functional description of the GSM half rate speechcodecThe GSM half rate codec uses the VSELP (Vector-Sum Excited Linear Prediction) algorithm. The VSELP algorithm isan analysis-by-synthesis coding technique and belongs to the class of speech coding algorithms known as CELP (CodeExcited Linear Prediction).The GSM half rate codec's encoding process is performed on a 20 ms speech frame at a time. A speech frame of thesampled speech waveform is read and based on the current waveform and the past history of the waveform, the codecencoder derives 18 parameters that describe it. The parameters extracted are grouped into the following three generalclasses:-energy parameters (R0 and GSP0);-spectral parameters (LPC and INT_LPC);-excitation parameters (LAG and CODE).These parameters are quantized into 112 bits for transmission as described in annex A and their order of occurrence overAbis is given in annex B.The GSM half rate codec is an analysis-by-synthesis codec, therefore the speech decoder is primarily a subset of thespeech encoder. The quantized parameters are decoded and a synthetic excitation is generated using the energy andexcitation parameters. The synthetic excitation is then filtered to provide the spectral information resulting in thegeneration of the synthesized speech (see figure 1).GSM half rate speech codecSpeech encoderSpeechdecoderspeechtransmitted speechparametersreceived speechparameterssynthesisedspeechFigure 1: Block diagram of the GSM half rate speech codecThe ANSI-C code that describes the GSM half rate speech codec is given in GSM 06.06 [2] and the test sequences inGSM 06.07 [3] (see clause 5 for the codec homing test sequences).4.1GSM half rate speech encoderThe GSM half rate speech encoder uses an analysis by synthesis approach to determine the code to use to represent theexcitation for each subframe. The codebook search procedure consists of trying each codevector as a possible excitationfor the Code Excited Linear Predictive (CELP) synthesizer. The synthesized speech s'(n) is compared against the inputspeech and a difference signal is generated. This difference signal is then filtered by a spectral weighting filter, W(z),(and possibly a second weighting filter, C(z)) to generate a weighted error signal, e(n). The power in e(n) is computed.The codevector which generates the minimum weighted error power is chosen as the codevector for that subframe. Thespectral weighting filter serves to weight the error spectrum based on perceptual considerations. This weighting filter isa function of the speech spectrum and can be expressed in terms of the a parameters of the short term (spectral) filter.SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)12GSM 06.20 version 6.0.1 Release 1997()WzzziiiNiiiNpp=---=-=åå1111aa~(1)The computation of the ai coefficients is described in subclause 4.1.7.The second weighting filter C(z), if used, is a harmonic weighting filter and is used to control the amount of error in theharmonics of the speech signal. If the weighting filter(s) are moved to both input paths to the subtracter, an equivalentconfiguration is obtained as shown in figure 2.S( )2VSELPCodebookX_IgInput speech
s(n)y(n)y'(n)e(n)Determine LPCcoefficientsbLtotal weightederrorLPC 1LPC 2LPC 3R0INT_LPCMODELAG-1CODE_1LAG_2CODE_2LAG_3CODE_3LAG_4CODE_4GSP 0-1GSP 0_2GSP 0_3GSP 0_4W(z)C(z)Find Minimumover L and all IB(z)H(z)C(z) b
gfind optimalgains
and.Figure 2: Block diagram of the GSM half rate speech encoder (MODE = 1,2 and 3)Here H(z) is the combination of A(z), the short term (spectral) filter, and W(z), the spectral weighting filter. These filtersare combined since the denominator of A(z) is cancelled by the numerator of W(z).SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)13GSM 06.20 version 6.0.1 Release 1997()HzziiiNp=--=å111~a(2)There are two approaches that can be used for calculating the gain, g. The gain can be determined prior to codebooksearch based on residual energy. This gain would then be fixed for the codebook search. Another approach is tooptimize the gain for each codevector during the codebook search. The codevector which yields the minimum weightederror would be chosen and its corresponding optimal gain would be used for g. The latter approach generally yieldsbetter results since the gain is optimized for each codevector. This approach also implies that the gain term needs to beupdated at the subframe rate. The optimal code and gain for this technique can be computed as follows:The input speech is first filtered by a high pass filter as described in subclause 4.1.1. The short term filter parameters arecomputed from the filtered input speech once per frame. A fast fixed point covariance lattice technique is used.Subclauses 4.1.3 and 4.1.4 describes in detail how the short term parameters are determined and quantized. An overallframe energy is also computed and coded once per frame. Once per frame, one of the four voicing modes is selected. IfMODE¹0, the long term predictor is used and the long term predictor lag, L, is updated at the subframe rate. L and aVSELP codeword are selected sequentially. Each is chosen to minimize the weighted mean square error. The long-termfilter coefficient, b, and the codebook gain, g, are optimized jointly. Subclause 4.1.8 describes the technique forselecting from among the voicing modes and, if one of voiced modes is chosen, determining the long-term filter lag.Subclause 4.1.10 describes an efficient technique for jointly optimizing b, g and the codeword selection.Subclause 4.1.10 also includes the description of the fast VSELP codebook search technique. The b and g parametersare transformed to equivalent parameters using the frame energy term, and are vector quantized every subframe. Thecoding of the frame energy and the b and g parameters is described in subclause 4.1.11.4.1.1High-pass filterThe 13 bit linear Pulse Code Modulated (PCM) input speech, x(n), is filtered by a fourth order pole-zero high pass filter.This filter suppresses the frequency components of the input speech which are below 120 Hz. The filter is implementedas a cascade of two second-order Infinite Impulse Response (IIR) filters. Incorporated into the filter coefficients is a gainof 0,5. The difference equation for the first filter is:()()()~~,,ynbxniaynjiijj=-+-==åå102112(3)where:b10 =
0,335052b11 = -0,669983a11 = 0,926117b12 =
0,335052a12 = -0,429413The difference equation for the second filter is:()()()ynbyniaynjiijj=-+-==åå202212,,~(4)where:b20 = 0,335052b21 = -0,669434a21= 0,965332b22 = 0,335052a22 = -0,469513SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)14GSM 06.20 version 6.0.1 Release 19974.1.2SegmentationA sample buffer containing the previous 195 input high pass filtered speech samples, y(n), is shifted so that the oldest160 samples are shifted out while the next 160 input samples are shifted in. The oldest 160 samples in the buffercorrespond to the next frame of samples to be encoded. The analysis interval comprises the most recent 170 samples inthe buffer. The samples in the buffer are labelled as s(n) where 0£n£194 and s(0) is the first (oldest) sample.4.1.3Fixed Point Lattice Technique (FLAT)Let rj represent the jth reflection coefficient. The FLAT algorithm for the determination of the reflection coefficients isstated as follows:STEP1Compute the covariance (autocorrelation) matrix from the input speech:()()()fiksnisnknNNpA,=+-+-=å24240 £ i, k £ Np(5)STEP2The f(i,k) array is modified by windowing()()()ff',,ikikwik=-0 £ i, k £ Np(6)STEP3()()Fikik0,',=f0 £ i, k £ Np - 1(7)()()Bikik011,',=++f0 £ i, k £ Np - 1(8)()()Cikik01,',=+f0 £ i, k £ Np - 1(9)STEP4set j = 1STEP5Compute rj()()()()()()rCCNjNjFBFNjNjBNjNjjjjppjjjppjpp=-+--++--+--------2000000111111,,,,,,(10)STEP6If j = NP then done.STEP7Update Fj(i,k), Bj(i,k), Cj(i,k)0 £ i, k £ NP-j-1()()()()()()FikFikrCikCkirBikjjjjjjj,,,,,=+++----11121(11)()()()()()()BikBikrCikCkirFikjjjjjjj,,,,,=+++++++++++----1112111111111,(12)()()()()()()CikCikrBikFikrCkijjjjjjj,,,,,=+++++++----111211111(13)STEP8j = j+1STEP9go to step 5.The windowing coefficients, w(|i-k|), are found in the table 1.SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)15GSM 06.20 version 6.0.1 Release 1997Table 1: Windowing coefficientsw(0)0,998966w(5)0,974915w(1)0,996037w(6)0,969054w(2)0,991663w(7)0,963060w(3)0,986399w(8)0,956796w(4)0.980722w(9)0,950127This algorithm can be simplified by noting that the f', F and B matrices are symmetric such that only the uppertriangular part of the matrices need to be computed or updated. Also, step 7 is done so that Fj(i,k), Bj(i-1,k-1), Cj(i,k-1),and Cj(k,i-1) are updated together and common terms are computed once and the recursion is done in place.4.1.4Spectral quantizationA three segment vector quantizer of the reflection coefficients is employed. A reduced complexity search technique isused to select the vector of reflection coefficients for each segment. The reflection coefficient vector quantizercodebooks are stored in compressed form to minimize their memory requirements.The three segments of the vector quantizer span reflection coefficients r1
r3, r4
r6, and r7 - r10 respectively. The bitallocations for the vector quantizer segments are:Q111 bitsQ29 bitsQ38 bitsA reflection coefficient vector prequantizer is used at each segment. The prequantizer size at each segment is:P16 bitsP25 bitsP34 bitsAt a given segment, the residual error due to each vector from the prequantizer is computed and stored in temporarymemory. This list is searched to identify the four prequantizer vectors which have the lowest distortion. The index ofeach selected prequantizer vector is used to calculate an offset into the vector quantizer table at which the contiguoussubset of quantizer vectors associated with that prequantizer vector begins. The size of each vector quantizer subset atthe k-th segment is given by:SkQPkk=22(14)The four subsets of quantizer vectors, associated with the selected prequantizer vectors, are searched for the quantizervector which yields the lowest residual error. Thus at the first segment, 64 prequantizer vectors and 128 quantizervectors are evaluated, 32 prequantizer vectors and 64 quantizer vectors are evaluated at the second segment, and 16prequantizer vectors and 64 quantizer vectors are evaluated at the third segment.4.1.4.1Autocorrelation Fixed Point Lattice Technique (AFLAT)An autocorrelation version of the FLAT algorithm, AFLAT, is used to compute the residual error energy for a reflectioncoefficient vector being evaluated. Compute the autocorrelation sequence R(i), from the optimal reflection coefficients,rj, over the range 0 £ i £ Np.STEP1Define the initial conditions for the AFLAT recursion:()()PiRiiNp001=££-,(15)()()ViRiNiNpp0111=+-££-,(16)SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)16GSM 06.20 version 6.0.1 Release 1997STEP2Initialize k, the vector quantizer segment index:k = 1(17)STEP3Let Il(k) be the index of the first lattice stage in the k-th segment, and Ih(k) be the index of the lastlattice stage in the k-th segment.STEP4Initialize j, the index of the lattice stage, to point to the beginning of the k-th segment:()jIkl=(18)STEP5Set the initial conditions Pj-1 and Vj-1 to:()()()()PiPiiIkIkjjhl--=££-110,(19)()()()()()()ViViIkIkiIkIkjjihlhl--=-+££-1,(20)STEP6Compute the values of Vj and Pj arrays using:()()()()()[]()PirPirViViiIkjjjjjjjh=+++-££-----1012111,(21)()()()()ViVirVirPijNiNjjjijjijjpp=++--+++-££-----11211121,(22)STEP7Increment j:j = j+1STEP8If j < Ih(k) go to STEP 6.STEP9The residual error out of lattice stage Ih(k), given the reflection coefficient vector r, is computed usingequation (21):()()EPrIkh=0(23)STEP10Using the AFLAT recursion outlined, the residual error due to each vector from the prequantizer at thek-th segment is evaluated, the four subsets of quantizer vectors to be searched are identified, andresidual error due to each quantizer vector from the selected four subsets is computed. The index of r,the quantizer vector which minimized Er over all the quantizer vectors in the four subsets, is encodedwith Qk bits.STEP11If k < 3 then the initial conditions for doing the recursion at segment k+1 need to be computed. Set j, thelattice stage index, equal to:()jIkl=(24)STEP12Compute:()()()()()[]PirPirViViiNjjjjjjjp=+++-££-----1012111~~,(25)()()()()ViVirVirPijNiNjjjijjijjpp=++--+++-££-----11211121~~,(26)STEP13Increment j,j = j+1SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)17GSM 06.20 version 6.0.1 Release 1997STEP14If j £ Ih(k) go to STEP 12STEP15Increment k, the vector quantizer segment index:k=k+1STEP16If k £ 3 go to STEP 4.Otherwise, the indices of the reflection coefficient vectors for the three segments have been chosen, andthe search of the reflection coefficient vector quantizer is terminated.To minimize the storage requirements for the reflection coefficient vector quantizer, eight bit codes for the individualreflection coefficients are stored in the vector quantizer table, instead of the actual reflection coefficient values. Thecodes are used to look up the values of the reflection coefficients from a scalar quantization table with 256 entries.4.1.5Frame energy calculation and quantizationThe unquantized value of R0, R(0), is computed during the computation of the short term predictor parameters.()()()R01010320=+ff0,0, (27)where f(i,k) is defined by equation (5). R(0) is then converted into dB relative to full scale (full scale, Rmax, is definedas the square of the maximum sample amplitude).()RRRdB=æèçöø÷10010logmax(28)RdB is then quantized to 32 levels. The 32 quantized values for RdB range from a minimum of -66 (corresponding to acode of 0 for R0) to a maximum of -4 (corresponding to a code of 31 for R0). The step size of the quantizer is 2 (2 dBsteps). R0 is chosen as:R0 which minimizesabs(R0 - (RdB + 66)/2)(29)where R0 can take on the integer values from 0 to 31 corresponding to the 32 codes for R0.Decoding of the R0 code is given by:()()()RRR010206610=-max/(30)4.1.6Soft interpolation of the spectral parametersInterpolation of the short term filter parameters improves the performance of the GSM half rate encoder. The direct formfilter coefficients (ai's), which correspond to quantized reflection coefficients, are the spectral parameters used forinterpolation. The GSM half rate speech encoder uses either an interpolated set of ai's or an uninterpolated set of ai's,choosing the set which gives better prediction gain for the frame.Two sets of LPC coefficient vectors are generated: the first corresponds to the interpolated coefficients, the second tothe uninterpolated coefficients. The frame's speech samples are inverse filtered using each of the two coefficient sets,and the residual frame energy corresponding to each set is computed. The coefficient set yielding the lower frameresidual energy is then selected to be used. If the residual energies are equal, the uninterpolated coefficient set is used.INT_LPC, a soft interpolation bit, is set to 1 when interpolation is selected or to 0 otherwise.To generate the interpolated coefficient set, the coder interpolates the ai's for the first, second, and third subframes ofeach frame. The fourth subframe uses the uninterpolated ai's for that frame.The interpolation is done as follows. Let ai,L be the direct-form LPC coefficients corresponding to the last frame, ai,Cbe the direct-form LPC coefficients corresponding to the current frame, and Del to be the interpolation curve used. Theinterpolated direct-form LPC coefficient vector at the j-th subframe of the current frame, ai,j, is given by:SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)18GSM 06.20 version 6.0.1 Release 1997aaaaijiLiciLDeljINTSOFT,,,,(,_)()=+-,1 £ i £ Np, 1 £ j £ 4(31)The values of the interpolation curve Del are given in table 2.Table 2: Values of the interpolation curve DeljDel(j,0)Del(j,1)10,00,3021,00,6231,00,9241,01,00From this point on, the subframe index j is omitted for simplicity when referring to ai,j coefficients, although it isimplied. For interpolated subframes, the ai's are converted to reflection coefficients to check for filter stability. If theresulting filter is unstable, then uninterpolated coefficients are used for that subframe. The uninterpolated coefficientsused for subframe 1 are the previous frame's coefficients. The uninterpolated coefficients used for subframes 2, 3, and 4are the current frame's coefficients.4.1.7Spectral noise weighting filter coefficientsTo exploit the noise masking potential of the formants, spectral noise weighting is applied. The computation of the aicoefficients, used by spectral noise weighting filters W(z) and H(z), is now described. Define an impulse sequence d(n)over Ns samples:d(),010=d(),n=00(32)where 1 £ n £ Ns-1 and h3(n) is the zero-state response of the cascade of three filters to d(n). The three filters are anLPC synthesis filter, an inverse filter using a weighting factor of 0,93 and a synthesis filter with a weighting factor of0,7. In equation form:hnnhniiiNp111()()()=+-=åda0 £ n £ Ns-1 (33)hnhnhniiiiNp2111093()()(,)()=--=åa0 £ n £ Ns-1(34)hnhnhniiiNip321307()()(,)(),=+-=åa0 £ n £ Ns-1(35)where ai's are the direct form LP coefficients. The autocorrelation sequence of h3(n) is calculated using:RihnhnihniNs3331()()(),=-=-å0 £ i £ Np(36)From Rih3() the reflection coefficients which define the combined spectrally noise weighted synthesis filter arecomputed using the AFLAT recursion once per frame.STEP 1 Define the initial conditions for the AFLAT recursion:PiRih03()(),=0 £ i £ Np-1(37)()ViRih031(),=+1-Np £ i £ Np-1(38)SIST EN 300 969 V6.0.1:2003

ETSIEN 300 969 V6.0.1 (1999-06)19GSM 06.20 version 6.0.1 Release 1997STEP 2Initialize j, the index of the lattice stage, to point to the first lattice stage:j = 1STEP 3Compute rj, the j-th reflection coefficient, using:rjVPjj=---1100()()(39)STEP 4Given rj, update the values of Vj and Pj arrays using:()[]PirPirViVijjjjjj()()()()=+++----12111,0 £ i £ Np - j - 1(40)()()ViVirVirPijjjjjj()(),=++--++---121111211 + j - Np £ i £ Np - j -1(41)STEP 5Increment j:j = j+1STEP 6If j £ Np go to STEP 3, otherwise all Np reflection coefficients have been obtained.STEP 7The reflection coefficients, rj, are then converted to direct-form LPC filter coefficients,ai forimplementing the combined spectrally noise weighted synthesis filter H(z) and the filter W(z).The method for the spectral noise weighting filter coefficient update mimicks how the direct form LPC filter coefficientsare updated at subframes of a frame (subclause 4.1.6). No stability check of interpolated spectral noise weighting filtercoefficients is done at subframes 1, 2, or 3 if the interpolation flag, INT_LPC="1", but if uninterpolated coefficients areused at subframes 1, 2, and/or 3 due to instability of the unweighted coefficients (INT_LPC = "0"), uninterpolatedweighting filter coefficients are also used at those subframes.4.1.8Long Term Predictor lag determinationFigure 3 illustrates that the long term lag optimization looks just like a codebook search where the codebook is definedby the long term filter state and the specific vector in the codebook is pointed to by the long term predictor lag, L. Theinput p(n) is the weighted input speech for the subframe minus the zero input response of just the H(z) filter.S( )2X-Lbb (n)Lp'(n)p(n)e(n)total weightederrorLong termfilter stateH(z)Figure 3: Long term predictor lag searchThe GSM half rate speech encoder uses a combination of open loop and closed loop techniques in choosing the longterm predictor lag. First an open loop search is conducted to determine "candidate" lags at each subframe. Then at most,two best candidate lags at each subframe are selected, with each s
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...