SIST EN 300 969 V6.0.1:2003
(Main)Digital cellular telecommunications system (Phase 2+) (GSM); Half rate speech; Half rate speech transcoding (GSM 06.20 version 6.0.1 Release 1997)
Digital cellular telecommunications system (Phase 2+) (GSM); Half rate speech; Half rate speech transcoding (GSM 06.20 version 6.0.1 Release 1997)
Upgrade from Phase 2+ to Release 1997
Digitalni celični telekomunikacijski sistem (faza 2+) – Govor s polovično hitrostjo – Prekodiranje pri polovični hitrosti govora (GSM 06.20, različica 6.0.1, izdaja 1997)
General Information
Standards Content (Sample)
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.Digital cellular telecommunications system (Phase 2+) (GSM); Half rate speech; Half rate speech transcoding (GSM 06.20 version 6.0.1 Release 1997)33.070.50Globalni sistem za mobilno telekomunikacijo (GSM)Global System for Mobile Communication (GSM)ICS:Ta slovenski standard je istoveten z:EN 300 969 Version 6.0.1SIST EN 300 969 V6.0.1:2003en01-december-2003SIST EN 300 969 V6.0.1:2003SLOVENSKI
STANDARD
SIST EN 300 969 V6.0.1:2003
EN 300 969 V6.0.1 (1999-06)European Standard (Telecommunications series)Digital cellular telecommunications system (Phase 2+);Half rate speech;Half rate speech transcoding(GSM 06.20 version 6.0.1 Release 1997)GLOBAL SYSTEM
FOR MOBILE COMMUNICATIONSRSIST EN 300 969 V6.0.1:2003
ETSIEN 300 969 V6.0.1 (1999-06)2GSM 06.20 version 6.0.1 Release 1997ReferenceDEN/SMG-110620Q6 (8z00300o.PDF)KeywordsDigital cellular telecommunications system,Global System for Mobile communications(GSM), CODEC, GSM, speechETSIPostal addressF-06921 Sophia Antipolis Cedex - FRANCEOffice address650 Route des Lucioles - Sophia AntipolisValbonne - FRANCETel.: +33 4 92 94 42 00
Fax: +33 4 93 65 47 16Siret N° 348 623 562 00017 - NAF 742 CAssociation à but non lucratif enregistrée à laSous-Préfecture de Grasse (06) N° 7803/88Internetsecretariat@etsi.frIndividual copies of this ETSI deliverablecan be downloaded fromhttp://www.etsi.orgCopyright NotificationNo part may be reproduced except as authorized by written permission.The copyright and the foregoing restriction extend to reproduction in all media.© European Telecommunications Standards Institute 1999.All rights reserved.SIST EN 300 969 V6.0.1:2003
ETSIEN 300 969 V6.0.1 (1999-06)3GSM 06.20 version 6.0.1 Release 1997ContentsIntellectual Property Rights.5Foreword.51Scope.72References.73Definitions, symbols and abbreviations.73.1Definitions.73.2Symbols.93.3Abbreviations.104Functional description of the GSM half rate speech codec.114.1GSM half rate speech encoder.114.1.1High-pass filter.134.1.2Segmentation.144.1.3Fixed Point Lattice Technique (FLAT).144.1.4Spectral quantization.154.1.4.1Autocorrelation Fixed Point Lattice Technique (AFLAT).154.1.5Frame energy calculation and quantization.174.1.6Soft interpolation of the spectral parameters.174.1.7Spectral noise weighting filter coefficients.184.1.8Long Term Predictor lag determination.194.1.8.1Open loop long term search initialization.204.1.8.2Open loop lag search.214.1.8.3Frame lag trajectory search (Mode ¹ 0).264.1.8.4Voicing mode selection.284.1.8.5Closed loop lag search.284.1.9Harmonic noise weighting.294.1.10Code search algorithm.314.1.10.1Decorrelation of filtered basis vectors.314.1.10.2Fast search technique.324.1.11Multimode gain vector quantization.334.1.11.1Coding GS and P0.344.2GSM half rate speech decoder.374.2.1Excitation generation.374.2.2Adaptive pitch prefilter.374.2.3Synthesis Filter.384.2.4Adaptive spectral postfilter.384.2.5Updating decoder states.405Homing sequences.405.1Functional description.405.2Definitions.405.3Encoder homing.405.4Decoder homing.415.5Encoder home state.415.6Decoder home state.41Annex A (normative):Codec parameter description.42A.1Codec parameter description.42A.1.1MODE.42A.1.2R0.42A.1.3LPC1 - LPC3.43A.1.4LAG_1 - LAG_4.43A.1.5CODEx_1 - CODEx_4.43A.1.6GSP0_1 - GSP0_4.43SIST EN 300 969 V6.0.1:2003
ETSIEN 300 969 V6.0.1 (1999-06)4GSM 06.20 version 6.0.1 Release 1997A.2Basic coder parameters.43Annex B (normative):Order of occurrence of the codec parameters over Abis.44Annex C (informative):Bibliography.45Annex D (informative):Change Request History.46History.47SIST EN 300 969 V6.0.1:2003
ETSIEN 300 969 V6.0.1 (1999-06)5GSM 06.20 version 6.0.1 Release 1997Intellectual Property RightsIPRs essential or potentially essential to the present document may have been declared to ETSI. The informationpertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be foundin SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respectof ETSI standards", which is available free of charge from the ETSI Secretariat. Latest updates are available on theETSI Web server (http://www.etsi.org/ipr).Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guaranteecan be given as to the existence of other IPRs not referenced in SR 000 314 (or the updates on the ETSI Web server)which are, or may be, or may become, essential to the present document.ForewordThis European Standard (Telecommunications series) has been produced by ETSI Technical Committee Special MobileGroup (SMG).The present document specifies the speech codec to be used for the GSM half rate channel for the digital cellulartelecommunications system. The present document is part of a series covering the half rate speech traffic channels asdescribed below:GSM 06.02"Digital cellular telecommunications system (Phase 2+); Half rate speech; Half rate speechprocessing functions".GSM 06.06"Digital cellular telecommunications system (Phase 2+); Half rate speech; ANSI-C code for theGSM half rate speech codec".GSM 06.07"Digital cellular telecommunications system (Phase 2+); Half rate speech; Test sequences for theGSM half rate speech codec".GSM 06.20"Digital cellular telecommunications system (Phase 2+); Half rate speech; Half rate speechtranscoding".GSM 06.21"Digital cellular telecommunications system (Phase 2+); Half rate speech; Substitution and mutingof lost frames for half rate speech traffic channels".GSM 06.22"Digital cellular telecommunications system (Phase 2+); Half rate speech; Comfort noise aspectsfor half rate speech traffic channels".GSM 06.41"Digital cellular telecommunications system (Phase 2+); Half rate speech; DiscontinuousTransmission (DTX) for half rate speech traffic channels".GSM 06.42"Digital cellular telecommunications system (Phase 2+); Half rate speech; Voice Activity Detector(VAD) for half rate speech traffic channels".The contents of the present document is subject to continuing work within SMG and may change following formal SMGapproval. Should SMG modify the contents of the present document it will be re-released with an identifying change ofrelease date and an increase in version number as follows:Version 6.x.ywhere:6indicates Release 1997 of GSM Phase 2+xthe second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates,etc.ythe third digit is incremented when editorial only changes have been incorporated in the specification.SIST EN 300 969 V6.0.1:2003
ETSIEN 300 969 V6.0.1 (1999-06)6GSM 06.20 version 6.0.1 Release 1997Proposed national transposition datesDate of adoption of this EN:05 June 1999Date of latest announcement of this EN (doa):30 September 1999Date of latest publication of new National Standardor endorsement of this EN (dop/e):31 March 2000Date of withdrawal of any conflicting National Standard (dow):31 March 2000SIST EN 300 969 V6.0.1:2003
ETSIEN 300 969 V6.0.1 (1999-06)7GSM 06.20 version 6.0.1 Release 19971ScopeThe present document specifies the speech codec to be used for the GSM half rate channel. It also specifies the testmethods to be used to verify that the codec implementation complies with the present document.The requirements are mandatory for the codec to be used either in GSM Mobile Stations (MS)s or Base Station Systems(BSS)s that utilize the half rate GSM speech traffic channel.2ReferencesThe following documents contain provisions which, through reference in this text, constitute provisions of the presentdocument.· References are either specific (identified by date of publication, edition number, version number, etc.) ornon-specific.· For a specific reference, subsequent revisions do not apply.· For a non-specific reference, the latest version applies.· A non-specific reference to an ETS shall also be taken to refer to later versions published as an EN with the samenumber.[1]GSM 06.02: "Digital cellular telecommunications system (Phase 2+); Half rate speech; Half ratespeech processing functions".[2]GSM 06.06: "Digital cellular telecommunications system (Phase 2+); Half rate speech; ANSI-Ccode for the GSM half rate speech codec".[3]GSM 06.07: "Digital cellular telecommunications system (Phase 2+); Half rate speech; Testsequences for the GSM half rate speech codec".3Definitions, symbols and abbreviations3.1DefinitionsFor the purposes of the present document, the following definitions apply:adaptive codebook: The adaptive codebook is derived from the long term filter state. The lag value can be viewed as anindex into the adaptive codebook.adaptive pitch prefilter: In the GSM half rate speech decoder, this filter is applied to the excitation signal to enhancethe periodicity of the reconstructed speech. Note that this is done prior to the application of the short term filter.adaptive spectral postfilter: In the GSM half rate speech decoder, this filter is applied to the output of the short termfilter to enhance the perceptual quality of the reconstructed speech.allowable lags: The set of lag values which may be coded by the GSM half rate speech encoder and transmitted to theGSM half rate speech decoder. This set contains both integer and fractional values (see table 3).analysis window: For each frame, the short term filter coefficients are computed using the high pass filtered speechsamples within the analysis window. The analysis window is 170 samples in length, and is centered about the last 100samples in the frame.basis vectors: A set of M, M1, or M2 vectors of length Ns used to generate the VSELP codebook vectors. Thesevectors are not necessarily orthogonal.SIST EN 300 969 V6.0.1:2003
ETSIEN 300 969 V6.0.1 (1999-06)8GSM 06.20 version 6.0.1 Release 1997closed loop lag search: A process of determining the near optimal lag value from the weighted input speech and thelong term filter state.closed loop lag trajectory: For a given frame, the sequence of near optimal lag values whose elements correspond toeach of the four subframes as determined by the closed loop lag search.codebook: A set of vectors used in a vector quantizer.Codeword (OR Code): An M, M1, or M2 bit symbol indicating the vector to be selected from a VSELP codebook.Delta (LAG) code: A four bit code indicating the change in lag value for a subframe relative to the previous subframe'scoded lag. For frames in which the long term predictor is enabled (MODE 1, 2, or 3), the lag for subframe 1 isindependently coded using eight bits, and delta codes are used for subframes 2, 3, and 4.direct form coefficients: One of the formats for storing the short term filter parameters. All filters which are used tomodify speech samples use direct form coefficients.fractional lags: A set of lag values having sub-sample resolution. Note that not every fractional lag value considered inthe GSM half rate speech encoder is an allowable lag value.frame: A time interval equal to 20 ms, or 160 samples at an 8 kHz sampling rate.harmonic noise weighting filter: This filter exploits the noise masking properties of the spectral peaks which occur atharmonics of the pitch frequency by weighting the residual error less in regions near the pitch harmonics and more inregions away from them. Note that this filter is only used when the long term filter is enabled (MODE = 1, 2 or 3).high pass filter: This filter is used to de-emphasize the low frequency components of the input speech signal.integer lags: A set of lag values having whole sample resolution.interpolating filter: An FIR filter used to estimate sub-sample resolution samples, given an input sampled with integersample resolution.lag: The long term filter delay. This is typically the pitch period, or a multiple or sub-multiple of it.long term filter: This filter is used to generate the periodic component in the excitation for the current subframe. Thisfilter is only enabled for MODE = 1, 2 or 3.LPC coefficients: Linear Predictive Coding (LPC) coefficients is a generic descriptive term for describing the shortterm filter coefficients.open loop lag search: A process of estimating the near optimal lag directly from the weighted speech input. This isdone to narrow the range of lag values over which the closed loop lag search shall be performed.open loop lag trajectory: For a given frame, the sequence of near optimal lag values whose elements correspond to thefour subframes as determined by the open loop lag search.reflection coefficients: An alternative representation of the information contained in the short term filter parameters.residual: The output signal resulting from an inverse filtering operation.short term filter: This filter introduces, into the excitation signal, short term correlation which models the impulseresponse of the vocal tract.soft interpolation: A process wherein a decision is made for each frame to use either interpolated or uninterpolatedshort term filter parameters for the four subframes in that frame.soft interpolation bit: A one bit code indicating whether or not interpolation of the short term parameters is to be usedin the current frame.spectral noise weighting filter: This filter exploits the noise masking properties of the formants (vocal tractresonances) by weighting the residual error less in regions near the formant frequencies and more in regions away fromthem.subframe: A time interval equal to 5 ms, or 40 samples at an 8 kHz sampling rate.SIST EN 300 969 V6.0.1:2003
ETSIEN 300 969 V6.0.1 (1999-06)9GSM 06.20 version 6.0.1 Release 1997vector quantization: A method of grouping several parameters into a vector and quantizing them simultaneously.GSP0 vector quantizer: The process of vector quantization, its intermediate parameters (GS and P0) for the coding ofthe excitation gains b and g.VSELP codebook: Vector-Sum Excited Linear Predictive (VSELP) codebook, used in the GSM half rate speech coder,wherein each codebook vector is constructed as a linear combination of the fixed basis vectors.zero input response: The output of a filter due to all past inputs, i.e. due to the present state of the filter, given that aninput of zeros is applied.zero state response: The output of a filter due to the present input, given that no past inputs have been applied, i.e.given the state information in the filter is all zeroes.3.2SymbolsFor the purposes of the present document, the following symbols apply:A(z)Short term spectral filter.aiThe LPC coefficients.bL(n)The output of the long term filter state (adaptive codebook) for lag L.bThe long term filter coefficient.C(z)Second weighting filter.e(n)Weighted error signalfj(i)The coefficients of the jth phase of the 10th order interpolating filter used to evaluate candidatefractional lag values; i ranges from 0 to Pf-1.gj(i)The coefficients of the jth phase of the 6th order interpolating filter used to interpolate C's and G'sas well as fractional lags in the harmonic noise weighting; i ranges from 0 to Pg-1.gThe gain applied to the vector(s) selected from the VSELP codebook(s).HA M2 bit code indicating the vector to be selected from the second VSELP codebook (whenoperating in mode 0).IA M or M1 bit code indicating the vector to be selected from one of the two first VSELPcodebooks.LThe long term filter lag value.Lmax142 (samples), the maximum possible value for the long term filter lag.Lmin21 (samples), the minimum possible value for the long term filter lag.M9, the number of basis vectors, and the number of bits in a codeword, for the VSELP codebookused in modes 1, 2, and 3.M17, the number of basis vectors, and the number of bits in a codeword, for the first VSELPcodebook used in mode 0.M27, the number of basis vectors, and the number of bits in a codeword, for the second VSELPcodebook used in mode 0.MODEA two bit code indicating the mode for the current frame (see annex A).NA170, the length of the analysis window. This is the number of high pass filtered speech samplesused to compute the short term filter parameters for each frame.NF160, the number of samples per frame (at a sampling rate of 8 kHz).Np10, the short term filter order.Ns40, the number of samples per subframe (at a sampling rate of 8 kHz).P16, the number of bits in the prequantizer for the r1 - r3 vector quantizer.P25, the number of bits in the prequantizer for the r4 - r6 vector quantizer.P34, the number of bits in the prequantizer for the r7 - r10 vector quantizer.PfThe order of one phase of an interpolating filter used to evaluate candidate fractional lag values. Pfequals 10 for j ¹ 0 and equal to 1 for j = 0.PgThe order of one phase of an interpolating filter, fj(n), used to interpolate C's and G's as well asfractional lags in the harmonic noise weighting, Pg equals 6.pitchThe time duration between the glottal pulses which result when the vocal chords vibrate duringspeech production.Q111, the number of bits in the r1 - r3 reflection coefficient vector quantizer.Q29, the number of bits in the r4 - r6 reflection coefficient vector quantizer.Q38, the number of bits in the r7 - r10 reflection coefficient vector quantizer.SIST EN 300 969 V6.0.1:2003
ETSIEN 300 969 V6.0.1 (1999-06)10GSM 06.20 version 6.0.1 Release 1997R0A five bit code used to indicate the energy level in the current frame.r(n)The long term filter state (the history of the excitation signal); n < 0rL(n)The long term filter state with the adaptive codebook output for lag L appended.s'(n)Synthesized speech.W(z)Spectral weighting filter.lhnwThe harmonic noise weighting filter coefficient.xThe adaptive pitch prefilter coefficient.éxùCeiling function: the largest integer y where y < x + 1,0.ëxûFloor function: the largest integer y where y £ x.()xiijK=åSummation: x(j)+x(j+1)+.+x(K).()ijKxi=ÕProduct: x(j)(x(j+1)).(x(K))max(x,y)Find the larger of two numbers x and y.min(x,y)Find the smaller of two numbers x and y.round(x)Round the non-integer x to the closest integer yyx:,=+05y: y=x+0,5.3.3AbbreviationsFor the purposes of the present document, the following abbreviations apply:AFLATAutocorrelation Fixed point LAttice TechniqueCELPCode Excited Linear PredictionFLATFixed Point Lattice TechniqueLTPLong Term PredictorSSTSpectral Smoothing TechniqueVSELPVector-Sum Excited Linear PredictionSIST EN 300 969 V6.0.1:2003
ETSIEN 300 969 V6.0.1 (1999-06)11GSM 06.20 version 6.0.1 Release 19974Functional description of the GSM half rate speechcodecThe GSM half rate codec uses the VSELP (Vector-Sum Excited Linear Prediction) algorithm. The VSELP algorithm isan analysis-by-synthesis coding technique and belongs to the class of speech coding algorithms known as CELP (CodeExcited Linear Prediction).The GSM half rate codec's encoding process is performed on a 20 ms speech frame at a time. A speech frame of thesampled speech waveform is read and based on the current waveform and the past history of the waveform, the codecencoder derives 18 parameters that describe it. The parameters extracted are grouped into the following three generalclasses:-energy parameters (R0 and GSP0);-spectral parameters (LPC and INT_LPC);-excitation parameters (LAG and CODE).These parameters are quantized into 112 bits for transmission as described in annex A and their order of occurrence overAbis is given in annex B.The GSM half rate codec is an analysis-by-synthesis codec, therefore the speech decoder is primarily a subset of thespeech encoder. The quantized parameters are decoded and a synthetic excitation is generated using the energy andexcitation parameters. The synthetic excitation is then filtered to provide the spectral information resulting in thegeneration of the synthesized speech (see figure 1).GSM half rate speech codecSpeech encoderSpeechdecoderspeechtransmitted speechparametersreceived speechparameterssynthesisedspeechFigure 1: Block diagram of the GSM half rate speech codecThe ANSI-C code that describes the GSM half rate speech codec is given in GSM 06.06 [2] and the test sequences inGSM 06.07 [3] (see clause 5 for the codec homing test sequences).4.1GSM half rate speech encoderThe GSM half rate speech encoder uses an analysis by synthesis approach to determine the code to use to represent theexcitation for each subframe. The codebook search procedure consists of trying each codevector as a possible excitationfor the Code Excited Linear Predictive (CELP) synthesizer. The synthesized speech s'(n) is compared against the inputspeech and a difference signal is generated. This difference signal is then filtered by a spectral weighting filter, W(z),(and possibly a second weighting filter, C(z)) to generate a weighted error signal, e(n). The power in e(n) is computed.The codevector which generates the minimum weighted error power is chosen as the codevector for that subframe. Thespectral weighting filter serves to weight the error spectrum based on perceptual considerations. This weighting filter isa function of the speech spectrum and can be expressed in terms of the a parameters of the short term (spectral) filter.SIST EN 300 969 V6.0.1:2003
ETSIEN 300 969 V6.0.1 (1999-06)12GSM 06.20 version 6.0.1 Release 1997()WzzziiiNiiiNpp=---=-=åå1111aa~(1)The computation of the ai coefficients is described in subclause 4.1.7.The second weighting filter C(z), if used, is a harmonic weighting filter and is used to control the amount of error in theharmonics of the speech signal. If the weighting filter(s) are moved to both input paths to the subtracter, an equivalentconfiguration is obtained as shown in figure 2.S( )2VSELPCodebookX_IgInput speech
s(n)y(n)y'(n)e(n)Determine LPCcoefficientsbLtotal weightederrorLPC 1LPC 2LPC 3R0INT_LPCMODELAG-1CODE_1LAG_2CODE_2LAG_3CODE_3LAG_4CODE_4GSP 0-1GSP 0_2GSP 0_3GSP 0_4W(z)C(z)Find Minimumover L and all IB(z)H(z)C(z) b
gfind optimalgains
and.Figure 2: Block diagram of the GSM half rate speech encoder (MODE = 1,2 and 3)Here H(z) is the combination of A(z), the short term (spectral) filter, and W(z), the spectral weighting filter. These filtersare combined since the denominator of A(z) is cancelled by the numerator of W(z).SIST EN 300 969 V6.0.1:2003
ETSIEN 300 969 V6.0.1 (1999-06)13GSM 06.20 version 6.0.1 Release 1997()HzziiiNp=--=å111~a(2)There are two approaches that can be used for calculating the gain, g. The gain can be determined prior to codebooksearch based on residual energy. This gain would then be fixed for the codebook search. Another approach is tooptimize the gain for each codevector during the codebook search. The codevector which yields the minimum weightederror would be chosen and its corresponding optimal gain would be used for g. The latter approach generally yieldsbetter results since the gain is optimized for each codevector. This approach also implies that the gain term needs to beupdated at the subframe rate. The optimal code and gain for this technique can be computed as follows:The input speech is first filtered by a high pass filter as described in subclause 4.1.1. The short term filter parameters arecomputed from the filtered input speech once per frame. A fast fixed point covariance lattice technique is used.Subclauses 4.1.3 and 4.1
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.