Information technology — Generic coding of moving pictures and associated audio information — Part 7: Advanced Audio Coding (AAC) — Technical Corrigendum 1

Technologies de l'information — Codage générique des images animées et du son associé — Partie 7: Codage du son avancé (AAC) — Rectificatif technique 1

General Information

Status
Withdrawn
Publication Date
25-Nov-1998
Withdrawal Date
25-Nov-1998
Current Stage
9599 - Withdrawal of International Standard
Completion Date
28-Jul-2003
Ref Project

Relations

Buy Standard

Standard
ISO/IEC 13818-7:1997/Cor 1:1998
English language
22 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL STANDARD ISO/IEC 13818-7:1997
TECHNICAL CORRIGENDUM 1
bc
Published 1998-12-01
INTERNATIONAL ORGANIZATION FOR STANDARDIZATION   ORGANISATION INTERNATIONALE DE NORMALISATION
• �¯˘˜�˝�—˛˜˝�� ˛—ˆ�˝¨˙��¨� ˇ˛ ���˝˜�—�¨˙��¨¨ •
INTERNATIONAL ELECTROTECHNICAL COMMISSION     COMMISSION ÉLECTROTECHNIQUE INTERNATIONALE
• �¯˘˜�˝�—˛˜˝�� �¸¯˚�—˛�¯�˝¨�¯�˚�� ˚˛�¨��¨� •
Information technology — Generic coding of moving
pictures and associated audio information —
Part 7:
Advanced Audio Coding (AAC)
TECHNICAL CORRIGENDUM 1
Technologies de l’information — Codage générique des images animées et du son associé —
Partie 7: Codage du son avancé (AAC)
RECTIFICATIF TECHNIQUE 1
Technical Corrigendum 1 to International Standard ISO/IEC 13818-7:1997 was prepared by Joint Technical
Committee ISO/IEC JTC 1, , Subcommittee SC 29,
Information technology Coding of audio, picture, multimedia and
hypermedia information.

1) Add the following paragraph at the end of clause 5:

The number of bits for each data element is written in the second column. "X.Y" indicates that the number of bits is one of
the values between X and Y including X and Y. "{X;Y}" means the number of bits is X or Y, depending on the value of other
data elements in the bitstream.

ICS 35.040 Ref. No. ISO/IEC 13818-7:1997/Cor.1:1998(E)
©  ISO/IEC 1998
Printed in Switzerland

---------------------- Page: 1 ----------------------
ISO/IEC 13818-7:1997/Cor.1:1998(E) © ISO/IEC
2) Replace Table 6.13 in subclause 6.3 with the following:

Syntax
No. of bits Mnemonic
section_data()
{
if( window_sequence == EIGHT_SHORT_SEQUENCE )
sect_esc_val = (1<<3) - 1
else
sect_esc_val = (1<<5) - 1
for( g=0; g < num_window_groups; g++ ) {
k=0
i=0
while (k sect_cb[g][i] 4 uimsbf
sect_len=0
while (sect_len_incr == sect_esc_val) {3;5} uimsbf
sect_len += sect_esc_val
sect_len += sect_len_incr
sect_start[g][i] = k
sect_end[g][i] = k+sect_len
for (sfb=k; sfb sfb_cb[g][sfb] = sect_cb[g][i];
k += sect_len
i++
}
num_sec[g] = i
}
}

3) In Table 6.15 in subclause 6.3, replace:
No. of bits “4/6“ with “{4;6}”
and
No. of bits “3/5“ with “{3;5}”.
4) Replace Table 6.16 in subclause 6.3 with the following:

Syntax
No. of bits Mnemonic
spectral_data()
{
for( g=0; g for (i=0; i if (sect_cb[g][i] != ZERO_HCB &&
sect_cb[g][i] <= ESC_HCB) {
for (k=sect_sfb_offset[g][sect_start[g][i]];
k< sect_sfb_offset[g][sect_end[g][i]]; ) {
if (sect_cb[g][i] hcod[sect_cb[g][i]][w][x][y][z] 1.16 bslbf
if( unsigned_cb[sect_cb[g][i]] )
quad_sign_bits 0.4 bslbf
k += QUAD_LEN
}
else {
hcod[sect_cb[g][i]][y][z] 1.15 bslbf
if( unsigned_cb[sect_cb[g][i]] )
pair_sign_bits 0.2 bslbf
k += PAIR_LEN
2

---------------------- Page: 2 ----------------------
© ISO/IEC ISO/IEC 13818-7:1997/Cor.1:1998(E)
if (sect_cb[g][i]==ESC_HCB) {
if (y==ESC_FLAG)
hcod_esc_y 5.21 bslbf
if (z==ESC_FLAG)
hcod_esc_z 5.21 bslbf
}
}
}
}
}
}
}

5) Replace Table 6.18 in subclause 6.3 with the following:

Syntax No. of bits Mnemonic
coupling_channel_element()
{
element_instance_tag 4 uimsbf
ind_sw_cce_flag 1 uimsbf
num_coupled_elements 3 uimsbf
num_gain_element_lists = 0
for (c=0; c num_gain_element_lists++
cc_target_is_cpe[c] 1 uimsbf
cc_target_tag_select[c] 4 uimsbf
if ( cc_target_is_cpe[c] ) {
cc_l[c] 1 uimsbf
cc_r[c] 1 uimsbf
if (cc_l[c] && cc_r[c] )
num_gain_element_lists++
}
}
cc_domain 1 uimsbf
gain_element_sign 1 uimsbf
gain_element_scale 2 uimsbf
individual_channel_stream(0)
for ( c=1; c if ( ind_sw_cce_flag ) {
cge = 1
} else {
common_gain_element_present[c] 1 uimsbf
cge = common_gain_element_present[c]
}
if ( cge )
hcod_sf[common_gain_element[c]] 1.19 bslbf
else {
for (g=0; g for (sfb=0; sfb if ( sfb_cb[g][sfb] != ZERO_HCB )
hcod_sf[dpcm_gain_element[c][g][sfb]] 1.19 bslbf
}
}
}
}
}

3

---------------------- Page: 3 ----------------------
ISO/IEC 13818-7:1997/Cor.1:1998(E) © ISO/IEC
6) Replace Table 6.22 in subclause 6.3 with the following:

Syntax
No. of bits Mnemonic
fill_element()
{
cnt = count 4 uimsbf
if (cnt == 15)
cnt += esc_count - 1; 8 uimsbf
while (cnt > 0) {
cnt -= extension_payload(cnt)
}
}

and add the following tables, Table 6.24, Table 6.25 and Table 6.26, at the end of subclause 6.3:

Table 6.24 Syntax of extension_payload()

extension_payload(cnt)
{
extension_type
4 uimsbf
switch( extension_type ) {
case EXT_DYNAMIC_RANGE:
n = dynamic_range_info();
return n;
case EXT_FILL_DATA:
fill_nibble      /* must be ‘0000’ */ 4 uimsbf
for (i=0; i fill_byte[i]  /* must be ‘10100101’ */ 8 uimsbf
return cnt
case default:
for (i=0; i<8*(cnt-1)+4; i++)
other_bits[i] 1 uimsbf
return cnt
}
}
Table 6.25 Syntax of dynamic_range_info()

Syntax
No. of bits Mnemonic
dynamic_range_info()
{
n = 1
drc_num_bands = 1
pce_tag_present 1 uimsbf
if (pce_tag_present == 1) {
pce_ instance_tag 4 uimsbf
drc_tag_reserved_bits 4
n++
}
excluded_chns_present 1 uimsbf
if (excluded_chns_present == 1) {
n += excluded_channels()
}
drc_bands_present 1 uimsbf
if (drc_bands_present == 1) {
drc_band_incr 4 uimsbf
drc_bands_reserved_bits 4 uimsbf
n++
4

---------------------- Page: 4 ----------------------
© ISO/IEC ISO/IEC 13818-7:1997/Cor.1:1998(E)
drc_num_bands = drc_num_bands + drc_band_incr
for (i=0; i drc_band_top[i] 8 uimsbf
n++
}
}
prog_ref_level_present 1 uimsbf
if (prog_ref_level_present == 1) {
prog_ref_level 7 uimsbf
prog_ref_level_reserved_bits 1 uimsbf
n++
}
for (i=0; i dyn_rng_sgn[i] 1 uimsbf
dyn_rng_ctl[i] 7 uimsbf
n++
}
return n
}
Table 6.26  Syntax of excluded_channels()
Syntax
No. Of bits Mnemonic
excluded_channels( )
{
n = 0
num_excl_chan = 7
for (i=0; i<7; i++)
exclude_mask[ i ] 1 uimsbf
n++
while (additional_excluded_chns[n-1] == 1) { 1 uimsbf
for (i= num_excl_chan; i< num_excl_chan+7; i++)
exclude_mask[ i ] 1 uimsbf
n++
num_excl_chan += 7
}
return n
}

7) In subclause 7.1, replace “ATDS0” with “ADTS”.
8) Replace definition of num_coupled_channels in subclause 7.3.2 with the following:

num_coupled_elements number of coupled target elements

9) Replace definition of home in subclause 8.1.1 with the following:

see ISO/IEC 11172-3, subclause 2.4.2.3 (Table 6.2) definition for original_copy

10) Replace definition of original_copy in subclause 8.1.1 with the following:

see ISO/IEC 11172-3, subclause 2.4.2.3 (table 6.2) definition for copyright

5

---------------------- Page: 5 ----------------------
ISO/IEC 13818-7:1997/Cor.1:1998(E) © ISO/IEC
11) Replace the last sentence in the second paragraph of subclause 8.1.2 with the following:

However, one non-normative transport stream, called Audio_Data_Transport_Stream (ADTS), is described. It may be used
for applications in which the decoder can parse this stream.

12) Replace the first two paragraphs of subclause 8.2.3 with the following:

Assuming that the start of a raw_data_block is known, it can be decoded without any additional «transport-level» information
and produces 1024 audio samples per output channel. The sampling rate of the audio signal, as specified by the
sampling_frequency_index, may be specified in a program_config_element or it may be implied in the specific application
domain. In the latter case, the sampling_frequency_index must be deduced in order for the bitstream to be parsed. Since a
given sampling_frequency_index is associated with only one sampling frequency, and since maximum flexibility is desired
in the range of possible sampling frequencies, the following table shall be used to associate an implied sampling frequency
with the desired sampling_frequency_index. It is used as follows: identify the frequency in the table that is the highest
frequency that is less than or equal to the implied frequency, and use the index that is in that same row.
Frequency sampling_frequency_index
92017 0x0
75132 0x1
55426 0x2
46009 0x3
37566 0x4
27713 0x5
23004 0x6
18783 0x7
13856 0x8
11502 0x9
9391 0xa
0 0xb
Assuming that the start of the first raw_data_block in a raw_data_stream is known, the sequence can be decoded without any
additional “transport-level” information and produces 1024 audio samples per raw_data_block per output channel.

13) Replace the second item of the second level bulleted list in subclause 8.3.5 with the following:

If there is only one group with length eight (num_window_group = 1, window_group_length[0]=8), the results is that spectral
data of all eight SHORT_WINDOWs is interleaved by scalefactor window bands.

14) Replace definition of num_valid_cce_elements in subclause 8.5 with the following:

num_valid_cc_elements The number of CCE's that can add to the audio data for this program (Table 6.21)

15) Replace definition of valid_cce_element_tag_select in subclause 8. 5 with the following:

valid_cc_element_tag_select instance_tag of the CCE addressed (Table 6.21)

16) Replace subclause 8.7 with the following:

6

---------------------- Page: 6 ----------------------
© ISO/IEC ISO/IEC 13818-7:1997/Cor.1:1998(E)
8.7 Fill element (FIL) including Dynamic Range Control (DRC)
Bitstream elements:
count Initial value for length of fill data (Table 6.22)
esc_count Incremental value of length of fill data (Table 6.22)
extension_type Four bit field indicating the type of fill element content (Table 6.22)
fill_nibble Four bit field for fill (Table 6.24)
fill_byte Byte to be discarded by the decoder (Table 6.24)
other_bits Bits to be discarded by the decoder (Table 6.24)
pce_tag_present One bit indicating that program element tag is present (table 6.25).
pce_instance_tag Tag field that indicates with which program the dynamic range information is associated
(table 6.25)
drc_tag_reserved_bits Reserved (table 6.25)
excluded_chns_present One bit indicating that excluded channels are present (table 6.25)
drc_bands_present One bit indicating that DRC multi-band information is present (table 6.25)
drc_band_incr Number of DRC bands greater than 1 having DRC information (table 6.25)
drc_bands_reserved_bits Reserved (table 6.25)
drc_band_top[i] Indicates top of i-th DRC band in units of 4 spectral lines (table 6.25).
If drc_band_top[i]=k, then the index (w.r.t zero) of the highest spectral coefficient that is in
the i-th DRC band is = k*4+3. In case of an EIGHT_SHORT_SEQUENCE
window_sequence the index is interpreted as pointing into the concatenated array of 8*128
(de-interleaved) frequency points corresponding to the 8 short transforms.
prog_ref_level_present One bit indicating that reference level is present (table 6.25).
prog_ref_level Reference level. A measure of long-term program audio level for all channels combined
(table 6.25).
prog_ref_level_reserved_bits Reserved (table 6.25)
dyn_rng_sgn[i] Dynamic range control sign information. One bit indicating the sign of dyn_rng_ctl (0 if
positive, 1 if negative, table 6.25)
dyn_rng_ctl[i] Dynamic range control magnitude information (table 6.25)
exclude_mask[ i ] Boolean array indicating the audio channels of a program that are excluded from DRC
processing using this DRC information.
additional_excluded_chns[ i ] One bit indicating that additional excluded channels are present (table 6.26)
Fill elements have to be added to the bitstream if the total bits for all audio data together with all additional data is lower than
the minimum allowed number of bits in this frame necessary to reach the target bitrate. Dynamic Range Control (DRC) bits
must be added to the fill element whenever the encoder wishes to include DRC information. Under normal conditions fill bits
are avoided and free bits are used to fill up the bit reservoir. Fill bits are written only if the bit reservoir is full. Any number of
fill elements are allowed.
Decoding process:
The syntactic element count gives the initial value of the length of the fill data. In the same way as for the data element this
value is incremented with the value of esc_count if count equals 15. The resulting number gives the number of bytes to be
read.
DRC Decoding process:
Fill elements containing an extension_payload with a extension_type of EXT_DYNAMIC_RANGE (see below) are reserved
for dynamic range information. In this case the fill_element count field must be set equal to the total length, in bytes, of all
dynamic range information plus the extension_type field.
prog_ref_level_present indicates that prog_ref_level is being transmitted. This permits prog_ref_level to be sent as
infrequently as desired (e.g. once), although periodic transmission would permit break-in.
prog_ref_level is quantized in 0.25 dB steps using 7 bits, and therefore has a range of approximately 32 dB. It indicates
program level relative to full scale (i.e. dB below full scale), and is reconstructed as:
− prog__ref level/24
level=⋅32767 2
7

---------------------- Page: 7 ----------------------
ISO/IEC 13818-7:1997/Cor.1:1998(E) © ISO/IEC
where “full scale level” is 32767 (prog_ref_level equal to 0).
pce_tag_present indicates that pce_instance_tag is being transmitted. This permits pce_instance_tag to be sent as
infrequently as desired (e.g. once), although periodic transmission would permit break-in.
pce_instance_tag indicates with which program the dynamic range information is associated. If this is not present then the
default program is indicated. Since each AAC bitstream typically has just one program, this would be the most common
mode. Each program in a multi-program bitstream would send its dynamic range information in a distinct
extension_payload() of the fill_element(). In the multiple program case, the pce_instance_tag would always have to be
signaled.
The drc_tag_reserved_bits fill out the optional fields to an integral number of bytes in length.
The excluded_chns_present bit indicates that channels that are to be excluded from dynamic range processing will be
signaled immediately following this bit. The excluded channel mask information must be transmitted in each frame where
channels are excluded. The following ordering principles are used to assign the exclude_mask to channel outputs:
If a PCE is present (explicit speaker mapping), the exclude_mask bits correspond to the audio channels in the SCE, CPE,

CCE and LFE syntax elements in the order of their appearance in the PCE. In the case of a CPE, the first transmitted mask
bit corresponds to the first channel in the CPE, the second transmitted mask bit to the second channel. In the case of a CCE,
a mask bit is transmitted only if the coupling channel is specified to be an independently switched coupling channel.
exclude_mask
• For the case of an implicit speaker mapping (no PCE present), the bits correspond to the audio channels in
the SCE, CPE and LFE syntax elements in the order of their appearance in the bitstream, followed by the audio channels in
the CCE syntax elements in the order of their appearance in the bitstream. In the case of a CPE, the first transmitted mask
bit corresponds to the first channel in the CPE, the second transmitted mask bit to the second channel. In the case of CCE, a
mask bit is transmitted only if the coupling channel is specified to be an independently switched coupling channel.

drc_band_incr is the number of bands greater than one if there is multi-band DRC information.

dyn_rng_ctl dyn_rng_sgn,
 is quantized in 0.25 dB steps using a 7-bit unsigned integer, and therefore, in association with has
a range of +/-31.75 dB. It is interpreted as a gain value that shall be applied to the decoded audio output samples of the current
frame.

The range supported by the dynamic range information is summarized in the following table:

Field bits steps stepsize, dB range, dB
prog_ref_level 7 128 0.25 31.75
dyn_rng_sgn
1 and +/- 127 0.25 +/- 31.75
and 7
dyn_rng_ctl

The following symbolic abbreviations for values of the extension_type field are defined currently:

Symbol Value of extension_type Purpose
EXT_FILL ‘0000’ Bitstream filler
EXT_FILL_DATA ‘0001’ Bitstream data as filler
EXT_DYNAMIC_RANGE ‘1011’ Dynamic range control
- all other values reserved

The ‘reserved’ values can be used for further extension of the syntax in a compatible way.

Note that fill_nibble is normatively defined to be ‘0000’ and fill_byte is normatively defined to be ‘10100101’ (to ensure that
self-clocked data streams, such as radio modems, can perform reliable clock recovery).

The dynamic range control process is applied to the spectral data spec[i] of one frame immediately before the synthesis
filterbank. In case of an EIGHT_SHORT_SEQUENCE window_sequence the index i is interpreted as pointing into the
concatenated array of 8*128 (de-interleaved) frequency points corresponding to the 8 short transforms.


8

---------------------- Page: 8 ----------------------
© ISO/IEC ISO/IEC 13818-7:1997/Cor.1:1998(E)
This following pseudo code is for illustrative purposes only, showing one method for applying one set of dynamic control
information to a frame of a target audio channel. The constants ctrl1 and ctrl2 are compression constants (typically
between 0 and 1, zero meaning no compression) that may optionally be used to scale the dynamic range compression
characteristics for levels greater than or less than the program reference level, respectively. The constant target_level
describes the output level desired by the user, expressed in the same scaling as .
prog_ref_level

bottom = 0;
drc_num_bands = 1;
if (drc_bands_present)
drc_num_bands += drc_band_incr;
if (drc_num_bands == 1)
drc_band_top[0] = 1024/4 - 1;
for (bd=0; bd < drc_num_bands; bd++) {
top = 4 * (drc_band_top[bd] + 1);

/* Decode DRC gain factor */
if (dyn_rng_sgn[bd])
factor = 2^(-ctrl1*dyn_rng_ctl[bd]/24); /* compress */
else
factor = 2^(ctrl2*dyn_rng_ctl[bd]/24); /* boost */

/* If program reference normalization is done in the digital domain, modify
 * factor to perform normalization.
 * prog_ref_level can alternatively be passed to the system for modification
 * of the level in the analog domain. Analog level modification avoids problems
 * with reduced DAC SNR (if signal is attenuated) or clipping (if signal is boosted)
 */
factor *= 0.5^((target_level-prog_ref_level)/24);

/* Apply gain factor */
for (i=bottom; i spec[i] *= factor;
bottom = top;
}

Note the relation between dynamic range control and coupling channels:
• Dependently switched coupling channels are always coupled onto their target channels as spectral coefficients
prior to the DRC processing and synthesis filtering of these channels. Therefore a dependently switched coupling
channel’s signal that couples onto to a specific target channel will undergo the DRC processing of that target
channel.
• Since independently switched coupling channels couple to their target channels in the time domain, each
independently switched coupling channel will undergo DRC processing and subsequent synthesis filtering
separate from its target channels. This permits the independently switched coupling channel to have distinct DRC
processing if desired.

Persistence of DRC information:

At the beginning of a stream, all DRC information for all channels is assumed to be set to its default value: program reference
level equal to the decoder’s target reference level, one DRC band, with no DRC gain modification for that band. Unless this
data is specifically overwritten, this remains in effect.

There are two cases for the persistence of DRC information that has been transmitted:
• The program reference level is per audio program, and persists until a new value is transmitted, at which point the new
data overwrites the old and takes effect that frame. (It may be appropriate to send this value periodically to allow
bitstream break-in.)
• Other DRC information persists on a per-channel basis. Note that if a channel is excluded via the appropriate
exclude_mask[] bit, then effectively no information is transmitted for that channel in that call to dynamic_range_info().
The excluded channel mask information must be transmitted in each frame where channels are excluded.

The rules for retaining per-channel DRC information are as follows:
• If there is no DRC information in a given frame for a given channel, use the information that was used in the previous
frame. (This means that one adjustment can hold for a long time, although it may be appropriate to transmit the DRC
information periodically to permit break-in.)
9

---------------------- Page: 9 ----------------------
ISO/IEC 13818-7:1997/Cor.1:1998(E) © ISO/IEC
If any DRC information for this channel appears in the current frame, the following sequence occurs: first, overwrite all

per-channel DRC information for that channel with the default values (one DRC band, with no DRC gain modification for
that band), then overwrite any per-channel DRC information with the transmitted values.

“less than 24 bits” “less than 22 bits”
17) In the fifth paragraph of subclause 9.3, replace with .
18) In subclause 10.3, replace the third line in the inverse quantization with the following:

width = (swb_offset [sfb+1] - swb_offset [sfb]);

19) In subclause 11.3.2, replace “/* see clause 4 */” with “/* see clause 9 */”.
20) In the first paragraph of subclause 11.3.2, replace “(but is initialized to zero to have an valid in the array)” with “(but is
initialized to zero to have a valid entry in the array)”.
21) Add the following sentence at the end of subclause 11.3.2:

Note that scalefactors, sf[g][sfb], must be within the range of zero to 256, both inclusive.

22) In subclause 12.1.3, replace the pseudo code used for computing the inverse M/S matrix

tmp = l_spec[g][b][sfb][i] +
r_spec[g][b][sfb][i];
l_spec[g][b][sfb][i] = l_spec[g][b][sfb][i] -
r_spec[g][b][sfb][i];
r_spec[g][b][sfb][i] = tmp;

with

tmp = l_spec[g][b][sfb][i] -
r_spec[g][b][sfb][i];
l_spec[g][b][sfb][i] = l_spec[g][b][sfb][i] +
r_spec[g][b][sfb][i];
r_spec[g][b][sfb][i] = tmp;

23) In the second paragraph of subclause 13.3.2.1, replace:

xn()=⋅bk ()n⋅a⋅r (n−1),
est,,m m q m−1
where
rn()=−r (n1)−b⋅k ()n⋅e ()n
qm,,qm−−11m q,m

with

xn()=⋅bk ()
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.