Part 2 of an excerpt from "Multimedia Over IP and Wireless Networks" examines loss concealment techniques for overlapped transform based codecs.
[Part 1 looks at error concealment strategies and error resilient coding for waveform and CELP speech codecs.]
3.4 LOSS CONCEALMENT FOR LAPPED TRANSFORM CODECS
Linear transforms are widely used in signal compression. They have the primary objective of concentrating the signal energy on a few coefficients, thus preparing the data for the subsequent quantization
and entropy coding. Block transforms (e.g., the Discrete Cosine Transform, DCT) are convenient in that they make each block of data independent, constraining the effect of any error (either by quantization or by loss) to that single block of data. Nevertheless, by not exploiting correlation between adjacent samples in different blocks, they may often produce a structured noise (blocking artifacts), which is readily identifiable in the decoded signal as a buzzing sound.
Overlapped transform coders occupy an important niche between block codes and fully predictive coders. They still limit the data to a certain block of samples, but their basis functions do not have discontinuities at block boundaries. Instead, basis functions spread over to (i.e., overlap) neighboring data blocks. This significantly reduces blocking artifacts, while preserving or even improving the compression qualities of the transform. For these reasons, overlapped transforms are used in numerous audio and speech codecs (e.g., MP3, Windows Media Audio [WMA], and ITU-G722.1).
A loss concealment technique based on exploiting the partial information available about certain samples has been recently introduced [16]. The technique can be used with essentially any linear transform where some of the coefficients are missing. Important cases include missing "frames" of overlapped transform (e.g., Modulated Lapped Transform, MLT) coefficients, or wavelet coefficients, or even single or multiple missing transform coefficients within a block of a block transform (e.g., DCT). However, since we are mostly interested in concealment of missing blocks in real-time speech and audio communication over packet networks, we will focus our discussion on the case of overlapped transforms.
When using an overlapped transform based codec, if a frame or block of coefficients is lost, partial information is available about the missing segment. While this information is not of enough quality to be used directly, it provides important clues about the missing segment. In this section we discuss ways in which to exploit this partial information to maximize the quality of the recovered signal. In particular, we apply some of the techniques to single-frame loss concealment on the ITU-G722.1 codec [5].
In order to better understand the scenario, let us take a look at how an overlapped transform is used for coding purposes. Figure 3.3a shows a one-dimensional signal. In this example, the signal is split into overlapping blocks of 2N samples, as shown in Figure 3.3b. Then, at each block, N transform coefficients are obtained by multiply/accumulate operations with the N basis functions constituting the transform.
FIGURE 3.3: A sample speech signal. (a) Original signal. (b) Signal split into overlapping segments and windowed. (c) Corresponding segments after decoding. (d) Overlapped/added signal with one missing block. (e) Error concealment using simple block repetition.
Figure 3.4 shows the first few basis functions of a typical transform. On the decoder side, the basis functions are scaled by the transform coefficients and added. Subsequent frames of the signal are then overlapped and added. Figure 3.3c shows the contribution of each overlapping block, before addition. Note that the recovered segments have the same length but are not identical to the original segments: the original signal is recovered only after adding the overlapping parts.
FIGURE 3.4: A few basis functions of the MLT transform. From top to bottom: 1st, 2nd, 3rd, 10th, and 50th basis functions.
Now, suppose the information about one of the blocks was lost. A total of 2N samples - spawning the lost block - cannot be reconstructed correctly. If we replace the lost coefficients with zeros, we would have the reconstructed signal indicated in Figure 3.3d. Note that in this example, although only N coefficients are missing, a total of 2N samples do not reconstruct correctly, due to the overlapping nature of the transform. Nevertheless, overlapped transforms like the MLT are critically sampled. This means that some partial information is available about the 2N incomplete samples. More specifically, a total of N linear equations are available regarding these 2N samples. We will now examine how this can be used to improve the loss concealment.