Robust coding
Index
The problem
The most common applicative context for this research field is the streaming over an unreliable channel (e.g., Internet) of multimedia data (e.g., video). The typical setup is as follows: a transmitter takes a multimedia stream, partitions it in packets and sends each packet over the channel; at the other end, a receiver collects the packets, decodes them and shows the result to the user. The video stream must be decoded in real time, i.e., downloading for later viewing is not an acceptable option.Since the channel is unreliable, some packet could not arrive at the decoder. A simple solution to the unreliability problem is the use of a protocol which retransmits the lost packets (e.g., TCP over IP). However, in case of video transmission each packet has a deadline and the retransmitted data could arrive too late for being useful. Because of this, retransmission is usually not considered a viable option and the receiver must be ready to cope with missing packets.
The most sofisticated players try to do some error concealment by estimating the portion of the multimedia content corresponding to the lost packets. However, this solution should be considered an extrema ratio since, depending on the characteristics of the multimedia content, the quality of the estimate can leave much to be desired.
A different approach to the loss problem is given by the source/channel coding techniques whose goal is to produce a compressed bit-stream which is more robust with respect to data losses.
The solution
Every robust coding technique can be summarized in two words: add redundancy. The idea is that by exploiting such redundancy one can still recover the original data even in case of losses. The majority of the robust coding techniques currently known can be partitioned in two big classes: FEC-based techniques and Multiple Description (MD) techniques.FEC-based solutions
This solution is best explained by means of a toy example: suppose one wants to transmit two packets A and B over a lossy connection. In order to have some protection against losses, one can send together A and B a third packet C=A ⊕ B obtained by taking the XOR of A and B. It is clear that in order to be able to recover A and B it suffices to receive at least two out of the three sent packets. (If, for example, only A and C are received one can recover B as B=A ⊕ C).This approach can be generalized by creating the redundancy packets by means of some suitable FEC code instead of XOR operation. An interesting candidate in this context is given by the Reed-Solomon (R-S) codes since they are maximally robust in the sense that if a (N,K) R-S code is employed (that is, to K information packets N-K redundancy packets are added), one can recover the original data stream as soon as any combination of K packets is received.
The advantage of the FEC approach is that it can applied to any bit-stream, independently on the nature of the transported data (it could be applied to "pure binary" data such as portions of an executable files) and it requires no change on the encoder/decoder pair if the FEC is added at the interface between the application and the transport layer. Another interesting characteristic of the FEC approach is that it can be applied between the transport and the application levels. In this case, the decoder would "see" a reliable channel with no packet losses (although with a smaller bandwidth) and this would allow one to use a simple decoder with very simple concealment procedures.
The main disadvantage of the FEC approach is that the the
performance of the scheme has a sudden drop if the actual loss rate
gets larger than the loss rate the FEC was designed for.
The reason for
this is clear from the toy example above: if one looses both A and B,
the data in packet C are useless and no reconstruction is possible.
In the more general case, if an (N,K)-code is employed and M < K
packets are received the redundancy packets in the received set are
(typically) useless. This gives rise to the performance
"knee" qualitatively shown in the figure.
MD-based solutions
The Multiple Description (MD) approach could be informally described as a way to add the redundancy at the signal level, rather then at the bit-stream level.As with the FEC approach, the MD idea is best described by using a toy example. Suppose that now one wants to transmit two "analog values" (i.e., real numbers) A and B (let us neglect for a moment the fact that a real number requires an infinite number of bits). In order to gain some protection against losses, one could send together with A and B also their average C=(A+B)/2. It is clear that in order to be able to recover A and B it suffices to receive at least two out of three values. In this context A, B and C are called descriptions.
The advantage of this solution is that if one receives only one value, one can exploit any correlation between A and B in order to estimate (with some error, but usually an acceptable one) the two original values. It is worth observing that in most applicative contexts A and B are correlated, for example, if the signal to be transmitted is an image, A (respectively, B) could be obtained by taking the even (respectively, odd) columns of the original image. If only one description is received, the missing columns could be recovered by interpolation. Because of this possibility, the performance degradation experienced by the MD approach can be smoother than the performance degradation experienced by FEC.
A disadvantage of the MD approach is that it requires a change in the encoder and/or decoder structure. It should be said that often the change reduces itself in the addition of a pre-processing block at the encoder side and/or a post-processing block at the decoder side. In this case, one can implement MD solutions by means of off-the-shelf software.
Finally, it should be noted a subtle point with the MD approach: the samples of a signal are typically lossy encoded before transmission. This implies that the receiver will not have the true values of A, B and C, but an approximated version, corrupted by the coding noise. How does the coding noise interact with the MD approach? On one hand, the redundancy in the MD can be exploited to reduce the noise effect when all the descriptions are received, on the other hand, a poorly designed MD scheme could actually amplify the coding noise when losses happen.