Distributed coding

Index


The problem

The ideas of distributed coding are best introduced by means of a toy example. Consider the following setup: in the opposite corners of a room one has two sensors which measure the temperature with the resolution of (say) 0.1 °C. The goal is to (lossy) compress the measurements obtained from the sensors. Let A and B denote the values read from the sensors. It is intuitive that the two temperatures will be quite close each other and that such a similiraity can be exploited in order to reduce the required rate. More precisely, it is intuitive (and it can be proved) that instead of compressing the two values A and B as they are, it is more convenient to compress their average (A+B)/2 and their semidifference (A-B)/2.

However, computing the average and the semidifference require to know both values A and B. This implies that there must be a central node to which both sensors send their measures or that one sensor must transmit its measure to the other one. Depending on the applicative context it could be difficult to arrange such a setup (e.g., in a sensor network where each node could even be unaware of the existence of other nodes). Of course, one could compress A and B independently, but this would not be efficient in terms of the required bit-rate.

The main question of distributed coding should be now clear: can we find a compression technique which (i) processes A and B in an independent way and (ii) has the same efficiency of the joint coding solution (i.e., compressing the average and the semidifference in the toy example above)? Surprising as it may seem, the answer is yes.

The solution

In the distributed coding context there are two main results: Slepian-Wolf coding for lossless coding and Wyner-Ziv coding for lossy coding. It is more convenient to introduce first the Slepian-Wolf approach.

Slepian-Wolf coding

In the Slepian-Wolf setting we have two symbol sources X and Y whose symbol sequences are correlated and we want to compress them in a lossless way.

The basic idea in the Slepian-Wolf approach is that since the symbols produced by Y are correlated to the symbols produced by X, we can consider X as a "noisy" version of Y, that is, we pretend that X was obtained by transmitting Y through a noisy channel. This suggests that we could recover Y from X by applying the techniques used protect data transmission over noisy channels.

More precisely, the Slepian-Wolf can be approximately described as follows

  1. Encode X by using any lossless technique and send the result to the receiver.
  2. Pretend Y is a sequence to be transmitted over a noisy channel and apply to it some systematic Forward Error Correction (FEC) code.
  3. A systematic FEC will produce a new symbol sequence composed of the original sequence Y together with a redundancy sequence R. Discard the systematic part Y and send only R to the receiver.
  4. The receiver joins X with R and "pretends" that the resulting sequence it is the sequence obtained by the transmission of the protected version of Y (i.e., Y+R) over a noisy channel. The receiver processes X+R with an algorithm for decoding the FEC and recovers Y.
In this approach sequence X plays two roles: it represents both itself and a noisy version of the other sequence Y.

Wyner-Ziv coding

Distributed coding via Slepian-Wolf

The simplest way to do distributed lossy compression is to use the usual approach "quantization + lossless compression", but doing the lossless compression in a distributed way, that is, with Slepian-Wolf. In the literature the Slepian-Wolf approach often is not applied directly to the quantized values, but the quantizer output is split into bitplanes and the Slepian-Wolf technique is applied to each bitplane.

Distributed coding via reduction functions

Another approach to lossy distributed coding tries to exploit the dependence between the sources before the quantization step.

A rough description of the basic ideas behind this approach can be given by considering again the toy example introduced above. Suppose that we know that the difference between the two temperatures will always be in the -0.5..0.5 °C range. We can exploit such a priori knowledge by sending X as it is and only the decimal part of Y, since this, with the knowledge of X, suffices to reconstruct Y. To be convinced, consider the following two examples

  • Suppose X=12.4 °C and Y=12.8 °C . In this case the decoder receives the two values "12.4" and "8". Clearly, the only possible value for Y is 12.8 °C since both 13.8 °C and 11.8 °C are already outside the range 12.4±0.5.
  • Suppose X=12.2 °C and Y=11.8 °C. In this case the decoder receives the two values "12.2" and "8". Note that it cannot be Y=12.8 °C since 12.8-12.2=0.6 > 0.5 is too large. It is easy to see that 11.8 is the only possible value for Y.
The above is clearly a toy example, but the same idea can be easily extended to more general settings.

Applications

Distributed coding techniques are clearly interesting in those contexts where one has several correlated, but physically separated, sources. Distributed coding techniques have been proposed for multi-camera compression, sensor networks and also video compression. In the latter application, sources X and Y typically represent two consecutive frames of the same video sequence. Who proposes distributed techniques for video coding claims that this approach allows one to exploit the inter-frame dependence without the need of costly motion compensation.

A main drawback of distributed coding techniques is that in order to have an efficient compression it is necessary to know with some precision the statistical relationship between X and Y. Alternatively, it is possible to lower the requirement about such a knowledge if a feedback channel (used by the receiver to request more redundancy symbols) is available.