Summary: Explains in simple terms the functions of a convolutional encoder and Viterbi decoder
Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.
A convolutional code is not decoded in short blocks as in a block code. However, to simplify decoding, messages are artificially broken down into very long blocks by periodically flushing the encoder with a string of zeros, as in the example discussed here.
For illustration only this example here uses an unrealistically short block length of 5 data bits with the last two fixed at 0 to flush the encoder (remember that this is very inefficient and, in practice, practical block lengths are very much longer, typically 1,000 to 10,000 bits in length).
Convolutional codes are always decoded using the Viterbi algorithm as this simplifies the decoding operation. The algorithm is based on the nearest neighbour decoding scheme and, like the other algorithms we have looked at, it relies on the assumption that the probability of t errors is much greater than the probability of t+1 errors and it thus selects or chooses and retains only the paths which have fewer errors.
The decoding process is based on the previous decoding trellis. We will use the previous ½ rate encoder example and assume that the received message is: 10 10 00 10 10, representing a total of five (unknown) transmitted data bits each encoded into five bit pairs, i.e. total of ten encoded data bits. We further assume in this simplified example that the last 2 bits of the 5 data inputs were flushing zeros to reset the encoder and decoder.
Starting (after flushing) with the first received bit in position A in the encoder, we know that if a 1 had been input, (lower path) from the encoder figure the output should have been 11 as we moved to state C. If a 0 was input (upper path) we should have received 00 and moved to state B, see upper part of Figure 1.
What was actually received was 10, a Hamming distance of 1 from both these possibilities, so we draw that in the lower part of Figure 1 onto the first stage of our decoding trellis.
![]() |
Instead of reporting the expected outputs we next annotate the lower part of Figure 1 with the separate distances between the received data and the trellis encoder on each path. We then add the cumulative Hamming distance to the states (B, C) in square brackets above the states B and C
Now consider the second pair of received data bits. Consider first state B. As before, we should have received 00 for a 0 input and 11 for a 1 input, see left hand side of Figure 2. What we actually received was 10, which is a Hamming distance of 1 from both possibilities so the right hand part of Figure 2 is annotated with the individual and cumulative distances to states D and E.
Then consider state C. For a 0 input, (upper part) we should have received 01, but what was actually received was 10, a Hamming distance of 2. For a 1 input (lower path) we should have received 10 and this is exactly what was received, corresponding to a Hamming distance of 0! Again the right part of Figure 2 is annotated with the individual distances on the paths and the new cumulative or summed distances to states F and G.
![]() |
We continue to build our decoding trellis until it is complete after receipt of all ten data bits, as shown in Figure 3.
If we have two paths to a state, as in the later states: H, I , J, K, L, M, N, P, we write the smaller (more likely) Hamming distance in square brackets above the state and discard the larger distance (as this is much less likely to represent the correct path). In our example, we assumed the last two bits were 0, so we must expect to finish back in state P, which is the same as the starting state A.
We finally need to find the path from state A to P which gives the lowest overall Hamming distance. We then retrace the path and remember that the upper path from a state represented a 0 transmitted and the lower path represented a 1 transmitted.
![]() |
The reverse decoded data for this example is indicated by the dashed line in Figure 3.
Leaving states A, C, G and K always in the lower of the two possible paths implies that a data bit 1 has been received at these states and therefore this translates to 1, 1, 1 as the first three encoded data bits.
The last two bits don’t matter in this case as we have assumed they are 0, 0 and we can remove from the docoding trellis all the states that don’t support or contribute to this solution.
Note that finishing a block with n-1 zero input data bits is not compulsory. If you make a decision after a delay of approximately five times the constraint length n, this makes little difference in code performance but does limit the memory consumed by the process to a more sensible amount.
Figure 4 shows the performance of various BLOCK codes, all of rate ½, whose performance improves as the block length increases, even for the same coding rate of ½.
The power of these forward error correcting codes (FECC) is quantified as the coding gain, i.e. the reduction in the required
For example in Figure 4 the (31, 16) code has a coding gain over the uncoded case of around 1.8 dB at a
![]() |
Figure 5 shows for comparison with the block codes of Figure 4 the performance of convolutional coders. The convolutional code initially provides very good performance at modest constraint length. A short constraint length of n = v = 3 is already superior to the 511 block length code of Figure 4. The additional attraction of the convolutional coder is its further improvement with the increase in constraint length up to n = 7 or 9, as shown in Figure 5.
Unfortunately the coding and decoding process gets more complicated with larger block/constraint length. As shown here convolutional codes with Viterbi decoding are generally more powerful than block codes, especially for very low error rates, hence their wider use. Single chip constraint length 9 (512 state) encoder and decoders are now widely available as commercial products from many semiconductor vendors.
![]() |