Channels with memory

Let us consider channels with memory. In other words channels that are statistically independent, p(x)p(y) ≠ 0. For example assume a DMS (discrete memoryless source X = {−1, +1}) sending symbols to a delay (D) such that their sum is the output,

x at
t = n − 1
x at t = n
−1 +1
−1 −2 0
+1 0 +2
Thus,
Figure 1 DMS X sends symbol x(n -1) and x(n) to form y(n)
Then the channel,
Figure 2 The feature of symbol x(n) combining with its delayed x(n-1) makes the channel dua-binary channel
is called a dual–binary or duo–binary channel.

Hidden Markov Process

If DMS and the duo–binary channel is considered an information source

Figure 3 Consider DMS plus duo-binary channel as information source Y

then the Markov process that exist within this information source is called Hidden Markov Process.

In steady–state, analysis of the system shows that

p(y = -2) = 0.25, p(y = 0) = 0.5 and p(y = +2) = 0.25

(Note that the notation in the equation is x(n) = x(n). See above figure for the parametric values used for the above computations)

Drawing a state diagram

Let us say that the channel at time index n sends symbol xi where i ∈ {0, 1}. Let us also say that this channel has H(X) = 1. Thus p(x0) = p(x1) = 0.5.

At any time index n if the symbol is xi = −1 (i.e, xi = x0) let us define the 'state' as S = S0. On the other hand, if the symbol is xi = +1 (i.e, xi = x1) let us say its state is S = S1. Then the Markov–Process state diagram is

Figure 4 State diagram of Fig 3 system

The diagram shows that if at state S0, for output y = −2 the symbol for next time index n is xi = x0. Since it is a statistically independent system  p(x sub 0 at n Given x sub 0 at n-1) = p(x sub 0 at n) . Thus the arrow is directed back to the same state. On the other hand, for output  y = 0 (assuming that the initial state is S0) the next symbol must be xi = x1. Hence  p(x sub 1 at n Given x sub 0 at n-1) = p(x sub 1 at n) . The arrows is therefore directed from states S0 to S1. From the state-diagram one can make further analysis.

Let

mu sub 0 at n is equal to p(S sub 0) which is equal to p(x = x sub 0 at time n) Similarly mu sub 1 at n is equal to p(S sub 1) which is equal to p(x = x sub 1 at time n)
Note that the total probability μ0 + μ1 = 1. We can therefore write the difference equation for the state probabilities as
vector of length 2, mu at n+1 is equal to the product of (matrix 2 by 2 of transitional probabilities, element(1,1) = p(x sub 0 Given x sub 0), element(1,2) = p(x sub 0 Given x sub 1), element(2,1) = p(x sub 1 Given x sub 0), element(2,2) = p(x sub 1 Given x sub 1)) and (vector 2 by 1 of mu sub 0 at n and mu sub 1 at n)
Thus
vector mu at n+1 is equal to matrix P times vector mu at n
where the matrix
matrix 2 by 2 P = [p sub 00, p sub10; p sub 01, p sub11]
such that pij is the transition probability from state Si to Sj. For example p01S0 to S1. For the above system  p sub 01 is equal to p(x sub 1 Given x sub 0) . Since given a particular state, sum of all probabilities from this state must equal 1, p00 + p01 = p10 + p11 = 1.

Hence for the above example

vector of length 2, mu at n+1 is equal to [0.5, 0.5; 0.5, 0.5] times [mu sub 0 at n; mu sub 1 at n]
Note that P elements in each column adds to 1 but the elements in each row also add up to 1, p00 + p10 = p01 + p11 = 1. This particular P matrix is therefore a full-rank matrix. Note that some P will not be a fully-ranked matrix however elements in each column will/should always add up to 1.

Continuing the analysis with the above example the state probability at steady-state (ss) is derived from the limit

limit as n tends to infinity of vector mu at n is equal to steady-state vector of mu = [0.5; 0.5]
Note that even if p(x0) ≠ p(x1) (for our example p(x0) = p(x1) = 0.5), the system will come to a steady-state.

Next:

An alternate view of state of the channel (p:2) ➽