Most functions are information lossy
For the system with minimal specifications
This can be shown in directed graph as
But we know that p(Y) = transitional probability × p(X).
Thus,
p(y0) =
p(y0|x0)
p(x0)
⇒
p(y0|x0)
= p(y0) ÷
p(x0)
= 0.05 ÷ 0.5
= 0.1
Similarly,
p(y1) =
p(y1|x0)
p(x0)
⇒
p(y1|x0)
= p(y1) ÷
p(x0)
= 0.4 ÷ 0.5
= 0.8
p(y3) =
p(y3|x1)
p(x1)
⇒
p(y3|x1)
= p(y3) ÷
p(x1)
= 0.4 ÷ 0.5
= 0.8
p(y4) =
p(y4|x1)
p(x1)
⇒
p(y4|x1)
= p(y4) ÷
p(x1)
= 0.05 ÷ 0.5
= 0.1
Therefore the directed graph is
We know that symbol
xi combine with symbol
ηj to give symbol
yk, i.e.,
xi +
ηj =
yk.
Therefore,
Given symbol xi we know about yk
Also
Given symbol yk we know about ηj
Thus p(yk|xi) =
p(ηj|xi +
ηj = yk ).
Hence we compute the mutual information between X and Y following the addition of N as follows
Therefore given
X there is uncertainty in
Y, i.e., information is lost in transmission. This can further be explained as follows. For this case
H(
X) = 1 and
I(X; Y) =
H(X) −
H(X|Y)
⇒ H(X|Y) =
1 − 0.90
= 0.1
This implies that given the knowledge of
Y there is some uncertainty remaining in knowledge of
X. Hence,
"Entropy of outcome of a function is not necessarily the same as entropy of the compound symbol".
This explains why in the above example system
H(Y) < H(C)
or
H(Y) < H(X; N)
Note that
H(Y) ≤ H(C)
but never
H(Y) ≥ H(C)
Therefore (in our example), since
xi +
ηj =
yk =
f (
xi,
ηj) we state
"Most functions are information lossy".
❶
Identifying confounding causes for information loss and solution to improve SNR (signal-to-noise ratio)
In our example the statement "Most functions are information lossy". applies because
The ⬤
y2 receives inputs from both
x0 and
x1, and
confounds the input.
How can this confounding of input be minimized?
One solution is
Normalization to improve SNR (signal–to–noise ratio).
In our example
N = {−1, 0, +1}
p(N) =
{0.1, 0.8, 0.1}
Keeping
p(
N) = {0.1, 0.8, 0.1} unchanged
N = {−1, 0, +1}
is normalized to
Thus the system with normalized noise results in
whose directed graph is
Therefore we have
Notice that H(X, N) = H(Y). This is because
H(X, N) =
H(C)
= H(X) + H(N)
= 1.921928
From
For the above case we get,
I(
X;
Y) = 1.
Therefore information loss from the original symbol X after the output Y is given by
H(X|Y) =
H(X) −
I(X;Y)
= 1 − 1 = 0
Hence there is no longer any confounding symbols.
❷
Next:
Information in terms of usability (useful/useless) (p:2) ➽
✪