You are on page 1of 3

1

Distributed Arithmetic Coding for Sources with


Hidden Markov Correlation
Yong Fang and Jechang Jeong, Member, IEEE

Abstract—Distributed arithmetic coding (DAC) has been shown R ≥ H(X). In the DAC [4], interval lengths are proportional
to be effective for Slepian-Wolf coding, especially for short data to the modified probabilities (1 − p)γ and pγ , where
blocks. In this letter, we propose to use the DAC to compress
momery-correlated sources. More specifically, the correlation H(X|Y )
arXiv:2101.02336v1 [cs.IT] 7 Jan 2021

between sources is modeled as a hidden Markov process. Ex- ≤ γ ≤ 1. (1)


H(X)
perimental results show that the performance is close to the
theoretical Slepian-Wolf limit. The resulting rate is R ≥ γH(X) ≥ H(X|Y ). To fit the
Index Terms—Distributed Source Coding (DSC), Slepian-Wolf [0, 1) interval, the sub-intervals have to be partially overlapped.
Coding (SWC), Distributed Arithmetic Coding (DAC), Hidden More specifically, symbols xt = 0 and xt = 1 correspond to
Markov Correlation, Forward Algorithm. intervals [0, (1 − p)γ ) and [1 − pγ , 1), respectively. It is just
the overlapping that leads to a larger final interval, and hence
I. I NTRODUCTION a shorter codeword. However, as a cost, the decoder can not
decode X unambiguously without Y .
W E consider the problem of Slepian-Wolf Coding (SWC)
with decoder Side Information (SI). The encoder com-
presses discrete source X = {xt }N t=1 in the absence of
To describe the decoding process, we define a ternary
symbol set {0, χ, 1}, where χ represents a decoding ambiguity.
Let CX be the codeword and x̃t be the t-th decoded symbol,
Y = {yt }N t=1 , discretely correlated SI. Slepian-Wolf theorem then
points out that lossless compression is achievable at rates γ
0, 0 ≤ CX < 1 − p

R ≥ H(X|Y ), the conditional entropy of X given Y ,
x̃t = χ, 1 − pγ ≤ CX < (1 − p)γ . (2)
where both X and Y are discrete random processes [1].
γ

Conventionally, channel codes, such as turbo codes [2] or Low- 1, (1 − p) ≤ CX < 1
Density Parity-Check (LDPC) codes [3], are used to deal with When the t-th symbol is decoded, if x̃t = χ, the decoder
the SWC problem. performs a branching: two candidate branches are generated,
Recently, some SWC techniques based on entropy coding corresponding to two alternative symbols xt = 0 and xt = 1.
are proposed, such as Distributed Arithmetic Coding (DAC) For each new branch, its metric is updated and the correspond-
[4, 5] and Overlapped Quasi-Arithmetic Coding (OQAC) ing interval is selected for next iteration. To reduce complexity,
[6]. These schemes can be seen as an extension of classic every time after decoding a symbol, the decoder uses the M -
Arithmetic Coding (AC) whose principle is to encode source algorithm to keep at most M branches with the best partial
X at rates H(X|Y ) ≤ R < H(X) by allowing overlapped metric, and prunes others [4].
intervals. The overlapping leads to a larger final interval and Note that the metric is not reliabe for the very last symbols
hence a shorter codeword. However, ambiguous codewords are of a finite length sequence X [5]. This problem is solved by
unavoidable at the same time. A soft joint decoder exploits SI encoding the last T symbols without interval overlapping [5].
Y to decode X. Afterwards, the time-shared DAC (TS-DAC) It means that for 1 ≤ t ≤ (N − T ), xt is mapped onto [0, (1 −
[7] is proposed to deal with the symmetric SWC problem. p)γ ) and [1 − pγ , 1); while for (N − T + 1) ≤ t ≤ N , xt is
To realize rate-incremental SWC, the rate-compatible DAC is mapped onto [0, 1 − p) and [1 − p, 1).
proposed in [8]. Therefore, a binary DAC system can be described by four
In this paper, we research how to use the DAC to compress parameters: {p, γ, M, T }.
sources with hidden Markov correlation.

III. H IDDEN M ARKOV M ODEL AND F ORWARD


II. B INARY D ISTRIBUTED A RITHMETIC C ODING
A LGORITHM
Let p be the bias probability of binary source X, i.e.,
p = P (xt = 1). In the classic AC, source symbol xt is Let S = {st }N N
t=1 be a sequence of states and Z = {zt }t=1

iteratively mapped onto sub-intervals of [0, 1), whose lengths be a sequence of observations. A hidden Markov process is
are proportional to (1 − p) and p. The resulting rate is defined by λ = (A, B, π):
A = {aji }: state transition probability matrix, where aji =
This research was supported by Seoul Future Contents Convergence (SFCC) P (st = i|st−1 = j);
Cluster established by Seoul R&BD Program. B = {bi (k)}: observation probability distribution, where
The authors are with the Lab. of ICSP, Dep. of Electronics and Commu-
nications Engineering, Hanyang University, Haengdang-dong, Seongdong-gu, bi (k) = P (zt = k|st = i);
Seoul 133-791, Korea (e-mail: yfang79@gmail.com). π = {πi }: initial state distribution, where πi = P (s1 = i).
2

The aim of forward algorithm is to compute P (z1 , ..., zt |λ), TABLE I


given observation {z1 , ..., zt } and model λ. Let αt (i) be the M ODELS FOR S IMULATION
probability of observing the partial sequence {z1 , ..., zt } such model {a00 , a11 , b0 (0), b1 (1)}
that state st is i, i.e.,
1 {0.01, 0.03, 0.99, 0.98}
αt (i) = P (z1 , ..., zt , st = i|λ). (3) 2 {0.01, 0.065, 0.95, 0.925}
3 {0.97, 0.967, 0.93, 0.973}
Initially, we have
4 {0.99, 0.989, 0.945, 0.9895}
α1 (i) = πi bi (z1 ). (4)
TABLE II
For t > 1, αt (i) can be induced through iteration E XPERIMENTAL R ESULTS
X
αt (i) = { [αt−1 (j)aji ]}bi (zt ). (5) model H(X|Y ) [9] DAC
j
1 0.24 0.36 0.345908
Therefore, X 2 0.52 0.67 0.648594
P (z1 , ..., zt |λ) = αt (i). (6) 3 0.45 0.58 0.585645
i 4 0.28 0.42 0.427236
In practice, αt (i) is usually normalized by
αt (i) approach [9]. Moreover, for hidden Markov correlation, the
αt (i) = , (7)
δt DAC outperforms the LDPC-based approach in two aspects:
where X 1). The LDPC-based approach requires longer codes to
δt = αt (i). (8) achieve better performance, while the DAC is insensitive to
i code length [4].
In this case, we have 2). For the LDPC-based approach, to synchronize the hidden
t Markov model, a certain proportion of original source symbols
must be sent to the decoder as “seeds”. However, it is hard
Y
P (z1 , ..., zt |λ) = δt′ . (9)
t′ =1
to determine α, the optimal proportion of “seeds”. The results
reported in [9] were obtained through an exhaustive search,
IV. DAC FOR H IDDEN M ARKOV C ORRELATION which limits its application in practice.
Assume that binary source X and SI Y are correlated by
Y = X ⊕ Z, where Z is generated by a hidden Markov VI. C ONCLUSION
model with parameter λ. X is encoded using a {p, γ, M, T } This paper researches the compression of sources with
DAC encoder. The decoding process is very similar to what hidden Markov correlation using the DAC. The forward al-
described in [4]. The only difference is that the forward gorithm is incorporated into the DAC decoder. The results are
algorithm is embedded into the DAC decoder and the metric of similar to that of the LPDC-based approach. Compared to the
each branch is replaced by P (z1 , ..., zt |λ), where zt = xt ⊕yt . LDPC-based approach, the DAC is more suitable for practical
applications.
V. E XPERIMENTAL R ESULTS
We have implemented a 16-bit DAC codec system. The bias R EFERENCES
probability of X is p = 0.5. According to the recommendation
[1] D. Slepian and J. K. Wolf, “Noiseless coding of correlated information
of [5], we set M = 2048 and T = 15. The same 2-state (0 sources,” IEEE Trans. Info. Theory, vol. 19, no. 4, pp. 471-480, Jul. 1973.
and 1) and 2-output (0 and 1) sources as in [9] are used in [2] J. Garcia-Frias, “Compression of correlated binary sources using turbo
simulations (see Table I). The length of data block used in codes,” IEEE Commun. Lett., vol. 5, no. 10, pp. 417-419, Oct. 2001.
[3] A. Liveris, Z. Xiong, and C. Georghiades, “Compression of binary sources
each test is N = 1024. with side information at the decoder using LDPC codes,” IEEE Commun.
To achieve lossless compression, each test starts from Lett., vol. 6, no. 10, pp. 440-442, Oct. 2002.
γ = H(X|Y ) (see Table II). If the decoding fails, we increase [4] M. Grangetto, E. Magli, and G. Olmo, “Distributed arithmetic coding,”
IEEE Commun. Lett., vol. 11, no. 11, pp. 883-885, Nov. 2007.
γ with 0.01. Such process is iterated until the decoding [5] M. Grangetto, E. Magli, and G. Olmo, “Distributed arithmetic cod-
succeeds. For each model, results are averaged over 100 trials. ing for the asymmetric slepian-wolf problem,” IEEE Trans. Signal
Experimental results are enlisted in Table II. Process (submitted), preprint available at URL www1.tlc.polito.it/sas-
ipl/Magli/index.php.
For comparison, also included in Table II are experimental [6] X. Artigas, S. Malinowski, C. Guillemot, and L. Torres, “Overlapped
results for the same settings from [9]. In each test of [9], Quasi-Arithmetic codes for Distributed Video Coding,” IEEE Interna-
N = 16384 source symbols are encoded using an LDPC tional Conference on Image Processing (ICIP), San Antonio, Texas, USA,
Sep. 2007.
code. In addition, to synchronize the hidden Markov model, [7] M. Grangetto, E. Magli, and G. Olmo, “Symmetric Distributed Arith-
N α original source symbols are sent to the decoder directly metic Coding of Correlated Sources,” IEEE International Workshop on
without compression. Multimedia Signal Processing (MMSP), Crete, Greece, Oct. 2007.
[8] M. Grangetto, E. Magli, R. Tron, and G. Olmo, “Rate-compatible
The results show that the DAC performs similarly to or distributed arithmetic coding,” IEEE Commun. Lett., vol. 12, no. 8, pp.
slightly better than (for models 1 and 2) the LDPC-based 575-577, Aug. 2008.
3

[9] J. Garcia-Frias and W. Zhong, “LDPC Codes for Compression of Multi-


Terminal Sources With Hidden Markov Correlation,” IEEE Commun.
Lett., vol. 7, no. 3, pp. 115-117, Mar. 2003.

You might also like