You are on page 1of 5

Lossless Compression of JPEG and GIF Files through Lexical Permutation Sorting with Greedy Sequential Grammar Transform based Compression

Sajib Kumar Saha, Mrinal Kanti Baowaly , Md. Rafiqul Islam Ғ , Md. Masudur Rahaman *

Khulna University/CSE, Khulna, Bangladesh, e-mail: to_sajib_cse@yahoo.com / tosajib@gmail.com Khulna University/CSE, Khulna, Bangladesh, e-mail: mrinalbaowaly@yahoo.com Ғ Khulna University/CSE, Khulna, Bangladesh, e-mail: dmri1978@yahoo.com * Khulna University/CSE, Khulna, Bangladesh, e-mail: masud_cse02@yahoo.com

Abstract— This paper provides a way for lossless Compression of Color Images through lexical permutation sorting (LPS) and Greedy Sequential Grammar Transform based compression. The proposed model adopts the advantages of lexical permutation sorting for Color Images to produce a permuted data. Greedy Sequential Grammar Transform based compression, which is basically a text compression technique can now be applied easily on that permuted data. For comparison, we have taken Inversion Coding of Burrows Wheeler Compression (BWIC), and Burrows Wheeler Compression (BWC) and the model proposed in ICCIT 2006( on a paper named ‘A New Approach for Lossless Compression of JPEG and GIF Files Using Bit Reduction and Greedy Sequential Grammar Transform’ ).

KeywordsLPS, BWC, BWIC.

I.

INTRODUCTION

High quality color images with higher resolution and huge memory spaces are the favorite of modern people. Color image compression is an important technique to reduce the image space and retain high image quality. In this paper a compression technique for Color images is proposed based on LPS and greedy sequential grammar transform. When the underlying data to be transmitted is a permutation Π, LPS algorithms generates a cyclic group of order n with Ф(n) generators [3], where Ф(n) is Euler’s Ф function on n. Among the Ф(n) possibilities, one or more choices may be cheaper to communicate than the original permutation (where Burrows Wheeler Transform (BWT) produces only one permutation). When greedy sequential grammar transform algorithm works on that permuted data, the produced grammar becomes sort.

II. LITERATURE SURVAY

A. Lexical Permutation Sorting

Before discussing the theoretical basis of LPS, we begin this section by giving an example

Let p = [3,1,5,4,2] be a given permutation. Construct the matrix

3 1 5 4 2 1 5 4 2 3 N = 5 4 2 3
3
1
5
4
2
1
5
4
2
3
N =
5
4
2
3
1
4
2
3
1
5
2
3
4
1
5
By forming successive rows of N which are consecutive
cycle left-shifts of the sequence p. Let F be the first, S the
second, and L the last column of N. By sorting the rows of
N lexically, we transform it to
1
5
4
2
3
2
3
1
5
4
3
1
5
4
2
N′ =
4
2
3
1
5
5
4
2
3
1
This amounts to string N respect to the first column, i.e.
applying a row permutation to N so that its first column
T
becomes (1,2,3,4,5)
. The original sequence p appears in
the 3 rd row of N′. If the transmitter transmits the pair (i, S′)

or (i, L), then the receiver can reconstruct the original sequence p uniquely. For example, if (i, S) is transmitted, the receiver constructs the original sequence p by using the following procedure as described in [3]

Procedure p[1] = i; for (j = 2; j n; j++) p[j] = S[p[j-1]];

Or, if (i, L) is transmitted, the receiver constructs the original sequence p by using the following procedure [3]

Procedure p[n] = L[j]; for (j = 1; j < n-1; j++) p[n-j] = L[p[n-j+1]];

Of course once we realize that Lis the inverse of Sas a

permutation, the second procedure is seen to be equivalent

to the one using S.

More generally, suppose that A is an alphabet

of

n

symbols with a linear ordering. If Y is a data string with

elements in A we denote by N(Y) the n × n matrix whose

i th row is Y (i) , and by N(Y) the matrix obtained by

lexically ordering the rows of N(Y).

Now

according

to

lemma

3.1

in

[3],

let

p

be

a

permutation of degree n given in Cartesian

form.

Construct an n × n matrix N whose first row is p

and

whose each row is a left cyclic shift of the previous row. If

Π j is the j th column of N, so that N = [Π 1 , Π 2 , …,Π n ], then

the result of lexically ordering the rows of N is the matrix

N= [Π 1 -1 Π 1 , Π 1

-1

Π 2 , …, Π 1 -1 Π n ], where Π 1 -1 Π j is the j th

column of Nin Cartesian form.

When we need to emphasize the dependency of N and

Non the input data permutation p, we write N(p) and

N(p) for N and Nrespectively. We continue to assume

that p is a given permutation as in lemma 3.1 in [6].

Now according to theorem 3.1 we have if l = Π 1 -1 Π n and

Π = l -1 = Π n Π 1 , then p(i+1) = Π (p(i))

-1

Knowledge of l = Π 1 -1 Π n and p(1) allows us to recover

p completely.

 

Let

the

matrix

 

N

=

N(p)

=

(t i , j

).

Note

that if

we

interpret the second column of Nas a permutation, we get

 

1

t

1, 2

 

2

t

2, 2

 

1

2

n

 

.

.

Θ =

.

.

=

t 1, 2

 

t 2, 2

t n, 2

.

.

 

n

t n, 2

 

But the image under Θ Є S 0 of any index j can be found

by taking any row, and looking at the next element in that

row. Since rows are cyclic shifts of each other, it does not

matter at which row we look. In particular, we could just

use the first row, i.e. we have that Θ = (1, t 1, 2 , t 1, 3 ,…

t 1,n ). Hence Θ is a cycle of length n. Moreover, it is clear

that the third column, interpreted as a permutation is

simply Θ 2 … , and in general the

k th column is Θ i-1 .

According to proposition 3.1 in [3] the columns of N(p)

form a cyclic group G of order n generated by Θ.

Let be the set of symbols of Nwhich are generators

of the cyclic group < Θ >, then Θ as well as l = Θ

-1

can be

completely specified as an integer power of any one

element of . There are | | = Ф(n), generators of the

cyclic group < Θ >, where Ф(n) is Euler’s Ф function of n.

If n is prime, there are Ф(n) = n-1 generators of the cyclic

group < Θ >. It is straightforward to obtain all elements of

, since Θ k Є ∆ if and only if the integer k, 0 < k < n, is

relatively prime to n. Hence

= { Θ k | gcd(k, n) = 1, 0 < k < n}.

In particular the following procedure as defined in [3]

allows us to determine a generator of least entropy for G,

as well as the integer t such that t = Θ. By a cost function

in that procedure mean a function that can measure the

entropy of the resulting data, after decomposition.

Procedure

C 0 = Cost (Θ); k 0 = 1;

for (k = 2; k < n; k ++)

{

¥ = Θ k ;

if (Cost (Θ) < C 0 )

k 0 = k;

}

t = Θ ko ;

t = Inverse (k 0 , Θ(n));

Again according to theorem 3.2 of [3] if Y be a data

string of length n, from a linearly ordered alphabet A of g

distinct symbols, with lexical index permutation λ = λ Y .

And If N= N(λ), then Mi,j = Y[Ni,j ]; where M = N(Y)

and M= N(Y).

Theorem 3.2 described in [3] establishes the connection

between LPS and BWT. When the data to be transmitted

is a permutation, then, in general LPS Algorithm will give

better results than BWT, because we are able to select the

least expensive generator ∂ Є ∆ , with an additional

overhead of transmitting a single integer x, 1 x n, such

that x = Θ. This amount to an overhead bounded by (log

n)/n.

  • B. Grammar Based Compression

Let x be a sequence from A which is to be compressed.

A grammar transform converts it into an admissible

grammar [1, 2] that represents x, then encoder works to

encode that grammar as shown in fig.1. In this paper, we

are interested particularly in a grammar transform that

starts from the grammar G consisting of only one

production rule s 0 x, and applies repeatedly reduction

rules 1–5 proposed in [1] in some order to reduce into an

irreducible grammar G’. Such a grammar transform is

called an irreducible grammar transform. To compress x,

the corresponding grammar-based code then uses a zero-

order arithmetic code to compress the irreducible grammar

G’. After receiving the codeword of G’, one can fully

recover G’ from which x can be obtained via parallel

replacement. Different orders via which the reduction

rules are applied give rise to different irreducible grammar

transforms, resulting in different grammar-based codes. Input data Context-free Grammar Arithmetic Binary transform grammar G coder
transforms, resulting in different grammar-based codes.
Input data
Context-free
Grammar
Arithmetic
Binary
transform
grammar G
coder
X
codeword

Fig. 1: Structure of a grammar ' based compression.

III. NEW MODEL FOR LOSSLESS IMAGE COMPRESSION

The basic idea in this proposed method is to sort the

input data through first applying LPS and then apply

greedy sequential grammar transform based compression

to compress image file.

 

Source image

 

Compressed image

code with a dynamic alphabet to encode the sequence of

parsed

phrases

x 1 ,

x 2

 

.

 
Specifically, we associate each symbol β € S U A with a
Specifically, we associate each symbol β € S U A with a

Specifically, we associate each symbol β S U A with a

   

Lexical permutation

sorting

 

Arithmetic

counter c(β). Initially c(β), is set to 1 if

β A and 0

decoder

otherwise. At the beginning, the alphabet used by the

 

arithmetic code is A. The first parsed phrase x 1 is encoded

 
by using the probability c ( x ) / ∑ c ( β ). Then the
by using the probability c ( x ) / ∑ c ( β ). Then the

by using the probability c(x 1 ) /

β A c(β). Then the

   

Greedy sequential

grammar

transform

 

Reverse greedy

sequential grammar

transform

counter c(x 1 ) increases by 1.Suppose that x 1 , x 2 x n have

i

been parsed off and encoded and that all corresponding

counters have been updated. Let G i be the corresponding

 

irreducible

grammar

for

x 1 ,…

x n

  • i . Assume that the

 
variable set of G is equal to S ( j )={ s , s , …
variable set of G is equal to S ( j )={ s , s , …

variable set of G i is equal to S(j i )={s 0 , s 1 , … ,s j

i

-1 } where

   

Arithmetic

 

Reverse lexical

j 1 = 1.

coder

permutation sorting

Let x n +1 ,…, x n

i

i-1

be parsed off as in our irreducible

 

grammar transform and represented by β

 
Source image

Source image

 
 
Compressed image

Compressed image

 

{s 1 , …,s j

i

-1 }

U

A.

Encode β

and

update the

relevant

(a)

  • (b) counters according to the following steps:

Step 1: The alphabet used at this point by the arithmetic

Fig. 2: Proposed model (a) Compression

 

code is {s 1 , …, s j -1 }.

i

 

(b) Decompression.

Encode x n +1 ,…, x n

i

i+1

by using the probability

 

c(β) / α s(ji) U A

c(α).

(1)

A. Proposed Compression Technique

 

Step 2: Increase c(β) by 1.

The proposed compression algorithm consists of two

phases:

1.

Lexical Permutation Sorting.

Step 3: Get G i+1 from the appended G i as in our

irreducible grammar transform.

Step 4: If j i+1 > j i , i.e., G i+1 includes the new variable s j ,

i

 

2.

Greedy

sequential

grammar

transform

based

increase the counter c(s) by 1. Repeat this procedure until

the whole sequence is processed and encoded. Note that

c(s 0 ) is always 0. Thus the summation over S(j i ) U A in

compression.

 
 

(1) is equivalent to the summation over {s 0 , …, s j -1 }U A.

i

 

Greedy Grammar Transform [1]

 

From Step 4, it follows that each time when a new

 

Let x = x 1 x 2 x n be a sequence from A which is to be

variable s j is introduced, its counter increases from 0 to 1.

i

compressed. It parses the sequence x sequentially into

non-overlapping substrings {x 1 , x 2 x n , x n +1 … x n

2

t-1

t

} and

builds sequentially an irreducible grammar for each

x 1 ,…x n

  • i where 1it, n 1 = 1 , and n t = n. The first

substring is x 1 and the corresponding irreducible grammar

G 1 consists of only one production rule s 0 x. Suppose

that x 1 , x 2 x n , x n +1 … x n

i-1

  • i have been parsed off and the

    • i has

corresponding irreducible grammar G i for x 1 , … x n

2

been built. Suppose that the variable set of G i is equal to

S(j i ) ={s 0 ,s 1 ,…, s j -1 } where j 1 = 1. The next substring

x n +1….

i

x

n

i+1

is the longest prefix of x n +1

i

. x n

i

that can be

represented by s j for some 0 <j < j i if such a prefix exists.

Otherwise, x n +1. x n

i

>1 and x n +1 x n

i+1

i

= x n +1 with n i+1 = n i +1.If n i+1 - n i

i

i

is represented by s j , then append s j to

i+1

the right end of G i (s 0 ); otherwise, append the symbol x n +1

i

to the right end of G i (s 0 ). The resulting grammar is

admissible, but not necessarily irreducible. Apply

reduction rules 1–5 proposed in [1] to reduce the grammar

to an irreducible grammar G i+1 . Then G i+1 represents x 1

Therefore, in the entire encoding process, there is no

zero-frequency problem. Also, in the sequential algorithm

[1], the parsing of phrases, encoding of phrases, and

updating of irreducible grammars are all done in one

pass. Clearly, after receiving enough co-debits to recover

the symbol β, the decoder can perform the update

operation in the exact same way as does the encoder.

B. Proposed Decompression Technique

The proposed compression algorithm consists of two

phases:

  • 1. Greedy sequential grammar based decompression.

  • 2. Reverse Lexical Permutation Sorting.

IV. EXPERIMENTAL RESULTS

We have taken some JPEG and GIF images as sample

x n

i

+1 Repeat this procedure until the whole sequence is

input to our proposed model. The images used in the

processed.

Then

the final irreducible grammar G t

experiment are shown in Fig. 3 and Fig. 5. It has been

represents x.

 

found that the quality of the final decompressed image is

 

Since only one symbol from S(j i ) U A is appended to the

exactly the same as that of the original image as shown in

end of G i (s 0 ), not all reduction rules can be applied to get

Fig. 4 and Fig. 6.

G i+1 .Furthermore, the order via which reduction rules are

applied is unique.

 

Encoding Algorithm

In the sequential algorithm [1], we encode the data

sequence x sequentially by using a zero-order arithmetic

(a) House (b) Man 1 (c) Man 2 (d) Tree Fig. 3: Some sample JPEG input

(a) House

(a) House (b) Man 1 (c) Man 2 (d) Tree Fig. 3: Some sample JPEG input

(b) Man 1

(a) House (b) Man 1 (c) Man 2 (d) Tree Fig. 3: Some sample JPEG input

(c) Man 2

(a) House (b) Man 1 (c) Man 2 (d) Tree Fig. 3: Some sample JPEG input

(d) Tree

Fig. 3: Some sample JPEG input files.

Table I Comparison with BWIC, BWC, the model proposed in [5], and the proposed model for JPEG files

File Name

Original

Using

Using

Using the

Using the

Size

BWIC

BWC

Model

Proposed

Proposed in [5]

Model

(bytes)

(bytes)

(bytes)

(bytes)

(bytes)

House

51, 202

51, 214

51, 195

51,038

49,031

Man 1

39, 821

39, 339

39, 012

38,723

36,172

Man 2

2, 149

2, 201

2, 157

2,159

2,141

Tree

7, 494

7, 588

7, 498

7,452

7,141

Total

100, 666

100, 342

99, 862

99,372

94,485

(a) House (b) Man 1 (c) Man 2 (d) Tree Fig. 3: Some sample JPEG input

(a) House

(a) House (b) Man 1 (c) Man 2 (d) Tree Fig. 3: Some sample JPEG input

(b) Man 1

(a) House (b) Man 1 (c) Man 2 (d) Tree Fig. 3: Some sample JPEG input

(c) Man 2

(a) House (b) Man 1 (c) Man 2 (d) Tree Fig. 3: Some sample JPEG input

(d) Tree

Fig. 4: Decompressed JPEG Images.

(a) House (b) Man 1 (c) Man 2 (d) Tree Fig. 3: Some sample JPEG input

(a) Texture

(a) House (b) Man 1 (c) Man 2 (d) Tree Fig. 3: Some sample JPEG input

(b) Advertisement

(a) House (b) Man 1 (c) Man 2 (d) Tree Fig. 3: Some sample JPEG input

(c) Coin

(a) House (b) Man 1 (c) Man 2 (d) Tree Fig. 3: Some sample JPEG input

(d) Woman

Fig. 5: Some sample GIF input files.

Table II Comparison with BWIC, BWC, the model proposed in [5],and the proposed model for GIF files

File Name

Original Size

Using BWIC

Using BWC

Using the

Using the

Model

Proposed

 

Proposed in [5]

Model

(bytes)

(bytes)

(bytes)

(bytes)

(bytes)

Texture

8, 714

8, 364

8, 566

8,579

8,157

Advertisement

65, 178

64, 911

64, 860

64,545

58,154

Coin

137, 566

132, 699

135, 129

132,186

119,116

Woman

164, 597

160, 482

162, 979

159,063

133,061

Total

376, 055

366, 456

371, 534

364,373

318,488

(a) Texture (b) Advertisement (c) Coin (d) Woman Fig. 6: Decompressed GIF input files. Table III

(a) Texture

(a) Texture (b) Advertisement (c) Coin (d) Woman Fig. 6: Decompressed GIF input files. Table III

(b) Advertisement

(a) Texture (b) Advertisement (c) Coin (d) Woman Fig. 6: Decompressed GIF input files. Table III

(c) Coin

(a) Texture (b) Advertisement (c) Coin (d) Woman Fig. 6: Decompressed GIF input files. Table III

(d) Woman

Fig. 6: Decompressed GIF input files.

Table III Comparison of time taken by the proposed model and the model proposed in [5]

(Processor: Intel-Celeron, 1.6GHz; RAM: 256; Operating system: Windows XP)

File Name

Original

   

Compress time

Decompress time

 

Size

 
     

Using the

Using the

Using the

Using the

Model

Proposed

Model

Proposed

Proposed in [5]

Model

Proposed in

Model

[5]

 

(ms)

(bytes)

 

(ms)

(ms)

(ms)

House

 

51, 202

   
  • 874 585

354

299

Man 1

 

39, 821

   
  • 547 421

241

223

Texture

 

8, 714

   
  • 102 101

61

61

Advertisement

 

65, 178

   
  • 769 566

 
  • 463 399

Coin

 

137, 566

   

1912

1511

 
  • 701 688

Woman

 

164, 597

   

2550

1952

1550

1463

V. CONCLUSION

 
 

at

providing

a

novel

for

Journal, Vol. 40, no. 5, October 1997.

This paper has aimed

[3] Ziya Arnavut, Spyros S. Magliveras, Lexical

Permutation Sorting Algorithm, The Computer

applying LPS algorithm on JPEG and GIF files. Block

sorting can also be used on the place of LPS as was

used in the model proposed in [6], but LPS is suitable

if the underlying data to be transmitted is a

permutation[3] and here for JPEG and GIF files LPS

works better that Block sorting.

The results have shown that the proposed method

achieves better compression ratio and takes reduced

compression time.

REFERENCES

[1]

En-hui Yang and Da-Ke He , Efficient universal

lossless data compression algorithms based on

a greedy sequential grammar transform—Part

two: With context models, IEEE Trans. Inform.

Theory, vol. 49, no. 11, November 2003.

[2]

E.-h. Yang and J. C. Kieffer, Efficient universal

lossless data compression algorithms based

on a greedy sequential grammar transform—

Part one: Without context models, IEEE Trans.

[4]

Burrows, M. and Wheeler, D.J (1994), A block

sorting Lossless Data Compression Algorithm,

SRC Research Report 124, Digital System

Research

center,

Palo

Alto,CA,gatekeeper.doc.com,

/pub/DEC/SRC/research-reports/SRC-124.ps.Z.

[5]

Mr. Rafiqul Islam, Sajib Kumar Saha, Mrinal

Kanti Baowlay, A New Approach for Lossless

Compression of JPEG and GIF Files Using Bit

Reduction and Greedy Sequential Grammar

Transform, ICCIT 2006.

[6]

Mr. Rafiqul Islam, Sajib Kumar Saha, Mrinal

Kanti Baowlay, A Modification of Greedy

Sequential Grammar Transform based

Universal Lossless Data Compression, ICCIT

2006.

Inform. Theory, vol. 46, pp. 755–788, May

2000.