Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword
Like this
0Activity
0 of .
Results for:
No results containing your search query
P. 1
Digital Shorthand Based Text Compression

Digital Shorthand Based Text Compression

Ratings: (0)|Views: 4|Likes:
Published by ijcsis
With the growing demand for text transmission and storage as a result of advent of net technology, text compression has gained its own momentum. Usually text is coded in yank traditional Code for data Interchange format. Huffman secret writing or the other run length secret writing techniques compresses the plain text[6][11]. We have planned a brand new technique for plain text compression, that is especially inspired by the ideas of Pitman Shorthand. In these technique we propose a stronger coding strategy, which can provide higher compression ratios and higher security towards all possible ways in which of attacks while transmission. The target of this method is to develop a stronger transformation yielding larger compression and additional security[11]. The basic idea of compression is to transform text in to some intermediate form, which may be compressed with higher efficiency and more secure encoding, that exploits the natural redundancy of the language in creating this transformation.
With the growing demand for text transmission and storage as a result of advent of net technology, text compression has gained its own momentum. Usually text is coded in yank traditional Code for data Interchange format. Huffman secret writing or the other run length secret writing techniques compresses the plain text[6][11]. We have planned a brand new technique for plain text compression, that is especially inspired by the ideas of Pitman Shorthand. In these technique we propose a stronger coding strategy, which can provide higher compression ratios and higher security towards all possible ways in which of attacks while transmission. The target of this method is to develop a stronger transformation yielding larger compression and additional security[11]. The basic idea of compression is to transform text in to some intermediate form, which may be compressed with higher efficiency and more secure encoding, that exploits the natural redundancy of the language in creating this transformation.

More info:

Categories:Types, School Work
Published by: ijcsis on Sep 01, 2014
Copyright:Traditional Copyright: All rights reserved

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

09/01/2014

pdf

text

original

 
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 12, No. 7, July 2014
 
Digital Shorthand Based Text Compression
Yogesh
 Rathore
 
CSE,UIT, RGPV Bhopal, M.P., India .
Dr. Rajeev Pandey
CSE,UIT, RGPV Bhopal, M.P., India . 
Manish k. Ahirwar
CSE,UIT, RGPV Bhopal, M.P., India .
 Abstract
—With the growing demand for text transmission and storage as a result of advent of net technology, text compression has gained its own momentum. usually text is coded in yank traditional Code for data Interchange format. Huffman secret writing or the other run length secret writing techniques compresses the plain text[6][11]. We have planned a brand new technique for plain text compression, that is especially inspired by the ideas of Pitman Shorthand. In these technique we propose a stronger coding strategy, which can provide higher compression ratios and higher security towards all possible ways in which of attacks while transmission. the target of this method is to develop a stronger transformation yielding larger compression and additional security[11].
 
The basic idea of compression is to transform text in to some intermediate form, which may be compressed with higher efficiency and more secure encoding, that exploits the natural redundancy of the language in creating this transformation.
 Keywords-Compression; Encoding; REL; RLL; Huffman; LZ; LZW;Pitman Shorthand;Compression;
I.
 
I
 NTRODUCTION
Data compression is a method of reducing the size of the information to be stored or to be transmitted through a network. Nearly 70-80 % of the Internet users send and receive text-based documents. There is a growing demand for speedy transmission of data, which can be made possible by achieving compression. There are many techniques already available to reduce the text into compressed format[4]. Most of these techniques use ASCII format (American Standard Code for Information Interchange) that is an 8-bit code. Each character in a text is encoded in a 8-bit format. ASCII is a well-defined set of codes, which is universally accepted. Text compression techniques have to be context dependent. In Huffman coding method[4], an input text is scanned once from the beginning till the end and the frequency of occurrence of each character is found (histogram). Subsequently, a new coding scheme is followed - frequently appearing characters will have code with less number of bits and least appearing characters are mapped to codes with more number of bits. More text compression methods are Arithmetic coding,Burrows-Wheeler transform,LZW Coding etc[9]. Pitman Shorthand[1] method of documenting is normally practiced by stenographers to take dictation at speaking speed[2][3]. Obviously English or any other language based character set cannot be used to take notes at such speeds. Pitman Shorthand is a  proven solution for this requirement. Special graphical sytnbols are used in this method of representing phonetic compositions of the dictated text for certain interval (may be 500 millisecond). This shorthand representation itself is a compressed and encrypted format of the English text. This is the inspiration for us to extend the concept of Pitman Shorthand[l] to compress the plain English text. In this research, a new set of codes is defined and these codes are used instead of graphic symbols. Compression also serves the purpose of encryption. II.
 
C
OMPRESSION
&
 DECOMPRESSION
 Compression may be a technology by that one or additional files or directory size will be reduced so it\'s straightforward to handle. the target of compression is to scale back the quantity of bits needed to represent information and to decrease the TRM. Compression is achieved through secret writing information and therefore the information is decompressed to its original kind by decryption. Compression will increase the capability of a line by transmittal the compressed file. a standard compressed file that is employed day-today has extensions that finish with .Sit, .Tar, .Zip; There are two main types of data compression: lossy and lossless. A. Lossless Compression Techniques
93http://sites.google.com/site/ijcsis/ ISSN 1947-5500
 
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 12, No. 7, July 2014
 Lossless compression techniques resurface the initial information from the compressed file with none loss of knowledge[17]. so the knowledge doesn\'t alter throughout the compression and decompression processes. lossless compression techniques square measure accustomed compress pictures, text and medical pictures preserved for jural reasons, laptop viable file then on[9][15].
 B. Lossy compression techniques
 Lossy compression techniques resurface the original message with loss of some information. It is not possible to resurface the original message using the decoding process. The decompression process results an nearly realignment. It may be desirable, when data of some ranges which could not recognized by the human brain can be ignored. Such techniques could be used for multimedia audio, video and images to achieve more compact data compression[7][8]. III.
 
I
 NTRODUCTION TO
S
HORTHAND METHOD
 Shorthand is an abbreviated symbolic writing method that increases speed and brevity of writing as compared to a normal method of writing a language. The process of writing in shorthand is called stenography, from the Greek
 stenos
(narrow) and
 graph
ē 
 
or
 graphie
. It has also been called brachygraphy, from Greek
brachys
(short) and tachygraphy, from Greek
tachys
(swift, speedy), depending on whether compression or speed of writing is the goal. Fig 1 shows some sentences written in shorthand method for some English statements[1][2][3][16]. Fig 1.Pitman shorthand sentences for some English statements IV.
 
P
ROPOSED
A
PPROACH
The basic philosophy of compression is to remodel text in to some intermediate form, which can be compressed with better efficiency and more secure encoding, which exploits the natural redundancy of the language in making this transformation[19]. The frequency of occurrence of each word is found. Subsequently, a new coding scheme is followed - frequently appearing word will have code with less number of special character and least appearing word are mapped to codes with more number of special character combination. Following example show some transformation.
The = ! said = * And = light = : God = >)
 The algorithm we developing is a three step process consisting: Step1: Make a Table. Step2: Encode the input text data. Step3: Extra Compression by using existing method 
 Step1: Make a Table
 1. Read all words one by one from input files and put in a table. 2. If a word is already within the table increment the quantity of incidence by one, otherwise add it to the table and set the quantity incidence to one. 3. currently kind the table by frequency of occurrences in raining order. 4. begin giving codes victimization the subsequent method: i). offer the primary 153 words every one permutation of 1 of the ASCII characters.
94http://sites.google.com/site/ijcsis/ ISSN 1947-5500
 
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 12, No. 7, July 2014
 ii). currently offer the remaining words every one permutation of 2 of the ASCII characters (in the vary thirty three to sixty four and 128 to 248), taken so as. If there ar any remaining words offer them every one permutation of 3 of the ASCII characters and finally if needed permutation of 4 characters. 5. produce a brand new table having solely words and their codes. Store this table because the wordbook in a very file. 6. Stop.
 Step2: write in code the input text knowledge
 1.While computer file isn\'t empty i. browse the characters from computer file . ii. If the token is longer than one character, then a. rummage around for the token within the table If it's not found, Write the token as such in to the computer file. Else Write corresponding kind word into computer file. iii. Else Write corresponding word into computer file. 2. Rename the computer file. 3. Exit
 Step3: Extra Compression by using existing method.
we using Gzip for extra compression. V.
 
P
ERFORMANCE
A
 NALYSIS
 The method is implemented using Java language and the input is tested mainly with three different types of texts - namely running text, (which is normally used in e-mails) addresses and bullet texts. The performance of the algorithm for three different types of text examples is shown in table. A text with rich grammalogues gives highest compression. Fig 2. First Windows of software In this fig first step of our  program.Firstly this program show one menu for compression and decompression .Firstly we chose compression optintion. Fig 3. Selecting Compressing File In this step our program for selecting a file for compression.This windows showes selected file size in back color.
95
 
http://sites.google.com/site/ijcsis/ ISSN 1947-5500

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->