0 views

Huffman coding

- Final Year Matlab Project List With Abstract 2012
- Improving LZW Image Compression
- Real-Time Voice Streaming Over IEEE 802.15.4
- Android Studio New Media Fundamentals - Content Production of Digital Audio-Video, Illustration and 3D Animation.pdf
- Integer Wavelet Transform and Predictive Coding Technique for Lossless Medical Image Compression
- Separación De Audio De Mezclas Comprimidas Lineales
- Power Control and Radio Link Supervision
- Design of a Python-subset Compiler in Rust targeting ZPAQL
- A Pixel Domain Video Coding based on Turbo code and Arithmetic code
- SPEECH COMPRESSION TECHNIQUES: A REVIEW
- Atpg Low Power Ppt
- Compression Tech
- TreeApps
- CS502QuizSolvedForFinalTermPreparation (1)
- Etreppid Life and Times
- Industrial Robotics
- 2-D ECG Compression Method Based on Wavelet Transform and Modified SPIHT
- DATA COMPRESSION
- Digital Video Watermarking based on candidates I-Frames.pdf
- 05Compression for IR

You are on page 1of 11

BS(SE)-V (Evening)

Contents

1. Introduction .......................................................................................................................................... 3

1.1 background ......................................................................................................................................... 3

2. LITERATURE REVIEW ................................................................................................................................ 3

2.1 GENERAL IDEA OF DATA COMPRESSION ........................................................................................... 3

2.1.1 SCHEMA OF DATA COMPRESSION .............................................................................................. 3

1. Spatial redundancy........................................................................................................................... 4

2. Temporal redundancy. ..................................................................................................................... 4

2.2.2 CODING AND ALGORITHM .............................................................................................................. 4

3. METHODOLOGY........................................................................................................................................ 5

4. IMPLEMENTATION RESULTS AND DISCUSSION....................................................................................... 5

4.1 IMPLEMENTATION DETAILS OF HUFFMAN CODING ......................................................................... 5

4.1.1 IMPLEMENTATION PREPARATION.............................................................................................. 5

5. CONCLUSION .......................................................................................................................................... 11

5. BIBLIOGRAPHY........................................................................................................................................ 11

1. Introduction

1.1 background

This project was conducted for Massey university 159.333 Individual

Programming Project paper. The main purpose of this proposed project is about

researching and implementing data compression algorithms. The objectives of the

project are:

1. Research and apply data compression algorithms involves C++ and data

structures.

2. Implement compression algorithms and compare effectiveness of

implementations.

3. Make extensions on current methodological approaches and test designs of

developed algorithms. In this project, two lossless data compression algorithms

were implemented and discussed. Lossless data compression uses statistical

redundancy to represent data with lossless information

2. LITERATURE REVIEW

In this section some compression-related research in the areas of information

theory and data compression algorithm will be discussed. For this project, entropy

and lossless compression algorithm are the main topics to be discussed.

Therefore, a brief introduction of data compression and lossless compression will

be discussed accordingly.

2.1 GENERAL IDEA OF DATA COMPRESSION

2.1.1 SCHEMA OF DATA COMPRESSION

In computer science and communication theory, data compression or source

coding is the art of encoding original data with specific mechanism to fewer bits

(or other information related units). For example, if the word “compression” in

previous sentence is encoding to “comp”, then this sentence can be stored in a

text file with less data. A popular living example is ZIP file format, which is widely

used in PC. It not only provides compress functions, but also offers archive tools

(Archiver) that can store many files in a same root file.

1. Spatial redundancy.

Data compression is possible because most real-world data contain large amount

of statistical redundancy. Redundancy can exist in various forms. For example, in a

text file, the letter “e” in English appears more frequently than the letter “q”. The

possibility is dramatically reduced if the letter “q” followed by “z”. There also

exists data redundancy in multimedia information. Basically, data redundancy has

following types: 1. Spatial redundancy. (The static architectural background, blue

sky and lawn of an image contain many same pixels. If such image is stored pixel

by pixel, then it will waste a lot of space. )

2. Temporal redundancy.

(In television, animation images between adjacent frames likely contain the same

background, only the position of moving objects have some slightly changes. So it

is only worthy to store discrepant portion of the adjacent frame. ) 3. Structure

redundancy 4. Knowledge redundancy 5. Information entropy redundancy. (It is

also known as encoding redundancy, which is the entropy that data carry.) Hence,

one aspect of compression is redundancy removal. Characterization of

redundancy involves some form of modeling. So for the previous English letter

redundancy example, the model of this redundancy is English text. This process is

also known as de-correlation. After the modelling process the information needs

to be encoded into a binary bits representation. This encoding part involves

coding algorithms which are applied

2.2.2 CODING AND ALGORITHM

entropy principle). Information entropy is the average amount of information

(uncertainty Metric) of the source.

The common entropy coding are:

header Shannon–Fano coding At about 1960 Claude E. Shannon (MIT) and Robert

M. Fano (Bell Laboratories) had developed a coding procedure for constructing a

binary code treeof prefix code according to a set of symbols and their

probabilities.

Huffman coding In 1952, HUffman had proposed a coding method completely in

accordance with the probability of character frequency to construct optimal

prefix code.

Arithmetic coding A coding method that directly encodes the whole input

message to a decimal number n(0.0 ≤ n < 1.0

3. METHODOLOGY

This section will mainly present various methods which were used in this project.

Implementations and experimental results will be discussed in next sections. All

the methods were conducted for testing and making extensions on current data

compression algorithms, such as Huffman Algorithm. A series of programs were

written in C++ to make adaptation of already available device, algorithm and

design, in order to research and improve the productivity of current compression

algorithms. These programs are tested using Visual Studio 2017 IDE; on an Intel

Core i7-4790k 2.7GHz equipped PC machine under Windows 10 operation system.

4.1 IMPLEMENTATION DETAILS OF HUFFMAN CODING

4.1.1 IMPLEMENTATION PREPARATION

Huffman coding is basically an algorithm of representing data in another representation to reduce the

information redundancy. The data is encoded in binary.

HuffManCoding.h File

#include <iostream>

#include <cstdlib>

using namespace std;

#define MAX_TREE_HT 100

struct MinHeapNode{

char data;

unsigned freq;

MinHeapNode *left, *right;

};

struct MinHeap{

unsigned size;

unsigned capacity;

MinHeapNode** array;

};

class HuffManCoding

{

public:

HuffManCoding();

~HuffManCoding();

MinHeapNode* newNode(char data, unsigned freq);

MinHeap* createMinHeap(unsigned capacity);

void swapMinHeapNode(MinHeapNode** a,

MinHeapNode** b);

void minHeapify(MinHeap* minHeap, int idx);

int isSizeOne(MinHeap* minHeap);

MinHeapNode* extractMin(MinHeap* minHeap);

void insertMinHeap(MinHeap* minHeap,

MinHeapNode* minHeapNode);

void buildMinHeap(MinHeap* minHeap);

void printArr(int arr[], int n);

int isLeaf(MinHeapNode* root);

MinHeap* createAndBuildMinHeap(char data[], int freq[], int size);

MinHeapNode* buildHuffmanTree(char data[], int freq[], int size);

void printCodes(MinHeapNode* root, int arr[], int top);

void HuffmanCodes(char data[], int freq[], int size);

};

HuffManCoding.cpp

#include "HuffManCoding.h"

HuffManCoding::HuffManCoding()

{

}

HuffManCoding::~HuffManCoding()

{

}

MinHeapNode* HuffManCoding::newNode(char data, unsigned freq)

{

MinHeapNode* temp

= new MinHeapNode();

temp->left = temp->right = NULL;

temp->data = data;

temp->freq = freq;

return temp;

}

MinHeap* minHeap

= new MinHeap();

minHeap->size = 0;

minHeap->capacity = capacity;

minHeap->array

= new MinHeapNode*[capacity];

return minHeap;

}

MinHeapNode** b)

{

MinHeapNode* t = *a;

*a = *b;

*b = t;

}

{

int smallest = idx;

int left = 2 * idx + 1;

int right = 2 * idx + 2;

freq < minHeap->array[smallest]->freq)

smallest = left;

freq < minHeap->array[smallest]->freq)

smallest = right;

if (smallest != idx) {

swapMinHeapNode(&minHeap->array[smallest],

&minHeap->array[idx]);

minHeapify(minHeap, smallest);

}

}

{

return (minHeap->size == 1);

}

{

MinHeapNode* temp = minHeap->array[0];

minHeap->array[0]

= minHeap->array[minHeap->size - 1];

--minHeap->size;

minHeapify(minHeap, 0);

return temp;

}

MinHeapNode* minHeapNode)

++minHeap->size;

int i = minHeap->size - 1;

i = (i - 1) / 2;

}

minHeap->array[i] = minHeapNode;

}

int n = minHeap->size - 1;

int i;

minHeapify(minHeap, i);

}

{

int i;

for (i = 0; i < n; ++i)

cout << arr[i];

}

int HuffManCoding::isLeaf( MinHeapNode* root)

}

minHeap->array[i] = newNode(data[i], freq[i]);

minHeap->size = size;

buildMinHeap(minHeap);

return minHeap;

}

{

MinHeapNode *left, *right, *top;

while (!isSizeOne(minHeap)) {

left = extractMin(minHeap);

right = extractMin(minHeap);

top->left = left;

top->right = right;

insertMinHeap(minHeap, top);

}

return extractMin(minHeap);

}

if (root->left) {

arr[top] = 0;

printCodes(root->left, arr, top + 1);

}

if (root->right) {

arr[top] = 1;

printCodes(root->right, arr, top + 1);

}

if (isLeaf(root)) {

printArr(arr, top);

}

}

{

MinHeapNode* root

= buildHuffmanTree(data, freq, size);

}

Main.cpp

#include<iostream>

using namespace std;

#include"HuffManCoding.h";

int main()

{

HuffManCoding HMC;

char arr[] = { 'a', 'b', 'c', 'd', 'e', 'f' };

int freq[] = { 5, 9, 12, 13, 16, 45 };

cin.get();

return 0;

}

5. CONCLUSION

To conclude, this project has introduced the background of data compression in

terms of classification, principle and theory. The lossless data compression and

information entropy are mainly discussed in this report for further

implementations. Huffman coding and LZW coding algorithms were implemented

for investigating how to improve compression ratio. The implementation is based

on several methods with respect to information search, concepts generation and

research analysis. The objectives of this project are mostly completed. Two

important lossless algorithms were implemented and discussed in details. . For

Huffman part, a user-defined bit length Huffman coding program was designed

for investigation the relationship between coding bit length and compression

ratio. The experimental results showed that the program can produce better

compression ratio if the input file string is split by even number bits. Also, in the

range of 16 to 28 split bits, the program can reach the highest compression ratio.

Therefore, it is recommended that user can focus on this range to get a

productive compression. Additionally, the test files with high information

redundancy (e.g. image file) can be compressed in higher compression ratio.

5. BIBLIOGRAPHY

A.Lesne. (2011). Shannon entropy: a rigorous mathematical notion at the

crossroads between probability, information theory, dynamical systems and

statistical physics.

Retrieved from: http://www.lptmc.jussieu.fr/user/lesne/MSCS-entropy.pdf

A.RRNYI. (1961).

ON MEASURES OF ENTROPY AND INFORMATION. Retrieved from:

http://l.academicdirect.org/Horticulture/GAs/Refs/Renyi_1961.pdf

A.Shahbahrami, R.Bahrampour, M.S.Rostami, M.A.Mobarhan. (2011). Evaluation

of Huffman and Arithmetic Algorithms for Multimedia Compression Standards.

Retrieved from: http://arxiv.org/ftp/arxiv/papers/1109/1109.0216.pdf

- Final Year Matlab Project List With Abstract 2012Uploaded byMohit Sngg
- Improving LZW Image CompressionUploaded byRizki Anugrah
- Real-Time Voice Streaming Over IEEE 802.15.4Uploaded byCésar Alonso Abad
- Android Studio New Media Fundamentals - Content Production of Digital Audio-Video, Illustration and 3D Animation.pdfUploaded byPranav Swaroop
- Integer Wavelet Transform and Predictive Coding Technique for Lossless Medical Image CompressionUploaded byeditor_ijtel
- Separación De Audio De Mezclas Comprimidas LinealesUploaded byricardoag_new
- Power Control and Radio Link SupervisionUploaded byBao Tan Dang
- Design of a Python-subset Compiler in Rust targeting ZPAQLUploaded byyo
- A Pixel Domain Video Coding based on Turbo code and Arithmetic codeUploaded byJohn Berg
- SPEECH COMPRESSION TECHNIQUES: A REVIEWUploaded byIJIERT-International Journal of Innovations in Engineering Research and Technology
- Atpg Low Power PptUploaded byprawinv
- Compression TechUploaded byNavin Mis
- TreeAppsUploaded byshk8
- CS502QuizSolvedForFinalTermPreparation (1)Uploaded byMaaz Bin Rizwan
- Etreppid Life and TimesUploaded bySherriLCruz
- Industrial RoboticsUploaded byamiestudent
- 2-D ECG Compression Method Based on Wavelet Transform and Modified SPIHTUploaded byskwijaya
- DATA COMPRESSIONUploaded byJeevan Jyoti Rana
- Digital Video Watermarking based on candidates I-Frames.pdfUploaded byRakesh Ahuja
- 05Compression for IRUploaded bytarillo2
- Huffman CodingUploaded byYugal Jindle
- VLSIPROJECTTOPICSUploaded byursbaji2775
- its10camiloUploaded byangel1341989
- ScarmanaGabrielCompressing ImagesUploaded byVũ Thanh Hà
- ABCUploaded byShivappa Siddannavar
- Syllabus of 7th SemesterUploaded byRahul.S.Nair
- Following Fatal Function Form of Thought to Assign Something Clear Exactly TrueUploaded byAnonymous 8erOuK4i
- IJETR031346Uploaded byerpublication
- Amazing Audacity Session 1Uploaded by18pollas
- IntroUploaded byBen Wilson

- Ardrox 966 P300AUploaded byMark Evan Salutin
- welsh gatsbyfinalUploaded byapi-334617841
- updated resume pdfUploaded byapi-277406572
- 5th grade rela module 2Uploaded byapi-326844298
- The Historical Development of Islamic Law.docxUploaded byPalak Rawat
- cognitive-behavior therapyUploaded bytexastechlove
- 9072018 Fake News RTIUploaded byShri Gopal Soni
- Change LogUploaded byPetre Vijiac
- Influence of Osmotic Suction on the Soil-WaterUploaded byLuisRodrigoPerezPinto
- Williams WidebodyUploaded byjose
- Castro, Charles P. - Enchanted Trees, Sacred Groves, And Forest Fairies - A Sampling of Folk Beliefs Associated With Trees and ForestsUploaded byIrjay Rolloda
- EMPLOYEE EMPOWERMENT_ A CRUCIAL INGREDIENT IN A TOTAL QUALITY..Uploaded byPatrick Andrew Jansen
- Campaign to Abolish Psychiatric DiagnosisUploaded byCarolyn Anderson
- Iveco NEF Engine (N60 ENT M40) Service Repair Manual.pdfUploaded byjksmemms
- Course File Machine DesignUploaded byMatam Prasad
- On Microbe HuntersUploaded byDrBabu PS
- The Modern Approach to BiasUploaded byCarl Gardner
- chapter 17 section 2Uploaded byapi-206809924
- History, Memory and Forgetting in Nietszche and DerridaUploaded byVirginia Magna
- Bioplastics_web28409Uploaded byIman Haerudin
- Ma. Luisa A. Pineda Vs. Virginia Zuñiga Vda. De Vega G.R. No. 233774Uploaded bybogoldoy
- FloJoe CPE CollocationsUploaded byRichard Holguín
- simplified settling velocity formulaUploaded byapi-3702623
- prof ed 1ogroup 1 8-9 mwf-- angelie pepitoUploaded byapi-245194389
- Home VideoUploaded byArman Ul Nasar
- 58MM Thermal Reciept Printer Operating Manual-20160223Uploaded byNicole Zijiang
- Getting from AEDGs to Zero Energy Buildings.pdfUploaded bythermosol5416
- [1960] Brzezinski, Communist Ideology and International AffairsUploaded byMilan Igrutinovic
- Lays vs BingoUploaded byAnadiGoel
- Castromayor vs ComelecUploaded byDanilo Dela Rea