Computer And System Engineering Department Presents

FS Assignment 2

ENCODE

Blagwa

Intro
FS Assignment 2

ENCODE

By

:

TA’S Notes :

2

Index

• • • •

Intro Index Problem Statement Overall System
• • • •

• System under the microscope
• • • • • • • •
• •

System Requirements UML How to Use Take care

2 3 4 6
6 6 7 7

• Some ideas along the way
Bad Good .. BUT

File Reader File Writer InBuf OutBuf File Statistics Huffman Encoding Tree Builder Huffman Encode Huffman Decode

8

8 9 10 11 12 xx xx xx
13 13

13

3

Problem Statement You are required to write a program that can compress and decompress a file using Huffman encoding. Input to this program is: 1. A file name. 2. Whether to compress or decompress the file. Output of the program is: 1. The compressed/decompressed file. 2. The code used to code the bytes of the source file. 3. The compression ratio. 4. The execution time. Part (A) – Compression You must collect statistics from more than one text, binary files and find the Huffman codes for each byte. This is a fixed code that is to be used in your program. There will be two fixed codes one for text files and one for binary files. To compress a file, you have to collect statistics for the file and find whether it is more economical to code it using the fixed code OR to use another Huffman code according to the new statistics and store the code in the compressed file. Your output is the compressed file, compression ratio and the code used to compress the file. To display the code, it must be in the following format: Byte Original Code New Code 65 01000001 10001 66 01000010 101101 67 01000011 01000
4

Problem Statement Part (B) - Decompression You must read the file header and determine whether the file is encoded using the fixed code or using another code. The file must then be decompressed (returned to its original format).

5

Overall System

Requirements
• Get a file name. • See weather you are going to encode or decode. • If encode • Do statistics . • Check for the most economical solution. • And do it. • If decode • See which way you went through encoding . • Move it backward the real file data.

UML Class Digram

6

Overall System

How To Use

Take Care
• That is a beta version with many bugs • The program wasn’t tested sufficiently • Big file take a lot of time

7

System Under Microscope

File Reader
Class FileReader is an input character like buffer parameters : inputFileStream => input File Stream -> to get data from the file characterBuffer => unsigned character buffer index => integer (pointer) points to the next data to be retrieved limit=> integer (pointer) points to the end of data Behaviors : FileReader Constructor => parameter -> character pointer file name Actions -> open up input file stream and read some data and set index to start ~FileReader destructor => Actions -> close input file stream readByte => Actions -> if index points to the end then get new data if can'y get data throw exception return next valid data

8

System Under Microscope

File Writer
Class FileWriter is an output character like buffer parameters : outputFileStream => file stream -> to write to the file CharacterBuffer => character array -> to store characters index => integer (pointer) -> to points to the next empty slot to write in Behaviors : FileWriter Constructor => parameter -> Character pointer output file name Actions -> open up file stream to the file and set index to start ~FileWriter destructor => flush and closes fileWriter writeChar => parameter -> unsigned character a Actions -> if index points to the end write down and flush the stored data in CharacterBuffer write down a in the next empty place close => Actions -> write and flush the data stored in CharacterBuffer then closes the outputFileStream file stream

9

System Under Microscope

InBuf
Class InBuf (An input bit like buffer) to accumilate data inside it and retrieve from it bit by bit parameters : inputCharacterBuffer -> File Reader (character like buffer) to read from the input file bitsDataStorage -> unsigned character (data storage for the bits) to accumulate the bits and retrieve from bit by bit Index -> integer (pointer) points to where is the next bit to be retrieved from Behaviors : InBuf Constructor 1 => parameters -> Character pointer input file name Actions ->Creates new File Reader and sets the pointer index to the start InBuf Constructor 2 => parameters -> File Reader in Pointer Actions -> set inputCharacterBuffer to in and sets the pointer index to the start ~InBuf Destructor => Actions -> deletes inputCharacterBuffer readBit => Action -> if index points to the end => get new data f rom file reader and reset the index pointer Retrieve next bit return -> boolean to indicate weather it read 1 or 0
10

System Under Microscope

OutBuf
Class OutBuf is a bit like buffer (A buffer for bits) parameters : outputCharacterBuffer -> File Writer (character like buffer) to write to the output file bitsDataStorage -> unsigned character (data storage for the bits) to accumulate the bits till a char is completed index -> integer (pointer) points to where is the next empty slot Behaviors : OutBuf Constructor 1 => parameters -> Character pointer output file name Actions ->Creates new File Writer and sets the pointer index to the s tart OutBuf Constructor 2 => parameters -> File Writer out Pointer Actions -> set outputCharacterBuffer to out and sets the pointer I ndex to the start ~OutBuf Destructor => Actions ->Closes and deletes outputCharacterBuffer writeBit => parameters -> boolean bit to indicate weather to write 1 or 0 Action -> if index points to the end => flush the results and reset the Index pointer write the bit in the very next empty slot writeSome => parameters -> [Character / 64Long] z (Data Storage for the bits wanted to be written) Short some (number of bits wanted to be written) Action -> Loop from the start of valid data till the end see weather the bit is 1 or 0 and write it down flush => Action -> Write down the data inside bitsDataStorage getFileWriter => Action ->return a pointer to outputCharacterBuffer
11

System Under Microscope

File Statistics
FileStatistics It is a class to count up the occurrences(probabilities) for each character in the input file Constructor Allocate the memory for the 2 maps Distructor Deallocate the memory of the 2 maps Some getters print print the statistics of each byte in the input file printing format : number => weight Do the statistics BY reading the file and incrementing the occurrences of the data

12

Ideas along the W ay

BAD
• ROOT ERROR storage • It Depends on replacing the value of integer by his root and the difference between integer and the root square

GOOD
• Existence dependent code • It Depends on that only the exist bytes be considered on the new code • Minimum Variance Huffman tree • It Depends on forcing the tree to be more like balanced • Calonical Huffman • It Depends on making a relation between elements and codes , and sort them according to that relation .

13

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.