Professional Documents
Culture Documents
SYSTEM DESIGN
The System Design Document describes the system requirements, operating environment,
system and subsystem architecture, files and database design, input formats, output layouts,
human-machine interfaces, detailed design, processing logic, and external interfaces.
Systems design is the process of defining the architecture, modules, interfaces, and data for a
system to satisfy specified requirements. Systems design could be seen as the application of
systems theory to product development.
The key is declared as a character array that can hold a maximum of 5 characters. Checks are
done to ensure the key if of exactly 3 characters during input and have only digits. The other
fields are also character array. All of the fields are character array that can hold a string value
of some maximum size. Hence we fix the fields into a predictable length and place a
delimiter at the end of each field to separate it from the next field.
The size of the array is larger than the longest string it can hold. Hence, we preserve the
identity of fields by separating them with delimiters. We have chosen the vertical bar
character commonly known as the pipe symbol (|), as the delimiter here. Hence we fix the
fields into a predictable length and place a delimiter at the end of each field to separate it
from the next field.
The records are used as containers to hold a mix of fixed and variable-length fields
within a fixed length record. A fixed-length record is one in which every field has a fixed
length. Each record is of 100 bytes.
All the records are stored in a bucket. Bucket − A hash file stores data in bucket
format. Bucket is considered a unit of storage. A bucket typically stores one complete disk
block, which in turn can store one or more records Each bucket can store a maximum of 5
fixed length record. There are 5 buckets used in the application. Therefore, each bucket has a
maximum size of 300 bytes and a total of 25 records can be stored.
Hence we use fixed length bucket, fixed length record with variable length fields and two
pipe symbols (||) used to separate the records within each bucket and new line character (\n)
to separate the buckets in a file. The class declaration of a typical event file record and
student file records are as shown in Fig 3.1 and Fig 3.2.
The Event_id, until the user enters one of exactly 3 characters each of which is a
digit, else it displays invalid number message and user is prompted to read Event_id
again.
After all the values are accepted, hash () is called that is used to generate the address of the
bucket where the record has to be stored and finally store () is used to store the record in the
data file and the user can press any key to return back to the menu screen.
After all the values are accepted, hash () is called that is used to generate the address of the
bucket where the record has to be stored and finally store () is used to store the record in the
data file and the user can press any key to return back to the menu screen.
There are 2 types of hashing static hashing and dynamic hashing. In this project static
hashing has been implemented. In static hashing, when a search-key value is provided, the
hash function always computes the same address. For example, if mod-11 hash function is
used, then it shall generate only 11 values. The output address shall always be same for that
function. The number of buckets for the program remains unchanged at all the times.
Bucket − A hash file stores data in bucket format. Bucket is considered a unit of
storage. A bucket typically stores one complete disk block, which in turn can store one or
more records. Each bucket can store a maximum of 5 fixed length record each of hundred
bytes in this project. There are 5 buckets used in this application. Therefore each bucket has a
maximum size of 500 bytes and a total of 25 records can be stored. And delimiter character
“||” is used to separate the record in a file. An array is used to keep a count of records within
a bucket. Initially the file is created and is filled with ‘#’ and the count value of each bucket
is initialized to 0, and whenever a record is added to or deleted from a bucket the count is
changed accordingly. This count helps in easy retrieval of record, and performs other
operations with ease.
The hashing algorithm is called the hash function, probably the term is derived from
the idea that the resulting hash value can be thought of as a mixed up version of the
represented value. The hash function is used to index the original value or key and then used
later each time the data associated with the value or key is to be retrieved.
Thus, hashing is always a one-way operation. Ideal hash function can’t be derived by
any analysis. A good hash function also should not produce the same hash value for two
different inputs. If it does, this is known as collision. A hash function that offers an
extremely low risk of collision may be considered acceptable.
The hash function that has been used in this project is Division – remainder method. The
division method is generally a reasonable strategy, unless the key happens to have some
undesirable properties. The size of the number of items in the table is estimated. That number
is then used as a divisor into each original value or key to extract a quotient and a remainder.
The remainder is the hashed value.
Since this method is liable to produce a number of collisions, any search mechanism
would have to be able to recognize a collision and offer an alternate search mechanism. The
number of buckets considered in this project is 5 hence the divisor is 5. The sum of the
individual elements of the key (Event_id) which is a string is considered as the dividend and
it is divided with the number of buckets (5) as the divisor. The remainder is the hashed value.
Once a valid value is read we pack the data using ‘|’ as the delimiter.
Collision occurs when the bucket corresponding to the computed hash value has all
the 5 records filled in it (because the bucket size considered is 5). The technique used to
resolve collision is linear probing. It was invented in 1954 by Gene Amdahl, Elaine M.
McGraw, and Arthur Samuel and first analyzed in 1963 by Knuth. Along with quadratic
probing and double hashing, linear probing is a form of open addressing. In these schemes,
each cell of a hash table stores a single key–value pair.
If faced with a collision situation, the linear probing technique will look onto
subsequent hash elements until the first free space is found. This traversal is known as
probing the table; and as it goes by one element at a time, it is linear probing. Once the end
of hash table is reached which is end of 4th bucket because the total number of buckets
considered is 5, wrap around concept is used to further check for free space from the
beginning of the table till the first computed hashed value. This traversal is known as probing
the table; and as it goes by one element at a time, it is linear probing. Once the end of hash
table is reached, wrap around concept is used to further check for free space from the
beginning of the table till the first computed hashed value.