You are on page 1of 58

Second Year (CSE)

Year 2021-2022

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Concept of
Sequential access :
Sequential access is the way of accessing the
data. For example, if it is on a tape. It may also be
the access method of choice, for example if we
wanted to process a sequence of data elements in
order.
S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

In data structures, a data structure is


said to have sequential access if one can
only visit the values it contains in one
particular order. Examples of this is the
array, linked list.
Indexing into a list that has sequential
access requires O(n) time, where n is the
index.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

An array is a variable which can store multiple


values of same data type at a time.

An array can also be defined as follows…


“Collection of similar data items stored in
continuous memory locations with single name”.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

To understand the concept of arrays, consider the following


example declaration;
int a, b, c;
Here, the compiler allocates 2 bytes of memory to each of
these variables.
These three memory locations may be in sequence or may
not be in sequence.

a b c
1001 4802 2466

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Now see this when we declare it as an array;

int a[3];

a[0] a[1] a[2]

1001 1003 1005


S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Main()
{ 1001
1
int a[5]= {1,2,3,4,5}; 2
1003
for (i=0;i<=4; i++)
{ 1005 3
printf(“%d\n”,*(a+i)); 1007 4
}
} 1001
1009 5

S.R.Tandle
CSE Department
a
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Primitive Non- primitive

Linear Non-linear

i) int i) Array i) Tree


ii) float ii) List ii) Graph
iii) char iii)Stack
iv) pointer iv) Queue

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Storage representation
Array

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Storage representation
Stack

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Storage representation
Queue

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Storage representation
List

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Storage representation
Tree

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Storage representation
Graph (Using array – matrix)

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Storage representation
Graph (Using linked list)

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Matrix

It’s a two dimensional array having


row and column as its dimensions that
represents a collection of data in a
tabular form.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Matrix

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Matrix

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Matrix

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Matrix

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Bit Matrix (Boolean matrix)

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Sparse Matrix

It’s a matrix in which there is maximum


preponderance of zero elements. It means
that there are maximum zero values.

In contrast to this the matrix which has


maximum non-zero elements is called as
dense matrix.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Sparse Matrix and it’s representation

Why to use Sparse Matrix respresentaion instead


of simple matrix ?

Storage: There are lesser non-zero elements than


zeros and thus lesser memory can be used to store
only those elements.

Computing time: Computing time can be saved by


logically designing a data structure traversing only
non-zero elements.
S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Sparse Matrix and it’s representation

Representing a sparse matrix by a 2D array leads to


wastage of lots of memory as zeroes in the matrix are of
no use in most of the cases. So, instead of storing
zeroes with non-zero elements, we only store non-zero
elements. This means storing non-zero elements
with triples- (Row, Column, value).

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Sparse Matrix and it’s representation

Sparse Matrix Representations can be done in many


ways.

Following are two common representations;

1. Array representation
2. Linked list representation

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Sparse Matrix and it’s representation

Method 1: Using Arrays


2D array is used to represent a sparse matrix in which there
are three rows named as

Row: Index of row, where non-zero element is located


Column: Index of column, where non-zero element is located
Value: Value of the non zero element located at index –
(row,column)

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Sparse Matrix and it’s representation


Method 1: Using Arrays

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Hash Table

Hash Table is a data structure which stores


data in an associative manner.

In a hash table, data is stored in an array


format, where each data value has its own unique
index value. Access of data becomes very fast if
we know the index of the desired data.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Hash Table

It is a data structure in which insertion


and search operations are very fast
irrespective of the size of the data. Hash
Table uses an array as a storage medium
and uses hash technique to generate an
index where an element is to be inserted or
is to be located from.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Hashing

Hashing is a technique to convert a range of key


values into a range of indexes of an array. We can
use different functions to get a range of key values.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Hashing
Consider one example with the following items.
0
1
12
2 91
23 3 12
55 H(f) = K % 10 + 1 4 23

68 5 74

74 H(f) = 55%10 + 1 6 55
7
91 =5+1=6 8

65 9 68

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Searching
Consider one example with the following items.
0
1
2 91
3 12
H(f) = K % 10 + 1 4 23
5 74
6 55
7
8
9 68
S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Hashing functions
Characteristics of good hashing function
• The hash function should generate different hash
values for the similar string.
• The hash function should be easy to understand and
simple to compute.
• The hash function should produce the keys which will
get distributed, uniformly over an array.
• A number of collisions should be less while placing the
data in the hash table.
• The hash function is a perfect hash function when it
uses all the input data. S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Hashing functions

Trivial hash function :


If the keys are uniformly or sufficiently
uniformly distributed over the key space, so
that the key values are essentially random,
they may be considered to be already
'hashed'.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Hashing functions
Midsquare hashing :
A mid-squares hash code is produced by
squaring the input and extracting an appropriate
number of middle digits or bits.

For example;
40 -> 1600 -> 60
41-> 1681 -> 68

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Hashing functions
Division hashing :
A standard technique is to use a modulo
function on the key, by selecting a
divisor which is a prime number close to the
table size, so h(K)=K mod M.
The table size is usually a power of 2. This
gives a distribution from {0,M-1}.
This gives good results over a large number
of key sets.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Hashing functions
Digit folding method
In this method the key is divided into separate
parts and by using some simple operations these
parts are combined to produce a hash key.
Ex: consider a record of 12465512 then it will be
divided into parts i.e. 124, 655, 12. After dividing
the parts combine these parts by adding it.
H(key)=124+655+12 =791

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Direct Address Table


Direct Address Table is a data structure that has
the capability of mapping records to their
corresponding keys using arrays. In direct address
tables, records are placed using their key values
directly as indexes.
They facilitate fast searching, insertion and
deletion operations.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Direct Address Table


For example.
We create an array of size equal to maximum value plus one
(assuming 0 based index) and then use values as indexes. For
example, in the following diagram key 21 is used directly as index.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Direct Address Table


Advantages:
1. Searching in O(1) Time: Direct address tables use
arrays which are random access data structure, so,
the key values (which are also the index of the array)
can be easily used to search the records in O(1) time.
2. Insertion in O(1) Time: We can easily insert an
element in an array in O(1) time. The same thing
follows in a direct address table also.
3. Deletion in O(1) Time: Deletion of an element takes
O(1) time in an array. Similarly, to delete an element
in a direct address table we need O(1) time.
S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Direct Address Table


Limitations:

1. Prior knowledge of maximum key value.


2. Practically useful only if the maximum value is
very less.
3. It causes wastage of memory space if there is
a significant difference between total records
and maximum value.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Perfect hashing

Hash functions are there to map different keys to


unique locations (index in the hash table), and
the process of hashing which is able to do so is
known as the perfect hashing. Such a function is
known as perfect hash function.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Collision
Since the size of the hash table is small
comparatively to the range of keys, the perfect
hash function is practically impossible.
When more than one keys map to the
same location, is known as a collision. A good
hash function should have less number of
collisions.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Collision resolution techniques


Collision resolution is finding another
location to avoid the collision. The most popular
resolution techniques are,
• Separate chaining
• Open addressing
Open addressing can be further divided into,
1. Linear Probing
2. Quadratic Probing
3. Double hashing S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Collision resolution techniques

Separate chaining :
In separate chaining, we maintain a linked
chain for every index in the hash table. So
whenever there is a Collision the linked list is
extended for that particular location of the hash
table.
We can visualize the separate chaining
method with the following example,
S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Collision resolution techniques

Separate chaining

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Collision resolution techniques


Advantage and disadvantages of separate chaining
Advantages are,
1. We can add as many keys as we want to add.
2. It's simpler than open addressing to implement.
Disadvantages are,
1. It uses extra spaces for links.
2. If the collision rate is high, the search complexity
increases as load factor increases.
S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Collision resolution techniques


Open addressing :
In open addressing, all the keys will be stored in the
hash table itself, not by using any additional memory
or extending the index(linked list). This is also known
as closed hashing and this is done mainly based on
probing.
Probing can be done based on either linear
probing or quadratic probing. In open addressing,
we keep rehashing until we resolve.
S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Collision resolution techniques

Linear Probing :
In linear probing, the rehashing process is
linear. Say the location found at any step
is n and n is occupied then the next
attempt will be to hash at position (n+1).

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction
Collision resolution techniques

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction
Collision resolution techniques

Quadratic Probing :
The problem of clustering can be avoided by using quadratic
probing. Here the rehashing is done like below,
rehashing(key) = (n+k2) % table size
where, k is 1,2,3, ... We wrap around from the last table
location to the first location if necessary.
Like say hashing location initially is 3 and 3 is occupied then
we will go for 3+12=4, if it’s still occupied we will go for
4+22=8. So on
S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction
Collision resolution techniques

Quadratic Probing :
The main advantage is, we can overcome the problem
of clustering which appeared in the case of linear probing.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Collision resolution techniques


Double hashing :
Double hashing is the best open addressing
technique to overcome clustering chances. Here we
increment the probing length based on another hash
function.

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Collision resolution techniques


Double hashing :
Say the primary hash function is h1 and secondary
hash function is h2 to increment probing length
Then f(key)=h1(key)+k*h2(key) where h2≠h1

Like, first we find h1(key). If it's occupied, we


will go for h1(key)+h2(key) where h2(key) is the
probing length. If it's still occupied then will go
for h1(key) +2*h2(key), so on.
S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction

Collision resolution techniques


Double hashing :
Inserting 983 in the hash map. So, location is 3, but
it's occupied.
Next location is 3+digit(983)=6, that's occupied too.
Next location is 3+2*digit(983)=9 and that's occupied
too.
Next location is 3+3*digit(983)==12%10=2

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction
Collision resolution techniques
Double hashing :
Index Keys
0 Empty
1 Empty
2 983
3 123
4 124
5 Empty
6 333
7 Empty
8 Empty
9 4679
S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
End of Unit-I

S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.

You might also like