Data Structures - Unit - I

Second Year (CSE)
Year 2021-2022
S.R.Tandle
CSE Department
M.S.Bidve Engineering College, Latur.
Unit-I
Introduction
Concept of
Sequential access :
Sequential access is the way of accessing the
data. For example, if it is on a tape. It may also be
the access method of choice, for example if we
wanted to process a sequence of data elements in
order.
S.R.Tandle
CSE Department
Unit-I
Introduction
In data structures, a data structure is

said to have sequential access if one can
only visit the values it contains in one
particular order. Examples of this is the
array, linked list.
Indexing into a list that has sequential
access requires O(n) time, where n is the
index.
S.R.Tandle
CSE Department
Unit-I
Introduction
S.R.Tandle
CSE Department
Unit-I
Introduction
An array is a variable which can store multiple

values of same data type at a time.
An array can also be defined as follows…

“Collection of similar data items stored in
continuous memory locations with single name”.
S.R.Tandle
CSE Department
Unit-I
Introduction
To understand the concept of arrays, consider the following

example declaration;
int a, b, c;
Here, the compiler allocates 2 bytes of memory to each of
these variables.
These three memory locations may be in sequence or may
not be in sequence.
a b c
1001 4802 2466
S.R.Tandle
CSE Department
Unit-I
Introduction
Now see this when we declare it as an array;
int a[3];
a[0] a[1] a[2]
1001 1003 1005

S.R.Tandle
CSE Department
Unit-I
Introduction
Main()
{ 1001
1
int a[5]= {1,2,3,4,5}; 2
1003
for (i=0;i<=4; i++)
{ 1005 3
printf(“%d\n”,*(a+i)); 1007 4
}
} 1001
1009 5
S.R.Tandle
CSE Department
a
Unit-I
Introduction
Primitive Non- primitive
Linear Non-linear
i) int i) Array i) Tree

ii) float ii) List ii) Graph
iii) char iii)Stack
iv) pointer iv) Queue
S.R.Tandle
CSE Department
Unit-I
Introduction
Storage representation
Array
S.R.Tandle
CSE Department
Unit-I
Introduction
Stack
S.R.Tandle
CSE Department
Unit-I
Introduction
Queue
S.R.Tandle
CSE Department
Unit-I
Introduction
List
S.R.Tandle
CSE Department
Unit-I
Introduction
Tree
S.R.Tandle
CSE Department
Unit-I
Introduction
Graph (Using array – matrix)
S.R.Tandle
CSE Department
Unit-I
Introduction
Graph (Using linked list)
S.R.Tandle
CSE Department
Unit-I
Introduction
Matrix
It’s a two dimensional array having

row and column as its dimensions that
represents a collection of data in a
tabular form.
S.R.Tandle
CSE Department
Unit-I
Introduction
Matrix
S.R.Tandle
CSE Department
Unit-I
Introduction
Matrix
S.R.Tandle
CSE Department
Unit-I
Introduction
Matrix
S.R.Tandle
CSE Department
Unit-I
Introduction
Matrix
S.R.Tandle
CSE Department
Unit-I
Introduction
Bit Matrix (Boolean matrix)
S.R.Tandle
CSE Department
Unit-I
Introduction
Sparse Matrix
It’s a matrix in which there is maximum

preponderance of zero elements. It means
that there are maximum zero values.
In contrast to this the matrix which has

maximum non-zero elements is called as
dense matrix.
S.R.Tandle
CSE Department
Unit-I
Introduction
Sparse Matrix and it’s representation
Why to use Sparse Matrix respresentaion instead

of simple matrix ?
Storage: There are lesser non-zero elements than

zeros and thus lesser memory can be used to store
only those elements.
Computing time: Computing time can be saved by

logically designing a data structure traversing only
non-zero elements.
S.R.Tandle
CSE Department
Unit-I
Introduction
Representing a sparse matrix by a 2D array leads to

wastage of lots of memory as zeroes in the matrix are of
no use in most of the cases. So, instead of storing
zeroes with non-zero elements, we only store non-zero
elements. This means storing non-zero elements
with triples- (Row, Column, value).
S.R.Tandle
CSE Department
Unit-I
Introduction
Sparse Matrix Representations can be done in many

ways.
Following are two common representations;
1. Array representation
2. Linked list representation
S.R.Tandle
CSE Department
Unit-I
Introduction
Method 1: Using Arrays

2D array is used to represent a sparse matrix in which there
are three rows named as
Row: Index of row, where non-zero element is located

Column: Index of column, where non-zero element is located
Value: Value of the non zero element located at index –
(row,column)
S.R.Tandle
CSE Department
Unit-I
Introduction

Method 1: Using Arrays
S.R.Tandle
CSE Department
Unit-I
Introduction
Hash Table
Hash Table is a data structure which stores

data in an associative manner.
In a hash table, data is stored in an array

format, where each data value has its own unique
index value. Access of data becomes very fast if
we know the index of the desired data.
S.R.Tandle
CSE Department
Hash Table
It is a data structure in which insertion

and search operations are very fast
irrespective of the size of the data. Hash
Table uses an array as a storage medium
and uses hash technique to generate an
index where an element is to be inserted or
is to be located from.
S.R.Tandle
CSE Department
Unit-I
Introduction
Hashing
Hashing is a technique to convert a range of key

values into a range of indexes of an array. We can
use different functions to get a range of key values.
S.R.Tandle
CSE Department
Unit-I
Introduction
Hashing
Consider one example with the following items.
0
1
12
2 91
23 3 12
55 H(f) = K % 10 + 1 4 23
68 5 74
74 H(f) = 55%10 + 1 6 55
7
91 =5+1=6 8
65 9 68
S.R.Tandle
CSE Department
Unit-I
Introduction
Searching
Consider one example with the following items.
0
1
2 91
3 12
H(f) = K % 10 + 1 4 23
5 74
6 55
7
8
9 68
S.R.Tandle
CSE Department
Hashing functions
Characteristics of good hashing function
• The hash function should generate different hash
values for the similar string.
• The hash function should be easy to understand and
simple to compute.
• The hash function should produce the keys which will
get distributed, uniformly over an array.
• A number of collisions should be less while placing the
data in the hash table.
• The hash function is a perfect hash function when it
uses all the input data. S.R.Tandle
CSE Department
Unit-I
Introduction
Hashing functions
Trivial hash function :

If the keys are uniformly or sufficiently
uniformly distributed over the key space, so
that the key values are essentially random,
they may be considered to be already
'hashed'.
S.R.Tandle
CSE Department
Unit-I
Introduction
Hashing functions
Midsquare hashing :
A mid-squares hash code is produced by
squaring the input and extracting an appropriate
number of middle digits or bits.
For example;
40 -> 1600 -> 60
41-> 1681 -> 68
S.R.Tandle
CSE Department
Hashing functions
Division hashing :
A standard technique is to use a modulo
function on the key, by selecting a
divisor which is a prime number close to the
table size, so h(K)=K mod M.
The table size is usually a power of 2. This
gives a distribution from {0,M-1}.
This gives good results over a large number
of key sets.
S.R.Tandle
CSE Department
Unit-I
Introduction
Hashing functions
Digit folding method
In this method the key is divided into separate
parts and by using some simple operations these
parts are combined to produce a hash key.
Ex: consider a record of 12465512 then it will be
divided into parts i.e. 124, 655, 12. After dividing
the parts combine these parts by adding it.
H(key)=124+655+12 =791
S.R.Tandle
CSE Department
Unit-I
Introduction
Direct Address Table

Direct Address Table is a data structure that has
the capability of mapping records to their
corresponding keys using arrays. In direct address
tables, records are placed using their key values
directly as indexes.
They facilitate fast searching, insertion and
deletion operations.
S.R.Tandle
CSE Department
Unit-I
Introduction

For example.
We create an array of size equal to maximum value plus one
(assuming 0 based index) and then use values as indexes. For
example, in the following diagram key 21 is used directly as index.
S.R.Tandle
CSE Department
Unit-I
Introduction

Advantages:
1. Searching in O(1) Time: Direct address tables use
arrays which are random access data structure, so,
the key values (which are also the index of the array)
can be easily used to search the records in O(1) time.
2. Insertion in O(1) Time: We can easily insert an
element in an array in O(1) time. The same thing
follows in a direct address table also.
3. Deletion in O(1) Time: Deletion of an element takes
O(1) time in an array. Similarly, to delete an element
in a direct address table we need O(1) time.
S.R.Tandle
CSE Department
Unit-I
Introduction

Limitations:
1. Prior knowledge of maximum key value.

2. Practically useful only if the maximum value is
very less.
3. It causes wastage of memory space if there is
a significant difference between total records
and maximum value.
S.R.Tandle
CSE Department
Unit-I
Introduction
Perfect hashing
Hash functions are there to map different keys to

unique locations (index in the hash table), and
the process of hashing which is able to do so is
known as the perfect hashing. Such a function is
known as perfect hash function.
S.R.Tandle
CSE Department
Unit-I
Introduction
Collision
Since the size of the hash table is small
comparatively to the range of keys, the perfect
hash function is practically impossible.
When more than one keys map to the
same location, is known as a collision. A good
hash function should have less number of
collisions.
S.R.Tandle
CSE Department
Unit-I
Introduction
Collision resolution techniques

Collision resolution is finding another
location to avoid the collision. The most popular
resolution techniques are,
• Separate chaining
• Open addressing
Open addressing can be further divided into,
1. Linear Probing
2. Quadratic Probing
3. Double hashing S.R.Tandle
CSE Department
Unit-I
Introduction
Separate chaining :
In separate chaining, we maintain a linked
chain for every index in the hash table. So
whenever there is a Collision the linked list is
extended for that particular location of the hash
table.
We can visualize the separate chaining
method with the following example,
S.R.Tandle
CSE Department
Unit-I
Introduction
Separate chaining
S.R.Tandle
CSE Department
Unit-I
Introduction

Advantage and disadvantages of separate chaining
Advantages are,
1. We can add as many keys as we want to add.
2. It's simpler than open addressing to implement.
Disadvantages are,
1. It uses extra spaces for links.
2. If the collision rate is high, the search complexity
increases as load factor increases.
S.R.Tandle
CSE Department
Unit-I
Introduction

Open addressing :
In open addressing, all the keys will be stored in the
hash table itself, not by using any additional memory
or extending the index(linked list). This is also known
as closed hashing and this is done mainly based on
probing.
Probing can be done based on either linear
probing or quadratic probing. In open addressing,
we keep rehashing until we resolve.
S.R.Tandle
CSE Department
Unit-I
Introduction
Linear Probing :
In linear probing, the rehashing process is
linear. Say the location found at any step
is n and n is occupied then the next
attempt will be to hash at position (n+1).
S.R.Tandle
CSE Department
Unit-I
Introduction
S.R.Tandle
CSE Department
Unit-I
Introduction
Quadratic Probing :
The problem of clustering can be avoided by using quadratic
probing. Here the rehashing is done like below,
rehashing(key) = (n+k2) % table size
where, k is 1,2,3, ... We wrap around from the last table
location to the first location if necessary.
Like say hashing location initially is 3 and 3 is occupied then
we will go for 3+12=4, if it’s still occupied we will go for
4+22=8. So on
S.R.Tandle
CSE Department
Unit-I
Introduction
Quadratic Probing :
The main advantage is, we can overcome the problem
of clustering which appeared in the case of linear probing.
S.R.Tandle
CSE Department
Unit-I
Introduction

Double hashing :
Double hashing is the best open addressing
technique to overcome clustering chances. Here we
increment the probing length based on another hash
function.
S.R.Tandle
CSE Department
Unit-I
Introduction

Double hashing :
Say the primary hash function is h1 and secondary
hash function is h2 to increment probing length
Then f(key)=h1(key)+k*h2(key) where h2≠h1
Like, first we find h1(key). If it's occupied, we

will go for h1(key)+h2(key) where h2(key) is the
probing length. If it's still occupied then will go
for h1(key) +2*h2(key), so on.
S.R.Tandle
CSE Department
Unit-I
Introduction

Double hashing :
Inserting 983 in the hash map. So, location is 3, but
it's occupied.
Next location is 3+digit(983)=6, that's occupied too.
Next location is 3+2*digit(983)=9 and that's occupied
too.
Next location is 3+3*digit(983)==12%10=2
S.R.Tandle
CSE Department
Unit-I
Introduction
Double hashing :
Index Keys
0 Empty
1 Empty
2 983
3 123
4 124
5 Empty
6 333
7 Empty
8 Empty
9 4679
S.R.Tandle
CSE Department
End of Unit-I
S.R.Tandle
CSE Department

Data Structures - Unit - I

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Structures - Unit - I

Uploaded by

Copyright:

Available Formats

Second Year (CSE)

In data structures, a data structure is

An array is a variable which can store multiple

An array can also be defined as follows…

To understand the concept of arrays, consider the following

Now see this when we declare it as an array;

a[0] a[1] a[2]

1001 1003 1005

Primitive Non- primitive

i) int i) Array i) Tree

It’s a two dimensional array having

Bit Matrix (Boolean matrix)

It’s a matrix in which there is maximum

In contrast to this the matrix which has

Sparse Matrix and it’s representation

Why to use Sparse Matrix respresentaion instead

Storage: There are lesser non-zero elements than

Computing time: Computing time can be saved by

Sparse Matrix and it’s representation

Representing a sparse matrix by a 2D array leads to

Sparse Matrix and it’s representation

Sparse Matrix Representations can be done in many

Following are two common representations;

Sparse Matrix and it’s representation

Method 1: Using Arrays

Row: Index of row, where non-zero element is located

Sparse Matrix and it’s representation

Hash Table is a data structure which stores

In a hash table, data is stored in an array

It is a data structure in which insertion

Hashing is a technique to convert a range of key

Trivial hash function :

Direct Address Table

Direct Address Table

Direct Address Table

Direct Address Table

1. Prior knowledge of maximum key value.

Hash functions are there to map different keys to

Collision resolution techniques

Collision resolution techniques

Collision resolution techniques

Collision resolution techniques

Collision resolution techniques

Collision resolution techniques

Collision resolution techniques

Collision resolution techniques

Like, first we find h1(key). If it's occupied, we

Collision resolution techniques

You might also like