You are on page 1of 19

Data Structure and Algorithm:

A data structure is a named location that can be used to store and organize
data. And an algorithm is a collection of steps to solve a particular problem.
Learning data structures and algorithms allow us to write efficient and optimized
computer programs.
Data Structure is a way of collecting and organizing data in such a way that we can
perform operations on these data in an effective way. Data Structures is about
rendering data elements in terms of some relationship, for better organization and
storage
If you are aware of Object Oriented programming concepts, then a class also does
the same thing, it collects different type of data under one single entity. The only
difference being, data structures provides for techniques to access and manipulate
data efficiently.

In simple language, Data Structures are structures programmed to store ordered


data, so that various operations can be performed on it easily. It represents the
knowledge of data to be organized in memory. It should be designed and
implemented in such a way that it reduces the complexity and increases the
efficiency.

Basic types of Data Structures

As we have discussed above, anything that can store data can be called as a data
structure, hence Integer, Float, Boolean, Char etc., all are data structures. They are
known as Primitive Data Structures.

Then we also have some complex Data Structures, which are used to store large
and connected data. Some example of Abstract Data Structure are :

 Linked List
 Tree
 Graph
 Stack, Queue etc.
All these data structures allow us to perform different operations on data. We
select these data structures based on which type of operation is required. We will
look into these data structures in more details in our later lessons.

The data structures can also be classified on the basis of the following
characteristics:

Characteristic Description

In Linear data structures, the data items are arranged in a


Linear
linear sequence. Example: Array

In Non-Linear data structures, the data items are not in


Non-Linear
sequence. Example: Tree, Graph
In homogeneous data structures, all the elements are of same
Homogeneous
type. Example: Array

Non- In Non-Homogeneous data structure, the elements may or


Homogeneous may not be of the same type. Example: Structures

Static data structures are those whose sizes and structures


Static associated memory locations are fixed, at compile time.
Example: Array

Dynamic structures are those which expands or shrinks


depending upon the program need and its execution. Also,
Dynamic
their associated memory locations changes. Example: Linked
List created using pointers

What is an Algorithm?

An algorithm is a finite set of instructions or logic, written in order, to accomplish


a certain predefined task. Algorithm is not the complete code or program, it is just
the core logic (solution) of a problem, which can be expressed either as an informal
high level description as pseudocode or using a flowchart.

Every Algorithm must satisfy the following properties:

1. Input- There should be 0 or more inputs supplied externally to the algorithm.


2. Output- There should be at least 1 output obtained.
3. Definiteness- Every step of the algorithm should be clear and well defined.
4. Finiteness- The algorithm should have finite number of steps.
5. Correctness- Every step of the algorithm must generate a correct output.
An algorithm is said to be efficient and fast, if it takes less time to execute and
consumes less memory space. The performance of an algorithm is measured on the
basis of following properties :

1. Time Complexity
2. Space Complexity

Space Complexity

Its the amount of memory space required by the algorithm, during the course of its
execution. Space complexity must be taken seriously for multi-user systems and in
situations where limited memory is available.

An algorithm generally requires space for following components :

 Instruction Space: Its the space required to store the executable version of
the program. This space is fixed, but varies depending upon the number of
lines of code in the program.
 Data Space: Its the space required to store all the constants and
variables(including temporary variables) value.
 Environment Space: Its the space required to store the environment
information needed to resume the suspended function.

Time Complexity

Time Complexity is a way to represent the amount of time required by the program
to run till its completion. It's generally a good practice to try to keep the time
required minimum, so that our algorithm completes it's execution in the minimum
time possible.

Why Learn Data Structure?

Data structure and algorithms are two of the most important aspects of computer
science. Data structures allow us to organize and store data, while algorithms allow
us to process that data in a meaningful way. Learning data structure and algorithms
will help you become a better programmer. You will be able to write code that is
more efficient and more reliable. You will also be able to solve problems more
quickly and more effectively.

Types of Data Structure

There are 2 types of Data Structure :

 Primitive Data Structure


 Non – Primitive Data Structure

Primitive Data Structure –

Primitive Data Structures directly operate according to the machine instructions.


These are the primitive data types. Data types like int, char, float, double, and
pointer are primitive data structures that can hold a single value.

Non – Primitive Data Structure –

Non-primitive data structures are complex data structures that are derived from
primitive data structures. Non – Primitive data types are further divided into two
categories.

 Linear Data Structure


 Non – Linear Data Structure
Linear Data Structure –

Linear Data Structure consists of data elements arranged in a sequential manner


where every element is connected to its previous and next elements. This
connection helps to traverse a linear arrangement in a single level and in a single
run. Such data structures are easy to implement as memory is additionally
sequential. Some examples of Linear Data Structure are List, Queue, Stack, Array
etc.

Types of Linear Data Structure –

1] Arrays –

An array is a collection of similar data elements stored at contiguous memory


locations. It is the simplest data structure where each data element can be accessed
directly by only using its index number.

2] Linked List –

A linked list is a linear data structure that is used to maintain a list-like structure in
the computer memory. It is a group of nodes that are not stored at contiguous
locations. Each node of the list is linked to its adjacent node with the help of
pointers.

3] Stack –

Stack is a linear data structure that follows a specific order during which the
operations are performed. The order could be FILO (First In Last Out) or LIFO
(Last In First Out).

The basic operations performed in stack are as follows :

 Push – Adds an item within the stack.


 Pop – Deletes or removes an item from the stack.
 Top – Returns the topmost element of the stack.
 IsEmpty – Returns true if the stack is empty.
4] Queue –

Queue is a linear data structure in which elements can be inserted from only one
end which is known as rear and deleted from another end known as front. It
follows the FIFO (First In First Out) order.

 Deque – Adds an element to the queue.


 Enqueue – Deletes or removes an element from the queue.
 IsFull – Returns true if the queue is full.
 IsEmpty – Returns true if the queue is empty.

Non-Linear Data Structure –

Non-linear Data Structures do not have any set sequence of connecting all its
elements and every element can have multiple paths to attach to other elements.
Such data structures support multi-level storage and sometimes can’t be traversed
in a single run. Such data structures aren’t easy to implement but are more efficient
in utilizing memory. Some examples of non-linear data structures are Tree, BST,
Graphs etc.

Types of Non-Linear Data Structure

1] Tree –
A tree is a multilevel data structure defined as a set of nodes. The topmost node is
named root node while the bottom most nodes are called leaf nodes. Each node has
only one parent but can have multiple children.

Types of Trees in Data structure

 General Tree
 Binary Tree
 Binary Search Tree
 AVL Tree
 Red Black Tree
 N-ary Tree
2] Graph

A graph is a pictorial representation of a set of objects connected by links known


as edges. The inter connected nodes are represented by points named vertices, and
the links that connect the vertices are called edges.

Types of Graph

 Finite Graph
 Infinite Graph
 Trivial Graph
 Simple Graph
 Multi Graph
 Null Graph
 Complete Graph
 Pseudo Graph
 Regular Graph
 Bipartite Graph
 Labeled Graph
 Digraph Graph
 Subgraph
 Connected or Disconnected Graph
 Cyclic Graph
 Vertex Labelled Graph
 Directed Acyclic Graph
A graph is a pair of sets (V, E), where V is the set of vertices and E is the set of
edges.
Classification of Data Structure

Data Structure can be further classified as

 Static Data Structure


 Dynamic Data Structure
Static Data Structure

Static Data Structures are data structures where the size is allocated at the compile
time. Hence, the maximum size is fixed and cannot be changed.

Dynamic Data Structure

Dynamic Data Structures are data structures where the size is allocated at the run
time. Hence, the maximum size is flexible and can be changed as per requirement.

Data Structure Operations –

The common operations that can be performed on the data structures are as follows
:

 Searching – We can easily search for any data element in a data structure.
 Sorting – We can sort the elements either in ascending or descending order.
 Insertion – We can insert new data elements in the data structure.
 Deletion – We can delete the data elements from the data structure.
 Updating – We can update or replace the existing elements from the data
structure.
Advantages of Data Structure –

1. Data structures allow storing the information on hard disks.


2. An appropriate choice of ADT (Abstract Data Type) makes the program
more efficient.
3. Data Structures are necessary for designing efficient algorithms.
4. It provides reusability and abstraction.
5. Using appropriate data structures can help programmers save a good amount
of time while performing operations such as storage, retrieval, or processing
of data.
6. Manipulation of large amounts of data is easier.
Data Structure Applications
1. Organization of data in a computer’s memory

2. Representation of information in databases

3. Algorithms that search through data (such as a search engine)

4. Algorithms that manipulate data (such as a word processor)

5. Algorithms that analyze data (such as a data miner)

6. Algorithms that generate data (such as a random number generator)

7. Algorithms that compress and decompress data (such as a zip utility)

8. Algorithms that encrypt and decrypt data (such as a security system)

9. Software that manages files and directories (such as a file manager)

10. Software that renders graphics (such as a web browser or 3D rendering


software)

Problem analysis is the process of defining a problem and decomposing overall


system into smaller parts to identify possible inputs, processes and outputs
associated with the problem. This task is further subdivided into six subtasks
namely:

1. Specifying the Objective :

First, we need to know what problem is actually being solved. Making a


clear statement of the problem depends upon the size and complexity of the
problem. Smaller problems not involving multiple subsystems can easily be
stated and then we can move onto the next step of “Program Design”.
However, a problem interacting with various subsystems and series of
programs require complex analysis, in-depth research and careful
coordination of people, procedures and programs.

2. Specifying the Output :

Before identifying inputs required for the system, we need to identify what
comes out of the system. The best way to specify output is to prepare some
output forms and required format for displaying result. The best person to
judge an output form is the end user of the system i.e. the one who uses the
software to his benefit. Various forms can be designed by the programmer
which must be examined to see whether they are useful or not.

3. Specifying Input Requirements :

After having specified the outputs, the input and data required for the system
need to be specified as well. One needs to identify the list of inputs required
and the source of data. For example, in a simple program to keep student’s
record, the inputs could be the student’s name, address, roll-numbers, etc.
The sources could be the students themselves or the person supervising
them.

4. Specifying Processing Requirements :

When output and inputs are specified, we need to specify process that
converts specified inputs into desired output. If the proposed program is to
replace or supplement an existing one, a careful evaluation of the present
processing procedures needs to be made, noting any improvements that
could made. If the proposed system is not designed to replace an existing
system, then it is well advised to carefully evaluate another system that
addresses a similar problem.

5. Evaluating the Feasibility :

After the successful completion of all the above four steps one needs to see
whether the things accomplished so far in the process of problem solving are
practical and feasible. To replace an existing system one needs to determine
how the potential improvements outperforms existing system or other
similar system.

6. Problem Analysis Documentation

Before concluding the program analysis stage, it is best to record whatever


has been done so far in the first phase of program development. The record
should contain the statement of program objectives, output and input
specifications, processing requirements and feasibility.

What is abstract data type?

An abstract data type is an abstraction of a data structure that provides only the
interface to which the data structure must adhere. The interface does not give any
specific details about something should be implemented or in what programming
language.

In other words, we can say that abstract data types are the entities that are
definitions of data and operations but do not have implementation details. In this
case, we know the data that we are storing and the operations that can be
performed on the data, but we don't know about the implementation details. The
reason for not having implementation details is that every programming language
has a different implementation strategy for example; a C data structure is
implemented using structures while a C++ data structure is implemented using
objects and classes.
For example, a List is an abstract data type that is implemented using a dynamic
array and linked list. A queue is implemented using linked list-based queue, array-
based queue, and stack-based queue. A Map is implemented using Tree map, hash
map, or hash table.

Abstract data type model

Before knowing about the abstract data type model, we should know about
abstraction and encapsulation.

Abstraction: It is a technique of hiding the internal details from the user and only
showing the necessary details to the user.

Encapsulation: It is a technique of combining the data and the member function in


a single unit is known as encapsulation.

The above figure shows the ADT model. There are two types of models in the
ADT model, i.e., the public function and the private function. The ADT model also
contains the data structures that we are using in a program. In this model, first
encapsulation is performed, i.e., all the data is wrapped in a single unit, i.e., ADT.
Then, the abstraction is performed means showing the operations that can be
performed on the data structure and what are the data structures that we are using
in a program.

Let's understand the abstract data type with a real-world example.

If we consider the smartphone. We look at the high specifications of the


smartphone, such as:

o 4 GB RAM
o Snapdragon 2.2ghz processor
o 5 inch LCD screen
o Dual camera
o Android 8.0

The above specifications of the smartphone are the data, and we can also perform
the following operations on the smartphone:

o call(): We can call through the smartphone.


o text(): We can text a message.
o photo(): We can click a photo.
o video(): We can also make a video.

The smartphone is an entity whose data or specifications and operations are given
above. The abstract/logical view and operations are the abstract or logical views of
a smartphone.

The implementation view of the above abstract/logical view is given below:


1. class Smartphone
2. {
3. private:
4. int ramSize;
5. string processorName;
6. float screenSize;
7. int cameraCount;
8. string androidVersion;
9. public:
10. void call();
11. void text();
12. void photo();
13. void video();
14.}

The above code is the implementation of the specifications and operations that can
be performed on the smartphone. The implementation view can differ because the
syntax of programming languages is different, but the abstract/logical view of the
data structure would remain the same. Therefore, we can say that the
abstract/logical view is independent of the implementation view.
ADT is let us consider different in-built data types that are provided to us. Data
types such as int, float, double, long, etc. are considered to be in-built data types
and we can perform basic operations with them such as addition, subtraction,
division, multiplication, etc. Now there might be a situation when we need
operations for our user-defined data type which have to be defined. These
operations can be defined only as and when we require them. So, in order to
simplify the process of solving problems, we can create data structures along with
their operations, and such data structures that are not in-built are known as
Abstract Data Type (ADT).
Abstract Data type (ADT) is a type (or class) for objects whose behavior is
defined by a set of values and a set of operations. The definition of ADT only
mentions what operations are to be performed but not how these operations will
be implemented. It does not specify how data will be organized in memory and
what algorithms will be used for implementing the operations. It is called
“abstract” because it gives an implementation-independent view.
The process of providing only the essentials and hiding the details is known as
abstraction.

The user of data type does not need to know how that data type is implemented,
for example, we have been using Primitive values like int, float, char data types
only with the knowledge that these data type can operate and be performed on
without any idea of how they are implemented.
So a user only needs to know what a data type can do, but not how it will be
implemented. Think of ADT as a black box which hides the inner structure and
design of the data type. Now we’ll define three ADTs
namely List ADT, Stack ADT, Queue ADT.

1. List ADT

=
 The data is generally stored in key sequence in a list which has a head
structure consisting of count, pointers and address of compare function needed
to compare the data in the list.
 The data node contains the pointer to a data structure and a self-referential
pointer which points to the next node in the list.
 The List ADT Functions is given below:
 get() – Return an element from the list at any given position.
 insert() – Insert an element at any position of the list.
 =remove() – Remove the first occurrence of any element from a non-empty
list.
 removeAt() – Remove the element at a specified location from a non-empty
list.
 Re===place() – Replace an element at any position by another element.
 size() – Return the number of elements in the list.
 isEmpty() – Return true if the list is empty, otherwise return false.
 isFull() – Return true if the list is full, otherwise return false.
2. Stack ADT

View of stack

 In Stack ADT Implementation instead of data being stored in each node, the
pointer to data is stored.
 The program allocates memory for the data and address is passed to the stack
ADT.
 The head node and the data nodes are encapsulated in the ADT. The calling
function can only see the pointer to the stack.
 The stack head structure also contains a pointer to top and count of number of
entries currently in stack.
 push() – Insert an element at one end of the stack called top.
 pop() – Remove and return the =element at the top of the stack, if it is not
empty.
 peek() – Return the element at the top of the stack without removing it, if the
stack is not empty.
 size() – Return the number of elements in the stack.
 isEmpty() – Return true if the stack is empty, otherwise return false.=
 isFull() – Ret=urn true if the stack is full, otherwise return false.
3. Queue ADT=

=
=View of Queue

 The queue abstract data type (A=DT) follows the basic design of the stack
abstract data type.
 Each node contains :a void pointer to the data and the link pointer to the next
element in the queue. The program’s responsibility is to allocate memory for
storing the data.
 enqueue() – Insert an element at the end of the queue.
 dequeue() – Remove and return the first element of the queue, if the queue is
not empty.
 peek() – Return the element of the queue without removing it, if the queue is
not empty.
 size() – Return the number of elements in the queue.
 isEmpty() – Return true if the queue is empty, otherwise return false.
 isFull() – Return true if the queue is full, otherwise return false.
Features of ADT:
Abstract data types (ADTs) are a way of encapsulating data and operations
on that data into a single unit. Some of the key features of ADTs include:
 Abstraction: The user does not need to know the implementation of the data
structure only essentials are provided.
 Better Conceptualization: ADT gives us a better conceptualization of the
real world.
 Robust: The program is robust and has the ability to catch errors.
 Encapsulation: ADTs hide the internal details of the data and provide a
public interface for users to interact with the data. This allows for easier
maintenance and modification of the data structure.
 Data Abstraction=: ADTs provide a level of abstraction from the
implementation details of the data. Users only need to know the operations
that can be performed on the data, not how those operations are implemented.
 Data Structure Independence: ADTs can be implemented using different
data structures, such as arrays or linked lists, without affecting the
functionality of the =====ADT.
 Information Hiding: ADTs can protect the integrity of the data by allowing
access only to authorized users and operations. This helps prevent errors and
misuse of the data.
 Modularity: ADTs can be combined with other ADTs to form larger, more
complex data structures. This allows for greater flexibility and modularity in
programming.
Overall, ADTs provide a powerful tool for organizing and manipulating data in a
structured and efficient manner.
Abstract data types (ADTs) have several advantages and disadvantages that
should be considered when deciding to use them in software development. Here
are some of the main advantages and disadvantages of using ADTs:

Advantages:

 Encapsulation: ADTs provide a way to encapsulate data and operations into a


single unit, making it easier to manage and modify the data structure.
 Abstraction: ADTs allow users to work with data structures without having to
know the implementation details, which can simplify programming and reduce
errors.
 Data Structure Independence: ADTs can be implemented using different
data structures, which can make it easier to adapt to changing needs and
requirements.
 Information Hiding: ADTs can protect the integrity of data by controlling
access and preventing unauthorized modifications.
 Modularity: ADTs can be combined with other ADTs to form more complex
data structures, which can increase flexibility and modularity in programming.

Disadvantages:

 Overhead: Implementing ADTs can add overhead in terms of memory and


processing, which can affect performance.
 Complexity: ADTs can be complex to implement, especially for large and
complex data structures.
 Learning Curve: Using ADTs requires knowledge of their implementation
and usage, which can take time and effort to learn.
 Limited Flexibility: Some ADTs may be limited in their functionality or may
not be suitable for all types of data structures.
 Cost: Implementing ADTs may require additional resources and investment,
which can increase the cost of development.
Overall, the advantages of ADTs often outweigh the disadvantages, and they are
widely used in software development to manage and manipulate data in a
structured and efficient way. However, it is important to consider the specific
needs and requirements of a project when deciding whether to use ADTs.
From these definitions, we can clearly see that the definitions do not specify how
these ADTs will be represented and how the operations will be carried out. There
can be different ways to implement an ADT, for example, the List ADT can be
implemented using arrays, or singly linked list or doubly linked list. Similarly,
stack ADT and Queue ADT can be implemented using arrays or linked lists.

You might also like