You are on page 1of 16


Data Structure

Introduction to ADTs
Abstract data types are purely theoretical entities, used (among other things) to simplify the
description of abstract algorithms, to classify and evaluate data structures, and to formally describe
the type systems of programming languages. However, an ADT may be implemented by specific
data types or data structures, in many ways and in many programming languages; or described in
a formal specification language. ADTs are often implemented as modules: the module's interface
declares procedures that correspond to the ADT operations, sometimes with comments that
describe the constraints. This information hiding strategy allows the implementation of the module
to be changed without disturbing the client programs.
The term abstract data type can also be regarded as a generalised approach of a number
of algebraic structures, such as lattices, groups, and rings. The notion of abstract data types is
related to the concept of data abstraction, important in object-oriented programming and design by
contract methodologies for software development.

An abstract data type is defined as a mathematical model of the data objects that make up a data
type as well as the functions that operate on these objects. There are no standard conventions for
defining them. A broad division may be drawn between "imperative" and "functional" definition
WHO: Abstract Data Type
WHAT: Abstract Data Type are data-type that combines the common functionalities from different
related objects into one package, so that those different but related objects interface can inherit
from ADT thus making it more flexible and less coding for the programmer.
WHERE: Anywhere, when you code, in your life wherever applicable.
WHEN: Use it when you want to generalise a design. When you have many objects that have
similar functionalities, you can Abstract the common functionalities into one class, and have the
other objects inherit from them.
WHY: You would create ADT for numerous reasons. First it generally leads to a flexible design if
designed well. Second its less coding for the programmer. Third its more reusability. Fourth, it
leads to easier maintenance. and Fifth, it creates the possibility of polymorphism.

Sparse Matrix
Sparse matrix is a matrix populated primarily with zeros. Sparse matrices are stored using special
data structures, which allow to store only nonzero elements and save a lot of memory and CPU
time when working with such matrices.
Specialized data structures allow to solve computational problems which are impossible to solve
when matrices are stored using "conventional", dense storage.

Sparse Storage Formats

Structure used for sparse matrix storage should provide following functionality:
maintain list of nonzero elements and their values
retrieve element by its indexes i/j
enumerate all nonzero elements in the matrix
insert new element in the matrix
multiply sparse matrix by vector (or dense matrix)
other service operations
There are several popular data structures which are used to store sparse matrices, each with its
own benefits and drawbacks. ALGLIB package implements two such structures: hash table storage
(also known as Dictionary of keys) and CRS (Compressed row storage).

Hash table storage

Sparse matrix can be stored as hash table which maps keys (row/column indexes) to values of
corresponding elements. Such storage format has following features:
matrix creation - requires to know only matrix size. It is not necessary to know amount of
nonzero elements, although you can benefit from such knowledge. It will allow you to allocate
required amount of space during matrix creation and to avoid subsequent reallocations of the
hash table during initialization of the matrix elements.
time to insert element - O(1). Elements can be inserted in arbitrary order. However, when hash
table becomes nearly full we have to spend O(TableSize) time creating new, larger table and
copying elements from the old one.
memory requirements - O((2sizeof(int)+sizeof(double))) per one nonzero element in nearly
full table. Here is a small constant from [1.2,1.5] which is a consequence of the fact that hash
table require some amount of "spare memory".
time to read element - O(1). Hash tables are quite fast and allow arbitrary access to elements.
time to enumerate all nonzero elements - O(TableSize), where TableSize is a total size of hash
table (including both busy and free elements). Hash table allows to enumerate its entries, but it
does not provide any ordering guarantee. Elements of the matrix will be retrieved in arbitrary
linear operations with matrix (matrix-vector products, matrix-matrix products) - not supported.
Hash table stores matrix elements in random order, which prevents efficient implementation of
the linear algebra operations. Thus, in order to use matrix in linear operations you have to
convert it to CRS format.
You may see that hash table storage allows easy creation and modification of sparse matrices. We
may access (read and write) elements in arbitrary order, we do not have to know table size in
advance, we have fast random access to the table. However, this storage format is not good for

and does not allow to use sparse matrix in linear algebra operations like matrix-vector product. You
have to convert table to CRS format in order to do so.
Thus, hash table storage can be classified as intermediate representation which is used when you
want to initialize sparse matrix.

CRS representation
CRS (Compressed Row Storage) is another representation which can be used to store sparse
matrix. Matrix elements are ordered by rows (from top to bottom) and columns (within row - from
left to right). Such storage format has following features:
matrix creation - requires prior knowledge of both matrix size and nonzero elements count. It is
require to tell in advance how many nonzero elements is stored in each row of the matrix.
time to insert element - O(1). Elements can not be inserted in arbitrary locations. You can insert
them only in strict order - within row from left to right, rows are processed from top to bottom.
memory requirements - O(sizeof(int)+sizeof(double)) per one nonzero element.
time to read element - O(log(NonzeroElementsInRow)). Retrieval of the arbitrary element
requires binary search in the corresponding row.
time to enumerate all nonzero elements - O(NonzeroCount), where NonzeroCount is a number
of nonzero elements. CRS representation guarantees that elements will be retrieved in specific
order - rows are enumerated from top to bottom, within row elements are enumerated from left to
linear algebra operations (matrix-vector products, matrix-matrix products) - are supported.
You may see that CRS representation allows to use matrices in the linear algebra operations due
to efficient storage format. However, hard initialization is weakness of such representation: a) you
have to know in advance number of nonzero elements in the matrix, and b) you have to fill matrix
in strict order (from left to right, from top to bottom).
Thus, CRS representation is good for numerical work, but is not good for easy initialization. Luckily,
there exists hash table storage format, which is good for initialization, and can be converted to

void main()
int A[10][10],B[10][3],m,n,s=0,i,j;
printf("\nEnter the order m x n of the sparse matrix\n");
printf("\nEnter the elements in the sparse matrix(mostly zeroes)\n");
printf("\n%d row and %d column: ",i,j);
printf("The given matrix is:\n");

printf("%d ",A[i][j]);
printf("\nThe sparse matrix is given by");
printf("%d ",B[i][j]);

Recursion is a wonderful programming tool. It provides a simple, powerful way of approaching a
variety of problems. It is often hard, however, to see how a problem can be approached
recursively; it can be hard to think recursively. It is also easy to write a recursive program that
either takes too long to run or doesnt properly terminate at all. In this article well go over the
basics of recursion and hopefully help you develop, or refine, a very important programming skill.

What is Recursion?
In order to say exactly what recursion is, we first have to answer What is recursion? Basically, a
function is said to be recursive if it calls itself. Below is pseudocode for a recursive function that
prints the phrase Hello World a total of count times:
function HelloWorld(count)
print("Hello World!")
HelloWorld(count - 1)

Why use Recursion?

The problem we illustrated above is simple, and the solution we wrote works, but we probably
would have been better off just using a loop instead of bothering with recursion. Where recursion
tends to shine is in situations where the problem is a little more complex. Recursion can be applied
to pretty much any problem, but there are certain scenarios for which youll find its particularly
helpful. In the remainder of this article well discuss a few of these scenarios and, along the way,
well discuss a few more core ideas to keep in mind when using recursion.

Tail Recursion
Recursive calls can occur at any point of the algorithm. For example, if we consider a for loop, that
runs from 0 to n-1, then we know that the loop body is executed repeatedly with different values of
n. Similarly, If the recursive call occurs at the end of the function, called tail recursion, the result is
similar to a loop. That is, function executes all the statements before recursive call before jumping
into the next recursive call.
public void foo(int n) {
if (n == 1) return;
else { System.out.println(n); foo(n-1);}

Head Recursion
If the recursive call occurs at the beginning of the function, called head recursion, the function
saves the state of the program before jumping into the next function call. That is, function waits to
evaluate statements until the exit condition is reached. The state of the program is saved in a stack
(we will learn more about stacks later in the course)
void foo(int n) {
If (n==0) return;
else { foo(n-1);

Self Referencing Classes

Self-reference occurs in natural or formal languages when a sentence, idea or formula
refers to itself. The reference may be expressed either directlythrough some intermediate
sentence or formulaor by means of some encoding. In philosophy, it also refers to the ability of a
subject to speak of or refer to itself: to have the kind of thought expressed by the first person
nominative singular pronoun, the word "I" in English.
The definition of class may refer to itself. One situation in which this occurs is when a class
operation has an object as a parameter where that object is of that same class as the once
containing the operation. Examples of where is occurs are the following:
a Location object is to decide if it has the same screen coordinates as another location object.
a Shape object is to decide if it has the same height and width as another shape object, or
a File object is to copy itself to or from another File

In each case the operation needs as its parameter an object in the same class as the one
containing the operation.

#include <iostream>
#include <string>
class Node
Node(int AGE, string NAME)
next = 0;
void setAge(int a)
age = a;
int getAge()
return age;
void setName(string n)
name = n;
string getName()
return name;
void setNextPtr(Node *node)
Node *last = this;
while (last->next)
last = last->next;
last->next = node;
Node *getNextPtr();
int age;
string name;
Node *next;
void main()
Node *temp = 0;
Node *temp2;
int age, choice;
string name;

cout << "1. Insert Record\n2. Search Record\n3. Delete Record\n4. Print Record\n5. Quit\nEnter
choice: ";
cin >> choice;
cout << endl << endl;
switch (choice)
case 1:
cout << "\n\nEnter first name: ";
cin >> name;
cout << "Enter age: ";
cin >> age;
temp2 = new Node(age, name);
cout << "Successfully created record.\n";
if (temp == 0)
temp = temp2;

Memory Allocation
Memory allocation is a process by which computer programs and services are assigned with
physical or virtual memory space.
Memory allocation is the process of reserving a partial or complete portion of computer memory for
the execution of programs and processes. Memory allocation is achieved through a process known
as memory management.
Memory allocation has two core types;
Static Memory Allocation: The program is allocated memory at compde
Dynamic Memory Allocation: The programs are altocated wth memory at
run time.
Static Memory Allocation
Done at compile time,
Global variables: variables declared ahead of time." such fixed arrays.
Lifetime = entire runtime of program
Advantage: efficient execution time.
If we declare more static data space Aan we need, we waste space.
If we declare less static space than we need, we are out of luck.

Dynamic Memory Allocation

Done at run time.
Data structures can grow and shrink to fit changing data requirements.
We can allocate (create) additional storage whenever we need them.
We can de-allocate (free/delete) dynamic space whenever we are done with them.
Advantage: we can always have exactly the amount of space required - no more, no less
For example, with references to connect them we can use dynamic
data structures to create a chain of data structures called a linked list

Balance and AVL Tree

In computer science, an AVL tree is a self-balancing binary search tree. It was the first such
data structure to be invented. In an AVL tree, the heights of the two child subtrees of any node
differ by at most one; if at any time they differ by more than one, rebalancing is done to restore this
property. Lookup, insertion, and deletion all take O(log n) time in both the average and worst
cases, where n is the number of nodes in the tree prior to the operation. Insertions and deletions
may require the tree to be rebalanced by one or more tree rotations.
The AVL tree is named after its two Soviet inventors, Georgy Adelson-Velsky and Evgenii
Landis, who published it in their 1962 paper "An algorithm for the organisation of information".

Balance factors can be kept up-to-date by knowing the previous balance factors and the
change in height - it is not necessary to know the absolute height. For holding the AVL balance
information, two bits per node are sufficient.

How to keep the tree balanced

Still need to insert preserving the order of keys(otherwise search does not work)
Insert as the new item arrive (cannot wait till have all items)
So need to alter the tree as more item arrive.

Hashing is the process of mapping large amount of data item to a smaller table with the
help of a hashing function. The essence of hashing is to facilitate the next level searching method
when compared with the linear or binary search. The advantage of this searching method is its
efficiency to hand vast amount of data items in a given collection (i.e. collection size).

Types of Hashing
There are two types of hashing which are as follows:
Static Hashing

Dynamic hashing

Static Hashing
Hashing provides a means for accessing data without the use of an index structure.
Data is addressed on disk by computing a function on a search key instead.
A bucket in a hash file is unit of storage (typically a disk block) that can hold cane or
more records.
The hash function, h, is a function from the set of all search-keys, K, to the set of all
bucket addresses, B.
Insertion, deletion, and lookup are done in constant time.

Dynamic Hashing
More effective then static hashing when the database grows or shrinks;
Extendable hashing splits and coalesces buckets appropriately with the database size.
i.e. buckets are added and deleted on demand.

Overflow Handling
The API for user written fixed points function provides functions for some mathematical
operations such as conversions. When these operations are performed, a loss of precision or
overflow may occur the tokens in the following tables allow you to control the way an API functions
handles precision loss and overflow. The data type of overflow handling methods is



Structure overflow


Wrap overflow

Overflow Logging Structure

Math functions of the API:
Such as the FXPConvert, can encounter overflow when carrying out an operation. These functions
provide a mechanism to log the occurrence of overflow and to report that log back to caller.
You can use a fixed point overflow logging structure in your S.function by defining a variable of
FXPOverflowlogs. Some API functions such as SSFXPConvert, accept a pointer to this structure
as an argument.


Standard Template Library

The Standard Library is a fundamental part of the C++ Standard. It provides C++
programmers with a comprehensive set of efficiently implemented tools and facilities that can be
used for most types of applications.
In this article, I present an introduction/tutorial on the Standard Template Library, which is the most
important section of the Standard Library. I briefly present the fundamental concepts in the STL,
showing code examples to help you understand these concepts. The article requires and assumes
previous knowledge of the basic language features of the C++, in particular, templates (both
function templates and class templates).

In the example of reversing a C array, the arguments to reverse are clearly of type double*. What
are the arguments to reverse if you are reversing a vector, though, or a list? That is, what exactly
does reverse declare its arguments to be, and what exactly do v.begin() and v.end() return?
The answer is that the arguments to reverse are iterators, which are a generalization of pointers.
Pointers themselves are iterators, which is why it is possible to reverse the elements of a C array.
Similarly, vector declares the nested types iterator and const_iterator. In the example above, the
type returned by v.begin() and v.end() is vector<int>::iterator. There are also some iterators, such
as istream_iterator and ostream_iterator, that aren't associated with containers at all.
Iterators are the mechanism that makes it possible to decouple algorithms from containers:
algorithms are templates, and are parameterized by the type of iterator, so they are not restricted
to a single type of container. Consider, for example, how to write an algorithm that performs linear
search through a range. This is the STL's find algorithm.
template <class InputIterator, class T>
InputIterator find(InputIterator first, InputIterator last, const T& value) {
while (first != last && *first != value) ++first;
return first;
Find takes three arguments: two iterators that define a range, and a value to search for in that
range. It examines each iterator in the range [first, last), proceeding from the beginning to the end,
and stops either when it finds an iterator that points to value or when it reaches the end of the
First and last are declared to be of type InputIterator, and InputIterator is a template parameter.
That is, there isn't actually any type called InputIterator: when you call find, the compiler substitutes
the actual type of the arguments for the formal type parameters InputIterator and T. If the first two
arguments to find are of type int* and the third is of type int, then it is as if you had called the
following function.
int* find(int* first, int* last, const int& value) {
while (first != last && *first != value) ++first;
return first;


Arraylist is a class which implements List interface. It is widely used because of the
functionality and flexibility it offers. Most of the developers choose Arraylist over Array as its a very
good alternative of traditional java arrays.
The issue with arrays is that they are of fixed length so if it is full we cannot add any more elements
to it, likewise if there are number of elements gets removed from it the memory consumption would
be the same as it doesnt shrink. On the other ArrayList can dynamically grow and shrink as per the
need. Apart from these benefits ArrayList class enables us to use predefined methods of it which
makes our task easy.

Once an array object has been created, what each individual component refers to can be changed,
but the size of the array cannot be changed. ArrayList objects, however, can have their size
changed. If a refers to an object of type ArrayList<Thing>, then the statement a.add(expr) where
expr evaluates to a value of type Thing causes an extra component to be added to the end of the
arrayList object referred to by a which is set to refer to the object returned by the evaluation of
expr. The value returned by a.size() is increased by 1, and the value returned by a.get(a.size()-1)
with the new value of a.size() is the value returned by expr. This is a destructive change, so any
variable which aliases a will see the same change in the object it refers to as it is the same object.
So in our code for makeArrayListInt above, the statement
ArrayList<Integer> a = new ArrayList<Integer>();
declares a variable of type ArrayList<Integer> called a, and sets it to a new arrayList of integers
whose size is initially 0. Then every time a.add(0) is called in the loop, an extra component is
added to this arrayList object, every component added referring to an Integer of value 0.

Linked List
In computer science, a linked list is a linear collection of data elements, called nodes,
pointing to the next node by means of a pointer. It is a data structure consisting of a group of nodes
which together represent a sequence. Under the simplest form, each node is composed of data
and a reference (in other words, a link) to the next node in the sequence. This structure allows for
efficient insertion or removal of elements from any position in the sequence during iteration. More
complex variants add additional links, allowing efficient insertion or removal from arbitrary element

Singly Linked List

Singly linked lists contain nodes which have a data field as well as a 'next' field, which
points to the next node in line of nodes. Operations that can be performed on singly linked lists
include insertion, deletion and traversal.


Double Linked List

In computer science, a doubly linked list is a linked data structure that consists of a set of
sequentially linked records called nodes. Each node contains two fields, called links, that are
references to the previous and to the next node in the sequence of nodes. The beginning and
ending nodes' previous and next links, respectively, point to some kind of terminator, typically a
sentinel node or null, to facilitate traversal of the list. If there is only one sentinel node, then the list
is circularly linked via the sentinel node. It can be conceptualized as two singly linked lists formed
from the same data items, but in opposite sequential orders.

The two node links allow traversal of the list in either direction. While adding or removing a node in
a doubly linked list requires changing more links than the same operations on a singly linked list,
the operations are simpler and potentially more efficient (for nodes other than first nodes) because
there is no need to keep track of the previous node during traversal or no need to traverse the list
to find the previous node, so that its link can be modified.

Use of Data Structure

In computer science, a data structure is a particular way of organizing data in a computer
so that it can be used efficiently. Data structures can implement one or more particular abstract
data types (ADT), which specify the operations that can be performed on a data structure and the
computational complexity of those operations. In comparison, a data structure is a concrete
implementation of the specification provided by an ADT.
Different kinds of data structures are suited to different kinds of applications, and some are highly
specialized to specific tasks. For example, relational databases commonly use B-tree indexes for
data retrieval, while compiler implementations usually use hash tables to look up identifiers.

Data structures provide a means to manage large amounts of data efficiently for uses such
as large databases and internet indexing services. Usually, efficient data structures are key to
designing efficient algorithms. Some formal design methods and programming languages
emphasize data structures, rather than algorithms, as the key organizing factor in software design.
Data structures can be used to organize the storage and retrieval of information stored in both
main memory and in secondary memory.


Object Oriented Concepts

Object Oriented Programming:Object-oriented programming (OOP) is a programming paradigm based on the concept of
"objects", which may contain data, in the form of fields, often known as attributes; and code, in the
form of procedures, often known as methods.


In object-oriented programming, a class is an extensible program-code-template for
creating objects, providing initial values for state (member variables) and implementations of
behaviour (member functions or methods). In many languages, the class name is used as the
name for the class (the template itself), the name for the default constructor of the class (a
subroutine that creates objects), and as the type of objects generated by instantiating the class;
these distinct concepts are easily conflated.
When an object is created by a constructor of the class, the resulting object is called an instance of
the class, and the member variables specific to the object are called instance variables, to contrast
with the class variables shared across the class.

In computer science, an object can be a variable, a data structure, or a function, and as
such, is a location in memory having a value and possibly referenced by an identifier.
In the class-based object-oriented programming paradigm, "object" refers to a particular instance
of a class where the object can be a combination of variables, functions, and data structures.

In object-oriented programming, inheritance is when an object or class is based on another
object (prototypal inheritance) or class (class-based inheritance), using the same implementation
(inheriting from an object or class) specifying implementation to maintain the same behavior
(realizing an interface; inheriting behavior). It is a mechanism for code reuse and to allow
independent extensions of the original software via public classes and interfaces. The relationships
of objects or classes through inheritance give rise to a hierarchy. Inheritance was invented in 1967
for Simula. The term "inheritance" is loosely used for both class-based and prototype-based
programming, but in narrow use is reserved for class-based programming (one class inherits from
another), with the corresponding technique in prototype-based programming being instead called

Types of Inheritance
There are various types of inheritance, based on paradigm and specific language.


where subclasses inherit the features of one superclass. A class acquires the properties of another

where one class can have more than one superclass and inherit features from all parent classes.
"Multiple Inheritance (object-oriented programming) was widely supposed to be very difficult to
implement efficiently. For example, in a summary of C++ in his book on objective C Brd.Cox
actually claimed that adding Multiple inheritance to C++ was impossible. Thus, multiple inheritance
seemed more of a challenge. Since I had considered multiple inheritance as early as 1982 and
found a simple and efficient implementation technique in 1984. I couldn't resist the challenge. I
suspect this to be the only case in which fashion affected the sequence of events.

where a subclass is inherited from another subclass. It is not uncommon that a class is derived
from another derived class as shown in the figure "Multilevel inheritance".

The class A serves as a base class for the derived class B, which in turn serves as a base class for
the derived class C. The class B is known as intermediate base class because it provides a link for
the inheritance between A and C. The chain ABC is known as inheritance path.

In programming languages and type theory, polymorphism (from Greek , polys,
"many, much" and , morph, "form, shape") is the provision of a single interface to entities
of different types. A polymorphic type is one whose operations can also be applied to values of
some other type, or types. There are several fundamentally different kinds of polymorphism:
Ad hoc polymorphism: when a function denotes different and potentially heterogeneous
implementations depending on a limited range of individually specified types and combinations. Ad
hoc polymorphism is supported in many languages using function overloading.
Parametric polymorphism: when code is written without mention of any specific type and thus
can be used transparently with any number of new types. In the object-oriented programming
community, this is often known as generics or generic programming. In the functional programming
community, this is often shortened to polymorphism.
Subtyping (also called subtype polymorphism or inclusion polymorphism): when a name
denotes instances of many different classes related by some common superclass. In the objectoriented programming community, this is often simply referred to as polymorphism.

Ad hoc polymorphism and parametric polymorphism were originally described in
Fundamental Concepts in Programming Languages, a set of lecture notes written in 1967 by
British computer scientist Christopher Strachey. In a 1985 paper, Peter Wegner and Luca Cardelli
introduced the term inclusion polymorphism to model subtypes and inheritance. However,
implementations of subtyping and inheritance predate the term "inclusion polymorphism", having
appeared with Simula in 1967.


In object-oriented programming theory, abstraction involves the facility to define objects that
represent abstract "actors" that can perform work, report on and change their state, and
"communicate" with other objects in the system. The term encapsulation refers to the hiding of
state details, but extending the concept of data type from earlier programming languages to
associate behavior most strongly with the data, and standardizing the way that different data types
interact, is the beginning of abstraction. When abstraction proceeds into the operations defined,
enabling objects of different types to be substituted, it is called polymorphism. When it proceeds in
the opposite direction, inside the types or classes, structuring them to simplify a complex set of
relationships, it is called delegation or inheritance.
Various object-oriented programming languages offer similar facilities for abstraction, all to support
a general strategy of polymorphism in object-oriented programming, which includes the substitution
of one type for another in the same or similar role. Although not as generally supported, a
configuration or image or package may predetermine a great many of these bindings at compiletime, link-time, or loadtime. This would leave only a minimum of such bindings to change at runtime.