DS Outsidin Chapter 2

Chapter 2
The Big Picture

1
Overview
The big picture answers to several questions.
What are data structures?
What data structures do we study?
What are Abstract Data Types?
Why Object-Oriented Programming (OOP) and

Java for data structures?
How do I choose the right data structures?
2.1 What Are Data Structures
2.1 What Are Data Structures
A data structure is an aggregation of data components

that, together, constitute a meaningful whole.
The components themselves may be data structures.

Stops at some atomic unit.
Atomic or primitive type A data type whose elements

are single, non decomposable data items (can be
broken into parts)
Composite type A data type whose elements are
composed of multiple data items
(ex: take tow integers (simple elements) x, y to form a

point (x, y)
4
2.2 What Data Structures Do We

Study?
An array is an aggregation of entries that are

arranged in contiguous fashion with the
provision of a single-step random access to any
entry.
There are numerous other situations where more

sophisticated data structures are required.
Data structure categories:
Linear
Non-linear
Category is based on how the data is conceptually

organized or aggregated.

Study?
Linear structures
List, Queue and Stack are linear collections, each

of them serves as a repository in which entries may
be added or removed at will.
Differ in how these entries may be accessed once they

are added.

Study?
List
The List is a linear collection of entries in which

entries may be added, removed, and searched for
without restrictions.
Two kinds of list:
Ordered List
Unordered List

Study?
Queue
Entries may only be removed in the order in which

they are added.
First in First out (FIFO) data structures

No search for an entry in the Queue

Study?
Stack
Entries may only be removed in the reverse order in

which they are added.
Last In, First Out (LIFO)

No search for an entry in the Stack.

Study?
Trees:
A tree is a nonlinear structure with a unique starting

node (the root), in which each node is capable of
having many child nodes, and in which a unique
path exists from the root to every other node. Trees
are useful for representing hierarchical relationships
among data items.
Root The top node of a tree structure; a node with

no parent
10
Tree
11
Not Tree
12

Study?
Binary Tree
Binary tree A tree in which each node is capable

of having two child nodes, a left child node and a
right child node
Leaf A tree node that has no children
13
A Binary Tree
14

Study?
General Tree
Models a hierarchy such as the organizational

structure of a company, or a family tree.
A non-linear arrangement of entries, it is a generalization

of the binary tree structure, hence the name.
15

Study?
Binary Search Tree

A binary tree in which the key value in any node
is greater than the key value in its left child and
any of its descendants (the nodes in the left
subtree) and less than the key value in its right
child and any of its descendants (the nodes in
the right subtree)
16
Binary Search Tree
17

Study?
AVL Tree
Height-balanced, binary search tree.
AVL Tree derives its importance from the fact that it

speeds up this search process to a remarkable degree.
18

Study?
Heap as a Priority Queue
A priority queue is a specialization of the FIFO

Queue.
Entries are assigned priorities.

The entry with the highest priority is the one to leave first.
The heap is a special type of binary tree.
19
Complete binary tree
a complete binary tree is one in which every

level but the last must have the maximum
number of nodes possible at that level.
The last level may have fewer than the maximum

possible nodes, but they should be arranged from
left to right without any empty spots.
20
Heap
21

Study?
Hash Table:
Hash Functions: A function used to manipulate the

key of an element in a list to identify its location in
the list
Hashing: The technique for ordering and accessing

elements in a list in a relatively constant amount of
time by manipulating the key to identify its location
in the list
Hash table: Term used to describe the data

structure used to store and retrieve elements using
hashing
22
Using a hash function to Determine the

Location of the Element in an Array
23
2.2 What Data Structures Do We Study?
Graphs
Graph: A data structure that consists of a set of models and a set of edges
that relate the nodes to each other
Vertex: A node in a graph
Edge (arc): A pair of vertices representing a connection between two nodes

in a graph
Two kinds of graphs:
Undirected graph: A graph in which the edges have no direction

Directed graph (digraph): A graph in which each edge is directed from one vertex to
another (or the same) vertex
A general tree is a special kind of graph, since a hierarchy is a special

system of relationships among entities.
Graphs may be used to model systems of physical connections such as computer

networks, airline routes, etc., as well as abstract relationships such as course prerequisite structures.
Standard graph algorithms answer certain questions we may ask of the system.
24
Graphs
25
Adjacent vertices: Two vertices in a graph that

are connected by an edge
Path: A sequence of vertices that connects two
nodes in a graph
Complete graph: A graph in which every vertex
is directly connected to every other vertex
Weighted graph: A graph in which each edge
carries a value
26
27
Data Type
In a modern computer, data consists fundamentally of binary bits, but

meaningful data is organized into primitive data types such as integer,
real, and boolean and into more complex data structures such as arrays
and binary trees.
These data types and data structures always come along with
associated operations that can be done on the data. For example, the
32-bit int data type isdefined both by the fact that a value of type int
consists of 32 binary bits and by the fact that two int values can be
added, subtracted, multiplied, compared, and so on.
An array is defined both by the fact that it is a sequence of data items
of the same basic type, and by the fact that it is possible to directly
access each of the positions in the list based on its numerical index.
So the idea of a data type includes a specification of the possible
values of that type together with the operations that can be
performed on those values.
28
2.3 What are Abstract Data Types?
An algorithm is an abstract idea, and a program is an

implementation of an algorithm. Similarly, it is useful to
be able to work with the abstract idea behind a data
type or data structure, without getting bogged down in
the implementation details.
The abstraction in this case is called an abstract data
type."
An abstract data type specifies the values of the type,
but not how those values are represented as
collections of bits, and it specifies operations on those
values in terms of their inputs, outputs, and effects
rather than as particular algorithms or program code.
29
We are all used to dealing with the primitive data types

as abstract data types.
It is quite likely that you don't know the details of how
values of type double are represented as sequences
of bits. The details are, in fact, rather complicated.
However, you know how to work with double values by
adding them, multiplying them, truncating them to
values of type int, inputting and outputting them, and
so on.
To work with values of type double, you only need to
know how to use these operations and what they do.
You don't need to know how they are implemented.
30

Abstract data type (ADT) A data type whose
properties (domain and operations) are
specified independently of any particular
implementation
A data structure is an abstract data type.
i.e. The primitive data types built into a language integer,

real, character and boolean.
We do not at all worry about its internal representation.
We know we can perform certain operations on integers

that are guaranteed to work the way they are intended to.
+, -, *, / language designers and compiler writers were

responsible for designing the interface for these data types, as
well as implementing them.
31
Exactly how these operations are implemented

by the compiler in the machine code is of no
concern.
On a different machine, the behavior of an integer

does not change, even though it's internal
representations may change.
As the programmers were concerned, the primitive

data types were abstract entries.
32
When you create your own new data types(stacks,

queues, trees, and even more complicated structures)
you are necessarily deeply involved in the data
representation and in the implementation of the
operations.
However, when you use these data types, you don't
want to have to worry about the implementation. You
want to be able to use them the way you use the built-in
primitive types, as abstract data types.
This is in conformance with one of the central principles
of software engineering: encapsulation, also known as
information hiding, modularity, and separation of
interface from implementation.
33
A data structure consisting of several layers of

abstraction.
A Stack (of integers) may be built using a List (of

integers), which in turn may be built using an array
(of integers).
Integer and array are language-defined types.

Stack and List are user-defined.
34
35
Since an ADT makes a clean separation

between interface and implementation, the user
only sees the interface and therefore does not
need tamper with the implementation.
The responsibility of maintaining the implementation

is separated from the responsibility of maintaining th
code that uses an ADT.
This makes the code easier to maintain.
ADTs may be used many times in various contexts.
A List ADT may be used directly in application code, or

may be used to build another ADT, such as the Stack.
36
Object-oriented programming is paradigm that

addresses exactly these issues.
37
2.4 Why OOP and Java for Data

Structures?
OOP paradigm views the program as a system

of interacting objects.
Objects in a program may model physical objects, or

abstract entities.
An object encapsulates state and behavior.
For instance, the state of an integer, is its current value, it

behavior is the set of arithmetic operations that may be
applies to integers.
A class is analogous to the ADT, an object is

analogous to a variable of an ADT.
The user of an object is called its client.
38

Structures?
A stack may be built using a List ADT.
A stack interface defines four operations:
push
pop
get-Size
isEmpty
39

Structures?
The stack object contains a List object which

implements its state, and the behavior of the
Stack object is implemented in terms of the List
object's behavior.
40

Structures?
Reuse by inheritance.
Data structures may be built by inheriting from other

data structures; if data structure A inherits from data
structure B, then every A object is is-A B object.
AVL Tree can inherit from a Binary Search Tree, since

every AVL Tree is is-A special, height balanced, binary
search tree.
Every binary search tree is not an AVL tree.
41

Structures?
Inheriting means sharing the implementation

code.
A language that implements the OOP paradigm
automatically ensures:
Separation of interface from implementation by

encapsulation
Code reuse inheritance when permitted.
42
2.5 How Do I choose the Right Data

Structures?
The interface of operations that is supported by

a data structure is one factor to consider when
choosing between several available data
structures.
The efficiency of the data structures:
How much space does the data structure occupy?

What are the running times of the operation in its
interface?
43

Structures?
Example
Implementing a printer queue, requires a queue

data structure.
Maintains a collection of entries in no particular order.

An unordered list would be the appropriate data structure
in this case.
It is not too difficult to fit the requirements of the

application to the operations supported by a data
structure.
It is more difficult to choose from a set of candidate data

structures that all meet the operational requirements.
44

Structures?
Important, and sometimes contradictory factor

to consider:
The running time of each operation in the interface.
A data structure with the best interface with the best fit
may not necessarily be the best overall fit, if the running
times of its operations are not up to the mark.
When we have more than one data structure
implementation whose interfaces satisfy our
requirements, we may have to select one based on
comparing the running times of the interface operations.
Time is traded off for space,
i.e. more space is consumed to increase speed, or a reduction in

speed is traded for a reduction in the space consumption.
45

Structures?
Time-space tradeoff
We are looking to buy the best implementation of a

stack.
StackA. Does not provide a getSize operation.
i.e. there is not single operation that a client can use to get the
number of entries in StackA.
StackB. Provides a getSize operation, implemented in

the manner we discussed earlier, transferring entries back
and forth between two stacks.
StackC. Provides a getSize operation, implemented as
follows: a variable called size is maintained that is
incremented every time an entry is pushed, and
decremented every time an entry is popped.
46

Structures?
Three situations:
Need to maintain a large number stacks, with no

need to find the number of entries.
Need to maintain only one stack, with frequent need

to find the number of entries.
Need to maintain a large number of stacks. With

infrequent need to find the number of entries.
47

Structures?
Situation 1, StackA fits the bill.
Tempting to pick StackC, simply because we may

want to play conservative: what if we need getSize
in the future?
48

Structures?
Situation 2, StackB or Stack C.
Need to use getSize.
getSize in StackB is more time-consuming than that

in StackC.
We need only one stack, the additional size variable

used by StackC is not an issue.
Since we need to use getSize frequently, it is better

to with StackC.
49

Structures?
Situation 3 presents a choice between StackB

and StackC.
If getSize calls are infrequent, we may choose to go

with StackB and suffer a loss in speed.
The faster getSize delivered by StackC is at the

expense of an extra variable per stack, which may
add up to considerable space consumption since
we plan to maintain a number of stacks.
50

Structures?
getSize in StackB is more time-consuming that

that in StackC.
How can we quantify the time taken in either case?
For each data structure we study, we present the

running time of each operation in its interface.
51

DS Outsidin Chapter 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DS Outsidin Chapter 2

Uploaded by

Copyright:

Available Formats

Chapter 2

The Big Picture

The big picture answers to several questions.

What are data structures?

What data structures do we study?

What are Abstract Data Types?

Why Object-Oriented Programming (OOP) and

How do I choose the right data structures?

2.1 What Are Data Structures

2.1 What Are Data Structures

A data structure is an aggregation of data components

The components themselves may be data structures.

Atomic or primitive type A data type whose elements

(ex: take tow integers (simple elements) x, y to form a

2.2 What Data Structures Do We

An array is an aggregation of entries that are

There are numerous other situations where more

Data structure categories:

Category is based on how the data is conceptually

2.2 What Data Structures Do We

List, Queue and Stack are linear collections, each

Differ in how these entries may be accessed once they

2.2 What Data Structures Do We

The List is a linear collection of entries in which

Two kinds of list:

2.2 What Data Structures Do We

Entries may only be removed in the order in which

First in First out (FIFO) data structures

2.2 What Data Structures Do We

Entries may only be removed in the reverse order in

Last In, First Out (LIFO)

2.2 What Data Structures Do We

A tree is a nonlinear structure with a unique starting

Root The top node of a tree structure; a node with

2.2 What Data Structures Do We

Binary tree A tree in which each node is capable

Leaf A tree node that has no children

2.2 What Data Structures Do We

Models a hierarchy such as the organizational

A non-linear arrangement of entries, it is a generalization

2.2 What Data Structures Do We

Binary Search Tree

Binary Search Tree

2.2 What Data Structures Do We

Height-balanced, binary search tree.

AVL Tree derives its importance from the fact that it

2.2 What Data Structures Do We

Heap as a Priority Queue

A priority queue is a specialization of the FIFO

Entries are assigned priorities.

The heap is a special type of binary tree.

Complete binary tree

a complete binary tree is one in which every

The last level may have fewer than the maximum

2.2 What Data Structures Do We

Hash Functions: A function used to manipulate the

Hashing: The technique for ordering and accessing

Hash table: Term used to describe the data

Using a hash function to Determine the

2.2 What Data Structures Do We Study?

Vertex: A node in a graph

Edge (arc): A pair of vertices representing a connection between two nodes

Two kinds of graphs: