You are on page 1of 22

Computational Thinking and Problem Solving Techniques

Computational Thinking and Problem-Solving Techniques:


Unit III – Data Organization

What is data organization?

Data organization is the practice of categorizing and classifying data to make it more usable.
Similar to a file folder, where we keep important documents, you’ll need to arrange your data in
the most logical and orderly fashion, so you — and anyone else who accesses it — can easily find
what they’re looking for.

Naming:

• Names should be unique. A name should refer to only one thing and never more than
one.

• One item should not have more than one name. If one item has two different names, it
can be confusing to communicate with other people (or other computing systems) about
that item.

• A name should be descriptive. When computing with data, the name of an item should
describe its role or function within the system.

• The name of an item should be related to the location of the item.

Lists:

• Are you the type of person that makes a list of everything?

• A list is a sequence of items that are arranged in a particular order.

• Consider the following list of the top five most expensive paintings of all
time.

1. The Card Players by Paul Cézanne

2. No. 5, 1948 by Jackson Pollock

3. Woman III by Willem de Kooning

4. Portrait of Adele Bloch-Bauer I by Gustav Klimt

5. Portrait of Dr. Gachet by Vincent van Gogh

Indexing:

• Indexing associates a unique number with every item in a set of data and therefore allows
the items to be identified by their index.

• Since indices are used to identify data within a list;

• An index can be understood as a unique name for one of the items in a list.

1
Computational Thinking and Problem Solving Techniques

• we will denote an item in a list by using brackets.

• if Paintings is the name of a list, we denote the ith painting as Paintings[i].

• For any computing system that uses zero as the starting index, we denote the first item
in the list as Paintings[0] and the second item in the list as Paintings[1].

Memory Location:

All of the data in a computer is stored at some location in memory and each memory location is
numbered as a list starting from zero.

Each memory location can store one word of data where a word is the smallest unit of data that
is naturally stored by some computing system.

Example – List of 5 Painitings

2
Computational Thinking and Problem Solving Techniques

Discs Memory Arrangement

• Compact discs (CDs), and digital video discs (DVDs) are physically shaped as discs, the
data stored on these devices is also arranged in a linear manner.

• Each disc is divided into concentric rings known as tracks. Each track is divided into
arcs known as sectors.

• Each sector is able to store one word of data.

• As a disc spins, a single track will scan through a sequence of words whose indices
increase linearly

Arrays:

3
Computational Thinking and Problem Solving Techniques

An array is a collection of items stored at contiguous memory locations. The idea is to store
multiple items of the same type together. This makes it easier to calculate the position of each
element by simply adding an offset to a base value, i.e., the memory location of the first element
of the array (generally denoted by the name of the array). The base value is index 0 and the
difference between the two indexes is the offset.

Each element can be uniquely identified by their index in the array (in a similar way as you could
identify your friends by the step on which they were on in the above example).

Array’s size

In C language array has the fixed size meaning once size is given to it. It can’t change i.e. can’t
shrink it, can’t expand it. The reason was that for expanding if we change the size we can’t be
sure ( it’s not possible every time) that we get the next memory location to us as free. The
shrinking will not work because array, when declared, it gets memory statically, and thus
compiler is the only one to destroy it.

Arrays – Storage

• An array is perhaps the simplest way of storing a list in memory.

• An array stores each item at the memory address corresponding to the items position in
the list.

4
Computational Thinking and Problem Solving Techniques

Possibility of Storing 2 Arrays:

5
Computational Thinking and Problem Solving Techniques

• One very important property of an array is that once an array is stored in memory, the
array cannot change its location nor can it change its length.

• The array cannot expand to fill more memory nor can it shrink to take up less memory.

Accessing Array Elements

• One of the main advantages of arrays is that we can easily find any item in the array if
we know the index, or position, of the item in the list.

• we want to find the second item in the list of paintings, we would name that item as
Paintings[2].

• This name indicates to the computer where the item is located in memory. The
computer associates the name Paintings with the anchor (memory address 3) and then
takes the number 2 as an offset from the anchor.

• The ith item in an array A is named A[i] and the memory location of that item is given
by the following simple formula.

• (anchor of array A) + (i – 1)

Zero Indexing

• When using zero indexing, the first item in a list is considered to be at position 0 and
the second item in a list is considered to be at position 1 and so forth.

• If we adopt this convention, we can compute the memory address of the ith item in an
array by using the formula:

(anchor of array) + i

Deleting Array elements

Deleting an element from an array is very similar to deleting an item from a handwritten list.
You would likely first erase all of the items in the list, and then rewrite them once they have
been erased. If you also wanted to insert something into the list, you would simply rewrite the
entire list in the desired order. This process is obviously not very efficient since we essentially
rewrite the entire list any time we delete an item from the list.

6
Computational Thinking and Problem Solving Techniques

Inserting Array Elements:

We can also insert an element into the array, but only if there are blank locations at the end of
the array. Consider, for example, what would happen if someone purchased Bal du moulin de
la Galette, a painting by Pierre-Auguste Renoir, for $350 million. We would then insert this
item at the beginning of the paintings list since the purchase price would make the painting
the most expensive in history. Insertion at the beginning of the list requires us to first shift
every item in the list and then to finally place the entered item into the anchor location. These
operations must take place from the end of the list working toward the beginning of the list to
prevent data loss.

7
Computational Thinking and Problem Solving Techniques

Advantages of using arrays:

 Arrays allow random access of elements. This makes accessing elements by position
faster.
 Arrays have better cache locality that can make a pretty big difference in performance.

Disadvantages of using arrays:

You can’t change the size i.e. once you have declared the array you can’t change its
size because of static memory allocated to it.

Array Summary:

 Any list can be stored in memory as an array.


 The main advantage of using an array is that any element in an array can be easily
accessed by just knowing the position of the item in the list.
 The main disadvantages are that the size of the array is fixed and that inserting and
deleting items in the list will often require a significant number of steps

8
Computational Thinking and Problem Solving Techniques

Linked Lists:

Geocache Game

• A person will place a logbook and some special items, the treasure, into a waterproof
container.

• The person will then hide the container, also known as the cache, somewhere
interesting and record the GPS coordinates of the cache.

• When a player finds the cache, he or she will sign the logbook and perhaps even replace
the treasure they find with a treasure of their own.

 Define a node, a (container), to be an adjacent pair of memory cells.


 The first node value holds an item in the list and the second node value holds the
memory location of the next item in the list.
 In part (a) of this figure, we see 16 memory cells and the data that each cell contains.
 A proper interpretation of this data is shown in part (b) where the cell-pairs that
compose a single node are grouped together and the linkages are shown with arcs that
connect the nodes.

What is Linked List?

 Linked List is a very commonly used linear data structure which consists of group of
nodes in a sequence.
 Each node holds its own data and the address of the next node hence forming a
chain like structure.

Linked Lists are used to create trees and graphs.

9
Computational Thinking and Problem Solving Techniques

Types of Linked Lists:

There are 3 different implementations of Linked List available, they are:

1. Singly Linked List

2. Doubly Linked List

3. Circular Linked List

Let's know more about them and how they are different from each other.

Singly Linked List

Singly linked lists contain nodes which have a data part as well as an address part i.e.
next, which points to the next node in the sequence of nodes.

The operations we can perform on singly linked lists are insertion, deletion and traversal.

Doubly Linked List

In a doubly linked list, each node contains a data part and two addresses, one for the
previous node and one for the next node.

10
Computational Thinking and Problem Solving Techniques

Circular Linked List:

In circular linked list the last node of the list holds the address of the first node hence
forming a circular chain.

We will learn about all the 3 types of linked list, one by one, in the next tutorials. So click
on Next button, let's learn more about linked lists.

Advantages of Linked Lists

• They are a dynamic in nature which allocates the memory when required.

• Insertion and deletion operations can be easily implemented.

• Stacks and queues can be easily executed.

• Linked List reduces the access time.

Disadvantages of Linked Lists

• The memory is wasted as pointers require extra memory for storage.

• No element can be accessed randomly; it has to access each node sequentially.

• Reverse Traversing is difficult in linked list.

11
Computational Thinking and Problem Solving Techniques

Operations on Linked List:

12
Computational Thinking and Problem Solving Techniques

13
Computational Thinking and Problem Solving Techniques

Graphs:

 Many real-world objects and concepts are be modeled as a graph.


 A node typically represents some real-world item, and an edge is a connection between
two nodes.
 Graph is a set of nodes and a set of arcs.
 Graph G = (V, E) means that graph G is composed of a set of nodes, V, and a set of arcs,
E

 V = {A,B,C,D,E}
 E = {(A,E), (A,B), (B,A), (B,D), (C,E), (D,C), (E,B), (E,C), (E,D)}
 G = ({A,B,C,D,E}, {(A,E), (A,B), (B,A), (B,D), (C,E), (D,C), (E,B), (E,C), (E,D)})

14
Computational Thinking and Problem Solving Techniques

Terminologies:

 Adjacency: Assume that U and V are vertices in some graph.


o Vertex U is adjacent to vertex V if there is an arc (U, V) in the graph.
o For example, vertex D is adjacent to vertex C in G since the arc (D, C) appears in
G.
o Vertex C is not adjacent to D since the arc (C, D) is not in G.
 Loop: A loop is any arc such that the first and second nodes of the arc are the same.
o Graph G does not have a loop.

 In-degree: The in-degree of a vertex V is the number of arcs in the graph having V as
the second vertex.

 The in-degree of vertex E is 2 since G has arcs (A, E) and (C, E) as the only arcs where
vertex E is the second vertex.

 Out-degree: The out-degree of a vertex V is the number of arcs in the graph having V as
the first vertex.

 The out-degree of vertex E is 3 since G has arcs (E, B), (E, C), and (E, D) as the only
arcs where vertex E is the first vertex.

 Order: The order of a graph is the number of vertices.

o The order of G is 5 since there are 5 vertices.

 Size: The size of a graph is the number of arcs.

o The size of G is 9 since there are 9 arcs.

15
Computational Thinking and Problem Solving Techniques

 Path: A path is a sequence of vertices such that for every pair of adjacent vertices in the
sequence there is a corresponding arc in the graph. Also, a sequence containing a single
vertex is a path.

o For example, the sequence [A, E, C] is a path since (A, E) and (E, C) are arcs in
the graph.

 Path length: The length of a path is the number of arcs in the path. The length of [A, E,
C] is 2.

 Cycle: A cycle is a path where the length is greater than zero, and the first and last
vertices are the same.

o A graph without any cycles is known as an acyclic graph. For example, [A, B, A]
is a cycle and hence graph G is not acyclic.

Graphs – Storage

• We can store a graph by first storing the value of the vertex (the name) and then storing
the out-degree of the vertex in the very next memory location.

• After this, we store N links where N is the out-degree of the vertex.

• Consider vertex A of Figure. Since the number 2 is stored in the memory address
immediately following node A, we now know that the out-degree of A is 2.

• We then understand that the next two values in memory are links to the two vertices
that are adjacent to A. In this example, we note that vertices B (stored at location 1) and
E (stored at location 14) are adjacent to A.

16
Computational Thinking and Problem Solving Techniques

Graphs – Representations

• Adjacency Matrix

• Incidence matrix

• Adjacency List

Adjacency Matrix:

17
Computational Thinking and Problem Solving Techniques

Incidence Matrix:

Adjacency List – By Linked List and Arrays

18
Computational Thinking and Problem Solving Techniques

Hierarchies

• A hierarchy is an arrangement of elements such that the elements are arranged in


levels.

• Each element in the hierarchy may have many elements that are directly below but only
one element that is directly above.

• Organizational Chart

• Family Tree

Organizational Chart

• An organizational chart (often known as an org chart) is a diagram showing the


authority structure of an organization and the relationships and reporting lines that
exist between the people that are part of the organization.

Example - Figure below, for example, is a simplified organizational chart for the Internet
sales company Amazon. In this chart, there is one president, Jeffrey Bezos, who sits atop
the tree as the “root.” Four individuals hold the role of vice president and report directly to
Jeffrey Bezos. There are four directors and one senior manager. Since Thomas Szkutak is
above Tim Stone in this chart, we understand that Tim Stone reports directly to Thomas
Szkutak and that Thomas Szkutak has supervisory authority over Tim Stone. In addition,
since Jeffrey Bezos is above Thomas Szkutak, we understand that Thomas Szkutak reports
directly to Jeffrey Bezos and that Jeffrey Bezos has supervisory authority over Thomas
Szkutak.

19
Computational Thinking and Problem Solving Techniques

Family Tree:

A family tree shows the genealogical relationship between ancestors and their descendants.
The lower levels of a family tree contain family members from recent generations, and the
upper levels of the hierarchy contain ancestors from many generations past. Family trees are
not purely hierarchical if they include information about both the maternal and fraternal
ancestry since a person will have two individuals, both a father and a mother, directly above
them.

This feature violates the constraint that each element in the hierarchy must have only one
element that is directly above. If, however, we restrict a family tree to showing only the
fraternal or maternal ancestry of a person, the result is a proper hierarchy of family members.

Example - J.R.R. Tolkien, author of the popularLord of the Ringstrilogy, described

the family ancestry of the Bagginses covering a span of about five generations.

20
Computational Thinking and Problem Solving Techniques

Trees:

• A tree is a type of graph that is intended to model hierarchal data.


• A tree is a graph that has the following characteristics:
• Exactly one vertex with in-degree zero*; this vertex is known as the root.
• Every vertex other than the root has an in-degree of one.
• There is a path from the root to every other vertex.

21
Computational Thinking and Problem Solving Techniques

Trees – List Representation

Trees – Left Child Right Sibling Representation:

22

You might also like