You are on page 1of 22

(Format for the Training Report)

Sample Sheet (Title Page/Front Page)

SUMMER INSTITUTIONAL TRAINING REPORT

ON

Algorithms for Searching, Sorting, and Indexing

SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENT FOR THE AWARD

OF THE DEGREE OF

BACHELOR OF ENGINEERING

(Computer Science & Engineering)

JUNE-JULY,2022

SUBMITTED BY:

NAME(S) : AAINA BHARDWAJ

UNIVERSITY UID : 21BCS4620


CHANDIGARH UNIVERSITY,GHARUAN,MOHALI

CANDIDATE'S DECLARATION

I “Aaina Bhardwaj ” hereby declare that I have undertaken Summer Training and developed project
titled "Algorithms for Searching, Sorting, and Indexing ” during a period from “17 june
2022” to “29 june 2022” in partial fulfillment of requirements for the award of degree of B.E
(COMPUTER SCIENCE & ENGINEERING) at CHANDIGARH UNIVERSITY GHARUAN,
MOHALI. The work which is being presented in the training report submitted to Department of
Computer Science & Engineering at CHANDIGARH UNIVERSITY GHARUAN, MOHALI is an
authentic record of training work.

Signature of the Student

The training Viva–Voce Examination of has been held on and


accepted.

Signature of Internal Examiner Signature of External Examiner


Abstract

A data structure is a specialized format for organizing and storing data. General data structure types
include the array, the file, the record, the table, the tree, and so on. Any data structure is designed to
organize data to suit a specific purpose so that it can be accessed and worked with in appropriate
ways. Searching is the process of finding a particular item in a group of items. Not even a single day
pass, when we don’t have to search for something in our day to day life, like car keys, mobile
charger, books etc., Same is the life of computer, there is so much data stored in it, that whenever a
user asks for some data, computer has to search its memory to look for the data and make it available
to the user. Sorting refers to rearrangement of data items in a particular order. We sort the items on a
list into alphabetical or numerical order. We have well-known Sorting techniques to sort elements
either in ascending or descending order. Searching and Sorting are the most basic problems in
computer science, as it is used in most of the software applications. The computer has its own
techniques to search and sort the elements through its memory, which we look here
Acknowledgement

I want to express my sincere thanks to the institution for their

excellent guidance and support in seeing my thesis through to

completion. I would also like to thank my family, teachers and

friends for their support in the duration of the course. I was able

gain a lot of knowledge doing

the training as well as while creating the projects.

I also want to express my gratitude to our Coursera instructor for

giving me the chance to work on a project centered on the topic

of algorithms. Without their assistance and suggestions, this

project could not have been completed.


About the Course

This course covers basics of algorithm design and analysis, as well as algorithms for sorting arrays, data structures
such as priority queues, hash functions, and applications such as Bloom filters.

Algorithms for Searching, Sorting, and Indexing can be taken for academic credit as part of CU Boulder’s Master
of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary
degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science,
Information Science, and others. With performance-based admissions and no application process, the MS-DS is
ideal for individuals with a broad range of undergraduate education and/or professional experience in computer
science, information science, mathematics, and statistics.
CONTENTS

Topic Page No.

Certificates by Coursera i
Candidate’s Declaration ii
Abstract iii
Acknowledgement vi
About the Course
v
CHAPTER 1 INTRODUCTION
(This chapter should include the background of the topic of the training, theoretical explanation
about the same, SW/HW tools learned)
1.1 Backgrpund of the topic
1.2 Theoretical explanation

CHAPTER 2 TRAINING WORK UNDERTAKEN


(This chapter should include the sequential learning steps, methodology followed and project
undertaken, if any)
2.1 Week 1
2.2 Week 2
2.3 Week 3
2.4 Week 4

CHAPTER 3 RESULTS AND DISCUSSION


(This chapter should include any results and the related discussions for the projects made during
training. If no project has been made the results and snapshots for the tools learnt should be included)
3.1 Guided project
3.2 Rhyme project

CHAPTER 4 CONCLUSION AND FUTURE SCOPE


4.1 Conclusion and Future Scope

REFERENCES
CHAPTER 1 INTRODUCTION

Search is process of finding a value in a list of values. In other words, Searching is the process of
locating given value position in a list of values. Sorting refers to the operation of arranging data in
some given sequence i.e., increasing or decreasing order.Sorting is categorized as internal sorting and
external sorting. Internal sorting means we are arranging the elements within the array which is only
in computer primary memory. Whereas the external sorting is the sorting of elements from the
external file by reading it from secondary memory
1.1 Background of the topic

A process or set of rules or well defined list of steps to be followed in calculations or other problem
solving operations, especially by a computer is called as an algorithm. The analysis of an algorithm
provides information that gives us a general idea of how long an algorithm will take for solving a
problem. Searching for an element and arranging the elements in a particular order are the everyday
jobs. Also in computer programming searching and sorting are common. As these tasks are common,
different algorithms are developed. If the data is kept in some sorted order then searching becomes
very easy. After looking at each of the algorithm meticulously and experience the illustrations, we
can decide the execution of every algorithm, considering how rapidly it completes the assigned task.
According to linear relationship between the data members sorting is defined as arranging the data in
ascending or descending order. In this data is organized in an order allowing information to be found
easier
1.2 Theoretical explanation

The idea of assigning scheduling planes to that or landing onto the taxiway is an example of a
scheduling problem for which there are efficient algorithms. And they come up in a lot of context
scheduling problem, assignment problem. And then, for example, once you are ready to take off,
there is a scheduling problem where the runway needs to be scheduled for all the planes that are
waiting to take off from the runway, all right? So that's an example of a scheduling problem, all
right? So every time you go to the airport you see there is a lot of logistics that are going on in the
background, which are being handled by algorithms. I don't even want to talk about how your bags
that you check in are going to be handled so that they reach the plane on which you are flying, all
right? So that's a whole logistical nightmare. There are tons of algorithms. I don't need to tell you, for
example, that every time you go to a gate and you're about to board an airplane, there is an airplane
that comes to your gate and takes you to your destination, okay? Now, where is this airplane coming
from? The airline doesn't reserve an airplane to your route. Usually it doesn't work that way. Usually
what happens is the airline tries to schedule the airplanes that it has available to the routes that it
needs to fly, okay? And sometimes, if things get delayed, you might find a different airplane is
brought into service. Or if an airplane breaks down, if there is a technical fault in your airplane, then
a different plane is brought to take you from A to B. And those kinds of problems are called vehicle
routing problems or vehicle assignment problems, okay? So there's a ton of logistic problems that
happen every time you go to an airport and you would like to go from destination A to destination B.
Another example of algorithms come in would be for example, suppose you send a package through
FedEx, all right? Now, FedEx has algorithms that are going to schedule your package on the right
plane or the right transport to get from A to B. When you get from A to B then there is a truck that
has a bunch of packages that need to be delivered to a bunch of addresses and the truck needs to
come back to the same place it started, to the depot, all right? And this is called a travelling
salesperson problem.
CHAPTER 2 TRAINING WORK UNDERTAKEN

2.1 Week 1

In this module the student have learned the very basics of algorithms through three examples:
insertion sort (sort an array in ascending/descending order); binary search: search whether an element
is present in a sorted array and if yes, find its index; and merge sort (a faster method for sorting an
array). Through these algorithms the student will be introduced to the analysis of algorithms -- i.e,
proving that the algorithm is correct for the task it has been designed for and establishing a bound on
how the time taken to execute the algorithm grows as a function of input. The student is also exposed
to the notion of a faster algorithm and asymptotic complexity through the O, big-Omega and big-
Theta notations.

Learning Objectives

 Analyze computational complexity of algorithms.


 Prove correctness of algorithms using inductive invariants.
 Analyze the running time of binary search, insertion sort, bubble sort and merge sort algorithms.
 Prove correctness of binary search, insertion sort, bubble sort and merge sort algorithms.
 Compare running time of algorithms through big-O, big-Omega and big-Theta notations
2.2 Week 2

In this module, the student have learned about the basics of data structures that organize data to make
certain types of operations faster. The module starts with a broad introduction to data structures and
talks about some simple data structures such as first-in first out queues and last-in first out stack.
Next, we introduce the heap data structure and the basic properties of heaps. This is followed by
algorithms for insertion, deletion and finding the minimum element of a heap along with their time
complexities. Finally, we will study the priority queue data structure and showcase some
applications.

Learning Objectives

 Understand the need for data structures and how these are used in programming.
 Define the properties of a min/max heap and reason about the implication of these properties.
 Analyze the correctness and running time of insertion, deletion and find minimum/maximum
operations on a heap.
 Identify problems that can be solved efficiently using heaps and problems where heaps may be a
sub-optimal choice.
 Understand hashtable data structure, its organization and properties.
 Learn the basic differences between a heap and a hashtable data-structure. Understand when to
use a hashtable as opposed to a heap data-structure
2.3 Week 3

We will go through the quicksort and quickselect algorithms for sorting and selecting the kth smallest
element in an array efficiently. This will also be an introduction to the role of randomization in
algorithm design. Next, we will study hashtables: a highly useful data structure that allows for
efficient search and retrieval from large amounts of data. We will learn about the basic principles of
hash-table and operations on hashtables.

Learning Objectives

 Understand and implement quicksort algorithm with a carefully designed partition scheme.
 Prove correctness of the partition scheme and the algorithm.
 Understand and implement the quickselect algorithm.
 Perform an empirical evaluation of running time for quicksort and quickselect algorithms
 Understand the pitfalls behind hash function design for hashtables and the use of randomization
to make hashtables work against the worst case.
2.4 Week 4

In this module, we have learned randomized pivot selection for quicksort and quickselect. We will
learn how to analyze the complexity of the randomized quicksort/quickselect algorithms. We will
learn open address hashing: a technique that simplifies hashtable design. Next we will study the
design of hash functions and their analysis. Finally, we present and analyze Bloom filters that are
used in various applications such as querying streaming data and counting.

Learning Objectives

 Design open-address hash-tables and algorithms for insertion and deletion.


 Cuckoo hashing, double hashing and variants.
 Bloom filters and their applications
 Understand how to use hash-tables for sketching over streaming data.
CHAPTER 3 RESULTS AND DISCUSSION

3.1 Guided Project

In this course, we learned how to create an application that sorts a dataset of tax information. We will
learn how to analyze a sorting algorithm's efficiency, and then learned how to implement each
algorithm in Java using easy-to-understand concepts.

Learning Objectives

 Discussed the overall structure of the program and dataset


 Implement and understand the theoretical details of bubble sort in Java.
 Examine an algorithm to determine its time/memory complexity
 Implement and understand the theoretical details of mergesort in Java
 Implement and understand the theoretical details of insertion sort in Java.
 Implement and understand the theoretical details of selection sort in Java.
 Implement and understand the theoretical details of quicksort sort in Java.
 Give examples of when each sorting algorithm is ideal
3.2 Tax data sorting application

We built a tax data sorting application in Java. We accomplished it in by completing each task in the
project:

Task 1: Overview of the Program Design and of our Tools

Task 2: Bubble Sort and Big-O Notation

Task 3: Insertion Sort

Task 4: Selection Sort

Task 5: Mergesort Overview

Task 6: Mergesort Implementation

Task 7: Quicksort Overview

Task 8: Quicksort implementation


3.3 Rhyme project
CHAPTER 4 CONCLUSION AND FUTURE SCOPE

In this paper we have studied about different Searching and sorting algorithms. Every searching and
sorting algorithm has advantage and disadvantage. Some sorting algorithms have been compared on
the basis of different factors like complexity, number of passes, number of comparison etc. It is also
seen that many algorithms are problem oriented so we will try to make it global oriented. Hence we
can say that there are many future works which are as follows.

1) Remove disadvantage of various fundamental sorting and advance sorting.

2) Make problem oriented sorting to global oriented. In the end we would like to say that there is
huge scope of the sorting algorithm and searching in the near future, and to find optimum-sorting
algorithm, the work on sorting algorithm will go on forever. Each sorting algorithm has its own
advantages and disadvantages. Selection sort O(N2) sorts and quick sort O(NlogN) sorts.

• The advantage of Selection sort is that the number of swaps is O(N), and the disadvantage is that it
does not stop early if the list is sorted and it looks at every element in the list in turn. It is suitable in
cases where you need to select max element in the list. With time complexity same as Bubble Sort
o(n^2), its performance however is better than Bubble Sort.

• The advantage of Quick sort is that it is the fastest sort and is O(NlogN) in both the number of
comparisons and the number of swaps and the disadvantage is that the algorithm is a bit tricky to
understand.
CU Citation Reference

Citation standards in this reference are provided for:


 Books
 Conference Technical Articles/Papers
 Periodicals (Journals/ Transaction/Magazines/Letters)
 Reports
 Online sources
 Patents, Standards, Thesis (M.E) and Dissertations (Ph.D )

NOTE: For two authors use style [J. K. Author and A. N. Writer] and
For three or more authors: [separate author names by comma and also use word ‘and’ before the name of last
author e.g. : J. K. Author, R. Cogdell, R. E. Haskell, and A. N. Writer]

Books
Basic Format:
[1] J. K. Author, Title of His Published Book, xth ed. City of Publisher, Country: Abbrev. of Publisher, year.

Examples:
[1] B. Klaus and P. Horn, Robot Vision. Cambridge, USA: MIT Press, 1986.
[2] L. Stein, Computers and You, J. S. Brake, Ed. New York, USA: Wiley, 1994.
[3] M. Abramowitz and I. A. Stegun, Eds., Handbook of Mathematical Functions (Applied Mathematics Series 55).
Washington, DC, USA: NBS, 1964.

Conference Technical Articles/Papers

Basic Format:
[1] J. K. Author, “Title of paper,” Unabbreviated Name of Conference, City of Conference, Country, year, pp. xxx-
xxx.

Example:
[1] H. Chen, S. C. Laroiya, and M. Adithan, “Precision Machining of Advanced Ceremics” International
Conference on Advanced Manufacturing Technology (ICMAT - 94), Johor Bahru, Malaysia, 1994, pp. 203-210.

Periodicals (Journals/ Transaction/Magazines/Letters)


Basic Format:
[1] J. K. Author, “Name of paper,” Unabbreviated Title of Periodical, vol. x, no. x, pp. xxx-xxx, Abbrev. Month, year.
Examples:
[1] R. E. Kalman, “New results in linear filtering and prediction theory,” Journal of Electrical Engineering, vol. 83,
no. 5, pp. 95-108, Mar. 1961.
[2] Y. V. Lavrova, “Geographic distribution of ionospheric disturbances in the F2 layer,” IET Microwaves,
Antennas and Propagation, vol. 19, no. 29, pp. 31–43, Feb. 1961.
[3] E. P. Wigner, “On a modification of the Rayleigh–Schrodinger perturbation theory,” (in German), International
Journal of Computational Intelligence Studies, vol. 53, p. 475, Sep. 1935.
[4] W. Rafferty, “Ground antennas in NASA’s deep space telecommunications,” IEEE Transactions on Antennas
and Propagation, vol. 82, pp. 636-640, May 1994.

Reports:
The general form for citing technical reports is to place the name and location of the company or institution after the
author and title and to give the report number and date at the end of the reference.
Basic Format:
[1] J. K. Author, “Title of report,” Name of Company, City of Company, Country, Report No., xxx, year.
Examples:
[1] E. E. Reber “Oxygen absorption in the earth’s atmosphere,” Aerospace Corporation, Los Angeles, USA, Tech.
Rep. TR-0200 (4230-46)-3, Nov., 1988.

Online Sources
FTP
Basic Format:
[1] J. K. Author. (year). Title (edition) [Type of medium]. Available FTP: Directory: File:
Example:
[1] R. J. Vidmar. (1994). On the use of atmospheric plasmas as electromagnetic reflectors [Online]. Available
FTP:atmnext.usc.edu Directory: pub/etext/1994 File: atmosplasma.txt

WWW
Basic Format:
[1] J. K. Author. (year, month day). Title (edition) [Type of medium]. Available: http://www.(URL)
Example:
[1] J. Jones. (1991, May 10). Networks (2nd ed.) [Online]. Available: http://www.atm.com

Patents, Standards, Thesis (M.S.) and Dissertations (Ph.D.)


Patents
Basic Format:
[1] J. K. Author, “Title of patent,” U.S. Patent x xxx xxx, Abbrev. Month, day, year.
Example:
[1] J. P. Wilkinson, “Nonlinear resonant circuit devices,” U.S. Patent 3 624 125, July 16, 1990.
NOTE: Use “issued date” if several dates are given.

Standards
Basic Format:
[1] Title of Standard, Standard number, date.
Examples:
[1] IEEE Criteria for Class IE Electric Systems, IEEE Standard 308, 1969.
[2] Letter Symbols for Quantities, ANSI Standard Y10.5-1968.

Thesis (Master) and Dissertations (Ph.D.)


Basic Format:
[1] J. K. Author, “Title of thesis,” M.S. thesis, Abbrev. Dept., Abbrev. Univ., City of Univ., Country, year.
[2] J. K. Author, “Title of dissertation,” Ph.D. dissertation, Abbrev. Dept., Abbrev. Univ., City of Univ., Country,
year.
Examples:
[1] J. O. Williams, “Narrow-band analyzer,” Ph.D. dissertation, Dept. Elect. Eng., Harvard Univ., Cambridge, MA,
1993.
[2] N. Kawasaki, “Parametric study of thermal and chemical non equilibrium nozzle flow,” M.S. thesis, Dept.
Electron. Eng., Osaka Univ., Osaka, Japan, 1993.

References in Text
References in Text:
References are needed be cited in the text and they should appear on the line, in square inside the punctuation.
Grammatically, they may be treated as if they were footnote numbers, e.g.,
as shown by Brown [4], [5]; as mentioned earlier [2], [4]–[7], [9]; Smith [4] and Brown and Jones [5]; Wood et
al. [7]
or as nouns:
as demonstrated in [3]; according to [4] and [6]–[9].

NOTE: Use et al. when three or more names are given.

Reference List Style


Reference numbers are set flush left and form a column of their own, hanging out beyond the body of the reference.
The reference numbers are on the line, enclosed in square brackets. In all references, the given name of the author or
editor is abbreviated to the initial only and precedes the last name. There must be only one reference with each number.

[1] R. E. Kalman, “New results in linear filtering and prediction theory,” Journal of Electrical Engineering, vol. 83,
no. 5, pp. 95-108, Mar. 1961.
[2] Ye. V. Lavrova, “Geographic distribution of ionospheric disturbances in the F2 layer,” Applied Soft Computing,
vol. 19, no. 29, pp. 31–43, Feb. 1961.

[3] E. P. Wigner, “On a modification of the Rayleigh–Schrodinger perturbation theory,” (in German), International
Journal of Computational Intelligence Studies, vol. 53, p. 475, Sep. 1935.
[4] W. Rafferty, “Ground antennas in NASA’s deep space telecommunications,” IEEE Transactions on Antennas
and Propagation, vol. 82, no. 3, pp. 636-640, May 1994.

Important: Editing of references may entail careful renumbering of references, as well as the citations in text.

You might also like