Professional Documents
Culture Documents
21BCS4620 AainaBhardwaj
21BCS4620 AainaBhardwaj
ON
OF THE DEGREE OF
BACHELOR OF ENGINEERING
JUNE-JULY,2022
SUBMITTED BY:
CANDIDATE'S DECLARATION
I “Aaina Bhardwaj ” hereby declare that I have undertaken Summer Training and developed project
titled "Algorithms for Searching, Sorting, and Indexing ” during a period from “17 june
2022” to “29 june 2022” in partial fulfillment of requirements for the award of degree of B.E
(COMPUTER SCIENCE & ENGINEERING) at CHANDIGARH UNIVERSITY GHARUAN,
MOHALI. The work which is being presented in the training report submitted to Department of
Computer Science & Engineering at CHANDIGARH UNIVERSITY GHARUAN, MOHALI is an
authentic record of training work.
A data structure is a specialized format for organizing and storing data. General data structure types
include the array, the file, the record, the table, the tree, and so on. Any data structure is designed to
organize data to suit a specific purpose so that it can be accessed and worked with in appropriate
ways. Searching is the process of finding a particular item in a group of items. Not even a single day
pass, when we don’t have to search for something in our day to day life, like car keys, mobile
charger, books etc., Same is the life of computer, there is so much data stored in it, that whenever a
user asks for some data, computer has to search its memory to look for the data and make it available
to the user. Sorting refers to rearrangement of data items in a particular order. We sort the items on a
list into alphabetical or numerical order. We have well-known Sorting techniques to sort elements
either in ascending or descending order. Searching and Sorting are the most basic problems in
computer science, as it is used in most of the software applications. The computer has its own
techniques to search and sort the elements through its memory, which we look here
Acknowledgement
friends for their support in the duration of the course. I was able
This course covers basics of algorithm design and analysis, as well as algorithms for sorting arrays, data structures
such as priority queues, hash functions, and applications such as Bloom filters.
Algorithms for Searching, Sorting, and Indexing can be taken for academic credit as part of CU Boulder’s Master
of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary
degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science,
Information Science, and others. With performance-based admissions and no application process, the MS-DS is
ideal for individuals with a broad range of undergraduate education and/or professional experience in computer
science, information science, mathematics, and statistics.
CONTENTS
Certificates by Coursera i
Candidate’s Declaration ii
Abstract iii
Acknowledgement vi
About the Course
v
CHAPTER 1 INTRODUCTION
(This chapter should include the background of the topic of the training, theoretical explanation
about the same, SW/HW tools learned)
1.1 Backgrpund of the topic
1.2 Theoretical explanation
REFERENCES
CHAPTER 1 INTRODUCTION
Search is process of finding a value in a list of values. In other words, Searching is the process of
locating given value position in a list of values. Sorting refers to the operation of arranging data in
some given sequence i.e., increasing or decreasing order.Sorting is categorized as internal sorting and
external sorting. Internal sorting means we are arranging the elements within the array which is only
in computer primary memory. Whereas the external sorting is the sorting of elements from the
external file by reading it from secondary memory
1.1 Background of the topic
A process or set of rules or well defined list of steps to be followed in calculations or other problem
solving operations, especially by a computer is called as an algorithm. The analysis of an algorithm
provides information that gives us a general idea of how long an algorithm will take for solving a
problem. Searching for an element and arranging the elements in a particular order are the everyday
jobs. Also in computer programming searching and sorting are common. As these tasks are common,
different algorithms are developed. If the data is kept in some sorted order then searching becomes
very easy. After looking at each of the algorithm meticulously and experience the illustrations, we
can decide the execution of every algorithm, considering how rapidly it completes the assigned task.
According to linear relationship between the data members sorting is defined as arranging the data in
ascending or descending order. In this data is organized in an order allowing information to be found
easier
1.2 Theoretical explanation
The idea of assigning scheduling planes to that or landing onto the taxiway is an example of a
scheduling problem for which there are efficient algorithms. And they come up in a lot of context
scheduling problem, assignment problem. And then, for example, once you are ready to take off,
there is a scheduling problem where the runway needs to be scheduled for all the planes that are
waiting to take off from the runway, all right? So that's an example of a scheduling problem, all
right? So every time you go to the airport you see there is a lot of logistics that are going on in the
background, which are being handled by algorithms. I don't even want to talk about how your bags
that you check in are going to be handled so that they reach the plane on which you are flying, all
right? So that's a whole logistical nightmare. There are tons of algorithms. I don't need to tell you, for
example, that every time you go to a gate and you're about to board an airplane, there is an airplane
that comes to your gate and takes you to your destination, okay? Now, where is this airplane coming
from? The airline doesn't reserve an airplane to your route. Usually it doesn't work that way. Usually
what happens is the airline tries to schedule the airplanes that it has available to the routes that it
needs to fly, okay? And sometimes, if things get delayed, you might find a different airplane is
brought into service. Or if an airplane breaks down, if there is a technical fault in your airplane, then
a different plane is brought to take you from A to B. And those kinds of problems are called vehicle
routing problems or vehicle assignment problems, okay? So there's a ton of logistic problems that
happen every time you go to an airport and you would like to go from destination A to destination B.
Another example of algorithms come in would be for example, suppose you send a package through
FedEx, all right? Now, FedEx has algorithms that are going to schedule your package on the right
plane or the right transport to get from A to B. When you get from A to B then there is a truck that
has a bunch of packages that need to be delivered to a bunch of addresses and the truck needs to
come back to the same place it started, to the depot, all right? And this is called a travelling
salesperson problem.
CHAPTER 2 TRAINING WORK UNDERTAKEN
2.1 Week 1
In this module the student have learned the very basics of algorithms through three examples:
insertion sort (sort an array in ascending/descending order); binary search: search whether an element
is present in a sorted array and if yes, find its index; and merge sort (a faster method for sorting an
array). Through these algorithms the student will be introduced to the analysis of algorithms -- i.e,
proving that the algorithm is correct for the task it has been designed for and establishing a bound on
how the time taken to execute the algorithm grows as a function of input. The student is also exposed
to the notion of a faster algorithm and asymptotic complexity through the O, big-Omega and big-
Theta notations.
Learning Objectives
In this module, the student have learned about the basics of data structures that organize data to make
certain types of operations faster. The module starts with a broad introduction to data structures and
talks about some simple data structures such as first-in first out queues and last-in first out stack.
Next, we introduce the heap data structure and the basic properties of heaps. This is followed by
algorithms for insertion, deletion and finding the minimum element of a heap along with their time
complexities. Finally, we will study the priority queue data structure and showcase some
applications.
Learning Objectives
Understand the need for data structures and how these are used in programming.
Define the properties of a min/max heap and reason about the implication of these properties.
Analyze the correctness and running time of insertion, deletion and find minimum/maximum
operations on a heap.
Identify problems that can be solved efficiently using heaps and problems where heaps may be a
sub-optimal choice.
Understand hashtable data structure, its organization and properties.
Learn the basic differences between a heap and a hashtable data-structure. Understand when to
use a hashtable as opposed to a heap data-structure
2.3 Week 3
We will go through the quicksort and quickselect algorithms for sorting and selecting the kth smallest
element in an array efficiently. This will also be an introduction to the role of randomization in
algorithm design. Next, we will study hashtables: a highly useful data structure that allows for
efficient search and retrieval from large amounts of data. We will learn about the basic principles of
hash-table and operations on hashtables.
Learning Objectives
Understand and implement quicksort algorithm with a carefully designed partition scheme.
Prove correctness of the partition scheme and the algorithm.
Understand and implement the quickselect algorithm.
Perform an empirical evaluation of running time for quicksort and quickselect algorithms
Understand the pitfalls behind hash function design for hashtables and the use of randomization
to make hashtables work against the worst case.
2.4 Week 4
In this module, we have learned randomized pivot selection for quicksort and quickselect. We will
learn how to analyze the complexity of the randomized quicksort/quickselect algorithms. We will
learn open address hashing: a technique that simplifies hashtable design. Next we will study the
design of hash functions and their analysis. Finally, we present and analyze Bloom filters that are
used in various applications such as querying streaming data and counting.
Learning Objectives
In this course, we learned how to create an application that sorts a dataset of tax information. We will
learn how to analyze a sorting algorithm's efficiency, and then learned how to implement each
algorithm in Java using easy-to-understand concepts.
Learning Objectives
We built a tax data sorting application in Java. We accomplished it in by completing each task in the
project:
In this paper we have studied about different Searching and sorting algorithms. Every searching and
sorting algorithm has advantage and disadvantage. Some sorting algorithms have been compared on
the basis of different factors like complexity, number of passes, number of comparison etc. It is also
seen that many algorithms are problem oriented so we will try to make it global oriented. Hence we
can say that there are many future works which are as follows.
2) Make problem oriented sorting to global oriented. In the end we would like to say that there is
huge scope of the sorting algorithm and searching in the near future, and to find optimum-sorting
algorithm, the work on sorting algorithm will go on forever. Each sorting algorithm has its own
advantages and disadvantages. Selection sort O(N2) sorts and quick sort O(NlogN) sorts.
• The advantage of Selection sort is that the number of swaps is O(N), and the disadvantage is that it
does not stop early if the list is sorted and it looks at every element in the list in turn. It is suitable in
cases where you need to select max element in the list. With time complexity same as Bubble Sort
o(n^2), its performance however is better than Bubble Sort.
• The advantage of Quick sort is that it is the fastest sort and is O(NlogN) in both the number of
comparisons and the number of swaps and the disadvantage is that the algorithm is a bit tricky to
understand.
CU Citation Reference
NOTE: For two authors use style [J. K. Author and A. N. Writer] and
For three or more authors: [separate author names by comma and also use word ‘and’ before the name of last
author e.g. : J. K. Author, R. Cogdell, R. E. Haskell, and A. N. Writer]
Books
Basic Format:
[1] J. K. Author, Title of His Published Book, xth ed. City of Publisher, Country: Abbrev. of Publisher, year.
Examples:
[1] B. Klaus and P. Horn, Robot Vision. Cambridge, USA: MIT Press, 1986.
[2] L. Stein, Computers and You, J. S. Brake, Ed. New York, USA: Wiley, 1994.
[3] M. Abramowitz and I. A. Stegun, Eds., Handbook of Mathematical Functions (Applied Mathematics Series 55).
Washington, DC, USA: NBS, 1964.
Basic Format:
[1] J. K. Author, “Title of paper,” Unabbreviated Name of Conference, City of Conference, Country, year, pp. xxx-
xxx.
Example:
[1] H. Chen, S. C. Laroiya, and M. Adithan, “Precision Machining of Advanced Ceremics” International
Conference on Advanced Manufacturing Technology (ICMAT - 94), Johor Bahru, Malaysia, 1994, pp. 203-210.
Reports:
The general form for citing technical reports is to place the name and location of the company or institution after the
author and title and to give the report number and date at the end of the reference.
Basic Format:
[1] J. K. Author, “Title of report,” Name of Company, City of Company, Country, Report No., xxx, year.
Examples:
[1] E. E. Reber “Oxygen absorption in the earth’s atmosphere,” Aerospace Corporation, Los Angeles, USA, Tech.
Rep. TR-0200 (4230-46)-3, Nov., 1988.
Online Sources
FTP
Basic Format:
[1] J. K. Author. (year). Title (edition) [Type of medium]. Available FTP: Directory: File:
Example:
[1] R. J. Vidmar. (1994). On the use of atmospheric plasmas as electromagnetic reflectors [Online]. Available
FTP:atmnext.usc.edu Directory: pub/etext/1994 File: atmosplasma.txt
WWW
Basic Format:
[1] J. K. Author. (year, month day). Title (edition) [Type of medium]. Available: http://www.(URL)
Example:
[1] J. Jones. (1991, May 10). Networks (2nd ed.) [Online]. Available: http://www.atm.com
Standards
Basic Format:
[1] Title of Standard, Standard number, date.
Examples:
[1] IEEE Criteria for Class IE Electric Systems, IEEE Standard 308, 1969.
[2] Letter Symbols for Quantities, ANSI Standard Y10.5-1968.
References in Text
References in Text:
References are needed be cited in the text and they should appear on the line, in square inside the punctuation.
Grammatically, they may be treated as if they were footnote numbers, e.g.,
as shown by Brown [4], [5]; as mentioned earlier [2], [4]–[7], [9]; Smith [4] and Brown and Jones [5]; Wood et
al. [7]
or as nouns:
as demonstrated in [3]; according to [4] and [6]–[9].
[1] R. E. Kalman, “New results in linear filtering and prediction theory,” Journal of Electrical Engineering, vol. 83,
no. 5, pp. 95-108, Mar. 1961.
[2] Ye. V. Lavrova, “Geographic distribution of ionospheric disturbances in the F2 layer,” Applied Soft Computing,
vol. 19, no. 29, pp. 31–43, Feb. 1961.
[3] E. P. Wigner, “On a modification of the Rayleigh–Schrodinger perturbation theory,” (in German), International
Journal of Computational Intelligence Studies, vol. 53, p. 475, Sep. 1935.
[4] W. Rafferty, “Ground antennas in NASA’s deep space telecommunications,” IEEE Transactions on Antennas
and Propagation, vol. 82, no. 3, pp. 636-640, May 1994.
Important: Editing of references may entail careful renumbering of references, as well as the citations in text.