This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Introduction

Data Structures - Introduction

1

Class Overview

Introduction to many of the basic data structures used in computer software

± Understand the data structures ± Analyze the algorithms that use them ± Know when to apply them

Practice design and analysis of data structures. Practice using these data structures by writing programs. Make the transformation from programmer to computer scientist

Data Structures - Introduction 2

Goals

You will understand

± what the tools are for storing and processing common data types ± which tools are appropriate for which need

** So that you can
**

± make good design choices as a developer, project manager, or system customer

** You will be able to
**

± Justify your design decisions via formal reasoning ± Communicate ideas about programs clearly and precisely

Data Structures - Introduction

3

Goals

³I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships.´

Linus Torvalds, 2006

Data Structures - Introduction 4

Goals

³Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won¶t usually need your flowcharts; they¶ll be obvious.´

Fred Brooks, 1975

Data Structures - Introduction

5

Data Structures ³Clever´ ways to organize information in order to enable efficient computation ± What do we mean by clever? ± What do we mean by efficient? Data Structures .Introduction 6 .

Introduction 7 .Picking the best Data Structure for the job The data structure you pick needs to support the operations you need Ideally it supports the operations you will use most often in an efficient manner Examples of operations: ± A List with operations insert and delete ± A Stack with operations push and pop Data Structures .

Implementation of data structure ± A specific implementation in a specific language Data Structures . description of a step-by-step process Data structure ± A specific family of algorithms for implementing an abstract data type.Introduction 8 . Useful building block. language independent.Terminology Abstract Data Type (ADT) ± Mathematical description of an object with set of operations on the object. Algorithm ± A high level.

Introduction 9 .Terminology examples A stack is an abstract data type supporting push.Stack. pop and isEmpty operations A stack data structure could use an array.util. or anything that can hold data One stack implementation is java.util. another is java. a linked list.LinkedList Data Structures .

Mechanisms Concrete Specific programming language Program ± A sequence of operations in a specific programming language. images. which reflects the programmer¶s design choices/goals. ± A sequence of high-level. Data structure ± A specific way in which a program¶s data is represented. which may act upon an abstracted view of data.Concepts Abstract Pseudocode Algorithm vs. Abstract Data Type (ADT) ± A mathematical description of an object and the set of operations on the object. language independent operations.Introduction . sound. which may act upon real data in the form of numbers. etc. 10 Data Structures .

elegance ± generality vs.Introduction 11 . memory efficient Generates tensions: ± time vs. space ± performance vs. another¶s The study of data structures is the study of tradeoffs. ³elegant´. simplicity ± one operation¶s performance vs. That¶s why we have so many of them! Data Structures .Why So Many Data Structures? Ideal data structure: ³fast´.

Introduction 12 .Today¶s Outline Introductions Administrative Info What is this course about? Review: Queues and stacks Data Structures .

Introduction 13 .First Example: Queue ADT FIFO: First In First Out Queue operations create destroy enqueue dequeue is_empty G enqueue FEDCB dequeue A Data Structures .

back = (back + 1) % size } dequeue() { x = Q[front] .1 b c d e f front back enqueue(Object x) { Q[back] = x . } Data Structures . front = (front + 1) % size. return x .Introduction 14 .Circular Array Queue Data Structure Q 0 size .

Introduction 15 .Linked List Queue Data Structure b front c d e f back void enqueue(Object x) { if (is_empty()) front = back = new Node(x) else back->next = new Node(x) back = back->next } bool is_empty() { return front == null } Object dequeue() { assert(!is_empty) return_data = front->data temp = front front = front->next delete temp return return_data } Data Structures .

Linked List Too much space Kth element accessed ³easily´ Not as complex Could make array more robust Can grow as needed Can keep growing No back looping around to front Linked list code more complex Data Structures .Circular Array vs.Introduction 16 .

Introduction 17 .Second Example: Stack ADT LIFO: Last In First Out Stack operations ± ± ± ± ± ± create destroy push pop top is_empty A B C D E F ED C BA F Data Structures .

Stacks in Practice Function call stack Removing recursion Balancing symbols (parentheses) Evaluating Reverse Polish Notation Data Structures .Introduction 18 .

Introduction 19 .Data Structures Asymptotic Analysis Data Structures .

Different algorithms may be correct ± Which should I use? Data Structures . Performance: ± What is the running time of the algorithm. ± How much storage does it consume.Introduction 20 .Algorithm Analysis: Why? Correctness: ± Does the algorithm do what is intended.

Recursive algorithm for sum Write a recursive function to find the sum of the first n integers stored in array v.Introduction 21 . Data Structures .

Introduction 22 . show that the k+1 case will be calculated correctly. Inductive Step (n=k+1): Given the hypothesis above. Data Structures .Proof by Induction Basis Step: The algorithm is correct for a base case or two by inspection. Inductive Hypothesis (n=k): Assume that the algorithm works correctly for the first k cases.

n) returns v[k]+sum(v.Program Correctness by Induction Basis Step: sum(v.k) correctly returns sum of first k elements of v. v[0]+v[1]+«+v[k-1]+v[k] Inductive Step (n=k+1): sum(v.0) = 0. i. Inductive Hypothesis (n=k): Assume sum(v.Introduction 23 .e.) v[k]+(v[0]+v[1]+«+v[k-1])= v[0]+v[1]+«+v[k-1]+v[k] Data Structures .k-1)= (by inductive hyp.

Introduction 24 .Algorithms vs Programs Proving correctness of an algorithm is very important ± a well designed algorithm is guaranteed to work correctly and its performance can be estimated Proving correctness of a program (an implementation) is fraught with weird bugs ± Abstract Data Types are a way to bridge the gap between mathematical algorithms and programs Data Structures .

Introduction 25 .Comparing Two Algorithms GOAL: Sort a list of names ³I¶ll buy a faster CPU´ ³I¶ll use C++ instead of Java ± wicked fast!´ ³Ooh look. the ±O4 flag!´ ³Who cares how I do it. I¶ll add more memory!´ ³Can¶t I just get the data pre-sorted??´ Data Structures .

« ± These would help any algorithms equally ± Don¶t just care about running time ± not a good enough measure Data Structures . independent of details ± Coding tricks. CPU speed. compiler optimizations.Introduction 26 .Comparing Two Algorithms What we want: ± Rough Estimate ± Ignores Details Really.

Introduction 27 .Big-O Analysis Ignores ³details´ What details? ± CPU speed ± Programming language used ± Amount of memory ± Compiler ± Order of input ± Size of input « sorta. Data Structures .

without actually measuring it Data Structures .Analysis of Algorithms Efficiency measure ± how long the program runs ± how much memory it uses time complexity space complexity Why analyze at all? ± Decide what algorithm to implement before actually doing it ± Given code.Introduction 28 . get a sense for where bottlenecks must be.

2n + 7 T(n) = 2n + n3 + 3n What happens as n grows? Data Structures .Introduction 29 .Asymptotic Analysis Complexity as a function of input size n T(n) = 4n + 5 T(n) = 0.5 n log n .

«) BUT n is often large in practice ± Databases. internet.Why Asymptotic Analysis? Most algorithms are fast for small n ± Time difference too small to be noticeable ± External things dominate (OS. graphics.Introduction 30 . « Difference really shows up as n grows! Data Structures . disk I/O.

Introduction 31 snippet? .Searching 2 3 5 16 37 50 73 75 126 bool ArrayFind(int array[]. int key){ // Insert your algorithm here } What algorithm would you choose to implement this code Data Structures . int n.Exercise .

Analyzing Code Basic Java operations Consecutive statements Conditionals Loops Function calls Recursive functions Constant time Sum of times Larger branch plus test Sum of iterations Cost of function body Solve recurrence relation Data Structures .Introduction 32 .

Linear Search Analysis bool LinearArrayFind(int array[]. int n. } return false. i++ ) { if( array[i] == key ) // Found it! return true. int key ) { for( int i = 0. } Best Case: Worst Case: Data Structures . i < n.Introduction 33 .

mid+1. int low. mid-1.Introduction 34 Best case: Worst case: . // Search this subarray recursively int mid = (high + low) / 2. } else if( key < array[mid] ) { return BinArrayFind( array. int key ) { // The subarray is empty if( low > high ) return false. key ). high.Binary Search Analysis bool BinArrayFind( int array[]. low. if( key == array[mid] ) { return true. } Data Structures . } else { return BinArrayFind( array. key ). int high.

Find a closed-form expression by setting the number of expansions to a value which reduces the problem to a base case Data Structures .Solving Recurrence Relations 1. What is/are the base case(s)? 2. ³Expand´ the original relation to find an equivalent general expression in terms of the number of expansions. Determine the recurrence relation.Introduction 35 . 3.

Introduction 36 .Data Structures Asymptotic Analysis Data Structures .

Introduction 37 .Linear Search vs Binary Search Linear Search Best Case Worst Case 4 at [0] 3n+2 Binary Search 4 at [middle] 4 log n + 4 So « which algorithm is better? What tradeoffs can you make? Data Structures .

Fast Computer vs. Slow Computer 38 .

Smart Programmer (round 1) 39 .Fast Computer vs.

Smart Programmer (round 2) 40 .Fast Computer vs.

to find the asymptotic runtime.Introduction 41 . throw away the constants and low-order terms ± Linear search is T(n) = 3n + 2 O(n) ± Binary search is T(n) = 4 log2n + 4 O(log n) Remember: the fastest algorithm has the slowest growing function for its runtime Data Structures .Asymptotic Analysis Asymptotic analysis looks at the order of the running time of the algorithm ± A valuable tool when the input gets ³large´ ± Ignores the effects of different machines or different implementations of an algorithm Intuitively.

5 n log n ± n log n2 => Data Structures .Introduction 42 .5 n log n + 2n + 7 ± n3 + 2n + 3n Eliminate coefficients ± 4n ± 0.Asymptotic Analysis Eliminate low order terms ± 4n + 5 ± 0.

B ! 2 A!2 2 2 B AB ! 2 log 2 A 2log 2 B ! 2(log 2 A log 2 B ) @ log AB ! log A log B Similarly: ± log(A/B) = log A ± log B ± log(AB) = B log A Any log is equivalent to log-base-2 Data Structures .Introduction 43 .Properties of Logs log AB = log A + log B log A log Proof: .

Introduction 44 .Order Notation: Intuition f(n) = n3 + 2n2 g(n) = 100n2 + 1000 Although not yet apparent. as n gets ³sufficiently large´. f(n) will be ³greater than or equal to´ g(n) Data Structures .

(g(n)) Exist positive constants c and n¶ such that T(n) u c g(n) for all n u n¶ Tight bound: T(n) = U(f(n)) When both hold: T(n) = O(f(n)) T(n) = .Introduction 45 .(f(n)) Big-O Omega Theta Data Structures .Definition of Order Notation Upper bound: T(n) = O(f(n)) Exist positive constants c and n¶ such that T(n) e c f(n) for all n u n¶ Lower bound: T(n) = .

Definition of Order Notation O( f(n) ) : a set or class of functions g(n) O( f(n) ) iff there exist positive consts c and n0 such that: g(n) e c f(n) for all n u n0 Example: 100n2 + 1000 e 5 (n3 + 2n2) for all n u 19 So g(n) O( f(n) ) Data Structures .Introduction 46 .

Introduction 47 .Order Notation: Example 100n2 + 1000 e 5 (n3 + 2n2) for all n u 19 So f(n) O( g(n) ) Data Structures .

Some Notes on Notation Sometimes you¶ll see g(n) = O( f(n) ) This is equivalent to g(n) O( f(n) ) What about the reverse? O( f(n) ) = g(n) Data Structures .Introduction 48 .

log n2 O(log n)) (k is a constant) (c is a constant > 1) Data Structures .Big-O: Common Names ± ± ± ± ± ± ± ± constant: O(1) logarithmic: linear: log-linear: quadratic: cubic: polynomial: exponential: O(log n) O(n) O(n log n) O(n2) O(n3) O(nk) O(cn) (logkn.Introduction 49 .

Introduction 50 .Meet the Family O( f(n) ) is the set of all functions asymptotically less than or equal to f(n) ± o( f(n) ) is the set of all functions asymptotically strictly less than f(n) .( f(n) ) is the set of all functions asymptotically greater than or equal to f(n) ± [( f(n) ) is the set of all functions asymptotically strictly greater than f(n) U( f(n) ) is the set of all functions asymptotically equal to f(n) Data Structures .

Formally g(n) O( f(n) ) iff There exist c and n0 such that g(n) e c f(n) for all n u n0 ± g(n) o( f(n) ) iff There exists a n0 such that g(n) < c f(n) for all c and n u n0 Equivalent to: limnpg g(n)/f(n) = 0 g(n) .( f(n) ) iff There exist c and n0 such that g(n) u c f(n) for all n u n0 ± g(n) [( f(n) ) iff There exists a n0 such that g(n) > c f(n) for all c and n u n0 Equivalent to: limnpg g(n)/f(n) = g g(n) U( f(n) ) iff g(n) O( f(n) ) and g(n) .( f(n) ) Data Structures .Meet the Family.Introduction 51 .

Big-Omega et al.Introduction 52 . U o [ Mathematics Relation e u = < > Data Structures . Intuitively Asymptotic Notation O .

Pros and Cons of Asymptotic Analysis Data Structures .Introduction 53 .

Perspective: Kinds of Analysis Running time may depend on actual data input.Introduction 54 . not just length of input Distinguish ± Worst Case Your worst enemy is choosing input ± Best Case ± Average Case Assumes some probabilistic distribution of inputs ± Amortized Average time over many operations Data Structures .

o) Lower bound (.Types of Analysis Two orthogonal axes: ± Bound Flavor Upper bound (O.. [) Asymptotically tight (U) ± Analysis Case Worst Case (Adversary) Average Case Best Case Amortized Data Structures .Introduction 55 .

Introduction 56 .16n3log8(10n2) + 100n2 = O(n3log n) Eliminate low-order terms Eliminate constant coefficients 16n3log8(10n2) + 100n2 16n3log8(10n2) n3log8(10n2) n3(log8(10) + log8(n2)) n3log8(10) + n3log8(n2) n3log8(n2) 2n3log8(n) n3log8(n) n3log8(2)log(n) n3log(n)/3 n3log(n) Data Structures .

- dsum
- 7B_3_clarich
- Download 223
- Hamming Codes
- LECTURE 1.pptx
- ooplai
- Mechanical 431 Project
- CIVA9 UserGuide Eng
- CS382-chapters9-10 (1)
- Crib Sheet - Final Exam
- Jasper Report Variables
- EngTips IEEE 551 Question
- Getting Started - TensorFlow
- Final Exam
- Realizing Ternary Logic in FPGAs for SWL DSP Systems
- component Interface peoplecode
- introducci_n_al_m_dulo_de_simulaci_n.pdf
- Hole Filing IFCNN Simulation by Parallel RK(5,6) Techniques
- Recursion in Distributed Computing - Gafni, Rajsbaum
- UNIT III
- Smart Art) [Compatibility Mode]
- mimo
- JPEG Compression
- An Optimization Technique for Radial Compressor Impellers
- Matlab3
- New Functions for Secrecy on Real
- 4594116
- Excel Formulae
- IMPLEMENTATION OF THE TRIGONOMETRIC LMS ALGORITHM USING ORIGINAL CORDIC ROTATION
- EE1A2_2011handbook

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd