You are on page 1of 112

資料結構

ADVANCED TREE H.Y.

補充講義
大綱
n Double-ended priority queue implementations
Ø Min-Max Heap
Ø Deap
Ø SMMH

n Extended Binary Tree


Ø E=I+2N

n Huffman algorithm for encoding/decoding messages


n AVL Tree
n M-way search tree
n B tree of order m, B+ tree of order m
n Red-Black Tree
n Optimal binary search tree (OBST)
n Splay tree
n Leftist tree and Leftist Heap
n Binomial Tree and Binomial Heap
n Fibonacci Heap
DOUBLE-ENDED PRIORITY QUEUE
n The following three data structures are the most suitable to implement the double-ended
priority queue
Ø Min-max heap

Ø Deap

Ø SMMH

n Operations and Time complexity

Operations Time complexity


Insert x O(logn)
Delete-Min O(logn)
Delete-Max O(logn)
Find-min (or maz) O(1)
WEPL: WEIGHTED EXTERNAL PATH LENGTH
n 給予 n 個 external nodes weight value, pi, 𝟏 ≤ 𝒊 ≤ 𝒏,
n WEPL = ∑𝒏𝒊"𝟏(𝒕𝒉𝒆 𝒑𝒂𝒕𝒉 𝒍𝒆𝒏𝒈𝒕𝒉 𝒇𝒓𝒐𝒎 𝒓𝒐𝒐𝒕 𝒕𝒐 𝒆𝒙𝒕𝒆𝒓𝒏𝒂𝒍 𝒏𝒐𝒅𝒆 𝒊 ∗ 𝒑𝒊)

n Note: Tree height 越小,不見得 WEPL 越小

WEPL=4∗2+4∗4+4∗5+4∗7+3∗9+3∗10+1∗15=144.
HUFFMAN ALGORITHM – FIND THE SMALLEST WEPL
HUFFMAN CODING IS OPTIMAL PREFIX CODE AND VARIABLE LENGTH

• Prefix code: as long as no message code is a prefix of another


message code. Such an encoding is known as a prefix code.
• Guarantee that the decoding will be unambiguous.
GREEDY ALGORITHM (STRATEGY) FOR PROBLEM SOLVING
n A greedy algorithm is any algorithm that follows the problem-solving heuristic of making
the locally optimal choice a
n We can make whatever choice seems best at the moment and then solve the
subproblems that arise later. The choice made by a greedy algorithm may depend on
choices made so far, but not on future choices or all the solutions to the subproblem. It
iteratively makes one greedy choice after another, reducing each given problem into a
smaller one. In other words, a greedy algorithm never reconsiders its choices.
n This is the main difference from dynamic programming, which is exhaustive and is
guaranteed to find the solution. After every stage, dynamic programming makes
decisions based on all the decisions made in the previous stage, and may reconsider the
previous stage's algorithmic path to solution.
n Optimal substructure: "A problem exhibits optimal substructure if an optimal solution to
the problem contains optimal solutions to the sub-problems.
n Greedy algorithms can be characterized as being 'short sighted', and also as 'non-
recoverable'. They are ideal only for problems which have 'optimal substructure'.
GREEDY ALGORITHM (STRATEGY)
n 貪婪演算法(greedy algorithm)是一種在每一步選擇中都採取在當前狀態下最好或最佳(即最有利)的選擇,
從而希望導致結果是最好或最佳的演算法。
n Greedy 並不保證一定可以對所有問題均求出最佳解
n 貪婪演算法在有最佳子結構的問題中尤為有效。最佳子結構的意思是局部最佳解能決定全域最佳解。簡單地說,問
題能夠分解成子問題來解決,子問題的最佳解能遞推到最終問題的最佳解。
n 貪婪演算法與動態規劃的不同在於它對每個子問題的解決方案都做出選擇,不能回退。動態規劃則會儲存以前的運
算結果,並根據以前的結果對當前進行選擇,有回退功能。
n 貪婪法可以解決一些最佳化問題 (that is get th optimal solution)
Ø Min spanning tree
l Kruskal, Prim, Sollin algorithm

Ø Huffman algorithm for optimal prefix code

Ø Dijkstra’s algorithm for shortest path length from one single source to other destinations

Ø Fractional knapsack problem


Ø SJF for min average waiting time
AVL TREE ROTATION FOR UNBALANCED (LL, RR ARE THE TYPE OF SINGLE ROTATION)
AVL TREE ROTATION FOR UNBALANCED (LE, RL ARE THE TYPE OF DOUBLE ROTATION)
定理
n 形成高度 h 的 AVL tree (root level=1), 所需的
Ø 最少節點數目= Fh+2-1

Ø 最多節點數目= 2h-1

n 例一 高度5的AVL tree 之最少節點數目?並繪出一例表示。最多節點數目?

n 例二 100個節點的AVL tree , 求最小高度與最大高度?(root level=1)


證明
COMPARISON TABLE
MAX HEIGHT OF AVL TREE WITH N NODES IS ABOUT 1.44LOG(N)
n Let Fh be an AVL tree of height h, having the minimum number of nodes. Let Fl and Fr be AVL trees which are the
left subtree and right subtree, respectively, of Fh. Then Fl or Fr must have height h-2. Suppose Fl has height h-1 so
that Fr has height h-2.
n Thus we have

n Note that | F0|= 1 and | F1| = 2.


M-WAY SEARCH TREE (M>>2)
n 主要用於external search
n The degree of each node is between 0 and m

n If the degree of a node is m, the number of keys in this node is (m-1)

n The keys in a node are listed in ascending order


n Search/Insert/Delete X : Time is O(h), h is the height of tree
Ø It maybe skewed, so the time could be the worst

n 定理 :if the height of m-way search tree is h,

𝒎𝒉 &𝟏
Ø The maximum number of nodes = ∑𝒉𝒊"𝟏(𝒎𝒊&𝟏 ) =
𝒎&𝟏

Ø The maximum number of keys = mh-1


B TREE OF ORDER M
n 定義:it is a balanced m-way search tree. It is used in the external search and extern sort.
If it is not empty, it satisfies the followings:
Ø 𝟐 ≤ 𝒓𝒐𝒐𝒕! 𝒔 𝒅𝒆𝒈𝒓𝒆𝒆 ≤ 𝒎
[𝒎]
Ø Except the root, the degree of the other nodes is between and m ([]:上限)
𝟐
Ø All leaf (external nodes) must be the same level

n 定理:高度 h 的 m-way search tree,


𝒎𝒉 &𝟏
Ø The maximum number of nodes = ∑𝒉𝒊"𝟏(𝒎𝒊&𝟏 ) =
𝒎&𝟏
Ø The maximum number of keys = mh-1

[𝒎]𝒉&𝟏
&𝟏
Ø The minimum number of nodes = 1+𝟐 ∗ 𝟐
[𝒎]
𝟐
&𝟏

[𝒎]𝒉&𝟏
Ø The minimum number of keys =𝟐 ∗ -1
𝟐
B TREE OF ORDER 3 (2-3 TREE) B TREE ORDER 4 IS ALSO NAMED AS 2-3-4 TREE
INSERT X IN B TREE ORDER M
n Step 1. Search for the X. since X does not exist in the tree, this will lead to an external
node(null), so put x into the parent of the external node
n Step 2. Check the node
Ø If it is not overflow, then exits

Ø If it is overflow, then do split action. After split, for its parent node, go back to step 2

n Split action
Ø Pick up the [m/2] order key: k in the overflow node, move k into its parent node, and then split the
left and right child for the remaining keys
Insert 55
EXAMPLE
n Insert 37, 5, 18, and 12
DELETE X IN A B TREE OF ORDER M
1) Search for the X, find the node containing X
2) Delete x from this node
3) check node.
Ø if the number of keys in the node >= [m/2]-1 ([ ]: 上限),表示no underflow, then exits.
Ø Otherwise (underflow)
l Try Rotation firstly. If can, do Rotation, then exits.

l If it cannot do Rotation, then do Combine(or Merge) processing. After combination, for the parent of the node,
go to step (3) for checking whether its underflow.

n Rotation processing
Ø 檢查 left, right sibling node, 檢視其key 數目是否 > [m/2]-1, 如果左(右)sibling 符合,則將左(右)兄弟中的最
大(小)key 往parent node置放,且將parent 中的對應key 往下放到自己節點中。如此即完成
n Combination processing
Ø 從parent node中下拉一個key, 且與此node及其左(右)兄弟node中所有的keys 合併成一個新子點。此時parent
node 會少一個key. 故須對parent 檢查是否underflow
B TREE OF ORDER 3, DELETE 58, 55, AND 40
B+ TREE OF ORDER M
n 支援ISAM (index sequential access method)實施
n 分為兩大層
Ø Index Level: 採用 B tree of order m 結構。純粹作為索引用途,不存放資料
Ø Data blocks level: 用來存放資料之用,且各個data blocks 之間以Link list 方式串連。而data block 內的資料
數,可以依據 B tree 的定義或是自行制訂,不一定要與B tree 規則一致。依據題目規定而執行。
EXAMPLE Root

B+ tree of order 5 13 17 24 30

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

Root
After inserting “8”
17

5 13 24 30

2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
EXAMPLE Root
17

5 13 24 30

2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

After deleting “19” and “20”


Root

17

5 13 27 30

2* 3* 5* 7* 8* 14* 16* 22* 24* 27* 29* 33* 34* 38* 39*
EXAMPLE Root

17

5 13 27 30

2* 3* 5* 7* 8* 14* 16* 22* 24* 27* 29* 33* 34* 38* 39*

After deleting “24”

Root
5 13 17 30

2* 3* 5* 7* 8* 14* 16* 22* 27* 29* 33* 34* 38* 39*


RED-BLACK TREE

1. Every node is either red or black


2. The root is black
3. The nil is black
4. If a node is red, then both its children are black
不允許

5. For each node, all path from the node to descendant leaves
contain the same number of black nodes
• All path from the node have the same black height

因為RB tree is balanced BST, so the search/insert/delete/find-min(max)/find kth item operations are O(logn) time
INSERT X IN THE RB TREE (TOP-DOWN APPROACH)
n STEPS
 搜尋 X 的適當插入位置

‚ 在搜尋的過程中,如果發現有經過某點(例如Y),他的兩個子點為紅色,則執行 COLOR CHANGE (即將


Y改紅色,Y的兩個子點改為黑色),然後檢查是否有連續的紅節點(也就是Y以及Y的父點是否都是紅色),
如果有,則須執行 Rotation 調整。
ƒ 此時,才放置新點 X,且X標示紅色

„ 檢查有無連續的紅色節點(也就是X以及X的父點是否皆為紅色),如果是,則需執行Rotation 調整

… 最後,檢查ROOT 是否為黑。如果為紅色,則一律改成黑色。

n 上述過程當中,step (2) and (4) 的rotation 可以沒有發生或頂多發生一次,不會兩個同時發生。也就是


Insertion 過程,頂多發生一種Rotation (or 以演算法課程來看,即single, double rotation 來看,頂
多是double rotation)
n That is, O(logn) time and O(1) rotation
ROTATION
n 類似AVL Tree 的rotation, 分為四種: LL and RR (they are belonged to single rotation) 及 LR
and RL (tey are belonged to double rotation)
n 以 自己,父親皆為紅色,從祖父往下看,分出上述四種結構

n 差別
Ø 中間鍵值往上拉,標示黑色。小左大右兩個子點標紅色。
EXAMPLE: INSERT 15
補充: [ALGORITHM] BOTTOM-UP APPROACH FOR INSERTING X

1. 插入Z,找出適當插入位置,且Z標示紅色
2. 執行
RB TREE 高度最多 2*LOG(N+1)
n the number of black nodes on any simple path from, but not including, a node x down to a leaf the black-
height of the node, denoted bh(x)
CONT.
補充
另一種版本說法
EXAMPLE OF A RED-BLACK TREE

7 bh = 2

3 18 bh = 2
null null
bh = 1 10 22 h=4
null
bh = 1 8 11 26
null null null null null null
bh = 0
5. All simple paths from any node x, excluding
x, to a descendant leaf have the same
number of black nodes = black-height(x).
HEIGHT OF A RED-BLACK TREE

Theorem. A red-black tree with n keys has height


h £ 2 log(n + 1).
Proof.
INTUITION: T
• Merge red nodes
into their black
parents.
HEIGHT OF A RED-BLACK TREE

Theorem. A red-black tree with n keys has height


h £ 2 log(n + 1).
Proof.
INTUITION: T
• Merge red nodes
into their black
parents.
HEIGHT OF A RED-BLACK TREE

Theorem. A red-black tree with n keys has height


h £ 2 log(n + 1).
Proof.
INTUITION: T’
• Merge red nodes
into their black
parents.
HEIGHT OF A RED-BLACK TREE

Theorem. A red-black tree with n keys has height


h £ 2 log(n + 1).
Proof.
INTUITION: T’
• Merge red nodes h¢
into their black
parents.
• This process produces a tree in which each node
has 2, 3, or 4 children.
• The 2-3-4 tree has uniform depth h¢ of leaves.
PROOF (CONTINUED)
• We have h¢ ³ h/2,
since at most half T
the vertices on
any path are red.
• # leaves in T = # leaves in T’ h
• # leaves in T = n+1 (fact about
binary trees with exactly 2
children per internal node)
• # leaves in T’³ 2h’(fact about T’
binary trees; T’can only have
more) h¢
Þ n + 1 ³ 2h'
Þ log(n + 1) ³ h' ³ h/2
Þ h £ 2 log(n + 1).
OBST (OPTIMAL BINARY SEARCH TREE)
n 假設有 a1,a2,….an 等N 個內部節點,且a1<a2<a3<…<an, 令pi, 1<=i<=n 代表內部節點加權值
n 令 qj ,0<=j<=n 代表外部(失敗)節點加權值

n 則一個BST 的搜尋總成本為

𝟏 𝟐𝒏
n 所以在 棵不同的BST 中,搜尋總成本最小者稱之為OBST
𝒏(𝟏 𝒏
Ø OBST 可能>=1棵
EXAMPLE
公式推導
TABLE RESULT
Q: Draw the OBST ?
OBST TIME=O(N3)
補充 OBST
n [CLRS algorithm版本]
Ø 失敗搜尋的成本定義與資結版本不同,演算法版本中,失敗結點的level 值沒有減一,即 qi * (失敗結點的level值)

n 有些版本只給予內部節點的加權值,沒有給予失敗結點加權值
n 另外,Tij 定義有的是代表 Ai,…Aj 之內部節點之OBST,並非Ai+1, …Aj , 所以公式要重新推導 一下。
資料結構
H.Y.
SEARCH AND SORT
大綱
n Search
Ø Linear Search

Ø Binary Search
n Sort
Ø 名詞解釋
l Internal Sort and External Sort
l Stable and Unstable sorting method

l Sorting in-place
Ø 初等排序
l Insertion, Selection, Bubble, Shell sort
大綱
Ø 高等排序
l Quick Sort, Merge Sort, Heap Sort
Ø 排序可以達多快?(在限定使用Comparison-base skill情況下):
𝛀(𝒏𝒍𝒐𝒈𝒏)
Ø 線性時間排序(Linear-Time sorting methods)
l Radix Sort, Buckets Sort, Counting Sort
n 演算法補充
Ø Find min&Max
Ø Selection problem (select the ith smallest item
among unsorted array with n items)
LINEAR (SEQUENTIAL) SEARCH
n 特色:
Ø 資料搜尋之前不需事先經過排序

Ø 資料保存可以在random access (Array) or Sequential access (Link List) 上實施

Ø Time Complexity: O(n)


LINEAR SEARCH ALGORITHM

int search(int arr[], int n, int x)


//n is input data size, X is the searching
key
{
int i;
for (i = 0; i < n; i++)
if (arr[i] == x)
return i;
return -1;
}
LINEAR SEARCH ALGORITHM WITH SENTINEL

int search(int arr[], int n, int x)


//arr[1..n] of data
//if not found x then return 0
{
int i=n;
arr[0]=x;
while (arr[i] !=x ) {i--;}
return i;
}
BINARY SEARCH
n 實施前提
Ø 資料必須事先經過排序(例如: 由小到大)

Ø 資料必須保存於Random Access (Array)機制上

n Time complexity: O(logn)


n 觀念
Ø 每一次都跟中間位置的紀錄進行比較
Ø 如果相等,則找到

Ø 如果小於中間位置資料值,則在左半部進行二分搜尋

Ø 如果大於中間位置資料值,則在右半部進行二分搜尋

n Divide and conquer 觀念


n Prune and search 觀念
BINARY SEARCH ALGORITHM (ITERATIVE VERSION)
#include <stdio.h>
// A iterative binary search function. It returns
// location of x in given array arr[l..r] if present,
// otherwise -1
int binarySearch(int arr[], int l, int r, int x)
{
while (l <= r) {
int m = (l + r) / 2;
if (arr[m] == x)
return m;
if (arr[m] < x)
l = m + 1;
else
r = m - 1;
}
return -1;// not found x
BINARY SEARCH (RECURSIVE VERSION)

int binarySearch(int arr[], int l, int r, int x)


{
if (l<=r) {
int mid = (l + r) / 2;
if (arr[mid] == x)
return mid;
else if (arr[mid] > x)
return binarySearch(arr, l, mid - 1, x);

else return binarySearch(arr, mid + 1, r, x);


}
return -1;
}
TIME COMPLEXITY ANALYSIS

nThe recursive time function is defined as


follows:
nT(n)=T(n/2) +1, T(1)=1
nSo, case2 of master theorem , T(n)=O(logn)
SORTING-名詞解釋部分
n Internal Sorting and External Sorting
l Internal sorting: 資料量很小,可以全部置於memory 中,完成排序工作,稱之
l External sorting: 資料量太大,無法一次全部置於記憶體中,需要藉助外部儲存體(如磁碟)保存
資料。再行排序工作,稱之。
ü 常見的外部排序方法:Merge Sort (也可以搭配Selection Tree), m-way search tree, B tree,
B+tree 等

n Stable and Unstable sorting method


Ø 通常,Input data 中常會有一些具有相同鍵值的資料,例如:….K, ….K+,….等,經過排序
後,如果此排序方法保證結果仍為….K, K+,…..(也就是K仍然出現在K+之前),則稱此排序
方法為Stable, 否則則為Unstable.
Ø Stable: Insertion, Bubble, Merge Sort`,Radix sort, Bucket sort, counting sort
Ø Unstable: Selection, Shell, Quick, Heap sort
Ø Unstable 會比 stable 多了不必要的資料交換(SWAP)動作。但是排序時間不一定比
Stable差
Ø 此外,Stable/Unstable 與排序的製作細節(coding)有關,我們理論都是用習知認定之方式
SORTING-名詞解釋部分

n Sorting In-Place
n A sorting algorithm sorts in place if only a constant number of
elements of the input array are ever stored outside the array
n An in-place algorithm is an algorithm that does not need an extra
space and produces an output in the same memory that contains
the data by transforming the input ‘in-place’. However, a small
constant extra space used for variables is allowed.
n 高等以及初等排序方法中,只有Merge Sort 不是 sorting in –place, 其餘都

n Linear-time sorting methods 也不是 sorting in-place
初等排序方法

nInsertion sort
nSelection sort
nBubble Sort
nShell Sort
INSERTION SORT
1. void insert(int arr[], int r, int i) 1. void insertionSort(int arr[], int n)
2. { 2. {
3. arr[-1]=r; 3. for (int i=1; i<n; i++)
4. int j=i; 4. insert(arr, arr[i], i-1);
5. while (r<arr[j]) 5.
6. }
6. {
7. arr[j+1]=arr[j];
8. j--;
9. }
10. arr[j+1]=r;
11.}
INSERTION SORT 分析

時間複雜度 Best case Worst case Average case


O(n) O(n2) O(n2)
輸入資料恰好是 輸入資料恰好是
由小到大排列時 由大到小排列時
T(n)=T(n-1)+1 T(n)=T(n-1)+(n-1) T(n)=T(n-1)+cn
空間複雜度 O(1)
Stable
資料記錄少的時 In general, 資料
候 量少用insertion
sort 就好,無需
用到快速排序
SELECTION SORT
void selectionSort(int arr[], int n)
void swap(int *xp, int *yp)
{
{
int i, j, min;
for (i = 0; i < n-1; i++) int temp = *xp;
{ *xp = *yp;
min= i; *yp = temp;
for (j = i+1; j < n; j++) }
if (arr[j] < arr[min])
min= j;

if (i != min)
swap(&arr[min], &arr[i]);
}
}
SELECTION SORT ANALYSIS

時間複雜度 Best case Worst case Average case


O(n2) O(n2) O(n2)

空間複雜度 O(1)
Unstable
適用於大型紀錄 因為每一回合頂
排序 多SWAP 一次
1. void bubbleSort(int arr[], int n)
BUBBLE SORT
2. {
3. int i, j, flag;
4. for (i = 0; i < n-1; i++)
5. {
6. flag=0;
7. for (j = 0; j < n-i-1; j++)
8. if (arr[j] > arr[j+1])
9. {
10. swap(&arr[j], &arr[j+1]);
11. flag=1;
12. }
13. if (flag==0) break; // no swap happen in this phase
14. }
15. }
BUBBLE SORT 分析

時間複雜度 Best case Worst case Average case


O(n) O(n2) O(n2)
輸入資料恰好是 輸入資料恰好是
由小到大排列時 由大到小排列時
T(n)=n-1 T(n)=T(n-1)+(n-1) T(n)=T(n-1)+cn
空間複雜度 O(1)
Stable
1. void shellSort(int arr[], int n)
void printArray(int arr[], int n)
SHELL SORT 2. { {
for (int i=0; i<n; i++)
3. int gap=n/2; cout << arr[i] << " ";
void swap(int *i, int *j)
4. int f; }
{
int temp=*i; 5. while (gap>=1) int main()
*i=*j;
6. { {
*j=temp;
int arr[] = {12, 34, 54, 2, 3,8,10, 1}, i;
}
7. f=0; int n = sizeof(arr)/sizeof(arr[0]);
8. for (int i =0 ; i < n-gap; i++)
cout << "Array before sorting: \n";
9. if (arr[i]>arr[i+gap]) printArray(arr, n);

10. { shellSort(arr, n);


11. swap(&arr[i], &arr[i+gap]);
cout << "\nArray after sorting: \n";
12. f=1; printArray(arr, n);
13. }
return 0;
14. if (f==0) }
15. gap=gap/2;
16. }
SHELL SORT 分析

時間複雜度 Best case Worst case Average case


依據gap 形式而定, O(n2) O(n2)
已知目前最好是
O(n7/6).
一般來說,考試寫
O(n3/2)即可以
Span 形式 n/2k型, 2k-1型等或
者自行制定也可以,
但最後一個回合需
要是1
空間複雜度 O(1)
UnStable
高等排序方法

nQuick sort
nMerge Sort
nHeap sort
QUICK SORT
n 平均狀況下,執行時間最快的排序方法

n 採取 “Divide and conquer” 策略

n 觀念
Ø 選擇一個pivot key (the leftmost or the rightmost key)

Ø 進行partition (切割),也就是決定pk 的正確位置,使得pk 左邊的資料都小於等於pk, pk 右邊的資料都大於等於


pk
Ø 針對左右兩邊的子串列資料,再分別進行quick sort (Conquer)

Ø 兩邊都排序好,則整個資料也就排序好

n Partition 方法
Ø 資料結構版:採用 [Hoare] partition, use the leftmost key as pk

Ø 演算法版:採用 [Comen] partition, use the rightmost key as pk


資料結構版-QUICK SORT [HOARE] PARTITION
例子
QUICK SORT 分析
時間複雜度 Best case Worst case Average case
O(nlogn) O(n2) O(nlogn)
Pk 位置恰將資料 輸入資料恰好是
量一分為兩等份 由小到大排列或者由大到
小排列時. Pk is min or max.

T(n)=2T(n/2)+cn T(n)=T(n-1)+cn $%&
1
𝑇 𝑛 = 𝑐𝑛 + ' (𝑇 𝑖 + 𝑇 𝑛 − 1 − 𝑖 )
𝑛
!"#
空間複雜度 O(logn)~O(n)
Unstable

Note: 如果所有的資料皆相同,則在此partition 方法下是為 Best case, time is O(nlogn)


改善WORST CASE 方法
n 原則:慎選pk
Ø 法一:randomized quick sort
l Use randomized skill to choose a key as pk

l 但是,worst case 仍為O(n2), 沒有改善

Ø 法二:middle of three
l 假設array 的index 為left to right,

l Middle =(left+right)/2;

l 比較這三筆紀錄 (A[left], A[middle], A[right]), 挑選出中間值的紀錄,然後跟 A[left] interchange

l 再用 quick sort (A[left] as pivot key)

Ø 法三: median of medians


l Selection of ith smallest item 這一個主題時,再來講述。
演算法版本-QUICK SORT
例子
演算法版-QUICK SORT
RANDOMIZED QUICK SORT
這個就是資料結構版的PARTITION 方法
排序可以達多快︖
n 在限定使用 Comparison- based skill 情況下,最快可以達到𝛀 𝒏𝒍𝒐𝒈𝒏
n 如果不是採用此排序技巧,則不受此限制。也就是有可能來到 linear-time :O(n) 的排序時間

n Decision tree for sorting comparison behavior (see the following diagram)
3個資料(K1, K2, K3)的排序比較之DECISION TREE
證明
MERGE SORT
n External sorting 常用方法之一
n 基本術語
Ø Run: 排序好的資料片段
Ø Run 長度:Run 中資料個數

n 分為兩種版本:
Ø Iterative version
Ø Recursive version : 採取divide-and-conquer approach

n 另外,版本也分為 k-way merge sort


Ø K=2, 4,8, 16,..etc.
MERGE TWO RUNS INTO A RUN

Time: O(n-l+1)
EXAMPLE: TWO-WAY MERGE SORT (ITERATIVE VERSION)
EXAMPLE: RECURSIVE MERGE SORT
RECURSIVE MERGE SORT ALGORITHM
MERGE SORT 分析
時間複雜度 Best case Worst case Average case
O(nlogn) O(nlogn) O(nlogn)

T(n)=2T(n/2)+cn
空間複雜度 O(n)
Stable
NOT sorting in-place
SELECTION TREE
n 目的:協助k-way merge (K個runs 合併成一個run) 加速合併過程
n 傳統方法:
Ø 要準備k 個指標變數,每一次最多進行(k-1)次比較,從k 個runs 中找出最小值,輸出到新的run, 最多進行(n-1)回合,
所以,花費O(n*k) 的時間合併k 個runs成一個run.

n 使用selection tree
Ø 建樹:花費 O(K)的時間

Ø 每一次從k 個 runs 中找出最小值,花費 O(logK)的時間,最多進行(n-1)回,所以花費 O(n*logK)

Ø Total: O(k)+O(n*logk)

n Selection tree 分為兩類


Ø Winner tree

Ø Loser tree

Ø 以 Loser tree 比較常被使用


EXAMPLE: WINNER TREE
EXAMPLE: LOSER TREE

Winner
HEAP SORT
HEAP SORT (CONT.)
EXAMPLE
HEAP SORT 分析
時間複雜度 Best case Worst case Average case
O(nlogn) O(nlogn) O(nlogn)

空間複雜度 O(1)
Unstable

1. 建立heap : 花 O(n) time

2. 排序過程:要執行 (n-1) 回合,每一回合執行類似 delete-max 動作,花費 O(logn),


所以花費 O(nlogn) time
LINEAR-TIME SORTING METHODS

資料結構版本 演算法版本
LSD Radix Sort Radix Sort
MSD Radix Sort Bucket Sort
Counting Sort Counting sort

註解:Radix sort=Bucket sort


LSD RADIX SORT
n 採用 Distribution and Merge 排序技巧,並非 comparison-based skill
n 給予 n 個 data, 且假設 最大值的資料位數個數為 d (或是鍵值範圍受到限制) , 採用的基底 (base) 為 r,

n 則需準備 r 個 buckets (編號:0~(r-1))

n 由最低位數值到最高位數值,執行 d 回合,每回合執行下列動作
Ø 分派:依據各資料的位數值,將資料分派到對應的桶子中

Ø 合併:依據桶子編號 0->(r-1),合併各個桶子內的資料,輸出成為下一回合的input. (桶子內的資料是以FIFO 順


序輸出)
LSD RADIX SORT- EXAMPLE

Pass1: 個位數
Pass2: 十位數
Pass3: 百位數
LSD RADIX SORT 分析

時間複 Best case Worst case Average case


雜度
O(d*(n+r)) O(d*(n+r)) O(d*(n+r))
空間複 O(r*n)
雜度
Stable
BUCKET SORT
EXAMPLE
COUNTING SORT
EXAMPLE
COUNTING SORT 分析
時間複 Best case Worst case Average case
雜度
O(n+k) O(n+k) O(n+k)
空間複 O(n+k)
雜度
Stable

An important property of counting sort is that it is stable: numbers with the same
value appear in the output array in the same order as they do in the input array.

Counting sort’s stability is important for another reason: counting sort is often used as a
subroutine in radix sort.
SELECTION PROBLEM
n Find min&max
n Select ith smallese item among unsorted n data

You might also like