5 views

Uploaded by Manoj Kumar G

save

You are on page 1of 45

N,33 H,20 E,9 M,14 K,7 T,17 S,12 P,8 U,2 W,6 Z,4

Maverick Woo <pooh+@cmu.edu>

Disclaimer

Articles of interest

Raimund Seidel and Cecilia R. Aragon.

Randomized search trees. Algorithmica 16 (1996), 464-497.

Guy E. Blelloch and Margaret Reid-Miller.

Fast Set Operations Using Treaps. In Proc. 10th Annual ACM SPAA, 1998.

**Of course this is joint work with Guy.
**

Hopefully Daniel will also show up.

May 2, 2001 2

Background

Very high level talk

No analysis To make this a technical talk i will insert a math symbol S

Some background

Splay Trees (zig, zig zig, zig zig zig…)

Treaps, if you still remember…

May 2, 2001

3

Agenda

Data structure research overview Treaps refresher Some current issues on Treaps

May 2, 2001

4

2001 5 . Not that many high-level problems.Data Structure Research I am not qualified to say yet. Representing a set/ordering Support some operations Some say it’s all about applications. but I do have some “feelings” about it. But need to be specific enough---we can make assumptions. Applications don’t have to very specific. May 2.

What Operations? Basic Insert.g. Intersection. Membership Intermediate Delete (e. Difference Finger Search May 2.g. 2001 6 . Binomial vs. Fibonacci Heaps) Disjoint-Union (e. Union-Find) Higher Level Union.

k. a. “Cache-oblivious” Runs efficiently on hierarchical memory Avoid memory-specific parameterization Forget data block size. Not my theme today May 2. 2001 7 .a.Behavior Restrictions Persistence “Functional” More later… Architecture Independence Relatively new. cache line width etc.

Less memory leak. For the theoretician You don’t need to worry about side effects.Why Persistence? Many reasons for persistence It’s practical with good garbage collectors. less dangling pointers May 2. Functional programming makes everyone’s life easier. Better analysis possible: NESL For the programmer You don’t need to worry about side effects. 2001 8 .

2001 9 . You index the web. May 2. You build your indices with your cool data structures. Conjunction query (AND) is intersection. Now one of the indices can get corrupted.Real-life example 1 You are have operations working on multiple-instances. You do the intersection on two indices.

You even learned how to write multi- threaded programs. Thread2 searches for y on SplayInstance42. 2001 10 . Real-world situation: search engines May 2. in a dot-com far away… You run a multi-processor machine. Thread1 searches for x on SplayInstance42.Real-life example 2 You are rich. You learned that Splay Trees are cool. Once upon a time.

Data Structure vs. Enqueue(Q. 2001 11 . Dequeue(Q) Need to grow. real example (Persistent) FIFO Queues Operations IsEmpty(Q).x). Hacking Examples To learn more about Splay Trees Dial (412)-HACKERS. let’s use Linked List… May 2. Ask for Danny Sleator… OK.

isn’t this just a hack? May 2. Suppose queue is x1x2…xiyi+1yi+2…yn. If one is not good enough. In the end. may be faster with a tree. Either Enqueue or Dequeue is going to be linear time. 2001 12 . Represent as [x1x2…xi]. How about doubly-ended queues (deques)? With that much extra space.[yn.yn-1…yi+1]. You can figure out the details yourself.FIFO Queues Linked List is “bad” though Transverse to tail takes linear time. use two.

Agenda Data structure research overview Treaps refresher Some current issues on Treaps May 2. 2001 13 .

Treaps Refresher A Treap is a recursive data structure. Assume all unique Arrange key in in-order. datatype 'a Treap = E | T of priority * 'a Treap * 'a * 'a Treap Each node has a key and a priority. 2001 14 . 8-way independence suffices for the analysis Can be computed with hash functions Don’t need to store the priority A key’s priority can be made consistent across runs May 2. priority in heap-order Priority is chosen uniformly at random.

etc. 2001 15 . May 2. etc. Walk on the left spine.Treap Operations Membership As in binary search trees Insert Add as leaf by key (in-order) Rotate up by priority (heap-order) Delete Reverse what insert does Find-min.

2001 16 . m. r1) Else (root.right)) If (root. T(root. k) (l1. root.left. root.left.left.p. root.right.k. m. m.k > k) // want to split left subtree Let (l1. m.right) May 2.Treap Split Want top-down split (it’s faster) (less.k. l1).k. gtr) = Split(root. r1. r1) = Split(root. root.p. k) (T(root. root. r1) = Split(root.k < k) // want to split right subtree Let (l1. x. root. k) If (root.

4 E.14 K.8 gtr W.17 S.17 S.33 T.33 H.9 M.20 E.8 U. 2001 17 .14 K.7 N.20 M.7 N.4 May 2.“V”) T.12 P.9 After less H.2 Z.2 W.12 P.Treap Split Example Before Split(Tr.6 Z.6 U.

6 Z.2 Z.33 H.12 P.7 S.8 U.17 gtr W. Only 4 new nodes created All on the search path to “V” N.14 K.4 S.6 U.20 W.4 N.8 E.9 M.14 K.Treap Split Persistence These figures are deceptive.17 H.7 May 2. 2001 less T.12 P.20 E.9 M.2 18 .33 T.

gtr.k.p) T(less. less. gtr)) Else T(gtr. less.p > gtr.left).k.right) May 2. Join(less.right. gtr.Treap Join Join(less. gtr) // less < x < gtr Handle empty less or gtr If (less. Join(less. gtr.p.p. 2001 19 .left.

17 S.2 W.14 K.20 M.12 P.20 E. 2001 20 .7 N.9 M.7 T.17 S.Treap Join Example After Join(less.6 Z.33 T.8 U.8 gtr W.9 Before less H.12 P.33 H.gtr) N.2 Z.4 E.14 K.4 May 2.6 U.

2001 21 .Treap Running Time All expected O(lg n) Also of note is Finger Search Given a finger in a treap Find the key that is d away in sorted order Expected O(lg d) time Require parent pointers Evil… Waste so much space See Seidel and Aragon for details. May 2.

x. T(p1. Union(a.gtr) = Split(b. Let (less.p1). gtr)) May 2. (k2.Treap Union Treaps really shine in set operations.left.k1). Union(a. 2001 22 . less).b) Suppose roots are (k1. Union(a.right. k1.p2) WLOG assume p1 > p2.

2001 23 .left.Treap Intersection Inter(a.x.gtr) = Split(b. gtr)) Else T(p1. Inter(a. Inter(a. (k2.left. sorry dude Join(Inter(a.right. gtr)) May 2. k1. less).p2).right. less).k1) If x is null // k1 is not in b. Inter(a.b) Suppose roots are (k1.p1). p1>p2 Let (less.

Treap Difference Similar to intersection Change the logic a bit Messier because it is not symmetric Leave as an exercise to the reader. 2001 24 . May 2.

2001 25 .Points of Note Persistence Did you see a side effect? (assignments?) Parallelization Parallelize without persistence is a pain. Very natural divide-and-conqueror Run the two recursive calls on different CPUs Running times… May 2.

2001 26 .Set Operation Running Time For two sets of size m and n (m · n) Optimal is q(m lg (n/m)) What’s known before this work With AVL Trees. O(m lg(n/m)) Rather complicated algorithms For the sake of your smooth digestion… Compare this to O(m+n) or O(m lg n) With Treaps Can use Finger Search if we have parent pointers Does not parallelize---multiple fingers??? May 2.

Set Operation Running Time What’s known after this work No parent pointers Parallelize naturally Optimal expected running time O(m lg (n/m)) Analysis available in Blelloch and Miller Relatively simple algorithm Experimental results 6.8 speedup on 8-processor SGI machine 4.4 speedup on 5-processor Sun machine May 2.1-4.3-6. 2001 27 .

Agenda Data structure research overview Treaps refresher Some current issues on Treaps May 2. 2001 28 .

Danny??? May 2.A Word on Splay Trees Splay Trees are slow in practice! Even a single simple search would require O(lg n) pointer updates! Skip Lists are way simpler and faster. 2001 29 . Let’s switch all Splay Trees to Skip Lists.

Then praise Skip Lists. I wonder if that works. 2001 30 . Splay Trees are not much slower than Skip List in practice. Ditch Splay Trees---say they are slow. ask who’s my advisor. So I tried. May 2.Bruce said… First find Danny. Danny will refute by quoting experimental studies.

Current Issues on Treaps Treaps are simpler than Splay Trees No famous conjecture for my back pocket Neat idea from Adam Kalai Not self-adjusting Access introduces more explicit changes Adding data compression to Treaps Finger search on Treaps Work by Guy + Daniel Blandford May 2. 2001 31 .

Adding Compression to Treaps Search engines Infrequent offline update (once a month) Frequent online query and set operations Keys are unique. Let’s compress the keys! Assume they are 64-bit integers. May 2. Keys can be huge and occurs sparsely. 2001 32 .

Begin with the simplest---Array The naïve approach Compress the whole array When need to access an element decompress the whole array do the access compress the whole array again May 2. 2001 33 .We’ve got a problem! I don’t know how to deploy data compression to general data structures.

Now we are back to “constant” time! Shh!!! That could be a trade secret. 2001 34 . May 2. Compress each block individually.Isn’t that dumb? Any suggestions? Use chunking Divide the array into blocks of size C. Of course they use something better than vanilla array.

2001 35 . Need better chunking rules Chunks Can’t be too big---hurt running time Can’t be too small---hurt compression (space) May 2.Chunking a Treap A sub-tree is a chunk. Desire consistent chunk size But Treaps are usually not full.

May 2.Vocab Internal node and Leaf block More precisely datatype tblock = Packed of int * key * key * key vector | UnPacked of int * int * key vector datatype trearray = TE | TB of tblock | TN of trearray * key * trearray All running time are in expected case. 2001 36 .

log(maxP) For n=(p. then n is an internal node. Otherwise. maxP . Trick done when a key is inserted.Idea 1 – Thresholds Priority is in the range 1 to maxP Invent a threshold Pth e. Also maintained by various operations. 2001 37 .g. n is in some leaf block. May 2.k) If p > Pth.

2001 38 . constant ratio between internal keys to “keys in block”. N keys log N internal nodes Height is log log N. each w/ a block Expect (N-log N) / O(log N) keys / block Binary search in block takes O(log N) May 2.Idea 1 – Features On average. O(log N) “bottom” node. With Pth = maxP .log(maxP).

Set operations rely on Join and Split’s O(log n) running time. 2001 39 . Join. Looking good… May 2. Insert is also O(log n).Idea 1 – Running Time Query is still O(log n). Split both take O(log n).

Idea 1 – Problems Asymptotic bound Need to work out the constants Exact analysis in progress I now think of Knuth even higher… SML implementation Make the idea as concrete as code Can now do more experiments May 2. 2001 40 .

Idea 1 – Questions Do we really need to maintain consistent priority across runs? Make things simpler But Union looks suspicious What compression algorithm to use? No general data compression Take advantage of index distribution May 2. 2001 41 .

2001 42 .Idea 2 – Small Blocks Want a more-or-less constant block size Small blocks are more realistic Say 20 Processor specific---fit cache line size How well can we compress 20 integers? Leave for second stage investigation May 2.

Now if SML has a debugger… Space time tradeoff is very real May 2. 2001 43 . Good for sloppy people like me.Perhaps I can share Writing down algorithm as code helps Pseudo code are good for short algorithms Real code is more concrete. Actual SML code You can figure out you missed some cases.

2001 44 .Treap Finger Search Daniel is working on it. No parent pointers needed Can mimic parent pointers by reversing root-to-(last accessed leaf) path Should probably leave this to him May 2.

don’t kick me too hard… May 2. 2001 45 .Q&A / Suggestions Work in progress. welcome suggestions Danny.

- 2d LSI SystemsUploaded byelamaran_vlsi
- Significant Figures and CalculationsUploaded bylovely queen
- data_structures_algorithms_tutorial.pdfUploaded byKrishna Kumari
- jurnal_achmadL.pdfUploaded byAchmad Lukman
- Algorithms Personal EssayUploaded byMohit Shrestha
- SequencingUploaded bySundaramali Govindaswamy G
- maze problem in Artificial intelligence.Uploaded byMohammed Faizul Gani
- TreeUploaded byregisanne
- SASken - QuesUploaded byMahesh
- Age Detection Using Audio FeaturesUploaded bysujay pujari
- Rr410507 Digital Speech and Image ProcessingUploaded byvasuvlsi
- Automatic Speech Recognition for Marathi Isolated WordsUploaded byAnonymous vQrJlEN
- CO Attainment Calc NEC KovilpattiUploaded bybasanth babu
- 11 - TreeUploaded byrakhaadit
- Assignment 3 VCUploaded bySwathi Mudumbai
- 10.1.1.378.8719.pdfUploaded byDjamel Taibi
- Bvarsv ReplicationUploaded byRene Barrera
- Array of ObjectsUploaded byNamachivayam Dharmalingam
- questionbank_2015_16Uploaded byAnonymous VKLiGS
- IIIrd Sem ME _PowerSystemUploaded byHappy Ambastha
- 10.1016@j.eswa.2011.07.065Uploaded byArunShan
- 640003Uploaded bySwetang Khatri
- Network-1 [Compatibility Mode]Uploaded byTejas Mehta
- ee331 lab2 control systemUploaded byMohannad S. Alghamdi
- Moore HodgsonUploaded bymarko18ara
- 10.1.1.116.370Uploaded bySuhail Khokhar
- Drilling Machine Performance using PID Controller Tuned with Evolutionary AlgorithmsUploaded byInternational Journal of Science and Engineering Investigations
- Priyanshu Executive SummaryUploaded byPriyanshu Agarwal
- Artificial Intelligence in MISUploaded byVidushi Rastogi

- WSN_MACUploaded byManoj Kumar G
- WSN_SensorManagementUploaded byManoj Kumar G
- WLANs_WPANsUploaded byManoj Kumar G
- Handbook DTEUploaded byManoj Kumar G
- WSN_MACUploaded byManoj Kumar G
- nos lab and c# syllabusUploaded byManoj Kumar G
- Network Simulator 2_ to Implement a New ProtocolUploaded byManoj Kumar G
- Advanced Network TechUploaded byManoj Kumar G
- Computer Science List 2014-15Uploaded byManoj Kumar G
- Info Theory LecturesUploaded byManoj Kumar G
- Linux NetworkUploaded byManoj Kumar G
- Handbook DTE.pdfUploaded byManoj Kumar G
- WSN OverviewUploaded byManoj Kumar G
- ijsrp-p1228-1Uploaded byManoj Kumar G
- WSN_Localization.pdfUploaded byManoj Kumar G
- Fac CoursefilesUploaded byManoj Kumar G
- 566-576Uploaded byManoj Kumar G
- Hacking - CEH Cheat Sheet ExercisesUploaded byManoj Kumar G
- Guidelines 2009Uploaded byManoj Kumar G
- Fac CoursefilesUploaded byManoj Kumar G
- Course File Contents Stage1Uploaded byManoj Kumar G
- ActivaUploaded byManoj Kumar G
- ABET_MET Employer Survey_ 2007Uploaded byManoj Kumar G
- 25 ScatterUploaded byManoj Kumar G
- Hacking - CEH Cheat Sheet ExercisesUploaded byManoj Kumar G
- 2136-8119-1-PBUploaded byManoj Kumar G
- Research Proposal GuideUploaded byManoj Kumar G
- Minutes of the 05th Review Meeting of PIsUploaded byManoj Kumar G
- 6th bog minutes.pdfUploaded byManoj Kumar G
- Linux Process MgtUploaded byManoj Kumar G