## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

5204 :

**Spatial Data Structures for GIS
**

Jörg-Rüdiger Sack

School of Computer Science, Carleton University Ottawa, Canada K1S 5B6, sack@scs.carleton.ca

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Geometric Objects

A geometric object is an object which characterizes a geometric component, i.e., the • location and • shape of the object in space. In addition, there is the attribute component which we will ignore for the discussion in this chapter.

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Example

Planar subdivisions for example are collections of polygons which represent towns or municipality regions. The geometric information about the location of the place is stored through the polygon. (Non-geometric information such as name, size, …. are also stored.)

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Operations

There are many operations that need to be carried out on geometric objects, these include: • point in polygon (point location) • traversal of a subregion (window queries) • intersection tests • …. • other operations include: – distance, containment, intersection

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Operations cont’d

1. objects are stored on disc examining, i.e., retrieving all objects is extremely inefficient! 2. checking each object is time-consuming (even after retrieval) as the geometry may be complex. Idea: support spatial queries to geometric objects by realizing a filter, i.e., providing a superset of the solution set and subsequently refine that set to the correct solution.

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Filter

Sometimes this approach is referred to as coarse filter fine filter where coarse filter refees to the retrieval of a subset of adjacent objects followed by the fine filter which analyzes geometric properties of the objects.

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

**The Idea of a Filter
**

Create a bounding box for 2-d geometric objects. Bounding box: = smallest axis parallel rectangle containing the geometric object The database search key for the geometric object is now that of the bounding box. There are many data structures for multi-dimensional For d dimensional objects, let Ui = universe in the ith dimension. Then U = U1x U2 x U3 … x Ud is the d-dimensional universe containing all geometric objects.

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Filter cont’d

G : be a particular set of geometric objects g ε G described as: – g.b d-dim bounding box – g.rest other attributes that are not relevant for the search g = (b, rest) b= (l1, r1, l2, r2,…, ld, rd) d-dim interval [l1, r1] x … x [ld, rd] where b.li : left and r.ri is the right interval boundary of the ith interval. we use: g. li for g.b. li and g. ri for g.b. ri

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

Example

dim 2

r2

l2 dim 1 l1

© Jörg-Rüdiger Sack School of Computer Science Carleton University

r1

Course Notes Computational Aspects of GIS

The Task

Task: find a secondary storage structure S supporting the following operations: (1)Range query (2) Search (3) Insert (4)Remove (delete) more formally next

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Rangequery

Rangequery (w, S(G)) range w, G is stored in S report all objects g in G with g.b ∩ w ≠ Ø assumption: two rectangles that only intersect at a boundary do not intersect, i.e., intersection (A,B) := closure (interior of A ∩ interior of B)

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Rangequery cont’d

1 2 7

6 3 4

5 reports: 1, 6, 3, 5

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

Search

Search (b, S(G)) for bounding box b and G stored in S report all objects g in G with g.b =b

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

search - example

the object g (blue) has bounding box matching the query box g

g’

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Search

Insert (g, S(G)) S(G) := S(G U {g}) add g to G and store it in S

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Remove (Delete)

Remove (Delete) (b, S(G)) remove object g is g.b = b and S(G) := S(G \ {g}) remove g from G and store the result

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Comments

1. While uniqueness is somewhat the underlying assumption it does not pose any serious implementation difficulties. 2. For insert, search and delete the key is spatial, but the spatial location is not referenced -> this can be handled by traditional secondary data structures such as B-trees, dynamic hashing, … e.g., map the 2d key components into one 1-dimensional key (lexicographic)

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Comments

Thus searchers can be handled!

**Problem: Queries of type Rangequery
**

they are space relevant and the above storage schemes show serious deficiencies

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Objective

Find data structure for geometric objects such as points, polygons etc that allow efficient retrieval. Primary concern: When accessing data, long chains of pointers that are crossing disk block boundaries must! be avoided. Game: design data structures with – small internal memory access structure – efficient dynamically updates

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

Basic Concepts

Basic Concepts for spatial structures access time: DRAM (dynamic random access memory) chips for

personal computers have access times of 50 to 150 nanoseconds (billionths of a second). Fast hard disk drives for personal computers boast access times of about 9 to 15 milliseconds. Note that this is about 200

times slower than average DRAM.

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Basic Concepts

Actually many machines have even larger ratios than that.

Typical numbers are: Memory access time (seconds): 10-7 … 10-6 Disc access time (seconds): 10-2 … 10-1 ratio disc/memory access time: 104 … 105

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Basic Concepts

Typical size of transfer unit (bits): Memory : 10 … 102 Disc : 104 … 105 ratio disc/memory transfer size: 102 … 103

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Basic Concepts

The time for an operation is thus determined by the time to retrieve the data + the time required to carry out the local computation. For many operations, # of disc accesses is the dominating factor. However, there are geometric problems where also the internal computations are costly.

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Objective

Find data structure for geometric objects such as points, polygons etc that allow efficient retrieval. Primary concern: When accessing data, long chains of pointers that are crossing disk block boundaries must! be avoided. Game: design data structures with – small internal memory access structure – efficient dynamically updates

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

Proximity

Data on discs are seen to be organized in BLOCKS. A block is a unit of data that is retrieved in one shot from a disc. A block contains many data, these should be useful for the algorithm and its execution,. 1. local maintenance of proximity; i.e, physically close in space 2. global maintenance of proximity; objects stored in adjacent blocks are physically close.

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

Proximity

especially the last points is very difficult to obtain. There is no perfect data organization! Even small improvements in that, yield accelerations that are noticeable.

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Central issue

Organizing the embedding space versus organizing its content. We will discuss data organizations who are dependent on the data and mostly those who are dependent on the space. This is the key distinction between space and non-spatial data structures.

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

**Non-spatial data structures
**

Data structures for non-spatial data any search structure that you may have encountered for example: binary search tree. •searches are comparative: •structures exist and are readily available also balanced – AVL, 2-3 trees, red-black trees excellent search structures also for statistical queries including median, percentiles,

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

**Non-spatial data structures
**

Such data structures are not designed for, nor can they efficiently handle:

• general location queries – nearest neighbour – identify clusters in data

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

**Review of address computation schemes
**

1. Hashing 2. radix trees 3. tries these assign an address of a storage cell to any key value x (course notes)

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

k-d trees

k-d trees were invented by Bentley ’75 as generalizations of search trees i.e. comparative other relevant structures: Lueker 78, Lee&Wong ’77, Willard’78, Bentley’79, Bentley and Maurer’80

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

k-d trees

An example:

x : 50 dim 1 y:4

y : 15

dim 2

dim 3

…

**dim d dim 1 dim 2
**

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

k-d trees

Problems: • it is hard to balance these structures, i.e., get log height • 1-d is easy • space partitioning created lacks regularity • difficult neighbour queries

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

First approaches

First approaches to spatial data structures • based on the existing search structures • data stored! • not the space in which the data was embedded

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

**filter illustration for a rectangular space partitioning
**

hit query q query cells

report all objects that intersect q ignored

**the oval is examined and then droped
**

© Jörg-Rüdiger Sack School of Computer Science Carleton University

drop

not retrieved

Course Notes Computational Aspects of GIS

Comment

Spatial data structures cover the space with cells. Each cell is stored on disc and therefore is associated with a disc block or blocks.

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

**Three-phase model
**

Three steps: 1. Cell addressing for a given query find all “cells” of the partitiong that could contain elements relevant to query 2. Coarse filter retrieve the elements found in Step 1 from disc 3. Fine Filter examine the elements (Step 2) if they fit the query

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

**Tree-based schemes
**

Work has been done on the internal memory data structures: segment trees and range trees and how they can be extended external storage. This is not covered here. Could be a good topic for a class presentation.

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Three philosophies

1. Space driven: 1. multi-dimensional linear hashing, 2. space filling curves 3. ... 2. Data driven 1. k-d-B-trees 2. …. 3. Combinations 1. grid file and its variants 2. Bang file, ….

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

Linear hashing

viewed as a spatial data structure partition the 1-d data space into intervals

0 0 0 0 4 2 2 5 1 1 1 3 6 3 7

**interval sizes half of previous; simple addressing scheme
**

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

doubling

Doubling is typically adding a bit to the front (or back) of the string created thus far. e.g., in some of the schemes you would see

0 1 added bit this means that when you run out of space a piece of the same size is appended resulting in a doubling of the space used. However address calculations are simple! 00 10 01 11

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

MOLPHE

Multidimensional Order Preserving Linear Hashing

2 0 0 1 0

3

2 5 3 7

1

0 4 1 6

Note the alternation of split in the dimensions. 1st split by x; 2nd split by y; 3rd split again by x-axis. Note also the each block is split.

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

z-hashing

Dynamic z-hashing

1 0 0 1 0

3

2 3 6 7

2

0 1 4 5

Note the addressing function is different to the one given above. The reason is that proximity is better maintained between adjacent blocks.

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

**space-filling curves
**

The above schemes define a traversal of the space. Here we list other space filling curves that are typically used. They have different properties and studies have been carried out on them. E.g., Peano, z-ordering and Hilbert

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

**space-filling curves
**

Hilbert

Z-order G.M. Morton

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Z-order

z-order of a point with coordinate x,y is obtain by bit-wise interleaving of the x and y bits. Ex.: y = 2 = 010 x = 5 = 101 25 = 0 1 1 0 0 1

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

25

Z-order

z-order of a point with coordinate x,y is obtain by bit-wise interleaving of the x and y bits. range queries are possible slight care needs to be taken to find successors of point in zorder

© Jörg-Rüdiger Sack School of Computer Science Carleton University Course Notes Computational Aspects of GIS

**Hilbert curve: maping
**

range queries more natural, but successor function more difficult than with z-ordering.

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Hilbert curve cont’d

direction in which to draw the elements of the Hilbnert curve

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

Peano

© Jörg-Rüdiger Sack School of Computer Science Carleton University

Course Notes Computational Aspects of GIS

- The CAD Guide Book
- Introduction to PC
- Indian Oversease Bank Clerk Exam Solved Question Paper 2011
- First Grading Test Grade 7
- Lec1IntroFileOrg_1
- IT Fundamentals Prelim Exam
- IT IntroductionIT
- Practice Paper - 4G and 4f2
- Computer Fundamentals
- Optimization of Rules Selection for Robot Soccer Strategies
- Distributed Multimedia Systems
- 2009 Maharashtra State Board of Technical Education Certification MS
- UNION BANK OF INDIA BANK CLERK EXAM (10 – 01- 2010)
- Five
- COMPUTER PROGRAMMING (TMK 3102) LECTURE NOTES 1
- Electronic Visitor’s Guide
- SBC2000 Brochure
- ch02
- MODULE BASIC PC.pdf
- Computer ABM 513
- system-administrator-guide.pdf
- Computer-Capsule-June-2016.pdf
- CBSE CCE Syllabus of FIT for Class IX 2011
- Electronic Visitor_s Guide
- Prepking 000-551 Exam Questions
- SAP-5s
- revison storage
- day wise training schedule with details
- IPL Document
- UC12.Chapter02

Skip carousel

- Gunsmoke Guns Search Warrant Affidavit
- Technitrol, Inc. v. Control Data Corporation, 550 F.2d 992, 4th Cir. (1977)
- Safe Storage v. Netapp
- SANDISK CORP 10-K (Annual Reports) 2009-02-25
- Data Speed Technology v. LexisNexis Group
- CPNI with certification 2017.pdf
- Microsoft v Motorola ITC '352 Patent Claim Chart
- 2014 Sales Tax Holiday Back to School
- A Survey
- Back to school tax exempt items
- An Analyzing of different Techniques and Tools to Recover Data from Volatile Memory
- Infrastructure Analyst Resume Sample
- IPEC Comments
- Data Speed Technology v. Microsoft
- Federal Storage Networking Report
- Optimum Power Solutions v. Dell
- Overland Storage v. Pivotstor
- Data Speed Technology LLC
- As NZS 3810.1-1998 Safes and Strongrooms - Methods of Test Test for Physical Attack
- Parallel Iron v. Hortonworks
- Data Speed Technology LLC
- Data Speed Technology v. Fujitsu America
- Implementation and Design of High Speed FPGA-based Content Addressable Memory
- Parallel Iron v. Netapp
- USA v. Keys
- CREW v. Executive Office of the President – Report Numbered Footnotes
- Sea Gate
- Operating System Fundamentals
- Safe Storage v. Hitachi Data Systems
- Directing image capture sequences in a digital imaging device using scripts (US patent 6222538)

- ELG5191 Design of Distributed System Software Chapter
- QoS OTTAWA Seminaire v2
- Installation Guide
- Gnu General Public License Version 3, 29
- ELG5191 Design of Distributed System Software Chapter
- EZilla Group
- EZilla Group
- Installation Guide
- A USB Monitor
- Standard Budget En
- Presentation HAVE April 2007
- Practice Quiz3
- ADM 2350 Winter 2009 Assign 1
- Apache Wink User Guide
- Intern Finance Ch05 F2008
- Search Trees
- A 3
- QoS for IP (2)
- ADM3305 Final Exam Review
- Web-Based Mapping of Real-Time GIS Data
- ELG5191 Design of Distributed System Software Chapter8 SOAP With Attachments
- Intern Finance Ch00 F2008

Sign up to vote on this title

UsefulNot usefulClose Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Close Dialog## This title now requires a credit

Use one of your book credits to continue reading from where you left off, or restart the preview.

Loading