You are on page 1of 60

Database models and system

architecture
Database Management System
• A database management system (DBMS) defines,
creates, and maintains a database.

DBMS components
DBMS components
• Hardware
• The physical computer system that allows physical access to data
• Software
• The actual program that allows users to access, maintain, and
update physical data
• Data: stored physically on the storage devices
• User: include end users and application programs
• Procedure
• A set of procedures (rules) that should be clearly defined and followed by the users
of the database

05/25/2020 4
ARCHITECTURE
Figure 14-2

Database
architecture
Database architecture
ANSI/SPARC
• Internal level
• The internal level determines where data are actually
stored on the storage device
• Conceptual level
• The conceptual level defines the logical view of the data
• External level
• The external level interacts directly with the user
DATABASE
MODELS
Database models
• Hierarchical model: obsolete
• Network model: obsolete Relational model
• In a relational model, data are organized in two-dimensional tables called
relations.
Hierarchical model
Network model: graph
Relational model
RELATIONAL
MODEL
Relational database management system
• Relation: a relation is a 2D table has the following
features:
• Name
• Attributes
• Tuples
OPERATIONS
ON
RELATIONS
Operations on relations
• Insert : unary operation
• Delete : unary operation
• Update : unary operation
• Select : unary operation
• Project : unary operation
• Join : binary operation
• Union : binary operation
• Intersection : binary operation
• Difference : binary operation

05/25/2020 17
Insert operation

Delete operation
Update operation

Select operation
Figure 14-11

Project operation

• The project operation creates a relation in which each


tuple has fewer attributes.
Figure 14-12

Join operation
Figure 14-13

Union operation
Figure 14-14

Intersection operation
Figure 14-15

Difference operation
STRUCTURED
QUERY
LANGUAGE
OTHER
DATABASE
MODELS
• Distributed databases
• The data are stored on several computers that communicate through the
Internet.
• Fragmented distributed databases
• Replicated distributed databases
• Object-oriented databases
• An object-oriented database tries to keep the advantages of the relational
model and at the same time allows applications to access structured data.
Key terms
• Join operation
Application program
• Network model
Cardinality
•• Object-oriented
Conceptual level database
•• Procedure
Database
•• Project
DBMS operation
• Relation
• Difference operation
• RDBMS
• Distributed database
• Relational model
• End user
• Select operation
• External level
• Software
•• Hardware
SQL
•• Hierarchical model
Tuple
•• Internal
Union level
operation
• Intersection
Update operation
operation
Systems Architecture
Monolithic Systems

29
Monolithic Systems
• “no architecture”
reports

static data imported data

dynamic data

CSC407 30
Examples
• Most programs you deal with day-to-day
• word processing
• spreadsheets
• powerpoint
• e-mail
• inexpensive accounting packages
• development environments
• compilers
• most games

13 - Monolithic CSC407 31
• Large, corporate batch systems
• payroll
• reports
• Descartes route planning

05/25/2020 32
Characteristics
• Usually written in a single programming language.
• Everything compiled and linked into a single (monolithic) application
• as large as 100M loaded executable
• as large as 2G virtual memory
• May operate in both
• batch mode
• GUI mode

13 - Monolithic CSC407 33
• Data
• load into memory
• write all back on explicit save
No simultaneous data sharing
• May have concurrency
• multi-threading
• multi-processing (but only one executable)

05/25/2020 34
Concurrency
• multi-threading

source code
shared memory
executes within OS Process shared system resources
single or multi-cpu
multi-threading
source code

35
• symmetric multi-processing

master
program
fork
fork

slaves program program

precise copies except for fork() return value

source code

36
• distributed processing

program 2
program 1

many source codes

37
• Why multi-threading?
• performance when there is access to multiple CPUs
• A design philosophy for dealing with asynchronous events
• interrupts
• GUI events
• communications events

38
• Maintain liveness
• can continue to interact with user despite time-consuming operations
• e.g., SMIT “running man”
• performance
• pre-load, network initializations
• multi-tasking (lets the user do many tasks at once)
• e.g., downloads from the net

05/25/2020 39
• Why symmetric multi-processing?
• you need parallelism
• performance
• liveness
• …
• a program is not written to be multi-threaded
• temporarily modifying shared data

13 - Monolithic CSC407 40
• Tricks:
• special allocators to group modifiable shared data to contiguous memory
• Using memory management hardware to switch volatile data based
on “thread”

05/25/2020 41
Monolithic Architecture
• A monolithic system is therefore characterized by
• 1 source code
• 1 program generated
• but… may contain concurrency

42
Data
• In a monolithic architecture
• data is read into application memory
• data is manipulated
• reports may be output
• data may be saved back to the same source or different
• Multi-user access is not possible

43
Multi-User Access
• Can changes by one user be seen by another user?
• not if each copy of the application reads the data into
memory
• only sequential access is possible

44
• Allowing multiple users to access and update volatile data
simultaneously is difficult.
• Big extra cost
• require relational database expertise

45
Advantages
• performance
• accessing all data
• disk is disk!
• either

46
- read data more directly from the disk via file system
• highly optimized
• caching and pre-fetching built-in
• read data less directly from the disk via layers of intervening software (e.g.,
RDBMS, OODBMS, distributed data server).

05/25/2020 47
• Modifying data
• in-memory is massively quicker
• caching is not an option for shared data systems
• delays while committing changes to a record
• No IPC overhead

05/25/2020 48
• simplicity
• less code to write
• fewer issues to deal with
• locking, transactions, integrity, performance, geographic
distribution

05/25/2020 49
Disadvantages
• Lack of support for shared access
• forces one-at-a-time access
• mitigate:
• allowing datasets that merge multiple files
• hybrid approach
• complex monolithic analysis software
• simple data client/server update software

50
• Quantity of data
• when quantity of data is too large to load into memory
• too much time to load
• too much virtual memory used
• Depending on which is possible
• sequential access (lock db or shadow db)
• selective access

05/25/2020 51
Red Herring
• Monolithic systems are “less modular”

52
• The code for distributed systems will need to share
common objects.
• This “module” organization could be terrible.

53
Red Herring (sort of)
• Distributed systems require architects to define and maintain
interfaces between components
• cannot do anything without this
• even for RDBMS systems
• relational schema + stored procedures define an important interface
• by default: nothing is visible
• must work to expose interface

54
• For monolithic systems, this is “optional”
• because there are no process boundaries, any tiny component can depend on
(use, invoke, instantiate) any other in the entire monolithic system. e.g.,
extern void a_routine_I_should_not_call(int a, int b);
• default: everything is visible
• must work to hide non-interface

05/25/2020 55
Module Structure
• To preserve the architectural integrity of a monolithic system, we must
work to define and maintain (typically) extra-linguistic sub-system
boundaries.
• recall façade pattern

56
Library Structure
foo.c foo.o bar.o
01010010010101 01010010010101
100 100
10100100101000 10100100101000
1001000210010001 1001000210010001
gcc 1001010100100101
011010100100101
1001010100100101
011010100100101
011010101010101 011010101010101
01101010101010 01101010101010
010 010
1010100100101 1010100100101

main.o
01010010010101
100
10100100101000 lib2.a
1001000210010001 01010010010101
100
ar/ln
1001010100100101 10100100101000
011010100100101 1001000210010001

main 011010101010101 1001010100100101


011010100100101
01101010101010
lib.a/lib.so
011010101010101
01010010010101 010 01101010101010
100 1010100100101 010
1010100100101 01010010010101
10100100101000
100
1001000210010001
10100100101000
1001010100100101
1001000210010001
011010100100101
011010101010101
ln 1001010100100101
011010100100101
01101010101010
011010101010101
010
01101010101010
1010100100101
010
1010100100101

57
Library Structure in C/C++
• Decide
• how many libraries to have
• their names
• which subsystems go into which libraries
• wise to align library structure with a subsystem
• not necessary to do so
• e.g., could be a base level of utilities that rarely change whose TU’s
belong to unrelated subsystems (stretching it).
• rationale

13 - Monolithic CSC407 58
• Why?
• reduce compilation dependencies
• can be changing a bunch of .c’s and .h’s and others can keep using the library
• but… don’t change any.h’s exported beyond the library

05/25/2020 59
• “poor man’s” configuration management system
• often most practical
• Reduces link time (libraries often pre-linked)
• Shipping libraries
• Common library supports many apps

05/25/2020 60

You might also like