You are on page 1of 29

CS 5614: Basic Data Definition and Modification in SQL

67

Introduction to SQL

Structured Query Language (Sequel)

Serves as DDL as well as DML

Declarative
Say what you want without specifying how to do it
One of the main reasons for commercial success of DBMSs
Many standards and implementations
ANSI SQL
SQL-92/SQL-2 (Null operations, Outerjoins etc.)
SQL3 (Recursion, Triggers, Objects)
Vendor specific implementations
Bag Semantics instead of Set Semantics
Used in commercial RDBMSs

CS 5614: Basic Data Definition and Modification in SQL

Example:

Create a Relation/Table in SQL

CREATE TABLE Students


(sid CHAR(9),
name VARCHAR(20),
login CHAR(8),
age INTEGER,
gpa REAL);

Support for Basic Data Types

CHAR(n)
VARCHAR(n)
BIT(n)
BIT VARYING(n)
INT/INTEGER
FLOAT
REAL, DOUBLE PRECISION
DECIMAL(p,d)
DATE, TIME etc.

68

CS 5614: Basic Data Definition and Modification in SQL

69

More Examples

And one for Courses

CREATE TABLE Courses


(courseid CHAR(6),
department CHAR(20));

And one for their relationship!

CREATE TABLE takes

Why?

(sid CHAR(9),
courseid CHAR(6));

Can also provide default values

CREATE TABLE Students


(sid CHAR(9),
....
age INTEGER DEFAULT 21,
gpa REAL);

CS 5614: Basic Data Definition and Modification in SQL

Examples Contd.

DATE and TIME


Implementations vary widely
Typically treated as strings of a special form
Allows comparisons of an ordinal nature (<, > etc.)
DATE Example
1999-03-03 (No Y2K problems)
TIME Examples
15:30:29
15:30:29.3875
Deleting a Relation/Table in SQL

DROP TABLE Students;

70

CS 5614: Basic Data Definition and Modification in SQL

71

Modifying Relation Schemas

Drop an attribute (column)

ALTER TABLE Students DROP login;

Add an attribute (column)

ALTER TABLE Students ADD phone CHAR(7);

What happens to the new entry for the old records?


Default is NULL or say
ALTER TABLE Students ADD phone CHAR(7)
DEFAULT unknown;

Always begin with ALTER TABLE <TABLE_Name>

Can use DEFAULT even with regular definition (as in Slide 69)

CS 5614: Basic Data Definition and Modification in SQL

How do you enter/modify data?

INSERT command
INSERT
INTO Students
VALUES (53688,Mark,mark2345,23,3.9)

Cumbersome (use bulk loading; described later)

DELETE command
DELETE
FROM Students S
WHERE S.name = Smith

UPDATE command
UPDATE Students S
SET S.age=S.age+1, S.gpa=S.gpa-1
WHERE S.sid = 53688

72

CS 5614: Basic Data Definition and Modification in SQL

73

Domains

Domains: Similar to Structs and other user-defined types

CREATE DOMAIN Email AS CHAR(8) DEFAULT unknown;


....
login Email // instead of login CHAR(8) DEFAULT unknown

Advantages: can be reused

junkaddress Email,
fromaddress Email,
toaddress Email,
....

Can DROP DOMAINS too!

DROP DOMAIN Email;

Affects only future declarations

CS 5614: Basic Data Definition and Modification in SQL

Keys

To Specify Keys

Use PRIMARY KEY or UNIQUE


Declare alongside attribute
For multiattribute keys, declare as a separate line
CREATE TABLE takes
( sid CHAR(9),
courseid CHAR(6),
PRIMARY KEY (sid,courseid)
);

Whats the difference between PRIMARY KEY and UNIQUE?

Typically only one PRIMARY KEY but any number of UNIQUE keys
Implementor allowed to attach special significance

74

CS 5614: Basic Data Definition and Modification in SQL

75

Creating Indices/Indexes

Why?

Speeds up query processing time


For Students

CREATE INDEX indexone ON Students(sid);


CREATE INDEX indextwo ON Students(login);

How to decide attributes to place indices on?


One is (typically) created by default on PRIMARY KEY
Creation of indices on UNIQUE attributes is implementation-dependent
In general, physical database design/tuning is very difficult!
Use Tools: Microsoft SQLServer has an index selection Wizard
Why not place indices on all attributes?
Too cumbersome for insertions/deletions/updates
Like all things in computer science, there is a tradeoff! :-)

CS 5614: Basic Data Definition and Modification in SQL

Other Properties

NOT NULL instead of DEFAULT


CREATE TABLE Students
(sid CHAR(9),
name VARCHAR(20),
login CHAR(8),
age INTEGER,
gpa REAL);

Can insert a tuple without a value for gpa

NULL will be inserted


If we had specified

gpa REAL NOT NULL);

insert cannot be made!

76


 


"!$#&%('*),+.-&/10(2$35462748%(0(09':!;/&-<=!$#&>?2@2A-&%CBD2@>?2$;!FEG/&2$>:<H0(IJ&KL/1ILKL2@'?ML>N2$O&>N2$'?2$;!;IP!;%C+L&'Q/&'N2$-%(SR$+LJT
U /&&RV!;%(+JW48%X!;#W!$#&2
>?2$0CIY!;%C+L&IJ0L)"+.-&2$0CZ\[]#&2
^&>N'!_%('9>N2$0(IP!;%C+L&IJ0.IL0(KJ2$`&>NIL3GILQIL0CKL2$`&>NIL%CR_IJ&-AO1>?+.R$2$-1/&>?IJ0
46I<Hab+J>]R$>?2@IY!;%C&Kc&2N4d>?2@0(IY!$%(+J&']ab>?+J)eKL%gfD2$+L12$'?Zh[]#&2Q'?2$R@+L&-%(']i*IY!;IJ0(+JKL3548#&%(RY#%(']0(+JKL%CR$IL0jIL&-&2$R@0(IL>NIY!$%XfD2h%(&IY!$/&>?2,k:'!;/1-&2$;!;']abIL)"%(0C%(IJ>l48%X!$#F!;#&2Qm
n]oQp9o8qrO&>?+JKL>NIL)"),%C&K,0CIL&KJ/&ILKJ2h48%C0(0^1&!;#1%('\IJ0(0!;+.+c&IP!;/&>NIL0(sNZtabIJRN!;3!;#&2@>?2uIJ>?2uR$0C+L'N2uR$+L>N>?2$'NOv+L&-&2@&R$2$'w`D2V!V4u2$2$-1IY!;IJ`&IL'N2Q'<'!;2@),'wIL&m
n]oQp9o8qQZwm_n]oQp9oQqxR$ILy`v2z!;#1+L/&KJ# !=+Ja]IL'AIS-&IY!$IL`&IJ'?2'<'!$2$){48#&2$>N2,IJ0(0|!;#12-&IY!$I}^J!$'A%C !$+
)"IL%(~)"2$)"+L><.ZR@+L;!;>?IJ'!$3uI-1%('!$%(&KJ/&%('N#&%(1Kab2$IY!$/&>?2S+Launhiu]&'c%C'Q!$#&IY!"!;#&2V<+LOv2$>NIY!;2+J
'?2@R$+L&-1IL><'!$+L>NILKL2@ZoW!$#&2$>W!;#&IJ!;#&IP!ykIJ&-~'?+J),2-1%(Bv2$>?2@&R$2$'Q%(yEG/&2$><O&>N+.R$2$'?'N%(&KJs?3!$#&2$>?2"IL>N2
2NR$2$0C0(2$;!7IL1IL0(+JKL'w!$+c`v+Y!;#SR$/&0g!;/&>N2$'?Z]]+P!;#m
n]o8p9oQqIJ&-ntiu]1']IL>?2Q-&2@R$0(IJ>?IY!$%XfD2$Z#&IY!]4u2
&+4!;+"`v2A>?2@0(IY!$%(+J&']IL>N2A>?2@ab2$>?>N2$-z!;+"IL'*(O&>N2$-&%(R@IY!;2@'?%Cm_n]oQp9oQqQZ!$/&O&0(2Q%C']R$IL0C0(2$-I"KL>?+J/&&abILRV!F%C}m
n]oQp9o8qQZ!;IJ`&0(2z%('hR$IL0C0(2$-ILC2NG!;2$&'N%(+J&IL0l-&2$^&1%X!;%C+L&9%C}m
n]oQp9oQqIJ&-'?++J&Zk:iu+
&+P!8KL2N!u`D+JKLKJ2$-}-&+48`;<"!;#&2$'N2u'?Ov2$R@%(^&R$'NG4u2t%C&R$0(/1-&2
!$#&2$)#&2$>N2 U /&'!u'?+*!$#&IY!w<D+L/R$IL),I 2
!$#&2
R$+J&&2$RN!$%(+J&3%(a|<D+J/IJ>?2cIL0C>?2$IJ-J<abIJ),%C0(%CIL>w48%g!;#m
n]oQp9o8qQZ

0('N2$3&+P!;#&%C&KF!;+F46+J>?><.Z(s[]#&2*!$#&%(>NEG/&2$><=>N2$O&>N2$'?2$;!;IP!;%C+L"%('?3j+La9R$+J/&>?'N2$318p!$#&IY!w46IJ']%(;!;>?+.-1/&R$2$-"2$IL>N0(%C2$>?ZwV6!;#12u>?2$)"IL%C&-&2$>
+Ja1!;#&%C'
-&+.R$/1),2$;!;34u2F48%(0(0
%C !$>?+.-&/&R@2=`1IL'?%CR+LOv2$>?IP!;%C+L&'"IL&-~)"IL&%CO&/&0CIY!;%C+L&'*!$#&IY!c4u2R$IL~Ov2$>?ab+J>?)+J
>?2@0(IY!$%(+J&'?Z"v+L>Q2$ILRY#'?/1R#`&IJ'?%CRc+LOv2$>?IP!;%C+L&3462*48%(0C0'N#&+4#&+4%X!c%('Q>?2$O1>?2$'N2$ !$2$-}%C2$ILRY#+La|!$#&2
!;#1>?2$2Q-&%CBD2@>?2$;!6&+P!;IP!;%(+J&'?Z
Zulvv]G&@vl \[]#&2z/&&%C+L+Laj!4u+c>?2$0CIY!;%C+L&'7IL&-=%('!;#&2Q'?2V!8+Lal2$0(2$)"2$;!;'!;#&IP!
IJ>?2u%C+J>\%C+L>]`v+P!;#&Z2uIL'N'?/&)"2w!$#&IY!w!;#&2h'?RY#&2$)"IL'w+LaIJ&-IL>N2uIL0C% 28k+Ja9R$+L/1>?'?2@s
IJ&-z!;#1IY!
!;#&2@%(>R$+J0(/&)"&'
IL>N2tIJ0('N+A+L>N-&2$>N2$-,IJ0(% 26k+JaR$+J/&>?'N2$3vILKJIL%(1s?Z2]&+4KJ%XfD2!;#12
!$#&>?2$2
>N2$O&>?2@'?2$;!;IP!;%(+J&']+La5!;#128/1&%(+J}>?2@0(IY!$%(+J&

k'?%C),O10(2$3j>?%(KJ# !$ s

?
?
?~
??|
?
?
?~
??|
*+Y!;%CR$2S!$#&IY!!$#&2SfIL>N%(IL`10(2$'

?
? IJ>?2)"2$>?2@0X<(O10(ILR@2$#&+L0C-&2$>N'?Q/&'?2@-ab+J>O&IY!N!;2$>N

)"IY!$R#1%(&KJ1462uR$+J/&0(-#&IfD2W48>?%g!?!$2$F!;#&2QIL`v+fD2W!46+cIL'N
?
?
;~
? ?|
?
?
?~
??|
?GGG
l\ _ LYN?9Y9Y_V]_ ? VW?]] LVV_@V ]N\b.jQ$ 

G 
G;
?GGG
G 
Zu_G l."DwG&@vl \[]#&2*-&%(Bv2$>N2$&R$2t
2$0C2$)"2$ !$'l!$#&IY!AIL>N2A%(=`&/J!A&+Y!

~+Ja1!46+A>N2$0(IP!;%C+L&'\IJ&-%('!;#&2*'?2N!u+La

%C=
Zhu'h/&'?/1IL0(35462QIJ'?'?/1),27!;#&IP!t!$#&2A'NR#&2@),IJ']+LaIL&-

IL>N2uIL0C% 2hIL&-F!$#&IY!w!;#&2$%C>
R$+L0C/&),1'\IL>N2uIL0C'?+c+J>?-&2$>N2$-IJ0(% 2@Z\+L>N2$+fD2$>?31&+Y!$%(R$2w!$#&IY!6
%C']&+Y!kKJ2$&2$>NIL0(0g<|s!;#&2Q'NIL)"28IJ'*

*Z

?
?
?~
??|D|;
??|
?GGG
G 
G

?GGG
G 
Z Y$G $G.$vv
G&@vl
[]#&2]%(;!;2$>N'?2$RV!;%(+JF
+La!46+Q>?2$0CIY!$%(+L1' IJ&-,%('j!$#&2]'?2V!
 
+Ja2$0C2$)"2$ !$'l!$#&IY!8IJ>?2A%C

IL&-}
Z]*KLIL%C&31462*IL'?'N/&)"2h!;#1IY!]!;#128'NR#12$),IJ']+LaIJ&-

IJ>?2\IJ0(% 2_IJ&-*!;#1IY!!$#&2$%(>9R@+L0(/1),&'lIL>N2]IL0C'?+h+L>?-12$>?2$-cIJ0(% 2$Zu+P!;%(R@29!$#&IY!7




k
sNZ

?
?
?~
??|D
?
?D
?GGG
G 
;GGGG
?GGG
G 
Z Q 
 

.$v*o8Ov2$>NIY!$2$'u+LyI'N%(&KJ0(2A>N2$0(IP!;%C+L}IJ&->?2@),+fD2$'h'?+J),2z+La|!;#&2cR@+L0(/1),&'NZu'N2$ab/&0

ab+J>,>N2$'!$>?%(RV!;%(1K}%(1ab+L>?)"IY!$%(+J&Zu'?'N/&)"2c!$#&IY!"462F4uIL;!=+L&0g<!;#&2&IJ),2IJ&-IJ-&-&>?2@'?'"ab>?+L)
>N2$0(IP!;%(+J=*Z

! #"$"&%')(*(

?
?|G
?
?D
[]#G/&' 

`D2@R$+L)"2]%(>N>?2$0C2NfIL;!WIP!?!;>N%(`&/!;2$'NZl2]R$+L/&0C-,%C&'!$2$IL-c>N2$%(1ab+L>?R@2!;#&%C'` <z48>?%g!;%(1K

?
?|G
?
?,+,+.
GGG.-G/G0G211
G
Z 4lG.$v\oQOv2$>?IP!;2$'h+LSIc'?%C&KL0C28>N2$0(IP!;%C+LIL&-S>?2$)"+fD2$']'?+J),2Q+Jaj!;#&2Q>N+
3 

48'NZt[]#12A>?2@),+fIL0

%C'8`1IL'?2@-+L'N+L)"2,R@+L&-&%g!;%C+Ly'?Ov2$R$%C^&2$-y` <!$#&2,/1'?2$>NZD+J>Q2N|IJ),O10(2$3
'?/1O&Ov+L'?28462646IL;!IJ0(0
!$#&2W!;/&O&0C2$']ab>?+J)48#12$>?2W!;#1281IL)"28%C'](%CR#&IJ2$0(CZ
65

7 8:9<;>=<?A@
#)BDC C

?
?
?~
??|DFEHGJIG GKJL
GGG
G

MNGG.-/G.EOG?2IG;G2KJL
Z Qv R4lG.$vlTSH&2$0C2$RN!;%C+L&'c`v2$R@+L)"2}R$+J),O&0C%(R$IP!;2$-48#&2@y!;#12}R$+J&-&%X!$%(+J&',KJ2N!}0(+L1KL2$>N3
P 
O&IJ>!$%(R$/&0CIL>N0X<48%X!$#}iuIP!;IL0C+LKk!$#&2A+P!;#&2@>w!46+ab+J>?)"'tIJ>?2cO&>N2N!?!<'!$>?IL%CKL#;!;ab+J>46IJ>?-&sNZV]
U +L&'N%(-&2@>
ab+J>_2V|IJ),O&0C2$3;48#&2$*4624uIL;!7IL0C0J!;#12!;/&O&0C2$'ab>N+L)dn48#12$>?2!$#&2]&IL)"2]%('(%(RY#&IJ2$0(.oQny48#12$
!$#&2QKL2$&-12$>]%(']C(ZG2W48>?%g!;2w!$#&%(']%C=i*IY!;IJ0(+JKIL'N
?
?
?G
?
?D,FEHG?2IG GG2KJLD
?
?
?G
?
?D,FEHG?WL

u+P!;%CR$2h!$#&IY!Q462cC'?O&0C%X!$j!;#&2cR@+L&-&%g!;%C+LILR@>?+L'N'h!46+>?/&0C2$'?3 U /1'!,IL'W462"-&+%(!$#&2c/&&%C+LyR$IL'N2
kXU]+L)"2z!;+

!;#&%C

+Lah%X!$3l!;#12,oQn%('A%C&-&2$2@-!$#&2,/1&%(+J+Ja!46+}R@+L&-&%g!;%C+L&'Ns?Zy&%()"%(0CIL>?0g<3j!$#&2

R$+J),)"I%C}2$IJR#S+La|!;#&2cIJ`v+ fD2c-1IY!;IJ0(+JK=>N/&0(2@'t)"+.-&2$0C'w!;#&2c*uidR$+J&-&%g!;%(+J&Zcp92N!$('hR$+L&'N%(-&2@>
I)"+L>N2AR$+J),O10(%(R@IY!;2@-=R@+L&-&%g!;%C+L&Q&2$0C2$RN!8IJ0(05!;#&27!;/&O&0C2$'hab>?+L)xn!;#&IP!

IL>N2A&2$%g!;#&2@>t)"IL0C2A&+J>

#&IfD2w!$#&2Q&IL)"2Q(v+|CZ
?
?
?G
?
?D,
#&%C0(2Q'?2@0(2$RN!$%(&K*!$#&2h!$/&O&0C2$']ab>?+J)

YZG?WLG=dFY[GJ\;WL
n!$#&IY!

IJ>?2z&+Y!

`v+P!;#)"IL0(2QIJ&-#&IfD2W!;#&2z&IL)"2Q(v+|9%C'

IJR#&%C2NfD2$-"` <k48#;<| s?
?
?
?G
?
?D,
?
?
?G
?
?D,

YZG?WL

YZG?2\;HL|

Z]c& $G$&^Q:_`l. c[]#&%('z%('w!$#&2"'?2N!,+Law(O&IJ%(>N'?


ab+L>N),2$-y`;<RY#&+.+L'N%(&K

!;#&2"^&>?':!,2$0(2$)"2$;!

ab>N+L)IL1-"!;#12u'?2$R@+L&-2$0C2$),2@ !tab>?+J)Z]R$IL'N2Q+LalR$+L&ab/1'?%(+J&'hIL)"+L&K"IY!N!;>?%C`&/J!$2u&IL)"2$'N3
-&%C'?IJ)c`&%(KJ/&IY!$2h!;#12$)x` <O&>N2$^J%(&K6!;#&2@)48%g!;#
+Jav!46+">?2@0(IY!$%(+J&'7IL&-}

!;#&2c>N2$0(IP!;%(+J}&IJ),2@Z8[]#12cR$IL>:!;2$'N%(ILSO&>?+.-1/&RN!

%('hKL%gfD2$`;<|

?2aJaJa|VJaD?Fb?Fb?b b~
?Ja|?2a?2aV2a.,
?b
?Fb?FbVFb
GGGh*-/hGF0GJ1F1Dt-G0
,hJIc0FdGGd.

h*-/hGF0GJ1F1Dt-G0
,hJIc0FdGGd.

G 

u+P!;%CR$2Q#&+4462z-&%('NIL)c`&%CKL/&IP!;2AIP!?!$>?%(`1/J!;2$'h%(
!;/&O&0C2$']IL1-

#1IL'Wf!;/&O&0C2$'?31!$#&2$=

!;#&2Q&QpfD2$>N'?%(+J&Zuu0C'?+"&+Y!$%(R$27!;#&IP!F%Car#1IL'

H48%C0(0#1I fD2

6g

e f!;/1O&0(2$'NZ

Z hjiG$lk#mv[]#&%('c%C' U /&'!=0C% 2*!$#&2R$IL>:!;2$'N%(IL~O&>N+.-&/&RN!=`1/J!=KL+.2$'"I'!$2$O~ab/&>!$#&2$>NZ*a!;2$>


5 
ab+J>?)"%(&K*!$#&2 e
f !;/&O10(2$'N3%X!u'?2$0C2$RN!$'\+L10X<HIc'N/&`&'?2V!8+La5!;#&2$)!;+c%C&R$0C/&-&2u%C%X!$']IL&':462$>N39`&IL'N2$-

+J'N+L)"2R$+L&-1%X!;%C+L&Z[]#G/&'N3h!;#12!;#&2N!$IYT U +L%C+Lah!4u+>?2$0CIY!$%(+L1'#&IL'8!;#&2'NIL)"2/&)c`v2$>+La
R$+J0(/&)"&'hIL'!;#&2zR$IL>:!;2$'N%(IJ}O&>N+.-&/&RN!A`&/J!A&+Y!c&2$R$2$'N'?IJ>?%(0g<!;#128'NIL)"2AG/&)c`v2$>]+La
>N+ 48'Ak'N+L)"2

+Ja]!$#&2}>N+48'Q48%C0(0h`D2S>?2$)"+fD2$-~`v2$R$IL/1'?2!;#12N<-&%C'?'?IP!;%C'?a<'N+L)"2}R$+J&-&%g!;%(+J&s?Z^U]+J&'?%C-&2$>ab+J>
2NIL)"O&0C2$3 !;#&IP!]462W46IJ !w!;+c^&&-CO&IL%C>?'?9+Ja':!;/&-12$ !$'\'N/&R#F!$#&IY!]!$#&2Q^&>?':!8OD2@>?'?+J%(F!;#&2*O&IL%C>
%C']IL0X4uI<|']C%(RY#&IL2$0C(Z|28KJ2N!;
on)p6qsr t#u#8lv ;>=<?A@
#)BDC
?2aJaJa|VJaD?Fb?Fb?b b~
?Ja|?2a?2aV2a.,
?b
?Fb?FbVFbD,2acEHGJI;GJKL|
GGGh*-/hGF0GJ1F1Dt-G0
,hJIc0FdGGd.

h*-/hGF0GJ1F1Dt-G0
,hJIc0FdGGd.

G 

MNGGt*-GF/GwExG?JI;GJK2L
u+P!;%CR$2|!;#&IP!4u2_%C !$>?+.-&/1R$2l!$#&2
(`v+4]!;%(2@.'<)c`D+J0yn*pz'?/lz6T?2$-z` <u!$#&2\R@+L&-&%g!;%C+LAab+J>%C&-&%CR$IY!$%(&K

!$#&2W!;#&2N!$IYT U +L%C&Z]u0('N+L39>N2$IL0C%|{$2w!$#&IY!

}n*p~
Z S&`l1mvS[]#&%('z%('
g 

7 ~]k

s

U /&':!IR$0C2NfD2$>N2$>w46I<!$+R$+L)c`1%(&26!46+}>N2$0(IP!;%C+L&'c%C !$+}+L12$Z~[]#&2

`&IJ'?%CR,%C-&2$I%('W!;#1IY!%Ca|!;#&26!46+>?2$0CIY!$%(+L1'8#1I fD2"'?+J),2"R$+J0(/&)"k'NsQ%(yR$+L)"),+J&3|!;#&2@4u2,R$IJ
CR$+L0C0(ILO1'?2$j!$#&2$)

%C !$+"!;#12,'NIL)"2R$+L0C/&)"%C!$#&2,^1&IL0w+L/J!$O&/J!;Zy+L>N2$+fD2$>?3|4u2,R$IJ-&+

!;#&%C'

+J&0X<}%(a!;#&2w!4u+h!;/1O&0(2$'
ab>N+L)!;#&2w!4u+8>N2$0(IP!;%C+L&'wILKL>N2$2u%Cz!;#&+J'?2uR@+L)"),+JR$+L0C/&)"&'?Zw[]#/&'N3%g!
%C'u'?%C),%C0(IL>!$+z!;#12cR$IL>:!;2$'N%(ILO1>?+.-&/&RV!;3`&/!u462z U +L%C&l+L&0g<}!;#1+L'?2cO1IL%(>N'w!;#1IY!

)"IY!$R#S%(!;#12$%(>

R$+J),)"+L"IY!N!;>N%(`&/J!$2$'?ZU]+L&'N%(-&2@>l!;#1IY!
46246IJ !
!;+Q^&1-z!;#&2h&IJ),2$3vIJ-&-&>?2@'?'?3vKJ2$&-&2$>N3vKLO&IzIL&`&%C>!$#&-&IY!$2A+Ja'!$/&-&2$;!;'h%(I'N%(&KJ0(2Q>?2$0CIY!$%(+L1Zuu+Y!$%(R$2_!;#&IP!
+P!;#&2$>
ab+J/&>IY!N!;>N%(`&/J!$2$'
IL>N2]O&>?2$'N2$;!7%(

KLO&I"%('hIfIL%(0CIL`&0C2Qab>?+L)`&/!t!$#&2

*Z]1+L3 462]&2@2$-cIW46I<A!$+8%C !$2$0(0C%(KJ2$ !$0X<AR$+L)c`1%(&2!;#&2$'N2

!46+c>N2$0(IP!;%C+L&'N
on)p
?
?
?!D~G~?
?
?|,?
|
GGGh*-/h"GF0GJ1F1,t-G0
thJIc0FdGGd.
G 

MNGGt*-GF/GFEGh)-./

hGGFG0 211
E hGF0GJ1F1
 Z8Glluj\[]#1%('h%(' U /&':!FIcR$+.+J0v!$#&%(1KL39%(R$IJ'?2W462Q#&IfD2W!;+.+")"IL;<H&IL)"%(1KcR$+Ll1%(RN!$']IL&R$+J&ab/&'N%(+L1'uIL>N%('?%C&KLZ72cR@IL}/1'?2*!;#1%('h+LOv2$>?IP!;+J>]!$+>?2@&IL)"2AI>N2$0(IP!;%C+L&C't&IJ),2zIL&-1ML+L>Q+J&2
+J>t)"+L>N28+Ja
%X!;']IP!?!$>?%(`1/J!;2$'NZuv+L>]2NIL)"O&0C2$39IL'N'?/&)"2h4u2]4uIL;!t!$+,>N2$&IJ),28!$+
%g!;'hR$+L0C/&),1'!;+`v2AR@IL0(0C2$0C% 2h'?+L

;& # st

IL1-

e 

k*s

IJ&-})"I 2

ZQ[]#&%C't%C't)"+L':!F/&'N2$ab/&0548%X!$#=>N2$0(IP!;%C+L&IJ0IJ0(KL2@`&>?IJ3

CS 5614: Misc. SQL Stuff, Safety in Queries

81

What is still to be covered


(and will be)

Declaring constraints

Domain Constraints
Referential Integrity (Foreign Keys)
More SQL Stuff
Subqueries
Aggregation
SQL Peculiarities
Strange Phenomena
More on Bag Semantics
Ifs and Buts
Embedding SQL in a Programming Environment
Accessing DBs from within a PL
(will be covered in Module 3)

CS 5614: Misc. SQL Stuff, Safety in Queries

What will be mentioned


(but not covered in detail)

Triggers

Read Cow Book or Boat Book


More SQL Gory Details
Recursive Queries (SQL3)
Why do we need these?

Security

Authorization and Privacy

Trends towards Object Oriented DBMSs

82

CS 5614: Misc. SQL Stuff, Safety in Queries

83

Tuple-Based Domain Constraints

Already Seen

NOT NULL
UNIQUE, PRIMARY KEY etc.
In General

CREATE TABLE Students


(sid CHAR(9),
name VARCHAR(20),
login CHAR(8),
age INTEGER,
gpa REAL,
CHECK (gpa >= 0.0)
);

Note: Implementations vary, but this is the general idea

Other Complicated Forms


Constraints on whole relations, Assertions

CS 5614: Misc. SQL Stuff, Safety in Queries

Referential Integrity Constraints

Foreign Keys
An attribute a of R1 is a foreign key if it references
the primary key (say b) of another relation R2
In addition, there is a ref. integrity constraint from R1 to R2.
Example
login is a FOREIGN KEY for Students

CREATE TABLE Students


(sid CHAR(9) PRIMARY KEY,
name VARCHAR(20),
login CHAR(8)
REFERENCES Accounts(acct),
age INTEGER,
gpa REAL
);
CREATE TABLE Accounts
(
acct CHAR(8) PRIMARY KEY
);

84

CS 5614: Misc. SQL Stuff, Safety in Queries

85

Alternatively

Can use FOREIGN KEY construct

CREATE TABLE Students


(sid CHAR(9) PRIMARY KEY,
name VARCHAR(20),
login CHAR(8),
age INTEGER,
gpa REAL,
FOREIGN KEY login
REFERENCES Accounts(acct)
);
CREATE TABLE Accounts
(
acct CHAR(8) PRIMARY KEY
);

Note: acct should be declared as PRIMARY KEY for Accounts!

in both cases

CS 5614: Misc. SQL Stuff, Safety in Queries

SQL Subqueries

Given

Students(sid,name,login,age,gpa)
HasCar(sid,carname)
Find

the car of the student with login=mark


Traditional Way

SELECT carname
FROM Students, HasCar
WHERE Students.login=mark
AND Students.sid=HasCar.sid;

The Subway

SELECT carname
FROM HasCar
WHERE sid=
(SELECT sid FROM Students
WHERE login=mark);

86

CS 5614: Misc. SQL Stuff, Safety in Queries

87

Aggregation

Given

Students(sid,name,login,age,gpa)

Find

the average of the ages of all the students


Solution

SELECT AVG(age)
FROM Students;

Other Operations

SUM (summation of all the values in a column)


MIN (least value)
MAX (highest value)
COUNT (the number of values), e.g.
SELECT COUNT(*)
FROM Students;

COUNTs the number of Students!

CS 5614: Misc. SQL Stuff, Safety in Queries

Ordering

Given

Students(sid,name,login,age,gpa)

List

the students in (ascending) alphabetical order of name


Solution

SELECT *
FROM Students
ORDER BY name;

and that for DESCending ORDER is

SELECT *
FROM Students
ORDER BY name DESC;

Default is ASC

88

CS 5614: Misc. SQL Stuff, Safety in Queries

89

Grouping

Given

Students(sid,name,login,age,gpa)

Find

the names of students with gpa=4.0 and


group people with like ages together
Solution

SELECT name
FROM Students
WHERE gpa=4.0
GROUP BY name;

CS 5614: Misc. SQL Stuff, Safety in Queries

More on Grouping

Given

Students(sid,name,login,age,gpa)

Find

the names of students with gpa=4.0 and


group people with like ages together and
show only those groups that have more than 2 students in it
Solution

SELECT name
FROM Students
WHERE gpa=4.0
GROUP BY name
HAVING COUNT(*) > 2;

90

CS 5614: Misc. SQL Stuff, Safety in Queries

91

Summary of SQL Syntax

General Form

SELECT <attribute(s)>
FROM <relation(s)>
WHERE <condition(s)>
GROUP BY <attribute(s)>
HAVING <grouping condition(s)>

Order of Execution

FROM
WHERE
GROUP BY
HAVING
SELECT

CS 5614: Misc. SQL Stuff, Safety in Queries

Views

Can be viewed as temporary relations


do not exist physically BUT
can be queried and modified (sometimes) just like normal relations
Example:

CREATE VIEW GoodStudents(id,name) AS


SELECT sid,name
FROM Students
WHERE gpa=4.0;
SELECT *
FROM GoodStudents
WHERE name=Mark;

Can we update the original relation using the GoodStudents VIEW?

92

CS 5614: Misc. SQL Stuff, Safety in Queries

93

Beginning of Wierd Stuff

SQL uses Bag Semantics


meaning: does not normally eliminate duplicates
e.g. the SELECT clause
BUT (a big BUT)
this doesnt apply to
UNION, INTERSECT and DIFFERENCE
Either way, it provides facilities to do whatever we want
If you want duplicates eliminated in SELECT clause

use SELECT DISTINCT .....

If you want to prevent elimination of duplicates in UNION etc.

use (SELECT ...) UNION ALL (SELECT ...)


Likewise for INTERSECT and DIFFERENCE

CS 5614: Misc. SQL Stuff, Safety in Queries

... and thats just the tip of the iceberg

What happens with the following code?

SELECT R.A
FROM R,S,T
WHERE R.A = S.A or R.A = T.A

when R(A) has {2,3}, S(A) has {3,4} and T(A) is {}

Confusion Reigns!

94

CS 5614: Misc. SQL Stuff, Safety in Queries

95

Safety in Queries

Some queries are inherently unsafe

should not be permitted in DB access


Example
Given only the following relation

Students(id)

Find all those who are not students

Easy to distinguish unsafe queries via common-sense


Final result is not closed
Is there an automatic way to determine safety?

CS 5614: Misc. SQL Stuff, Safety in Queries

Answer: Yes!

Easiest to spot when written in Datalog

Answer(id) <- NOTStudents(id).

Golden Rule

Any variable that appears anywhere must also


appear in a non-negated body part

In this case, id causes the query to be unsafe

Example of a Safe Query

Answer(id) <- People(id), NOT Students(id)

This produces all those people who are NOT students


safe because the People relation provides a reference point
id which appears in a negated body part also appears non-negated

96

CS 5614: Misc. SQL Stuff, Safety in Queries

97

More Dangers

Problem not restricted to negated body parts

occurs even with arithmetic body parts (why?)


Given
only the following relation

Students(id,age)

Find all those numbers that are greater than the age of some student
Answer(x) <- Student(id,age), x>age.

Extension to previous rule

Any variable that appears anywhere must also


appear in a non-negated, non-arithmetic body part
In this case, x causes the query to be unsafe
bcoz it doesnt appear in a non-negated, non-arithmetic part

CS 5614: Misc. SQL Stuff, Safety in Queries

98

One More Example

Given
a relation Composite(x)
which lists all the composite numbers
Write a query to find
the prime numbers
Wrong Way

Prime(x) <- NOT Composite(x).


Right Way

Prime(x) <- Number(x), NOT Composite(x).


Safety in Other Notations

Relational Algebra: via the subtraction operator


SQL: via the EXCEPT construct
Notice how SQL and Relational Algebra do not allow unsafe queries
because there is no way to write such queries with the given constructs
how clever, eh? :-)
It is always amazing how languages force you to think in a certain manner
a problem long studied by philosophers

CS 5614: Misc. SQL Stuff, Safety in Queries

99

Recursion in Queries

Used to specify an indefinite number of applications of a relation


Example

Given only the following relation


Person(name,parent)

Find all the ancestors of Mark

Easy to find an ancestor at a predefined level


parent: Use Person
grandparent: Join Person with Person
great-grandparent: Join Person with Person with Person
and so on.
To find an ancestor at no predefined level
Need to join Person with Person an indefinite number of times
SQL3 provides support for recursive definitions

CS 5614: Misc. SQL Stuff, Safety in Queries

Solution in Datalog

First, the base case

Ancestor(x,y) <- Person(x,y).

Then, the inductive step

Ancestor(x,y) <- Person(x,z), Ancestor(z,y).

Can also write the previous rule as

Ancestor(x,y) <- Ancestor(x,z), Ancestor(z,y).

why?

100

CS 5614: Misc. SQL Stuff, Safety in Queries

101

Recursion in SQL3

Use the WITH RECURSIVE ... SELECT construct


Example

WITH RECURSIVE Ancestor(name,ans) AS


(SELECT *
FROM Person)
UNION
(SELECT Person.name,Ancestor.ans
FROM Person, Ancestor
WHERE Person.parent=Ancestor.name)
SELECT * FROM Ancestor;

Use with caution: Some kinds of recursive queries will not be allowed!

example: the following Datalog query might not be allowed in SQL3

Ancestor(x,y) <- Ancestor(x,z), Ancestor(z,y).

because the rule involves 2 applications of the recursively defined predicate


Linear recursion allows only one (as in the SQL code above)

CS 5614: Misc. SQL Stuff, Safety in Queries

Final Example

Be careful when combining negation, aggregation and recursion


perfect recipe for disaster!
Mutual Recursion

Odd(x) <- Number(x), NOT Even(x).


Even(x) <- Number(x), NOT Odd(x).
What are the problems?

Notice that the query appears safe (per Slide 96)


cycles indefinitely!; no proper base cases
Illegal in SQL3
not because of mutual recursion
but due to the fact that there is no unique interpretation to the query
Eg: 6 could be either in Odd or in Even; both are acceptable!
Sometimes mutual recursion is good and fruitful, if written properly

with proper limiting constraints and base cases

102

CS 5614: Misc. SQL Stuff, Safety in Queries

103

Introduction to Deductive DBMSs

Intersection of traditional RDBMSs and Logic Programming


Example Systems

CORAL (Univ. Wisc.)


LDL++ (MCC)
XSB Systems (SUNY, Stony Brook)
Can be viewed as
extending PROLOG-type systems with secondary storage
extending RDBMSs with deductive functionality
Mappings: Commonalities between PROLOG and DBMSs
Predicate: Relation
Argument: Attribute
Ground Fact: Tuple
Extensional Definition: Table (defined by data)
Intensional Definition: Table (defined by a view)

CS 5614: Misc. SQL Stuff, Safety in Queries

PROLOG vs. RDBMSs

Characteristics of PROLOG

Tuple-at-a-time
Backward Chaining
Top-Down
Goal-Oriented
Fixed-Evaluation Strategy (Depth-First)
Characteristics of RDBMSs
Set-at-a-time (recall relational algebra)
Forward Chaining
Bottom-Up
Query Optimizer figures a good evaluation strategy
Example

ancestor(X,X). parent(amy,bob).
ancestor(X,Y) <- parent(X,Z), ancestor(Z,Y).
Query

Find the ancestors of bob: ancestor(X,bob)?

104

CS 5614: Misc. SQL Stuff, Safety in Queries

105

PROLOG Pitfalls

Previous Example

Linear Recursion
Tail Recursion

What if we reverse the order of clauses in

What if we make it

ancestor(X,Y) <- parent(X,Z), ancestor(Z,Y).


PROLOG goes into an infinite loop (why?)
ancestor(X,Y) <- ancestor(X,Z), ancestor(Z,Y).
Not Linear Recursion

Inference = Resolution + Unification


Entailment in First Order Logic is Semi-decidable

CS 5614: Misc. SQL Stuff, Safety in Queries

106

Example of Deductive Query Optimization

Same-Generation: Hello World of DDBMSs

sg(X,Y) <- flat(X,Y).


sg(X,Y) <- up(X,U),sg(U,V),down(V,Y).
Magic: A Rewriting Technique

Rewrite query such that advantages of bottom-up evaluation


goal-oriented behavior are combined

Example: For the query

Magic produces

sg(john,Z)?

sg(X,Y) <- magic_sg(X),flat(X,Y).


sg(X,Y) <- magic_sg(X),up(X,U),sg(U,V),down(V,Y).
magic_sg(john).
How do you know when to stop?

Iterative Fixpoint Evaluation (when the answer stops changing)

CS 5614: Misc. SQL Stuff, Safety in Queries

107

SQL in a Programming Environment

Incorporating SQL in a complete application


Why?

There are some things we cannot do with SQL alone


e.g. preserving complex states, looping, branching etc.
Typically embed SQL in a host-language interface
Problems: Impedance Mismatch
SQL operates on sets of tuples
Languages such as C, C++ operate on an individual basis
Solution
easy when SELECT returns only one row
When more than one row is returned
design an iterator to run over the results
called a cursor

CS 5614: Misc. SQL Stuff, Safety in Queries

How are these implemented?

Vendor-Specific Implementations
ORACLE: PL/SQL (procedural extensions to SQL)
Open Database Connectivity Standard
Provides a standard API for transparent database access
used when database independence is important
used when required to connect to diverse data sources

108

CS 5614: Misc. SQL Stuff, Safety in Queries

109

Tradeoffs

ODBC

originated by Microsoft in 1991


adds one more abstraction layer
not as fast as a native API (does not exploit special features)
least-common denominator approach
constantly evolving

PL/SQL etc.
tailored to the details of the underlying DBMS
might not extend to heterogeneous domains
modeled after a specific programming language (e.g. Ada for Pl/SQL)

CS 5614: Misc. SQL Stuff, Safety in Queries

110

In Between: Stored Procedures

Used for developing tightly-coupled applications


push computations selectively into the database system
avoid performance degradation
work in database address space instead of application address space
Advantages
No sending SQL statements to and fro
eliminate pre-processing
speedup by an order of magnitude
Example Applications
Database Adminstration
Integrity Maintenance and Checks
Database Mining
Disadvantages
Non-standard implementation
Difficult to enforce transactional synchronization
Without traditional SQL optimization, can lead to performance degradation

CS 5614: Misc. SQL Stuff, Safety in Queries

111

Introduction to Query Optimization

Helps attain declarativeness of RDBMSs

One of the main reasons for commercial success of DBMSs


A motivating example
Find all students with 4.0 gpa enrolled in CS5614

SELECT name
FROM Students, Classroll
WHERE Students.name = Classroll.studentname
AND Students.gpa = 4.0
AND Classroll.coursename = CS5614

Two Strategies

Do join and then filter out the ones with gpa <> 4.0 and course <> CS5614
Filter first the ones with gpa <> 4.0 and course <> CS5614 and then Join
Which is Better?
Always good to push selections as far down into the query parse tree

CS 5614: Misc. SQL Stuff, Safety in Queries

How does a Query Optimizer work?

Three Requirements
A Search Space of Plans
A Cost Model (for Plan evaluation)
An Enumeration Algorithm
Ideally
Search Space: contains both good and efficient plans
Cost Models: cheap to compute and accurate
Enumeration Algorithm: efficient (not a monkey-typewriter algorithm)
Example of a Search Space
See Previous Slide
Examples of Cost Models
#(tuples) evaluation
#(main memory locations) etc.
Example of an enumeration algorithm
Sequential enumeration of a lattice of plans
Dynamic Programming vs. Greedy Approaches

112

CS 5614: Misc. SQL Stuff, Safety in Queries

113

A Simple Measure of Cost

#(Tuples) in a query
Easiest to compute for

Cartesian Product: #(R X S) = #(R)#(S)


Projection:#(Pi(R)) = #(R)
A Notation for Other Operations
V(R,A) = Number of distinct values of attribute A in R
formulas assume that all values of A are equally likely in R
Holds in average case for most distributions (e.g. Zipf)
Selectivity Factors for Selection Operations
Equality Tests: Use 1/V(R,A)
< or > Tests: Use 1/3
<> Test: Use (V(R,A)-1)/V(R,A)
AND Conditions: Multiply Selectivity Factors
OR Conditions: Three Choices
Sum of results from individual selectivity factors
Max(sum,total size of relation): why?
n(1-(1-m1/n)(1-m2/n)) formula : most accurate

CS 5614: Misc. SQL Stuff, Safety in Queries

Estimating the Size of a Join

Assume: R(X,Y) Join S(Y,Z)


Range of Values

Minimum: 0
In-between: #(R) (if Y is a foreign key for R and a key for S)
Maximum: #(R)#(S) (if Ys in R and S are all the same)
Assumptions for Join Size Estimation
Containment of Value Sets
Preservation of Value Sets
Containment of Value Sets
If V(R,Y) <= V(S,Y) then the Ys in R are a subset of the Ys in S
Satisfied when Y is a foreign key in R and a key in S
Preservation of Value Sets
#(R Join S,X) = #(R,X)
#(R Join S,Z) = #(S,Z)
why is this reasonable?

114

CS 5614: Misc. SQL Stuff, Safety in Queries

115

The Actual Estimate

Assume that V(R,Y) <= V(S,Y)

Every tuple in R has a chance of 1/V(S,Y) of joining with a tuple of S


Every tuple in R has a chance of #(S)/V(S,Y) of joining with S
All tuples in R have a chance of #(R)#(S)/V(S,Y) of joining with S
What if V(S,Y) <= V(R,Y)
Answer: #(R)#(S)/V(R,Y)
In general: #(R)#(S)/(max (V(S,Y),V(R,Y)))
What if there are multiple join attributes
Have a max factor in the denominator for each such attribute!
How to Estimate #(R Join S Join T)?
Does it matter which we do first?
Surprise!
Estimation formula preserves associativity of Joins!
In other words, it takes care of itself!
Thus, for a Join attribute appearing > 2 times
3 times: Use two highest values
4 times: Use three highest values etc.

CS 5614: Misc. SQL Stuff, Safety in Queries

More on Join Associativity and Commutativity

Which is better: (R Join S) or (S Join R)


Good to put the smaller relation on the left
Why? Most Join algorithms are assymmetric
Example:
Construct a good query tree for the following

SELECT movietitle
FROM Actors,ActedIn
WHERE Actors.name = ActedIn.actorname
AND Actors.age = 23

Number of Possible Trees of n Attributes

Arises from the shape of the trees: T(n)


Arises from permuting the leaves: n!
Total choices: n! T(n)

116

CS 5614: Misc. SQL Stuff, Safety in Queries

117

What is T(n)?

Sample Values

1: 1
2: 1
3: 2
4: 5
5: 14

A formula
T(1) = 1 (Basis)
T(n) = T(1)T(n-1) + T(2)T(n-2) + ..... + T(n-1)T(1)
Classifications
Left-Deep Trees: All right children are leaves
Right-Deep Trees: All left children are leaves
Bushy Trees: Neither Left-Deep nor Right-Deep
Choosing a Join Order: Restricted to Left-Deep Trees
By Dynamic Programming: O(n!)
Greedy Approach: Make local selections

CS 5614: Misc. SQL Stuff, Safety in Queries

Example

Consider
R(a,b): #(R) = 1000, V(R,a) = 100, V(R,b) = 200
S(b,c): #(S) = 1000, V(S,b) = 100, V(S,c) = 500
T(c,d): #(T) = 1000, V(T,c) = 20, V(T,d) = 50
Possible Join Orders
(R Join S) Join T
(S Join R) Join T (same as above; why?)
(R Join T) Join S
(T Join R) Join S (same as above)
(S Join T) Join R
(T Join S) Join R (same as above)
Cost Estimation = Sizes of Intermediate Relations
(R Join S) Join T: 5000
(R Join T) Join S: 1000000
(S Join T) Join R: 2000
Best Plan = (S Join T) Join R

118

CS 5614: Misc. SQL Stuff, Safety in Queries

119

Database Tuning: Why?

Two Families of Queries

OLTP (Access small number of records)


OLAP (Summarize from a large number of records)
Sources of Poor Performance
Imprecise Data Searches
Random vs. Sequential Disk Accesses
Short Bursts of Database Interaction
Delays due to Multiple Transactions
What can be done?
Tune Hardware Architecture
Tune OS
Tune Data Structures and Indices

CS 5614: Misc. SQL Stuff, Safety in Queries

Examples of Database Tuning

To normalize or not to
Sacrificing Redundancy Elimination
Sacrificing Dependency Elimination
Several Choices of Normalized Schemas
Vertical Partitioning Applications
Recomputing Indices
Histograms etc. might be outdated
Restricting Uses of Subqueries
Unnesting query blocks by Joins
Declining the Use of Indices
Table Scans for Small Tables
Rule-based optimization: Rewrite A=6 as A+0=6
Provide Redundant Tables
Decision-Support/ Data Mining Queries

120

You might also like