You are on page 1of 7

Fllthough most of us think that any databasethat supportsSQLis automaticallyconsiderred a

relationaldatabase,this isnt alwaysthe case-at least not completely.In Chapter 1,I discussed
the basicsand foundations of relationaltheory but no discussionon this subjectwould biecom-
plete without looking at the rules that wereformulated in 1985in a two-prut article publishedby
Conxputerworldmagazine("IsYourDBMSReallyRelational?"and "DoesYburDBMSRun lBythe
Rules?"by E. E Codd,Computerworld,October14 and October21, 1985).JManywebsiteseilsoout-
line theserules.Theserules go beyond relationaltheory and definesmore speciflccriteria.that need
to be met in an RDBMS,if it's to be trulyrelational.
It might seemlike old news,but the samecriteria can still be usedtoday to measurehow rela-
tional a databaseis. Theserules arefrequentlybrought up in conversatior':rs when discussinghow
well a particular databasesewer is implemented.I presentthe rulesin this appendix,alorrgwith
brief commentsasto whether SQLServer2008meetseachof them, or ottrerwise.Relationaltheory
has come a long way sincetheseruleswereflrst published,and "serious"theoristshaveenhanced
and refinedrelationaltheory tremendouslysincethen, asyou'd expect.A good placefor more seri-
ousIearningis the websitehttp: //www.dbdebunk. com,run by C.I. Dateand FabianPascal,or any of
their books.If you want to seethe debateson theory the newsgroupcomp,, databases.theory is a
truly interestingplace.Of course,asthe coverof this book states,my goal is practicality,with a
foundation on theory so I won't delvetoo deeplyinto theory hereat all. I presentthese 12rules
simply to set a basisfor what a relationaldatabasestartedout to be and largelywhat it is amdwhat
it should be eventoday.
All theserules are basedupon what'ssometimesreferredto as thefoundation princi1tle,which
statesthat for any systemto be calleda relational databasemanagementsystem,the relational
capabilitiesmust be able to manageit completely.
For the rest of this appendix,I'll treat SQLServer2008specificallyasil relationaldatabase
engine,not in any of the other conligurationsin which it might be used,srLrch as a plain data store,a
document storagedevice,or whateverother way you might useSQLServeras a storageengine.

Rule1:TheInformation
Rule
AII information in the relationaldatabaseis represented
in exactlyoneand only onewary-by
ualuesin tables.

This rule is an informal definition of a relationaldatabaseand indicatesthat everypieceof datathat


we permanentlystore in a databaseis locatedin a table.
In general,SQLServerfutfllls this rule, becausewe cannot storeany information in anything
other than a table.We cant usethe variablesin this codeto oersistanv data,and thereforethev're
scopedto a singlebatch.
5S5
APPENDA
I X* C O D D ' S
12 RULES
F O RA N R D B M S

Rufe2: Guaranteed
AccessRule
Eachand,euerydatum (atomic ualue) is guaranteed to be logically accessibleby resorting to a
combinationof table name,primary keyualue,and column name.

This rule stressesthe importance of primary keysfor locating datain the database.The table name
locatesthe correcttable,the column name finds the correctcolumn, and the primary key'ualue
finds the row containing an individual dataitem of interest.In other words,each(atomic)pieceof
data is accessiblebythe combination of table name,primarykeyvalue, and column name.This
rule exactlyspecifieshowwe accessdatausing an accesslanguagesuch asTransact-SQl('I-SQL)
in SQLServer.
Using SQL,we can searchfor the primary key value (which is guaranteedto be unique,based
on relationaltheory aslong asit hasbeen defined),and oncewe havethe row, the datais irccessed
via the column name,We can alsoaccesqdataby any of the columns in the table,though vrearent
alwaysguaranteedto receivea singlerow back.

Rule3: Systematic
Treatment
of NULL
Values
NULL values(distinctfrom emptycharacterstring or a string of blank charactersand distinct
ftom rero or any other number)aresupportedin thefully relational RDBMSfor representing
missinginformation in a systematicway,independentof data type.

This nrle requiresthat the RDBMSsupport a distinct NULL placeholder,regardlessof datatype.NULLs


are distinct from an empty characterstring or any other number, and they are alwaysto be consid-
eredasunknown values.
NULLs must propagatethrough mathematicoperationsaswell asstring operations.NULL+
<anything> = NULL, the Iogicbeing that NULLmeans"unknown." If you add somethingknown to
somethingunknown, you still dont know what you have,so it's still unknown.
Thereare a few settingsin SQLServerthat can customizehow NULLs aretreated.Most of these
settingsexistbecauseof somepoor practicesthat were allowedin earlyversionsof SQLServer:
. ANSI_NULLS:DetermineshowNULLcomparisonsarehandled.When0FF,thenNULL,= NULLis
Truefor the comparison,and when 0N(the default),NULL = NULL returns UNKNOWN.
. C0NCAT_NULL_YIELDS_NULL:\tVhentheCONCAT_NULL_YIELDS_NULLsettingisset0N,NULLsare
treatedproperly,suchthatNULL + 'Stri.ngValue' = NULL.IftheC0NCAT_NULL_YIELDS_NULL
settingis OFF,which is allowedfor backwardcompatibility with SQLServer,NULLs aretreated
inanonstandardwaysuchthatNU + 'LSLt r i n g V a l u e ' = ' S t r i n g V a l u e ' .

Rule4: Dynamic
Online Based
Gatalog onthe
Relational
Model
at the logical levelin thesameway as ordinary data,
Thedatabasedescriptionis represented
so authorizeduserscan apply thesamerelational languageto its interrogationas theyapply
to regular data.
, & C O D D '1S2 R U L EF
A P P E N DAI X SO R, A NR D B M S

This rule requiresthat a relational databasebe self-describing.In other words,the databasemust


contain certain systemtableswhosecolumns describethe structureof the databaseitself, or alter-
natively,the databasedescriptionis containedin user-accessible tables.
This rule is becomingmore of a reality in eachnew version of SQLServer,aswith the imple-
mentation of the INFORMATI0N_SCHE|,IA systemviews.The INFORMATION_SCHEM is a schematlhathasa
set of views to look at much of the metadatafor the tables,the relationships,the constraints,and
eventhe codein the database.
Anything elseyou need to larow can most like$ be viewed in the systemviews (in the SYIischema).
Theyle the systemviews that replacedthe systemtablesin 2005that we had usedsincethe breginning
of SQLServertime. Theseviews arefar easierto read and use,and most all the datais self-explanatory
rather than requiring bitwise operationson somecolumns to find the value.

Rule5: Gomprehensive
DataSublanguage
Rule
A relationalsystenxnxaysupportseuerallanguagesand variousmodesof terminal use.How-
ever, there must be at least one language whose statenxentsare expressible,per sonxe
well-definedsyntax,,as characterstringsand whoseability to support all of thefollowing is
comprehensible: a. data definition b. uiewdefinition c. data manipulation (interactiu'eand
by program)d. integrity constraintse.authorizntionf, ffansactionboundaries(begin,com-
mit, androllback).

This rule rnandatesthe existenceof a relationaldatabaselanguage,such asSQL,to manipulate


data.SQLassuchisn't specificallyrequired.The languagemust be ableto support all the central
functions of a DBMS:creatinga database,retrievingand enteringdata,implementing database
security,and so on. T-SQLfnlfils this function for SQLServerand carriesout all the data definition
and manipulation tasksrequired to accessdata.
SQLis a nonprocedurallanguage,in that you dont specify"how" things happen, or even
where.Yousimply ask a question of the relationalserve! and it doesthe work.

Rule6:ViewUpdating
Rule
All uiewsthat are theoreticallyupdateablearealso updateableby thesystem.

This rule dealswith views,which arevirtual tablesusedto givevarioususersof a databaseffierent


views of its structure.It's one of the most challengingrules to implement in practice,and no com-
mercialproduct firlly satisfiesit today.
A view is theoreticallyupdateableaslong asit's made up of columnsthat directly correspond
to real table columns.In SQLServer,views areupdateableaslong asyou dont update more than a
singletable in the statement;neither can you update a derivedor constantfield. SQLServer2000
alsoimplemented INSTEAD 0Ftriggersthat you can apply to a view (seeChapter6). Hence,,this rule
can be technicallyfi-rlfilledusing INSTEAD 0Ftriggers,but in what can be a less-than-straightforward
manner.Youneedto take carewhen consideringhow to apply updates,especiallyif the virewcon-
tains a CROUP BYclauseand possiblyaggregates.
A P P E N DA
I XE C O D D ' S
12 RULES
F O RA N R D B M S

Rule7: High-Level
Insert,
Update,
andlDelete
The capability of handling a baserelation or a deriued relation as a single operand aioplies
not only to the retrieualofdatabut also to the insertion,update,and deletionofdata.

This rule stressesthe set-orientednature of a relational database.It requiresthat rows be lreated as


setsin insert, delete,and update operations.The rule is designedto prohibit implementationsthat
support only row-at-a-time,navigationalmodification of the database.The SQLlanguagecovers
this via the INSERT,UPDATE,and DELETE statements.
Eventhe CLRdoesn'tallowyou to accessthe physicalfiles wherethe data is stored,but BCP
doeskind of go around this. As always,you haveto be exfta carefti when you usethe low-.leveltools
that can modify the data without going through the typical SQLsyntax,becauseit can ignrcrethe
rulesyou have set up, introducing inconsistenciesinto your data.

Rule8: Physical
DataIndependence
Application prograrnsand,terminal activities remain logically unimpaired wheneverany
changesare made in eitherstoragerepresentationor accessmethods.

Applicationsmust still work using the samesyntax,evenwhen changes.ue madeto the way in
which the databaseinternally implementsdata storageand accessmethods.This nrle implies that
the way the data is storedphysicallymust be independentof the logical manner in which it's
accessed. This is sayingthat usersshouldn'tbe concernedabout how the datais storedor how
it's accessed.In fact, usersof the dataneed only be ableto get the basicdefinition of the datathey
need.
Other things that shouldnitaffectthe user'sview of the data areasfollows:
. Adding indexes:Indexesdeterminehow the data is stored,yet the user,through SQL,will
neverknow that indexesarebeing used.
. Changingthefilegroup of an objectlust moving a table to a new filegroupwill not affectthe
application.You accessthe data in the sameway no matter whereit is physicallylocated.
. Usingpartitioning: Beyondmoving entire tablesaround to different filegroups,yon can
moveparts ofa table around by using partitioning technologiesto spreadaccessaround to
different independent subsystemsto enhanceperformance,
. Modilying the storageengine:From time to time, Microsofthas to modiff how SQLServer
operates(especiallyin major versionupgrades).However,SQLstatementsmust appearto
accessthe data in the samemanner asthey did in any previousversion,only (wehope)
faster.

Microsofthas put a lot of work into this area,becauseSQLServerhas a separaterelational


engineand storageengine,and OLEDB is usedto passdatabetweenthe two. Further readingon
this topic is availablein SQLServer2008BooksOnline in the "DatabaseEngineComponents"topic
or in InsideMicrosofrSQLSeruer2005:TheStorageEngineby KalenDelaney(MicrosoftPress,2006).
A P P E N D IAX S C O D D ' S
1 2 R U L E SF O R. A NR D B M S 59!I

Rule9:Logical
DataIndependence
Application programsand terminal activitiesremain logically unimpaired when informa-
tion preservingchanges
of any kind that theoreticallypermit unimpairment are ma.deto the
basetables.

Along with rule 8, this rule insulatesthe user or application programfrom the low-levelimLplemen-
tation ofthe database.Together,they specifiithat specificaccessor storagetechniquesusedby the
RDBM$-and evenchangesto the structureof the tablesin the database-shouldnt affectthe
user'sability to work with the data.
In this way,if you add a column to a table and if tablesaresplit in a manner that doesnt add or
subtractcolumns,then the applicationprogramsthat call the databaseshouldbe unimpaired.
For exarnple,sayyou havethe table inFigureA-1.

baseTable
I baseTabletd
I
fcolriniil |
lcolumn2 I

FigureA-1.Small sampletable

Then, sayyou verticallybreak it up into two tables (seeFigureA-2).

baseTableA baseTableB
lbadTabtetd_-l
-l tbaseTabtetd_l
-l
lcolumnl lcolumn,
FigureA-2.Smallsampletablesplit into two tables

Youthen could createthis view:


VIEIdbaseTable
CREATE
AS
SELECT
baseTableld,column1,column2
FROM baseTableA
l0IN baseTableB
= baseTableB.baseTableld
0N baseTableA.baseTableld
The usershould be unaffected.If you wereto implement INSTEAD 0Ftdggerson the view that
had the samenumber of columnswith the samenames,you could seamlesslymeet the needto
managethe view in the exactmanner the tablewas managed.Note that the handling of identity
columns can be tricky in views,becausethey require data to be entered,evenwhen the daLta
wont
be used.SeeChapter6 for more detailson creatingINSTEAD 0Ftriggers.
Of course,you cannot alwaysmake this rule work if columns or tablesareremovedfrom the
system,but you can make the rule work if columns and data aresimply added.

Fifip Rtways datafromtheRDBMS


access byname,andnotbyposition * wildcard.
orbyusingtheSELECT The
order shouldn't
ofcolumns make a ditference
totheapplication.
600 APPENDIX
A f* CODD'S12 RULESFORAN RDBMS

Rule10:Integrity
Independence
Integrityconstraintsspecificto a particular relationaldatabasemustbedefinablein the rela-
tional data sublanguageand storablein the catalog,not in theapplicationprograms.

The databasemust support a minimum of the following two integrity constraints:


. Entity integrl4r No component of a primary key is allowedto havea NULL value.
. ReferentialintegrityiFor eachdistinct non-NULLforeignkey valuein a relationaldatabase,
there must exista matching primary key value from the samedomain.

This rule saysthat the databaselanguageshould support integrity constraintsthat rer;trictthe


data that can be enteredinto the databaseand the databasemodificationsthat can be made.In
other words,the RDBMSmust internally support the definition and enforcementof entity integrity
(primary keys)and referentialintegrity (foreignkeys).
It requiresthat the databasebe ableto implement constraintsto protect the datafrom invalid
valuesand that careftildatabasedesignis neededto ensurethat referentialintegrity is achieved.SQL
Server2008doesa greatjob of providingthe tools to make this rule a reality.Wecan protect our data
ftom invalid valuesfor most any possiblecaseusing constraintsand triggers.Most of Chapter6 was
spent coveringthe methodsthat we can usein SQLServerto implement integrityindepenclence.

Rule11: Distribution
lndependence
The data manipulation sublanguageof a relational DBMS must enableapplication pro-
gramsand terminal actiuitiesto rernain logically unimpaired whetherand wheneuerdata
arephysicallycentralizedor distributed.

This nrle saysthat the databaselanguagemust be able to manipulatedatalocatedon othelrcom-


puter systems.In essence,we shouldbe ableto split the data on the RDBMSout onto multiple
physicalsystemswithout the userrealizingit. SQLServer2008supportsdistributed transaLctions
among SQLServersources,aswell asother types of sourcesusing the Microsoft DistributerdTrans-
action Coordinatorservice,
Another distribution-independencepossibilityis a $oup of databaseserversworkinpJtogether
more or lessas one.Databaseserversworking togetherlike this are consideredtobe fed.ernted.With
everynew SQLServerversion,the notion of federateddatabaseserversseamlesslysharingthe load
is becoming a definite reality.More readingon this subjectcan be found in the SQLServer2008
BooksOnline in the "FederatedDatabaseServers'itopic.

Rule12:Non-Subversion
Rule
Ifa relational systemhas or supportsa low-level (single-record-at-a-tinxe) language',that
low-Ieuellanguagecannot be usedto subuertor bypassthe integrity rules or conshraints
expressed in the higher-leuel(multiple-records-at-a-time) relational languctge.

This rule requiresthat alternatemethods of accessingthe data arenot ableto bypassintegrity con-
straints,which meansthat userscant violate the rules of the databasein any way.For mosrtSQL
/-
,/
! I *C O D D '1S2 R U L EFSO RA NR D B M S
A P P E N DAI X

Server2008applications,this nrle is followed,becausethere are no methods of gettingto the raw


data and changingvaluesother than by the methods prescribedby the database.However;SQL
Server2008violatesthis rule in two places:
. Bulk cow: By default,you can usethe bulk copy routines to insert data into the table directly
and around the databaseservervalidations.
. Disablingconstraintsand triggers:There'ssyntaxto disableconstraintsand triggerci,thereby
subverting this rule.

It's alwaysgood practiceto make sureyouuse thesetwo featurescareftrlly.Theyleavegaping


holesin the integrity of your data,becausethey allow any valuesto be insertedin any coluLmn.
Becauseyou'reexpectingthe data to be protectedby the constraintyou'veapplied,datavirlue
errorsmight occur in the programsthat usethe data,without revalidatingit fust.

Summary
Codd's12rulesfor relational databasescan be usedto explain much about how SQLServerr oper-
atestoday.Theserules were a major stepforward in determiningwhether a databasevenclorcould
call his system"relational" and presentedstiff implementation challengesfor databasede'velopers.
Fifteenyearson, eventhe implementation of the most complexof theserulesis becomingJachiev-
able,though SQLServer(and other RDBMSs)still fall short of achievingall their objectives.