You are on page 1of 55

WG3

Database Languages
Stephen Cannan Convenor
2002-05-06

Agenda

Working Group changes ISO/IEC 9075 ISO/IEC 20606 Recap of recent changes Sketch of planned changes

S J Cannan

Working Group changes


3 ex-WG5 subprojects absorbed
20606-1 Authorization and Audit 20606-2 Distribution Schema 20606-3 Encompassing Transaction

Effect on WG3 internal structure

S J Cannan

ISO/IEC 9075
Multi-part standard
Part 1: SQL/Framework Part 2: SQL/Foundation Part 3: SQL/CLI Part 4: SQL/PSM Part 9: SQL/MED Part 10: SQL/OLB Part 11: SQL/Schemata Part 13: SQL/JRT Part 14: SQL/XML

S J Cannan

Part 1: SQL/Framework

Common definitions & concepts
Structure of multi-part standard Basic conformance structure & statement

About 85 pages

S J Cannan

Part 2: SQL/Foundation
The core of the standard Includes:
Traditional SQL Object-oriented SQL Module Language Host language bindings (except Java) Dynamic SQL Direct SQL

Excludes:
Information Schema & Definition Schema (in Part 11)

About 1300 pages


6 S J Cannan

Part 3: SQL/CLI
Call-Level Interface Best-known implementation: ODBC About 400 pages

S J Cannan

Part 4: SQL/PSM
Persistent Stored Modules PSM-96 specified:
functions & procedures SQL-server modules computational completeness

In PSM:1999, functions & procedures moved to

Foundation (same in 2003) Analogous to PL/SQL, Transact-SQL, etc. About 170 pages

S J Cannan

Part 9: SQL/MED
Management of External Data Foreign Servers, Foreign-Data Wrappers,
Foreign Tables

SQL-aware vs non-SQL-aware
Datalinks Merge with 20606-2 Distribution Schema

About 500 pages

S J Cannan

Part 10: SQL/OLB



Object Language Bindings
Embedded SQL in Java SQLJ Part 0

About 360 pages

10

S J Cannan

Part 11: SQL/Schemata


Information Schema Definition Schema About 300 pages

11

S J Cannan

Part 13: SQL/JRT


Java Routines and Types Java routines stored in an SQL database, and
invoked from SQL statements

Java classes used as data types of SQL


columns

About 200 pages

12

S J Cannan

Part 14: SQL/XML


Under construction Using SQL and XML together About 150 pages, but still growing

13

S J Cannan

ISO/IEC 20206
20606-1 Authorization and Audit 20606-3 Encompassing Transaction

14

S J Cannan

Project 20206 1 Authorization and Audit


Addresses issues of:
Authentication Access Control Transfer Integrity Transfer Confidentiality Non-repudiation

No Working Draft available CD expected 2002-06-01 John Hadjioannou listed as Editor but not present
15 S J Cannan

Project 20206 3 Encompassing Transaction


Interacts with external Transaction Managers Support for xa_ interface
9075-6 XA Specialization (SQL/Transaction)

Propagates Global Context information


No Working Draft available CD expected 2002-06-01 Bob Sunday listed as Editor but has withdrawn

16

S J Cannan

Recap of recent changes: ISO/IEC 9075 SQL:2003


Since 1999 we have already added:
Part 9 SQL/MED Part 10 SQL/OLB Part 13 SQL/JRT Addendum 1 SQL/OLAP

SQL:2003 will be the 4th generation


1987, 1992, 1999

Consists of:
Many error corrections (TC stands at 376 pages and growing) A few new features Part 14 SQL/XML
17 S J Cannan

New Features

Data types
Predicates Semantics Security Active Database

18

S J Cannan

New Data Types


BIGINT MULTISET

19

S J Cannan

New Data Types BIGINT


Exact numeric, scale 0 Precision of BIGINT precision of INTEGER Must have same radix as SMALLINT and
INTEGER

20

S J Cannan

New Data Types MULTISET


Varying-length, unordered collections of element

having specified type COL1 INTEGER MULTISET COL2 ROW( F1 BIGINT, F2 VARHCAR(4000) ) MULTISET

No (specified) maximum cardinality SQL:2003 collection types include:


MULTISET ARRAY

21

S J Cannan

New Data Types MULTISET


INTEGER MULTISET()
Empty multiset with integer element type (not null!)

INTEGER MULTISET(2, 3, 5, 7)
Integer multiset with first few primes

INTEGER MULTISET(SELECT COL1


FROM TBL1 WHERE COL2 > 10)
Integer multiset populated from values in column of table

22

S J Cannan

New Data Types MULTISET


CARDINALITY (value1)
Type of value1 must be multiset Returns number of elements in value

SET (value1)
Type of value1 must be multiset Returns value1 with duplicate elements removed

ELEMENT (value1)
Type of value1 must be multiset
Cardinality of value1 must be 1 Returns the single element in value1
23 S J Cannan

New Data Types MULTISET


UNNEST(value1) AS corr-name
Type of value1 must be multiset
Un-nests value1 and turns the elements into rows of a virtual table UNNEST MULTISET (2, 3, 5, 7) AS P P 2

3 5 7
24 S J Cannan

New Data Types MULTISET


value1 MULTISET setop
quantifier value2
setop UNION or EXCEPT or INTERSECT quantifier ALL or DISTINCT

SELECT col1

MULTISET INTERSECT DISTINCT col2 FROM tbl1 WHERE CARDINALITY(col2) > 50 Close analogs to ordinary set operators
UNION, EXCEPT, and INTERSECT

However, ALL is the default quantifier


25 S J Cannan

New Data Types MULTISET


New aggregates:
COLLECT Transform the values in a group into a multiset
FUSION Form a union of the multisets in a group number of duplicates of a given value in the result is the sum of the number of duplicates in the multisets in the rows of the group INTERSECTION Form an intersection of the multisets in a group number of duplicates of a given value in the result is the minimum of the number of duplicates in the multisets in the rows of the group

26

S J Cannan

New Predicates
NORMALIZED
If character string argument is in Unicode Normalization Form C (NFC), returns true

MEMBER
If first argument is a member of multiset in second argument, returns true

SUBMULTISET
If first argument is a submultiset of second argument, returns true

SET
If argument is a set (no duplicate values), returns true

27

S J Cannan

NORMALIZED Predicate
string_value IS [ NOT ] NORMALIZED Character repertoire of string_value must
be UCS

Encourages, but does not force, normalization


to NFC

28

S J Cannan

MEMBER Predicate
value [ NOT ] MEMBER [ OF ]
multiset

multiset must be a multiset Type of value comparable to element type of


multiset

If multiset is empty, returns false If value is equal to some element of multiset,

returns true Else, if some element of multiset is null, returns unknown

29

S J Cannan

SUBMULTISET Predicate
multiset1 [ NOT ] SUBMULTISET
[ OF ] multiset2

Both multiset1 and multiset2 must be


multisets

and their element types must be comparable If the cardinalities of multiset1 and
multiset2 are equal and if every value in multiset1 has a corresponding value in multiset2, then returns true
30 S J Cannan

SET Predicate
multiset IS [ NOT ] A SET multiset must be a multiset If there are no duplicate values in multiset,
returns true

Maximum of 1 null value in a set

31

S J Cannan

New Semantics

MERGE statement
OLAP: TABLESAMPLE Generated columns Identity columns and sequence generators

32

S J Cannan

New Semantics MERGE


If some row in the target table matches
some row in the source table according to the specified predicate, then the row in the subject table is updated

If no row in the target table matches a given


row in the source table according to the specified predicate, then the row from the source table is inserted into the subject table
33 S J Cannan

New Semantics MERGE


MERGE INTO table-name [ AS correlation ]
USING table-reference ON search-condition WHEN MATCHED THEN SET col = value MERGE INTO table-name [ AS correlation ] USING table-reference ON search-condition WHEN NOT MATCHED THEN INSERT [ ( col-list ) ] VALUES ( val-list ) Both MATCHED and NOT MATCHED clauses permitted once each, in either order

34

S J Cannan

TABLESAMPLE
New feature in the OLAP capability Permits evaluation of aggregates on samples
derived from database data

Permits faster debugging when database is huge Two forms of sampling: BERNOULLI and
SYSTEM

35

S J Cannan

TABLESAMPLE
TABLESAMPLE method ( percentage )

[ repeatable ] method is either BERNOULLI or SYSTEM


Bernoulli: Sample table contains approximately percentage % of the rows of the original table; the probability of a given row of the original table appearing in the sample table is percentage %, independently of every other row System: Sample table contains approximately percentage % of the rows of the original table; the probability of a given row of the original table appearing in the sample table is percentage %

repeatable acts as a random number seed for


sampling
36 S J Cannan

Generated Columns
Ordinary columns of base tables: base columns Generated column: value computed from values of zero

or more base columns in same row CREATE TABLE EMPLOYEES ( EMP_ID INTEGER, SALARY DECIMAL(7,2), BONUS DECIMAL(7,2), TOTAL_COMP GENERATED ALWAYS AS ( SALARY + BONUS ), HR_CLERK GENERATED ALWAYS AS ( CURRENT_USER ) )
S J Cannan

37

Identity columns & Sequence generators


Identity column: Invented as mechanism to
allow automatic population of table keys

Sequence generator: Invented to allow


generation of next value in a sequence

Together, they allow generation of keys for


unique columns (or combinations of columns) when required

Application chooses: select value for identity


column or let sequence generator make choice
38 S J Cannan

Sequence generators
Sequence generators
Data type (exact numeric with scale zero) Starting value

Increment (positive: ascending, negative: descending)


Minimum and maximum values Cycle option External (explicit schema object) or internal (part of another schema object, such a column)
39 S J Cannan

External sequence generators


CREATE SEQUENCE seqname AS type
START WITH value INCREMENT BY value MAXVALUE value CYCLE

Common sequence generator options

Variations:
NO CYCLE NO MAXVALUE, MINVALUE, NO MINVALUE Order of clauses can vary
40 S J Cannan

Sequence generators
Every sequence generator has a current base

value initially set to the start value Generate next value of a sequence generator: NEXT VALUE FOR seqname some N 0

Returns current base value + N * increment for


If computed value > MAXVALUE (or < MINVALUE) and NO CYCLE, then raise an exception Otherwise, reset to MINVALUE (or MAXVALUE) and compute new value for some N.
41 S J Cannan

Sequence generators
INSERT INTO TBL ( COL1, COL2 )
VALUES ( 10, NEXT VALUE FOR seqgen )

CALL myproc ( NEXT VALUE FOR seqgen ) SET J = J + NEXT VALUE FOR seqgen

42

S J Cannan

Sequence generators
ALTER SEQUENCE seqname
RESTART WITH newbase resets start of computation Common sequence generator options change start value, maximum or minimum, increment, cycle option

DROP SEQUENCE seqname dropoption

43

S J Cannan

Identity columns & Sequence generators


Base tables may optionally have a single identity
column

Data type: exact numeric, scale 0 e.g.,


INTEGER

Associated with internal sequence generator


Start value

Minimum and/or maximum value


Increment value Cycle option
44 S J Cannan

Identity columns & Sequence generators


GENERATED ALWAYS or GENERATED BY
DEFAULT
ALWAYS Not allowed to UPDATE column; INSERT requires OVERRIDING SYSTEM VALUE BY DEFAULT Allowed to INSERT or UPDATE column; if column not specified on INSERT, then value automatically generated

45

S J Cannan

Identity columns & Sequence generators


CREATE TABLE employees (
EMP_ID INTEGER GENERATED ALWAYS AS IDENTITY START WITH 100 INCREMENT 1 MINVALUE 10 NO MAXVALUE NO CYCLE, SALARY DECIMAL(7,2), ..., )
46 S J Cannan

New part SQL/XML


Out for FCD ballot Known not to be complete Current contents comprise infrastructure
mappings of
SQL identifiers to & from XML QNames SQL types & values onto XML schema types & values SQL tables onto XML schema types Publishing functions (SQL data expressed as XML) New built-in type for SQL: XML
47 S J Cannan

To be added:
Inserting XML into an SQL database
Transform character string containing XML text into an

XML value INSERT INTO employees ( empxml ) VALUES XMLPARSE ( '<emp id="15339"> <name>Gurney Halleck</name> <hiredate>3998-04-25</hiredate> <dept>Security</dept></emp>' STRIP WHITESPACE )

No concrete proposal yet (still working out issues)

48

S J Cannan

To be added:
Inserting XML into an SQL database
XMLPARSE does not validate But it does:
Syntax check all in-line DTDs

Process entity references defined in such DTDs


Apply default values defined in such DTDs

Whitespace handling not finalized:


STRIP and
PRESERVE are possible choices
49 S J Cannan

To be added:
Validating XML in an SQL context
Validate an XML value against an XML schema

(possibly a DTD) INSERT INTO employees ( empxml ) VALUES XMLVALIDATE ( XMLPARSE ( '<emp id="15339"><name>Gurney Halleck</name><hiredate>3998-04-25 </hiredate><dept>Security</dept> </emp>' STRIP WHITESPACE ) )

No concrete proposal yet (still working out issues)

50

S J Cannan

To be added:
Validating XML in an SQL context
Many remaining problems:
How to reference DTDs and schemas Security problems of referencing arbitrary DTDs and schemas Namespace considerations

xsi:schemaLocation and

xsi:noNameSpaceSchemaLocation might identify schemas outside the database a solution might involve caching schemas inside the database
S J Cannan

51

To be added:
Retrieving XML from an SQL database
Transform an XML value into a character string
containing XML text SELECT XMLSERIALIZE ( e.empxml ) INTO :hostvar FROM emp AS e WHERE ...

No concrete proposal yet (still working out issues)

52

S J Cannan

To be added:
Retrieving XML from an SQL database
Numerous problems not yet resolved:
Precise SQL type of the function result (CHARACTER, CHARACTER VARYING, CHARACTER LARGE OBJECT)

Actual character encoding (UTF8, UTF16, other) and SQLs specified character set
Specified character encoding in prologue

DTD or schema inference? Creation?

53

S J Cannan

Summary
ISO/IEC 9075
Consolidation of changes since 1999 Correction of errors New Part (SQL/XML)

ISO/IEC 20206
Area of concern Looking for:
National Body participation Editor(s)

54

S J Cannan

Questions?

55

S J Cannan