RDBMS

Compiled by: Watsh Rajneesh Software Engineer @ Quark (R&D Labs) 4/4/2002 wrajneesh@bigfoot.com Disclaimer The following contents have been picked up from the material mentioned in the references section below and have not been written by me but only been compiled by me into one place which makes this notes. There was no copyright issues during the time this document was created. In future if the original authors have any grievances of any kind pertaining to this document then i assure them of taking the corrective measures (which in worst case could mean removing this document from the public availability of my site.) References 1. http://www.db.cs.ucdavis.edu/teaching/sqltutorial/ -- A very nice tutorial on Oracle/SQL.Good coverage of SQL, and also discusses the architecture of Oracle, the PL/SQL (procedures, functions, embedded SQL, triggers). 2. http://www.arsdigita.com/books/sql/index.html -- Another good reference material on Oracle from a Prof @mit.edu. Examples from real world have been taken and discusses aspects like Database Modelling, Data Warehousing, Transactions, Database Tuning and case studies from the real world experience author had in building a threaded discussion forum. Its a very practical reference material. 3. Oracle 9i Database Concepts -- Reference for architecture of Oracle 9i, creating Clusters/Indexes, etc. Contents 0. Definitions 1. Data Types in Oracle 8i 2. Basics of Data Modelling 2.1 Tables 2.2 Constraints 2.3 Creating more elaborate constraints with triggers 2.4 Examples on data modelling 2.4.1 Case Study I: A threaded discussion forum data modelling 2.4.2 Case Study II: Modelling the data for Web Site core content 3. Simple Queries from one table 3.1 Subqueries 3.1.1 Correlated Subqueries 3.2 Joins 3.2.1 Self Equi Join

3.2 Outer Joins 3.3 Extending a simple query into a join 4. Complex Queries 5. Transactions 5.1 Atomicity 5.2 Consistency 5.3 Mutual Exclusion 5.4 Unique number generation techniques for use with primary keys 6. Triggers 7. Views 7.1 Protecting privacy with views 7.2 Materialized Views 8. Procedural programming in Oracle 8.1 PL/SQL 8.2 Functions 8.3 Stored Procedures 8.4 Triggers revisited 9. Embedded SQL (Pro*C) 10. Oracle 9i System Architecture This notes is complying to Oracle8i and 7.3 SQL syntax.

0. Definitions
0.1) Tables: In relational database systems, data are represented using tables (relations). The data is stored as records or rows or tuples in the table. The data attributes contitute the columns of table. The structure of table or relation schema is defined by these column attributes. 0.2) Database Schema: is a set of relation schemas. The extension of a database schema at database runtime is called a database instance or database, for short. 0.3) SQL: SQL stands for "Structured Query Language". This language allows us to pose complex questions of a database. It also provides a means of creating databases. SQL works with relational databases. 0.4) RDBMS Vs. ODBMS: The critical difference between RDBMS and ODBMS is the extent to which the programmer is constrained in interacting with the data. With an RDBMS the application program--written in a procedural language such as C, COBOL, Fortran, Perl, or Tcl--can have all kinds of catastrophic bugs. However, these bugs generally won't affect the information in the database because all communication with the RDBMS is constrained through SQL statements. With an ODBMS, the application program is directly writing slots in objects stored in the database. A bug in the application program may translate directly into corruption of the database, one of an organization's most valuable assets. With an object-relational database, you get to define your own data types. For example, you could define a data type called url... If you really want to be on the cutting edge, you can use a bona fide object database, like Object Design's ObjectStore (http://clickthrough.photo.net/ct/philg/wtr/thebook/databaseschoosing.html?send_to=http://www.odi.com). These persistently store the sorts of object and pointer structures that you create in a Smalltalk, Common Lisp, C++, or Java program. Chasing pointers and certain kinds of transactions can be 10 to 100 times faster

than in a relational database. 0.5) ACID Properties of RDBMS: Data processing folks like to talk about the "ACID test" when deciding whether or not a database management system is adequate for handling transactions. An adequate system has the following properties: Atomicity Results of a transaction's execution are either all committed or all rolled back. All changes take effect, or none do. That means, for Joe User's money transfer, that both his savings and checking balances are adjusted or neither are. Consistency The database is transformed from one valid state to another valid state. This defines a transaction as legal only if it obeys user-defined integrity constraints. Illegal transactions aren't allowed and, if an integrity constraint can't be satisfied then the transaction is rolled back. For example, suppose that you define a rule that, after a transfer of more than $10,000 out of the country, a row is added to an audit table so that you can prepare a legally required report for the IRS. Perhaps for performance reasons that audit table is stored on a separate disk from the rest of the database. If the audit table's disk is off-line and can't be written, the transaction is aborted. Isolation The results of a transaction are invisible to other transactions until the transaction is complete. For example, if you are running an accounting report at the same time that Joe is transferring money, the accounting report program will either see the balances before Joe transferred the money or after, but never the intermediate state where checking has been credited but savings not yet debited. Durability Once committed (completed), the results of a transaction are permanent and survive future system and media failures. If the airline reservation system computer gives you seat 22A and crashes a millisecond later, it won't have forgotten that you are sitting in 22A and also give it to someone else. Furthermore, if a programmer spills coffee into a disk drive, it will be possible to install a new disk and recover the transactions up to the coffee spill, showing that you had seat 22A.

1. Data Types in Oracle 8i
For each column that you define for a table, you must specify the data type of that column. Here are your options:
o

Character Data

char(n) A fixed-length character string, e.g., char(200) will take up 200 bytes regardless of how long the string actually is. This works well when the data truly are of fixed size, e.g., when you are recording a user's sex as "m" or "f". This works badly when the data are of variable length. Not only does it waste space on the disk and in the memory cache, but it makes comparisons fail. For example, suppose you insert "rating" into a comment_type column of type char(30) and then your Tcl program queries the database. Oracle sends this column value back to procedural language clients padded with enough spaces to make up 30 total

characters. Thus if you have a comparison within Tcl of whether $comment_type == "rating", the comparison will fail because $comment_type is actually "rating" followed by 24 spaces. The maximum length char in Oracle8 is 2000 bytes. varchar(n) A variable-length character string, up to 4000 bytes long in Oracle8. These are stored in such a way as to minimize disk space usage, i.e., if you only put one character into a column of type varchar(4000), Oracle only consumes two bytes on disk. The reason that you don't just make all the columns varchar(4000) is that the Oracle indexing system is limited to indexing keys of about 700 bytes. clob A variable-length character string, up to 4 gigabytes long in Oracle8. The CLOB data type is useful for accepting user input from such applications as discussion forums. Sadly, Oracle8 has tremendous limitations on how CLOB data may be inserted, modified, and queried. Use varchar(4000) if you can and prepare to suffer if you can't. In a spectacular demonstration of what happens when companies don't follow the lessons of The Mythical Man Month, the regular string functions don't work on CLOBs. You need to call identically named functions in the DBMS_LOB package. These functions take the same arguments but in different orders. You'll never be able to write a working line of code without first reading the DBMS_LOB section of the Oracle8 Server Application Developer's Guide. nchar, nvarchar, nclob The n prefix stands for "national character set". These work like char, varchar, and clob but for multi-byte characters (e.g., Unicode; see http://www.unicode.org/).
o

Numeric Data

number Oracle actually only has one internal data type that is used for storing numbers. It can handle 38 digits of precision and exponents from -130 to +126. If you want to get fancy, you can specify precision and scale limits. For example, number(3,0) says "round everything to an integer [scale 0] and accept numbers than range from -999 to +999". If you're American and commercially minded, number(9,2) will probably work well for storing prices in dollars and cents (unless you're selling stuff to Bill Gates, in which case the billion dollar limit imposed by the precision of 9 might prove constraining). If you are Bill Gates, you might not want to get distracted by insignificant numbers: Tell Oracle to round everything to the nearest million with number(38,-6). integer In terms of storage consumed and behavior, this is not any different from number(38) but I think it reads better and it is more in line with ANSI SQL (which would be a standard if anyone actually implemented it).
o

Dates and Date/Time Intervals

date A point in time, recorded with one-second precision, between January 1, 4712 BC and December 31, 4712 AD. You can put in values with the to_date

function and query them out using the to_char function. If you don't use these functions, you're limited to specifying the date with the default system format mask, usually 'DD-MON-YY'. This is a good recipe for a Year 2000 bug since January 23, 2000 would be '23-JAN-00'. On ArsDigita-maintained systems, we reset Oracle's default to the ANSI default: 'YYYY-MM-DD', e.g., '2000-01-23' for January 23, 2000. number Hey, isn't this a typo? What's number doing in the date section? It is here because this is how Oracle represents date-time intervals, though their docs never say this explicitly. If you add numbers to dates, you get new dates. For example, tomorrow at exactly this time is sysdate+1. To query for stuff submitted in the last hour, you limit to submitted_date > sysdate - 1/24.
o

Binary Data

blob BLOB stands for "Binary Large OBject". It doesn't really have to be all that large, though Oracle will let you store up to 4 GB. The BLOB data type was set up to permit the storage of images, sound recordings, and other inherently binary data. In practice, it also gets used by fraudulent application software vendors. They spend a few years kludging together some nasty format of their own. Their MBA executive customers demand that the whole thing be RDBMS-based. The software vendor learns enough about Oracle to "stuff everything into a BLOB". Then all the marketing and sales folks are happy because the application is now running from Oracle instead of from the file system. Sadly, the programmers and users don't get much because you can't use SQL very effectively to query or update what's inside a BLOB. bfile A binary file, stored by the operating system (typically Unix) and kept track of by Oracle. These would be useful when you need to get to information both from SQL (which is kept purposefully ignorant about what goes on in the wider world) and from an application that can only read from standard files (e.g., a typical Web server). The bfile data type is pretty new but to my mind it is already obsolete: Oracle 8.1 (8i) lets external applications view content in the database as though it were a file on a Windows NT server. So why not keep everything as a BLOB and enable Oracle's Internet File System? Note: In Oracle-SQL there is no data type boolean. It can, however, be simulated by using either char(1) or number(1).

2.Basics of Data Modelling
In this section and the following, i have collected many possible differing examples of SQL queries starting with the basic ones and graduating to the most complex ones. A relational database stores data in tables (relations). A database is a collection of tables. A table consists of a list of records - each record in a table has the same structure, each has a fixed number of "fields" of a given type.So it is using tables in an RDBMS in terms of which you model your data.

Tables

create table <table> ( <column 1> <data type> [not null] [unique] [<column constraint>], ::::::::: <column n> <data type> [not null] [unique] [<column constraint>], [<table constraint(s)>] ); The keyword unique speci es that no two tuples can have the same attribute value for this column. Unless the condition not null is also speci ed for this column, the attribute value null is allowed and two tuples having the attribute value null for this column do not violate the constraint. Example: The create table statement for our EMP table has the form create table EMP ( EMPNO number(4) not null, ENAME varchar2(30) not null, JOB varchar2(10), MGR number(4), HIREDATE date, SAL number(7,2), DEPTNO number(2) ); Remark: Except for the columns EMPNO and ENAME null values are allowed. CREATE TABLE your_table_name ( the_key_column key_data_type PRIMARY KEY, a_regular_column a_data_type, an_important_column a_data_type NOT NULL, ... up to 996 intervening columns in Oracle8 ... the_last_column a_data_type ); Even in a simple example such as the one above, there are few items worth noting. First, I like to define the key column(s) at the very top. Second, the primary key constraint has some powerful effects. It forces the_key_column to be non-null. It causes the creation of an index on the_key_column, which will slow down updates to your_table_name but improve the speed of access when someone queries for a row with a particular value of the_key_column. Oracle checks this index when inserting any new row and aborts the transaction if there is already a row with the same value for the_key_column. Third, note that there is no comma following the definition of the last row. If you didn't get it right the first time, you'll probably want to alter table your_table_name add (new_column_name a_data_type any_constraints); or

alter table your_table_name modify (existing_column_name new_data_type new_constraints); In Oracle 8i you can drop a column: alter table your_table_name drop column existing_column_name; If you're still in the prototype stage, you'll probably find it easier to simply drop table your_table_name; and recreate it. At any time, you can see what you've got defined in the database by querying Oracle's Data Dictionary: SQL> select table_name from user_tables order by table_name; after which you will typically type describe table_name_of_interest in SQL*Plus: SQL> describe users; Note that Oracle displays its internal data types rather than the ones you've given, e.g., number(38) rather than integer and varchar2 instead of the specified varchar.

Constraints
The specification of a (simple) constraint has the following form: [constraint <name>] primary key|unique|not null A constraint can be named. It is advisable to name a constraint in order to get more meaningful information when this constraint is violated due to, e.g., an insertion of a tuple that violates the constraint. If no name is speci ed for the constraint, Oracle automatically generates a name of the pattern SYS C<number>. The two most simple types of constraints have already been discussed: not null and unique. Probably the most important type of integrity constraints in a database are primary key constraints. A primary key constraint enables a unique identi cation of each tuple in a table.Based on a primary key, the database system ensures that no duplicates appear in a table. Note that in contrast to a unique constraint, null values are not allowed.For example, for our EMP table, the specification create table EMP ( EMPNO number(4) constraint pk emp primary key, ... );

Example: We want to create a table called PROJECT to store information about projects. For each project, we want to store the number and the name of the project, the employee number of the project's manager, the budget and the number of persons working on the project, and the start date and end date of the project. Furthermore, we have the following conditions: - a project is identi ed by its project number, - the name of a project must be unique, - the manager and the budget must be de ned. Table definition: create table PROJECT ( PNO number(3) constraint prj pk primary key, PNAME varchar2(60) unique, PMGR number(4) not null, PERSONS number(5), BUDGET number(8,2) not null, PSTART date, PEND date); A unique constraint can include more than one attribute. In this case the pattern unique(<columni>, : : : , <column j>) is used. If it is required, for example, that no two projects have the same start and end date, we have to add the table constraint constraint no same dates unique(PEND, PSTART) This constraint has to be de ned in the create table command after both columns PEND and PSTART have been de ned. A primary key constraint that includes more than only one column can be speci ed in an analogous way. Instead of a not null constraint it is sometimes useful to specify a default value for an attribute if no value is given, e.g., when a tuple is inserted. For this, we use the default clause. Example: If no start date is given when inserting a tuple into the table PROJECT, the project start date should be set to January 1st, 1995: PSTART date default('01-JAN-95') Note: Unlike integrity constraints, it is not possible to specify a name for a default. When you're defining a table, you can constrain single rows by adding some magic words after the data type:
• • • • •

not null; requires a value for this column unique; two rows can't have the same value in this column (side effect in Oracle: creates an index) primary key; same as unique except that no row can have a null value for this column and other tables can refer to this column check; limit the range of values for column, e.g., rating integer check(rating > 0 and rating <= 10) references; this column can only contain values present in another table's primary key column, e.g., user_id not null references users in the bboard table forces the

user_id column to only point to valid users. An interesting twist is that you don't have to give a data type for user_id; Oracle assigns this column to whatever data type the foreign key has (in this case integer). Constraints can apply to multiple columns: create table static_page_authors ( page_id integer not null references static_pages, user_id integer not null references users, notify_p char(1) default 't' check (notify_p in ('t','f')), unique(page_id,user_id) ); Oracle will let us keep rows that have the same page_id and rows that have the same user_id but not rows that have the same value in both columns (which would not make sense; a person can't be the author of a document more than once). Suppose that you run a university distinguished lecture series. You want speakers who are professors at other universities or at least PhDs. On the other hand, if someone controls enough money, be it his own or his company's, he's in. Oracle stands ready: create table distinguished_lecturers ( lecturer_id integer primary key, name_and_title varchar(100), personal_wealth number, corporate_wealth number, check (instr(upper(name_and_title),'PHD') <> 0 or instr(upper(name_and_title),'PROFESSOR') <> 0 or (personal_wealth + corporate_wealth) > 1000000000) ); The most simple way to insert a tuple into a table is to use the insert statement insert into <table> [(<column i, : : : , column j>)] values (<value i, : : : , value j>); For each of the listed columns, a corresponding (matching) value must be speci ed. Thus an insertion does not necessarily have to follow the order of the attributes as speci ed in the create table statement. If a column is omitted, the value null is inserted instead. If no column list is given, however, for each column as de ned in the create table statement a value must be given. Examples: insert into PROJECT(PNO, PNAME, PERSONS, BUDGET, PSTART) values(313, 'DBS', 4, 150000.42, '10-OCT-94'); or insert into PROJECT values(313, 'DBS', 7411, null, 150000.42, '10-OCT-94', null);

Now continuing with our example above, insert into distinguished_lecturers values (1,'Professor Ellen Egghead',-10000,200000); 1 row created. insert into distinguished_lecturers values (2,'Bill Gates, innovator',75000000000,18000000000); 1 row created. insert into distinguished_lecturers values (3,'Joe Average',20000,0); ORA-02290: check constraint (PHOTONET.SYS_C001819) violated As desired, Oracle prevented us from inserting some random average loser into the distinguished_lecturers table, but the error message was confusing in that it refers to a constraint given the name of "SYS_C001819" and owned by the PHOTONET user. We can give our constraint a name at definition time: create table distinguished_lecturers ( lecturer_id integer primary key, name_and_title varchar(100), personal_wealth number, corporate_wealth number, constraint ensure_truly_distinguished check (instr(upper(name_and_title),'PHD') <> 0 or instr(upper(name_and_title),'PROFESSOR') <> 0 or (personal_wealth + corporate_wealth) > 1000000000) ); Note: instr() is a built-in function which checks whether the pattern(arg2) exists in the string (arg1). insert into distinguished_lecturers values (3,'Joe Average',20000,0); ORA-02290: check constraint (PHOTONET.ENSURE_TRULY_DISTINGUISHED) violated Now the error message is easier to understand by application programmers.

Creating More Elaborate Constraints with Triggers
The default Oracle mechanisms for constraining data are not always adequate. For example, the ArsDigita Community System auction module has a table called au_categories. The category_keyword column is a unique shorthand way of referring to a category in a URL. However, this column may be NULL because it is not the primary key to the table. The shorthand method of referring to the category is optional. create table au_categories ( category_id integer primary key, -- shorthand for referring to this category, -- e.g. "bridges", for use in URLs category_keyword varchar(30), -- human-readable name of this category, -- e.g. "All types of bridges" category_name varchar(128) not null ); We can't add a UNIQUE constraint to the category_keyword column. That would allow the table to only have one row where category_keyword was NULL. So we add a trigger that can execute an arbitrary PL/SQL expression and raise an error to prevent an INSERT if necessary: create or replace trigger au_category_unique_tr before insert on au_categories for each row declare existing_count integer; begin select count(*) into existing_count from au_categories where category_keyword = :new.category_keyword; if existing_count > 0 then raise_application_error(-20010, 'Category keywords must be unique if used'); end if; end; This trigger queries the table to find out if there are any matching keywords already inserted. If there are, it calls the built-in Oracle procedure raise_application_error to abort the transaction.

DML (Insert/Update/Delete)

If there are already some data in other tables, these data can be used for insertions into a new table. For this, we write a query whose result is a set of tuples to be inserted. Such an insert statement has the form insert into <table> [(<column i, : : : , column j>)] <query> create table OLDEMP ( ENO number(4) not null, HDATE date); We now can use the table EMP to insert tuples into this new relation: insert into OLDEMP (ENO, HDATE) select EMPNO, HIREDATE from EMP where HIREDATE < '31-DEC-60'; Update For modifying attribute values of (some) tuples in a table, we use the update statement: update <table> set <column i> = <expression i>, : : : , <column j> = <expression j> [where <condition>]; An expression consists of either a constant (new value), an arithmetic or string operation, or an SQL query. Note that the new value to assign to <column i> must a the matching data type. An update statement without a where clause results in changing respective attributes of all tuples in the speci ed table. Typically, however, only a (small) portion of the table requires an update. Examples:

The employee JONES is transfered to the department 20 as a manager and his salary is increased by 1000: update EMP set JOB = 'MANAGER', DEPTNO = 20, SAL = SAL +1000 where ENAME = 'JONES';

All employees working in the departments 10 and 30 get a 15% salary increase. update EMP set SAL = SAL * 1.15 where DEPTNO in (10,30);

Analogous to the insert statement, other tables can be used to retrieve data that are used as new values. In such a case we have a <query> instead of an <expression>.

Example: All salesmen working in the department 20 get the same salary as the manager who has the lowest salary among all managers. update EMP set SAL = (select min(SAL) from EMP where JOB = 'MANAGER') where JOB = 'SALESMAN' and DEPTNO = 20; Explanation: The query retrieves the minimum salary of all managers. This value then is assigned to all salesmen working in department 20. It is also possible to specify a query that retrieves more than only one value (but still only one tuple!). In this case the set clause has the form set(<column i, ... , column j>) = <query>.It is important that the order of data types and values of the selected row exactly correspond to the list of columns in the set clause. Delete All or selected tuples can be deleted from a table using the delete command: delete from <table> [where <condition>]; If the where clause is omitted, all tuples are deleted from the table. An alternative command for deleting all tuples from a table is the truncate table <table> command. However, in this case, the deletions cannot be undone. Example: Delete all projects (tuples) that have been nished before the actual date (system date): delete from PROJECT where PEND < sysdate; sysdate is a function in SQL that returns the system date. Another important SQL function is user, which returns the name of the user logged into the current Oracle session. Commit and Rollback A sequence of database modi cations, i.e., a sequence of insert, update, and delete statements, is called a transaction. Modi cations of tuples are temporarily stored in the database system. They become permanent only after the statement commit; has been issued.As long as the user has not issued the commit statement, it is possible to undo all modi cationssince the last commit. To undo modi cations, one has to issue the statement rollback;.It is advisable to complete each modi cation of the database with a commit (as long as the modi cation has the expected e ect). Note that any data de nition command such as create table results in an internal commit. A commit is also implicitly executed when the user terminates an Oracle session.

Examples on data modelling

Composite primary key: the primary key is made up of more than one field. Foreign key: one (or more) field from this table relates to the primary key of another table. Eg: create table t_holiday (yr integer ,country varchar(50) ,commnt varchar(80) ,foreign key(country) references cia(name) ,primary key (yr,country) ); A foreign key should refer to a candidate key in some table. This is usually the primary key but may be a field (or list of fields) specified as UNIQUE.Eg: create table t_a ( i integer primary key); create table t_b ( j integer , foreign key (j) references t_a(i) ); You may not drop a table if it is referenced by another table. drop table t_a; //Error! The CASCADE CONSTRAINTS clause can be used to remove the references. drop table t_a cascade constraints; CREATE OR REPLACE will remove the old table if it exists.Eg: create or replace table t_holiday (a integer); Users must be granted RESOURCE to create tables. grant resource to scott; CAUTION! You may not use a reserved word as the name of a field. Many popular words are used by the system; some words to avoid... date, day, index, number, order, size, year. Eg: create table t_wrong (date date); Case study Case Study I: A threaded discussion forum: create table bboard ( msg_id char(6) not null primary key, refers_to char(6), email varchar(200), name varchar(200),

one_line varchar(700), message clob, notify char(1) default 'f' check (notify in ('t','f')), posting_time date, sort_key varchar(600) ); Messages are uniquely keyed with msg_id, refer to each other (i.e., say "I'm a response to msg X") with refers_to, and a thread can be displayed conveniently by using the sort_key column. alter table bboard add (originating_ip varchar(16)); It became apparent that we needed ways to: display site history for users who had changed their email addresses discourage problem users from burdening the moderators and the community carefully tie together user-contributed content in the various subsystems The solution was obvious to any experienced database nerd: a canonical users table and then content tables that reference it. Here's a simplified version of the data model, taken from the ArsDigita Community System: create table users ( user_id integer not null primary key, first_names varchar(100) not null, last_name varchar(100) not null, email varchar(100) not null unique, .. ); create table bboard ( msg_id char(6) not null primary key, refers_to char(6), topic varchar(100) not null references bboard_topics, category varchar(200), -- only used for categorized Q&A forums originating_ip varchar(16), -- stored as string, separated by periods user_id integer not null references users, one_line varchar(700), message clob, -- html_p - is the message in html or not html_p char(1) default 'f' check (html_p in ('t','f')), ... ); create table classified_ads ( classified_ad_id integer not null primary key, user_id integer not null references users,

... ); Note that a contributor's name and email address no longer appear in the bboard table. That doesn't mean we don't know who posted a message. In fact, this data model can't even represent an anonymous posting: user_id integer not null references users requires that each posting be associated with a user ID and that there actually be a row in the users table with that ID. Case Study II: Representing a Web Site core content. Its in continuation with the above Case Study. Requirements: We will need a table that holds the static pages themselves. Since there are potentially many comments per page, we need a separate table to hold the user-submitted comments. Since there are potentially many related links per page, we need a separate table to hold the user-submitted links. Since there are potentially many authors for one page, we need a separate table to register the author-page many-to-one relation. Considering the "help point readers to stuff that will interest them" objective, it seems that we need to store the category or categories under which a page falls. Since there are potentially many categories for one page, we need a separate table to hold the mapping between pages and categories. create table static_pages ( page_id integer not null primary key, url_stub varchar(400) not null unique, original_author integer references users(user_id), page_title varchar(4000), page_body clob, obsolete_p char(1) default 'f' check (obsolete_p in ('t','f')), members_only_p char(1) default 'f' check (members_only_p in ('t','f')), price number, copyright_info varchar(4000), accept_comments_p char(1) default 't' check (accept_comments_p in ('t','f')), accept_links_p char(1) default 't' check (accept_links_p in ('t','f')), last_updated date, -- used to prevent minor changes from looking like new content publish_date date ); create table static_page_authors ( page_id integer not null references static_pages, user_id integer not null references users, notify_p char(1) default 't' check (notify_p in ('t','f')),

unique(page_id,user_id) ); Note that we use a generated integer page_id key for this table. Much better is to use Oracle's built-in sequence generation facility: create sequence page_id_sequence start with 1; Then we can get new page IDs by using page_id_sequence.nextval in INSERT statements.

3. Simple Queries from One Table
select [distinct] <column(s)> from <table> [ where <condition> ] [ order by <column(s) [ascjdesc]> ]; A simple query from one table has the following structure:
• • • •

the select list (which columns in our report) the name of the table the where clauses (which rows we want to see) the order by clauses (how we want the rows arranged)

Instead of an attribute name, the select clause may also contain arithmetic expressions involving arithmetic operators etc. select ENAME, DEPTNO, SAL * 1.55 from EMP; For the diffrent data types supported in Oracle , several operators and functions are provided:
• • •

for numbers: abs, cos, sin, exp, log, power, mod, sqrt, +; for strings: chr, concat(string1, string2), lower, upper, replace(string, search string, replacement string), translate, substr(string, m, n), length, to date, ... for the date data type: add month, month between, next day, to char,...

Inserting the keyword distinct after the keyword select, however, forces the elimination of duplicates from the query result.It is also possible to specify a sorting order in which the result tuples of a query are displayed.For this the order by clause is used and which has one or more attributes listed in the select clause as parameter. desc speci es a descending order and asc speci es an ascending order (this is also the default order). For example, the query

select ENAME, DEPTNO, HIREDATE from EMP; from EMP order by DEPTNO [asc], HIREDATE desc; List the job title and the salary of those employees whose manager has the number 7698 or 7566 and who earn more than 1500: select JOB, SAL from EMP where (MGR = 7698 or MGR = 7566) and SAL > 1500; For all data types, the comparison operators =; != or <>; <; >; <=, => are allowed in the conditions of a where clause. Further comparison operators are:

Set Conditions: <column> [not] in (<list of values>) Example: select from DEPT where DEPTNO in (20,30);

Null value: <column> is [not] null, i.e., for a tuple to be selected there must (not) exist a de ned value for this column. Example: select * from EMP where MGR is not null; Note: the operations = null and ! =null are not defined!

Domain conditions: <column> [not] between <lower bound> and <upper bound> Example: select EMPNO, ENAME, SAL from EMP where SAL between 1500 and 2500; select ENAME from EMP where HIREDATE between '02-APR-81' and '08-SEP-81';

String Operations: A powerful operator for pattern matching is the like operator. Together with this operator, two special characters are used: the percent sign % (also called wild card), and the underline_ , also called position marker. SQL> select email from users where email like '%mit.edu'; The email like '%mit.edu' says "every row where the email column ends in 'mit.edu'". The percent sign is Oracle's wildcard character for "zero or more characters". Underscore is the wildcard for "exactly one character":

SQL> select email from users where email like '___@mit.edu'; Suppose that you were featured on Yahoo in September 1998 and want to see how many users signed up during that month: SQL> select count(*) from users where registration_date >= '1998-09-01' and registration_date < '1998-10-01'; COUNT(*) ---------920 We've combined two restrictions in the WHERE clause with an AND. OR and NOT are also available within the WHERE clause. For example, the following query will tell us how many classified ads we have that either have no expiration date or whose expiration date is later than the current date/time. SQL> select count(*) from classified_ads where expires >= sysdate or expires is null; Further string operations are:

• • • • •

upper(<string>) takes a string and converts any letters in it to uppercase, e.g., DNAME = upper(DNAME) (The name of a department must consist only of upper case letters.) lower(<string>) converts any letter to lowercase, initcap(<string>) converts the initial letter of every word in <string> to uppercase. length(<string>) returns the length of the string. substr(<string>, n [, m]) clips out a m character piece of <string>, starting at position n. If m is not speci ed, the end of the string is assumed. substr('DATABASE SYSTEMS', 10, 7) returns the string 'SYSTEMS'.

Aggregate Functions: Aggregate functions are statistical functions such as count, min, max etc. They are used to compute a single value from a set of attribute values of a column:

count Counting Rows Example: How many tuples are stored in the relation EMP? select count(*) from EMP;

Example: How many diffrent job titles are stored in the relation EMP? select count(distinct JOB) from EMP;
• •

max Maximum value for a column min Minimum value for a column Example: List the minimum and maximum salary. select min(SAL), max(SAL) from EMP; Example: Compute the di erence between the minimum and maximum salary. select max(SAL) - min(SAL) from EMP;

sum Computes the sum of values (only applicable to the data type number) Example: Sum of all salaries of employees working in the department 30. select sum(SAL) from EMP where DEPTNO = 30;

avg Computes average value for a column (only applicable to the data type number)

Note: avg, min and max ignore tuples that have a null value for the specified attribute, but count considers null values.

Subqueries
A query result can also be used in a condition of a where clause. In such a case the query is called a subquery and the complete select statement is called a nested query.You can query one table, restricting the rows returned based on information from another table. For example, to find users who have posted at least one classified ad: select user_id, email from users where 0 < (select count(*) from classified_ads where classified_ads.user_id = users.user_id); USER_ID EMAIL ---------- ----------------------------------42485 twm@meteor.com 42489 trunghau@ecst.csuchico.edu 42389 ricardo.carvajal@kbs.msu.edu 42393 gon2foto@gte.net 42399 rob@hawaii.rr.com 42453 stefan9@ix.netcom.com 42346 silverman@pon.net

42153 gallen@wesleyan.edu ... Conceptually, for each row in the users table Oracle is running the subquery against classified_ads to see how many ads are associated with that particular user ID. Keep in mind that this is only conceptually; the Oracle SQL parser may elect to execute this query in a more efficient manner. Another way to describe the same result set is using EXISTS: select user_id, email from users where exists (select 1 from classified_ads where classified_ads.user_id = users.user_id); This may be more efficient for Oracle to execute since it hasn't been instructed to actually count the number of classified ads for each user, but only to check and see if any are present. Think of EXISTS as a Boolean function that takes a SQL query as its only parameter returns TRUE if the query returns any rows at all, regardless of the contents of those rows (this is why we can use the constant 1 as the select list for the subquery). A respective condition in the where clause then can have one of the following forms: 1. Set-valued subqueries <expression> [not] in (<subquery>) <expression> <comparison operator> [any|all] (<subquery>) An <expression> can either be a column or a computed value. 2. Test for (non)existence [not] exists (<subquery>) In a where clause conditions using subqueries can be combined arbitrarily by using the logical connectives and and or. Example: List the name and salary of employees of the department 20 who are leading a project that started before December 31, 1990: select ENAME, SAL from EMP where EMPNO in (select PMGR from PROJECT where PSTART < '31-DEC-90') and DEPTNO =20; Explanation: The subquery retrieves the set of those employees who manage a project that started before December 31, 1990. If the employee working in department 20 is contained in this set (in operator), this tuple belongs to the query result set. Example: List all employees who are working in a department located in BOSTON: select * from EMP where DEPTNO in

(select DEPTNO from DEPT where LOC = 'BOSTON'); The subquery retrieves only one value (the number of the department located in Boston). Thus it is possible to use \=" instead of in. As long as the result of a subquery is not known in advance, i.e., whether it is a single value or a set, it is advisable to use the in operator. A subquery may use again a subquery in its where clause. Thus conditions can be nested arbitrarily. An important class of subqueries are those that refer to its surrounding (sub)query and the tables listed in the from clause, respectively. Such type of queries is called correlated subqueries. (IMP)Example: List all those employees who are working in the same department as their manager (note that components in [ ]are optional: select * from EMP E1 where DEPTNO in (select DEPTNO from EMP [E] where [E.]EMPNO = E1.MGR); Explanation: The subquery in this example is related to its surrounding query since it refers to the column E1.MGR. A tuple is selected from the table EMP (E1) for the query result if the value for the column DEPTNO occurs in the set of values select in the subquery. One can think of theevaluation of this query as follows: For each tuple in the table E1, the subquery is evaluated individually. If the condition where DEPTNO in : : : evaluates to true, this tuple is selected. Note that an alias for the table EMP in the subquery is not necessary since columns without a preceding alias listed there always refer to the innermost query and tables. Conditions of the form <expression> <comparison operator> [any|all] <subquery> are used to compare a given <expression> with each value selected by <subquery>. ¤ For the clause any, the condition evaluates to true if there exists at least on row selected by the subquery for which the comparison holds. If the subquery yields an empty result set, the condition is not satis ed. ¤ For the clause all, in contrast, the condition evaluates to true if for all rows selected by the subquery the comparison holds. In this case the condition evaluates to true if the subquery does not yield any row or value. (IMP)Example: Retrieve all employees who are working in department 10 and who earn at least as much as any (i.e., at least one) employee working in department 30: select * from EMP where SAL >= any (select SAL from EMP where DEPTNO = 30) and DEPTNO = 10; Note: Also in this subquery no aliases are necessary since the columns refer to the innermost from clause. Example: List all employees who are not working in department 30 and who earn more than all employees working in department 30: select * from EMP where SAL > all

(select SAL from EMP where DEPTNO = 30) and DEPTNO <> 30; For all and any, the following equivalences hold: in , = any not in , <> all or != all

JOIN
A major feature of relational databases, however, is to combine (join) tuples stored in di erent tables in order to display more meaningful and complete information. In SQL the select statement is used for this kind of queries joining relations: select [distinct] [<alias ak >.]<column i>, : : : , [<alias al >.]<column j> from <table 1> [<alias a1 >], : : : , <table n> [<alias an >] [where <condition>] The speci cation of table aliases in the from clause is necessary to refer to columns that have the same name in di erent tables. For example, the column DEPTNO occurs in both EMP and DEPT. If we want to refer to either of these columns in the where or select clause, a table alias has to be speci ed and put in the front of thecolumn name. Instead of a table alias also the complete relation name can be put in front of the column such as DEPT.DEPTNO, but this sometimes can lead to rather lengthy query formulations. A professional SQL programmer would be unlikely to query for users who'd posted classified ads in the preceding manner. The SQL programmer knows that, inevitably, the publisher will want information from the classified ad table along with the information from the users table. For example, we might want to see the users and, for each user, the sequence of ad postings: select users.user_id, users.email, classified_ads.posted from users, classified_ads where users.user_id = classified_ads.user_id order by users.email, posted; USER_ID EMAIL POSTED ---------- ----------------------------------- ---------39406 102140.1200@compuserve.com 1998-09-30 39406 102140.1200@compuserve.com 1998-10-08 39406 102140.1200@compuserve.com 1998-10-08 39842 102144.2651@compuserve.com 1998-07-02 39842 102144.2651@compuserve.com 1998-07-06 39842 102144.2651@compuserve.com 1998-12-13 ...

Because of the JOIN restriction, where users.user_id = classified_ads.user_id, we only see those users who have posted at least one classified ad, i.e., for whom a matching row may be found in the classified_ads table. This has the same effect as the subquery above. The order by users.email, posted is key to making sure that the rows are lumped together by user and then printed in order of ascending posting time. Comparisons in the where clause are used to combine rows from the tables listed in the from clause. Example: In the table EMP only the numbers of the departments are stored, not their name. For each salesman, we now want to retrieve the name as well as the number and the name of the department where he is working: select ENAME, E.DEPTNO, DNAME from EMP E, DEPT D where E.DEPTNO = D.DEPTNO and JOB = 'SALESMAN'; Explanation: E and D are table aliases for EMP and DEPT, respectively. The computation of the query result occurs in the following manner (without optimization): 1. Each row from the table EMP is combined with each row from the table DEPT (this operation is called Cartesian product). If EMP contains m rows and DEPT contains n rows, wethus get n * m rows. 2. From these rows those that have the same department number are selected (where E.DEPTNO = D.DEPTNO). 3. From this result nally all rows are selected for which the condition JOB = 'SALESMAN' holds. In this example the joining condition for the two tables is based on the equality operator \=". The columns compared by this operator are called join columns and the join operation is called an equijoin. Any number of tables can be combined in a select statement. Example: For each project, retrieve its name, the name of its manager, and the name of the department where the manager is working: select ENAME, DNAME, PNAME from EMP E, DEPT D, PROJECT P where E.EMPNO = P.MGR and D.DEPTNO = E.DEPTNO; It is even possible to join a table with itself: (IMP)Example: List the names of all employees together with the name of their manager: select E1.ENAME, E2.ENAME from EMP E1, EMP E2 where E1.MGR = E2.EMPNO; Explanation: The join columns are MGR for the table E1 and EMPNO for the table E2. The equijoin comparison is E1.MGR = E2.EMPNO.

OUTER JOIN

Suppose that we want an alphabetical list of all of our users, with classified ad posting dates for those users who have posted classifieds. We can't do a simple JOIN because that will exclude users who haven't posted any ads. What we need is an OUTER JOIN, where Oracle will "stick in NULLs" if it can't find a corresponding row in the classified_ads table. select users.user_id, users.email, classified_ads.posted from users, classified_ads where users.user_id = classified_ads.user_id(+) order by users.email, posted; ... USER_ID EMAIL POSTED ---------- ----------------------------------- ---------52790 dbrager@mindspring.com 37461 dbraun@scdt.intel.com 52791 dbrenner@flash.net 47177 dbronz@free.polbox.pl 37296 dbrouse@enter.net 47178 dbrown@cyberhighway.net 36985 dbrown@uniden.com 1998-03-05 36985 dbrown@uniden.com 1998-03-10 34283 dbs117@amaze.net 52792 dbsikorski@yahoo.com ... The plus sign after classified_ads.user_id is our instruction to Oracle to "add NULL rows if you can't meet this JOIN constraint".

Extending a simple query into a JOIN
Suppose that you have a query from one table returning almost everything that you need, except for one column that's in another table. Here's a way to develop the JOIN without risking breaking your application:
• • • • •

add the new table to your FROM clause add a WHERE constraint to prevent Oracle from building a Cartesian product hunt for ambiguous column names in the SELECT list and other portions of the query; prefix these with table names if necessary test that you've not broken anything in your zeal to add additional info add a new column to the SELECT list

Students build a conference room reservation system. They generally define two tables: rooms and reservations. The top level page is supposed to show a user what reservations he or she is current holding:

select room_id, start_time, end_time from reservations where user_id = 37; This produces an unacceptable page because the rooms are referred to by an ID number rather than by name. The name information is in the rooms table, so we'll have to turn this into a JOIN. select reservations.room_id, start_time, end_time, rooms.room_name from reservations, rooms where user_id = 37 and reservations.room_id = rooms.room_id;

4. Complex Queries
Often applications require grouping rows that have certain properties and then applying an aggregate function on one column for each group separately. For this, SQL provides the clause group by <group column(s)>. This clause appears after the where clause and must refer to columns of tables listed in the from clause:. select <column(s)> from <table(s)> where <condition> group by <group column(s)> [having <group condition(s)>]; Those rows retrieved by the selected clause that have the same value(s) for <group column(s)> are grouped. Aggregations speci ed in the select clause are then applied to each group separately. It is important that only those columns that appear in the <group column(s)> clause can be listed without an aggregate function in the select clause ! Example: For each department, we want to retrieve the minimum and maximum salary. select DEPTNO, min(SAL), max(SAL) from EMP group by DEPTNO; Rows from the table EMP are grouped such that all rows in a group have the same department number. The aggregate functions are then applied to each such group. We thus get the following query result: DEPTNO MIN(SAL) MAX(SAL) 10 1300 5000 20 800 3000 30 950 2850 Rows to form a group can be restricted in the where clause. For example, if we add the condition where JOB = 'CLERK', only respective rows build a group. The query then would retrieve the minimum and maximum salary of all clerks for each department. Note that is not allowed to specify any other column than DEPTNO without an aggregate function in the select clause since this is the only column listed in the group by clause

(is it also easy to see that other columns would not make any sense). Once groups have been formed, certain groups can be eliminated based on their properties, e.g., if a group contains less than three rows. This type of condition is speci ed using the having clause. As for the select clause also in a having clause only <group column(s)> and aggregations can be used. Example: Retrieve the minimum and maximum salary of clerks for each department having more than three clerks. select DEPTNO, min(SAL), max(SAL) from EMP where JOB = 'CLERK' group by DEPTNO having count(*) > 3; Note that it is even possible to specify a subquery in a having clause. In the above query, for example, instead of the constant 3, a subquery can be speci ed. A query containing a group by clause is processed in the following way: (IMP) 1. Select all rows that satisfy the condition speci ed in the where clause. 2. From these rows form groups according to the group by clause. 3. Discard all groups that do not satisfy the condition in the having clause. 4. Apply aggregate functions to each group. 5. Retrieve values for the columns and aggregations listed in the select clause. Suppose that you want to start lumping together information from multiple rows. For example, you're interested in JOINing users with their classified ads. That will give you one row per ad posted. But you want to mush all the rows together for a particular user and just look at the most recent posting time. What you need is the GROUP BY construct: select users.user_id, users.email, max(classified_ads.posted) from users, classified_ads where users.user_id = classified_ads.user_id group by users.user_id, users.email order by upper(users.email); USER_ID EMAIL MAX(CLASSI ---------- ----------------------------------- ---------39406 102140.1200@compuserve.com 1998-10-08 39842 102144.2651@compuserve.com 1998-12-13 41426 50@seattle.va.gov 1997-01-13 The group by users.user_id, users.email tells SQL to "lump together all the rows that have the same values in these two columns." In addition to the grouped by columns, we can run aggregate functions on the columns that aren't being grouped. For example, the MAX above applies to the posting dates for the rows in a particular group. We can also use COUNT to see at a glance how active and how recently active a user has been:

select users.user_id, users.email, count(*), max(classified_ads.posted) from users, classified_ads where users.user_id = classified_ads.user_id group by users.user_id, users.email order by upper(users.email); Let's find our most recently active users. At the same time, let's get rid of the unsightly "MAX(CLASSI" at the top of the report: select users.user_id, users.email, count(*) as how_many, max(classified_ads.posted) as how_recent from users, classified_ads where users.user_id = classified_ads.user_id group by users.user_id, users.email order by how_recent desc, how_many desc; USER_ID EMAIL HOW_MANY HOW_RECENT ---------- ----------------------------------- ---------- ---------39842 102144.2651@compuserve.com 3 1998-12-13 39968 mkravit@mindspring.com 1 1998-12-13 36758 mccallister@mindspring.com 1 1998-12-13 38513 franjeff@alltel.net 1 1998-12-13 34530 nverdesoto@earthlink.net 3 1998-12-13 34765 jrl@blast.princeton.edu 1 1998-12-13 38497 jeetsukumaran@pd.jaring.my 1 1998-12-12 38879 john.macpherson@btinternet.com 5 1998-12-12 37808 eck@coastalnet.com 1 1998-12-12 37482 dougcan@arn.net 1 1998-12-12 Note that we were able to use our correlation names of "how_recent" and "how_many" in the ORDER BY clause. The desc ("descending") directives in the ORDER BY clause instruct Oracle to put the largest values at the top. The default sort order is from smallest to largest ("ascending").

Finding co-moderators: The HAVING Clause
The WHERE clause restricts which rows are returned. The HAVING clause operates analogously but on groups of rows. Suppose, for example, that we're interested in finding those users who've contributed heavily to our discussion forum and a posting contributed three years ago is not necessarily evidence of interest in the community right now. So the query reads as: "show me users who've posted at least 30 messages in the past 60 days, ranked in descending order of volubility":

select user_id, count(*) as how_many from bboard where posting_time + 60 > sysdate group by user_id having count(*) >= 30 order by how_many desc; USER_ID HOW_MANY ---------- ---------34375 80 34004 79 37903 49 41074 46 42485 46 35387 30 42453 30 7 rows selected. We had to do this in a HAVING clause because the number of rows in a group is a concept that doesn't make sense at the per-row level on which WHERE clauses operate.Oracle 8's SQL parser is too feeble to allow you to use the how_many correlation variable in the HAVING clause. You therefore have to repeat the count(*) incantation.

Set Operations: UNION, INTERSECT, and MINUS
Oracle provides set operations that can be used to combine rows produced by two or more separate SELECT statements.Sometimes it is useful to combine query results from two or more queries into a single result. SQL supports three set operators which have the pattern: <query 1> <set operator> <query 2> The set operators are:

• •

union [all] returns a table consisting of all rows either appearing in the result of <query1> or in the result of <query 2>. Duplicates are automatically eliminated unless the clause all is used. intersect returns all rows that appear in both results <query 1> and <query 2>. minus returns those rows that appear in the result of <query 1> but not in the result of <query 2>.

Of the three, UNION is the most useful in practice. Example: Assume that we have a table EMP2 that has the same structure and columns as the table EMP:

All employee numbers and names from both tables:

select EMPNO, ENAME from EMP union select EMPNO, ENAME from EMP2;

Employees who are listed in both EMP and EMP2: select * from EMP intersect select * from EMP2;

Employees who are only listed in EMP: select * from EMP minus select * from EMP2;

Each operator requires that both tables have the same data types for the columns to which the operator is applied. Another example, select 'today - ' || to_char(trunc(sysdate),'Mon FMDDFM'), trunc(sysdate) as deadline from dual UNION select 'tomorrow - '|| to_char(trunc(sysdate+1),'Mon FMDDFM'), trunc(sysdate+1) as deadline from dual UNION select 'next week - '|| to_char(trunc(sysdate+7),'Mon FMDDFM'), trunc(sysdate+7) as deadline from dual UNION select 'next month - '|| to_char(trunc(ADD_MONTHS(sysdate,1)),'Mon FMDDFM'), trunc(ADD_MONTHS(sysdate,1)) as deadline from dual UNION select name || ' - ' || to_char(deadline, 'Mon FMDDFM'), deadline from ticket_deadlines where project_id = :project_id

and deadline >= trunc(sysdate) order by deadline The INTERSECT and MINUS operators are seldom used.

5. Transactions
The simplest and most direct interface to a relational database involves a procedural program in C, Java, Lisp, Perl, or Tcl putting together a string of SQL that is then sent to to the RDBMS. Here's how the ArsDigita Community System constructs a new entry in the clickthrough log: insert into clickthrough_log (local_url, foreign_url, entry_date, click_count) values ('$local_url', '$foreign_url', trunc(sysdate), 1)" The INSERT statement adds one row, filling in the four list columns. Two of the values come from local variables set within the Web server, $local_url and $foreign_url. Because these are strings, they must be surrounded by single quotes. One of the values is dynamic and comes straight from Oracle: trunc(sysdate). Recall that the date data type in Oracle is precise to the second. We only want one of these rows per day of the year and hence truncate the date to midnight. Finally, as this is the first clickthrough of the day, we insert a constant value of 1 for click_count.

Atomicity
Each SQL statement executes as an atomic transaction. For example, suppose that you were to attempt to purge some old data with delete from clickthrough_log where entry_date + 120 < sysdate; (delete clickthrough records more than 120 days old) and that 3500 rows in clickthrough_log are older than 120 days. If your computer failed halfway through the execution of this DELETE, i.e., before the transaction committed, you would find that none of the rows had been deleted. Either all 3500 rows will disappear or none will. More interestingly, you can wrap a transaction around multiple SQL statements. Note: However, if what you're actually doing is moving data from one place within the RDBMS to another, it is extremely bad taste to drag it all the way out to an application program and then stuff it back in. Much better to use the "INSERT ... SELECT" form. It is legal in SQL to put function calls or constants in your select list. You can compute multiple values in a single query:

select posting_time, 2+2 from bboard where msg_id = '000KWj'; POSTING_TI 2+2 ---------- ---------1998-12-13 4 Consider a comment editing transaction and look at the basic structure:
• • • •

open a transaction insert into an audit table whatever comes back from a SELECT statement on the comment table update the comment table close the transaction

Suppose that something goes wrong during the INSERT. The tablespace in which the audit table resides is full and it isn't possible to add a row. Putting the INSERT and UPDATE in the same RDBMS transactions ensures that if there is a problem with one, the other won't be applied to the database.

Consistency
Suppose that we've looked at a message on the bulletin board and decide that its content is so offensive we wish to delete the user from our system: select user_id from bboard where msg_id = '000KWj'; USER_ID ---------39685 delete from users where user_id = 39685; * ERROR at line 1: ORA-02292: integrity constraint (PHOTONET.SYS_C001526) violated - child record found Oracle has stopped us from deleting user 39685 because to do so would leave the database in an inconsistent state. Here's the definition of the bboard table: create table bboard ( msg_id char(6) not null primary key, refers_to char(6), ... user_id integer not null references users,

one_line varchar(700), message clob, ... ); The user_id column is constrained to be not null. Furthermore, the value in this column must correspond to some row in the users table (references users). By asking Oracle to delete the author of msg_id 000KWj from the users table before we deleted all of his or her postings from the bboard table, we were asking Oracle to leave the RDBMS in an inconsistent state.

Mutual Exclusion
When you have multiple simultaneously executing copies of the same program, you have to think about mutual exclusion. If a program has to
• • •

read a value from the database perform a computation based on that value update the value in the database based on the computation

Then you want to make sure only one copy of the program is executing at a time through this segment. First, anything having to do with locks only makes sense when the three operations are grouped together in a transaction. Second, to avoid deadlocks a transaction must acquire all the resources (including locks) that it needs at the start of the transaction. A SELECT in Oracle does not acquire any locks but a SELECT .. FOR UPDATE does. Here's the beginning of the transaction that inserts a message into the bboard table (from /bboard/insert-msg.tcl): select last_msg_id from msg_id_generator for update of last_msg_id Much more efficient is simply to start the transaction with lock table an_alert_log in exclusive mode; This is a big hammer and you don't want to hold a table lock for more than an instant.

What if I just want some unique numbers?
Does it really have to be this hard? What if you just want some unique integers, each of which will be used as a primary key? Consider a table to hold news items for a Web site: create table news ( title varchar(100) not null,

body varchar(4000) not null, release_date date not null, ... ); The traditional database design that gets around all of the problems is the use of a generated key. Here's how the news module of the ArsDigita Community System works, create sequence news_id_sequence start with 1; create table news ( news_id integer primary key, title varchar(100) not null, body varchar(4000) not null, release_date date not null, ... ); We're taking advantage of the nonstandard but very useful Oracle sequence facility. In almost any Oracle SQL statement, you can ask for a sequence's current value or next value. SQL> create sequence foo_sequence; SQL> select foo_sequence.nextval from dual; SQL> select foo_sequence.currval from dual; You can use the sequence generator directly in an insert, e.g., insert into news (news_id, title, body, release_date) values (news_id_sequence.nextval, 'Tuition Refund at MIT', 'Administrators were shocked and horrified ...', '1998-03-12); Caveats for Sequence: Oracle sequences are optimized for speed. Hence they offer the minimum guarantees that Oracle thinks are required for primary key generation and no more. If you ask for a few nextvals and roll back your transaction, the sequence will not be rolled back. You can't rely on sequence values to be, uh, sequential. They will be unique. They will be monotonically increasing. But there might be gaps. The gaps arise because Oracle pulls, by default, 20 sequence values into memory and records those values as used on disk. This makes nextval very fast since the new value need only be marked use in RAM and not on disk. But suppose that someone pulls the plug on your database server after only two sequence values have been handed out. If your database administrator and system administrator are working well together, the computer will come back to life running Oracle. But there will be a gap of 18 values in the sequence (e.g., from 2023 to 2041). That's because Oracle recorded 20 values used on disk and only handed out 2. So till the time your application (using Oracle of course!) requires

only uniqueness of ids, use of Oracle generated sequences will do....or else you will have to write your own sequence generators based on some logic deemed fit by you for your kind of application.

6. Triggers
A trigger de nition consists of the following (optional) components:

trigger name create [or replace] trigger <trigger name>

trigger time point before | after

triggering event(s) insert or update [of <column(s)>] or delete on <table>

trigger type (optional) for each row

trigger restriction (only for for each row triggers !) when (<condition>)

trigger body <PL/SQL block>

A trigger is a fragment of code that you tell Oracle to run before or after a table is modified. A trigger has the power to :
• • •

make sure that a column is filled in with default information make sure that an audit row is inserted into another table after finding that the new information is inconsistent with other stuff in the database, raise an error that will cause the entire transaction to be rolled back

Consider the general_comments table: create table general_comments ( comment_id integer primary key, on_what_id integer not null,

on_which_table varchar(50), user_id not null references users, comment_date date not null, ip_address varchar(50) not null, modified_date date not null, content clob, -- is the content in HTML or plain text (the default) html_p char(1) default 'f' check(html_p in ('t','f')), approved_p char(1) default 't' check(approved_p in ('t','f')) ); Users and administrators are both able to edit comments. We want to make sure that we know when a comment was last modified so that we can offer the administrator a "recently modified comments page". Rather than painstakingly go through all of our Web scripts that insert or update comments, we can specify an invariant in Oracle that "after every time someone touches the general_comments table, make sure that the modified_date column is set equal to the current date-time." Here's the trigger definition: create trigger general_comments_modified before insert or update on general_comments for each row begin :new.modified_date := sysdate; end; / show errors We're using the PL/SQL programming language. In this case, it is a simple begin-end block that sets the :new value of modified_date to the result of calling the sysdate function. When using SQL*Plus, you have to provide a / character to get the program to evaluate a trigger or PL/SQL function definition. You then have to say "show errors" if you want SQL*Plus to print out what went wrong. Unless you expect to write perfect code all the time, it can be convenient to leave these SQL*Plus incantations in your .sql files. The canonical trigger example is the stuffing of an audit table. create table queries ( query_id integer primary key, query_name varchar(100) not null, query_owner not null references users, definition_time date not null, -- if this is non-null, we just forget about all the query_columns -- stuff; the user has hand edited the SQL query_sql varchar(4000) );

create table queries_audit ( query_id integer not null, audit_time date not null, query_sql varchar(4000) ); Note first that queries_audit has no primary key. If we were to make query_id the primary key, we'd only be able to store one history item per query, which is not our intent. How to keep this table filled? We could do it by making sure that every Web script that might update the query_sql column inserts a row in queries_audit when appropriate. But how to enforce this after we've handed off our code to other programmers? Much better to let the RDBMS enforce the auditing: create or replace trigger queries_audit_sql before update on queries for each row when (old.query_sql is not null and (new.query_sql is null or old.query_sql <> new.query_sql)) begin insert into queries_audit (query_id, audit_time, query_sql) values (:old.query_id, sysdate, :old.query_sql); end; The structure of a row-level trigger is the following: CREATE OR REPLACE TRIGGER ***trigger name*** ***when*** ON ***which table*** FOR EACH ROW ***conditions for firing*** begin ***stuff to do*** end; Let's go back and look at our trigger:
• • • • •

It is named queries_audit_sql; this is really of no consequence so long as it doesn't conflict with the names of other triggers. It will be run before update, i.e., only when someone is executing an SQL UPDATE statement. It will be run only when someone is updating the table queries. It will be run only when the old value of query_sql is not null; we don't want to fill our audit table with NULLs. It will be run only when the new value of query_sql is different from the old value; we don't want to fill our audit table with rows because someone happens to be updating another column in queries. Note that SQL's three-valued logic forces

us to put in an extra test for new.query_sql is null because old.query_sql <> new.query_sql will not evaluate to true when new.query_sql is NULL (a user wiping out the custom SQL altogether; a very important case to audit). Other Trigger Examples are given below: Trigger1.sql -- Suppose we have to maintain the following integrity constraint: \The salary of an employee di erent from the president cannot be decreased and must also not be increased more than 10%. Furthermore, depending on the job title, each salary must lie within a certain salary range. We assume a table SALGRADE that stores the minimum (MINSAL) and maximum (MAXSAL) salary for each job title (JOB). Since the above condition can be checked for each employee individually,we de ne the following row trigger:
set echo off prompt "Example trigger trig1.sql, page 46 Oracle/SQL Tutorial" prompt prompt "Creating additional table SALS containing salary ranges..." set echo on DROP TABLE SALS; CREATE TABLE SALS (JOB VARCHAR2(9) primary key, MINSAL NUMBER(7,2), MAXSAL NUMBER(7,2) ); INSERT INSERT INSERT INSERT INSERT INTO INTO INTO INTO INTO SALS SALS SALS SALS SALS VALUES VALUES VALUES VALUES VALUES ('CLERK', 800, 1300); ('ANALYST', 3000, 3500); ('SALESMAN', 1250, 1600); ('MANAGER', 2450, 2975); ('PRESIDENT', 5000, 5500);

create or replace trigger check_salary_EMP after insert or update of SAL, JOB on EMP for each row when (new.JOB != 'PRESIDENT') declare minsal number; maxsal number; begin -- retrieve minimum and maximum salary for JOB select MINSAL, MAXSAL into minsal, maxsal from SALS where JOB = :new.JOB; -- If the new salary has been decreased or does not lie -- within the salary range raise an exception if :new.SAL < minsal or :new.SAL > maxsal then raise_application_error(-20225, 'Salary range exceeded'); elsif :new.SAL < :old.SAL then raise_application_error(-20230, 'Salary has been decreased'); elsif :new.SAL > 1.1*:old.SAL then raise_application_error(-20235, 'More than 10% salary increase'); end if;

end; /

We use an after trigger because the inserted or updated row is not changed within the PL/SQL block (e.g., in case of a constraint violation, it would be possible to restore the old attributevalues). Note that also modi cations on the table SALGRADE can cause a constraint violation. In order to maintain the complete condition we de ne the following trigger on the table SALGRADE. In case of a violation by an update modi cation, however, we do not raise an exception, but restore the old attribute values.
set echo off prompt "Example trigger trig2.sql, page 47 Oracle/SQL Tutorial" prompt set echo on create or replace trigger check_salary_SALS before update or delete on SALS for each row when ( new.MINSAL > old.MINSAL or new.MAXSAL < old.MAXSAL or new.MAXSAL is null) -- only restricting a salary range can cause a constraint violation declare job_emps number; begin if deleting then -- Does there still exist an employee having the deleted job select count(*) into job_emps from EMP where JOB = :old.JOB; if job_emps != 0 then raise_application_error(-20240,' There still exist employees with the job ' || :old.JOB); end if ; end if ; if updating then -- Are there employees whose salary does not lie within the -- modified salary range ? select count(*) into job_emps from EMP where JOB = :new.JOB and SAL not between :new.MINSAL and :new.MAXSAL; if job_emps != 0 then -- restore old salary ranges :new.MINSAL := :old.MINSAL; :new.MAXSAL := :old.MAXSAL; end if ; end if ; end; /

In this case a before trigger must be used to restore the old attribute values of an updated row. Suppose we furthermore have a column BUDGET in our table DEPT that is used to store the budget available for each department. Assume the integrity constraint requires that the total of all salaries in a department must not exceed the department's budget. Critical operations on the relation EMP are insertions into EMP and updates on the attributes SAL or DEPTNO.

set echo on ALTER TABLE DEPT ADD BUDGET NUMBER(8,2); UPDATE UPDATE UPDATE UPDATE DEPT DEPT DEPT DEPT set set set set BUDGET BUDGET BUDGET BUDGET = = = = 10000 where DEPTNO = 10; 15000 where DEPTNO = 20; 10000 where DEPTNO = 30; 5000 where DEPTNO = 40;

create or replace trigger check_budget_EMP after insert or update of SAL, DEPTNO on EMP declare cursor DEPT_CUR is select DEPTNO, BUDGET from DEPT; DNO DEPT.DEPTNO%TYPE; ALLSAL DEPT.BUDGET%TYPE; DEPT_SAL number; begin open DEPT_CUR; loop fetch DEPT_CUR into DNO, ALLSAL; exit when DEPT_CUR%NOTFOUND; select sum(SAL) into DEPT_SAL from EMP where DEPTNO = DNO; if DEPT_SAL > ALLSAL then raise_application_error(-20325, 'Total of salaries in the department '|| to_char(DNO) || ' exceeds budget'); end if; end loop; close DEPT_CUR; end; /

In this case we use a statement trigger on the relation EMP because we have to apply an aggregate function on the salary of all employees that work in a particular department. For the relation DEPT, we also have to de ne a trigger which, however, can be formulated as a row trigger.

7. Views
The relational database provides programmers with a high degree of abstraction from the physical world of the computer. A view is a way of building even greater abstraction. Suppose that Jane in marketing says that she wants to see a table containing the following information:
• • • • •

user_id email address number of static pages viewed number of bboard postings made number of comments made

This information is spread out among four tables. select u.user_id, u.email, count(ucm.page_id) as n_pages, count(bb.msg_id) as n_msgs, count(c.comment_id) as n_comments from users u, user_content_map ucm, bboard bb, comments c where u.user_id = ucm.user_id(+) and u.user_id = bb.user_id(+) and u.user_id = c.user_id(+) group by u.user_id, u.email order by upper(email) Then Jane adds "I want to see this every day, updated with the latest information. I want to have a programmer write me some desktop software that connects directly to the database and looks at this information; I don't want my desktop software breaking if you reorganize the data model." Note: The outer join adds NULLs to every column in the report where there was no corresponding row in the user_content_map table (ie for those registered users who have not placed any content in the bulletin board). create or replace view janes_marketing_view as select u.user_id, u.email, count(ucm.page_id) as n_pages, count(bb.msg_id) as n_msgs, count(c.comment_id) as n_comments from users u, user_content_map ucm, bboard bb, comments c where u.user_id = ucm.user_id(+) and u.user_id = bb.user_id(+) and u.user_id = c.user_id(+) group by u.user_id, u.email order by upper(u.email) To Jane, this will look and act just like a table when she queries it: select * from janes_marketing_view; Why should she need to be aware that information is coming from four tables? Or that you've reorganized the RDBMS so that the information subsequently comes from six tables?

Protecting Privacy with Views

A common use of views is protecting confidential data. For example, suppose that all the people who work in a hospital collaborate by using a relational database. Here is the data model: create table patients ( patient_id integer primary key, patient_name varchar(100), hiv_positive_p char(1), insurance_p char(1), ... ); If a bunch of hippie idealists are running the hospital, they'll think that the medical doctors shouldn't be aware of a patient's insurance status. So when a doc is looking up a patient's medical record, the looking is done through create view patients_clinical as select patient_id, patient_name, hiv_positive_p from patients; The folks over in accounting shouldn't get access to the patients' medical records just because they're trying to squeeze money out of them: create view patients_accounting as select patient_id, patient_name, insurance_p from patients; Relational databases have elaborate permission systems similar to those on time-shared computer systems. Each person in a hospital has a unique database user ID. Permission will be granted to view or modify certain tables on a per-user or per-group-of-users basis. Generally the RDBMS permissions facilities aren't very useful for Web applications. It is the Web server that is talking to the database, not a user's desktop computer. So the Web server is responsible for figuring out who is requesting a page and how much to show in response.

Protecting Your Own Source Code
The ArsDigita Shoppe system, described in http://www.arsdigita.com/books/panda/ecommerce, represents all orders in one table, whether they were denied by the credit card processor, returned by the user, or voided by the merchant. This is fine for transaction processing but you don't want your accounting or tax reports corrupted by the inclusion of failed orders. You can make a decision in one place as to what constitutes a reportable order and then have all of your report programs query the view:

create or replace view sh_orders_reportable as select * from sh_orders where order_state not in ('confirmed','failed_authorization','void'); Note that in the privacy example (above) we were using the view to leave unwanted columns behind whereas here we are using the view to leave behind unwanted rows. If we add some order states or otherwise change the data model, the reporting programs need not be touched; we only have to keep this view definition up to date. Note that you can define every view with "create or replace view" rather than "create view"; this saves a bit of typing when you have to edit the definition later. If you've used select * to define a view and subsequently alter any of the underlying tables, you have to redefine the view. Otherwise, your view won't contain any of the new columns. You might consider this a bug but Oracle has documented it, thus turning the behavior into a feature.

How Views Work
Programmers aren't supposed to have to think about how views work. However, it is worth noting that the RDBMS merely stores the view definition and not any of the data in a view. Querying against a view versus the underlying tables does not change the way that data are retrieved or cached. Standard RDBMS views exist to make programming more convenient or to address security concerns, not to make data access more efficient.

Materialized Views
In other words, the view might be created with a complicated JOIN, or an expensive GROUP BY with sums and averages. With a regular view, this expensive operation would be done every time you issued a query. With a materialized view, the expensive operation is done when the view is created and thus an individual query need not involve substantial computation. Materialized views consume space because Oracle is keeping a copy of the data or at least a copy of information derivable from the data. More importantly, a materialized view does not contain up-to-the-minute information. When you query a regular view, your results includes changes made up to the last committed transaction before your SELECT. When you query a materialized view, you're getting results as of the time that the view was created or refreshed. Note that Oracle lets you specify a refresh interval at which the materialized view will automatically be refreshed. Such views are also called summaries. At this point, you'd expect an experienced Oracle user to say "Hey, these aren't new. This is the old CREATE SNAPSHOT facility that we used to keep semi-up-to-date copies of tables on machines across the network!" What is new with materialized views is that you can create them with the ENABLE QUERY REWRITE option. This authorizes the SQL parser to look at a query involving aggregates or JOINs and go to the materialized view

instead. For each month, we have a count of how many users registered at photo.net. To execute the query, Oracle must sequentially scan the users table. If the users table grew large and you wanted the query to be instant, you'd sacrifice some timeliness in the stats with create materialized view users_by_month enable query rewrite refresh complete start with 1999-03-28 next sysdate + 1 as select to_char(registration_date,'YYYYMM') as sort_key, rtrim(to_char(registration_date,'Month')) as pretty_month, to_char(registration_date,'YYYY') as pretty_year, count(*) as n_new from users group by to_char(registration_date,'YYYYMM'), to_char(registration_date,'Month'), to_char(registration_date,'YYYY') order by 1; Oracle will build this view just after midnight on March 28, 1999. The view will be refreshed every 24 hours after that. Because of the enable query rewrite clause, Oracle will feel free to grab data from the view even when a user's query does not mention the view. For example, given the query select count(*) from users where rtrim(to_char(registration_date,'Month')) = 'January' and to_char(registration_date,'YYYY') = '1999' Oracle would ignore the users table altogether and pull information from users_by_month. This would give the same result with much less work. Suppose that the current month is March 1999, though. The query select count(*) from users where rtrim(to_char(registration_date,'Month')) = 'March' and to_char(registration_date,'YYYY') = '1999' will also hit the materialized view rather than the users table and hence will miss anyone who has registered since midnight (i.e., the query rewriting will cause a different result to be returned).

8. Procedural Programming in Oracle
Declarative languages can be very powerful and reliable, but sometimes it is easier to think about things procedurally. One way to do this is by using a procedural language in the database client. Like using JDBC APIs to talk to the Oracle DB in Java language or the Pro*C package which comes with Oracle for writing the client applications in C/C++ language or writing stored procedures, functions and triggers in the PL/SQL language which is an extension for procedural programming to SQL provided by Oracle. There are no clean ways in standard SQL to say "do this just for the first N rows" or "do something special for a particular row if its data match a certain pattern". Suppose that you have a million rows in your news table, you want five, but you can only figure out which five with a bit of procedural logic. Does it really make sense to drag those million rows of data all the way across the network from the database server to your client application and then throw out 999,995 rows? Or suppose that you're querying a million-row table and want the results back in a strange order. Does it make sense to build a million-row data structure in your client application, sort them in the client program, then return the sorted rows to the user? Eg: Visit http://www.scorecard.org/chemical-profiles/ and search for "benzene". Note that there are 328 chemicals whose names contain the string "benzene": select count(*) from chemical where upper(edf_chem_name) like upper('%benzene%'); COUNT(*) ---------328 The way we want to display them is
• • • • •

exact matches on top line break chemicals that start with the query string line break chemicals that contain the query string

Within each category of chemicals, we want to sort alphabetically. However, if there are numbers or special characters in front of a chemical name, we want to ignore those for the purposes of sorting. Can you do all of that with one query? And have them come back from the database in the desired order?

You could if you could make a procedure that would run inside the database. For each row, the procedure would compute a score reflecting goodness of match. To get the order correct, you need only ORDER BY this score. To get the line breaks right, you need only have your application program watch for changes in score. For the fine tuning of sorting equally scored matches alphabetically, just write another procedure that will return a chemical name stripped of leading special characters, then sort by the result. Here's how it looks: -- Your SQL query in your client application select edf_chem_name, edf_substance_id, score_chem_name_match_score(upper(edf_chem_name),upper('%benzene%')) as match_score from chemical where upper(edf_chem_name) like upper('%benzene%'); order by score_chem_name_match_score(upper(edf_chem_name),upper('benzene')), score_chem_name_for_sorting(edf_chem_name) We specify the procedure score_chem_name_match_score to take two arguments: one the chemical name from the current row, and one the query string from the user. It returns 0 for an exact match, 1 for a chemical whose name begins with the query string, and 2 in all other cases (remember that this is only used in queries where a LIKE clause ensures that every chemical name at least contains the query string. Once we defined this procedure, we'd be able to call it from a SQL query, the same way that we can call built-in SQL functions such as upper. -- Server side stored procedure 1. create or replace function score_chem_name_match_score (chem_name IN varchar, query_string IN varchar) return integer AS BEGIN IF chem_name = query_string THEN return 0; ELSIF instr(chem_name,query_string) = 1 THEN return 1; ELSE return 2; END IF; END score_chem_name_match_score; Notice that PL/SQL is a strongly typed language. We say what arguments we expect, whether they are IN or OUT, and what types they must be. We say that score_chem_name_match_score will return an integer. We can say that a PL/SQL variable should be of the same type as a column in a table:

-- Server side stored procedure 2. create or replace function score_chem_name_for_sorting (chem_name IN varchar) return varchar AS stripped_chem_name chem_hazid_ref.edf_chem_name%TYPE; BEGIN stripped_chem_name := ltrim(chem_name,'1234567890-+()[],'' #'); return stripped_chem_name; END score_chem_name_for_sorting; The local variable stripped_chem_name is going to be the same type as the edf_chem_name column in the chem_hazid_ref table. IMO it's a good idea to do transaction control (BEGIN TRANSACTION/ROLLBACK/COMMIT) solely in stored procedures, and not in a client program. The situation you want to avoid at all costs is a client opening a transaction, and then crashing -- with your database holding a transaction open (and pages locked) until (some time later) the operating system figures out that the TCP/IP connection to the client has closed, and tells the database about it (which then rolls back the transaction). This is more of an issue with PC clients (and users hitting Ctrl-Alt-Del) as opposed to a webserver program. Also, it should be mentioned that putting SQL into stored procedures makes it run faster, as it can be "precompiled" (some of the interpretation steps done ahead of time, such as parsing and generating a query plan). This is true of Sybase and (I would think) of Oracle etc. as well. PL/SQL is a block-structured language. Each block builds a (named) program unit, and blocks can be nested. Blocks that build a procedure, a function, or a package must be named. A PL/SQL block has an optional declare section, a part containing PL/SQL statements, and an optional exception-handling part. Thus the structure of a PL/SQL looks as follows (brackets[ ] enclose optional parts): [<Block header>] [declare <Constants> <Variables> <Cursors> <User defined exceptions>] begin <PL/SQL statements> [exception <Exception handling>] end; The block header speci es whether the PL/SQL block is a procedure, a function, or a package. If no header is speci ed, the block is said to be an anonymous PL/SQL block. Each PL/SQL block again builds a PL/SQL statement. Thus blocks can be nested like

blocks in conventional programming languages. The scope of declared variables (i.e., the part of the program in which one can refer to the variable) is analogous to the scope of variables in programming languages such as C or Pascal. Constants, variables, cursors, and exceptions used in a PL/SQL block must be declared in the declare section of that block. Variables and constants can be declared as follows: <variable name> [constant] <data type> [not null] [:= <expression>]; Valid data types are SQL data types (see Section 1.1) and the data type boolean. Boolean data may only be true, false, or null. The not null clause requires that the declared variable must always have a value di erent from null. <expression> is used to initialize a variable. If no expression is speci ed, the value null is assigned to the variable. The clause constant states that once a value has been assigned to the variable, the value cannot be changed (thus the variable becomes a constant). Example: declare hire date date; /* implicit initialization with null */ job title varchar2(80) := 'Salesman'; emp found boolean; /* implicit initialization with null */ salary incr constant number(3,2) := 1.5; /* constant */ ::: begin : : : end; Instead of specifying a data type, one can also refer to the data type of a table column (socalled anchored declaration). For example, EMP.Empno%TYPE refers to the data type of the column Empno in the relation EMP. Instead of a single variable, a record can be declared that can store a complete tuple from a given table (or query result). For example, the data type DEPT%ROWTYPE speci es a record suitable to store all attribute values of a complete row from the table DEPT.Such records are typically used in combination with a cursor. A eld in a record can be accessed using <record name>.<column name>, for example, DEPT.Deptno. A cursor declaration speci es a set of tuples (as a query result) such that the tuples can be processed in a tuple-oriented way (i.e., one tuple at a time) using the fetch statement. A cursor declaration has the form cursor <cursor name> [(<list of parameters>)] is <select statement>; The cursor name is an undeclared identi er, not the name of any PL/SQL variable. A parameter has the form <parameter name> <parameter type>. Possible parameter types are char, varchar2, number, date and boolean as well as corresponding subtypes such as integer.Parameters are used to assign values to the variables that are given in the select statement. Example: We want to retrieve the following attribute values from the table EMP in a tuple-oriented way: the job title and name of those employees who have been hired after a given date, and who have a manager working in a given department.

cursor employee cur (start date date, dno number) is select JOB, ENAME from EMP E where HIREDATE > start date and exists (select from EMP where E.MGR = EMPNO and DEPTNO = dno); If (some) tuples selected by the cursor will be modi ed in the PL/SQL block, the clause for update[(<column(s)>)] has to be added at the end of the cursor declaration. In this case selected tuples are locked and cannot be accessed by other users until a commit has been issued. Before a declared cursor can be used in PL/SQL statements, the cursor must be opened, and after processing the selected tuples the cursor must be closed. Exceptions are used to process errors and warnings that occur during the execution of PL/SQL statements in a controlled manner. Some exceptions are internally de ned, such as ZERO DIVIDE. Other exceptions can be speci ed by the user at the end of a PL/SQL block. User de ned ex-ceptions need to be declared using <name of exception> exception. PL/SQL uses a modi ed select statement that requires each selected tuple to be assigned to a record (or a list of variables).There are several alternatives in PL/SQL to a assign a value to a variable. The most simple way to assign a value to a variable is declare counter integer := 0; ::: begin counter := counter + 1; Values to assign to a variable can also be retrieved from the database using a select statement select <column(s)> into <matching list of variables> from <table(s)> where <condition>; It is important to ensure that the select statement retrieves at most one tuple ! Otherwise it is not possible to assign the attribute values to the speci ed list of variables and a run-time error occurs. If the select statement retrieves more than one tuple, a cursor must be used instead. Furthermore, the data types of the speci ed variables must match those of the retrieved attribute values. For most data types, PL/SQL performs an automatic type conversion (e.g., from integer to real). Instead of a list of single variables, a record can be given after the keyword into. Also in this case, the select statement must retrieve at most one tuple ! declare employee_rec EMP%ROWTYPE; max sal EMP.SAL%TYPE; begin select EMPNO, ENAME, JOB, MGR, SAL, COMM, HIREDATE, DEPTNO into employee_rec from EMP where EMPNO = 5698; select max(SAL) into max sal from EMP; ::: end;

PL/SQL provides while-loops, two types of for-loops, and continuous loops. Latter ones are used in combination with cursors. All types of loops are used to execute a sequence of statements multiple times. The speci cation of loops occurs in the same way as known from imperative programming languages such as C or Pascal. A while-loop has the pattern [<< <label name> >>] while <condition> loop <sequence of statements>; end loop [<label name>] ; A loop can be named. Naming a loop is useful whenever loops are nested and inner loops are completed unconditionally using the exit <label name>; statement. Whereas the number of iterations through a while loop is unknown until the loop completes, the number of iterations through the for loop can be speci ed using two integers. [<< <label name> >>] for <index> in [reverse] <lower bound>..<upper bound> loop <sequence of statements> end loop [<label name>] ; The loop counter <index> is declared implicitly. The scope of the loop counter is only the for loop. It overrides the scope of any variable having the same name outside the loop. Inside the for loop, <index> can be referenced like a constant. <index> may appear in expressions, but one cannot assign a value to <index>. Using the keyword reverse causes the iteration to proceed downwards from the higher bound to the lower bound. Processing Cursors: Before a cursor can be used, it must be opened using the open statement open <cursor name> [(<list of parameters>)] ; The associated select statement then is processed and the cursor references the rst selected tuple. Selected tuples then can be processed one tuple at a time using the fetch command fetch <cursor name> into <list of variables>; The fetch command assigns the selected attribute values of the current tuple to the list of variables. After the fetch command, the cursor advances to the next tuple in the result set. Note that the variables in the list must have the same data types as the selected values. After all tuples have been processed, the close command is used to disable the cursor. close <cursor name>; The example below illustrates how a cursor is used together with a continuous loop: declare cursor emp cur is select * from EMP; emp rec EMP%ROWTYPE; emp sal EMP.SAL%TYPE; begin

open emp cur; loop fetch emp cur into emp rec; exit when emp cur%NOTFOUND; emp sal := emp rec.sal; <sequence of statements> end loop; close emp cur; ::: end; Each loop can be completed unconditionally using the exit clause: exit [<block label>] [when <condition>] Using exit without a block label causes the completion of the loop that contains the exit statement. A condition can be a simple comparison of values. In most cases, however, the condition refers to a cursor. In the example above, %NOTFOUND is a predicate that evaluates to false if the most recent fetch command has read a tuple. The value of <cursor name>%NOTFOUND is null before the rst tuple is fetched. The predicate evaluates to true if the most recent fetch failed to return a tuple, and false otherwise. %FOUND is the logical opposite of %NOTFOUND. Cursor for loops can be used to simplify the usage of a cursor: [<< <label name> >>] for <record name> in <cursor name>[(<list of parameters>)] loop <sequence of statements> end loop [<label name>]; A record suitable to store a tuple fetched by the cursor is implicitly declared. Furthermore, this loop implicitly performs a fetch at each iteration as well as an open before the loop is entered and a close after the loop is left. If at an iteration no tuple has been fetched, the loop is automatically terminated without an exit. It is even possible to specify a query instead of <cursor name> in a for loop: for <record name> in (<select statement>) loop <sequence of statements> end loop; That is, a cursor needs not be speci ed before the loop is entered, but is de ned in the select statement. Example: for sal rec in (select SAL + COMM total from EMP) loop :::; end loop; total is an alias for the expression computed in the select statement. Thus, at each iteration only one tuple is fetched. The record sal rec, which is implicitly de ned, then contains only one entry which can be accessed using sal rec.total. Aliases, of course, are not necessary if only attributes are selected, that is, if the select statement contains no arithmetic operators or aggregate functions. For conditional control, PL/SQL o ers if-then-else constructs of the pattern if <condition> then <sequence of statements>

[elsif ] <condition> then <sequence of statements> ::: [else] <sequence of statements> end if ; Starting with the rst condition, if a condition yields true, its corresponding sequence of statements is executed, otherwise control is passed to the next condition. Thus the behavior of this type of PL/SQL statement is analogous to if-then-else statements in imperative programming languages. Except data de nition language commands such ascreate table, all types of SQL statements can be used in PL/SQL blocks, in particular delete, insert, update, and commit. Note that in PL/SQL only select statements of the type select <column(s)> into are allowed, i.e.,selected attribute values can only be assigned to variables (unless the select statement is used in a subquery). The usage of select statements as in SQL leads to a syntax error. If update ordelete statements are used in combination with a cursor, these commands can be restricted tocurrently fetched tuple. In these cases the clause where current of<cursor name> is addedas shown in the following example. Example: The following PL/SQL block performs the following modi cations: All employees having 'KING' as their manager get a 5% salary increase. declare manager EMP.MGR%TYPE; cursor emp cur (mgr no number) is select SAL from EMP where MGR = mgr no for update of SAL; begin select EMPNO into manager from EMP where ENAME = 'KING'; for emp rec in emp cur(manager) loop update EMP set SAL = emp rec.sal * 1.05 -- implicit fetch where current of emp cur; end loop; commit; end; Remark: Note that the record emp rec is implicitly de ned. A PL/SQL block may contain statements that specify exception handling routines. Each error or warning during the execution of a PL/SQL block raises an exception. One can distinguish between two types of exceptions: ¤ system defined exceptions ¤ user defined exceptions (which must be declared by the user in the declaration part of a block where the exception is used/implemented) System de ned exceptions are always automatically raised whenever corresponding errors or warnings occur. User de ned exceptions, in contrast, must be raised explicitly in a

sequence of statements using raise <exception name>. After the keyword exception at the end of a block, user de ned exception handling routines are implemented. An implementation has the pattern when <exception name> then <sequence of statements>; The most common errors that can occur during the execution of PL/SQL programs are handled by system de ned exceptions. The table below lists some of these exceptions with their names and a short description. Exception name Number CURSOR ALREADY OPEN ORA-06511 cursor which is already open INVALID CURSOR ORA-01001 such as fetching from a closed cursor NO DATA FOUND ORA-01403 statement returned no tuple TOO MANY ROWS ORA-01422 statement returned more than one tuple ZERO DIVIDE ORA-01476 a number by 0 Example: declare emp sal EMP.SAL%TYPE; emp no EMP.EMPNO%TYPE; too high sal exception; begin select EMPNO, SAL into emp no, emp sal from EMP where ENAME = 'KING'; if emp sal * 1.05 > 4000 then raise too high sal else update EMP set SQL : : : end if ; exception when NO DATA FOUND { {no tuple selected then rollback; when too high sal then insert into high sal emps values(emp no); commit; end; After the keyword when a list of exception names connected with or can be speci ed. The last when clause in the exception part may contain the exception name others. This introduces the default exception handling routine, for example, a rollback. If a PL/SQL program is executed from the SQL*Plus shell, exception handling routines may contain statements that display error or warning messages on the screen. For this, the procedure raise application error can be used. This procedure has two parameters <error Remark You have tried to open a Invalid cursor operation A select : : : into or fetch A select : : : into You have tried to divide

number> and <message text>. <error number> is a negative integer de ned by the user and must range between -20000 and -20999. <error message> is a string with a length up to 2048 characters.The concatenation operator "||" can be used to concatenate single strings to one string. In order to display numeric variables, these variables must be converted to strings using the function to char. If the procedure raise application error is called from a PL/SQL block, processing the PL/SQL block terminates and all database modi cations are undone, that is, an implicit rollback is performed in addition to displaying the error message. Example: if emp sal * 1.05 > 4000 then raise application error(-20010, 'Salary increase for employee with Id '|| to char(Emp no) || ' is too high'); Procedures and Functions PL/SQL provides sophisticated language constructs to program procedures and functions as stand-alone PL/SQL blocks. They can be called from other PL/SQL blocks, other procedures and functions. The syntax for a procedure de nition is create [or replace] procedure <procedure name> [(<list of parameters>)] is <declarations> begin <sequence of statements> [exception <exception handling routines>] end [<procedure name>]; A function can be speci ed in an analogous way create [or replace] function <function name> [(<list of parameters>)] return <data type> is ::: The optional clause or replace re-creates the procedure/function. A procedure can be deleted using the command drop procedure <procedure name> (drop function <function name>). In contrast to anonymous PL/SQL blocks, the clause declare may not be used in procedure/function de nitions. Valid parameters include all data types. However, for char, varchar2, and number no length and scale, respectively, can be speci ed. For example, the parameter number(6) results in a compile error and must be replaced by number. Instead of explicit data types, implicit types of the form %TYPE and %ROWTYPE can be used even if constrained declarations are referenced. A parameter is speci ed as follows: <parameter name> [IN | OUT | IN OUT] <data type> [{ := | DEFAULT} <expression>] The optional clauses IN, OUT, and IN OUT specify the way in which the parameter is used. The default mode for a parameter is IN. IN means that the parameter can be referenced inside the procedure body, but it cannot be changed. OUT means that a value can be assigned to the parameter in the body, but the parameter's value cannot be referenced. IN OUT allows both assigning values to the parameter and referencing the parameter. Typically, it is su

cient to use the default mode for parameters. Example: The subsequent procedure is used to increase the salary of all employees who work in the department given by the procedure's parameter. The percentage of the salary increase is given by a parameter, too.
set echo on create or replace procedure raise_salary (DNUM in number, PERCENT in number) as cursor EMP_CUR is select EMPNO, SAL from EMP where DEPTNO = DNUM for update of SAL; ENUM number(4); ESAL number; begin open EMP_CUR; loop fetch EMP_CUR into ENUM, ESAL; exit when EMP_CUR%NOTFOUND; update EMP set sal = (ESAL * (1+(PERCENT / 100))) where current of EMP_CUR; end loop; close EMP_CUR; commit; end raise_salary; /

This procedure can be called from the SQL*Plus shell using the command execute raise salary(10, 3); If the procedure is called only with the parameter 10, the default value 0.5 is assumed as speci ed in the list of parameters in the procedure de nition. If a procedure is called from a PL/SQL block, the keyword execute is omitted. Functions have the same structure as procedures. The only di erence is that a function returns a value whose data type (unconstrained) must be speci ed. Example: create function get dept salary(dno number) return number is all sal number; begin all sal := 0; for emp sal in (select SAL from EMP where DEPTNO = dno and SAL is not null) loop all sal := all sal + emp sal.sal; end loop; return all sal; end get dept salary;

In order to call a function from the SQL*Plus shell, it is necessary to rst de ne a variable to which the return value can be assigned. In SQL*Plus a variable can be de ned using the command variable <variable name> <data type>;, for example, variable salary number. The above function then can be called using the command execute :salary := get dept salary(20); Note that the colon \:" must be put in front of thevariable. Further information about procedures and functions can be obtained using the help command in the SQL*Plus shell, for example, help [create] function, help subprograms, help stored subprograms. Packages:PL/SQL supports the concept of modularization by which modules and other constructs can be organized into packages. A package consists of a package speci cation and a package body. The package speci cation de nes the interface that is visible for application programmers, and the package body implements the package speci cation (similar to header- and source les in the programming language C).
create package manage_employee as -- package specification function hire_emp (name varchar2, job varchar2, mgr number, hiredate date, sal number, comm number default 0, deptno number) return number; procedure fire_emp (emp_id number); procedure raise_sal (emp_id number, sal_incr number); end manage_employee; create package body manage_employee as function hire_emp (name varchar2, job varchar2, mgr number, hiredate date, sal number, comm number default 0, deptno number) return number is -- Insert a new employee with a new employee Id new_empno number(10); begin select emp_sequence.nextval into new_empno from dual; insert into emp values(new_empno, name, job, mgr, hiredate, sal, comm, deptno); return new_empno; end hire_emp; procedure fire_emp(emp_id number) is -- deletes an employee from the table EMP begin delete from emp where empno = emp_id; if SQL%NOTFOUND then -- delete statement referred to invalid emp_id raise_application_error(-20011, 'Employee with Id ' || to_char(emp_id) || ' does not exist.'); end if; end fire_emp; procedure raise_sal(emp_id number, sal_incr number) is -- modify the salary of a given employee begin update emp set sal = sal + sal_incr where empno = emp_id;

if SQL%NOTFOUND then raise_application_error(-20012, 'Employee with Id ' || to_char(emp_id) || ' does not exist'); end if; end raise_sal; end manage_employee;

9. Embedded SQL (Pro*C)
Programs written in Pro*C and which include SQL and/or PL/SQL statements are precompiled into regular C programs using a precompiler that typically comes with the database management software (precompiler package). In order to make SQL and PL/SQL statements in a Proc*C program (having the su

x .pc) recognizable by the precompiler, they are always preceded by the keywords EXEC SQL and end with a semicolon \;". The Pro*C precompiler replaces such statements with appropriate calls to functions implemented in the SQL runtime library. The resulting C program then can be compiled and linked using a normal C compiler like any other C program. The linker includes the appropriate Oracle speci c libraries.

As it is the case for PL/SQL blocks, also the rst part of a Pro*C program has a declare section. In a Pro*C program, in a declare section so-called host variables are speci ed. Host variables are the key to the communication between the host program and the database. Declarations of host variables can be placed wherever normal C variable declarations can be placed. Host variables are declared according to the C syntax. Host variables can be of the following data types: char <Name> single character char <Name>[n] array ofn characters int integer float floating point VARCHAR<Name>[n] variable length strings VARCHAR2 is converted by the Pro*C precompiler into a structure with an n-byte characterarray and a 2-bytes length eld. The declaration of host variables occurs in a declare section having the following pattern:

EXEC SQL BEGIN DECLARE SECTION <Declaration of host variables> /* e.g., VARCHAR userid[20]; */ /* e.g., char test ok; */ EXEC SQL END DECLARE SECTION In a Pro*C program at most one such a declare section is allowed. The declaration of cursors and exceptions occurs outside of such a declare section for host variables. In a Pro*C program host variables referenced in SQL and PL/SQL statements must be pre xed with a colon ":".Note that it is not possible to use C function calls and most of the pointer expressions as host variable references. In addition to host language variables that are needed to pass data between the database and C program (and vice versa), one needs to provide some status variables containing program runtime information. The variables are used to pass status information concerning the database access to the application program so that certain events can be handled in the program properly. The structure containing the status variables is called SQL Communication Area or SQLCA, for short, and has to be included after the declare section usingthe statement EXEC SQL INCLUDE SQLCA.H In the variables de ned in this structure, information about error messages as well as program status information is maintained: struct sqlca { /* ub1 */ char sqlcaid[8]; /* b4 */ long sqlabc; /* b4 */ long sqlcode; struct { /* ub2 */ unsigned short sqlerrml; /* ub1 */ char sqlerrmc[70]; } sqlerrm; /* ub1 */ char sqlerrp[8]; /* b4 */ long sqlerrd[6]; /* ub1 */ char sqlwarn[8]; /* ub1 */ char sqlext[8]; }; Components of this structure can be accessed and veri ed during runtime, and appropriate handling routines (e.g., exception handling) can be executed to ensure a correct behavior of the application program. If at the end of the program the variable sqlcode contains a 0, then theexecution of the program has been successful, otherwise an error occurred. There are two ways to check the status of your program after executable SQL statements which may result in an error or warning: (1) either by explicitly checking respective components of the SQLCA structure, or (2) by doing automatic error checking and handling using the WHENEVER statement. The complete syntax of this statement is

EXEC SQL WHENEVER <condition> <action>; By using this command, the program then automatically checks the SQLCA for <condition> and executes the given <action>. <condition> can be one of the following:
• • •

• • • •

SQLERROR: sqlcode has a negative value, that is, an error occurred SQLWARNING: In this case sqlwarn[0] is set due to a warning NOT FOUND: sqlcode has a positive value, meaning that no row was found that satis es the where condition, or a select into or fetch statement returned no rows <action> can be STOP: the program exits with an exit() call, and all SQL statements that have not been committed so far are rolled back CONTINUE: if possible, the program tries to continue with the statement following the error resulting statement DO <function>: the program transfers processing to an error handling function named <function> GOTO <label>: program execution branches to a labeled statement (see example)

At the beginning of Pro*C program, more precisely, the execution of embedded SQL or PL/SQL statements, one has to connect to the database using a valid Oracle account and password. Connecting to the database occurs trough the embedded SQL statement EXEC SQL CONNECT :<Account> IDENTIFIED BY :<Password>. Both <Account> and <Password> are host variables of the type VARCHAR and must be speci ed and handled respectively. <Account> and <Password> can be speci ed in the Pro*C program, but can also be entered at program runtime using, e.g., the C function scanf. Before a program is terminated by the c exit function and if no error occurred, database modi cations through embedded insert, update, and delete statements must be committed. This is done by using the embedded SQL statement EXEC SQL COMMIT WORK RELEASE; If a program error occurred and previous non-committed database modi cations need to be undone, the embedded SQL statement EXEC SQL ROLLBACK WORK RELEASE; has to be speci ed in the respective error handling routine of the Pro*C program. The following Pro*C program connects to the database using the database account scott/tiger. The database contains information about employees and departments (see the previous exam- ples used in this tutorial). The user has to enter a salary which then is used to retrieve all employees (from the relation EMP) who earn more than the given minimum salary. Retrieving and processing individual result tuples occurs through using a PL/SQL cursor in a C while-loop.
/* Declarations */ #include <stdio.h> #include <string.h> #include <stdlib.h>

/* Declare section for host variables */ EXEC SQL BEGIN DECLARE SECTION; VARCHAR userid[20]; VARCHAR passwd[20]; int empno; VARCHAR ename[15]; float sal; float min_sal; EXEC SQL END DECLARE SECTION; /* Load SQL Communication Area */ EXEC SQL INCLUDE SQLCA.H; main() /* Main program */ { int retval; /* Catch errors */ EXEC SQL WHENEVER SQLERROR GOTO error; /* Connect to Oracle as SCOTT/TIGER; both are host variables */ */ /* of type VARCHAR; Account and Password are specified explicitly strcpy(userid.arr,"SCOTT"); userid.len=strlen(userid.arr); strcpy(passwd.arr,"TIGER"); passwd.len=strlen(passwd.arr); /* /* /* /* userid.arr uid.len := passwd.arr passwd.len := "SCOTT" */ 5 */ := "TIGER" */ := 5 */

EXEC SQL CONNECT :userid IDENTIFIED BY :passwd; printf("Connected to ORACLE as: %s\n\n", userid.arr); /* Enter minimum salary by user */ printf("Please enter minimum salary > "); retval = scanf("%f", &min_sal); if(retval != 1) { printf("Input error!!\n"); EXEC SQL ROLLBACK WORK RELEASE; /* Disconnect from ORACLE */ exit(2); /* Exit program */ } /* Declare cursor; cannot occur in declare section! */ EXEC SQL DECLARE EMP_CUR CURSOR FOR SELECT EMPNO,ENAME,SAL FROM EMP WHERE SAL>=:min_sal; /* Print Table header, run cursor through result set */ printf("Empployee-ID Employee-Name Salary \n"); printf("-------------------------------------\n"); EXEC SQL OPEN EMP_CUR; EXEC SQL FETCH EMP_CUR INTO :empno, :ename, :sal; /* Fetch 1.tuple while(sqlca.sqlcode==0) { /* are there more tuples ? */

*/

tuple */ } EXEC SQL CLOSE EMP_CUR;

ename.arr[ename.len] = '\0'; /* "End of String" */ printf("%15d %-17s %7.2f\n",empno,ename.arr,sal); EXEC SQL FETCH EMP_CUR INTO :empno, :ename, :sal; /* get next

/* Disconnect from database and terminate program */ EXEC SQL COMMIT WORK RELEASE; printf("\nDisconnected from ORACLE\n"); exit(0); /* Error Handling: Print error message */ error: printf("\nError: %.70s \n",sqlca.sqlerrm.sqlerrmc); EXEC SQL ROLLBACK WORK RELEASE; exit(1); }

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.