You are on page 1of 12

SQL Basics

A primary key is a column or set of columns that uniquely identifies the rest of the data in any given row A foreign key is a column in a table where that column is a primary key of another table, which means that any data in a foreign key column must have corresponding data in the other table where that column is the primary key Data inconsistency: A Data field with same name but info does not match. (Occurs when Data Redundancy exists) Data dependency: When application depends on Data structure and has no flexibility. Data Redundancy: When a data item exists in several files (duplication) (Eliminated if using Normalized data structure) Data Independence: Data structures are defined separately from application programs. Relations: Two-dimensional tables of data values = Table Atomic: Values cannot be broken down any further. Domain: Values for attributes are drawn from a domain. Atomic set of attributes. Ex: Date, City, etc. Candidate Key: Several keys that act as a subject for primary key. Concatenated key: Combination of attributes (from candidate keys) that forms the primary key. Alternate Keys: Candidate keys not chosen to be part of primary key. Entity integrity: No part of the primary key can be missing. "NOT NULL" Referential Integrity:A foreign key must have applicable primary key in other table. Data Warehousing The Fundamentals Def: Single, complete and consistent store of data obtained from various sources. It is usually made of relational databases. It consists of: - A set of programs that extract data from an operational environment. - A database that maintains data warehouse data, - Systems that provide data to users. Functions: The main function of a data warehouse is to give end-users faster, easier, and more direct access to corporate data. Characteristics: - Data Warehouses are offline systems. Their information is not live and it is not continuously updated. - One of the big advantages of a warehouse implementation is its ability to store historical data. Codd Rules In 1985 Codd proposed an informal set of twelve rules by which a database could be evaluated to see how "relational" it is. Very few commercial databases exist which meet or satisfy all twelve rules. The 12 rules are based on the following foundation rule. Rule 0: For any system that is advertised as, or claims to be, a relational database management system, that system must be able to manage databases entirely through its relational capabilities. In other words, the DBMS should not have to rely on non-relational methods in order to manage its data. The other twelve rules are all implied in Rule 0, but it is easier to check for the other twelve individually than for this general rule. Rule 1: The information rule All information in a relational database is represented explicitly at the logical level and in exactly one way - by values in tables. This includes data about the database itself. Data about the database itself is kept in a data dictionary. Rule 2: The guaranteed access rule. Each and every datum (atomic value) in a relational database is guaranteed to be logically accessible by resorting to a combination of table name, primary key value, and column value. If a database conforms to rule 2, every atomic value should be easily retrievable. An atomic value is the smallest unit of value in a relational database. In a relational database, an atomic value can always be retrieved if you know the column or attribute name, the table it is stored in, and the primary key's value.

Rule 3: Systematic treatment of null values Null values (distinct from the empty character string or a string of blank characters and distinct from zero or any other number) are supported in a fully relational DBMS for representing missing information in a systematic way, independent of data type. A null value can mean that data is not there is not known, or is irrelevant. The null value represents empty database fields. There is no value for that field. It is different from zero or blank. A primary field should never have and empty field. This protects the integrity of the database. Rule 4: Query language The database description is represented at the logical level in the same way as ordinary data, so that authorized users can apply the same relational language to its interrogation as they apply to the regular data. In a relational database the same query language is used on the data dictionary as is used on the application database. Rule 5: The comprehensive data sub language rule A relational system may support several languages and various modes of terminal use (for example, the fill-in-the-blanks mode). However, there must be one language whose statements are expressible, per some well defined syntax, as character strings and that is comprehensive in supporting all the following items: Data definition, view definition, data manipulation, integrity constraints, authorization, transaction boundaries. There are often many different ways of interacting with the database, for example QBE (Query By Example) or SQL (for more sophisticated queries) Rule 6: View updating rule All views that are theoretically updateable are also updateable by the system. A view is a "virtual table" in a database. With a relational DBMS, any change that a user makes to a view should ideally also be made in the base table from which the view is derived. Rule 7: High-level insert, update and delete The capability of handling a base relation or a derived relation as a single operand applies not only to the retrieval of data but also to the insertion, update and deletion of data. This means that one command in a relational database should be able to carry out an operation on one or more rows in either a base relation or a view. Rule 8: Logical integrity Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representation or access methods. Ex: moving tables to different disk drives, changing the order of rows in the table, reorganizing database files. In a relational environment the DBMS decides how to access a piece of data. Rule 9: Data independence Application programs and terminal activities remain logically unimpaired when information-preserving changes of any kind that theoretically permit unimpairement are made to the base tables. Relational tables may have to be expanded or restructured. New tables may also have to be added to the database. Expansion of a table may involve adding columns to existing tables. The addition of a new column to a table in a relational database should not affect programs that use that table. Rule 10: Integrity constraints Integrity constraints specific to a particular relational database must be definable in the relational data sublanguage and stored in the catalog, not in the application programs. No data should be stored in a relational database that has not been defined beforehand. Integrity controls must exist to protect the consistency of the database from unauthorized users. Two integrity constraints exist: Entity integrity and referential integrity. Entity integrity states that no part of the primary key can be missing. The key is said to be "not null". Referential integrity relates to the use of foreign keys. A foreign key is an attribute or group of attributes that matches the primary key of another table. If a table has a foreign key to represent a relationship, then the related table must have a matching primary key.

Rule 11: Extension of rule 8. A relational DBMS has distribution independence. Distribution independence means that application programs and terminal activities remain unaffected when data distribution is first introduced, when data is redistributed. Rule 8 requires that data should remain unaffected by the ways in which it is stored. Rule 11 requires that independence should still hold when data is distributed across different locations. Rule 12: Integrity constraints in the high-level language of the RDBMS. If a relational system has a low-level (single-record-at-a-time) language, that low-level language cannot be used to subvert or bypass the integrity rules and constraints expressed in the higher-level relational language (multiplerecords-at-a-time). This rule guarantees the integrity constraints contained in the high-level language of the DBMS. In some case, you want to use a one-record-at-a-time procedure. A procedural language such as C, Cobol, or Fortran is used for this. These procedural languages cannot bypass the DBMS. Data Handling Techniques, DML, DDL, DCL DML (Data Manipulation Language) DDL (Data Definition Language) DCL (Data Control Language) [SELECT, INSERT, UPDATE, DELETE] [CREATE, DROP] [GRANT REVOKE (DBA)]

SELECT Statement examples: Eg: SELECT discount, stor_id AS bookstore, discount FROM discounts Eg: OTHER alias examples: SELECT discount, stor_id bookstore, discount FROM discounts Eg: SELECT discount, bookstore = stor_id, discount FROM discounts

Eg: With Text: SELECT 'The answer is:' discount, stor_id, discount FROM discounts Eg: With Math: SELECT discount, stor_id, discount, discount*1.75 AS 'UK VAT' FROM discounts Eg: Without repeat: SELECT DISTINCT state FROM stores

Eg: The WHERE clause: SELECT title FROM titles WHERE title_id='MC2222' Eg: Using the BETWEEN Statement (BETWEEN is inclusive!): SELECT title_id, qty FROM sales WHERE qty BETWEEN 10 AND 30

Eg: Using the IN statement: SELECT pub_name FROM publishers WHERE state IN ('NH', 'MA') Eg: The NOT statement: SELECT pub_name FROM publishers WHERE NOT state ='CA' Eg: The ORDER clause (ASC - ascending [default], DESC - descending) SELECT au_laname, au_fname FROM authors ORDER BY au_laname DESC Eg: Text matching (% is a wildcard for a string of zero or many characters): SELECT title_id, title FROM titles WHERE title LIKE '%ook%' Eg: Text matching (_ is a wildcard for exactly one character): SELECT title_id, title FROM titles WHERE title LIKE '_ook%' Eg: Text matching ([] is a range wildcard), any titles that start with a,c,d or f: SELECT title_id, title FROM titles WHERE title LIKE '[acdf]%' Eg: Text matching ([^] is the NOT range wildcard), any titles that does not start with a,b,c,d or e: SELECT title_id, title FROM titles WHERE title LIKE '[^a-e]%' UNION, UNION ALL, INTERSECT, MINUS (To see info from two different tables with same data definition, NO duplication) With UNION if the tables are seen as ensembles, then the intersection between the tables is listed only once Eg: SELECT stor_id, title_id, qty FROM sales_america UNION SELECT stor_id, title_id, qty FROM sales_europe (To see info from two different tables with same data definition, WITH duplication) With UNION ALL If the tables are seen as ensembles, then the intersection between the tables is listed twice Eg: SELECT stor_id, title_id, qty FROM sales_america UNION ALL SELECT stor_id, title_id, qty FROM sales_europe

(To see info from two different tables with same data definition, duplication only) With INTERSECT If the tables are seen as ensembles, then only the intersection between the tables is listed

Eg:

SELECT stor_id, title_id, qty FROM sales_america INTERSECT SELECT stor_id, title_id, qty FROM sales_europe (To see info from two different tables with same data definition, duplication extracted) With MINUS If the tables are seen as ensembles, then only the data from the first table is listed less the common data and the data from the second table Eg: SELECT stor_id, title_id, qty FROM sales_america MINUS SELECT stor_id, title_id, qty FROM sales_europe

INSERT Eg: INSERT INTO products(prod_id, description) VALUES(34,'pants') Eg: INSERT INTO highprice(prod_id,prod_code,price) SELECT prod_id, prod_code, price --(have to have same columns) FROM products WHERE price > 20.00

UPDATE Eg: UPDATE stocks SET qty = 0 WHERE warehouse_id = 10

DELETE Eg: DELETE FROM warehouse WHERE location = 'Chicago' ORDER BY Eg: SELECT * FROM products WHERE prod_code = 'H' ORDER BY price DESC --ASC is the default Comparison, range, patterns, IN, BETWEEN operators Comparison operators: = equal to <> not equal to > greater than < less than >= greater than or equal <= less than or equal Range operators: BETWEEN:

Eg:

SELECT product_id FROM products WHERE price BETWEEN 3 AND 20 Set membership operator: IN Eg: SELECT product_id FROM products WHERE product_code IN ('H','E') Pattern matching operator: LIKE Eg: SELECT product_id, description FROM products WHERE description LIKE 'Pipe%' OR WHERE description LIKE '%Pipe%' OR WHERE description LIKE 'Pipe 2_mm%' % _ wildcard for several characters (percent sign) wildcard for one character(underscore sign)

Logical Operators: IS NULL Eg: SELECT product_id, price FROM products WHERE price IS NULL --null is not equal to zero BETWEEN, IN, LIKE, IS NULL can all be negated by the NOT operator. NOT BETWEEN NOT IN NOT LIKE IS NOT NULL Arithmetic Expressions: Arithmetic Expressions: + - * / Used to generate virtual/temporary column in query results Eg: SELECT description, price, (price*1.05)new_price FROM products --note: name of new column must follow the parentheses Logical Connectives: Logical Connectives: AND OR Eg: SELECT product_id, price FROM products WHERE price>10 OR product_id=10 --note: In a query the AND is satisfied first using IN( , , ) has the same result as using OR

Aggregate Functions:

SUM () gives the total of all the rows, satisfying any conditions, of the given column, where the given column is numeric. AVG () gives the average of the given column. MAX () gives the largest figure in the given column. MIN () gives the smallest figure in the given column. COUNT(*) gives the number of rows satisfying the conditions. Note : Takes entire column of data and produces a single data item that summarizes the column. They are: AVG( ) SUM( ) COUNT( ) MAX( ) MIN( ) Eg: SELECT COUNT(*) FROM stocks --COUNT(*) will count nulls SELECT COUNT(DISTINCT region) FROM warehouses --Use of word DISTINCT eliminate duplicate in count SELECT COUNT(Qty) FROM stocks --Does not count nulls SELECT COUNT(*), MAX(price), MIN(price), AVG(price) FROM products

Eg:

Eg:

Eg:

GROUP BY Eg: SELECT prod_code, AVG(price) FROM products GROUP BY prod_code -- note: the column to GROUP BY has to be in the SELECT column HAVING (WHERE clause of the GROUP BY) -- The GROUP BY clause (with aggregate fctn): Eg: SELECT state, COUNT(*) AS 'Total' FROM authors GROUP BY state -- The HAVING clause (with aggregate fctn): Eg: SELECT state, COUNT(*) AS 'Total' FROM authors GROUP BY state HAVING COUNT(*) > 1 JOIN Example of an equi-join (When there is an exact match between the columns)

Eg:

SELECT products.prod_id, description --(full path name avoids ambiguity if FROM products, stocks -- column name is similar in two tables) WHERE products.prod_id= stocks.prod_id --(joint statement) AND qty = 5 An example of a natural join follows. It produces a product of every combination of the two tables!!! Not used frequently since it returns useless information Eg: SELECT * FROM products, stocks An example of a join on more than two tables follows: Eg: SELECT description, qty, location FROM products, stocks, warehouses WHERE products.prod_id= stocks.prod_id AND stocks.warehouse_id = warehouses.warehouse_id AND qty < 5 ALIAS Eg: SELECT description, qty, location FROM products p, stocks s, warehouses w WHERE p.prod_id= s.prod_id AND s.warehouse_id = w.warehouse_id AND qty < 5

-- <- alias declaration

Self-join are used in a multi-table query involving a relationship a table has with itself. Eg: SELECT right.location FROM warehouses left, warehouses right WHERE left.region= right.region AND left.warehouse_id <> right.warehouse_id An outer-join is needed when a value in a joining column in one table has no matching value in the joined table. Eg: SELECT location, region_name FROM region r FULL OUTER JOIN warehouses w ON r.region_id = w.region_id Subquery Eg: SELECT prod_id, description FROM products WHERE product_id IN(SELECT prod_id FROM stocks WHERE qty > 50) Inner Join statements (rows must be present in each tables that match on the join condition): Eg: SELECT p.pub_id, pub_name, title FROM publishers AS p INNER JOIN titles AS t ON p.pub_id = t.pub_id WHERE type = 'business' Eg:

SELECT stor_name, title, ord_date, qty FROM stores INNER JOIN sales ON stores.stor_id = sales.stor_id INNER JOIN titles ON sales.title_id = titles.tles_id WHERE type = 'popular_comp' Left and right outer join (Left joins return rows with left table column info that have matching condition but null info on column from the right side table) : Eg: SELECT titles.title_id, title, ord_date, qty FROM titles LEFT OUTER JOIN sales ON titles.title_id = sales.title_id

VIEW Eg: CREATE VIEW James_view AS SELECT prod_id, description FROM products WHERE price < 17.50; DDL (Data Definition Language) CREATE also using UNIQUE, NOT NULL, ASSERTION, DOMAIN, CHECK... ALTER used to add a column, add/delete primary/foreign keys, add/drop a uniqueness or Check constraint DROP When using RESTRICT Drop will fail if table has objects dependencies CREATE VIEW CREATE INDEX The catalog holds information about roles and privileges. It has information on VIEWS. Eg: DROP TABLE customer Eg: To create a table: CREATE TABLE title ( Au_id ID, Title_id TID, au_ord TINYINT NULL, typer INT NULL ) Eg: To create a temporary table (visible to present connection, will be deleted when logout) CREATE TABLE #temp1 ( Au_id ID, Title_id TID )

Eg: To create a global temporary table (visible to all connections on server, will be deleted when all logout) CREATE TABLE ##temp1 (Au_id ID, Title_id TID) Eg: To create a view CREATE VIEW myfirstview AS SELECT fname, lname, address FROM customers Eg: To remove a view from the database DROP VIEW myfirstview Eg: To change a view ALTER VIEW myfirstview (store_name, qty_sold, date_sold, title) AS SELECT store, qty, date, title_name FROM stores INNER JOIN sales ON sales.stor_id = stores.stor_id DCL (Data Control Language) GRANT priviledge TO username IDENTIFIED BY password; CREATE SYNONYM Used to shorten user/owner of a table. CREATE PUBLIC SYNONYM ... FOR ... ; Public synonyms can be created by the DBA. COMMIT and ROLLBACK An explict transaction is a group of statements that must all succeed or must all fail. to turn ON the ANSI SQL-92 behaviour use: SET IMPLICIT_TRANSACTION ON --ANSI SQL-92 COMMIT WORK ROLLBACK WORK COMMIT; --makes changes made to some database systems permanent (since the last COMMIT; known as a transaction) ROLLBACK; --Takes back any changes to the database that you have made, back to the last time you gave a Commit command...beware! Some software uses automatic committing on systems that use the transaction features, so the Rollback command may not work. Mathematical Functions: ABS(X) Absolute value-converts negative numbers to positive, or leaves positive numbers alone CEIL(X) X is a decimal value that will be rounded up. FLOOR(X) X is a decimal value that will be rounded down. GREATEST(X,Y) Returns the largest of the two values. LEAST(X,Y) Returns the smallest of the two values. MOD(X,Y) Returns the remainder of X / Y. POWER(X,Y) Returns X to the power of Y. ROUND(X,Y) Rounds X to Y decimal places. If Y is omitted, X is rounded to the nearest integer. SIGN(X) Returns a minus if X < 0, else a plus. SQRT(X) Returns the square root of X. Character Functions

LEFT(<string>,X) RIGHT(<string>,X) UPPER(<string>) LOWER(<string>) INITCAP(<string>) LENGTH(<string>) <string>||<string>

Returns the leftmost X characters of the string. Returns the rightmost X characters of the string. Converts the string to all uppercase letters. Converts the string to all lowercase letters. Converts the string to initial caps. Returns the number of characters in the string. Combines the two strings of text into one, concatenated string, where the first string is immediately followed by the second

LPAD(<string>,X,'*') Pads the string on the left with the * (or whatever character is inside the quotes), to make the string X characters long. RPAD(<string>,X,'*') Pads the string on the right with the * (or whatever character is inside the quotes), to make the string X characters long. SUBSTR(<string>,X,Y) Extracts Y letters from the string beginning at position X. NVL(<column>,<value>) The Null value function will substitute <value> for any NULLs for in the <column>. If the current value of <column> is not NULL, NVL has no effect. Command / ? [Keyword] @[@] [Filename] [Parameter list] ACC[EPT] Variable [DEF[AULT] Value] [PROMPT Text | NOPR[OMPT]] CL[EAR] [SCR[EEN]] CL[EAR] SQL COL[UMN] [Column] [Format] CON[NECT] [username/password@database] DEF[INE] [Variable] [ = Text] DESC[RIBE] Object DISC[CONNECT] EDIT EXEC[UTE] Procedure EXIT GET [Filename] HOST [Command] HELP [Keyword] INFO PAUSE [Message] PRI[NT] [Variable] PROMPT [Text] QUIT R[UN] REM[ARK] [Text] SET AUTOP[RINT] [ON | OFF] Meaning Executes the SQL buffer Provides SQL help on the keyword Runs the specified command file, passing the specified parameters Allows the user to enter the value of a substitution variable Clears the screen Clears the SQL buffer Defines the format of a column, displays the format of a column, or displays all column formats Connects to the database with the speciffied user Defines a substitution variable, displays a variable, or displays all substitution variables. Gives a description of the specified object Disconnects from the database Displays a text editor to edit the SQL buffer Executes the specified procedure Quits a running script or closes the Command Window Loads a command file into the editor Executes the host command Provides SQL help on the keyword Displays information about the connection Displays the message and pauses until the user presses return Displays the value of the bind variable, or all bind variables Displays the specified text Quits a running script or closes the Command Window Executes the SQL buffer A comment line Determines if bind variables are automatically displayed after executing a SQL statement or PL/SQL block.

SET CON[CAT] [Character | ON | OFF] SET DEF[INE] [Character | ON | OFF] SET ECHO [ON | OFF] SET ESC[APE] [Character | ON | OFF] SET FEED[BACK] [ON | OFF] SET HEA[DING] [ON | OFF] SET LONG [Width] SET PAGES[IZE] [Size] SET SERVEROUT[PUT] [ON | OFF] [SIZE n] SET TERM[OUT] [ON | OFF] SET TIMI[NG] [ON | OFF] SET VER[IFY] [ON | OFF] SHO[W] REL[EASE] SHO[W] SQLCODE SHO[W] USER SPO[OL] [Filename | OFF] STA[RT] [Filename] [Parameter list] STORE SET [Filename] UNDEF[INE] Variable VAR[IABLE] [Variable] [Datatype]

Determines the character that terminates a substitution variable reference (default = .) Determines the character that starts a substitution variable reference (default = &) Determines if executed commands in a script are displayed Determines the character that escapes the character that starts a substitution variable reference (default = \) Determines if the number of affected rows of a SQL statement is displayed Determines if headings are displayed above the columns of a result set Determines the maximum display width of a long column Determines the number of lines that are displayed for a result set, before the headings are repeated Determines if output of calls to dbms_output.put_line is displayed, and what the size of the output buffer is Determines if output of executed SQL statements is displayed Determines if timing information about executed SQL statements is displayed Determines if substitution variables are displayed when used in a SQL statement or PL/SQL block Displays Oracle release information for the current connection Displays the result code of the executed SQL statement Displays the username of the current connection Starts or stops spooling Runs the specified command file, passing the specified parameters Stores the values of all options in the filename. You can execute this file later to restore these options. Undefines the given substitution variable Defines a bind variable, displays a bind variable, or displays all bind variables.

You might also like