You are on page 1of 12

Section A Question 1:

 

ERD based question. You will have to draw one with entities, relationships and attributes and keys.

 

Section A Question 2:

 

Basic SQL. Does not require functions, triggers

 

Section A Question 3:

 

TSQL (3 sub questions) Recognise whether they are functions, triggers, and stored procedures

Write an execution statement

 

Read two objects and tell them what tasks they are doing and point out errors

 
 

Section B Questions 4, 5, 6:

 

Three BI questions, based on the Snow Abode case study All BI solutions are very important.

 

Section C Questions 7, 8, 9:

Information Modelling

Ignoring time variant data (current data with historic data) -most data changes with time -updating data does not have to mean over writing -more accurate information is through time: stamping and adding extra rows with every change over time.

Fan trap problem -set hierarchies for the data

Redundant relationships -useless relationship between data should be discussed

Use of surrogate keys -when the natural keys are not suitable as a primary key: too long with multiple data types and may change over time

Use of enterprise keys -primary keys that are unique in the whole database and it can be used in more than one relation corresponds with the concept of the key in object-oriented systems

Information Modelling Ignoring time variant data (current data with historic data) -most data changes with time

Denormalization

Benefits can one improve performance by reducing the number of table lookups (reduce number of necessary join queries)

Costs (data duplication) wasted storage space, data integrity/consistency threats

Common denormalization opportunities one to one relationship, many to many relationship with non-key attributes (associative entity), reference data (1: N relationship where 1-side has data not used in any other relationship)

 

Efficiency: records used together are grouped

Inconsistent access speed: slow retrievals across

 
 
Relational model

Relational model

The data stored in a database are worthless unless one can interact with it to get useful information. •

In relational algebra (theoretical way), queries are specified using a collection of operators. Such queries are said to be conducted in an operational manner. This is procedural. In relational calculus, a query describes the desired answer, without specifying how the answer is to be computed. Used very much in QBE (query by example). This non-procedural style of querying is said to be declarative

SQL Basic concepts

SQL can be used for five main purposes: retrieve, define, manipulate, and control data & transactions. A statement is sent to RDBMS server and the results are appeared as a tuple.

NULL values (Unknown) Tuples in SQL relations can have NULL as a value for one or more components. Missing value: we know it has data but we dont know what it is Inapplicable: the value does not meet the condition. When we want to initialize a variable at the beginning of the program (SET @variable=null)

GroupBy functions perform calculations such as sum, avg, count, max, min and so on. After this clause HAVING function could be used as WHERE (it cant be with aggregate functions)

View

Simplifying complex joins

Reformatting retrieved data for efficiency of the query

Filtering unwanted data

To include calculated fields

• When we want to update a view, drop view and then create a new view

ERD generalisation concepts

All the subtypes inherit the common attributes from the supertype also subtype will have its own attributes

Generalization: the process of defining a more general entity type from a set of more specialized entity types. Aggregate entities to a superclass entity type by identifying their common characteristics.

Specialization: the process of defining one or more subtypes of the supertype and forming supertype/subtype relationships. Identifying subclasses, and their distinguishing characteristics.

EER (extended entity relation) diagrams are difficult to read when there are too many entities and relationship so GROUP entities and relationships into entity clusters: set of one or more entity types and associated relationships grouped into a single abstract entity type.

Programming with T-SQL

Variable It is a named object that stores values within a program

It does not have persistency outside a program (unlike values stored in tables and views in the database) by default. These variables are known as local variables (use @)

Persistency of a value stored in a variable can be extended to other available programs within

an application by declaring it a global variable (use @@) Once it is declared with data type, it cannot be undeclared within the scope of the program.

It has no values when first declared so use SET for single variable and SELECT for multiple.

GetDate(): returns the current date DatePart(): returns part of the date (the day, the day of the week and the month) DatePart(dw, GetDate()): returns the current day of the week (1 for Sunday and 2 for Monday)

BEGIN & END define a block of code. Use this in IF and WHILE (looping) statements. SQL server does not recognize ELSE as part of the IF statement so only executes the next line of code and if a condition is not true do not activate the code.

Programming with T-SQL Variable It is a named object that stores values within a program •
Programming with T-SQL Variable It is a named object that stores values within a program •

Stored procedures

Stored procedures are batched collections of SQL statements saved for future use. They are statements that are bunched together and run as one unit, sequentially in order as written within the procedure

Allows data integrity same reports will always give the same results if conditions are kept static

Preventing errors and maintaining consistency, security and improve performance and reducing workload

Independence of business logic and data

It contains simplicity, operational efficiency, performance and security.

Stored procedures Stored procedures are batched collections of SQL statements saved for future use. They are
Stored procedures Stored procedures are batched collections of SQL statements saved for future use. They are
Stored procedures Stored procedures are batched collections of SQL statements saved for future use. They are
Stored procedures Stored procedures are batched collections of SQL statements saved for future use. They are

Cursors

A cursor points to a record one at a time allowing operations to be performed on the tuple that it is pointing to. It is a database query stored in SQL server. Not a SELECT statement but the relational set resulting from that statement. We can think it as a pointer.

Cursors are created using the DECLARE statement used to declare a variable. DECLARE names the cursor and takes a SELECT statement that defines the cursor. To remove an unused cursor use the DEALLOCATE statement.

Once the cursor is opened, the data can be accessed by stepping through the relation created by using FETCH statement. It is managed by FETCH statement accessing the next row and the next with each FETCH.

FETCH options

Fetch to prior: to retrieve the previous row of a relation

Fetch first: to retrieve the first row of a relation

Fetch last: to retrieve the last row of a relation

Fetch absolute: to retrieve a specific row of a relation starting from the top

Fetch relative: to retrieve a specific row of a relation starting from the current row

DECLARE @orderNo INT; DECLARE @orderTotal MONEY; DECLARE @grandTotal MONEY;

SET @grandTotal=0;

DECLARE CustOrders_Cursor CURSOR FOR SELECT CustOrdNo FROM CustOrder ORDER BY CustOrdNo;

OPEN CustOrders_Cursor;

FETCH NEXT from CustOrders_Cursor into @orderNo;

WHILE @@FETCH_STATUS=0 BEGIN EXECUTE Order_TOTAL @orderNo, 1, @orderTotal OUTPUT

SET @grandTotal = @grandTotal + @orderTotal;

FETCH NEXT from CustOrders_Cursor into @orderNo;

END

CLOSE CustOrders_Cursor; DEALLOCATE CustOrders_Cursor; SELECT @grandTotal as GrandTotal;

Triggers

Triggers are used to create an audit trail of what happens in a database and also ensure data integrity and consistency.

  • 1. Specify unique name for the trigger

  • 2. Specify the table on which the trigger will act on

  • 3. Specify which database event the trigger should respond to: insert, update or delete Example: AFTER INSERT

  • 4. If you may want the trigger to run after more than one such event, then specify Example: AFTER INSERT, UPDATE

  • 5. To delete a trigger use: DROP trigger <trigger name>;

  • 6. To update a trigger use: ALTER trigger <trigger name>; or DROP and CREATE a trigger

Identity It is a function in SQL server that can be used when creating a table similar to creating a table similar to creating a sequence in Oracle. Only one identity column is used for one table, generally for the PK. *CREATE Table Test1 (Test1_ID INT NOT NULL IDENTITY (1: Start with, 1: Increment by))*

END CLOSE CustOrders_Cursor; DEALLOCATE CustOrders_Cursor; SELECT @grandTotal as GrandTotal; Triggers Triggers are used to create an
END CLOSE CustOrders_Cursor; DEALLOCATE CustOrders_Cursor; SELECT @grandTotal as GrandTotal; Triggers Triggers are used to create an

Functions

It is a programmable object used to do one or more calculations and return a value to a calling application or integrate values into a result set. It can manipulate data and return a value, but they cannot modify any data within the repositories.

These functions are global functions available to all developers of SQL server applications. Even it is not portable, better to use it than writing new more portable functions for better performance within SQL server.

User-defined functions (can be created only to manipulate data) • Declare to define data variables and cursors local to the function Set to assign values to scalar and table local variables Cursor operations including FETCH statements that assign values to local variables using the INTO clause are allowed Select statements containing select lists with expressions that assign values to variables that are local to the function Update, Insert and Delete statements modifying table variables that are local to the function Execute statements calling an extended store procedure

User-defined function

Stored procedure

Must return a value- a single result set

Can return a values or even multiple result sets

Returns table variables

Cannot return a table variable although it can create a table

Directly use in SELECT, ORDER BY, WHERE and FROM clauses

Cannot use SELECT

Cannot change server environment variables

Can change server environment variables

Always stop execution when error occurs

If use proper error handling code, consistency

Data Warehouse

It is an integrated, subject-oriented, time-variant and non-volatile database that provides support for management decision making in an organization. It needs to be integrated, company-wide view of high quality information from disparate databases and separation of operational and informational systems and data for improved performance.

Operational or Transaction Processing system Substitutes computer-based processing for manual procedures

Deals with well-structured routine processes

Data warehouse: motivation Massive amounts of data from business transactions

Improvements in IT

Intense competition for customers attention

 

Operational data

DS data

Timespan

represent current transactions

tend to cover long time frame

Granularity

represent specific transactions that occur at a given time

presented at different levels of aggregation

Dimensionality

focuses on representing atomic transactions

can be analyzed from multiple dimensions

When we create a data warehouse, there are external and internal data which is operational data

When we create a data warehouse, there are external and internal data which is operational data , so we extract filter, transform, integrate, classify, aggregate and summarise and then send those to data warehouse to make it integrated, subject-oriented, time-variant and non-volatile.

Star schema

Data mining

Facts, dimensions, attributes, attribute hierarchies

When we create a data warehouse, there are external and internal data which is operational data
When we create a data warehouse, there are external and internal data which is operational data
steps in data reconciliation capture - scrub - transform - load&index data warehouse pitfalls • getting

steps in data reconciliation

capture - scrub - transform - load&index

data warehouse pitfalls

getting reconciled metadata for the enterprise data model

getting clean data values

communication problems

lack of technical expertise

poor planning

Database architecture

Database engine • is the core service for storing, processing, and securing data provides controlled access and rapid transaction processing to meet requirements, also rich support for sustaining high availability is used to create relational databases for OLTP or OLAP data includes creating tables and other database objects for viewing, managing and securing data SQL server management studio can be used to manage the database objects • SQL server profiler can be used for capturing server events Analysis services multidimensional data

It provides fast, intuitive, top-down analysis of large quantities of data built on this unified data model, which can be delivered to users in multiple languages and currencies.

Also works with data warehouse, data marts, production databases and operational data stores, supporting analysis of both historical and real time data

OLAP data can be disaggregated and aggregated along a dimension according to their natural hierarchy

Issues are query performance & reliability, integration & flexibility, capacity & scalability, exponential database growth, total cost of ownership and rapid technological changes.

Intelligence density

BI is what is achieved for the decision makers in a business through techniques and tool such as MSS, data warehousing, OLAP, and data mining, either individually or in combination of two or more