You are on page 1of 10

UNIVERSITI TUNKU ABDUL RAHMAN

JANUARY 2020 TRIMESTER

MAIN FINAL ASSESMENT

UCCD3233 DATA WAREHOUSE MODELLING AND IMPLEMENTATION

2:00PM, 4th MAY 2020 DURATION: (3 HOURS)

BACHELOR OF INFORMATION SYSTEMS (HONS)


INFORMATION SYSTEMS ENGINEERING
BACHELOR OF INFORMATION SYSTEMS (HONS)
BUSINESS INFORMATION SYSTEMS

Instructions to Students:

General

1. This Final Assessment (FA) is an Individual, Open-Book assessment which consists


of FOUR (4) questions. Each question carries 25 marks.

2. You are required to answer ALL questions, and submit the ANSWER SCRIPT by
5:00pm, 4th MAY 2020.

3. During the period of 3 hours of this FA, the examiner(s) can be reached at
(a). Microsoft Teams with Code/Password: 8uwsdst or
(b). Email:uccd3233chat@gmail.com
You may use the above e-platform(s) to check with the examiner(s) if you need any
clarification on this FA question paper

4. You may refer to any books, lecture notes, published materials, online resources, etc
when answering the questions. However, COPY-AND-PASTE, DISCUSSION, and
SHARING OF ANSWERS are STRICTLY PROHIBITED during the FA.

___________________________________________________________________________
This assessment paper consists of 4 questions on 9 printed pages
2
UCCD3233 DATAWAREHOUSE MODELLING AND IMPLEMENTATION

Answer Script File

5. The answer script MUST be either a Microsoft Word or PDF file, in A4 size format.
Note: The file size is limited to 10MB.

6. Please check you index number generated by the Division of Examinations, Awards,
and Scholarships (DEAS). You MUST name your answer script using the following
file name for submission:
[Course Code]_FA_[Programme Abbreviation]_[Your Index Number]

For example, if you are from the degree programme IB, and your Index Number is
A01234CBIBF, then your answer script should be named as
UCCD3233_ FA_ IB_A01234CBIBF.doc

Answer Script Submission

7. Your answer script file has to be submitted to BOTH of the following platforms
before the due time/date.
(a). Attach your answer script at Google Form:[Please click here for answer
script submission through Google Form], or, copy and paste the following
link:
https://docs.google.com/forms/d/e/1FAIpQLSfwDfFyqeJ9r1IDhjwTWKe2XPl
f5iM_HED4HpVNsF5Hej7uzA/viewform?usp=sf_link
Note: Use your “1UTAR” email account to access the Google Form.
You can submit the answer script to the Google Form ONLY ONCE.

(b). Send your answer script to the following Email Account according to your
programme:
i. IA students please send your answer script to:
[ UCCD3233FAIA@gmail.com ]
ii. IB students please send your answer script to:
[ UCCD3233FAIB@gmail.com ]
Note: For the title of your email, please use the file name of your answer
script. That is,
UCCD3233_FA_[Programme Abbreviation]_[Your Index Number]

8. Please make sure that you submit the same copy of answer scripts to the above
platforms. If different answer scripts are received, the examiner will just randomly
choose one of them to mark and the other will be totally ignored.

___________________________________________________________________________
This assessment paper consists of 4 questions on 10 printed pages.
3
UCCD3233 DATAWAREHOUSE MODELLING AND IMPLEMENTATION

9. The answer script submitted after the due time/date may incur a late penalty as shown
below:
i. 0 hour < lateness ≤ 0.5 hour: 10% mark deduction
ii. 0.5 hour < lateness ≤ 1 hour: 20% mark deduction
iii. 1 hour < lateness ≤ 1.5 hours: 30% mark deduction
iv. 1.5 hours < lateness ≤ 2 hours: 40% mark deduction
v. 2 hours < lateness ≤ 3 hours: 50% mark deduction
vi. Lateness > 3 hours : 100% mark deduction

Contents of Answer Script

10. The first page of your answer script is the cover page. You MUST use the template
given in Appendix 1 and fill up the following information
 You Degree Programme (Abbreviation)
 Your Index Number
 Your Name
 Your Student ID

11. The second page of your answer script is the Declaration Form. You MUST use the
template given in Appendix 2, and sign on this form to indicate your authenticity of
submitted work without plagiarism.

12. Each question should be answered starting on a new page. It is recommended that the
answer to each question is limited to [5] pages or [500] words, whichever is lower.

13. In your answer script, all texts MUST be typed using Times New Roman characters
with font size no less than 12, except for the drawings and equations/calculations.

14. For the drawings and equations/calculations, you MUST draw/write them on a blank
paper, and then take pictures and include the pictures in the Word document as part of
your answers. It is recommended that the size of each picture file should be kept in a
range of 10kB ~ 500kB.

15. Please include a page number on each and every page of your answer script.

WARNING OF PLAGIARISM

16. All answer scripts will be uploaded by the examiners to Turnitin for similarity
check. In the case of plagiarism being suspected, the evidences will be submitted
to the University Examination Disciplinary Committee for further investigation
and trial. If found guilty, serious disciplinary action will be taken against the
students.

___________________________________________________________________________
This assessment paper consists of 4 questions on 10 printed pages.
4
UCCD3233 DATAWAREHOUSE MODELLING AND IMPLEMENTATION

Questions: [Total: 100 Marks]

Q1. Hub-and-Spoke architecture becomes the more common and successful architecture
design for data a warehouse enterprise currently. Independent Data Marts and
Federated Data Mart architecture reported are the poorest architecture for data
warehouse development.

(a) In contrast with the above statement, if you were given a chance to prove the
abilities of these Independent Data Marts and Federated Data Mart
architectures, discuss and provide your choices of the concept of your chosen
architecture. (8 marks)

(b) Even though the Hub-and-Spoke known as Inmon’s Approach is the optimal
solution for a data warehouse implementation, some of the independent
researcher claimed the architecture is costly and it requires more time to
implement. Justify this statement with suitable factors. (7 marks)

(b) The next stage is designing the dimensional data model after the selection of
architecture. You should consider the design decisions as a data warehouse
designer expert at this stage. What would be your guidelines in terms of
making design decisions in designing the dimensional data model? (10 marks)
[Total : 25 marks]

Q2. (a) The core objective of denormalization in the data warehouse is for faster data
retrieval with reducing complexity and joins tables. Figure 1 shows an entity-
relationship (ER) model for sales order industry. By using the denormalization
concept, draw a Star Schema for representing the ER model.

*Note: Measurements are defined based on Time dimension (attributes: Date,


Week, Month, Year), Minimum four dimensions including with Time
dimension, and the Measures: Order Quantity, Total Sales.

___________________________________________________________________________
This assessment paper consists of 4 questions on 10 printed pages.
5
UCCD3233 DATAWAREHOUSE MODELLING AND IMPLEMENTATION

Q2. (a) (Continued)

Figure 1: The entity-relationship (ER) model

Draw a star schema diagram based on Figure 1 with the requirements stated
below:

(i) Denormalize the tables and form dimension tables with attributes.
(12 marks)

(ii) A fact table with attributes. (4 marks)

(iii) Relationships between the fact table and the dimension tables.
(4 marks)

(b) Sales data are generally stored in the fact table by order and by date. Recently,
the client requested that the new report should be presented at the brand level.
Based on your answer in Q2 (a), re-draw the star schema with aggregated fact
table and dimensions if this occurs in one-way aggregates. (5 marks)
[Total : 25 marks]

___________________________________________________________________________
This assessment paper consists of 4 questions on 10 printed pages.
6
UCCD3233 DATAWAREHOUSE MODELLING AND IMPLEMENTATION

Q3. Slowly Changing Dimension (SCD) is used to perform a proper management of data
and to help to perform analysis. There are three different types of SCD proposed by
Kimball in general for applying the changes in the data warehouse.

Figure 2: It shows an example of an information package for export and


import industry

(a) Based on your understanding of Figure 2, when the ‘export type’ of export for
exp_ID 1 changes (create the new records based on your StudentID, for
example: exp_ID A01234CBIBF):

(i) Scenario 1: Illustrate the changes if the industry decides to apply Type
1 techniques to change the dimension “Export’. (2 marks)

(ii) Scenario 2: Illustrate the changes if the industry decides to apply Type
2 techniques to change the dimension “Export’. (2 marks)

(iii) Scenario 3: Illustrate the changes if the industry decides to apply Type
3 techniques to change the dimension “Export’. (2 marks)

(b) If you have decided to apply SCD Type 3 techniques based on your example
illustrated in Q3 (a), discuss how you would implement this technique?
(9 marks)

(c) Even though Kimball proposed three different types of SCD, explain why
most of the industry practitioners do not recommend Type 3 as one of the SCD
techniques in the data warehouse. (5 marks)

___________________________________________________________________________
This assessment paper consists of 4 questions on 10 printed pages.
7
UCCD3233 DATAWAREHOUSE MODELLING AND IMPLEMENTATION

Q3. (Continued)

(d) As a data warehouse involved in maintenance, which SCD type you will
recommend and why if you were given the chance to recommend to your
fellow industry practitioners between SCD Type 1 and SCD Type 2?
(5 marks)
[Total : 25 marks]

Q4. Assume that a medical laboratory star schema with four dimensions with 8 branches
and two specimen rejection in a branch on a given day from the 30,000 specimens and
there are 400 specimens per type specimen based on your assumptions. The fact table
should load the data based on Gregorian calendar for 3 years:

*Note: assume that there is at least one clinical pathology test is carried out per
specimen per branch per week.

(a) Estimate the number of fact table rows to be retrieved and summarized for the
following queries by showing the steps:

(i) Query 1: if involves per specimen, per branch, per week. (2 marks)

(ii) Query 2: if involves all specimens, per branch, per week. (2 marks)

(iii) Query 3: if involves per type specimen, all branches, per week.
(2 marks)

(iv) Query 4: if involves per type specimen, all branches, per year.
(2 marks)

(b) Show the steps if you had precalculated and created an aggregated fact table
based on the following:

(i) Aggregated query 1: if involves per type specimen, all branches, per
week. (2 marks)

(ii) Aggregated query 2: if involves per type specimen, all branch, per
year. (2 marks)

(c) Show the steps if you have precalculated and created another aggregate fact
table based on your answer in Q4(a)(iv) per type specimen, per branch, per
year. Justify your answer that this method is effective for data warehouse.
(5 marks)

___________________________________________________________________________
This assessment paper consists of 4 questions on 10 printed pages.
8
UCCD3233 DATAWAREHOUSE MODELLING AND IMPLEMENTATION

Q4. (Continued)

(d) During the requirement definition process of data warehouse, you are required
to roughly calculate the storage sizes for the data warehouse development.
Discuss that how you will apply the calculations if you are deciding for
implementing the top-down approach. (8 marks)
[Total : 25 marks]

___________________________________________________________________________
This assessment paper consists of 4 questions on 10 printed pages.
9
UCCD3233 DATAWAREHOUSE MODELLING AND IMPLEMENTATION

Appendix 1: Final Assessment Cover Page

(Remark: This must be placed as the FIRST PAGE of your Answer Script)

Answer Script

Main Final Assessment - Jan 2020 Trimester

UCCD3233 Data Warehouse Modelling and Implementation

Degree Programme IA / IB

Exam Index Number:

Student Name:

Student ID:

Marks Awarded

Q1.

Q2.

Q3.

Q4.

Total:

Remark: Late Submission? _____

If Yes, Lateness: ______________


Marks after Deduction: ________

___________________________________________________________________________
This assessment paper consists of 4 questions on 10 printed pages.
10
UCCD3233 DATAWAREHOUSE MODELLING AND IMPLEMENTATION

Appendix 2: Final Assessment Declaration Statement

(Remark: This must be placed as the SECOND PAGE of your Answer Script)

Final Assessment Declaration Statement

DECLARATION

I, ___________________________________(Name), Student ID. ____________________


hereby solemnly and fully declare and confirm that during my programme of study at
Universiti Tunku Abdul Rahman, I shall abide and comply with all the rules, regulations and
lawful instructions of Universiti Tunku Abdul Rahman and endeavour at all times to uphold
the good name of the University.
I hereby declare that my submission for this Final Assessment is based on my original work,
not plagiarised from any source(s) except for citations and quotations which have been duly
acknowledged. I am fully aware that students who are suspected of violating this pledge are
liable to be referred to the Student Disciplinary Committee of the University.

Programme: ________________________________________

(Digital) Signature: ___________________________________

Student’s I.C. / Passport No:: ___________________________

Exam Index No: _____________________________________

Date of Submission: __________________________________

___________________________________________________________________________
This assessment paper consists of 4 questions on 10 printed pages.

You might also like