You are on page 1of 5

Exercise-I

Suppose that a data warehouse contains three dimensions date, doctor and patient. There
is only measure – charge where charge is the fee that a doctor charges to a patient for a
visit. Design a star schema for the data warehouse, assuming some concept hierarchy for
each dimension. Starting with the base cuboid [date, doctor, patient], which sequence of
OLAP operations do you need to list the total fee collected by each doctor in the year
2002?

Exercise-II

A consortium of banks wants to develop a data warehouse for effective decision-making


about their loan schemes. The banks provide loans to customers for various purposes,
like, House Building Loan, Car Loan, Educational Loan, Personal Loan, etc. The whole
country is categorized into a number of regions, namely, North, South, East and West.
Each region consists of a set of states. Loan is disbursed to customers at interest rates that
change from time to time. Also, at any given point of time, the different types of loans
have different rates. The data warehouse should record an entry for each disbursement of
loan to customer.

With respect to the above business scenario, answer the following questions. Clearly state
any reasonable assumptions you make.
a. Design a star schema for the data warehouse clearly identifying the fact table(s),
dimensional table(s), their attributes and measures along with the primary key and
foreign key relationships.
b. Write an SQL query by which you can display region-wise, bank-wise, year-wise total
amount of loans disbursed from your schema.
c. Starting with the base cuboid, if we want to see the amount of loan disbursed during
the year 2000 for the state of Maharashtra, which sequence of OLAP operations would
you need to perform?

Exercise-III
Consider the following business scenario. A telecom company plans to maintain a CRM
data warehouse. There are 10 million customers of the company. Besides the usual
attributes, the company wants to maintain additional demographic information like
literacy percentage, male/female ratio, average life expectancy and average income of the
people belonging to the state to which each customer belongs. The company also wants
to maintain information about the age group, income level and marital status of its
customers. They also need to run queries like the number of married and unmarried
customers they have at any point in time.
a. Design an efficient data warehouse schema that satisfies the above business scenario.
Clearly identify the fact table(s), dimension table(s), primary key(s) and foreign key(s).
b. Write an SQL statement that generates the number of married and unmarried
customers that the company has today.

Exercise-IV

An insurance company, with branches all over the country, wants to develop a data
warehouse for effective decision-making about their insurance policies. There are a
number of different types of insurance like Auto insurance, Home insurance, Industrial
insurance, etc. The entire country is categorized into four regions, namely, North, South,
East and West. Each region consists of a set of states. There may be different types of
customers like individuals, institution, industry, etc. The data warehouse should record an
entry for each policy issued to each customer along with the premium paid.
With respect to the above business scenario, answer the following questions. Clearly state
any reasonable assumptions you make.

Exercise-V

A chain of departmental stores called “India-Mart” having operations only in India, plans
to develop a data warehouse for effective decision-making about their sales and different
promotion schemes. India-Mart puts some of their products on promotional sales from
time to time. There may be a large number of different types of promotions like coupon
sales, end-of-the-aisle display, buy-two-get-one-free, etc.
India-Mart would like to analyze how item sale is affected by the promotions at
each store, in each state and across the entire country.
With respect to the above business scenario, answer the following questions.
[15+5+5+5+5+5]

a. Design a star schema for the data warehouse clearly identifying the fact table(s),
dimension table(s), their attributes and measures along with the primary key and foreign
key relationships.
b. Write an SQL query by which you can display year-wise, promotion-wise, product-
wise total sales in the entire country from your schema.
c. Draw a cuboid that would display the result of the query specified in Q. b above.
d. From the cuboid of Q. c above, if we want to find the total amount of promotional
sales made during the years 2002 and 2003 for the states of Karnataka and Maharashtra,
which sequence of OLAP operations would you need to perform?
e. Draw possible schema hierarchies for each dimension that you have designed.
f. Based on the schema hierarchies drawn in Q. e above, determine the total number of
cuboids, considering all the aggregation levels.
Exercise-VI

A university plans to build a data warehouse that would help them in analyzing the
peformance of the students in various courses in different academic sessions. They want
to analyze if there is any relation between the average grade of a course and the number
of students attending it. They would also like to know if there were some courses offered
but did not have any students registered for them. Relative performance among boys and
girls and average grades of students from various states and cities of the country for each
course must be analyzed for each course and also overall CGPA.

(a) Design a star schema for such a data warehouse clearly identifying the fact table(s)
and dimension table(s), their primary key(s) and foreign key(s). Your schema should at
least be able to satisfy the above mentioned analysis requirements. You may consider
other suitble attributes for thedimension table(s).
(b) Write an SQL query that runs on your schema and returns the average SGPA of boys
from the state of Karnataka for each spring semester during the years 2002-2005.
[15+5=20]

Exercise-VII

A hospital cum medical research institute is carrying out a study on the


nature of different
types of fevers. In order to track every patient as he/she keeps coming
back to the hospital, a
unique id is maintained. For each patient, they keep track of the body
temperature at every
hour of the day as long as the patient is admitted in the hospital. They
also maintain data
about the different types of medicine being given to the patient.
Patients may be given more
than one medicine in a day. Every medicine is administered as many
times in a day as the
doctor has prescribed. Since there is history of different types of fevers
occurring in various
districts, states and regions in the country, the hospital research team
wants to maintain such
residence details of each patient. One of the goals of the research is to
determine if there is
any relation between the age and gender of the patients with their
body temperature when
various medicines are administered. Another goal is to determine if
there is a relation
between the % of population who are farmers, office goers or teachers
in the patient’s state
with the body temperature of the patients when various medicines are
administered.
a. Design a suitable schema for the hospital cum medical research
institute, clearly
identifying the Fact table(s), Dimension Tables(s), the Facts, the
Dimensions, Primary
Keys and Foreign Keys of all the tables. Your schema should at least be
able to
satisfy the above mentioned research requirements. You may consider
other suitable
attributes for the dimension table(s).
b. Classify the fact(s) in your fact table(s) as additive, non-additive and
semi-additive.
c. Write an SQL query that runs on your schema and returns today’s
average, maximum
and minimum body temperature for each married male patient.
d. Draw a cuboid to represent the result of your query.

Exercise-VIII

A very large tele-communications company called “Cell9”, providing


cellular phone
services to a number of states in various regions of the country, plans
to build a data
warehouse for decision support. They have millions of subscribers in
the country. They
want to track the duration (in minutes) as well as the prevailing rate
(per minute) of each
phone call made by its subscribers. They also want to analyze if there
is any link between
the total amount of time spent in talking on cellphones by a subscriber
and the number of
graduates in the state or the number of married persons in the state or
the male-female
ratio of the state to which the subscriber belongs. Further, they want to
analyse the
relation between the age, salary and marital status of the customers to
their total bill
amount per day/month/year. One other important requirement is to
make queries like
determining the current total number of customers in the various age
groups for each state
having certain ranges of male-female ratio.
(a) Design a suitable relational database schema for such a data
warehouse, clearly
identifying the fact table(s), the facts in the fact table(s), the
dimension table(s),
their primary key(s) and foreign key(s). Your schema should at least be
able to
satisfy the above mentioned analysis requirements. You may consider
other
suitable attributes for the dimension table(s).
(b) Classify the facts in your fact table(s) as additive, non-additive and
semi-additive.
(c) Draw possible concept hierarchies for each dimension that you
have designed,
identifying whether these are schema hierarchies or set grouping
hierachies.
(d) Write an SQL query that runs on your schema and returns the
region-wise yearly
average bill amounts of married and unmarried customers.
(e) Draw a cuboid to represent the result of your query.
(f) From this cuboid, which sequence of OLAP operations would you
perform to get
the average monthly bill amounts of all the customers for the states of
Bihar and
West Bengal?
(g) Write an SQL query to return the current total number of customers
in the various
age groups for each state with male-female ratio between 0.9 and 1.1.
(h) For any one fact table (You may have only one, depending on your
design), and
any one attribute of any one dimension table, draw the bitmap index
table(s) and
join index table(s). Before drawing the index tables, first mention the
representative rows in the tables.