You are on page 1of 2

Exercise 1: Information package

Consider a retail company composed of 4 stores in Vietnam. These stores sell 2500 different
products in 50 categories from 50 different providers. Note that each product belongs to a
single category and being provided by a single provider.

This company has an operational database, which stores the sales of each product real-time
and the stocks of each product in each store real-time.

Sales of products are grouped into invoices including: the date of the sale, the shop where the
invoice is emitted, the list of products sold and for each product, the quantity sold. The
invoice also mentions the information of the customer including the name, the phone number,
the district, and the city.

Information about the invoices is kept in the relational database for 1 year.

The operational database also contains some information about the products: name,
description, selling price, supplier (given name, last name, address, phone number), product
category (shampoo, rice, clothes…) and the shops (address, district, city, manager’s name,
address, telephone). We consider that every shop has only one manager.

The information about the stocks of each product in each store include the remaining quantity
of each product in each store.

It may happen that there are changes in the database, as for example changes in the supplier or
the selling price of the products. If this happens, the new value directly replaces the old
value). If a new product is added in the reference catalogue, it is simply added in the database.
We consider that there are roughly 100 more references each year. If a new product is
removed from the reference catalogue, it is simply removed from the database.

On average, 10000 products are sold in every store each week, out of 2000 different invoices.

Business requirements:

The decision maker now wishes to be able to analyze sales for the last three years, in total
autonomy (without any intervention of programmers pre-computing queries). The important
criterions of the analysis you wish to make are the volume of the sales (sales turnover,
quantity), dates and stores in which products are sold.

Of course, the classification by product category is interesting.

It might also be interesting to discover what are the products that are most frequently
purchased within the same invoice, this is in order to reorganize the store shelves (with bread
closer to jam for example) to trigger more buying (a customer could forget the bread when
buying his jam, otherwise).

Regarding the dates, the decision maker wishes to analyze the sales according to the day of
the week (Monday, Tuesday...), the day of the month (to quantify the 'end of month' effect
observed with clients and possibly propose promotions in function), the fact that the day is
public holiday or non-holiday, etc...
You now must design the relational data model suitable for this data warehouse, in the form
of a new class diagram. This model must be easily used by the decision maker, even if it
means violating the fundamental principles of no redundancy in the operational database.

Question :
1. Create the information package corresponding to your data warehouse
2. Distinguish the hierarchies and categories of each dimension.
3. Which level of details of data your data warehouse must be hold? Justify your answer.

You might also like