You are on page 1of 10

A Conceptual Model for Multidimensional Data

Anand S. Kamble
Department of Information Technology
Government of India
New Delhi, India
Email: ask@mit.gov.in
Abstract
This paper introduces a Conceptual Data Model for
Data Warehouse including multidimensional aggrega-
tion. It is based on Entity-Relationships data model.
The conceptual data model gracefully extends stan-
dard Entity-Relationship data model with multidi-
mensional aggregated entities. The model has a clear
mathematical theoretic semantics grounded on stan-
dard ER semantics and the (/T logic-based multi-
dimensional data model. The aim of this work is not
to propose yet another conceptual data model, but to
find the most general and precise formalism consider-
ing all the proposals for a conceptual data model in
the data warehouse field, making therefore a possible
formal comparison of the differences of the models in
the literature, and to study the formal properties or
extensions of such data models.
1 Introduction
The goal of this work is to extend the standard
Entity-Relationship (ER) data model, as defined in
the database textbooks, with constructs which allow
the modeling of multidimensional aggregated entities
together with their interrelationships with the other
parts of the conceptual schema. An important as-
pect is that a formal model-theoretic semantics is to
be given to the conceptual data model by combin-
ing the well known first order semantics of standard
ER, as described for example, in (Borgida et al. 2003,
Calvanese et al. 1998)—with the model theoretic se-
mantics of the (/T logical multidimensional data
model (Franconi & Kamble 2003, 2004b). This work
is also based on a similar preliminary work done on
the use of Description Logics as a mean to give precise
semantics to a data warehouse conceptual data model
and to study its computational properties (Franconi
& Sattler 1999). This paper presents the formal as-
pects along with well defined model-theoretic syntax
and semantics of the conceptual data model intro-
duced in (Franconi & Kamble 2004a).
The proposed framework is a novel data warehouse
conceptual data model, ((/T–generalising concep-
tual multidimensional data models in the data ware-
house field. The aim of this work is not to propose
yet another data model, but to find the most gen-
eral, an elegant and precise formalism encompassing
all the proposals, for example, listed in (Phipps &
Davis 2002), for a conceptual data model in the data
warehouse field, making therefore a possible formal
Copyright c 2008, Australian Computer Society, Inc. This pa-
per appeared at the Fifth Asia-Pacific Conference on Concep-
tual Modelling (APCCM 2008), Wollongong, NSW, Australia,
January 2008. Conferences in Research and Practice in Infor-
mation Technology, Vol. 79. Annika Hinze and Markus Kirch-
berg, Eds. Reproduction for academic, not-for profit purposes
permitted provided this text is included.
comparison of the different expressivities of the mod-
els in the literature.
The paper is organised as follows. Section 2 de-
scribes the ((/T model along with the required
extensions to the standard entity-relationships data
model. In Section 3 and Section 4, we present re-
spectively syntax and semantics of the ((/T model,
which are purely based on mathematical theory. Sec-
tion 5 reviews the related literature on multidimen-
sional conceptual models. Section 6 evaluates the
((/T model against the criteria for a good concep-
tual multidimensional model. In Section 6, we present
comparison of the ((/T model with other multidi-
mensional models. Finally, in Section 8, we briefly
conclude the paper and outline the future work.
2 The ((/T Data Model
The ((/T model extends ideas of a data warehouse
conceptual data model first proposed in (Franconi
& Sattler 1999) where aggregations and dimensions
are first class citizens. It abstracts principles of data
warehouse and describes the multidimensional struc-
ture of the data of a business domain of an enterprise.
A ((/T model is based on an ER model. It
captures database schemata expressed in an entity-
relationship diagram and describes multidimensional
structure including dimensions with their hierarchi-
cally organised levels and the structure of aggrega-
tions. It extends standard ER schema with constructs
of aggregated entities together with their interrela-
tionships with the other parts of the schema. As
stated in (Agrawal et al. 1997), a “good” data ware-
house system should support user-definable multiple
hierarchies along the arbitrary dimensions. A ((/T
model is able to support user-definable multiple hier-
archies, and is able to express aggregations along the
arbitrary dimensions and levels.
2.1 The ((/T: an Extended Entity-
Relationship Model
This section describes the ((/T data model with
an ER model and presents the ER extensions. It also
presents methodology for data warehouse design from
the standard (operational) ER schema and the struc-
ture of aggregations.
We describe the model with an example of telephone
calls presented in Figure 1 (taken from (Franconi &
Sattler 1999)). Entities Calls, Day and Point, and
relationships such as Date, Dest and Source present
the base data. The cardinality constraints such as
(1,1) on the ”Date” relationship between ”Calls” &
”Day” entities, and the ”Source”/”Dest” (Destina-
tion) relationship between ”Calls” & ”Point” entities
express that the calls are issued on some dates from
some source points and receiving at some destination
Calls
duration no of calls
1,1
1,1
1,1
Day
day/month/year
holiday?
weekday
Point
address
code
type
Consumer Business
Cell LandLine DirectLine PABX
Dest
Source
Date
d
d d
Figure 1: The Conceptual Schema for the base data of telephone calls
Calls
duration no of calls
1,1
1,1
1,1
Point Day
Dest
Source
Date
Figure 2: The Conceptual Schema for the Basic Multidimensional information for the base data considered in
Figure 1
points. The conceptual multidimensional data model
for this information (base data) we have obtained,
is exemplified in Figure 2. A basic multidimensional
entity such as Calls described in the diagram of the
figure 2 is using a standard star schema—i.e., it is
represented by means of a weak entity with respect
to its dimensions. In this example, this basic mul-
tidimensional entity may be useful for analysing the
nature of telephone calls by considering, among oth-
ers, the dimension related to the origin and the desti-
nation of the calls with respect to the type of phone
point (associated to consumer or business customers).
So, the entity Calls represents a basic cube whose
dimensions are Date, Dest and Source (identifying
relationships) which are restricted to the basic levels
Day, Point, and gain Point (associated entities) re-
spectively. This part of the diagram makes still use
of standard constructs.
Level building: For building the aggregation level
hierarchies for each dimension, we consider the fol-
lowing:
• discriminator of an entity (Elmasri & Navathe
2000)
• generalisation/specialisation hierarchy: creating
a single entity of all subclasses (possibly disjoint)
of a superclass.
• one-to-many relationship
• partial relationship
• many-to-many relationship: converting into one-
to-many relationships which are then converted
into levels as suggested in (Moody & Kortink
2000).
Taking these constraints into account, Figure 3
presents the multidimensional conceptual schema in-
cluding level hierarchies for each dimension. Outer
boxes indicate levels; and inner boxes are their ele-
ments. The bold arrows (from lower level to higher
level) denote hierarchy. The levels “Pointtype” and
“Customertype” are created from the partitions of
”Point” entity. An entity Point is partitioned (ac-
cording to an attribute ”type”—a discriminator (El-
masri & Navathe 2000)) into four basic points and
two higher level points. Pointtype aggregates four
basic point types (partitions) namely, Cell, Land
Line, Direct Line & PABX, and Customertype ag-
gregates higher level two (partitions) points types
(partitions) viz Consumer and Business. Thus,
first three constraints namely, discriminator, general-
isation/secialisation, one-to-many relationships hold.
Similarly, Date dimension multiple hierarchies (for in-
stance, hierarchy including the levels DAY, Month,
Qtr, Year) are created. The Holiday-Nonholiday level
aggregates all holidays and non-holidays. However,
in this case, Holiday or Nonholiday is an optional
level as the relationship between Day and Holiday or
Nonholiday is partial, hence, partial relationships. In
running example, we do not have many-to-many rela-
tionships, however, handling them is straightforward
as suggested in (Moody & Kortink 2000).
Aggregation:
We now perform the analysis of telephone calls along
the arbitrary dimensions. For example, a query
“Analyse telephone calls by day and point type?” is a
bi-dimensional cube along the Date and the Source di-
mensions involving the level Day and the level Point-
type respectively. A conceptual schema for this query
includes the definition of the basic cube (Figure 2)
and the definition of the aggregation along the defi-
nitions of associated levels, i.e., a new aggregated en-
tity Calls-by-Day-and-Pointtype denoting aggre-
gations according to the basic level Day and the level
Pointtype along the dimensions Date and Source re-
spectively. Figure 4 presents the conceptual schema
Calls
duration no of calls
1,1
1,1
1,1
Day
day/month/year holiday? weekday
d
Holiday-NonHoliday
Holiday NonHoliday
d
Weekday
Mon Tue Wed Thu Fri Sat Sun
d
Year
Yr
1
. . . Yrn
d . . . d
Qtr
Qtr1/Yr
1
. . . Qtr4/Yr
1
. . . Qtr1/Yrn . . . Qtr4/Yrn
d . . . d . . . d . . . d
Month
Jan Feb Mar . . . Oct Nov Dec . . . Jan Feb Mar . . . Oct Nov Dec
d . . . d . . . d . . . d
DAY
01/01 . . . 31/01 . . . 01/12 . . . 31/01 . . . 01/01 . . . 31/01 . . . 01/12 . . . 31/12
Point
address
code
type
d
Customertype
Consumer Business
d d
Pointtype
Cell LandLine DirectLine PABX
Dest
Source
Date
Figure 3: The Multidimensional Conceptual Schema for the data considered in Figure 1.
for this aggregated cube (query) in the variant of an
Entity-Relationship model. This particular way of
presenting aggregation (entity) is adapted from UML
(Unified Modeling Language) syntax.
Now consider a multidimensional aggregated view,
for example, “analysis of telephone calls by week
day and customer type”, composing telephone calls
along the Date and the Source dimensions involv-
ing levels Weekday and Customertype respectively.
The conceptual schema for this aggregated view in-
cludes the definition of the basic cube and the defi-
nition of aggregation, i.e., a new aggregated entity,
say Calls-by-Weekday-and-Customertype (along
with the definitions of the level Weekday and the
level Customertype) denoting aggregations accord-
ing to the level Weekday and the level Customertype
along the Date and the Source dimensions respec-
tively. Figure 5 presents the conceptual schema in the
variant of an ER model. This bi-dimensional aggre-
gated view is actually computed from an aggregated
cube of Figure 4. This indicates that the aggregations
can be computed from pre-computed aggregations.
2.2 Extensions to the ER Model
As described above, a first extension to the standard
ER Model can be seen with simple aggregated
entities—i.e., non-dimensional aggregations—
Weekday and Customertype, which represent
dimensional levels built from the basic dimensional
entities Day and Point respectively. A simple aggre-
gation aggregates the collection of objects that are in
the extension of the aggregated entities. So, in our
example, since entities Mon,...,Sun form a partition
of the entity Day, the Weekday entity denotes exactly
seven objects, one for all the Mondays, one for all the
Tuesdays, etc. On the other hand, the aggregated
entity Customertype denotes exactly two objects,
one aggregating all customer phone points and the
other aggregating all business phone points. In this,
by interleaving partitioning and simple aggregations,
we are able to construct level hierarchies starting
from some basic dimensional level. Obviously, the
functional dependencies exist among the levels of a
hierarchy, as analysed by (Golfarelli et al. 1998).
A second extension to the standard ER model is
the multidimensional aggregated entity exem-
plified in Figure 5 by the entity Calls-by-Weekday-
and-Customertype and in Figure 4 by the en-
tity Calls-by-Day-and-Pointtype. The entity
Calls-by-Weekday-and-Customertype denotes all
the cells of a cube whose coordinates are the week-
days of the date of the calls, and the customer types
of the originators of the calls. Such an entity (i.e., ex-
tension) holds the necessary constraints enforced for
a cube by the (/T-based semantics (Franconi &
Kamble 2003, 2004b).
A multidimensional aggregated entity is an en-
tity itself in the ER diagram, and it can have
attributes (for instance, total no of calls and aver-
age duration in Figure 5 or Figure 4, and can be
computed with associated aggregation functions, i.e.,
sum(no of calls) and average(duration) respec-
tively) and can be part of further relationships or
constraints.
3 Syntax of the ((/T Data Model
The basic constructs of the ER schema are entities,
relationships. and attributes. Entity is drawn as a
rectangle around the entity symbol (entity name),
whereas relationship between the entities is drawn as
a diamond around the symbol (relationship name).
An attribute is drawn as a circle or oval outside or
around the attribute symbol (attribute name). ER-
roles are the edges (links) between entities and re-
lationships and are labeled with number restrictions
called cardinality constraints. An is-a link constraint
is drawn as an arrow from more specific entity (sub-
class) to more general entity called superclass (respec-
tively from more specific to more general relation-
ship). The disjoint-total constraint is drawn with a
Calls
no of calls duration
1,1
1,1
1,1
Point
address
code
type
d
Consumer Business
d d
Cell LandLine DirectLine PABX
Pointtype
Day
date/month/year holiday? weekday
Calls-by-Day-and-Pointtype
total(no of calls) average(duration)
Dest
Source
Date
Figure 4: A cube composing calls by level Day and level Pointtype along Date and Source dimensions respec-
tively
Calls
no of calls duration
1,1
1,1
1,1
Point
address
code
type
d
Consumer Business
Customertype
Day
date/month/year holiday? weekday
d
Mon Tue Wed Thu Fri Sat Sun
Weekday
Calls-by-Weekday-and-Customertype
average duration total no of calls
Dest
Source
Date
Figure 5: The Conceptual Data Warehouse Schema for multidimensional aggregated view presenting Calls by
Weekday and Customertype—an aggregated cube (view)
circle having ”d” inside, connecting subclasses with
the edges and a superclass with a double-lined ar-
row from the circle to a superclass. Weak entities
are represented as double-lined rectangles, whereas
identifying relationships are denoted by double-lined
diamonds. Aggregated entity (simple aggregation) is
drawn as rectangle attached with diamond, whereas
multidimensional aggregated entity is drawn as shad-
owed rectangle attached with diamond.
Formally, the syntax of an Extended Entity-
Relationship (EER) model is as follows.
Definition 1 (EER schema) An EER schema is
constructed over the signature o = < c, ¹, /, |, 1, _
, card, ¦[ [¦ > where
• c is a finite set of entity names,
• ¹ is a finite set of relationship names, each as-
sociated with an arity k,
• / is a finite set of attribute names,
• | is a finite set of ER-role names,
• 1 is a finite set of domain names,
• _⊆ c c ∪ ¹¹ is a binary relation over c
and ¹
• card is a function such that card(E,R,U) =
< cmin(E, R, U), cmax(U) >∈ N N where
cmin(E,R,U) ≤ cmax(E,R,U) for each E ∈ c,
R ∈ ¹, U ∈ |.
• ¦[ [¦ models aggregation over c.
An EER schema E over the signature o is
• a finite set of entities E ∈ c,
• a finite set of relationship constraints R of an
arity k such that R
.
= [U
1
: E
1
, . . . , U
k
: E
k
]
where R ∈ ¹, E
i
∈ c and U
i
∈ | for each i,
1 ≤ i ≤ k,
• a finite set of attribute constraints A
i
∈ / such
that E
.
= ¦A
1
: V
1
, . . . , A
n
: V
n
¦ where A
i
∈ /,
V
i
∈ 1, E ∈ c is in E for each i, 1 ≤ i ≤ n,
• a finite set of is-a link constraints between two
entities E
1
and E
2
such that E
1
_ E
2
, (respec-
tively between two relationships R, S such that
R _ S),
• a finite set of disjoint-total constraints between
more specific entities E
1
. . . E
n
and a more gen-
eral entity E such that E
1
_ E, . . . , E
n
_ E
where E
i
,= E
j
for i ,= j, i ≤ n; j ≤ n, E, E
k
∈ c
for each k, k = 1, . . . , n
• a finite set of aggregation links between entities,
each aggregation link in E is drawn as an edge
with attached diamond at one end
• a finite set of simple aggregations G ∈ c involving
n entities F
1
, . . . , F
n
(each being connected with
an aggregation link to G)
• a finite set of aggregations G involving (con-
necting) n relations D
1
,. . . ,D
n
and n entities
L
1
,. . . ,L
n
and a weak entity F.
Before giving the formal semantics of the EER model,
we describe intuitively the components of the EER
Schema. An entity denotes a set of objects called
instances that have common properties. The elemen-
tary properties are modelled with attributes whose
values belong to one of several predefined domains
such as Integer, Real, String, or Boolean. The prop-
erties that are due to relations to other entities are
modelled through the participation of the entities in
the relationships. A relationship denotes a set of tu-
ples called its instances, each of which represents an
association among different combination of instances
of the entities that participate in the relationship.
Each entity can participate in a relationship more
than once. Such participation is represented by an
ER-role. Each ER-role is assigned a unique name.
Number of ER-roles associated to a relationship is
called the arity of that relationship. The cardinal-
ity constraints (number restrictions) are associated
to ER-role in order to restrict the number of times
participation of each instance of an entity via that
ER-role in instances of the relationship. The min-
imum cardinality is either 0 (zero) or 1 (one) and
maximum cardinality is either 1 or ∞. An is-a link
is modelled by _ and is used to denote the inclusion
between two entities (respectively between two rela-
tionships) and therefore more specific entity (respec-
tively relationship) inherits properties of more general
entity (respectively relationship).
Weak entity is a dependent entity which is identi-
fied by considering the primary keys of participation
of other entities via relationships (called weak rela-
tionships) to which it is connected via ER-roles, each
having minimum and maximum cardinalities equal to
1. Each instance of weak entity is a composition of
instances of participating entities (one instance per
entity).
A fact is represented as weak entity (aggregated
fact as aggregated weak entity). The dimensions are
represented as relationships (weak relationships) and
levels are represented as entities (also including aggre-
gated entities). An entity (level) directly connected
to a (weak) relationship (dimension) is called a ba-
sic level for that dimension. A roll-up link between
two levels (entities) is modeled by a roll-up function ρ
which maps lower level elements (instances) to higher
level elements (instances). Simple aggregation is rep-
resented by an aggregated entity which is a composi-
tion of entities (to which it is connected with lines).
i.e., Simple aggregation involves a finite set of enti-
ties on which it is based on. An n-dimensional ag-
gregation is represented by an aggregated weak en-
tity connecting n relationships (dimensions), n enti-
ties (levels) by connecting them via n circles (one per
relationship and per entity), and a weak entity (fact
on which aggregation is based on) connecting to it
(with line), i.e., an n-dimensional aggregation is rep-
resented by an aggregation (aggregated weak entity)
involving n dimensions (each being a relationship), n
levels (each being an entity or aggregated entity) and
a fact (weak entity) on which aggregation is based
on. That is an n-dimensional aggregation involves n
dimensions, n levels (one per dimension) and a fact
it is based on. Each instance of n-dimensional ag-
gregation is called a cell which is a composition of n
elements (instances) of n levels (one element/instance
per level) of n dimensions involved in the aggregation.
In rest of the paper, we will many times use only ”ag-
gregation” to refer multidimensional or n-dimensional
aggregation.
4 Semantics of the ((/T Data Model
The semantics of an EER Schema is given in terms
of legal data warehouse states, i.e., data warehouses
which conform to the constraints imposed by the
schema. We consider as a starting point the ER
semantics introduced in (Calvanese et al. 1998), re-
casted to cope with multidimensional information.
For we consider (/T, the logical multidimensional
data model introduced in (Franconi & Kamble 2003).
(/T abstracts notions such as levels, multiple
level hierarchies, dimensions, facts, cells, aggregation,
cube, coordinates and measures. A central element
in (/T is a cube. A cube defined on all specified
dimensions with their basic levels is called a basic
cube, otherwise, it is called an aggregated cube. A
cube is computed from a cube. An aggregated cube
is computed from a cube on which the aggregation
is based on. The (/T introduces a notion of data
warehouse state. A data warehouse state is a collec-
tion of cells (with their dimensions and measures). A
data warehouse state is legal if it satisfies the above
cube conditions.
Definition 2 (EER Semantics) A data warehouse
state I = < ∆, Γ,
I
> over the signature <
c, ¹, /, |, 1, _, card, ¦[ [¦ > with respect to the EER
schema E is constituted by
• ∆ a nonempty finite set assumed to be different
from all domains,
• Γ a finite set of (concrete) domains

I
an interpretation function such that
– V
I
⊆ Γ for each V ∈ 1, where V
I
is
disjoint from any other W
I
such that W ∈ 1
– E
I
⊆ ∆ for each E ∈ c, where E
I
is
disjoint from any other E’
I
such that E’∈ c
– A
I
⊆ ∆ V
I
for each A ∈ /, and for
some V ∈ 1
– R
I
⊆ ∆ . . . ∆ = ∆
k
for each k-ary
relationship R ∈ ¹ such that a tuple r ∈
R
I
is of the form [U
1
:e
1
,. . . ,U
k
:e
k
], where
e
i
∈ E
I
i
, for each i ∈ ¦1, . . . , k¦.
A tuple r ∈ R
I
over ∆ can be viewed as a function
that maps each ER-role U
i
to e
i
∈ E
i
and is denoted
by [U
1
:e
1
,. . . ,U
k
:e
k
], i.e., r[U
i
] = e
i
∈ E
i
for each
i, i = 1, . . . , k.
The elements of E
I
, A
I
, and R
I
are called instances
of E, A, and R, respectively.
A data warehouse state I = < ∆, Γ,
I
> is said to be
legal for an EER schema E, if it satisfies the follow-
ing:
• E
I
1
⊆ E
I
2
for each is-a link in E between two
entities E
1
, E
2
in E such that E
1
_ E
2
Similarly R
I
1
⊆ R
I
2
for each is-a link between
relationships R
1
, R
2
i E such that R
1
_ R
2
• A
I
(e) ∈ V
I
for each e ∈ E
I
, where A ∈ / is an
attribute of E with domain V ∈ 1.
Similarly, A
I
(r) ∈ V
I
for each r ∈ R
I
, where
A ∈ / is an attribute of R with domain V ∈ 1
• R
I
⊆ E
I
1
. . . E
I
k
for each relationship R in
E connected to entities E
1
,. . . ,E
k
in E
• cmin(E, R, U) ≤ #¦r ∈ R
I
[ r[U] = e¦ ≤
cmax(E, R, U)
for each U ∈ |, associated to R ∈ ¹ and
E ∈ c in E, for each e ∈ E
I
, and cardi-
nality constraint card(U) = (min, max) asso-
ciated with ER-role U where cmin(E, R, U) =
min and cmin(E, R, U) = max
• for each disjoint-total construct in E where E is
a superclass and E
1
,. . . ,E
n
are subclasses (parti-
tions), the following must hold:
E
I
i
⊆ E
I
for each i = 1, . . . , n and
E
I
i
∩ E
I
j
= ∅ for each i ,= j, and
E
I
⊆ E
I
1
∪ . . . ∪ E
I
n
• for two connected levels L
i
, L
j
(each one being
an entity or simple aggregation) in E there must
be a (possibly partial) roll-up function ρ
Li,Lj
such
that
ρ
L
i
,Lj
(x) = y for each x ∈ L
I
i
and y ∈ L
I
j
L
i
,
L
j
∈ c,
We define reflexive transitive closure of roll-up
function ρ

Li,Lj
(from L
i
to any higher level L
j
if
there is a level L
k
along the path between L
i
and
L
j
)
inductively as follows:
ρ

Li,Li
= id
ρ

Li,Lj
=

k
ρ
Li,L
k
◦ ρ

L
k
,Lj
for each k such
that L
i
¸ L
k
where

L
p
,Lq
∪ ρ
L
r
,Ls
)(x) = y
iff

ρ
Lp,Lq
(x) = ρ
Lr,Ls
(x) = y, or
ρ
Lp,Lq
(x) = y and ρ
Lr,Ls
(x) = ⊥, or
ρ
L
p
,Lq
(x) = ⊥ and ρ
L
r
,Ls
(x) = y
• for each fact F ∈ T (being a weak entity) in E
with p dimensions D
1
, . . . , D
p
(each one being an
identifying relationship) and corresponding p lev-
els L
1
, . . . , L
p
(each one being an entity or sim-
ple aggregation) in E for i = 1, . . . , p, M
j
∈ /,
V
j
∈ 1 for j = 1, . . . , m,
the following holds ((/T cube conditions):
1. ∀f. F(f) → ∃l
1
, . . . , l
n
. D
1
(f) = l
1
∧L
1
(l
1
)∧
. . . ∧ D
n
(f) = l
n
∧ L
n
(l
n
)
2. ∀f, f

, l
1
, . . . , l
n
. F(f) ∧ F(f

) ∧
D
1
(f) = l
1
∧ D
1
(f

) = l
1
∧ . . . ∧
D
n
(f) = l
n
∧ D
n
(f

) = l
n
→ f = f

• for each aggregation G in E involving n dimen-
sions D
1
, . . . , D
n
and n levels R
1
, . . . , R
n
(one
per dimension) and a fact F:
G
.
= F ¦D
1
[
R1
, . . . , D
n
[
Rn
¦
where F
.
= E ¦D
1
[
L
1
, . . . , D
p
[
L
p
¦ such that
n ≤ p
the following must hold:
∀g. G
I
(g) ↔
g = ¦[f [ F
I
(f)∧

h=1,...,p


L
h
,R
h
(D
I
h
(f)) = D
I
h
(g))[¦
for each n ≤ p, ¦[ [¦ denotes aggregation.
Each aggregated cell is an aggregation of cells whose
coordinates roll-up to the coordinates associated with
an aggregated cell on which it is based on.
Thus, a particular EER diagram denotes a set of data
warehouse states. According to (/T, a particular
EER schema is a set of legal data warehouse states,
if they (data warehouse states) satisfy the cube (to-
gether with the aggregated cube) conditions imposed
by the (/T schema, i.e., the set of all possible data
warehouse states which conformto the constraints im-
posed by the (/T schema, conform to the diagram
it self —i.e., they are legal data warehouse states. If a
diagram is inconsistent, then no data warehouse may
conform to it.
5 Related Work
Several proposals on a conceptual model exist in the
data warehouse field. The only proposals by (Gol-
farelli et al. 1998, Sapia et al. 1998, Tryfona et al.
1999, Husemann et al. 2000, Zepeda & Celma 2006)
address the conceptual model in a real fashion. The
proposals by (Perez et al. 2005, Berenguer et al. 2005)
address UML model (Perez et al. 2005) based on
star schema, and propose the quality indicator met-
rics (Berenguer et al. 2005) for the conceptual model,
although their work is based on UML modeling.
In (Golfarelli et al. 1998), a Dimensional Fact
Model (DFM) is constructed from an operational ER
schema based on requirement analysis. The con-
struction methodology is well defined. It is rela-
tional and is based on star schema. DFM does not
support generalisation/speciaisation hierarchies and
many-to-many relationships. In the similar manner,
Zepeda and Celema (Zepeda & Celma 2006) pre-
sented a Model Driven Architecture (MDA) for pro-
ducing candidate multidimensional schemas from op-
erational ER schema based on requirement analysis.
Each of the candidate schema is based on star schema.
However, a model supports generalisation hierarchies
and many-to-many relationships. A mapping is pre-
sented for transformation of candidate (multidimen-
sional) ER schema to cube, dimensions, levels, and
measures. In both DFM and MDA, no aggregation is
defined at conceptual schema level.
In (Kimball 1997, 1996), a multidimensional mod-
eling manifesto using multidimensional view of enter-
prise data has been proposed; it is a relational im-
plementation in the form of “star schema”. This ap-
proach is not conceptual in the sense that it is not
independent of the implementation.
A multidimensional conceptual model called Mul-
tiDimER model based on ER model has been pro-
posed in (Malinowski & Zimanyi 2006). The model
is based on star and snowflake schema. The features
such as generalisation/specialisation hierarchies, com-
posite attributes, aggregations, etc have not been con-
sidered in this model. The model is well defined. It
is based on the ER model and its logical representa-
tions. A conceptual model proposed by (Abello et al.
2006) is based on UML and its extensions, empha-
sizing on part-hole relationships for aggregation but
does not support aggregations at the schema level.
None of these proposals addresses conceptual
structure of aggregation. They only derive basic
multidimensional schema from the given ER schema.
Moreover, all these models need to specify design
methodology such as information analysis, require-
ment analysis and specifications, etc (Golfarelli et al.
1998, Husemann et al. 2000) manually. The only pro-
posal by (Franconi & Sattler 1999) for data ware-
house conceptual model presents the structure of
multidimensional aggregation; and it automates the
construction of multidimensional conceptual schema
from an ER diagram. The ((/T model is purely
conceptual and addresses all the issues from data
warehouse construction to aggregations and view
management. The ((/T takes care of all con-
straints of the standard ER model in addition to mul-
tidimensional constraints. This shows ((/T is syn-
tactically and semantically richer than the other mod-
els.
6 Evaluation of the ((/T model
In this section, we evaluate the ((/T data model
according to certain criteria found for a multidimen-
sional conceptual model in the literature. We also
compare ((/T with other models. For evalua-
tion, we consider some criteria listed in (Blaschka
et al. 1998), nine requirements introduced in (Peder-
sen & Jensen 1999), and several requirements found
in (Abello et al. 2001) for a data warehouse mul-
tidimensional model. We also consider some addi-
tional requirements which are also important for a
data warehouse multidimensional conceptual model.
All these requirements are randomly listed below.
1. Implementation independent (Blaschka et al.
1998):
2. Explicit Separation of Structure and Con-
tents (Blaschka et al. 1998):
3. Explicit hierarchies (Blaschka et al. 1998, Peder-
sen & Jensen 1999): A model should support the
explicit hierarchy in the dimension.
4. Symmetric treatment for dimensions and mea-
sures (Blaschka et al. 1998, Pedersen & Jensen
1999): A model should allow measures to be
treated as dimensions and vice versa.
5. Multiple hierarchies in dimension (Pedersen &
Jensen 1999):
6. Dimension/level attributes (Abello et al. 2001):
A model should specify the attributes that do
not define hierarchies.
7. Support for aggregation (Pedersen & Jensen
1999): A model should be able to provide mean-
ingful aggregations.
8. Complex Measures (Blaschka et al. 1998): A
model should support multiple and complex mea-
sures for the same fact (cube).
9. Handling different levels of granularity (Pedersen
& Jensen 1999):
10. Support for non-onto hierarchies (Pedersen &
Jensen 1999): A model should support non-onto
(unbalanced) hierarchies, i.e., hierarchies with
paths of different lengths.
11. Support for non-strict hierarchies (Pedersen &
Jensen 1999): A model should support non-strict
hierarchies.
12. Support for many-to-many relationships (Peder-
sen & Jensen 1999):
13. Generalisation/specialisation hierarchies (Abello
et al. 2001): A model should support genenalisa-
tion/specialisation (is-a) relationships.
14. Handling change over time (Pedersen & Jensen
1999):
15. Handling uncertainty
16. Multi-cube/fact schema (Abello et al. 2001): A
model should support multiple cubes/facts in
schema.
Since in ((/T one cube/fact is based on an-
other, it (((/T) allows
17. Summarisability: A model should support sum-
marisation (Lez & Shoshani 1997).
18. User defined Aggregation functions (Abello et al.
2001): A model should support user defined ag-
gregation functions.
19. Drill-across (Abello et al. 2001): A model should
allow to drill-across (sharing dimensions).
20. Dimensionless aggregation
21. Measureless aggregation
22. Aggregation from aggregation (view over view)
The ((/T model fulfills all the above require-
ments 1–22 except requirements 4, 12 and 14. Re-
quirements 4 and 12 are partially supported, how-
ever, requirement 4 can be fully supported if the
measure is used as coordinate. The requirement 14
is not supported as no syntactical provision is made
for changing dimensions/levels in ((/T. In addi-
tion, the ((/T model supports aggregations at any
higher level (ignoring intermediate levels). This is one
of the important characteristics of the ((/T model.
7 Comparison of ((/T with other models
In this section, we evaluate other models against the
same requirements listed in Section 6, and compare
them with the ((/T model which is already evalu-
ated in Section 6.
We consider the models proposed in (Golfarelli
et al. 1998, Sapia et al. 1998, Tryfona et al. 1999,
Husemann et al. 2000) for comparison as they are
conceptual models in true sense. However, models
proposed in (Tsois et al. 2001, Abello et al. 2001, Pei
2003, Jensen et al. 2004), and (Trujillo et al. 2001) (an
object oriented model— extension of (Trujillo et al.
2000)) are also taken into consideration for compari-
son because they are current state-of-the-art models,
and are also conceptual in some way or other. Ta-
ble in Figure 6 presents summary of comparison of
these models. As before, if a model meets a particu-
lar requirement/feature/functionality fully, then it is
denoted by ”

”. If a model supports the requirement
partially then it denoted by “p”, and if a model does
not support it at all then it is denoted by “x”.
Requirement 1 (Implementation independent) is par-
tially supported by a model of (Abello et al. 2001),
since it is a multi-star schema based on concepts
of star model. Remaining models (Kimball 1996,
Golfarelli et al. 1998, Jensen et al. 2004, Pei 2003),
can not be considered implementation independent
as they are either relational (Kimball 1996) or based
on star schema ((Golfarelli et al. 1998)), a relational
model or designed for a specific domain (Jensen et al.
2004, Trujillo et al. 2001, Pei 2003) implementation,
for example, clinical domain (Jensen et al. 2004), and
thus provide no support.
Requirement 2 (Explicit Separation of Structure
and Contents), requirement 3 (Explicit hierarchies),
requirement 5 (Multiple hierarchies) and require-
ment 8 (Complex measures) are met by all of the
models (except star schema (Kimball 1996) for re-
quirement 3), thus provide full support. Star schema
does not support explicit hierarchy, and consequently
does not support the requirement 3.
Requirement 6 (Dimension/level attributes). Only
one model (Jensen et al. 2004) does specify the at-
tributes that do not define hierarchies, hence pro-
vides no support. Remaining models specify the non-
dimension attributes, thus provide full support.
Requirement 4 (Symmetric treatment for dimensions
and measures), only one model (Jensen et al. 2004)
captures this feature by means of derivation mech-
anism, thus providing full support. Some mod-
els (Abello et al. 2001, Tryfona et al. 1999) do not
consider measure explicitly as dimensions but condi-
tionally if the measure is used to identify a cell, thus
providing partial support. Remaining models (Gol-
farelli et al. 1998, Sapia et al. 1998, Husemann et al.
2000, Tsois et al. 2001, Trujillo et al. 2001, Pei 2003),
do not consider this feature in the framework and thus
provide no support.
Requirement 7 (Support for correct aggregation) is
met by a very few models (Abello et al. 2001, Jensen
et al. 2004) either by derivation through some op-
erations (Abello et al. 2001) or by restricting hierar-
chies to strict, covering and onto through some deriva-
tions so that data will not be double counted, thus
provide full support. Only one model (Tsois et al.
2001) does not support this requirement because of
including many-to-many relationships between facts
and dimensions and among the hierarchies. Remain-
ing models partially support this feature either by re-
stricting the dimension/hierarchies and aggregation
functions (Golfarelli et al. 1998) or hierarchies to
strict, onto and covering, and restricting aggregation
functions (Sapia et al. 1998, Husemann et al. 2000,
Tsois et al. 2001, Trujillo et al. 2001, Pei 2003).
Requirement 9 (different levels of granularity) A very
few models (Tsois et al. 2001, Abello et al. 2001)
captures different levels of granularity (i.e., measures
at different levels of granularity) either by aggrega-
tion (Tsois et al. 2001) or by specialising the cells
depending on whether the measure is derived or not.
Only one model (Jensen et al. 2004) captures this fea-
ture partially through some derivation. Rest of the
models do not capture this feature, and hence pro-
vide no support.
For Requirement 10 (Support for non-onto hierar-
chies), only three models (Trujillo et al. 2001, Tsois
et al. 2001, Jensen et al. 2004) fully support this fea-
tures. Remaining models provide no support.
Requirement 11 (Support for non-strict hierarchies)
is fully supported by only two models (Jensen et al.
2004, Tsois et al. 2001). Remaining models provide
no support.
Requirement 12 (Support for many-to-many relation-
ships). A model of (Tsois et al. 2001) supports many-
to-many relationships between facts and dimensions
but does not support many-to-many relationships be-
tween hierarchies. Only four models (Tryfona et al.
1999, Trujillo et al. 2001, Abello et al. 2001, Jensen
et al. 2004) of the other models support both many-
to-many relationships between facts and dimensions
and between hierarchies. Rests do not support many-
to-many relationships.
Requirement 13 (Generalisation/specialisation hier-
archies). Support provided by (Sapia et al. 1998,
Tryfona et al. 1999, Abello et al. 2001, Trujillo et al.
2001, Jensen et al. 2004) is considered partial, because
generalisation/specialisation is considered in the hier-
archy but is rather kept to distinguish the contents,
thus providing partial support. Remaining models do
not support this feature.
Requirement 14 (Change over time in data) and Re-
quirement 15 (Uncertainty in data) are supported by
a model (Jensen et al. 2004) by attaching a time tag
attribute to dimension values for probabilistic mea-
surements of occurrences of facts and dimension val-
ues. A model of (Abello et al. 2001) supports require-
ment 14 only by changing the schema with appropri-
ate time tags, but does not support requirement 15.
Remaining models do not support both features.
Requirement 16 (Multi-cube/fact schema). Only one
model (Abello et al. 2001) allows multiple stars in a
single schema. However, which dimensions/levels be-
long to which star schema are not clearly reflected in
this model in any way, thus providing partial support.
Remaining models do not meet this feature, thus pro-
vide no support.
Requirement 17 (Summarisability). The only mod-
els such as (Abello et al. 2001) and (Jensen et al.
2004) provide full support, either by applying some
algebraic operations to make part-whole relationships
between levels and then applying aggregation func-
tions (Abello et al. 2001) or by computing weight-
ing factor between facts and dimensions and makes
full covering relationship between levels in the ag-
gregation path (Jensen et al. 2004). Some models
such as (Trujillo & Palomar 1998, Trujillo et al. 2000,
2001), (Golfarelli et al. 1998), and (Tryfona et al.
1999) specify possible functions that can be applied
in order to support summarisation, but do not pro-
vide with a specific operation application, thus pro-
vide only partial support. Remaining models do not
support summarisation.
Requirement 18 (user defined aggregation function).
The only model of (Abello et al. 2001) supports this
feature since it based on UML which allows user de-
fined operations. Rest of the models provide no sup-
port.
Requirement 19 (Drill-across). The only model of
(Abello et al. 2001) allows drill-across because of shar-
ing multi-star dimensions, thus, providing full sup-
port. Some models such as Star (Kimball 1996) (con-
stellation) and DFM (Golfarelli et al. 1998) share di-
mensions but limit drilling mechanism and hence pro-
vide partial support. Remaining models provides no
support.
Requirement 20 (Dimensionless aggregation) and Re-
quirement 22 (aggregation from aggregations) are
not met by any of the other models. Require-
ment 21 (measureless aggregation) is partially sup-
ported by only two models star (Kimball 1996) and
DFM (Golfarelli et al. 1998) by exploring the possi-
bility of having factless (measureless) fact but they
do not address or reflect aggregation in any way.
Model Requiremets/Criteria/Features/Functionalities
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
star x p x p

p

x x x x x p x x p p x p x p x
DFM
√ √ √
x
√ √
p

x p x x x x x p p x p x p x
ME/R x
√ √
x
√ √
x

x x x x p x x p x x x x x x
starER x
√ √
p
√ √
p

x x x

p x x x p x x x x x
DWPM x x

x
√ √
p

x p x x p x x x p x x x x x
OOCM x
√ √
x
√ √
p

x

x

p x p p x x x x x x
MAC x
√ √
x
√ √
x
√ √ √ √ √
x x x x x x x x x x
YAM
√ √ √
p
√ √ √ √ √
x x

p p x
√ √ √ √
x x x
GOLAP x
√ √
x
√ √
p

x p x

x x x x p x x x x x
EMDM x
√ √ √ √ √
p

p
√ √ √
x
√ √
x p x x x x x
CGMD
√ √ √
p
√ √ √ √ √ √ √
p

x
√ √ √ √ √ √ √ √
Figure 6: Comparison between ((/T and other Multidimensional Models
8 Conclusions and Future work
There are several proposals on multidimensional mod-
eling and data warehouse design. So far, there is no
consensus on modeling and design method yet (Rizzi
& Abello 2006). The ((/T data model gives uni-
form way of modeling multidimensional concepts,
data warehouse design and aggregations. Thus, gen-
eralising the design of data warehouse and provid-
ing the uniform way for view management. It is a
framework where to translate and compare concep-
tual properties and expressive power of different ex-
pressivities of the models and related extensions. The
((/T model is to help cost effective design of the
data warehouse, update propagation and view man-
agement of multidimensional data. Our future work is
based translation of ((/T into Description Logic–a
language for reasoning mechanism.
Acknowledgements
The major portion of this work was supported by
School of Computer Science, University of Manch-
ester, United Kingdom and Free University of
Bolzano, Italy.
References
Abello, A., Samos, J. & Saltor, F. (2001), Yam (yet
another multidimensional): An extension of uml,
in ‘Techniocal Report LSI-01-43-R of Department
de Llenguates i Sistems Informatics’.
Abello, A., Samos, J. & Saltor, F. (2006), Yam2: A
multidimensional conceptual model extending uml,
in ‘Information Systems, 31(6)’, pp. 541–567.
Agrawal, R., Gupta, A. & Sarawagi, S. (1997), Model-
ing multidimensional databases, in ‘Proc. of ICDE-
97’.
Berenguer, G., Romero, R., Trujillo, J., Serrano, M.
& Piattini, M. (2005), A set of quality indicator and
thier corresponding metrics for conceptual models
for data warehouses, in ‘Proc. International Con-
ference on Data Warehousing and Knowledge Dis-
covery (DaWak’05)’, pp. 95–104.
Blaschka, M., Sapia, C., Hofling, G. & Dinter, B.
(1998), Finding your way through multidimen-
sional data models, in ‘Proc. 9th International
Workshop on Database and Expert Systems Ap-
plications (DEXA’98)’, pp. 198–203.
Borgida, A., Lenzerini, M. & Rosati, R. (2003), De-
scription logics for databases., in F. Baader, D. Cal-
vanese, D. McGuinness, D. Nardi & P. Patel-
Schneider, eds, ‘Description Logic Handbook’,
Cambridge University Press, chapter 16, pp. 462–
484.
Calvanese, D., Lenzerini, M. & Nardi, D. (1998), De-
scription logics for conceptual data modeling, in
‘Chomicki, Jan and Saake, Gunter, editors 1998,
Logics for Databases and Information Systems.
Kluwer’, pp. 229–264.
Elmasri, R. & Navathe, S., eds (2000), Fundamen-
tals of Database Systems, Third Edition, Addison-
Wesley.
Franconi, E. & Kamble, A. (2003), The (/T data
model for multidimensional information: a brief in-
troduction, in ‘Proc. of 5th International Confer-
ence on Data Warehousing and Knowledge Discov-
ery (DaWak-03)’, pp. 55–65.
Franconi, E. & Kamble, A. (2004a), Data warehouse
conceptual data model, in ‘Proc. of 16th Interna-
tional International Conference on Scientific and
Statistical Database Management (SSDBM)’.
Franconi, E. & Kamble, A. (2004b), The (/T
data model and algebra for multidimensional in-
formation, in ‘Proc. of 16th International Con-
ference on Advanced Information Systems Engi-
neering (CAiSE 2004), Riga, Latvia, June 7-11’,
pp. 446–462.
Franconi, E. & Sattler, U. (1999), A data warehouse
conceptual data model for multidimensional aggre-
gation, in ‘Proc. of the Workshop on Design and
Management of Data Warehouses (DMDW-99)’,
pp. 13–1–13–10.
Golfarelli, M., Maio, D. & Rizzi, S. (1998), ‘The di-
mensional fact model: a conceptual model for data
warehouses’, IJCIS 7(2-3), 215–247.
Husemann, B., Lechtenborger, J. & Vossen, G.
(2000), Conceptual data warehouse modeling, in
‘Proc. of the International Workshop on Design and
Management of Data Warehouses (DMDW’2000),
Stockholm, Sweden, June 5-6’, pp. 6–1–6–11.
Jensen, C., Kligys, A., Pedersen, T. & Timko,
I. (2004), Multidimensional data modeling for
location-based services, in ‘The VLDB Journal
Vol.13’, pp. 1–21.
Kimball, R. (1996), The Data Warehouse Toolkit,
John Wiley & Sons, USA.
Kimball, R. (1997), ‘A dimensional modeling mani-
festo’, DBMS Magazine, August 10(9), 58–70.
Lez, H. J. & Shoshani, A. (1997), Summarizabilty in
olap and statistical databases, in ‘Proc. 9 th In-
ternational Conference on Scientific and Statistical
Database Management (SSDBM)’.
Malinowski, E. & Zimanyi, E. (2006), Hierrachies in
a multidimensional model: From conceptual mod-
eling to logical representation, in ‘Data and Knowl-
edge Engineering, In press’.
Moody, D. & Kortink, M. (2000), From enterpise
models to dimensional models: A methodology
for data warehouse and data mart design, in
‘Proc. of the International Workshop on Design and
Management of Data Warehouses (DMDW’2000),
Stockholm, Sweden, June 5-6’, pp. 5–1–5–12.
Pedersen, T. & Jensen, C. (1999), Multidimensional
data modeling for complex data, in ‘Proc. of 15th
IEEE Intenational Conference on Data Enginerring
(ICDE-99)’, pp. 336–345.
Pei, J. (2003), A general model for online analytical
processing of complex data design and evolution,
in ‘Proc. of 22nd International Conference on Con-
ceptual Modeling (ER2003)’.
Perez, J., Berlanga, R. & Pedersen, T. (2005), A
relevence-extended multi-dimensional model for a
data warehouse contextualized with documents, in
‘Proc. 8th ACM International Workshop on Data
Warehousing and OLAP (DOLAP’05)’, pp. 19–28.
Phipps, C. & Davis, K. C. (2002), Automating
data warehouse conceptual schema design and evo-
lution, in ‘Proc. of the International Workshop
on Design and Management of Data Warehouses
(DMDW’2002)’, pp. 23–32.
Rizzi, S. & Abello, A. (2006), Research in data ware-
house modeling and design: Dead or alive?, in
‘Proc. 9th ACM International Workshop on Data
Warehousing and OLAP (DOLAP’06)’, pp. 3–10.
Sapia, C., Blaschka, M., Hofling, G. & Dinter, B.
(1998), Extending the er model for the multidi-
mensional paradigm, in ‘Proc. of ER Workshop’,
pp. 105–116.
Trujillo, J. & Palomar, M. (1998), An object ori-
ented approach to multidimensional databases con-
ceptual modelling, in ‘Proc. of First International
ACM Workshop Data Warehousing and OLAP’,
pp. 16–21.
Trujillo, J., Palomar, M. & Gomez, J. (2000), Ap-
plying object oriented conceptual modeling tech-
niqyes to the design of multidimensional databases
and olap applications, in ‘Proc. of the First In-
ternational Conference on Web-Age Information
Management (WAIM’2000), Sanghai, China, June’,
pp. 83–94.
Trujillo, J., Palomar, M., Gomez, J. & Song, I.
(2001), Designing data warehouses with oo con-
ceptual models, in ‘IEEE Computer Vol.34(12)’,
pp. 66–75.
Tryfona, N., Busborg, F. & Christiansen, J. (1999),
starer: A conceptual model for data warehouse de-
sign, in ‘Proc. of ACM Second International Work-
shop on Data Warehousing and OLAP (DOAP’99),
Kansas City, Missouri, USA, November’, pp. 3–8.
Tsois, A., Karayiannidis, N. & Sellis, T. (2001),
MAC: Conceptual data modelling for OLAP, in
‘Proc. of the International Workshop on Design and
Management of Data Warehouses (DMDW-2001)’,
pp. 5–1–5–13.
Zepeda, L. & Celma, M. (2006), A model driven ap-
proach for data warehouse conceptual design, in
‘Proc. 7th IEEE International Baltic Conference
on Databases and Information Systems (DBIS’06)’,
pp. 114–121.

The Holiday-Nonholiday level aggregates all holidays and non-holidays.. one-to-many relationships hold. Aggregation: We now perform the analysis of telephone calls along the arbitrary dimensions. generalisation/secialisation. hence. So.1 duration 1. this basic multidimensional entity may be useful for analysing the nature of telephone calls by considering.1 duration 1. the entity Calls represents a basic cube whose dimensions are Date. Land Line. Pointtype aggregates four basic point types (partitions) namely. Point. For example. An entity Point is partitioned (according to an attribute ”type”—a discriminator (Elmasri & Navathe 2000)) into four basic points and two higher level points. This part of the diagram makes still use of standard constructs.1 Source Figure 2: The Conceptual Schema for the Basic Multidimensional information for the base data considered in Figure 1 points.day/month/year no of calls holiday? weekday Day Date 1.1 Dest Point Day Date Calls 1. Taking these constraints into account. Thus. Figure 4 presents the conceptual schema . i. Level building: For building the aggregation level hierarchies for each dimension.1 Dest Point 1. Direct Line & PABX. however. we consider the following: • discriminator of an entity (Elmasri & Navathe 2000) • generalisation/specialisation hierarchy: creating a single entity of all subclasses (possibly disjoint) of a superclass. is exemplified in Figure 2. handling them is straightforward as suggested in (Moody & Kortink 2000). The conceptual multidimensional data model for this information (base data) we have obtained. Cell. Figure 3 presents the multidimensional conceptual schema including level hierarchies for each dimension. it is represented by means of a weak entity with respect to its dimensions. first three constraints namely. Similarly. hierarchy including the levels DAY. Dest and Source (identifying relationships) which are restricted to the basic levels Day. Month. the dimension related to the origin and the destination of the calls with respect to the type of phone point (associated to consumer or business customers). A conceptual schema for this query includes the definition of the basic cube (Figure 2) and the definition of the aggregation along the definitions of associated levels. discriminator. In this example.. and gain Point (associated entities) respectively. In running example. • one-to-many relationship • partial relationship • many-to-many relationship: converting into oneto-many relationships which are then converted into levels as suggested in (Moody & Kortink 2000). among others. However. partial relationships. The bold arrows (from lower level to higher level) denote hierarchy. we do not have many-to-many relationships. Qtr. Year) are created. Holiday or Nonholiday is an optional level as the relationship between Day and Holiday or Nonholiday is partial.e. Outer boxes indicate levels. and Customertype aggregates higher level two (partitions) points types (partitions) viz Consumer and Business. and inner boxes are their elements.1 type Source d Business d DirectLine PABX address code Calls Consumer d Cell LandLine Figure 1: The Conceptual Schema for the base data of telephone calls no of calls 1. The levels “Pointtype” and “Customertype” are created from the partitions of ”Point” entity. A basic multidimensional entity such as Calls described in the diagram of the figure 2 is using a standard star schema—i.e. Date dimension multiple hierarchies (for instance. in this case. a new aggregated entity Calls-by-Day-and-Pointtype denoting aggregations according to the basic level Day and the level Pointtype along the dimensions Date and Source respectively. a query “Analyse telephone calls by day and point type?” is a bi-dimensional cube along the Date and the Source dimensions involving the level Day and the level Pointtype respectively.

. Jan Feb Mar ... A simple aggregation aggregates the collection of objects that are in the extension of the aggregated entities. since entities Mon. by interleaving partitioning and simple aggregations.Sun form a partition of the entity Day.. Obviously. 01/01 . total no of calls and average duration in Figure 5 or Figure 4. 31/01 .. Oct Nov Dec d . The entity Calls-by-Weekday-and-Customertype denotes all the cells of a cube whose coordinates are the weekdays of the date of the calls. and can be computed with associated aggregation functions. and the customer types of the originators of the calls.... 31/01 . d . a new aggregated entity. Qtr4/Yr1 .. 01/12 ...e. Oct Nov Dec .. for example. A multidimensional aggregated entity is an entity itself in the ER diagram.. Figure 5 presents the conceptual schema in the variant of an ER model. d Month Jan Feb Mar .e.1 Source type d d Weekday Tue Wed Thu Fri Sat Sun Customertype Consumer Business d ... extension) holds the necessary constraints enforced for a cube by the GMD-based semantics (Franconi & Kamble 2003.1 duration Dest address code Point Day d Holiday-NonHoliday Holiday NonHoliday d Mon Year Yr1 .e.e. d Qtr d d Pointtype Qtr1/Yr1 . d . This indicates that the aggregations can be computed from pre-computed aggregations. The basic constructs of the ER schema are entities. “analysis of telephone calls by week day and customer type”.. d DAY 01/01 . one aggregating all customer phone points and the other aggregating all business phone points. one for all the Tuesdays. sum(no of calls) and average(duration) respectively) and can be part of further relationships or constraints. 31/01 . ERroles are the edges (links) between entities and relationships and are labeled with number restrictions called cardinality constraints.day/month/year holiday? weekday no of calls 1.. a first extension to the standard ER Model can be seen with simple aggregated entities—i. This particular way of presenting aggregation (entity) is adapted from UML (Unified Modeling Language) syntax. one for all the Mondays.. So.. On the other hand. as analysed by (Golfarelli et al. 01/12 . which represent dimensional levels built from the basic dimensional entities Day and Point respectively.. In this. etc. and it can have attributes (for instance.. relationships.. Now consider a multidimensional aggregated view....2 Extensions to the ER Model we are able to construct level hierarchies starting from some basic dimensional level... for this aggregated cube (query) in the variant of an Entity-Relationship model. An is-a link constraint is drawn as an arrow from more specific entity (subclass) to more general entity called superclass (respectively from more specific to more general relationship)... An attribute is drawn as a circle or oval outside or around the attribute symbol (attribute name). Qtr1/Yrn . the Weekday entity denotes exactly seven objects. 31/12 Figure 3: The Multidimensional Conceptual Schema for the data considered in Figure 1. 2.. Such an entity (i. i... in our example. Qtr4/Yrn Cell LandLine DirectLine PABX d ... composing telephone calls along the Date and the Source dimensions involving levels Weekday and Customertype respectively. 1998). d . non-dimensional aggregations— Weekday and Customertype............ This bi-dimensional aggregated view is actually computed from an aggregated cube of Figure 4.. The disjoint-total constraint is drawn with a . the aggregated entity Customertype denotes exactly two objects. Entity is drawn as a rectangle around the entity symbol (entity name). The conceptual schema for this aggregated view includes the definition of the basic cube and the definition of aggregation. whereas relationship between the entities is drawn as a diamond around the symbol (relationship name). 3 Syntax of the CGMD Data Model As described above. the functional dependencies exist among the levels of a hierarchy. Yrn Date Calls 1.. A second extension to the standard ER model is the multidimensional aggregated entity exemplified in Figure 5 by the entity Calls-by-Weekdayand-Customertype and in Figure 4 by the entity Calls-by-Day-and-Pointtype. 2004b).. and attributes.. say Calls-by-Weekday-and-Customertype (along with the definitions of the level Weekday and the level Customertype) denoting aggregations according to the level Weekday and the level Customertype along the Date and the Source dimensions respectively. i.1 1.. d .

. .U) ≤ cmax(E. S such that R S). cmax(U ) >∈ N × N where cmin(E.1 d Mon Tue Wed Thu Fri Sat Sun total no of calls average duration Consumer Business Weekday Calls-by-Weekday-and-Customertype Customertype Figure 5: The Conceptual Data Warehouse Schema for multidimensional aggregated view presenting Calls by Weekday and Customertype—an aggregated cube (view) circle having ”d” inside. . U ). • R is a finite set of relationship names. . the syntax of an Extended EntityRelationship (EER) model is as follows. En E where Ei = Ej for i = j.1 Dest Point address code type Day Date 1. E. . 1 ≤ i ≤ k. whereas multidimensional aggregated entity is drawn as shadowed rectangle attached with diamond. card.1 Dest address code Point type Source d Day Date Calls 1. . k = 1.R. Ei ∈ E and Ui ∈ U for each i. (respectively between two relationships R. R. U. . R. Weak entities are represented as double-lined rectangles. • { · | models aggregation over E. Definition 1 (EER schema) An EER schema is constructed over the signature S = < E. • U is a finite set of ER-role names. . . 1 ≤ i ≤ n. .1 Calls 1. Ek ∈ E for each k. • a finite set of disjoint-total constraints between more specific entities E1 . Aggregated entity (simple aggregation) is drawn as rectangle attached with diamond.R. .R. { · | > where | } • E is a finite set of entity names. connecting subclasses with the edges and a superclass with a double-lined arrow from the circle to a superclass.date/month/year holiday? weekday no of calls duration 1. Uk : Ek ] where R ∈ R. .1 Source d Consumer d Cell LandLine Business d DirectLine PABX total(no of calls) average(duration) Calls-by-Day-and-Pointtype Pointtype Figure 4: A cube composing calls by level Day and level Pointtype along Date and Source dimensions respectively date/month/year holiday? weekday no of calls 1. . A.1 duration 1. • V is a finite set of domain names. R ∈ R. .U) for each E ∈ E. • a finite . • ⊆ E ×E ∪ R×R and R An EER schema E over the signature S is • a finite set of entities E ∈ E.U) = < cmin(E. • a finite set of is-a link constraints between two entities E1 and E2 such that E1 E2 . Vi ∈ V. . U ∈ U. arity k such that R = [U1 : E1 . V. E ∈ E is in E for each i. . j ≤ n. each associated with an arity k. i ≤ n. . En and a more general entity E such that E1 E. | } • card is a function such that card(E.set of attribute constraints Ai ∈ A such that E = {A1 : V1 . • A is a finite set of attribute names. n is a binary relation over E . Formally. whereas identifying relationships are denoted by double-lined diamonds. An : Vn } where Ai ∈ A. . • a finite set of relationship constraints R of an . .

Dn and n entities L1 . for each i ∈ {1. The cardinality constraints (number restrictions) are associated to ER-role in order to restrict the number of times participation of each instance of an entity via that ER-role in instances of the relationship. . . 4 Semantics of the CGMD Data Model The semantics of an EER Schema is given in terms of legal data warehouse states. . . where VI is disjoint from any other WI such that W ∈ V – EI ⊆ ∆ for each E ∈ E. where EI is disjoint from any other E’I such that E’∈ E – AI ⊆ ∆ × V I for each A ∈ A. Such participation is represented by an ER-role. . each of which represents an association among different combination of instances of the entities that participate in the relationship. . i.e.• a finite set of aggregation links between entities. For we consider GMD. Γ. E2 in E such that E1 E2 Similarly RI ⊆ RI for each is-a link between 1 2 relationships R1 . .. . k. Γ. . Each instance of n-dimensional aggregation is called a cell which is a composition of n elements (instances) of n levels (one element/instance per level) of n dimensions involved in the aggregation. R2 i E such that R1 R2 . A relationship denotes a set of tuples called its instances. . A fact is represented as weak entity (aggregated fact as aggregated weak entity). i. the logical multidimensional data model introduced in (Franconi & Kamble 2003). The GMD introduces a notion of data warehouse state. An is-a link is modelled by and is used to denote the inclusion between two entities (respectively between two relationships) and therefore more specific entity (respectively relationship) inherits properties of more general entity (respectively relationship). An n-dimensional aggregation is represented by an aggregated weak entity connecting n relationships (dimensions). ·I > over the signature < E. We consider as a starting point the ER semantics introduced in (Calvanese et al. Before giving the formal semantics of the EER model. r[Ui ] = ei ∈ Ei for each i. each aggregation link in E is drawn as an edge with attached diamond at one end • a finite set of simple aggregations G ∈ E involving n entities F1 . where ei ∈ EI . respectively. n entities (levels) by connecting them via n circles (one per relationship and per entity). . i A tuple r ∈ RI over ∆ can be viewed as a function that maps each ER-role Ui to ei ∈ Ei and is denoted by [U1 :e1 . U. and RI are called instances of E. Each instance of weak entity is a composition of instances of participating entities (one instance per entity). each having minimum and maximum cardinalities equal to 1. . coordinates and measures. Each ER-role is assigned a unique name. and for some V ∈ V – RI ⊆ ∆ × . . Real. A data warehouse state is a collection of cells (with their dimensions and measures). . A data warehouse state I = < ∆. cells. Simple aggregation involves a finite set of entities on which it is based on. .Uk :ek ]. Definition 2 (EER Semantics) A data warehouse state I = < ∆. The dimensions are represented as relationships (weak relationships) and levels are represented as entities (also including aggregated entities). An aggregated cube is computed from a cube on which the aggregation is based on. and a weak entity (fact on which aggregation is based on) connecting to it (with line). A central element in GMD is a cube. ·I > is said to be legal for an EER schema E. n levels (each being an entity or aggregated entity) and a fact (weak entity) on which aggregation is based on. card. An entity denotes a set of objects called instances that have common properties. Fn (each being connected with an aggregation link to G) • a finite set of aggregations G involving (connecting) n relations D1 . The minimum cardinality is either 0 (zero) or 1 (one) and maximum cardinality is either 1 or ∞. V. Simple aggregation is represented by an aggregated entity which is a composition of entities (to which it is connected with lines). .e. aggregation. In rest of the paper. The elementary properties are modelled with attributes whose values belong to one of several predefined domains such as Integer. A data warehouse state is legal if it satisfies the above cube conditions. . .e. R. A roll-up link between two levels (entities) is modeled by a roll-up function ρ which maps lower level elements (instances) to higher level elements (instances). . The elements of EI . A. That is an n-dimensional aggregation involves n dimensions. . A cube is computed from a cube. recasted to cope with multidimensional information. if it satisfies the following: • EI ⊆ EI for each is-a link in E between two 1 2 entities E1 . and R. dimensions.. An entity (level) directly connected to a (weak) relationship (dimension) is called a basic level for that dimension. n levels (one per dimension) and a fact it is based on. i. facts.. an n-dimensional aggregation is represented by an aggregation (aggregated weak entity) involving n dimensions (each being a relationship). k}. i = 1. Number of ER-roles associated to a relationship is called the arity of that relationship. . . Weak entity is a dependent entity which is identified by considering the primary keys of participation of other entities via relationships (called weak relationships) to which it is connected via ER-roles. we describe intuitively the components of the EER Schema.Uk :ek ]. we will many times use only ”aggregation” to refer multidimensional or n-dimensional aggregation. . A cube defined on all specified dimensions with their basic levels is called a basic cube. AI . GMD abstracts notions such as levels. The properties that are due to relations to other entities are modelled through the participation of the entities in the relationships. A. cube.. data warehouses which conform to the constraints imposed by the schema. • Γ • · I a finite set of (concrete) domains an interpretation function such that – VI ⊆ Γ for each V ∈ V. × ∆ = ∆k for each k-ary relationship R ∈ R such that a tuple r ∈ RI is of the form [U1 :e1 . . { · | > with respect to the EER | } schema E is constituted by • ∆ a nonempty finite set assumed to be different from all domains. multiple level hierarchies.. Each entity can participate in a relationship more than once. i. . it is called an aggregated cube. 1998).Ln and a weak entity F.. otherwise. String.e. . ... or Boolean.

or ρLp . . Lp (each one being an entity or simple aggregation) in E for i = 1. .Lj (from Li to any higher level Lj if L there is a level Lk along the path between Li and Lj ) inductively as follows: ρ∗ i . . it is a relational implementation in the form of “star schema”. a particular EER diagram denotes a set of data warehouse states. although their work is based on UML modeling. . the following must hold: EI i EI ⊆ EI i ∩ EI j for each i = 1. n =∅ for each i = j..Lq (x) = ρLr . ∀f. ln .Ek in E • cmin(E.Ls (x) = y... The model is based on star and snowflake schema. A mapping is presented for transformation of candidate (multidimensional) ER schema to cube. . . × EI for each relationship R in 1 k E connected to entities E1 . . Tryfona et al. | } ↔ Each aggregated cell is an aggregation of cells whose coordinates roll-up to the coordinates associated with an aggregated cell on which it is based on. .Lj such that I ρLi . and measures. Sapia et al. . . ∧ Dn (f ) = ln ∧ Ln (ln ) 2. .Lq ∪ ρLr . . conform to the diagram it self —i. 5 Related Work EI ⊆ EI ∪ . Thus. F(f ) ∧ F(f ) ∧ D1 (f ) = l1 ∧ D1 (f ) = l1 ∧ . U ) ≤ #{r ∈ RI | r[U ] = e} ≤ cmax(E. i. R. if they (data warehouse states) satisfy the cube (together with the aggregated cube) conditions imposed by the GMD schema. a multidimensional modeling manifesto using multidimensional view of enterprise data has been proposed. ∪ EI 1 n • for two connected levels Li . Dn and n levels R1 . 2000. dimensions. 1999. Vj ∈ V for j = 1. The proposals by (Perez et al. .Lj (x) = y for each x ∈ Li and y ∈ LI Li . Lj (each one being an entity or simple aggregation) in E there must be a (possibly partial) roll-up function ρLi . . . ln . j Lj ∈ E. . In both DFM and MDA. The only proposals by (Golfarelli et al.Ls )(x) = y iff ρLp . or ρLp . If a diagram is inconsistent. 2005) based on star schema.Lj L for each k such • for each fact F ∈ F (being a weak entity) in E with p dimensions D1 . AI (r) ∈ VI for each r ∈ RI .Lq (x) = y and ρLr . . max) associated with ER-role U where cmin(E. . levels. Dn |Rn } . . . . The construction methodology is well defined.. l1 . . U ) = max • for each disjoint-total construct in E where E is a superclass and E1 . . According to GMD. where A ∈ A is an attribute of R with domain V ∈ V • RI ⊆ EI × . . G = F {D1 |R1 . . 1998. . We define reflexive transitive closure of roll-up function ρ∗ i . . 1998. . 2005) for the conceptual model. DFM does not support generalisation/speciaisation hierarchies and many-to-many relationships. . . . R. 1998).En are subclasses (partitions). .e.. ..• AI (e) ∈ VI for each e ∈ EI . . In (Kimball 1997.. .Li = id L ρ∗ i . no aggregation is defined at conceptual schema level. . F(f ) → ∃l1 . m. a particular EER schema is a set of legal data warehouse states. .e. Husemann et al. ∀f. Dp |Lp } such that n≤p the following must hold: ∀g. U ) for each U ∈ U. and and • for each aggregation G in E involving n dimensions D1 . p. the set of all possible data warehouse states which conform to the constraints imposed by the GMD schema. where F = E {D1 |L1 . ∧ Dn (f ) = ln ∧ Dn (f ) = ln → f = f Several proposals on a conceptual model exist in the data warehouse field. Dp (each one being an identifying relationship) and corresponding p levels L1 . . 2005) address UML model (Perez et al. GI (g) I g = { | F (f )∧ |f I I ∗ } h=1. 1996). However. . they are legal data warehouse states. R. . Mj ∈ A. A multidimensional conceptual model called MultiDimER model based on ER model has been proposed in (Malinowski & Zimanyi 2006). D1 (f ) = l1 ∧L1 (l1 )∧ .Lj = L that Li Lk where (ρLp . . It is relational and is based on star schema. Zepeda & Celma 2006) address the conceptual model in a real fashion..Lq (x) = ⊥ and ρLr . . a Dimensional Fact Model (DFM) is constructed from an operational ER schema based on requirement analysis. a model supports generalisation hierarchies and many-to-many relationships. .Ls (x) = ⊥. . . . . where A ∈ A is an attribute of E with domain V ∈ V. . . . In the similar manner. Each of the candidate schema is based on star schema.Lk ◦ ρ∗ k . . In (Golfarelli et al. Rn (one per dimension) and a fact F: . Zepeda and Celema (Zepeda & Celma 2006) presented a Model Driven Architecture (MDA) for producing candidate multidimensional schemas from operational ER schema based on requirement analysis. This approach is not conceptual in the sense that it is not independent of the implementation. U ) = min and cmin(E. the following holds (GMD cube conditions): 1. . .Ls (x) = y k ρLi . . f .p (ρLh . for each e ∈ EI . . . { · | denotes aggregation. R. The features . 2005. Similarly. . Berenguer et al. then no data warehouse may conform to it. and propose the quality indicator metrics (Berenguer et al. .Rh (Dh (f )) = Dh (g))| for each n ≤ p. and cardinality constraint card(U ) = (min. associated to R ∈ R and E ∈ E in E.

The CGMD takes care of all constraints of the standard ER model in addition to multidimensional constraints. we evaluate other models against the same requirements listed in Section 6. We consider the models proposed in (Golfarelli et al. Complex Measures (Blaschka et al. In addition. 1998. etc have not been considered in this model. Symmetric treatment for dimensions and measures (Blaschka et al. 1998). nine requirements introduced in (Pedersen & Jensen 1999). Aggregation from aggregation (view over view) The CGMD model fulfills all the above requirements 1–22 except requirements 4. i. requirement analysis and specifications. Dimension/level attributes (Abello et al. Measureless aggregation 22. Tryfona et al. emphasizing on part-hole relationships for aggregation but does not support aggregations at the schema level. it (CGMD) allows 17. etc (Golfarelli et al. 4. Multiple hierarchies in dimension (Pedersen & Jensen 1999): 6. 2001): A model should allow to drill-across (sharing dimensions). 2001): A model should support genenalisation/specialisation (is-a) relationships. 18. Pedersen & Jensen 1999): A model should support the explicit hierarchy in the dimension. 2001. They only derive basic multidimensional schema from the given ER schema. composite attributes. Since in CGMD one cube/fact is based on another. 20. Sapia et al. Summarisability: A model should support summarisation (Lez & Shoshani 1997). 11. All these requirements are randomly listed below. 7. We also consider some additional requirements which are also important for a data warehouse multidimensional conceptual model. The CGMD model is purely conceptual and addresses all the issues from data warehouse construction to aggregations and view management. Explicit Separation of Structure and Contents (Blaschka et al. 1998. 9. we evaluate the CGMD data model according to certain criteria found for a multidimensional conceptual model in the literature. 12 and 14. 12. 2000) manually. Support for aggregation (Pedersen & Jensen 1999): A model should be able to provide meaningful aggregations. Pei 2003. Jensen et al. and it automates the construction of multidimensional conceptual schema from an ER diagram. This is one of the important characteristics of the CGMD model. Table in Figure 6 presents summary of comparison of these models. 1999. 1998): 2. 7 Comparison of CGMD with other models In this section. 1998. We also compare CGMD with other models. Husemann et al. and are also conceptual in some way or other. Pedersen & Jensen 1999): A model should allow measures to be treated as dimensions and vice versa.such as generalisation/specialisation hierarchies. Generalisation/specialisation hierarchies (Abello et al. Support for non-strict hierarchies (Pedersen & Jensen 1999): A model should support non-strict hierarchies. Support for many-to-many relationships (Pedersen & Jensen 1999): 13. 2000)) are also taken into consideration for comparison because they are current state-of-the-art models. and compare them with the CGMD model which is already evaluated in Section 6. As before. 14. requirement 4 can be fully supported if the measure is used as coordinate. The requirement 14 is not supported as no syntactical provision is made for changing dimensions/levels in CGMD. models proposed in (Tsois et al. 2001): A model should specify the attributes that do not define hierarchies. It is based on the ER model and its logical representations. 2001) (an object oriented model— extension of (Trujillo et al. and if a model does not support it at all then it is denoted by “x”. Drill-across (Abello et al. all these models need to specify design methodology such as information analysis. However.. 2001. . 1998. Explicit hierarchies (Blaschka et al. we consider some criteria listed in (Blaschka et al. 1998): A model should support multiple and complex measures for the same fact (cube). 8. and (Trujillo et al.e. then it is √ denoted by ” ”. 1998): 3. Implementation independent (Blaschka et al. Handling uncertainty 16. hierarchies with paths of different lengths. 2001): A model should support multiple cubes/facts in schema. Requirements 4 and 12 are partially supported. 2000) for comparison as they are conceptual models in true sense. Multi-cube/fact schema (Abello et al. Husemann et al. 2004). 2006) is based on UML and its extensions. The only proposal by (Franconi & Sattler 1999) for data warehouse conceptual model presents the structure of multidimensional aggregation. Handling change over time (Pedersen & Jensen 1999): 15. aggregations. Dimensionless aggregation 21. Moreover. and several requirements found in (Abello et al. None of these proposals addresses conceptual structure of aggregation. 2001) for a data warehouse multidimensional model. Abello et al. 1. Support for non-onto hierarchies (Pedersen & Jensen 1999): A model should support non-onto (unbalanced) hierarchies. 1998. 2001): A model should support user defined aggregation functions. A conceptual model proposed by (Abello et al. Handling different levels of granularity (Pedersen & Jensen 1999): In this section. This shows CGMD is syntactically and semantically richer than the other models. For evaluation. the CGMD model supports aggregations at any higher level (ignoring intermediate levels). 19. The model is well defined. 6 Evaluation of the CGMD model 10. If a model supports the requirement partially then it denoted by “p”. 5. however. User defined Aggregation functions (Abello et al. if a model meets a particular requirement/feature/functionality fully.

2001) or by computing weighting factor between facts and dimensions and makes full covering relationship between levels in the aggregation path (Jensen et al. 2004). 2001) supports manyto-many relationships between facts and dimensions but does not support many-to-many relationships between hierarchies. 1999. 2001) supports this feature since it based on UML which allows user defined operations. Remaining models do not support both features. thus.e. 1998). which dimensions/levels belong to which star schema are not clearly reflected in this model in any way. 1998) share dimensions but limit drilling mechanism and hence provide partial support. 2001. 1998. Remaining models (Kimball 1996. Some models such as Star (Kimball 1996) (constellation) and DFM (Golfarelli et al. 2001. Remaining models do not support summarisation. Trujillo et al. 1998. providing full support. 2000. Jensen et al. Requirement 12 (Support for many-to-many relationships). 2004. Remaining models do not meet this feature. Requirement 4 (Symmetric treatment for dimensions and measures). Remaining models do not support this feature. 2004) captures this feature by means of derivation mechanism.. 2004) fully support this features. thus providing partial support. covering and onto through some derivations so that data will not be double counted. 1999) do not consider measure explicitly as dimensions but conditionally if the measure is used to identify a cell. thus providing partial support. 2004). Requirement 13 (Generalisation/specialisation hierarchies). do not consider this feature in the framework and thus provide no support. Only one model (Tsois et al. clinical domain (Jensen et al. Trujillo et al. 1998) or hierarchies to strict. Jensen et al. Star schema does not support explicit hierarchy. and hence provide no support. 2001. Requirement 16 (Multi-cube/fact schema). 2001. thus providing full support. Only one model (Jensen et al. 2001. and thus provide no support. thus provide full support. Requirement 20 (Dimensionless aggregation) and Requirement 22 (aggregation from aggregations) are not met by any of the other models. Husemann et al. only three models (Trujillo et al. Jensen et al. A model of (Abello et al. thus provide full support. because generalisation/specialisation is considered in the hierarchy but is rather kept to distinguish the contents. Only one model (Abello et al. 2000. 2001. thus provide no support. Pei 2003).Requirement 1 (Implementation independent) is partially supported by a model of (Abello et al. 2001. 2001. 2001. 2004) of the other models support both manyto-many relationships between facts and dimensions and between hierarchies. Jensen et al. Trujillo et al. hence provides no support. The only models such as (Abello et al. since it is a multi-star schema based on concepts of star model. Remaining models specify the nondimension attributes. Only one model (Jensen et al. Trujillo et al. Remaining models (Golfarelli et al. 2004) provide full support. thus provide full support. 1998)). 2004) by attaching a time tag attribute to dimension values for probabilistic measurements of occurrences of facts and dimension values. 2004. 2004) either by derivation through some operations (Abello et al. Requirement 14 (Change over time in data) and Requirement 15 (Uncertainty in data) are supported by a model (Jensen et al. Some models such as (Trujillo & Palomar 1998. Requirement 21 (measureless aggregation) is partially supported by only two models star (Kimball 1996) and DFM (Golfarelli et al. Trujillo et al. 2001. and consequently does not support the requirement 3. 2001). onto and covering. only one model (Jensen et al. Remaining models provide no support. Tsois et al. and (Tryfona et al. Trujillo et al. 2001. thus providing partial support. 2001. 2000. Pei 2003). (Golfarelli et al. 2004) does specify the attributes that do not define hierarchies. Rests do not support manyto-many relationships. Remaining models provides no support. 2001. requirement 5 (Multiple hierarchies) and requirement 8 (Complex measures) are met by all of the models (except star schema (Kimball 1996) for requirement 3). Requirement 9 (different levels of granularity) A very few models (Tsois et al. . 2001) or by restricting hierarchies to strict. 2001). Rest of the models do not capture this feature. but do not provide with a specific operation application. Requirement 11 (Support for non-strict hierarchies) is fully supported by only two models (Jensen et al. 1998. 2001. The only model of (Abello et al. The only model of (Abello et al. 1998. for example. 1998) by exploring the possibility of having factless (measureless) fact but they do not address or reflect aggregation in any way. Pei 2003). Remaining models provide no support. A model of (Tsois et al. 2001) captures different levels of granularity (i. Abello et al. 2004) is considered partial. 2001) does not support this requirement because of including many-to-many relationships between facts and dimensions and among the hierarchies. Tryfona et al. Abello et al. Abello et al. measures at different levels of granularity) either by aggregation (Tsois et al. Tsois et al. Tsois et al. 2001) and (Jensen et al. Requirement 6 (Dimension/level attributes). 2001) supports requirement 14 only by changing the schema with appropriate time tags. 2001) or by specialising the cells depending on whether the measure is derived or not. Requirement 2 (Explicit Separation of Structure and Contents). 2001). thus provide only partial support. 2004) captures this feature partially through some derivation. Sapia et al. 1999) specify possible functions that can be applied in order to support summarisation. Rest of the models provide no support. Pei 2003) implementation. and restricting aggregation functions (Sapia et al. can not be considered implementation independent as they are either relational (Kimball 1996) or based on star schema ((Golfarelli et al. Requirement 17 (Summarisability). Tsois et al. but does not support requirement 15. 1998. Jensen et al. requirement 3 (Explicit hierarchies). 2001) allows drill-across because of sharing multi-star dimensions. For Requirement 10 (Support for non-onto hierarchies). 2001) allows multiple stars in a single schema. Remaining models partially support this feature either by restricting the dimension/hierarchies and aggregation functions (Golfarelli et al. Support provided by (Sapia et al. Requirement 18 (user defined aggregation function). a relational model or designed for a specific domain (Jensen et al. Requirement 7 (Support for correct aggregation) is met by a very few models (Abello et al. Golfarelli et al. Husemann et al. Some models (Abello et al. Tryfona et al. either by applying some algebraic operations to make part-whole relationships between levels and then applying aggregation functions (Abello et al. However. Requirement 19 (Drill-across). Only four models (Tryfona et al. 1999. 2004.

I. ‘Description Logic Handbook’. (2004b). (1996). & Timko. (2003). Samos. J. Description logics for conceptual data modeling. R. of 16th International Conference on Advanced Information Systems Engineering (CAiSE 2004). . E. Fundamentals of Database Systems. There are several proposals on multidimensional modeling and data warehouse design. G. E. data warehouse design and aggregations. Pedersen. 541–567. J. University of Manchester. update propagation and view management of multidimensional data. Samos. Cambridge University Press. A. of 16th International International Conference on Scientific and Statistical Database Management (SSDBM)’. G. B. & Kamble. pp. J. pp. (2000). Sapia. Agrawal. E.. S. R. Finding your way through multidimensional data models. Baader. IJCIS 7(2-3). (2006)..13’. U. A. 1–21. Blaschka. John Wiley & Sons. Sweden. (1998). It is a framework where to translate and compare conceptual properties and expressive power of different expressivities of the models and related extensions. editors 1998. S. in ‘Information Systems. So far. Yam (yet another multidimensional): An extension of uml. 215–247. Gunter. Berenguer. Riga. D. C. of the International Workshop on Design and Management of Data Warehouses (DMDW’2000). R. in ‘Techniocal Report LSI-01-43-R of Department de Llenguates i Sistems Informatics’. pp. J. Stockholm.. pp. McGuinness. Jan and Saake. M. there is no consensus on modeling and design method yet (Rizzi & Abello 2006). Italy. eds (2000). A data warehouse conceptual data model for multidimensional aggregation. 446–462. & Shoshani. & Saltor. & Sattler.. in ‘Proc.. C. The CGMD data model gives uniform way of modeling multidimensional concepts. M. (1999). T. Lechtenborger. Lenzerini. D. The CGMD model is to help cost effective design of the data warehouse. 198–203. 55–65.Model star DFM ME/R starER DWPM OOCM MAC YAM GOLAP EMDM CGMD 1 x √ x x x x x √ x x √ 2 p √ √ √ x √ √ √ √ √ √ 3 x √ √ √ √ √ √ √ √ √ √ 4 p x x p x x x p x √ p 5 √ √ √ √ √ √ √ √ √ √ √ 6 p √ √ √ √ √ √ √ √ √ √ Requiremets/Criteria/Features/Functionalities 7 √ p x p p p x √ p p √ 8 x √ √ √ √ √ √ √ √ √ √ 9 x x x x x x √ √ x p √ 10 x p x x p √ √ x p √ √ 11 x x x x x x √ x x √ √ 12 x x x √ x √ √ √ √ √ p 13 p x p p p p x p x x √ 14 x x x x x x x p x √ x 15 x x x x x p x x x √ √ 16 p p p x x p x √ x x √ 17 p p x p p x x √ p p √ 18 x x x x x x x √ x x √ 19 p p x x x x x √ x x √ 20 x x x x x x x x x x √ 21 p p x x x x x x x x √ 22 x x x x x x x x x x √ Figure 6: Comparison between CGMD and other Multidimensional Models 8 Conclusions and Future work Calvanese. in ‘Proc. References Abello. A. A.. in ‘Chomicki. 13–1–13–10.. chapter 16. & Navathe. 58–70. (1998). Golfarelli. Lez. of ICDE97’. Maio. D. Franconi. Data warehouse conceptual data model. A. Romero. M.. (2005).. Third Edition. in ‘The VLDB Journal Vol. USA. & Nardi. S. in ‘Proc. (2004a). Modeling multidimensional databases. in ‘Data and Knowledge Engineering. (1997). Description logics for databases. G. United Kingdom and Free University of Bolzano. (2006). in F. & Kamble. R. of 5th International Conference on Data Warehousing and Knowledge Discovery (DaWak-03)’. A. in ‘Proc. (2001). R. 95–104. A set of quality indicator and thier corresponding metrics for conceptual models for data warehouses. M. in ‘Proc. The GMD data model for multidimensional information: a brief introduction. International Conference on Data Warehousing and Knowledge Discovery (DaWak’05)’. Kimball. Hofling. & Rosati. In press’. Yam2: A multidimensional conceptual model extending uml. June 5-6’. M. & Vossen. Franconi. & Kamble. (1997). Franconi. DBMS Magazine. Franconi. 9th International Workshop on Database and Expert Systems Applications (DEXA’98)’. H. M. D. eds. Lenzerini. Multidimensional data modeling for location-based services. A. 9 th International Conference on Scientific and Statistical Database Management (SSDBM)’. in ‘Proc. Summarizabilty in olap and statistical databases. June 7-11’. & Piattini. generalising the design of data warehouse and providing the uniform way for view management. 229–264. PatelSchneider. (2004). Thus. ‘The dimensional fact model: a conceptual model for data warehouses’. Nardi & P. A. Kimball. (1997). The Data Warehouse Toolkit. D. The GMD data model and algebra for multidimensional information. Husemann. E. August 10(9). Latvia. (1998). 31(6)’.. & Dinter. 462– 484. Logics for Databases and Information Systems. & Zimanyi. & Rizzi. A. Kligys. in ‘Proc... pp. in ‘Proc. Jensen. Gupta. AddisonWesley. Malinowski. Our future work is based translation of CGMD into Description Logic–a language for reasoning mechanism. F. Kluwer’. pp. pp. (2003). Trujillo. E.. 6–1–6–11. E. pp. in ‘Proc. Borgida. Calvanese... Conceptual data warehouse modeling. & Sarawagi. Elmasri. J. R. Hierrachies in a multidimensional model: From conceptual modeling to logical representation. Serrano. ‘A dimensional modeling manifesto’. D. & Saltor. Abello. pp. pp. Acknowledgements The major portion of this work was supported by School of Computer Science. of the Workshop on Design and Management of Data Warehouses (DMDW-99)’.. B. F.

of the International Workshop on Design and Management of Data Warehouses (DMDW’2000). Multidimensional data modeling for complex data. S. & Kortink. (1998). Rizzi. & Sellis.. N. B. . pp. June’. in ‘Proc. & Palomar. in ‘Proc. M. pp. C. An object oriented approach to multidimensional databases conceptual modelling. USA. Sanghai. 114–121.. A. Berlanga. 3–8. & Pedersen. A relevence-extended multi-dimensional model for a data warehouse contextualized with documents. T. Kansas City. M. of ACM Second International Workshop on Data Warehousing and OLAP (DOAP’99). I. pp. Sapia. 19–28. Trujillo. Busborg. 66–75.34(12)’. J. C. & Abello. 16–21. pp. 7th IEEE International Baltic Conference on Databases and Information Systems (DBIS’06)’. & Jensen... in ‘Proc. (2003). MAC: Conceptual data modelling for OLAP. C. F. (2000). China. M. Research in data warehouse modeling and design: Dead or alive?. Applying object oriented conceptual modeling techniqyes to the design of multidimensional databases and olap applications. in ‘Proc. pp. & Dinter. (1998). of ER Workshop’. M. Zepeda. of 15th IEEE Intenational Conference on Data Enginerring (ICDE-99)’. T. C. November’. N. pp. Extending the er model for the multidimensional paradigm. Karayiannidis. From enterpise models to dimensional models: A methodology for data warehouse and data mart design. J.. pp. in ‘Proc. in ‘Proc. Palomar. Stockholm. Tsois. K. of the First International Conference on Web-Age Information Management (WAIM’2000). J. in ‘Proc. R. Sweden. Pei. & Gomez. in ‘Proc. of the International Workshop on Design and Management of Data Warehouses (DMDW’2002)’. J. (2005). (2006). Missouri. & Song. D. Pedersen. J. in ‘Proc. A. 5–1–5–13. Phipps. Automating data warehouse conceptual schema design and evolution. Gomez. 83–94.. & Celma. pp. Trujillo. in ‘Proc. Trujillo. June 5-6’. (2000). of 22nd International Conference on Conceptual Modeling (ER2003)’. (2001). 23–32. of the International Workshop on Design and Management of Data Warehouses (DMDW-2001)’. Perez. L. Hofling. pp. (2001). 3–10. (2006). 8th ACM International Workshop on Data Warehousing and OLAP (DOLAP’05)’. & Christiansen. 105–116. T. J. (1999). Designing data warehouses with oo conceptual models. M. Palomar. & Davis. J. pp. in ‘Proc. pp. M. (1999). in ‘Proc. J. (2002). A model driven approach for data warehouse conceptual design. 336–345. Blaschka.Moody. starer: A conceptual model for data warehouse design... G. 9th ACM International Workshop on Data Warehousing and OLAP (DOLAP’06)’. 5–1–5–12. pp. of First International ACM Workshop Data Warehousing and OLAP’. in ‘IEEE Computer Vol. Tryfona. A general model for online analytical processing of complex data design and evolution.