You are on page 1of 64

Database Systems Journal vol. V, no.

3/2014 1

Database Systems Journal BOARD


Director
Prof. Ion Lungu, PhD, University of Economic Studies, Bucharest, Romania

Editors-in-Chief
Prof. Adela Bara, PhD, University of Economic Studies, Bucharest, Romania
Prof. Marinela Mircea, PhD, University of Economic Studies, Bucharest, Romania

Secretaries
Lect. Iuliana Botha, PhD, University of Economic Studies, Bucharest, Romania
Lect. Anda Velicanu, PhD, University of Economic Studies, Bucharest, Romania

Editorial Board
Prof. Ioan Andone, PhD, A.I.Cuza University, Iasi, Romania
Prof. Anca Andreescu, PhD, University of Economic Studies, Bucharest, Romania
Prof. Emil Burtescu, PhD, University of Pitesti, Pitesti, Romania
Joshua Cooper, PhD, Hildebrand Technology Ltd., UK
Prof. Marian Dardala, PhD, University of Economic Studies, Bucharest, Romania
Prof. Dorel Dusmanescu, PhD, Petrol and Gas University, Ploiesti, Romania
Prof. Marin Fotache, PhD, A.I.Cuza University, Iasi, Romania
Dan Garlasu, PhD, Oracle Romania
Prof. Marius Guran, PhD, University Politehnica of Bucharest, Bucharest, Romania
Lect. Ticiano Costa Jordão, PhD-C, University of Pardubice, Pardubice, Czech Republic
Prof. Brijender Kahanwal, PhD, Galaxy Global Imperial Technical Campus, Ambala, India
Prof. Dimitri Konstantas, PhD, University of Geneva, Geneva, Switzerland
Prof. Hitesh Kumar Sharma, PhD, University of Petroleum and Energy Studies, India
Prof. Mihaela I.Muntean, PhD, West University, Timisoara, Romania
Prof. Stefan Nithchi, PhD, Babes-Bolyai University, Cluj-Napoca, Romania
Prof. Corina Paraschiv, PhD, University of Paris Descartes, Paris, France
Davian Popescu, PhD, Milan, Italy
Prof. Gheorghe Sabau, PhD, University of Economic Studies, Bucharest, Romania
Prof. Nazaraf Shah, PhD, Coventry University, Coventry, UK
Prof. Ion Smeureanu, PhD, University of Economic Studies, Bucharest, Romania
Prof. Traian Surcel, PhD, University of Economic Studies, Bucharest, Romania
Prof. Ilie Tamas, PhD, University of Economic Studies, Bucharest, Romania
Silviu Teodoru, PhD, Oracle Romania
Prof. Dumitru Todoroi, PhD, Academy of Economic Studies, Chisinau, Republic of Moldova
Prof. Manole Velicanu, PhD, University of Economic Studies, Bucharest, Romania
Prof. Robert Wrembel, PhD, University of Technology, Poznan, Poland

Contact
Calea Dorobanţilor, no. 15-17, room 2017, Bucharest, Romania
Web: http://dbjournal.ro
E-mail: editordbjournal@gmail.com; editor@dbjournal.ro
2 Database Systems Journal vol. V, no. 3/2014

CONTENTS

Labeling Consequents of Fuzzy Rules Constructed by Using Heuristic Algorithms of


Possibilistic Clustering............................................................................................................. 3
Dmitri A. VIATTCHENIN, Stanislau SHYRAI, Aliaksandr DAMARATSKI

BI solutions for modern management of organizations - Cognos ..................................... 14


Ruxandra OPREA, Ionuț LĂCĂTUȘ, Cătălin HUȚANU

Business Intelligence overview .............................................................................................. 23


Dragoş AGIU, Vlad MATEESCU, Iulia MUNTEAN

Comparative study on software development methodologies ............................................ 37


Mihai Liviu DESPA

Social BI. Trends and development needs ........................................................................... 57


Corneliu Mihai PREDA
Database Systems Journal vol. V, no. 3/2014 3

Labeling Consequents of Fuzzy Rules Constructed by Using


Heuristic Algorithms of Possibilistic Clustering

Dmitri A. VIATTCHENIN1,2, Stanislau SHYRAI1, Aliaksandr DAMARATSKI2


1
Belarusian State University of Informatics and Radio-Electronics, Minsk, Belarus
2
United Institute of Informatics Problems of the National Academy of Sciences of Belarus,
Minsk, Belarus
viattchenin@mail.ru, ashaman410@gmail.com, a.damaratski@gmail.com

The paper deals with the problem of automatic labeling output variables in Mamdani-type
fuzzy rules generated by using heuristic algorithms of possibilistic clustering. The labeling
problem in fuzzy clustering and basic concepts the heuristic approach to possibilistic
clustering are considered in brief. Labeling consequents procedure is proposed. Experimental
results are presented shortly and some preliminary conclusions are made.
Keywords: Data Mining, Clustering, Fuzzy Rule, Consequent, Labeling

1 Introduction
Fuzzy classifiers play an important
role in different data mining approaches.
If xˆ 1 is Bl1 and  and xˆ m1 is Blm1
, (1)
Thus, the problem of generation of fuzzy then y1 is C1l and  and y c is C cl
rules is one of more than important where input variables xˆ t1 , t1  1,  , m1 are
problems in the development of fuzzy antecedents and output variables yl ,
classifiers. l  1,  , c are consequents of fuzzy rules,
There are a number of approaches to
learning fuzzy rules from data based on Blt1 , t1  {1,  , m1 } and C ll , l  {1,  , c} are
techniques of evolutionary or neural fuzzy sets that define an input and output
computation, mostly aiming at optimizing space partitioning. A fuzzy classifier which
parameters of fuzzy rules. From other is described by a set of fuzzy classification
hand, fuzzy or possibilistic clustering rules with the form (1) is the multiple inputs,
seems to be a very appealing method for multiple outputs system.
learning fuzzy rules since there is a close The principal idea of extracting fuzzy
and canonical connection between fuzzy classification rules based on fuzzy clustering
clusters and fuzzy rules. The fact was was outlined in [1] and the idea is the
shown in [1]. following. Each fuzzy cluster is assumed to
Let us consider in brief some basic be assigned to one class for classification
concepts. We assume that the training set and the membership grades of the data to the
contains n data pairs. Each pair is made clusters determine the degree to which they
of a m1 -dimensional input-vector and a can be classified as a member of the
с -dimensional output-vector. We assume corresponding class. So, with a fuzzy cluster
that the number of rules in the fuzzy that is assigned to the some class we can
inference system rule base is с . So, associate a linguistic rule. The fuzzy cluster
Mamdani and Assilian’s [2] fuzzy rule l is projected into each single dimension
within the fuzzy inference system is leading to a fuzzy set on the real numbers.
written as follows: An approximation of the fuzzy set by
projecting only the data set and computing
the convex hull of this projected fuzzy set or
approximating it by a trapezoidal or
triangular membership function is used for
4 Labeling Consequents of Fuzzy Rules Constructed by Using
Heuristic Algorithms of Possibilistic Clustering

the fuzzy rules obtaining. objective function that evaluates the


The idea of extracting fuzzy classification partition of the data into a given number of
rules based on possibilistic clustering [3] fuzzy clusters.
is similar to the idea of deriving fuzzy All objective function-based fuzzy
rules based on fuzzy clustering. clustering algorithms can in general be
On the other hand, a heuristic approach to divided into two types: object versus
possibilistic clustering was outlined in [4] relational.
and the approach was developed in other The object data clustering methods can be
publications. Moreover, a method of the applied if the objects are represented as
rapid extracting fuzzy rules based on points in some multidimensional space
results of the heuristic possibilistic I m1 ( X ) . In other words, the data which is
clustering of the training data set was also composed of n objects and m1 attributes is
proposed in [4] and the method is very
effective in comparison with the method denoted as Xˆ nm1  [ xˆit1 ] , i  1,  , n ,
based on fuzzy clustering results. The t1  1,  , m1 and the data are called
idea of deriving fuzzy classification rules sometimes the two-way data [5]. Let
from the training data can be formulated X  {x1 ,..., x n } is the set of objects. So, the
as follows: the training data set is divided two-way data matrix can be represented as
into homogeneous group and a fuzzy rule follows:
is associated to each group.  xˆ11 xˆ12  xˆ1m1 
However, names should be assigned to  
each output variable yl , l  1, , c . The  xˆ 2 xˆ22  xˆ2m1 
Xˆ n m1  2 .
process of assigning names to output    
variables is connected with the problem  xˆ1 xˆn2  xˆnm1 
 n
of interpretation of classification results (2)
and a labeling procedure in clustering. So, the two-way data matrix can be
The main goal of this paper is a represented as Xˆ  ( xˆ 1 ,  , xˆ m1 ) using n -
consideration of an approach to automatic
labeling consequents of fuzzy rules dimensional column vectors xˆ t1 ,
generated by heuristic algorithms of t1  1,  , m1 , composed of the elements of
possibilistic clustering. The contents of the t1 -th column of X̂ .
this paper is as follows: in the second The traditional optimization methods of
section a labeling problem in fuzzy fuzzy clustering are based on the concept of
clustering is described, in the third fuzzy c -partition [1]. The initial set
section basic concepts of the heuristic X  {x1 ,..., x n } of n objects represented by
approach to possibilistic clustering are the matrix of similarity coefficients, the
considered, in the fourth section a matrix of dissimilarity coefficients or the
labeling procedure for fuzzy rules matrix of object attributes, should be divided
consequents is described, in the fifth a into c fuzzy clusters. Namely, the grade u li ,
numerical example of application of the
1  l  c , 1  i  n to which an object xi
proposed procedure to fuzzy rules
generated from the Anderson’s Iris data belongs to the fuzzy cluster Al should be
set are given, and some final remarks are determined. For each object xi , i  1,  , n
stated in the sixth section. the grades of membership should satisfy the
conditions of a fuzzy c -partition:
c
2. Related works
The most widespread approach in fuzzy
 u li  1 , 1  i  n , 0  u li  1 , 1  l  c . (3)
l 1
clustering is the optimization approach. In other words, the family of fuzzy sets
Most optimization fuzzy clustering P( X )  { A l | l  1, c, c  n} is the fuzzy c -
algorithms aim at minimizing an partition of the initial set of objects
Database Systems Journal vol. V, no. 3/2014 5

X  {x1 ,..., x n } if condition (3) is met. c


 li  0 , 0  li  1 . (5)
Fuzzy c -partition P ( X ) may be l 1

described with the aid of a partition So, the family of fuzzy sets
l
matrix Pcn  [u li ] , l  1,  , c , i  1,  , n .  ( X )  { A | l  1, c, c  n} is the possibilistic
The set of all fuzzy c -partitions will be partition of the initial set of objects
denoted by  . So, the fuzzy problem X  {x1 ,..., x n } if condition (5) is met.
formulation in cluster analysis can be Obviously that the conditions of the
defined as the optimization task possibilistic partition (5) are more flexible
Q  extr under the constraints (3), than the conditions of the fuzzy c -partition
P ( X )
(3).
where Q is a fuzzy objective function.
In order to be applying the found cluster
The best known optimization approach to prototype as classifiers, they need to be
fuzzy clustering is the method of fuzzy given reasonable names. One can then use
c -means [6]. The FCM-algorithm is these names as column titles of the
based on an iterative optimization of the membership matrix when using the recall
fuzzy objective function, which takes the function of the FCM-algorithm. This helps
form: in the interpretation of the results.
c n 2
Q FCM ( P, )    u li x i   l , The process of assigning class names to
l 1 i 1 cluster prototypes is called labeling. A
(4) labeling method for the fuzzy c -means
where u li , l  1,  , c , i  1,  , n is the method is to inspect the cluster prototypes
membership degree, xi , i  {1,  , n} is the and their respective membership values of
data point,   { 1 ,  ,  c } is the set the various attributes, and to assign a label
manually.
fuzzy clusters prototypes, and   1 is the
However, usually, it is already known when
weighting exponent. training a classifier which objects belong to
The purpose of the classification task is which classes. This information can be taken
to obtain the solutions P ( X ) and into account to use so as automatic the fuzzy
 1 ,  ,  c which minimize equation (4). c -means labeling process. The
Some other similar objective function- corresponding labeling procedure is
based fuzzy clustering algorithms are described in [7] in detail. The principal idea
considered in [1], [5] and [6] in detail. of the procedure is to present a sample of
However, the condition of fuzzy c - objects to the FCM-classifier whose class
partition is very difficult from essential membership are known in the hard form of 0
positions. So, a possibilistic approach to or 1 values and have also been calculated by
clustering was proposed by the procedure. By means of the given cluster
Krishnapuram and Keller in [3] and membership values for each fuzzy cluster
developed by other researchers. Major prototype, the fuzzy cluster prototypes can
algorithms of possibilistic clustering are be associated with their respective classes.
objective function-based procedures. On the other hand, all objective function-
A concept of possibilistic partition is a based fuzzy clustering algorithms are
basis of possibilistic clustering methods iterative procedures and the initial fuzzy c -
and membership values li , l  1,, c , partition P ( X ) is initialized randomly. So,
i  1,, n can be interpreted as the values coordinates of fuzzy clusters prototypes and
of typicality degree. For each object xi , values of membership functions will be
i  1, , n the grades of membership different in each experiment for the same
should satisfy the conditions of a data set, because the result of classification
possibilistic partition: is sensitive to initialization. Moreover,
major objective function-based fuzzy
6 Labeling Consequents of Fuzzy Rules Constructed by Using
Heuristic Algorithms of Possibilistic Clustering

clustering algorithms are need for using  μ l ( xi ), xi  Aαl


some validity measures [1] for μli   A , (6)
0, otherwise
determining the most “plausible” number
c of fuzzy clusters in the sought fuzzy where Aαl  {xi  X | μ Al ( xi )  α} , α  (0,1] is
c -partition P ( X ) . So, a problem of rapid
the α -level of a fuzzy set Al and the α -
automatic labeling is arises. level is the support of the fuzzy α -cluster
The effective labeling procedure for
A(lα ) , Aαl  Supp( A(lα ) ) . The value of α is the
heuristic algorithms of possibilistic
clustering was proposed in [8] and the tolerance threshold of fuzzy α -cluster
procedure is the basis of the labelling elements.
procedure for consequents of derived Let { A(1α ) ,..., A(nα ) } be the family of fuzzy α -
fuzzy rules. However, basic definitions of
clusters for some α . The point τ el  Aαl , for
the heuristic approach to possibilistic
clustering should be considered in the which
first place. τ el  arg max μli , xi  Aαl ,
xi

(7)
3. Basic concepts of the heuristic is called a typical point of the fuzzy α -
approach to possibilistic clustering
cluster A(lα ) , α  (0,1] , l  [1, n] . A set
Let us remind the basic concepts of the
heuristic method of possibilistic K ( A(lα ) )  {τ1l ,  , τ ll } of typical points of the
clustering [4]. Let X  {x1 ,..., x n } be the
fuzzy cluster A(lα ) is a kernel of the fuzzy
initial set of elements and
T : X  X  [0,1] some fuzzy tolerance on cluster and card K ( A(lα ) )   l is a cardinality
X with μT ( x i , x j )  [0,1] , xi , x j  X of the kernel. If the fuzzy cluster have an
being its membership function. Let α be unique typical point, then l  1 .
the α -level value of the fuzzy tolerance Let R zα ( X )  { A(lα ) | l  1, c, 2  c  n} be a
T , α  (0,1] . Columns or rows of the family of fuzzy α -clusters for some value of
fuzzy tolerance matrix are fuzzy sets tolerance threshold α , which are generated
{ A1 ,..., A n } on the universal set X . Let by a fuzzy tolerance T on the initial set of
Al , l  {1,  , n} be a fuzzy set on X with elements X  {x1 ,..., xn } . If condition
c
μ Al ( xi )  [0,1] , xi  X being its  μli  0 , xi  X ,
l 1
membership function. The α -level fuzzy
(8)
 
set A(lα )  ( xi , μ Al ( xi )) | μ Al ( xi )  α, xi  X
is met for all A(lα ) , l  1, c , c  n , then the
is fuzzy α -cluster. So, A(lα )  Al , family is the allotment of elements of the set
α  (0,1] , Al  { A1 ,  , An } and μ Al ( xi ) is X  {x1 ,..., xn } among fuzzy α -clusters
the membership degree of the element { A(lα ) , l  1, c, 2  c  n} for some value of the
xi  X for some fuzzy α -cluster A(lα ) , tolerance threshold α . It should be noted
α  (0,1] , l  {1,  , n} . The membership that several allotments Rz (X ) can exist for
degree will be denoted μli in further some tolerance threshold α . That is why
symbol z is the index of an allotment.
considerations. The membership degree
Obviously, the definition of the allotment
of the element xi  X for some fuzzy α -
among fuzzy clusters (8) is similar to the
cluster A(l ) , α  (0,1] , l  {1,  , n} can be definition of the possibilistic partition (5).
defined as a So, the allotment among fuzzy clusters can
be considered as the possibilistic partition
Database Systems Journal vol. V, no. 3/2014 7

and fuzzy clusters in the sense of (6) are Thus, the conditions (9) and (10) can be
elements of the possibilistic partition. generalized for a case of particularly
Allotment separate fuzzy clusters. So, a condition
c
RI ( X )  { A(l ) | l  1, n,   (0,1]} of the set
 card ( Al )  card ( X ),
l 1
of objects among n fuzzy clusters for
some tolerance threshold   (0,1] is the A(l )  Rс( z ) ( X ), , (11)
initial allotment of the set X  {x1 ,..., xn } .   (0,1], card ( Rc( z ) ( X ))  c
In other words, if initial data are and a condition
represented by a matrix of some fuzzy T card ( Al  Am )  w, A(l ) , A(m ) ,
then lines or columns of the matrix are , (12)
l  m,   (0,1]
fuzzy sets Al  X , l  1, n and  -level
are generalizations of conditions (9) and
fuzzy sets A(l ) , l  1, c ,   (0,1] are
(10). Obviously, if w  0 in conditions (11)
fuzzy clusters. These fuzzy clusters and (12) then conditions (9) and (10) are
constitute an initial allotment for some met. The adequate allotment Rс( z ) ( X ) for
tolerance threshold  and they can be
considered as clustering components. If some value of tolerance threshold   (0,1] is
some allotment a family of fuzzy clusters which are
 l
Rс ( z ) ( X )  { A( ) | l  1, c, c  n} corresponds elements of the initial allotment RI (X ) for
to the formulation of a concrete problem, the value of  and the family of fuzzy
then this allotment is an adequate clusters should satisfy the conditions (11)
allotment. In particular, if a condition and (12). So, the construction of adequate
c allotments Rс( z ) ( X )  { A(l ) | l  1, c, c  n} for
 Al  X ,
l 1 every  is a trivial problem of
(9) combinatorics.
and a condition Allotment RP ( X )  { A(l ) | l  1, c} of the set
card ( Al  Am )  0, A(l ) , A(m ) , of objects among the minimal number c ,
,
l  m,   (0,1] 2  c  n of fully separate fuzzy clusters for
(10) some tolerance threshold   (0,1] is the
are met for all fuzzy clusters A(l ) , l  1, c principal allotment of the set X  {x1 ,..., xn } .
of some allotment Several adequate allotments can exist. Thus,
 l
Rс ( z ) ( X )  { A( ) | l  1, c, c  n} for a value the problem consists in the selection of the
unique adequate allotment Rc (X ) from the
  (0,1] , then the allotment is the
set B of adequate allotments,
allotment among fully separate fuzzy 
clusters. B  {Rc ( z ) ( X )} , which is the class of possible
Fuzzy clusters in the sense of definition solutions of the concrete classification
(6) can have an intersection area. If the problem. The selection of the unique
intersection area of any pair of different adequate allotment Rc (X ) from the set
fuzzy clusters is an empty set, then
B  {Rc( z ) ( X )} of adequate allotments must
conditions (9) and (10) are met and fuzzy
clusters are called fully separate fuzzy be made on the basis of evaluation of
clusters. Otherwise, fuzzy clusters are allotments. In particular, the criterion
called particularly separate fuzzy clusters c 1 nl
F ( Rc( z ) ( X ),  )    li    c ,
and w  {0,  , n} is the maximum number l 1 nl i 1

of elements in the intersection area of (13)


different fuzzy clusters. For w  0 fuzzy where c is the number of fuzzy clusters in
clusters are fully separate fuzzy clusters. the allotment Rс( z ) ( X ) and nl  card ( Al ) ,
8 Labeling Consequents of Fuzzy Rules Constructed by Using
Heuristic Algorithms of Possibilistic Clustering

A(l )  Rc( z ) ( X ) is the number of elements  D-AFC-PS(c)-algorithm: using the


partially supervised construction of the
in the support of the fuzzy cluster A(l ) , allotment among given number c of
can be used for evaluation of allotments. partially separate fuzzy clusters.
Maximum of criterion (13) corresponds On the other hand, the family of direct
to the best allotment of objects among c prototype-based heuristic algorithms of
fuzzy clusters. So, the classification possibilistic clustering includes
problem can be characterized formally as  D-AFC-TC-algorithm: using the
determination of the solution Rc (X ) construction of the allotment among an
satisfying unknown number c of fully separate
Rc ( X )  arg  max F ( Rc( z ) ( X ), ) , fuzzy clusters;
Rc ( z ) ( X )B  D-PAFC-TC-algorithm: using the
(14) construction of the principal allotment
The problem of cluster analysis can be among an unknown minimal number
defined in general as the problem of of at least c fully separate fuzzy
discovering the unique allotment Rc (X ) , clusters;
 D-AFC-TC(α)-algorithm: using the
resulting from the classification process
construction of the allotment among an
and detection of fixed or unknown
unknown number c of fully separate
number c of fuzzy clusters can be
fuzzy clusters with respect to the
considered as the aim of classification.
minimal value  of the tolerance
Thus, the problem of cluster analysis can
threshold.
be defined as the problem of discovering
It should be noted that these direct
the unique allotment Rc ( X ) , resulting prototype-based heuristic possibilistic
from the classification process and clustering algorithms are based on a
detection of fixed or unknown number c transitive closure of an initial fuzzy
of fuzzy α -clusters can be considered as tolerance relation.
the aim of classification. On the other hand, a family of direct
Direct heuristic algorithms of prototype-based heuristic possibilistic
possibilistic clustering can be divided clustering algorithms based on a transitive
into two types: relational versus approximation of a fuzzy tolerance is
prototype-based. A fuzzy tolerance proposed in [9].
relation matrix is a matrix of the initial So, the matrix of memberships Rc ( X )  [ μli ] ,
data for the direct heuristic relational
the value α of the tolerance threshold and
algorithms of possibilistic clustering and
a matrix of attributes (2) is a matrix for the set of kernels {K ( A(1α ) ),, K ( A(cα ) )} are
the prototype-based algorithms. In results of classification. The results will be
particular, the group of direct relational constant in each experiment for the same
heuristic algorithms of possibilistic data set, because the sought clustering
clustering includes structure Rc ( X ) of the set of objects X is
 D-AFC(c)-algorithm: using the based directly on the formal definition of
construction of the allotment among fuzzy cluster and the possibilistic
given number c of partially memberships are determined directly from
separate fuzzy clusters; the values of the pairwise similarity of
 D-PAFC-algorithm: using the objects.
construction of the principal The training data matrix (2) and clustering
allotment among an unknown results are a basis for constructing of
minimal number of at least c fully Mamdani-type fuzzy rules (1). The
separate fuzzy clusters; corresponding methodology described in [4]
in detail.
Database Systems Journal vol. V, no. 3/2014 9

go to step 1.2;
4. A labeling procedure for fuzzy rules 1.3 Check the following condition:
consequents if the typical point  l is labeled
Fuzzy classifier can be generated directly then l : l  1 and go to step 1.2
by some heuristic algorithm of else go to step 1.4;
possibilistic clustering [4]. A fuzzy rule is 1.4 Check the following condition:
associated to each fuzzy α -cluster of the if all typical points  l , l  1, c
obtained allotment, Rc ( X ) . So, a number are labeled then go to step 2.
of fuzzy rules is equal to a number of 2. Perform the following operations for
fuzzy α -clusters and equal to a number each output variable yl , l  1, c and
of output variables y l , l  1,  , c . each typical point  l , l  1, c :
The results obtained from heuristic 2.1 Let l : 1 ;
algorithms of possibilistic clustering are 2.2 A label of typical point  l
stable. The set of kernels should be assigned to output
1 c
{K ( A(α ) ),  , K ( A( α ) )} and the set of labels variable yl ;
{label 1,  , label c}are inputs for a 2.3 Check the following condition:
labeling procedure [8]. We assume that a if a condition l  c is met then
condition card K ( A(lα ) )   1 is met for l : l  1 and go to step 2.2 else
stop.
each kernel K ( A(lα ) ) , l  1, c . In other
words, the set of typical points { 1 ,  ,  c } That is why the proposed labeling procedure
is given. for consequents of fuzzy rules can be
Each output variable { y1 ,  , yc } considered as an extended version of the
corresponds to a fuzzy α -cluster of the procedure for labeling fuzzy α -clusters [8].
obtained allotment, Rc ( X ) . There is a
5. An illustrative example
two-step procedure which can be The Anderson’s Iris database [10] is the
described as follows: most known database to be found in the
pattern recognition literature. The data set
1. Perform the following operations represents different categories of Iris plants
for each typical point  l , l  1, c and having four attribute values. The four
each label label m , m  1, c : attribute values represent the sepal length,
1.1 Let l : 1 and m : 1 ; sepal width, petal length and petal width
1.2 Check the following condition: measured for 150 irises. It has three classes
if  l corresponds to label m Setosa, Versicolor and Virginica, with 50
then the label label m is the samples per class. Examples of records in
the database are presented in Table 1.
label for the typical point  l and
go to step 1.3 else m : m  1 and

Table 1. Examples of records in the Iris database


Numbers Attributes Labels of
of objects Sepal length Sepal width Petal length Petal width classes
… … … … … …
18 5.1 3.3 1.7 0.5 SETOSA
… … … … … …
48 5.5 2.6 4.4 1.2 VERSICOLOR
… … … … … …
10 Labeling Consequents of Fuzzy Rules Constructed by Using
Heuristic Algorithms of Possibilistic Clustering

108 5.6 2.8 4.9 2.0 VIRGINICA


… … … … … …

The Anderson’s Iris data form the matrix The matrix of fuzzy tolerance
of attributes Xˆ 1504  [ xˆit1 ] , i  1,,150 , T  [ T ( xi , x j )] was obtained after
t1  1, ,4 , where the sepal length is application of complement operation
denoted by x̂1 , sepal width – by x̂ 2 , petal T ( xi , x j )  1   I ( xi , x j ) , (17)
length – by x̂ 3 and petal width – by x̂ 4 . to the matrix of fuzzy intolerance
The data was normalized as follows: I  [  I ( xi , x j )] , i, j  1, ,150 obtained from
xˆit1 previous operations. The labeled training
xit1  .
max xˆit1 data is shown in Fig. 1.
i
By executing the D-AFC(c)-algorithm [4]
(15) for c  3 using the normalized Euclidean
So, each object can be considered as a distance (16), we obtain that the typical
fuzzy set xi , i  1, ,150 and
point of the first class τ 1 is the object x23 ,
xit1   xi ( x t1 )  [0,1] , i  1, ,150 ,
the typical point of the second class τ 2 is the
t1  1, ,4 , are their membership object x95 , and the typical point of the third
functions. The matrix of coefficients of class τ 3 is the object x98 . The clustering
pair wise dissimilarity between objects
result is presented in Fig. 2. The set of labels
I  [  I ( xi , x j )] , i, j  1, ,150 can be
is {SETOSA, VERSICOLOR,
obtained after application of some VIRGINICA} and these labels were
distance to the matrix of normalized data assigned to corresponding output variables.
X 1504  [  xi ( x t1 )] , i  1, ,150 , t1  1, ,4 . The performance of the generated fuzzy
In particular, the normalized Euclidean classifier is shown in Fig. 3 in which
distance [11] l  1,  ,3 is the number rule.
1 m1
e( xi , x j )    2
  xi ( x t1 )   x j ( x t1 ) .
m1 t11
(16)
was applied to the normalized data.

Fig. 1. The training data set


Database Systems Journal vol. V, no. 3/2014 11

Fig. 2. The clustering result

Fig. 3. Performance of the fuzzy classifier which was generated from Anderson’s Iris data

So, the result of application the proposed database. These perspectives for
labeling procedure seems to be investigations are of great interest both from
satisfactory. the theoretical point of view and from the
practical one as well.
6 Conclusions
The fast procedure for labeling Acknowledgment
consequents of fuzzy rules generated by This investigation was done in framework of
using heuristic possibilistic clustering a project of the scientific program
results is proposed in the paper. Stability “Monitoring-SG” of the Union State of
of results of heuristic possibilistic Russia and Belarus.
clustering is a basis of the developed
procedure. The proposed procedure can References
be very useful in process of adding new [1] F. Höppner, F. Klawonn, R. Kruse and
records to database. For the purpose, all T. Runkler, Fuzzy Cluster Analysis:
records in a database can be classified by Methods for Classification, Data
using heuristic possibilistic clustering and Analysis and Image Recognition,
fuzzy classifier can be generated on a Chichester, Wiley Intersciences, 1999.
basis of clustering results. That is why [2] E.H. Mamdani and S. Assilian, “An
the corresponding label can be assigned Experiment in Linguistic Synthesis with
to a new record which added to the a Fuzzy Logic Controller”, International
12 Labeling Consequents of Fuzzy Rules Constructed by Using
Heuristic Algorithms of Possibilistic Clustering

Journal of Man-Machine Studies, Outline for an Approach to Automatic


Vol. 7, No. 1, 1975, pp. 1-13. Labeling for Interpretation of Heuristic
[3] R. Krishnapuram and J.M. Keller, “A Possibilistic Clustering Results,”
th
Possibilistic Approach to Clustering”, Proceedings of the 12 Int. Conf. on
IEEE Transactions on Fuzzy Systems, Pattern Recognition and Information
Vol. 1, No. 2, 1993, pp. 98-110. Processing, pp. 290-294, Minsk,
[4] D.A. Viattchenin, A Heuristic Belarus, 28-30 May 2014.
Approach to Possibilistic Clustering: [9] D.A. Viattchenin and A. Damaratski,
Algorithms and Applications, “Direct Heuristic Algorithms of
Heidelberg, Springer, 2013. Possibilistic Clustering Based on
[5] M. Sato-Ilic and L.C. Jain, Transitive Approximation of Fuzzy
Innovations in Fuzzy Clustering, Tolerance”, Informatica Economica
Heidelberg, Springer, 2006. Journal, Vol. 17, No. 3, 2013, pp. 5-15.
[6] J.C. Bezdek, Pattern Recognition with [10] E. Anderson, “The Irises of the Gaspe
Fuzzy Objective Function Algorithms, Peninsula”, Bulletin of the American Iris
New York, Plenum Press, 1981. Society, Vol. 59, No. 1, 1935, pp. 2-5.
[7] DataEngine: Tutorials and Theory, [11] A. Kaufmann, Introduction to the
Aachen, MIT GmbH, 1999. Theory of Fuzzy Subsets, New York,
[8] D.A. Viattchenin, A. Damaratski, E. Academic Press, 1975.
Nikolaenya and S. Shyrai, “An
Database Systems Journal vol. V, no. 3/2014 13

Dmitri A. VIATTCHENIN was born on 13.12.1972 in Moscow, Russia. In


1994 he graduated the Department of Mathematical Modelling and Data
Analysis of the Faculty of Applied Mathematics and Informatics in
Belarusian State University. He defended his PhD in the field of philosophy
of sciences and technology in 1998. He got the title of Associate Professor in
the specialty informatics and computer engineering in 2013. He is currently
an associate professor in the Department of Software Information
Technology of the Belarusian State University of Informatics and Radio-Electronics and a
leading researcher in the Laboratory of Information Protection Problems of the United
Institute of Informatics Problems of the National Academy of Sciences of Belarus. His
research interests include techniques of fuzzy and possibilistic clustering, fuzzy control,
intelligent decision making systems, simulation of complex systems, and philosophical
aspects of artificial intelligence. He is the author of over 135 publications including papers in
international and domestic journals and conference proceedings, 3 monographs and 1 edited
volume. He is also a reviewer various journals, for example Fuzzy Sets and Systems,
Neurocomputing, Journal of Information and Organizational Sciences.

Stanislau SHYRAI was born on 28.09.1991 in Stavropol, Russia. He


graduated from the Faculty of Computer Systems and Networks of the
Belarusian State University of Informatics and Radio-Electronics in 2014. At
present he is a lecturer in the Department of Software Information Technology
of the Belarusian State University of Informatics and Radio-Electronics. His
interest domains related to computer science are: software development
methodologies, intuitionistic fuzzy sets, fuzzy cluster analysis, fuzzy
classifiers, information systems and databases.

Aliaksandr DAMARATSKI was born on 29.11.1983 in Mogilev, Belarus.


He has graduated the Faculty of Computer Design of Belarusian State
University of Informatics and Radio-Electronics in 2005. He is currently a
junior researcher in the Laboratory of Information Protection Problems of
the United Institute of Informatics Problems of the National Academy of
Sciences of Belarus. His work focuses on possibilistic clustering, image
processing, decision making, developing software, and computer networks
analysis. He is the author of more than 25 papers in journals and conference proceedings.
14 BI solutions for modern management of organizations - Cognos

BI solutions for modern management of organizations - Cognos

Ruxandra OPREA, Ionuț LĂCĂTUȘ, Cătălin HUȚANU


1, 2, 3
University of Economic Studies, Bucharest, Romania
rux.oprea@yahoo.com, ionutlacatus@gmail.com, catalin@hutanu.ro

Business Intelligence has evolved in the last decade increasingly relying on real-time data.
Business analysis has become essential. This involves actions in response to results analysis
and instant change of business processes parameters making BI beneficial for several
reasons.
Keywords: Business Intelligence, Performance Management, Cognos

1 Introduction
Business Intelligence (BI) has two
different basic meanings related to use of
large number of business information in
concrete performance indicators and finally
the organizational knowledge. BI
the term intelligence [1]. The first implementations are efforts involving
meaning, less frequent, is referring to the multiple issues, from organizational strategy
capacity of human intelligence in to organizational processes and
business or in activities. Business management, from application management
Intelligence is a new and powerful area of to information infrastructure changes. BI
investigation of the applicability of projects do not aim to teach managers how
human cognitive abilities and artificial to make the right decisions; instead they
intelligence technologies in management help them make decisions based on facts and
and decision support in various business figures, not on assumptions. Companies
issues. The second meaning displayed in collect vast amounts of data through
Fig.1 refers to intelligence as information transactional systems (e.g. ERP, CRM, and
valued for its use and relevance. SCM) that have implemented over the years
and which they use daily to perform a
variety of corporate functions. Before the
notion of BI was launched, there was no
concept that allowed the use of large
amounts of data by integrating and
transforming them into information [3].
Development of BI concepts and
technologies creates a management
environment where current and new data can
be used to improve the quality in the process
of decision making. In addition, the
Fig.1 Information and business existence of large volumes of transactional
intelligence [2] data, especially transactional data with a
high degree of specificity and particularity,
Application. BI applications include creates opportunities for management to
decision support systems, query and improve forecast accuracy.
reporting tools, online analytical
processing and also forecasting and data BI Tools:
mining systems. Ultimately, the final  Spreadsheets
results of BI implementations are depth  Query software and OLAP (Online
analysis, refining and concentrating a analytical processing)
Database Systems Journal vol. V, no. 3/2014 15

 Digital Dashboards (an executive Applications for social media make it


information system user interface is possible to analyze the attitude of customers,
designed to be easy to read) products and associated brands and
 Data mining (process of extracting emerging issues on the organization or
patterns from large volumes of data market segment.
by combining methods from statistics IBM Cognos Business Intelligence
and artificial intelligence with those provides reports, analysis, dashboards and
of the database management) scoreboards to help support the way people
 Decision Engineering (framework think and work when they are trying to
that unifies a number of best practices understand business performance. You can
for organizing decision making) freely explore information, analyze key facts
 Process mining (extracting and quickly collaborate to align decisions
knowledge from the events recorded with key stakeholders [5].
by the information system) [3]  Reports equip users with the data they
 Business performance management is necessitate to make fact-grounded
a set of management and analytic conclusions [4].
processes that enable management of  Dashboards avail users access, interact
an organization's performance in and personalize content in a way that
order to achieve one or more goals. assists on how they make decisions help
users access, interact and personalize
content in a way that supports how they
make decisions.
 Analysis capabilities provide access to
data from multiple angles and
perspectives so you can watch and study
it to make informed decisions [4].
 Collaboration capabilities include
communication tools and social
networking to fuel the exchange of
thoughts during the decision-making
process.
 Scorecarding capabilities automate the
capture, management and monitoring of
business metrics so you can compare
Fig.2 Product capabilities [4] them with your strategic and operational
targets [5].
Traditional BI capabilities are extended
beyond reporting, analysis and creating 2. IBM Cognos Business Intelligence
dashboards and scorecards, which are Cognos Incorporated was an Ottawa,
represented in figure 2. Those who use BI Ontario-based company making business
can also plan and conduct scenario intelligence and performance management
modelling, real-time monitoring and software. “Founded in 1969, at its peak
predictive analytics. Managers want to be Cognos employed almost 3,500 people and
able to perform their own analyzes and served more than 23,000 customers in over
view data in real time [6]. They want to 135 countries until being acquired by IBM
connect predictive analytics and social on January 31, 2008. While no longer an
media analysis with traditional BI reports independent company, the Cognos name
and dashboards to get a complete view of continues to be applied to IBM's line of
the business, customers, competitors and business intelligence and performance
market. management products” [11].
16 BI solutions for modern management of organizations - Cognos

IBM Cognos Business Intelligence is an and monitor outcomes and metrics so that
integrated business intelligence suite that they can prepare efficient business decisions
offers a broad range of functionality to [8].
help understanding an organization’s IBM Cognos 8 integrates the following
information. Everyone in the system can business intelligence activities, shown in
use IBM Cognos 8 to view or create table 1, in one Web-based solution.
business reports, examine information,
Table 1 Components and activities [8]
IBM Cognos Connection Publishing, managing, and viewing
content
IBM Cognos Insight Managed workspaces
IBM Cognos Workspace Interactive workspaces
IBM Cognos Workspace Advanced Ad hoc querying and data exploration
IBM Cognos Report Studio Managed reporting
IBM Cognos Event Studio Event management and alerting
IBM Cognos Metric Studio Scorecarding and metrics
IBM Cognos for Microsoft(tm) Office Works with IBM Cognos BI content in
Microsoft Office
IBM Cognos Query Studio Ad-hoc querying
IBM Cognos Analysis Studio Data exploration
on Cognos BI servers. This software allows
multiple users to download and install IBM
“IBM Cognos Connection is the Web Cognos Insight on their computers from the
portal for IBM Cognos Business Cognos Connection interface.”
Intelligence.” It is the beginning spot to In IBM Cognos Workspace, the user can
access BI information and the create sophisticated interactive workspaces
functionality of IBM Cognos BI. The using IBM Cognos content, as well as
Web portal can be used to publish, find, external information sources such as TM1
manage, organize, and view Websheets and CubeViews, according to
organization's business intelligence specific information needs. The user can
content, such as reports, scorecards, and view and open favourite workspaces and
agents. Possessing the necessary reports, manage the content in the
permissions, various studios can be workspaces, and e-mail the workspaces.
accessed from the portal and this can be Also, can be used comments, activities and
used for content administration, including social software like IBM Connections for
programming and distributing reports, collaborative decision making.
and creating businesses. In IBM Cognos With IBM Cognos Workspace Advanced,
Insight, the user can analyze information, the user can perform advanced data
explore scenarios, and influence exploration and create simple reports. When
decisions by creating personal or the user is in a workspace in IBM Cognos
managed workspaces. The interactive Workspace and wants to perform deeper
workspaces can be used to pass the analysis and report authoring, he can
results to managers. Because Cognos seamlessly graduate to Cognos Workspace
Insight supports write-back, the user can Advanced, where he can perform more
also use these workspaces to collect and advanced data exploration, such as adding
consolidate management targets, additional measures, conditional formatting,
commitments, and forecasts. “IBM and advanced computations [7]. The user
Cognos Insight is provided with IBM can also launch Cognos Workspace
Cognos BI. IBM Cognos Connection Advanced from the IBM Cognos
Installer for Cognos Insight can be used Connection portal. With Cognos Workspace
in order to install provisioning software Advanced, the user can create reports with
Database Systems Journal vol. V, no. 3/2014 17

relational or dimensional data sources, access to all IBM Cognos report content,
and that shows the data in lists, crosstabs, including data, metadata, headers, footers,
and charts. He can also use his external and charts. The user can use predefined
data source. With this feature, a report reports, or create new content using IBM
authored in IBM Cognos Report Studio Cognos Query Studio, IBM Cognos
can be accessed; the objects that can be Analysis Studio, or IBM Cognos Report
inserted only in Report Studio can be Studio. By importing content into Microsoft
seen (such as map). However, he cannot Excel spreadsheet software, the user can
modify these objects. work with the data and leverage Microsoft
Using Report Studio, report authors Excel's formatting, calculation, and
create, edit, and distribute a wide series presentation capabilities. The user can also
of professional reports. They can also use the formatting and charting features of
define corporation standard report Microsoft Excel. By importing content into
templates for use in Query Studio, and Microsoft PowerPoint and Microsoft Word,
edit/modify reports created in Query various reports and charts can be included to
Studio or in Analysis Studio. enhance the presentations and documents.
Report Studio can be used for reports Using Query Studio, users with little or no
which are intended for a large audience, training can quickly design, create and save
exist long enough to require maintenance reports to meet reporting needs not covered
for updateing requirements and data or by the standard, professional reports created
require a thorough check on the in Report Studio [8].
appearance. Report Studio provides In Analysis Studio, users can explore,
powerful features, such as transferring in analyze, and compare dimensional data.
burst, prompts, maps and create advanced Analysis Studio provides access to
graphics and offers many ways to dimensional, OLAP (online analytical
customize reports. processing), and dimensionally modelled
Event Studio is an action oriented agent relational data sources. Analyses created in
which notifies the users when an Analysis Studio can be opened in Report
bussiness event occurs. Agents can Studio and used to build professional reports
publish details to the portal can deliver [12].
alerts by e-mail, can run and distribute IBM Cognos Transformer is a multi-
reports based on events and can monitor dimensional data modelling component
the status of events. designed for use with IBM Cognos Business
In Metric Studio, the user can create and Intelligence. This component is useful to
deliver a customized scorecarding create a multi-dimensional model: a
environment for monitoring and business presentation of the information in
analyzing metrics throughout one or more different data sources that share
organization. Users can monitor, analyze, common data [8]. After the user add the
and report on time-critical information by needed metadata from IBM Cognos
using scorecards based on cross- Business Intelligence packages, reports, and
functional metrics. other various data sources, model the
In IBM Cognos for Microsoft Office, dimensions, customize the measures, and
the user can work with secure IBM apply IBM Cognos BI secured views with
Cognos Business Intelligence content in dimensional filtering, he can create IBM
his familiar Microsoft Office Cognos PowerCubes based on this model.
environment. He can extract report These cubes can be deployed in order to
content from a variety of IBM Cognos support OLAP reporting and analysis around
applications, including IBM Cognos BI the globe.
and IBM Cognos PowerPlay. IBM IBM Cognos Business Intelligence
Cognos for Microsoft Office provides Administrators ensure that IBM Cognos BI
18 BI solutions for modern management of organizations - Cognos

runs without interruption and at optimal downloading - can view the downloaded
performance. They can: pages and interact with them without the
 Define connections to data sources of need to wait for complete download of
the organization the report.
 Define security permissions for users 2. Dashboards
and groups in the organization  With the ability to create dashboards,
 Specify distribution lists, contacts and Cognos Business Intelligence allows
printers users to create and customize their ways
 Manage servers and dispatchers, and of viewing dashboards that they can
can finely adjust performance for shape the needs.
IBM Cognos BI  Historical data with variable data and
 Predefine links to an entire package what-happens-if scenarios provides
that authors can easily add it to their extended insight of performance.
reports  Portable dashboards can be created that
 Customize the appearance and can support decisions on when and
functionality of IBM Cognos BI where to create
IBM Cognos BI is secured by setting 3. Analysis
permissions and by enabling user  With analysis capabilities of Cognos
authentication. When anonymous access Business Intelligence, users can use
is enabled, IBM Cognos BI can be used analytical reporting, trend analysis,
as a specific user without authentication. statistical analysis and some other
In IBM Cognos BI, administrators define intuitive tools that can easily analyze
permissions so that users can access information.
functions. For example, to edit a report  Financial and business analysis can
using IBM Cognos Report Studio, estimate the short-term business through
security permissions and the appropriate advanced analytics and predictive
license are required. In addition, each statements what-happens-if.
entry in IBM Cognos Connection is  Support for operational and strategic
secured by defining those who can read decision cycles may engage the right
it, edit and run [8]. people at the right time for analysis.
IBM Cognos BI comes with the  Cogons Business Intelligence comes
following capabilities: with options for mobile devices, analysis
1. Reports and offline interactivity along with
 Cognos Business versions for Microsoft Office to the
Intelligence features professional appropriate user needs.
report generation capabilities that are 4. Collaboration
easy to use and help minimize the  The ability to form communities, to
effort to build them. capture annotations, views and share
 Users can create their opinions help streamline and improve
own reports (queries) or edit existing decision-making groups.
reports.  Workflow and ability to handle tasks
 Collaborative are designed to connect users and to
reporting features built into Cognos enhance activities coordination.
Business Intelligence helps users to  Using IBM Connections and its
communicate between them to make social capabilities, users can share their
decisions and to gain additional opinions and request ideas.
insights. 5. Scorecarding
 Users can access  Using measurement capabilities inside
reports on mobile devices and can (scorecarding) of Cognos Business
already interact with the report while Intelligence, organization can compare
Database Systems Journal vol. V, no. 3/2014 19

performance indicators with strategic - Measurement indices appear as weak


goals. (red), medium (yellow) or excellent
 Facilities available that can analyze (green).
the organizational strategy can help - For each metric, we can see if the
departments and employees set their situation improves, stays the same or
priorities. gets worse.
 Model based on web interface - When we position the pointer over the
increase management efficiency and title of a metric, we can see a chart with
encourages adoption by users. historical performance metric. It extends
a floating menu when you pause the
3. Example – Performance Monitoring cursor over a metric or above diagram.
With IBM Cognos Metric Studio, you
can track the performance of an Procedure
organization compared with its 1. Opening IBM Cognos Connection from
objectives. In a quick glance, decision- the web browser and access the URL
makers at every level of the organization obtained from the administrator. URL is like
can see the status of the organization and this: http://nume_server/cognos
then react or make a plan. A metric is a 2. On the Welcome to IBM Cognos, choose
key indicator of measures that compare Manage my measurement indices.
actual results with the results of the 3. Choose Samples, Models, GO Metrics.
target. An index of measurement registers 4. Viewing a map of strategies: On the left
also who is responsible for the outcome pane, choose Scorecards tab and then GO
and impact index [8]. A scorecard is a Consolidated. In the right pane, select
collection of performance metrics and Diagrams. In GO Strategy map (figure 3),
projects that reflect the strategic goals of we see how we can quickly assess the
a department in an organization. The performance of each purpose of the
following example shows how you can: organization. Major indicators show the
 examine a strategy map for a visual status of a specific strategy. Smaller
representation of the strategy and indicators that appear in strategy show trend.
objectives of this strategy for an Measured values of indices that do not
organization register a good performance will appear in
 examine and understand the red. Indicators of status and trends, which
performance of a metric on a appear in red, indicate possible problem
scorecard areas. For example, in Production and
 create an action on an index Distribution function of strategy map, in
measuring Control product quality measuring index,
 add a metric to watch list Return quantity % is red, representing a
Suppose that we are a sales manager for a measure of poor performance.
region in Sample Outdoors Company.
Regularly consult a scorecard that
contains metrics for the company's sales.
Measurement indices provides us a quick
comparison between actual sales and the
company's objectives it. The amount
returned is one of the indices of
measurement. To perform this exercise,
you need to have licensing and security
permissions appropriate for this
functionality.
Things to note:
20 BI solutions for modern management of organizations - Cognos

Fig.4 Diagram tab

To get more information on this metric,


Fig.3 GO Strategy map pause the cursor over Asia Pacific -
Quantity returned %. (figure 5)
5. Exploring scorecard: In the left pane,
click Scorecards and expand GO
Consolidated. Note that Sample
Outdoors Company has scorecards for
each of the four functions of the
company: Finance, Sales and Marketing,
Production and Distribution and Human
Resource. Expand Production and
distribution, choose Asia Pacific and
then choose Metrics tab showing all
indices of measurement associated with
the Asia Pacific region. Note that
Returns by reason index in Asia Pacific
% wrong product ordered is in red. To Fig.5 Metric tooltip
understand the cause of the problem, we
further explore this metric: press Asia In the left pane, under production and
Pacific - Amount refunded %, then the distribution, press Asia Pacific and then on
Diagrams tab (figure 4). Expand Metrics tab (figure 6) to view the metric
measurement index in chart by clicking related to the amount refunded. Note that the
on the arrow next to the metric Asia metric condition Asia Pacific - Return by
Pacific Returns by reason in % Orders reason % wrong product ordered is poor.
failed. We can see what data can be
found in the index measuring Quantity
returned %. We can also see which of
the return reasons is a major problem.

Fig.6 Metric tab

6. We click on the metric Asia Pacific -


Return by reason % - Wrong product
ordered. In Metric Studio, tabs represent
Database Systems Journal vol. V, no. 3/2014 21

questions that we might put, as we try to distribution, we click on Asia Pacific and
solve a problem or understand the then tab Metrics; we click the measuring
information. index Asia Pacific - Returns by Reason %
7. In the right pane, the History tab - wrong product ordered and on the
(figure 7), we click on the list. The superior toolbar choose Add the watch list
information in the History tab, answer the (figure 8), and press OK. In the left panel,
question "When?". We see the actual we click on My Folders on the Watch List.
values on the target metric data for Index Asia Pacific - Returns by Reason %
previous periods. Also see half-yearly - wrong product ordered appears on our
and annual summaries. list of supervisors

Fig.8 Watch list

4. OBIEE vs. COGNOS


In Business Intelligence Tools 2014 survey
was intended to compare various Business
Intelligence tools, a comparison that ignores
Fig.7 History tab the supplier. Comparison of Obie vs.
fourteen other tools Cognos Business
8. We click on the Diagrams tab. Intelligence in more than 100 selection
Diagrams tab information helps us criteria. The criteria we are talking here are
understand the question "How?" by essential for obtaining success with Business
comparing our indices measuring other Intelligence [9]. For example: OBIEE
metrics. Impact diagrams show the support or Cognos-based reporting roles and
relationship between the metric Return reporting components can be reused in the
by reason % and other indices. implementation of business intelligence,
9. Create action for one of our business whether they are used in web reports,
analysts, in which he investigates the dashboards, analysis, intranet or mobile
performance of products returns: click the version?
Actions tab. Choose new action. Tip: The survey results revealed that Obie and
You must select a metric to see new Cognos have scored almost equal if all
action icon. In the Name box, type criteria are taken into account. But in
Kazumi Uragome the person who will different categories, each one has different
own action. To announce the Kazumi score, and also the costs each are different
what to do, in the Description box, type [9].
Please investigate. Near Planned End
box, we click the Calendar icon and References
select a date a week away from today. [1] - Business Intelligence: Concepts,
Hint: You may need to scroll down to Components, Techniques and Benefits,
see Planned End box, and then we click http://www.techrepublic.com/resource-
on OK. library/whitepapers/business-
10. Add index measurement Asia Pacific intelligence-concepts-components-
- Returns by reason % - wrong techniques-and-benefits/
product ordered on our list of [2] - Building Business Intelligence,
supervisors to monitor it in the future http://exonous.typepad.com/mis/2004/03
easily in the following way: in the left /building_busine.html
panel, the scorecard Production and
22 BI solutions for modern management of organizations - Cognos

[3] - BUSINESS INTELLIGENCE, ter/c8bi/v8r4m0/topic/com.ibm.swg.im.c


http://ie2.wikispaces.com/Business+I ognos.wig_cr.8.4.0.doc/wig_cr.html
ntelligence [9] - OBIEE VS COGNOS,
[4] - Making Business Intelligence a Part http://www.passionned.com/business-
of Your Organization, intelligence/business-intelligence-
http://socialmediatoday.com/docmark tools/business-intelligence-tools-
ting/1711106/making-business- comparison/obiee-vs-cognos/
intelligence-part-your-organisation [10] - cognosvsobiee-120628035521-phpa
[5] - Business intelligence software with pp01, http://www.share-
reporting, analysis, dashboards and pdf.com/c47886b35f4e47739ab9efa9e82
scorecarding capabilities, http://www- c3e53/cognosvsobiee-120628035521-
03.ibm.com/software/products/sv/busi phpapp01.htm
ness-intelligence [11] - Cognos, http://en.wikipedia.org/wiki
[6] - What is Business Intelligence, /Cognos
http://www.selectbs.com/products- [12] - IBM Cognos Business Intelligence,
general/what-is-business-intelligence http://en.wikipedia.org/wiki/IBM_Cogno
[7] - IBM® Cognos® 8 Business s_Business_Intelligence
Intelligence,
http://public.dhe.ibm.com/software/da
ta/cognos/documentation/docs/en/8.4.
0/wig_cr.pdf
[8] - IBM® Cognos® 8 Business
Intelligence Getting Started,
http://publib.boulder.ibm.com/infocen

Ruxandra OPREA graduated from the Faculty of Mathematics and


Informatics of the Ovidius University, Constanța, in 2012. She is currently
graduating from the Databases for Business Support master program organized by the
Academy of Economic Studies of Bucharest.

Ionuț LĂCĂTUȘ graduated from the Faculty of Mathematics and


Informatics of the Ovidius University, Constanța, in 2012. He is currently
graduating from the Databases for Business Support master program
organized by the Academy of Economic Studies of Bucharest.

Cătălin HUȚANU graduated from the Faculty of Mathematics and


Informatics of the Ovidius University, Constanța, in 2012. He is currently
graduating from the Databases for Business Support master program
organized by the Academy of Economic Studies of Bucharest.
Database Systems Journal vol. V, no. 3/2014 23

Business Intelligence overview


Dragoş AGIU, Vlad MATEESCU, Iulia MUNTEAN
University of Economic Studies,Bucharest, Romania

agiu.corneliudragos@yahoo.com, vlad.mateescu30@gmail.com,
muntean.iulia12@gmail.com

The need for information has always been present, but what defines the present time is the
big amount of data available everywhere and the need to have all the answers we need in
a very short time. Because of the agglomeration of markets and the dynamics of economic
environment, the ability to collect data and turn it into useful information for decisional
process may be the element which makes the difference. Thus, the term of “business
intelligence” was first introduced by Gartner Group in the mid of 90’s, but it still exists
since the 70’s, when the reporting systems were static, two dimensional, with no
analytical skills.
Keywords: Business Intelligence, management, Data Mining, CRM, ERP

The concept of BI managers by using BI systems is

1 Business Intelligence can be defined


theoretically as the use of high class
software or business applications or
the use of values to make better decisions
given by monitoring the risks that
may threaten the organization’s
strategic objectives and financial
losses.
for the company, as IBM confirms. 2. BI solutions integrate all data for
Technical and practical, Business analysis: standard relational data,
Intelligence are tools for collecting, exports of texts, Microsoft Excel
processing and analyzing data. This way, data up to XML data streams,
the company can evaluate the results and which are stored in data warehouse
interpret them . or operational systems. This way,
Based on the newest technologies, BI they are always available for the
systems are essential for the decisional use in BI applications.
level efficiency, but also for improving 3. The costs for implementing a BI
relations with the clients, employees and system are low, large investments
suppliers by: facilitating the decisional are not required in hardware
process, increasing productivity of equipment and the training for the
employees, lower costs, increasing the next users can be done in a short
relationship with the partners and business time, all of them being minor
development. investments, which will be
To understand the importance of the BI recovered in first moths.
trend, it’s important to know what are the 4. Decrease the influence of power
most relevant benefits after using this type games in the company
of systems, as follows: 5. Avoiding decisional problems.
1. BI systems enable effective risk
management: the value brought to
24 Business Intelligence overview

Fig. 1 Business Intelligence Process

2. Choosing a BI solution Intelligence strategy is to know where the


company wants to reach out and which
It’s already known that today’s economic are the effects expected after
activity can generate a lot of data implementation.
volumes. Each one represents a tiny part To obtain the necessary information and
of the business and it is found in various to present them “user-friendly” is an
locations or departments, or often in important step to pass in desire to solve
different geographical regions. the problem of BI in an organization. For
With BI, as we’ve already mentioned, all example, data difficult to understand
data are collected, processed in may discourage users from using them.
information, information that will be well Here, it should be also noted that “dirty”
analyzed and used for the next decisions. data in the system must be identified,
Under the current business environment, removed and replaced.
the quality and timeliness of information It is essential that the organization should
is not just a choice between hard profit not forget the benchmarks for
and loss, but a matter of survival and performance – measuring process will
training. require both the company past and future.
The benefits of a BI system are obvious – Visions must be very clear, so the
analysts are optimistic, showing that the measure progress must be easy to do.
coming years millions of people will Once established where the company is
become regular users of the system. in the chosen moment for analysis, what
Around this concept however is now does it want to be done, it’s time to
seeking the most appropriate strategy for decide how to reach the established goal.
choosing a business intelligence system The standards can be set, BI Competence
and of course, this strategy will be Centers can be implemented, or BI in
customized to the needs of each cloud or framework can be used, all
organization. according to the needs of the company.
A BI system without a clear end goal can
certainly give a yield, however will never 3. BI solutions for organization
guide the organization where desired, management
because no one knows the ultimate goal.
So, the first step in building a Business
Database Systems Journal vol. V, no. 3/2014 25

Business intelligence applications include (KPI) for an organization or business in


decision support systems, query and the form of questions that require specific
reporting tools, online analytical answers and are based on facts. Some
processing (OLAP) and data mining and examples of key performance indicators
forecasting systems. But ultimately the include: the evolution of income, profits
final results of implementing business earned by a line of business, and cost
intelligence are in depth analysis, refining projections regarding business questions
and concentrating a large number of including management problems with
business intelligence in concrete numerical answers: "how much are we
performance indicators and ultimately, buying from a supplier in one year?" and
organizational knowledge. The business so on.
intelligence projects do not aim to teach In defining business intelligence solutions
managers how to make the right decisions within an organization, the focus should
- but they help to understand the figures fall on functional analysis, the design
and the decisions they take in their solution and the outcome or information.
analysis. One of the best strategies in the
Basically companies collect vast amounts establishment of such a solution is the
of data through transactional systems (eg strategy of "top-down", which starts from
ERP, CRM, SCM) that have been the Executive Management that the
implemented over time and which they use company and their needs for information
to perform daily a variety of corporate and ends with the information technology
functions. The existence of such large and integration of multiple data sources to
volumes of transactional data creates meet the information needs of
opportunities for management and management.
improves forecast accuracy. However, research reveals that more than
Business Intelligence wants to eliminate half of Business Intelligence projects hit a
assumptions and uncertainties in decision- low degree of acceptance or fail. What
making processes, both tactical and factors influence the implementation
strategic level. At the tactical level, BI negative or positive?
helps optimize business processes or The most important factor is definitely the
product lines by identifying trends, project sponsor, and this variable will be
changes or behaviors which need bet on the functions of the organization's
improving management and control needs. A Business Intelligence application
functions. Regarding the strategic level, BI is divided into three levels: operational
can provide significant value increased by reporting, analysis and strategy. If the
different alignment of business processes beneficiary is defining realistic
and product lines with the strategic expectations that are based on clear
objectives of the organization through an objectives and requirements, they create
integrated performance management prerequisites for a successful
framework and systematic analysis. It is implementation. In addition, there are
essential to consider that Business limitations related to human resources.
Intelligence has tended to shift from Knowledge of business will be provided
reporting past events to forecast and by the client. These projects involve a
prediction. team from the implementer, and one from
Defining BI solutions begins with strategic the customer consisting of user’s key
performance measurement requirements of functional areas (finance, sales, middle
the organization and not necessarily with management and top management) which,
the technical details related to it. In besides regular duties, should work with
general, performance evaluation can be consulting solution implemented.
defined as key performance indicators
26 Business Intelligence overview

In terms of the defining trends of BI the valuable information hidden in large


solutions in companies in the current year amounts of data at its disposal to make
is important to note the following: critical business decisions.
 Companies choose mobility - inflexible  Data from social media environment
switching solutions that create static become useful tools for the business –
reports to user-driven BI solutions will more and more, social media is becoming
accelerate as more organizations recognize an essential component for the presence of
the importance of obtaining data that each a company on the market and contributes
user be able to base their decisions. to a more direct and authentic
Therefore BI solutions adapt the way communication with the public. At the
people work, giving them access to same time, however, beyond the number
information wherever they are at any given of likes or followers a company earns on
time. For both the large companies and various platforms of social networking,
small ones the speed of the business world social media becomes a relevant source of
means that mobile business intelligence information that helps measuring the
solutions are commonly used and not just reputation of the company in online
occasionally. Business users want to be market or it can reveal unexpected insights
able to access information in the context of considering the composition of the
the natural flow of their day and not just audience.
when they are physical in their offices.  Moving to cloud – a few years ago, the
Devices such as a PC Tablet for example, cloud system was an interesting subject in
are ideal for mobile BI applications and almost all the segments of the software
they support and encourage collaborative except for BI. 2014 will be, most likely the
decision-making process. The data brings year when the cloud will become an
out its secrets - is obvious that those important element of BI solutions. The
organizations who use data obtained from change will be determined by the
daily activities to make decisions are more international market maturation and by the
successful, while those companies that do companies which realize that the costs of
not, see their market position threatened. If thir own infrastructure are bigger than a
traditionally the data analysis process is cloud solution. Because the BI solutions
the responsibility of an “analyst” expert, are limited, the organizations prefer their
today this process counts more often as a people to spend time exploring data,
responsibility of a regular user of business testing new ideas and highlight business
skills. The objective of the most successful insights instead of being stuck with
Business Intelligence solutions today is to infrastructure activities and upgrading
offer any user, regardless of their software.
technological skills, the power to harness
Database Systems Journal vol. V, no. 3/2014 27

Fig. 2 Customer&Market Intelligence

4. Life cycle of a BI system that the project or projects implicated in


the process to have maximum profitability.
Business Inteligence is a strategic initiative Company managers with each project
which helps the organizations to measure manager should adopt a specific
the effectiveness of their plans on the methodology based on the needs they
market. know they have.
A succesful company must know how to
plan and how to address a BI strategy so

1.Analysis 2.Design

5.Evolution 3.Development

4.Implementation

Fig. 3. The five main phases of a BI system


28 Business Intelligence overview

4.1. The analysis stage selected instruments should be jointly


The aim is one of the most important managed by the Center of Excelence
things in any project, but sometimes it (COE) and the IT department (according
should be put aside and addressed in a to the BI architectural standards) and it
different stage than the one in the should include the active participation of
beginning. Any project must be compared the end users, to make sure that the
from a technical and an organizational expectations are the same with the
point of view, but it must coincide with the needs.The plan should also decide what
primary aim of the company. This will be information sources are necessary to
able to give a clearer view over what it is support the solicited key peformance
expected to be done during the following indicators, including their quality, as well
period of time. as any other necessary transformations for
Any BI project has to clearly justify the the analysis.
cost and the benefits of solving a business 4.3. The development stage
problem. The analysis is carried out by a In this stage, all the information flow from
predefined set of key performance the organization must be modeled. It is
indicators (KPI) and it is requested by the critical to create the prototypes and a
end users.The analysis stage produces the testing environment to verify and compare
design of different components of the the target objects of the company.
solution with the relevant information The data infrastructure could count as
sources. Because of the dynamic nature of much as 70% in the effort and the costs of
BI projects, the changes in the objectives, this stage. The previous stage, as well as
people, estimates, technologies, users and this one, are usually the ones who
sponsors can have a serious impact in the consume the most time and resources
success of the project. during the development cycle. The
4.2. The design stage workload involved in building the solution
Based on the complexity of the solution, afterall the data is in place depends on the
the BI technologies are carefully complexity of the project. Only a simple
chosen.One of the usual methods of trying configuration could be necessary or a full
out solutions is building prototypes. This customization may be needed.
way, it is possible to adjust the requests The specification that gives us what kind
based on the expectations. of data we need for the development and
The key performance indicators should be must be stored can be grouped in a
defined without taking account of the metadata model. Further more, the
current informations, the aim is to capture specification needed for the delivery of the
the business needs, even though the metadata to the clients must be analysed.
support for these needs is unavailable at If a metadata deposit is acquired, this will
the time being. The plan should include a have to be extended with characteristics
high level design of different components solicited by the BI applications. If a
of the solution, including the relevant metadata deposit is built, the database
information sources. Then, the team must be designed based on this deposit.
members of the project, including the key The database design schema must coincide
managers and the IT department should all with the access specification of the
agree in a formal way over the plan and business.
the success criteria. According to how precise the data and the
The projectation stage should include the requests for data transofrmation during the
adequate selection of BI technologies, analysis, an ETL instrument may or may
based on the needs of the users and the not be the best solution. In both cases,
complexity of the implementation. The preprocessing the data and writing the
Database Systems Journal vol. V, no. 3/2014 29

extensions for the instruments are technical support department for a better
frequently necessary.The real challenge for guidance of the end user.
the BI applications comes from the BI that 4.5. The evolution stage
is hidden in the data of the organization, This stage can be called, contrary to the
that can only be discovered with data title, the consumption stage because the
mining instruments. Developping metadata user uses the information received to
deposits becomes a subproject of the main change the business and make decisions.
BI project. The main goals of this stage are:
4.4. The implementation stage  Measuring the success of a project
After all the BI components were  Extending the application to the
thoroughly tested, the application is enterprise level
implemented at user level. Regardless of
the technology used, the success of a  Increasing the exchange of
project will depend on the training done at information between business and
the user level and the support of a functional, both internally and
dedicated team, especially in the first stage externally.
of implementation. This phase requires an Business-oriented aspects are often
iterative approach with extra training ignored at this stage - since the BI solution
sessions so that customer needs are met. is technically capable, the user is free to
This step also requires a preliminary make any decision. Many projects stalled
development of specific reports and in this phase because responsibility of
analysis for the business users type. This passing to the user is not officially sent to
will create a foundation for future a business manager. Therefore a good
advanced analysis. study is dividing this stage in several sub-
All support and guidance phases to show more clearly how this
operations will be provided by the IT team methodology can lead to the end of a life
and the users will serve as consultants cycle of BI.
during the execution of this stage. Of Shown below it is presented the schema of
course, the IT team will have to train the the evolution stage subdivisions:

a.Descovery b.Access

e.Change c.Decision
Evolution

d.Division

Fig. 6 Business cycle.

a. The discovery stage


30 Business Intelligence overview

Often the organization does not understand something did not come out as expected
how the solution can lend itself to the organizational changes can occur.
exact needs of the customers. End users e. The changing stage
along with the first version of the solution Changes made can trigger a fundamental
should create the main environment for process of reengineering. On this stage all
using an effective solution, but for there technical resources together with COE
has to be a mutual agreement over a basic shall review the issues and help resolve
system where the solution can run at first. them.
b. The access stage Any BI cycle that ends should start again
Having identified the indicators and the from the first stage, but the methodology
valuable information during the discovery approach will have a new level of focus:
stage, the end users begin to follow, - The analysis stage
understand and manage the information - The reevaluation stage
leading to deeper perspectives in the - The modifying stage
chosen business solution. The users can - The optimization stage
ask for assistance or contact the - The adaptability stage
organization for further clarification. This determines that the benefits of the
c. The decision stage experiences should be added again to a
The end users make definite decisions process and maintain the relevant BI cycle.
based on the newly obtained information. Using a BI methodology helps the
The Center of Excellence (COE) may be organization to understand and produce a
involved in verifying the solution so that sequence of steps to develop and
the user will make a better decision. implement a successful Business
d. The division stage Intelgence site. Some methodologies may
The decisions and analyzes made are serve as a guide for consuming the
distributed in the company to better resources effectively and/or funding from
analyze whether the BI solution given to other successful companies.
the user was indeed a good one, and if
(Online Analytical Processing) as well as
the data mining concept.
5. BI Components: OLAP, Data Mining

The relational model is the most common a. OLAP


model in the representation of databases.
Although it has its strengths, it does not According to [5], OLAP is an interactive
handle complex queries on large data sets. technology that allows the user to make
The concept of Business Intelligence can the following operations:
be divided into three parts, data collection, - Fast and dynamic analysis of
data analysis and reporting. The main tool aggregate data;
for data analysis is the cube, which is a - Viewing the information from
multidimensional data structure built on a multiple perspectives and
data warehouse. The cube is used for data dimensions;
clustering on multiple dimensions and - Analysis of trends during
selecting a subdomain of interest. The significant time periods;
selected data can be interpreted with data
mining tools, a concept used to find trends An OLAP application is designed to allow
and patterns in data structures. users to browse, retrieve and present
In the following sections we will analyze specific data of the business. The tables in
an approach that uses cubes, OLAP the relational model use only one or two
dimensions, which does not correspond to
Database Systems Journal vol. V, no. 3/2014 31

the complex data of a company. In the dimensions as are needed for the business
OLAP concept a cube is a model model. For dimensions with large number
specifically designed to work with data of records, hierarchical dimensions are
stored on multiple dimensions. The made. They use a field named "parent"
concept of a dimension used with the which groups all fields of the same type.
OLAP technology refers to a characteristic For example, if a store sells shirts with
of the data, not an actual dimension in three colors, red, green and purple, instead
space. of having one field named "Color", there
For a data model to be defined, it is will be a parent-field called “Shirt” and the
necessary to define the model structure child-fields will be "Red", "Green" and
(the objects of the model and the relations "Violet".
between them), the operators (those acting There are two main models for
on the structure), and the integrity implementing cubes, MOLAP
constraints (rules and constraints imposed (Multidimensional OLAP) and ROLAP
to ensure the correctness of the model). (Relational OLAP). MOLAP assumes that
The structure of the model consists of a cube is multidimensional and has the
dimensions (structures spanning different data stored in a multidimensional way.
hierarchical levels that the data are Thus, the data is copied from the data-
grouped by) and tables of facts (containing store to the data-store of the cube, and
measures and foreign keys to dimension aggregates of different size combinations
tables). A time dimension, for example, are pre-calculated and stored in the cube at
may include days of the week, months, or a given vector. This means that the
even years. response time will be very short, but there
Generally in Business Intelligence there is a possibility that explosion of data could
are used three types of schemes: star occur when all combinations of size
schema, snowflake schema and aggregations are stored in the cube.
multidimensional cube. In a star schema ROLAP, as the name suggests, uses the
type (Fig. 5), there is a table of facts, of relational model and considers that it is for
which the dimension tables are connected the best to keep the data stored in the data
to. Dimension tables are not connected warehouse. As in the MOLAP, the data
among them, otherwise the scheme would can be aggregated and pre-calculated using
be called snowflake. A star schema does materialized views. [6]
not correspond to the third normal form, b. Data-Mining
since the dimension tables are made up of Data Mining is the process of finding
several adjacent tables. But this is hidden patterns and associations,
preferable, due to the loss of performance constructing analytical models, achieving
when the schema is in third normal form classification and prediction and
and running operations on large data sets. presenting the obtained results. In other
If it is wanted a variation in the third words, the process of data mining can be
normal form, a snowflake schema can be viewed as examining past data to find
used. [6] useful information that can be used as a
Given the fact that the users who use the guide for the future. Thus, data mining
OLAP technology want quick answers to gets the answers to questions like "What
questions like "How many bikes have sold happened in the past?", "Why it
to customers in Amsterdam in the last four happened?" and "What is likely to happen
months?" several queries are performed on again in the future?”.
large databases and united, in order to To be effective, the process of data mining
provide an answer. For this reason, it is needs "clean" and organized data,
preferred to use a multidimensional model, therefore preferring them to be stored in
the Cube. A cube is consists of as many data warehouses, because they have
32 Business Intelligence overview

already been cleaned, processed and that trends and patterns found,
organized by ETL processes. answer the questions defined in the
According to [7], data mining process is first stage .
divided into the following steps: 6. Launching and updating the model
1. Defining the problem - as in any – in the last step, the model is
Business Intelligence process, first launched in execution so that the
up, is the fully understanding of the users can use it. While in use, it is
problem and the questions need possible that a new set of questions
answers. It is recommended to which need to be answered could
focus on what we want to ask, not occur and the model must be
how we will respond. modified.
2. Preparing data - once the necessary The functions that a system based on
information is well defined, the data mining process performs, are:
data is being prepared. Data mining - Classification - maps data into
algorithms are very intelligent in predefined groups and classes
solving the problem, but will fail if - Clustering - groups similar data
they receive the data in a foreign into clusters
format. As mentioned, it is - Analysis of linkages - discovers
recommended that the data should relationships linking certain
be retrieved from a data data
warehouse. - Finds data that misbehaves
3. Exploring Data - to understand the from the general behavior of
result of the data mining process, the data.
the data must first be covered and A natural question is "why data
studied. This is usually done using mining?". Data mining can be
one of the two methods : successfully used in data analysis
a) Means and extremes: on processes and support of decision
different sets of media making. For example, in a system of
numbers, the means are marketing and management analysis,
calculated and the standard the first elements that are studied, are
deviation from the mean the modes of the origin of data
too. Also, the minimum and (transactions, payments, gift vouchers
maximum values are and so on). The second step is to find
determined. This process clusters of customers who have
helps to understand the data common features (ways of spending
range used for working. the money, income, etc.). Further,
b) Basic statistics: different operations marketing analysis can be
data sets are taken and performed, establishing associations
simple statistical operations and correlations between product sales
are performed. so that predictions based on these
4. Creating the data mining model - assumptions can be made. Also, by
queries are created using DMX clustering and classification, the types
language (Data Mining Extensions) of customers that buy certain products
or using a software wizard. can be determined and may make
5. Exploration and validation of the assumptions about what factors will
model – once the model is build, it attract new customers. [8]
should be explored to see the
trends and patterns that the model A data mining system is generally the
produces and validate it, to ensure architecture shown in Figure 7.
Database Systems Journal vol. V, no. 3/2014 33

User interface

Results evaluation

Data Mining Engine

Preprocessing the data: cleaning, selection, filtering şi


integration

Database Data warehouse Files

Fig 7 The architecture of a data mining system

6. Building a Successful BI solution, ready to be used. But, so


There is no certain recipe or template for far, there hasn’t been invented a
success on how to build a business, as well size that can fit for everyone. Each
as building a model of Business company is complex and has its
Intelligence does not have a classified own terms and objectives.
number of standard steps. This depends 3. Understanding complexity -
very much on the company, market, although it may seem easy on
people involved, and defining the problem. paper, a Business Intelligence
According to [8], there is still a lot of ideas solution is complex and this must
that can help a company to go right into be understood from the beginning,
creating a successful BI. in order to avoid unpleasant
1. Choosing good key performance situations that may arise later.
indicators (KPI - Key Performance 4. Creativity – sometimes it is needed
Indicators) - start from the question to look at things from another
"How do you measure a company's perspective. This is especially
success?". KPIs are metrics and important in BI, where a book
measurements that can indicate cannot ensure success. If a team
whether a company is successful or can come up with new ideas that do
not. It is known that if we started not comply with a pattern defined
with the right KPIs selected, the by someone else, this should be
business will benefit in the long encouraged.
term. 5. Choosing a good team - as a sequel
2. Adjusting recipe - there are plenty to those mentioned in the previous
of tips, or even complete BI paragraph, choosing a team that has
34 Business Intelligence overview

a good grasp of knowledge about convince those skeptical persons of


the data problem, is a step towards the importance of implementation.
to finding the best solution for the 7. Conclusions
company. Organizations today can store and access a
6. Study technologies - Business big piece of information gathered over the
Intelligence is a rapidly evolving last decade. Business Intelligence is the
field, there are numerous books, ability to bring to account the true value of
studies, or blogs about this area. It this information and this is becoming
is encouraged to keep up with new increasingly important. Unfortunately, not
appearances. all companies today have developed this
7. Learning from mistakes - part of the organizational strategy.
sometimes there are pitfalls and Following the steps mentioned, having a
mistakes that cannot be avoided. center of excellence, plus a BI strategy
The key is to remember what based on certain standards and
happened, how it happened, why methodologies chosen correctly,
and how the problem can be companies can reach the top. They can
avoided in the future. This applies recover their investment through different
to any field, but is especially suppliers, they can reduce costs and they
8. important in Business Intelligence, can especially learn how to use the
is avoiding to repeat the error. company's capital.
9. Understanding the company - every As possibilities in business intelligence get
company is different and before to be a top management priority,
starting the development of companies will need more flexible IT
Business Intelligence is necessary solutions that meet the needs of the BI, not
to know in detail how the company only in short term, but also with the
works. How the company is split, company's growth they need more
how decisions are made, how the complex solutions.
employees work, are things that There are three main factors to be taken
must be included in the planning into account when selecting a business
process. intelligence solution:
10. Creation of well-defined phases - - Involvement of business intelligence
opportunities for Business across the enterprise - BI solutions that
Intelligence solutions are endless. make work easier for all employees
Executive management has less must be found, not just for a few, in
patience for huge projects and order to work, partake, understand and
prefers to see small results over interpret the data and information in
time, rather than at the end. the organization.
Therefore, the vision of business - Integration with other systems and
intelligence should be divided into applications - BI solutions must be
several phases. Each phase must sought to integrate well with the rest of
contain a well-defined purpose, the enterprise business systems in
achievable goals and benefits that order to be used in the future.
can be demonstrated to all. - Flexibility of business intelligence -
11. Support from the executive - as when requirements change or when
Business Intelligence solutions new functionality is added, BI
touch several departments of the solutions that mold on such changes
organization, this may be seen with must be researched.
skepticism. An influential person in You don’t have to make the main purpose
the company who understands the of a company to implement Business
importance of the solution can Intelligence, but this should be the way to
Database Systems Journal vol. V, no. 3/2014 35

guide the company towards an ideal of [5] Introducing Students to Business


success. Intelligence:Acceptance and Perceptions
REFERENCES of OLAP Software, Mike Hart, Farhan
[1] Esat, Michael Rocha, and Zaid Khatieb,
http://www.gartner.com/technology/about. Department of Information Systems
jsp University of Cape Town, South Africa
[2] [6] Business Intelligence:
http://www.webopedia.com/TERM/B/Busi Multidimensional Data Analysis, Per
ness_Intelligence.html Westerlund
[3] http://incomemagazine.ro/articole/7- [7] Microsoft's Business Intelligence, Ken
tendin-e-definitorii-pentru-business- Withee
intelligence-in-acest-an# [8] Business Intelligence for Dummies,
[4] Swain Scheps
http://www.cio.com/article/40296/Busines [9] Data Mining & Business Intelligence:
s_Intelligence_Definition_and_Solutions Practical Tools, Motaz K. Saad
36 Business Intelligence overview

Dragoş AGIU - As a person I developed abilities for studying


mathematics and informatics, so as for my education I studied the "CA
Rosetti" High School - mathematics-informatics profile, 2005-2009.
After that I went at the Bucharest Academy of Economic Studies,
Department of Cybernetics, Statistics and Economic Informatics,
Economic Informatics specialization, 2009-2012, Bachelor Degree. To
complete my studies in this range of knowledge I attended also the Academy of Economic
Studies, Department of Cybernetics, Statistics and Economic Informatics the Database
Master, 2012-2014 in Bucharest. For 2 years I’ve been working in a IT company and my
skills in computer science are growing every day.I intend to reach a high level in a IT
company or anywhere I can get the chance. I want to prove to myself that I can give all that is
needed when it comes to involvement in a company. For my hobbies I practice martial arts,
swimming and motorcycle riding.

Vlad MATEESCU - My passion for informatics has started in high


school and it continued after graduation, as I’ve attended The Faculty of
Automatic Control and Computers within the Politehnica University of
Bucharest (between 2008 and 2012) and the Business Support Databases
master program within the University of Economic Studies (between 2012
and 2014). I am currently working at an IT company as a software
developer, cultivating new skills every day, both professionally and
personally. Although my job consists in creating software applications, I am interested in all
the stages of a project and would like to work as a software analyst one day. In my free time, I
enjoy travelling, photography and live music.

Iulia MUNTEAN - The passion I have for IT has started 2 years ago,
during a project I made for school. That’s the moment when I decided I
would love to work in this domain. Since then, I started to learn everything
about programming. I am currently working as a web programmer and I
love this job. Every day is a new day to apply what I learned theoretical in
the past and see it’s working.In my spare time, I like reading historical
books, travel and don’t miss any chance to learn something new.
Database Systems Journal vol. V, no. 3/2014 37

Comparative study on software development methodologies

Mihai Liviu DESPA


mihai.despa@yahoo.com

Bucharest University of Economic Studies

This paper focuses on the current state of knowledge in the field of software development
methodologies. It aims to set the stage for the formalization of a software development
methodology dedicated to innovation orientated IT projects. The paper starts by depicting
specific characteristics in software development project management. Managing software
development projects involves techniques and skills that are proprietary to the IT industry.
Also the software development project manager handles challenges and risks that are
predominantly encountered in business and research areas that involve state of the art
technology. Conventional software development stages are defined and briefly described.
Development stages are the building blocks of any software development methodology so it is
important to properly research this aspect. Current software development methodologies are
presented. Development stages are defined for every showcased methodology. For each
methodology a graphic representation is illustrated in order to better individualize its
structure. Software development methodologies are compared by highlighting strengths and
weaknesses from the stakeholder’s point of view. Conclusions are formulated and a research
direction aimed at formalizing a software development methodology dedicated to innovation
orientated IT projects is enunciated.
Keywords: software development, project management, development methodology

1 Introduction
The management process of a
software development projects follows
development projects are notorious for
frequently changing initial planning and
specifications. What do you believe are the
the basic rules of project management but main reasons for changing specifications
also includes particular features. A during the development stage? Project
software development project manager managers responded on 3 out of the 11
has to deal with challenges and setbacks groups. The 3 groups are: PMO – project
that are proprietary to the IT industry. management office, Agile Project
Also in software development there are Management Group and International
benefits and strong points that help ease Society for Professional Innovation
the burden of management. Management. Based on the information
collected from the LinkedIn discussions and
Software development projects are on the author’s own experience as software
notorious for frequently changing initial development project manager specifications
planning and specifications. In order to change due to the fact that:
identify the main reasons for changing  the project owner identifies new
specification during the development business opportunities and decides to
stage of a software product debates were integrate them into the software
started on LinkedIn project management being developed;
groups. Debates were initiated on 11  due to the technical nature of
LinkedIn groups starting the discussion software development projects there
with the following introduction: Software is a lack of shared understanding of
38 Comparative Study on Software Development Methodologies

expected outcomes; development in order to meet the project


 original planning was based on owner’s requirements and in order
specifications that were coordinate effectively the project team.
misinterpreted by the project
manager or poorly illustrated by Software development projects involve
the project owner; project teams that are made up of highly
 project team is unable to skilled and highly trained individuals with
implement planned functionalities predominantly technical backgrounds.
due to lack of expertise or Highly skilled and highly trained individuals
technological limitations; will require significant compensation for
 the context in which the software their work and that will translate into high
is going to be used changes thus man-hour or man-day rates. So the project
generating the need for the manager is under considerable pressure to
software to change; provide accurate time estimates in terms of
 new technology or software required man-days as every inconsistency
product is launched on the market. will generate significant additional costs.
Also highly skilled individuals don’t
Changing specifications has a negative integrate well in a team as they have a
impact on the project management tendency of being arrogant and self-
process as it reduces predictability and absorbed. The project manager should be
exercises pressure on the budget and able to exploit their ego in the best interest
deadlines. of the project and mitigate disputes and
opinion clashes that occur between team
The software development field is members.
characterized by high dynamics of
technology and standards. Programming Software development projects are often
languages evolve, new frameworks arise implemented by teams that have members
and fall with astonishing speed, user distributed all over the globe. Building
interfaces become more and more diverse software does not require formal face-to-
as software is required to work on a face communication. Task assignment and
larger array of devices. PHP server-side task tracking is done by using online
scripting language registered 16 releases management tools like Pivotal Tracker,
on new and improved versions in 2014 Basecamp or Producteev. Code version
alone [1]. The software development control is ensured by using versioning tools
community widely accepted Phalcon and like SVN or GIT. File sharing is
Laravel, as two of the most powerful accomplished by using tools like Dropbox,
PHP frameworks. Both were released in Google Drive or Box. Online meetings can
2012. The project manager has to keep up take place using Skype video conferences.
with the latest trends in software
Table 1. Software development projects characteristics
Characteristic Positive Impact Negative Impact
jeopardize deadlines
frequently changing results in exceeding the project
- budget
specifications
causes stress and discontent for
the development team
software can become obsolete
generates new
high dynamics of by the time it hits the market
opportunities in terms of
technology and standards software developers have to
design and codding
invest a lot of time in
Database Systems Journal vol. V, no. 3/2014 39

researching new technologies


increases the likelihood of high cost generated by human
skilled workforce achieving innovative resources
results
work can be performed monitoring and control
around the clock becomes more difficult
globally distributed teams
cultural diversity nurtures integrating new code is more
creativity challenging

Table 1 summarizes the impact that behaviour patterns. The project team is
software development characteristics have responsible for evaluating the requirements
on the project management and implicitly from a technical perspective. The project
on the project team. The only team will have to research the frameworks,
characteristic that does not have a positive API’s, libraries, versioning tools and
impact is frequently changing of hosting infrastructure that will be required
specifications. in order to build the software product.

2. Software development stages Planning is the stage where all the


Building a software product is a process elements are set in order to develop the
consisting of several distinct stages. Each software product. Planning starts with
stage has its own deliverables and is bound defining the overall flow of the application.
by a specific time frame. Depending on the Next step is to breakdown the flow into
project, certain stages gain additional smaller, easier to manage subassemblies.
weight in the overall effort to implement For each subassembly a comprehensive set
the software product. of functionalities has to be defined. Based
on the required functionality a database
Research is the stage where the project structure is designed. Taking into account
owner, the project manager and the project the overall flow of the application, the
team gather and exchange information. subassemblies, functionalities and database
The project owner is responsible for structure, the project manager together
formulating requirements and passing them with the project team have to choose the
on to the project manager. In order to technology that will be employed to
properly formulate requirements the develop the application. Also the project
project owner has to first define a set of manager should decide on the best suited
goals. Then he has to envision the way a management methodology and the proper
software product will help him achieve work protocol for the project at hand.
those goals. In the research stage the
project owner will try to find people or Design is the stage where the layout of the
companies with similar goals and application is created. Web applications
document the way those people or and mobile applications tend to grant more
companies acted upon fulfilling their goals. impotence to layout than desktop
The project manager is responsible for applications. Depending on the nature of
receiving the requirements from the project the application designs can range from
owner, evaluating them and passing them rough and functionality driven to complex
to the project team as technical and artistic. An accounting application will
specification. The project manager has to only require basic graphic design but an
be able to evaluate the requirements from online museum will require high end
both a business perspective and a technical design work. In an accounting application
perspective. The project manager has to design has to emphasize and enhance
research market characteristics and user functionality whereas in an online museum
40 Comparative Study on Software Development Methodologies

functionality has to be tailored in order to issue. Design errors are actually


fit the design. The graphic design can inconsistency between what the project
overlap with the planning and with the owner requested and what the project team
programming stage. The graphic design ended up implementing. Design errors
stage is important because it will display to occur in the planning stage, have a
the project owner a preview of the significant impact on the project and are
application before it is actually built. At usually harder to fix. Identifying design
this stage usually the project owner comes errors is considerably more efficient when
up with new requirements that have to be the project owner is involved as he is the
summited to research and planning. one that formulated the application
requirements.
Development is the stage where code is
written and the software application is Setup is the stage where the application is
actually built. The development stage starts installed on the live environment. The
with setting up the development setup stage precedes the actual exploitation
environment and the testing environment. of the software product. The setup entails
The development environment and the test configuring the live environment in terms
environment should be synchronized using of security, hardware and software
always the same protocol. Code is written resources. Back-up procedures are defined
on the development environment and and tested. The actual setup of the software
uploaded on the test environment using the product includes copying the source code,
synchronization protocol. Another importing the database, installing third
important aspect of the development stage party applications if required, installing
is progress monitoring. The project cron-jobs if required and configuring API’s
manager has to determine actual progress if required. Once the application is
and evaluate it against the initial planning. installed it will go through another full
The project manager should constantly testing cycle. When testing is completed
update the project owner on the overall content is added to the application.
progress. When writing code the software
developers should also perform debugging Maintenance is the stage that covers
operation in order to upload clean and bug software development subsequent to the
free updates on the testing environment. application setup and also the stage
Software developers should also comment responsible for ensuring that the
their code so that they can easily decipher application is running within the planned
later or make it easy to understand for parameters. Ensuring that the application is
other developers. running properly is done by monitoring the
firewall, mail, HTTP, FTP, MySQL and
Testing is the stage where programming SSH error logs. Also monitoring traffic
and design errors are identified and fixed. data will provide valuable input on
Programming errors are scenarios were the potential issues that may affect the
application crashes or behave in a way it application’s performance. An important
was not supposed to according to the part of the maintenance stage consists of
designed architecture. Programing errors systematically testing functionalities for
also consist in security or usability issues. errors that were not identified in the testing
If the application is vulnerable to attacks stage or for issues that are not displayed in
and can therefore allow attackers access to the error logs. The maintenance stage also
private data, then that is regarded as a provides the opportunity to add new
programming error. If users have problem features of functionality to the software
with slow response time from the application. Adding new code or changing
application than that also is a programming the old code will have to be submitted to
Database Systems Journal vol. V, no. 3/2014 41

research, planning, programing, testing and their core characteristics. The analysis
setup. includes specifying the scale of the project
the methodology is suited for, the stage
The above mentioned stages are generally project owner feedback is delivered and a
agreed by the software development graphic representation of the methodology.
community as being the cornerstones of For coherence reasons stages defined in the
every software development project. second section of the article are also used
Depending of software development in depicting the graphic representations of
methodology they may be found under the methodologies.
different naming conventions, they may be
overlapping changing order or missing Waterfall is the first methodology
altogether. generally acknowledged as being dedicated
to software development. Its principals are
3. Current software development for the first time described by Winston W.
methodologies Royce even though the actual term
A software development methodology is a waterfall is not used in the article [24]. It
set of rules and guidelines that are used in emphasizes meticulous planning and it
the process of researching, planning, outputs comprehensive documentation.
designing, developing, testing, setup and The Waterfall methodology is linear-
maintaining a software product. The sequential process where every stage starts
methodology also includes core values that only after the previous has been completed.
are upheld by the project team and tools Each stage has its own deliverables. The
used in the planning, development and Waterfall methodology is predictable and
implementation process. This paper values rigorous software planning and
reviews 20 of the most popular software architecture.
development methodologies and highlights

Fig. 1. Waterfall methodology

The project owner’s feedback is received software development projects where


after the software application is completely requirements are clear and detailed
developed and tested. The Waterfall planning can be easily drafted for the entire
methodology is suitable for small scale project.
42 Comparative Study on Software Development Methodologies

Prototyping is a methodology that evolved specifications as it acts as baseline for


out of the need to better define communication between project team and
specifications and it entails building a project owner. The prototype is not meant
demo version of the software product that to be further developed into the actual
includes the critical functionality. Initial software product. Prototypes should be
specifications are defined only to provide built fast and most of the times they
sufficient information to build a prototype. disregard programming best practices [2].
The prototype is used to refine

Fig. 2. Prototyping methodology

The project owner’s feedback is received software application one step at the time in
after the prototype is completed. The the form of an expanding model [3]. Based
Prototyping methodology is suitable for on initial specification a basic model of the
large scale projects where is almost application is built. Unlike the prototype,
impossible to properly define exhaustive the model is not going to be discarded, but
requirements before any actual codding is is instead meant to be extended. After the
performed. Prototyping methodology is model is tested and feedback is received
also suitable for unique or innovative from the project owner specifications are
projects where no previous examples exist. adjusted and the model is extended. The
process is repeated until the model
Iterative and incremental is a becomes a fully functional application that
methodology that relies on building the meets all the project owner’s requirements.

Fig. 3. Iterative and incremental methodology

The project owner’s feedback is received methodology has 4 major phase: planning,
after each iteration is completed. The risk analysis, development and evaluation.
Iterative and incremental methodology Project will fallow each phase multiple
emphasizes design over documentation and times in the above mentioned order until
is suitable for medium and large projects. the software application is ready to be
setup on the live environment. The Spiral
Spiral is a methodology that focuses on methodology emphasizes risk analysis and
identifying objectives and analysing viable always evaluates multiple alternatives
alternatives in the context well documented before proceeding to implementing one.
project constrains [4]. The Spiral
Database Systems Journal vol. V, no. 3/2014 43

Fig. 4. Spiral methodology

The project owner’s feedback is received much faster development and higher-
after the first iteration of the spiral is quality results than those achieved with the
completed. The Spiral methodology is traditional methodologies. It is designed to
suitable for medium and large scale take the maximum advantage of powerful
projects. It has also proven more effective development software [15]. Rapid
in implementing internal projects as application development imposes less
identifying risks proprietary to your own emphasis on planning tasks and more
organization is easier. emphasis on development. Development
cycles are time boxed and multiple cycles
Rapid application development is a can be developed at the same time.
development lifecycle designed to give

Fig. 5. Rapid application development methodology

The project owner’s feedback is received projects with the constraint that projects
after each module is completed. The Rapid have to be broken down into modules.
application development methodology is
suitable for small, medium and large scale Extreme programming breaks the
conventional software development
44 Comparative Study on Software Development Methodologies

process into smaller more manageable and the other is supervising. They change
chunks. Rather than planning, analysing, roles at regular intervals. For reducing the
and designing for the entire project at once, number of errors it relies heavily on unit
extreme programming exploits the testing and developers are required to write
reduction in the cost of changing software tests before writing the actual code. There
to do all of these activities a little at a time, is a collective ownership code policy
throughout the entire software where any developer can change any code
development process [5]. It enforces pair sequence even if it was not written by him.
programming where two developers use The project owner is the one that decides
the same computer. One is writing code the priority of the tasks.

Fig. 6. Extreme programming methodology

Extreme programming methodology V-Model methodology is a software


requires that a represented of the project development process which is an extension
owner is always with the development of the waterfall model [16]. It emphasizes
team in order to have access to continuous thorough testing by pairing each software
and relevant feedback. The Extreme development stage with a matching phase
programming methodology is suitable for of testing.
small, medium and large scale projects.

Fig. 7. V-Model methodology

The project owner’s feedback is received suitable for small and medium scale
in the form of acceptance testing after the projects.
entire application is completed. The V-
Model development methodology is Scrum is a methodology for incrementally
building software in complex
Database Systems Journal vol. V, no. 3/2014 45

environments [6]. Software requirements assessed in daily meetings that are


are formulated and prioritized by the confined to 15 minutes and are known as
product owner and are called stories. All Daily Scrum. Task assignment is not done
the stories make up the Product Backlog. by the project manager or by any other
The Scrum methodology adopted a time individual. Scrum development teams are
box approach where development cycles self-organized and task assignment is a
known as Sprints take no more than 4 process where every team member is
weeks and end with a working version of involved. The team efforts are kept on
the application. All the stories of a sprint track by a Scrum Master.
make up the Sprint’s Backlog. Progress is

Fig. 8. Scrum methodology

The project owner’s feedback is received methodology is to construct software with


at the end of each sprint. The Scrum no defects during development [7].
methodology is suitable for small, medium Cleanroom methodology relies on a box
and large scale projects. structure method to design the software
product. Quality control is performed by
Cleanroom is a methodology that focuses using mathematic models and it also
on defect prevention. The motivation for introduces a statistical approach to testing.
this approach is the fact that defect Developers do not test the code that is the
prevention is much less expensive than testing team’s responsibility.
defect removal. The goal of the Cleanroom

Fig. 9. Cleanroom methodology

The project owner’s feedback is received


at the end of each increment. The Dynamic systems development
Cleanroom methodology is suitable for methodology is focused on developing
small, medium and large scale projects. application systems that truly serve the
46 Comparative Study on Software Development Methodologies

needs of the business [17]. Dynamic standards at the beginning of the project
systems development methodology is an and it sets non-negotiable deadlines.
iterative development model that uses a Testing is done early and continually
timebox approach and MoSCoW task throughout the entire development cycle.
prioritization. It defines strict quality

Fig. 10. Dynamic systems development methodology

Project team and project owner share a the-box roadmaps for different types of
workplace, physical or virtual, in order to software projects and it provides guidance
facilitate efficient feedback at every stage for all aspects of a software project. It does
of the project. The Dynamic systems not require the project team to engage in
development methodology is suitable for any specific activity or produce any
medium and large scale projects. specific artefact. It provides guidelines that
help the project manager tailor the process
Rational unified process methodology if none of the out-of-the-box roadmaps
provides a disciplined approach to software suits the project or organization [8].
development. It comes with several out-of-

Fig. 11. Rational unified process methodology

The project owner’s feedback is received is suitable for small, medium and large
as agreed at the start of the project as the scale projects.
methodology does not enforce a rule on
project team – project owner collaboration. Lean software development is a product
The Rational unified process methodology development paradigm with an end-to-end
Database Systems Journal vol. V, no. 3/2014 47

focus on creating value for the project also insures that team members are
owner and for the end user, eliminating motivated by empowering them to make
waste, optimizing value streams, significant decisions regarding the
empowering people and continuously application. The Lean software
improving [9]. Value is defined as development methodology does not
something that the project owner would enforce a certain process in terms of
pay for. Anything that is not adding value conducting the project. Project manager
is considered waste and has to be and team members are free to use any
discarded. Lean software development process they see fit as long as it stays true
delivers early iterations of working code. It to core lean development principals.

Fig. 12. Lean software development methodology

The project owner’s feedback is central to new functionality. If the tests work then
the Lean software development there is no need to write any code as the
methodology. The Lean software functionality already exists, it was just not
development methodology is suitable for know to the developer. This scenario is
small, medium and large scale projects. often encountered when dealing with
legacy code. If tests do not compile then
Test-driven development is a the developer will write the code and run
methodology developed around unit the tests again. The process is repeated
testing. Before writing any actual code the until all requirements are met [10].
developers write automated test cases for

Fig. 13. Test-driven development methodology

The project owner’s feedback is received projects. It is also recommended for


after the code tested successfully. The scenarios where developers have to work
Test-driven development methodology is with legacy code.
suitable for medium and large scale
48 Comparative Study on Software Development Methodologies

Behaviour-driven development acceptance test scenarios developers will


methodology is developed around implement functionality. When
acceptance testing. The project owner functionality in developed it is tested using
writes the requirements in the form of the same acceptance testing scenarios. If it
acceptance tests using a standard format. passes the tests code is moved on the live
Requirements are defined as user stories environment. The entire process is
and include a title, a narrative part and repeated until all requirements are met.
acceptance criteria [18]. Based on the

Fig. 14. Behaviour-driven development methodology

The project owner’s feedback is received Feature-driven development is a


after the code tested successfully. The methodology focused on actual
Behaviour-driven development functionality. Each feature in the Feature-
methodology is suitable for small, medium driven development methodology reads as
and large scale projects. It is also a requirement which is understandable by
recommended for scenarios where the project owner, it has true business
developers have to work with legacy code. meaning and describes true business value
[11].

Fig. 15. Feature-driven development methodology

The project owner’s feedback is received ways of handling requirements [12]. Based
after the application is already setup but on project owner’s requirements a
constant interaction between development metamodel is define. The metamodel is
team and project owner takes place actually a platform independent model that
throughout the entire length of the project. can be migrated onto any environment.
The Feature-driven development UML is usually used for building the
methodology is suitable for small, medium metamodel. The metamodel is then
and large scale projects. converted into a model that is specific to a
certain platform. Based on the model the
Model-driven engineering is a complex actual code is then generated.
methodology that uses domain models as

Fig. 16. Model-driven development methodology


Database Systems Journal vol. V, no. 3/2014 49

process, are the most important factor in


The project owner’s feedback is received any software project. Crystal Methods is a
after the code tested successfully. The myriad of methodology elements and does
not tackle every project in the same
Model-driven engineering methodology is manner but instead uses custom tailored
suitable for small, medium and large scale processes and tools depending on the
projects. It is also recommended for project’s profile and scale. Large or safety
projects that will have long exploitation critical projects require more methodology
period as metamodels can be easily elements than small non-critical projects.
migrated and adapted to new technologies. With Crystal Methods, organizations only
develop and use as much methodology as
Crystal Methods is a family of their business needs demand [19]. Crystal
methodologies that developed around the uses an iterative approach but does not
theory that people, and not tools or enforce a release with every iteration.

Fig. 17. Crystal Methods methodology

The project owner’s feedback is received requirement determination by involving


after each iteration is finished. The Crystal end users, project owner and project team
Methods methodology is suitable for small, in a series of freely interacting meetings
medium and large scale projects. It has a [13]. End user and project owner are also
different approach depending on the scale heavily involved in the design and
of the project. development stages. Joint Application
Development methodology uses
Joint application development is a prototyping as the basis for the actual
methodology that focuses on system software development.

Fig. 18. Joint application development methodology

The project owner’s feedback is received development methodology is suitable for


at every JAD meeting and after the medium and large scale projects.
prototype is finished. The Joint application
50 Comparative Study on Software Development Methodologies

Adaptive software development is a project’s mission. It is a timeboxed model


methodology that was built as a response that values delivering features and accepts
to an economy that is increasingly changes in all stages of the project.
changing and evolving [14]. Adaptive Adaptive software development responds
software development is based on iterative accepts risks and handles them efficiently.
development and is oriented on the

Fig. 19. Adaptive software development methodology

The project owner’s feedback is received of programmers work on writing,


after each iteration is finished. The debugging and testing software without
Adaptive software development expecting any direct compensation. Most
methodology is suitable for small, medium open source software developers will never
and large scale projects. meet face to face and yet find a way to
collaborate in harmony. Open source
Open source software development is a software development has the advantage of
decentralized methodology with no central comprehensive testing as code is reviewed
authority, project owner, no compensation by a large number of developers and also
for the project team, no accountability and benefits from around the clock work on the
yet with a high success rate [20]. Open project as developers are geographically
source software development defies scattered all around the globe.
traditional economic theory as thousands

Fig. 20. Open source software development methodology

In open source software development there The methodology is suitable for small,
is no project owner to provide feedback. medium and large scale projects.
Database Systems Journal vol. V, no. 3/2014 51

Microsoft Solutions Framework is a both lightweight and heavyweight


deliberate and disciplined approach to implementation so it can be applied in an
technology projects based on a defined set agile manner or using a waterfall approach.
of principles, models, disciplines, It fosters open communication and
concepts, guidelines, and proven practices empowers team members but at the same
from Microsoft [21]. Microsoft Solutions time establishes clear accountability and
Framework methodology has versions for shared responsibility.

Fig. 21. Microsoft Solutions Framework methodology

The project owner’s feedback is received They were not included in the current
after deployment. The Microsoft Solutions comparative study as they are variation of
Framework is suitable for small, medium other methodologies and their
and large scale projects. characteristics were already submitted to
analysis.
Apart from the above mentioned
methodologies a further research of the 4. Strengths and weaknesses
topic might consider analysing the Software development methodologies
following methodologies: follow one of two paths: heavyweight or
 Dual Vee model – a variation of the lightweight. Heavyweight methodologies
V model; are derived from the waterfall model and
 Evolutionary project management – emphasizes detailed planning, exhaustive
forerunner of the current agile specifications and detailed application
methodologies [23]; design. Lightweight methodologies are
 Agile Unified Process – a derived from the Agile model and promote
simplified version of the Rational working software, individuals and
Unified Process; interactions, acceptance of changing
 Essential Unified Process – a requirements and user feedback. The
variation of the Rational Unified Microsoft Solutions Framework includes
Process; variations for both heavyweight and
 Open Unified Process - a variation lightweight philosophies.
of the Rational Unified Process
[22].

Table 2. Software development methodologies characteristics


Methodology Characteristics Strengths Weaknesses
- comprehensive - easy to manage - working code is delivered
documentation - easy to understand for the late in the project
- meticulous planning project owner and team - does not cope well with
Waterfall
- linear-sequential process changing requirements
- each phase has its own - low tolerance for design
deliverables and planning errors
- build one or more demo - accurate identification of - leads to unnecessary
versions of the software application requirements increase of the
product - early feedback from the application’s complexity
Prototyping - project owner is actively project owner - increased programming
involved - improved user experience effort
- prototypes are meant to be - early identification of - costs generated by
discarded missing or redundant building the prototype
52 Comparative Study on Software Development Methodologies

- writing code is valued over functionality


writing specifications
- build an initial model that - continuous feedback -each iteration is a rigid
is meant to be extended in from the project owner structure that resembles a
successive iterations - multiple revisions on the small scale waterfall
Iterative and
- emphasizes design over entire application and on project
incremental
documentation specific functionality
- project owner is actively - working code is delivered
involved early in the project
- focuses on objectives, - working code is delivered - costs generated by risk
alternatives and constrains early in the project handling
- has 4 major phase: - minimizes risk - dependent on accurate
planning, risk analysis, - strong documentation risk analysis
development and evaluation
Spiral
- emphasises risk analysis
- evaluates multiple
alternatives before
proceeding to the planning
stage
- less emphasis on planning - applications are - poor documentation
tasks and more focus on developed fast - high development costs
Rapid application development - code can be easily reused
development - code integration issues
- timebox approach - application has to be
broken into modules
- pair programming - application gets very fast - lack of documentation
- unit testing in the production - developers reluctance to
- fast consecutive releases environment pair programing
- collective ownership - frequent releases of - developers reluctance to
- on-site project owner working code write tests first and code
Extreme
- open workspace - reduced number of bugs later
programming
- project owner decides the - smooth code integration - requires frequent
priority of tasks - continuous feedback meetings
from the project owner - lack of commitment to a
well-defined product leads
to project owner reluctance
- introduces testing at every - low bug rate - vulnerable to scope creep
development stage -easy to understand and - relies heavily on the
V-Model
- highlights the importance use initial set of specifications
of maintenance
- iterative development - deliver products in short - lack of documentation
- timebox approach known cycles - requires experienced
as Sprints - enables fast feedback developers
- daily meetings to assess - rapid adaptation to - hard to estimate at the
progress known as Daily change beginning the overall effort
Scrum required to implement
Scrum
- self organizing large projects; thus cost
development team estimates are not very
- tasks are managed using precise
backlogs; product backlog
and sprint backlog

- iterative development - considerable reduction in - increased development


- box structure method bug rate costs
- using mathematic models - higher quality software - increased time to market
Cleanroom in quality control products for software product
- statistical approach to - requires highly skilled
testing highly experienced
developers
Dynamic systems - iterative development - focusses on addressing - requires large project
development - MoSCoW prioritisation of effectively the business teams at it has multiple
Database Systems Journal vol. V, no. 3/2014 53

method tasks needs roles to cover


- timebox approach - post project - requires very skilled
- non-negotiable deadlines implementation developers
- strict quality standards set performance assessment
at the beginning of the - complete documentation
project - active user involvement
- project team and project
owner share a workplace
(physical or virtual)
- test early and continually

- iterative development - accurate and - requires highly qualified


- prioritize risk handling comprehensive professionals
- adequate business documentation - development process is
modelling - efficient change request complex and poorly
Rational Unified
- change management management organized
Process
- performance testing - efficient integration of
new code
- enables reuse of code and
software components
- iterative development - reduced project time and - project is highly
- discards all components cost by eliminating waste dependable on individual
that do not add value to the - early delivery of working team members
Lean software product code - a team member with
development - amplify learning - motivated project team strong business analysis
- customer focus skills required
- team empowerment
- continuous improvement
- unit testing - less time spent on - tests are focused on
-testing scenarios are debugging syntax and overlook actual
developed before actual - higher quality code functionality
coding - by designing tests the - requires more code than
Test-driven
- repeated short development developer empathizes with most methodologies
development
cycles the user - the developer is actually
- suitable for debugging - less defects get to the end the one doing the testing
legacy code developed with user - writing unit tests
other techniques increases costs
- unit testing - easy to maintain - project owners are
- focuses on business value - usability issues are reluctant to write
Behavior-driven - genuine collaboration discovered early behaviour scenarios
development between business and - reduced defect rate
development - easy to integrate new
code
- iterative development - multiple teams can work - individual code
- application is broken down simultaneously on the ownership
into features project - iterations are not well
- no feature should take - scales well to large teams defined
Feature-driven - good progress tracking
development longer than two weeks to
implement and reporting capabilities
- uses milestones to evaluate - easy to understand and
progress adopt

- uses domain model - high degree of - requires considerable


- models are automatically abstraction technical expertise
transformed into working - increased productivity - documentation is
Model-driven code - delivers products with a readable only by domain
engineering - knowledge is encapsulated high degree of experts
in high level models compatibility and - is difficult to implement
- emphasizes reuse of portability version control on
standardized models - shorter time to market modelling environment
54 Comparative Study on Software Development Methodologies

- lowers maintenance costs


- focusses on people and - easy to implement - critical decisions
skill and not on process - frequent delivery of regarding the architecture
- more than one iteration in a working code of the application are made
Crystal Methods
release - developers have by individuals and not by
Methodology
- different approaches dedicated timeslots to the entire team
depending on the projects reflect on possible code
size and criticality improvements
- emphasises system - accelerates design - relies heavily on the
requirement determination - enhances quality success of the group
- involves the project owner - promotes teamwork with meetings
Joint Application and end user in the design the customer - does not have a
Development and development - creates a design from the documented approach for
- JAD meetings customer's perspective stages that follow system
- prototyping - lowers maintenance costs requirements
determination and design
- iterative development - effective handling of - low risk handling
- focusses on the final goal change and scope creep - uses assumptions and
Adaptive software of the project - easy to understand and predictions
development - feature based implement - lacks tangible
- timeboxed - enables innovation documentation
- risk driven
- iterative development - low costs - low accountability for
- geographically distributed - highly motivated and submitted code
Open source
teams dedicated developers - no central management
software
- collaborative work - comprehensive testing as authority
development
code is reviewed by a large - unstructured approach to
number of developers development
- has versions for both - supports multiple process - difficult to setup and
lightweight and heavyweight approaches configure
implementation - solid risk handling
Microsoft - fosters open policies
Solutions communication - built to respond
Framework - empower team members effectively to change
and establishes clear - reduces team size
accountability and shared
responsibility
degree of flexibility as changes occur
Out of a total of 20 software development often.
methodologies that were analysed 6 were
based on the heavyweight model and 13
were based on the lightweight model. One 5. Conclusions
methodology offered support for both a Software development methodologies
lightweight and a heavyweight approach. follow two major philosophies:
In order to choose the appropriate software heavyweight and lightweight.
development methodology for a project Heavyweight methodologies are suitable
one should consider: project owner profile, for projects where requirements are
developer’s technical expertise, project unlikely to change and the software
complexity, budget and deadlines. An complexity allows for detailed planning.
innovative software development project is Heavyweight methodologies are easy to
difficult to match with one of the existing understand and implement. They provide
development methodologies. Innovative solid documentation and appeal to project
software development projects require owners because they are well structured
writing comprehensive documentation in and showcase tangible deliverables for
order to patent any original output that every stage of the project. With
might result. It also requires a considerable heavyweight methodologies the project
Database Systems Journal vol. V, no. 3/2014 55

manager can easily perform tracking,


evaluation and reporting. The project [1] http://php.net/ChangeLog-5.php
owner is considerably involved only in the [2] J. E. Cooling, T. S. Hughes, “The
research and planning stages. Lightweight emergence of rapid prototyping as a real-
methodologies are suitable for projects time software development tool”,
were specifications are unclear or are Proceedings of the Second International
likely to change due to project internal or Conference on Software Engineering for
external factors. Lightweight Real Time Systems, 18-20 Sep. 1989,
methodologies are based on an incremental Cirencester, UK, Publisher: IET, 1989, pg.
approach were software is delivered in 60-64
multiple consecutive iterations, all of them [3] C. Larman, V. R. Basili, “Iterative and
being functional versions of the Incremental Development: A Brief
application. Lightweight methodologies History”, Computer, vol. 36, no. 6, pg. 47-
provide great flexibility and can easily 56, 2003, doi:10.1109/MC.2003.1204375
adapt to change. They promote early [4] B.W. Boehm, “A spiral model of
delivery of working code, self-organizing software development and enhancement”,
teams and adaptive planning. The project Computer, vol. 21, no. 5, pg. 61-72, 1988,
owner is heavily involved in all the stages doi: 10.1109/2.59
of the project as its input and feedback is [5] K. Beck, “Embracing change with
critical for the success of lightweight extreme programming”, Computer, vol. 32
methodologies. When choosing a software , no.10, pg. 70 – 77, 1999, doi:
development methodology project owner 10.1109/2.796139
profile, developer’s technical expertise, [6] L. Rising, N. S. Janoff, “The Scrum
project complexity, budget and deadlines Software Development Process for Small
must be taken into account. Often no Teams”, IEEE Software, vol. 17, no. 4, pg.
methodology will fit perfectly the profile 26-32, 2000, doi:10.1109/52.854065
of a specific project. Then the best [7] A. Spangler, “Cleanroom software
matching methodology should be used or engineering-plan your work and work your
in the case of experienced project teams plan in small increments”, IEEE
and project managers a combination of Potentials, vol.15, no. 4, pg. 29 – 32, 1996,
methodologies could be introduced. In the doi: 10.1109/45.539962
case of innovative software development [8] G. Pollice, Using the Rational Unified
projects a new methodology is required. Process for Small Projects: Expanding
This topic is a subject for further research Upon eXtreme Programming, Rational
in the software development field. Software White Paper, 2001
[9] C. Ebert, P. Abrahamsson, N. Oza,
6 Acknowledgment “Lean Software Development”, IEEE
This paper was co-financed from the Software, vol. 29, no. 5, pg. 22-25, 2012,
European Social Fund, through the [10] E. M. Maximilien, L. Williams,
Sectorial Operational Programme Human “Assessing test-driven development at
Resources Development 2007-2013, IBM”, Proceedings. 25th International
project number Conference on Software Engineering,
POSDRU/159/1.5/S/138907 "Excellence ICSE 2003, 3-10 May 2003, Portland,
in scientific interdisciplinary research, USA, Publisher: IEEE, 2003, doi:
doctoral and postdoctoral, in the economic, 10.1109/ICSE.2003.1201238, pg.564-569
social and medical fields -EXCELIS", [11] D. J. Anderson, Feature-Driven
coordinator The Bucharest University of Development, Microsoft Corporation, Oct.
Economic Studies. 2004
[12] J. Bezivin, Model Driven
References Engineering: An Emerging Technical
56 Comparative Study on Software Development Methodologies

Space, International Summer School, using natural language processing”,


Summer School on. Generative and Proceedings of the 50th International
Transformational Techniques. in Software Conference on Objects, Models,
Engineering, 4-8 Jul. 2005, Braga, Components, Patterns, TOOLS 2012, 29-
Portugal, Publisher: Springer Berlin 31 May 2012, Prague, Czech Republic,
Heidelberg, pg. 36-64, doi: Publisher: Springer Berlin Heidelberg,
10.1007/11877028_2. 2012, doi: 10.1007/978-3-642-30561-0_19,
[13] E. W. Duggana, C. S. Thachenkaryb, pg. 269-287
“Integrating nominal group technique and [19] J. A. Livermore, “Factors that Impact
joint application development for Implementing an Agile Software
improved systems requirements Development Methodology”, Proceedings
determination”, Information & of IEEE SoutheastCon, SECON 2007, 22-
Management, vol. 41, no. 4, pg. 399–411, 25 Mar. 2007, Richmond, USA, Publisher:
2004, DOI: 10.1016/S0378- IEEE, 2007, doi:
7206(03)00080-6 10.1109/SECON.2007.342860, pg.82-86
[20] G. Madey V. Freeh R. Tynan, “The
[14] D. Riehle, “A comparison of the value open source software development
systems of Adaptive Software phenomenon an analysis based on social
Development and Extreme Programming: network theory”, Proceedings of the 8th
How methodologies may learn from each Americas Conference on Information
other”, Proceedings of the First Systems, AMCIS, 9-11 Aug. 2002, Dallas,
International Conference on Extreme USA, 2002, pg. 1806-1813.
Programming and Flexible Processes in [21] G. Lory, D. Campbell, A. Robin, G.
Software Engineering, XP 2000, 21-23 Simmons, P. Rytkonen, Microsoft
Jun. 2000, Cagliari, Italy, pg. 35-50 Solutions Framework version 3.0
[15] J. Martin, Rapid application Overview, White Paper, June 2003.
development, Publisher: Macmillan [22] R. Balduino, Introduction to OpenUP
Publishing, 1991, pg. 788, ISBN:0-02- (Open Unified Process)
376775-8 http://www.eclipse.org/epf/general/OpenU
[16] S. Mathur, S. Malik, “Advancements P.pdf
in the V-Model”, International Journal of [23] S. Woodward, “Evolutionary project
Computer Applications, vol. 1, no. 12, pg. management”, Computer, vol. 32, no. 10,
29-34, 2010, doi: 10.5120/266-425 pg. 49-57, 1999, doi: 10.1109/2.796109
[17] J. Stapleton, P. Constable, DSDM, [24] W. W. Royce, “Managing the
dynamic systems development method: the development of large software systems:
method in practice, Publisher: Cambridge Concepts and techniques”, IEEE
University Press, 1997, pg. 192, ISBN-10: WESCON, vol. 26, no. 8, pg. 1-9, 1970
0201178893
[18] M. Soeken, R. Wille, R. Drechsler,
“Assisted behavior driven development

Mihai Liviu DESPA has graduated the Faculty of Cybernetics, Statistics


and Economic Informatics from the Bucharest Academy of Economic
Studies in 2008. He has graduated a Master’s Program in Project
Management at the Faculty of Management from the Bucharest Academy
of Economic Studies in 2010. He is a PHD Student at the Economic
Informatics PHD School and he is currently Project Manager at GDM Webmedia SRL. His
main field of interest is project management for software development.
Database Systems Journal vol. V, no. 3/2014 57

Social BI. Trends and development needs

Corneliu Mihai PREDA


Academy of Economic Studies, Bucharest
mihaicpreda@gmail.com

This paper tries to show that the need to implement BI solutions that integrate social networks
data is increasingly more acute. Companies need to understand that efficient decision making
has to take into account the importance that social medial has nowadays. In addition, the
software developing companies have to keep up to date with new trends generated within
these communication channels and adapt their solutions to this dynamic environment.
Keywords: Business Intelligence, Social Network, Social BI, Market Trends

1 Introductory notes
Social BI is a management technique
that integrates the sharing of social
resources in order to improve
Currently, through the contribution of
social networks to the scope of BI, the
focus is increasing on their ability to bring
a lot of customers for small companies that
existing company projects, products or are producing for niche markets. In
business processes. This type of BI is addition to this, they provide analysis of
usually handled by software products that trends and statistics with greater relevance
offer the possibility to analyze social to the company and the end customer that
media indicators while, at the same time, is involved in creating the products for
managing projects like a traditional which he is the target of. The next figure
solution. Also, the main characteristic of allows us to see the features outlined by
this type of BI is that it welcomes the social media networks into the BI domain
involvement of the client from the starting [2].
point of business processes and long time As networks and internet applications are
before the product development end with becoming increasingly complex, given
its market release. increasingly more detailed data can be
Social BI is seen as an ideal way to take collected about users/consumers and sent
full advantage of human capital, both to companies operating in narrow markets,
inside and outside the company. From the which they can use for commercial
employee’s point of view, this kind of purposes (e.g. loyalty campaigns,
application offers transparent information promotions, etc.).
distribution across project assignments, no At the same time, user interaction can lead
matter if they are involved or would like to to favorable information dissemination
be involved in these projects. Also, it about the company which can become
creates a widely accepted environment for major waves, this being among the most
merit acknowledgement. On the other effective ways for market advertisement
hand, outside stakeholders (mainly clients) today. Also, this information can turn into
receive the means to actively give reliable sources for consumers who want to
feedback on product/services development, document about a specific product that
ensuring themselves that they want to purchase. Propagation speed
they will have a higher level of satisfaction of this medium is constantly increasing,
of their needs [5. actively using friend networks and
acquaintances of users.
58 Social BI. Trends and development needs

Fig. 1. BI – Social media

Companies need to recognize the need for to develop into a significant source of
virtual existence in the social environment personal data at the individual, opinion,
because the potential for loyalty and sales relationship and interests level. As a result,
increase is considerable. Even if the the lasting parallel existence of these two
purpose is just notifying the end user that concepts, led inevitably to the compund
the company exists is huge and at the same concept of Social BI.
time, the effects are measurable and the Social BI, conceptually wants to extract
measurements reliable. relevant information that would lead to
The fact that both companies that are effective decisions, all based on data from
trying to sell products or services and also social media. Also, it seeks the
social networks have as a common goal the participation in the design of BI solutions
meeting of the needs of the same that contain tools for analysis and
customer/end user, will lead over time to measurement of social metrics. In the
the strengthening of relations of present context, in which more companies
interdependence. This will actively be succeed in implementing BI systems, the
reflected in new products for BI techniques addition of this new component leads to
and processes. These tools will provide the accentuated competition and a major
link that will connect the customer to the benefit for the end customer [2].
companies that serve him and bring
increased efficiency and competitive 2 BI Interaction Management
advantage for businesses that will invest in
these solutions. Management Systems of social networking
BI areas of influence and social media as an integral part of BI, is a group of
were, since their appearance, the subject of applications or methods used to manage
continuous development and research. and follow business processes in a
While BI aims to help decision-making at distributed environment. These systems
company level by providing reports can be manually or automatically
containing relevant data, social media aims maintained and they allows managers to
track, aggregate, publish and take part in
Database Systems Journal vol. V, no. 3/2014 59

many social media channels using a single able to adjust or program the
software product.[4] occurrence of one or more messages;
The operating processes have three main  Ability to manage social data. The
features: system allows managers to view
 Connection to social media aggregated data and have various
channels; analytical reports with different levels
 Ability to quickly publish from of complexity and detail.
anywhere inside these channels being

Fig. 2. BI interactions

Each of these basic functions can be reach the end user. Interacting with social
extended to a very high level of detail. media, CRM uses information provided by
Social media platforms are proving to be it in order to publish messages that should
an important component of any marketing reach both current customers as well as
strategy. At the same time, marketing has prospective customers. By integrating
come to use many of the BI tools to reach these two components, potential customers
for the target audience, often extracting will know what to expect if they choose a
data directly from the internal business product provided by a certain company.
processes. Consequently, we see the need Inside the social media component, gaining
for marketing to track customer response credibility is a matter not only of content,
in real time. but also of a good social interaction. A
CRM (customer relationship management) common mistake is to approach social
focuses on providing services, loyalty media as platforms to send some mass e-
techniques and sales analysis that address mails and not as some communication
actual client needs in a more direct way processes. They should be considered more
that before on what they consider to be the like publishing tools through participation.
main issues of company-customer As mentioned previously, CRM integration
relationship (prices, warranties, shipping, with social media result in a long term
etc.). marketing strategy. The problem here is
CRM is fundamentally different from the huge amount of data coming through
social media components in that it is much these two channels. This is the time when
less immediate. This is the mechanism BI techniques must intervene in order to
through which contracts and products interpret the data into relevant reports
60 Social BI. Trends and development needs

available for the decision-making factor. Oracle and IBM which provide huge
What BI cannot do, however, is improve processing capabilities and complex
the quality of input data. External analysis.[3]
consultants are needed to standardize the
data stream so that it becomes valuable for 3 Social media dynamics and associated
the company. The existence of these metrics
expensive consultants does not preclude
Regarding the most important changes that
the use of free, but very simple services
(e.g. the number of brand occurrence over social media has been through since its
a given period of time). Top market share appearance, we can summarize the
service providers in this area are SAP, following in Table 1.[2]

Table 1. Evolution of social media


Beginning of social media Present social media
Social Media Guru: the man who has Social Media Analytics, BI. Given that data
numerous communication strategies and is an are widely available through various services
expert in creating accounts. A very short and interfaces is difficult to shape and
phase in which sales people tried to monitor communication strategies.
artificially increase their level of apparent
technical training.
Web Tools for distribution to multiple social Distribution coupled with analytics. These
media sites. Useful for a while, but many have as a consequence products that include
sites offer plug-ins that allow postings. embedded analytics service.
Web Analytics. Although web analytics are Social Media Analytics. These scan
still in use they do not provide information on comments about a particular brand or product
visitor opinions. not only to refrain opinions, but also to track
trends.
Many social sites. The area of social YouTube, Twitter, LinkedIn, Facebook.
networks was more diverse in the past. The Strongest sites have persisted and currently
success of a website is not determined by have the most numerous users. Although
technology, but by its users. there are possible failures or hindered daily
traffic, users’ loyalty is unquestionable.
Spam and numerous users. Earlier Facebook Target audience. Since then, marketing is
or Twitter users prided themselves with many trying to attract audiences using content
friends and large volumes of messages sent. relevant to them, and by using means of
increasing ingenuity.
Public image. Having access to large Content. People are now more motivated to
quantities of information there was a need for seek customized content.
more detailed material.
Quick and impulsive sales. The motto "buy Greeting the client. Consumers expect to be
fast" or "limited stock" does not apply online. greeted with a concept before the acquisition
It is easy for users to search for similar items. that will integrate him in a community.
Unmanageable data. Previously, the data Big Data. Today, data is formatted and
came in a large variety of formats and there available for distribution via various
were very few methods of integration. interfaces. There are also many instruments
of conversion from one data format to
another.
Database Systems Journal vol. V, no. 3/2014 61

Regarding the most important social media this can be achieved by the same methods
metrics that companies have to be of filtering online content.
responsive to, consultants identify the
following:
4 Development possibilities
1. Monitoring views of a target Characteristics of social media together
demographic in a certain geographical area with design elements of BI currently
by filtering used keywords in natural contribute as a foundation for any social BI
language to identify opinions of strategy. As areas of development for the
satisfaction or negative feelings associated above mentioned tools, we mention the
with a product or company. This process following:
can prevent a viral explosion of negative
opinions. If negative opinions are needed Users and customers: the inclusion of
direct intervention is needed to solve valid social media related set of data available
issues of the most influential individuals. for analysis can attract new BI users within
2. Identifying cases where negative an organization. The importance attached
opinions could turn into a PR crisis is to social interaction between users is an
achieved through effective monitoring of example of the new design requirements of
social media to identify key individuals BI. Also, if the BI system allows and
before the problem gets proportions. encourages interaction with customers, this
3. Identify individuals, blogs and web will generate new technical needs.
applications that are most influential to a Products and services: along with the
brand should lead to discussions on topics availability of data from social media, new
with which they have shown interest. products and services within the BI can be
4. Real time tracking of the evolution offered for use. Inside this context,
of views and opinions, especially about development directions include the
new products and services leads to discovery of methods to integrate
shortening the time between the signaling traditional BI results and analysis related to
of the problem and its solution. the structure of social networks or opinions
5. Measuring the link between trends. In addition, the limitations
marketing efforts and customer response regarding the quality and security of data,
at certain times of day and divided into will constantly update the need for
geographical areas leads to the appropriate tools.
customizing level that marketing Processes: some BI processes should be
campaigns need to be more efficient. adapted for the integration of social media
6. Determining the type of media and functionality. However, the research and
platforms that have the most success development requirements are relatively
allows the company to focus its efforts. For low in this case, contrary to the situation in
example, a company may divert resources which the BI system is used directly in the
to video campaigns because this means of interaction with the clients’ calls via live
communication has proved most effective channels.
in attracting clients. Data: nearly all of the social media
7. Monitoring competitors must be impacts data management architecture.
done using all available means to redeploy There will be a need for a broad spectrum
their tested business formula into the approach to BI tools if they are to process
company’s own processes. both traditional transactional data and data
8. Documenting the industry sector in from social media. Social data type is
which the company operates leads to rapid extremely dynamic and therefore cannot
identification of growth opportunities and guarantee an accurate mapping over a long
62 Social BI. Trends and development needs

period of time. There is a need for BI in particular the performance of the


continuous information for BI solution organization. [1]
developers on use cases that may arise. In
terms of quality and/or legality of data
5 Conclusions
sources, at first glance, social media
provides individualized, but non-standard The importance and practical application
and poor quality data. Therefore, the need of BI solutions that integrate ways to
to develop methods of aggregation will measure and analyze data from sources in
lead to insight into the trends observed at the field of social networking, records an
the population level. Also, the Wikipedia upward trend. As a result, we see an
example convinces us that making increase of awareness regarding the
common and mutual control over user- importance of these tools from both
generated content can lead to a continuous companies that are competing in different
increase in data quality. markets and also from the developers
IT&C: huge data volumes and their involved in research and development.[4]
frequent updates involved in social media We conclude that the domain of "Social
have a greater and greater impact on BI", although recently emerged as a
IT&C. These unstructured data require concept, is one of great growth potential,
specific software to be processed. representing the future of any effective
Currently the most popular management marketing campaign.
concept for this is the "Big Data", but this
is a domain anticipated to give rise to References
further developments.
Techniques: similar characteristics [1] Dinter, Barbara; Lorenz, Anja “Social
between different social environments on Business Intelligence: a Literature Review
the high volume of data, unstructured data, and Research Agenda”, 33rd ICIS 2012
quickly updated content, result in a major (International Conference on Information
need for research on analysis techniques. Systems)
Unlike IT&C, which requires de novo [2]”Social Media and Business
development of software products, social Intelligence” First Edition, CIO
BI needs to improve the current, widely Whitepapers, Ventus Publishing ApS 2012
used ETL process. [3] “Social Media and Business
Governance: social media and BI systems Intelligence:Creating the
can generate the emergence of new roles IntegratedCustomer Hub”, Oracle White
and responsibilities within organizations. It Paper, September 2012
can identify a clear impact on [4] “Unlock the value of social media data
organizational rules and define the – Social Intelligence”, Viewpoint Paper,
compliance with those rules. In parallel Hewlett-Packard Development Company,
with the use or distribution of information October 2013
through these social channels, problems [5] http://www.techopedia.com/
related to the legal aspects of global
interactions may pose a few problems.
Strategy: BI strategy, as part of the
company strategy, aims to provide support
for optimal decisions on the market.
Effective use of social data provides many
ways to contribute to the overall objectives
such as customer satisfaction. One
direction of research and development
could be measuring the contribution social
Database Systems Journal vol. V, no. 3/2014 63

Corneliu Mihai PREDA graduated from the Faculty of Finance,


Insurance, Banking and Stock Markets of the Bucharest University of
Economic Studies in 2008 and the Capital Markets Master’s program in
2010 with a paper called „Testing the informational efficiency of stock
markets”. He is currently a final year student at the Database Business
Support Master’s program at the Faculty of Cybernetics, Statistics and
Economic Informatics. His past working experience include Business
Planning and Financial Analisys and also SAP/ABAP Technical
Development.

You might also like