You are on page 1of 4

2008 International Conference on Computer Science and Software Engineering

Study and Realization of Supermarket BI System Based


on Data Warehouse and Web Technique

XUE Hong, LIU Zai-wen, MENG Hai-yang


College of Information Engineering, Beijing Technology and Business University,
Beijing 100048, China
Email: hongxue6@yahoo.com.cn

Abstract in the paper. It provides effective support for


managers’ decision-making in the realm of
Business intelligence system with B/S framework supermarket sale.
was developed by sales data from supermarket. Data
guiding into and guiding out service tool of Microsoft 2. Design of data warehouse
SQL Server 2005 was selected as ETL tool. By
extraction, transition, rinse and loading, mass data for Because data warehouse is the base of whole
analysis and decision-making were separated from supermarket business intelligence system, its structure
affair database, and data warehouse aiming at sales design plays an important part in system implementing.
theme of supermarket was established based on Data guiding into and guiding out service tool of
exploitation system of Microsoft Analysis Services. On- Microsoft SQL Server 2005 was selected as ETL tool.
line analytical processing system was designed by By extraction, transition, rinse and loading, mass data
adopting web technique, through which knowledge of for analysis and decision-making were separated from
decision-making was exhibited on web interfaces. With affair database, and data warehouse aiming at sales
the employment of data mining arithmetic of clustering theme of supermarket was established based on
analysis, clients were classified and diverse manners of exploitation system of Microsoft Analysis Services.
sales promotion were adopted respectively via
assessing customers’ purchasing behaviors and 2.1. Design of system star model
customs of consumption. The business intelligence
system furthest utilized data from affair databases, Star model was adopted in the system for reducing
transforming them into information and further into scanning time in fact tables and improving capability
knowledge. This system provides effective support for of inquiring, which is showed in Figure 1. The sale fact
managers’ decision-making in the realm of table was center, and all dimension tables were linked
supermarket sale. to the fact table by main key. The layers of dimensions
were divided according to system requirements. [1]

1. Introduction 2.2. Design of fact table, dimension tables and


metadata models
The application level of information management in
China corporations is not high at present. Data usually The fact table is situated in center of
are not transformed into information, and decision- multidimensional data model and is biggest table in
making management system isn’t adopted, so the using supermarket data warehouse. It records plentiful
efficiency of history data is low in most corporations. particular information of basic operation, which
The supermarket business intelligence system based on includes the main keys and measurement values of
the data warehouse and Web technique was studied by correlative dimension tables. The structure of sale fact
business intelligence, data warehouse, OLAP, data table is showed in table 1. Customer, sales promotion
mining, artificial intelligence and Web key technique activity, product, time and store dimension tables were
established by function design in the system.
﹡ This paper was supported by the Education Committee
Foundation of Beijing (200775) and (KM200610011006).

978-0-7695-3336-0/08 $25.00 © 2008 IEEE 482


DOI 10.1109/CSSE.2008.877

Authorized licensed use limited to: GOVERNMENT COLLEGE OF TECHNOLOGY. Downloaded on February 27, 2009 at 09:29 from IEEE Xplore. Restrictions apply.
*TimeId *CustomerId establish multidimensional data volume of data
TimeId warehouse and connect data source.
· Year · Country
ProductId
·· Quarter ·· State
CustomerId
··· Month
StoreId
··· City Table 2. Metadata table of sale theme
···· CustomerName
PromotionId
Sales Name Sales
*ProductdId *StoreId
Costs Merchandise sale data of
· Family · Country
Profits Description every POS machine
·· Department ·· State
Units
··· Category ··· City recording
···· Subcategoty ···· StoreName Analyzing sale status and
····· BrandName Purpose sales promotion instance of
······ ProductName *PromotionId
supermarket
· PromotionName
Linkman Sale manager of every store
Time, product, client, store
Dimension
Figure 1. Structure of system star model and sales promotion
Fact Sale fact table
Table 1. Structure of sale fact table Measurement Sale cost, Sale, Sale profit,
values Sale amount
Word sect Data types Explanation
name In the system, MOLAP data storage type was chosen
TimeId Int(4) Time code and aggregations were established in Set Aggregation
ProductId Int(4) Product code Options faceplate. To intercalate appropriate
CustomerId Int(4) Customer code aggregation options , memory space and
StoreId Int(4) Store code performances were weighed. The data can be observed
PromotionId Int(4) Sales promotion from arbitrary angle by “browsing data” menu option
activity code of system, which is showed in figure 2.
Sales Money(8) Sale
Costs Money(8) Sale cost
Profits Money(8) Sale profit
Units float(8) Sale amount

Metadata describes object attributes and all


information relating to the object. The metadata of
object was saved in repository when the object was
established by Analysis Services exploitation system.
The supermarket data warehouse included mostly
theme metadata model, fact metadata model,
dimension metadata model and data member metadata
model. The theme metadata model is showed in table
2.

3. Design of on-line analytical processing Figure 2. Browse of system multidimensional data


system
3.2. On-line analytical processing
3.1. Establishment of multidimensional data
The OLAP operation of the multidimensional data
volume
volume was realized by using Cube Browser control
part in Visual Basic. Level of a dimension is spread or
Design work of on-line analytical processing system
contracted by double-clicking on the sign (+) in the
is mostly to establish multidimensional data volume
table. It may increase a dimensionality or replace a
[2]. The data were pretreated by transition and rinse.
displaying dimensionality that a dimensionality or a
Then they were loaded into SQL database. The system
measurement above window is dragged into grid below
used the guide provided by Analysis Manager to
it. Different combination forms of dimensionality and

483

Authorized licensed use limited to: GOVERNMENT COLLEGE OF TECHNOLOGY. Downloaded on February 27, 2009 at 09:29 from IEEE Xplore. Restrictions apply.
measurement are defined as slice of the ‘Linking OLAP server’
multidimensional data volume. The data of the set mycst = server. createobject (“adomd.
multidimensional data volume can be observed from cellset”)
different angles when position of any dimension is mycst.ActiveConnection=“data source =
changed, which realizes circumrotating. Drilling is a olapservername ; provider = msolap ; initial catalog
process in which analysis server returns some data =databasename”
which has combined with structure of the
multidimensional data volume. When we click right 4. Clustering analysis of merchandise sales
any cell and then choose drilling, the system will return
the data producing the polymerizations by a form of
promotion
the simple recording volume, which is showed in
The supermarket governors are concerned with
figure 3.
obtaining furthest profits to view from the angle of
market sale, which is realized by enhancing manage to
all kinds of merchandise and attracting most clients in
price war of stinging competition. In the management
works, it is emphases of analyzing in supermarket sale
scheme that the proper strategy of sales promotion is
applied to appropriate clients in order to increasing sale
profits of supermarket. With the employment of data
mining arithmetic of clustering analysis, clients were
classified and diverse manners of sales promotion were
adopted respectively via assessing customers’
purchasing behaviors and customs of consumption in
the paper.

4.1. Confirming the classing data

It is first step of establishing clustering model that


Figure 3. Drilling operation of system the classing data are confirmed. Choosing basic
multidimensional data variables is very important in order to classing better.
The classing data of fourteen kinds were chosen in the
system, which includes CustomerID, Total_sales,
3.3. Design of front exhibiting system based on
Total_revenues, Total_items, Rs_Advertisement,
B/S pattern Rs_MemberCard, Rs_Discount, Rs_Profits,
Marital_status, Yearly_income, Gender, Education,
In OLAP front exhibiting system based on B/S
Age, Own_child.
pattern, user send out request on Web browser, which
is linked to Web server by HTTP. Web server send out
request to application server that is linked to the data
4.2. Confirming the classing number
warehouse by request. The data warehouse returns the
Basic variables of classing clients were product data
data for request to application server. Results needed
because clients were divided into groups by concern
are got by analyzing program in application server.
degree to one kind of product or many kinds of
Finally, the data are sent to Web browser by Web
products. They were divided into five types in the
server. The data warehouse based on Web technique
system.
may stride department, area, company and may be
demanded by exterior and interior users using Intranet
technique. Users have different jurisdiction. 4.3. Dynamic clustering analysis arithmetic of
VB.Net was adopted as developing tool in the merchandise sales promotion
system. Linking port between Web and data warehouse
and front exhibiting system based on B/S pattern were Because the system related to more data quantity,
established by ASP technique, ADO MD technique the paper adopted the data mining arithmetic of
and MDX. It was realized that data warehouse was dynamic clustering analysis. Its excellences are that
demanded by Web browser. The codes of linking port calculating workload is less, memory cells of computer
are as follows: are taken less and method is simple.
Dynamic clustering analysis arithmetic is as follows:

484

Authorized licensed use limited to: GOVERNMENT COLLEGE OF TECHNOLOGY. Downloaded on February 27, 2009 at 09:29 from IEEE Xplore. Restrictions apply.
• Original data are treated by standard. Original multiplicity types. The sales promotion manners are
n × p matrix is translated into n × n matrix. the manners of ad and member card merchandise.
• Sample is originally classed by choosing The clients of fourth type interest in grocery. They
agglomerate points of destining number. are mostly young students and bachelordom. Their
earnings are not high. Total merchandise amount of the
If the original classing number is k , xij expresses
clients buying is middling, but obtaining profit is lower
number j index of number i sample standardized than middling level. The sales promotion manner is
such that mostly discount manner.
m
The clients of fifth type seem to be very larruping.
SUM (i ) = ∑ xij . The types of purchasing merchandise centralize mostly
j =1
on high consumption levels—mobile telephone, slap-
Let up cosmetic and CD, for example. Their earnings are
MA = maxSUM (i ) , high. Although total merchandise amount of client
buying is small, the obtaining profit is more. The sales
MI = minSUM (i ) . promotion manner is mostly manner of member card
merchandise.
All samples are classed into K types. Each sample
xi is calculated by 5. Conclusion
( K − 1)( SUM (i ) − MI )
+1. The business intelligence system based on data
( MA − MI ) warehouse and web technique was developed by sales
Supposing the integer adjoining this number is k , data from supermarket. The advanced technique of
sample xi is come under number k type, for business intelligence was applied to business manage,
1≤ k ≤ K . which has management idea of intelligence and
optimizing function. Via mining fully existing data
• The centre of gravity in every type is calculated,
resource, supermarket corporation user can capture
which is regarded as a new agglomerate point.
information, analyze information, communicate
The distance between every sample and the new
information and discover many data relations
agglomerate point is calculated, and then this
unrecognized to help supermarket corporation
sample is come under the types that closest
governors to make out better business decision-
agglomerate point belongs to. If the calculated
making.
centre of gravity is fully same with old
agglomerate point, calculating course converges
and classing function goes fixed value. Here, 6. References
classing is accomplished. Otherwise calculating
[1]Cunningham C, Song IY, and Chen PP, Data warehouse
course of step three is repeated.
design to support customer relationship management
analyses[J], Journal of Database Management, 2006, 17(2),
pp.62-84.
4.4. Analyzing the data results classed by
clustering analysis arithmetic [2]Alfredo Cuzzocrea, Domenico Sacca, and Paolo Serafino,
Semantics-Aware advanced OLAP visualization of
The clients of first type account for very big multidimensional data cubes[J], International Journal of
Data Warehousing and Mining, 2007, 3(4), pp.1-30.
proportion. Their consumption centralizes mostly on
grocery and family things. They are named for the [3] Horng-Jinh Chang, Lun-Ping Hung, and Chia-Ling Ho,
masses types. The sales promotion manner is mostly ad An anticipation model of potential customers' purchasing
manner. behavior based on clustering analysis and association rules
The clients of second type have very outstanding analysis[J], Expert Systems with Application, 2007, 32(3),
interest at purchasing grocery, babyhood things and pp.753-764.
family things, in which the female clients of having
children account for more proportion. They are named
for the family types. The sales promotion manner is
mostly discount manner.
The clients of third type are more ladies and like
strolling around the street. They are named for the

485

Authorized licensed use limited to: GOVERNMENT COLLEGE OF TECHNOLOGY. Downloaded on February 27, 2009 at 09:29 from IEEE Xplore. Restrictions apply.

You might also like