A Project-Oriented Data Warehouse For Construction: Thammasak Rujirayanyong, Jonathan J. Shi

Automation in Construction 15 (2006) 800 807
www.elsevier.com/locate/autcon
A project-oriented data warehouse for construction

Thammasak Rujirayanyong a , Jonathan J. Shi b,
a
Department of Civil Engineering, Rangsit University, Patum-Thani, Thailand

Department of Civil and Architectural Engineering, Illinois Institute of Technology, Chicago, IL, USA
Accepted 16 November 2005
Abstract
A construction organization generates a great amount of operational data that are distributed across various functional systems to support its
daily operations. Although those data may be potentially useful for future projects, they are not widely collected and centrally stored in the
organization. This research presents a Project-oriented Data Warehouse (PDW) for contractors. PDW is designed with dimensional data models
consisting of 26 tables. Sixteen of the tables are dimension tables for storing general descriptive information, and the other ten are fact tables for
detailing various facts that are captured in the lifecycle of construction projects. PDW can be directly populated with data from existing
operational systems, such as P3 files, MS Access, P3/e databases, and Excel files. It maintains each data in the context of its associated project so
that a user can retrieve a specific piece of information plus any background information of the related project. PDW has been populated with three
sample project data. Through the user interface, a user can generate interested query reports as needed. The presented warehouse structure and data
models are scalable. They may be adopted by medium or large contractors for developing company-level data facilities.
2005 Elsevier B.V. All rights reserved.
Keywords: Database; Information system; Decision support system; Construction; Data warehousing
1. Introduction
Everyday organizations large or small create billions of bytes
of data about various aspects of their business, such as
customers, products, operations and people [1]. An organization
has the need to access a variety of information to support either
its daily operations or business decisions. Construction
organizations also generate a great amount of operational data
that are distributed across various functional databases. These
data play an important role in securing a project's completion on
time, within budget and meeting design specifications [2].
The information systems in an organization are generally
divided into two categories: operation support systems (OSS)
and decision support systems (DSS). OSS serve the need of
running the daily operations of the business. DSS provide
historical information for analyzing the business so that
important business decisions can be made appropriately. Many
companies have realized the importance of the hidden treasure of
information, which can significantly improve the quality of
Corresponding author.
E-mail address: jonathan.shi@iit.edu (J.J. Shi).
0926-5805/$ - see front matter 2005 Elsevier B.V. All rights reserved.
doi:10.1016/j.autcon.2005.11.001
decisions [3]. Moreover, unlike consumable resources, information as an organization's intangible asset can be reused over
and over without losing its value. Instead, it may even be
enriched in the process. Therefore, it is the interest of an
organization to collect and store the information for future use.
Data warehousing is a new technology evolved in the last
decade. It intends to provide all users in an organization with
timely access to whatever level of information as needed [1]. A
data warehouse provides an architectural model for the flow of
data from operational systems already in place to decisionsupport environments [4,5]. It is periodically populated with
data from operational systems such as equipment managements,
accounting systems, material inventory systems and customer
management systems. Essentially, a data warehouse collects all
of the relevant data into one central system, organizes the data
efficiently so it is consistent and easy to retrieve, keeps old
data for historical analysis, and enables access to and use of data
conveniently so that users can do it themselves [6]. William
Inmon, who coined the term data warehouse in 1990, defined
a data warehouse as a subject oriented, integrated, nonvolatile,
and time variant collection of data in support of management
decision [7].
T. Rujirayanyong, J.J. Shi / Automation in Construction 15 (2006) 800807
Data warehousing has become popular among organizations

that seek competitive advantage [8,9]. A data warehouse
essentially holds the business intelligence (BI) for the enterprise
to enable strategic decision making. It contains critical
measurements of the business processes stored along business
dimensions. The data warehouse institute (TDWI) defines BI as
the processes, tools, and technologies that are required to turn
data into information and to turn information into knowledge
and effective business plans. Objective information obtained
from the warehouse allows the business to examine its strategy
and to build up its competitive advantage in the market.
Historical data shows how successful the business was in the
past; the current information tells where the business stands; and
the sum of past and present information helps to position the
business for the future. In general, information provides the
basis for decision makers to have a better knowledge about the
business and to develop superior business strategies accordingly. To succeed in today's business environment, we must
understand how we performed in the past, where we are, and
how to correctly position ourselves for the future.
Data warehousing is gaining its popularity as organizations
realize the benefits of having a central database for supporting
efficient management functions. It has become an instant
phenomenon in many large organizations [10]. More than half
of the companies in the United States have committed to
implement the technology [11]. Despite its popularity in
manufacturing and other business sectors, studies and implementations are still very limited in the construction industry. In
1997, Decker et al. reported a cost engineering data warehouse
for supporting cost estimating for Amoco Corporation [4]. Chau
et al. in 2002 developed a prototype of a material inventory data
warehouse for inventory management in Hong Kong [12]. In
order to take a full advantage of the technology, more research is
needed on how to collect and store company-wide construction
data for future uses and how to develop a data warehouse for
assisting construction business decision making.
The current industry practice shows that many local
databases are maintained by different offices in a construction
company to support its management functions. For instance, the
Data Source
Data Staging Area
801
estimating office keeps a historical cost database; the operations

office maintains a project database and the equipment
department runs a database system for all equipment the
company owns. Each of these databases may be associated with
a specific operational application. For example project data is
stored in Primavera Project Planner (P3) and accounting data is
maintained in the selected accounting system. These operational
systems and data sources are segregated along the physical
boundaries between management offices. The segregation
causes many problems for construction business including:
It prevents data sharing between management functions;
Multiple entries for the same data are generated at various
locations;
It causes misunderstandings if the same data at different
locations are not updated simultaneously;
It slows down the decision-making process if data obtained
from different sources conflict with each other; and
It does not support advanced analysis and complex queries
that are essential for supporting business decisions.
Construction business is project-oriented. All data can be
associated with related projects and all decisions are made for
projects. Collecting and maintaining data in relation to projects
sound practical and logical. This research develops a companylevel data warehouse named Project-oriented Data Warehouse
(PDW) for large and medium contractors.
2. The development of a project-oriented data warehouse
(PDW)
The PDW organizes construction data in the context of its
associated projects. Its architecture is shown in Fig. 1 with four
major components: data sources, data staging area, data storage
servers, and data access. The data source component includes
source databases that supply data to the warehouse. The data
staging area is an intermediate database which is used for
transferring data from its sources to the warehouse. The data
storage servers store the data in the warehouse. The data access
Data Storage Server
Data Access
Project
performance
Query/Reporting
Estimate
Data
Staging Area
Material
ETL
PDW
ETL
Analysis
Contract&
Bidding
Data Mart
Data Mart
External
DB
Fig. 1. The project-oriented data warehouse architecture.
Data Mining
802
component provides an interface for end users to retrieve data,

to process, organize or analyze data, and to export data to
external environments as appropriate. Data marts and data
mining tools may be added to the system for advanced data
retrieval and analysis.
2.1. Construction project data sources
While designing a data warehouse, the first challenge is to
determine what data will be uploaded to the warehouse. Two
distinct approaches may be used to determine the corresponding
strategy: need-based and availability-based approaches. The
need-based approach examines what data will be needed in the
future based on the business nature so that these needed data
will be collected and be uploaded to the warehouse. The
availability-based approach examines what data is currently
available in the operational systems; and the available data will
be selected to the warehouse. Some data to be loaded in the
warehouse may not have any immediate use but may find it
useful in the future because it is much cheaper to store data than
to collect it later.
Construction project data is generated along its life cycle
starting from bidding to construction. This research classifies
project data into four categories: performance, materials,
estimates, and bidding/contracts. A project performance data
may exist in a large variety of formats or even different systems
such as Primavera Project Planner (P3) or Microsoft Project.
However, many construction applications are re-invented
themselves in recent years to the clientserver architecture
with central databases. For example, Primavera has released its
on-line project planning, scheduling and management tools for
Enterprise (P3/e) with a central database which can be
supported by either Oracle or MS SQL Server [13,14].
Estimating data is usually created in Spreadsheet (e.g., MS
Excel). While material management systems and contract/
bidding databases generally operate in relational database
systems. The PDW will include these major types of project
data.
corresponds to one of the components or dimensions of the fact

table. The fact table is the primary table that contains
quantitative or factual data of the business although it may
also contain textual attributes in order to limit number of
dimension tables. A dimension table contains descriptive
attributes for constraining and grouping data. Its size is usually
smaller than that of a fact table. Each dimension table is defined
with a primary key field. The fact table uses foreign key fields to
reference with its dimension tables. A dimensional data model
is scalable to allow new fact and dimension tables to be added as
needed. An example data model with a star schema is shown in
Fig. 2 with one fact table and four dimension tables. The fact
table details the cost, duration, and quantity for each activity.
The four dimension tables describe: the time dimension when
an activity is constructed, the activity dimension which defines
its characteristics, the WBS dimension which defines the
relationships between activities or processes through the
working-breakdown structure (WBS), and project dimension
that describes the general project information.
A star schema can be refined to a snowflake schema which
can support hierarchical dimension tables. A star schema offers
a better performance than a snowflake schema and is relatively
easier to manage. A snowflake schema increases the number of
joins and can slow down queries, but it provides a necessary
logical separation of data for complex business. Considering the
complexity of construction data, a snowflake schema is used in
this research with an example as shown in Fig. 3. In this
example, the project dimension is further broken down with a
lower level dimensioncategory dimension, which allows a
more specific description about project types.
2.3. The PDW database structure
A construction project is commonly broken down into
controllable workpackages/activities by following a work
breakdown structure (WBS) which may also be associated
with an organization breakdown structure (OBS) to detail
management responsibilities. Activities are assigned with
account numbers for monitoring and control purposes. The
2.2. Dimensional modeling

The data models in the PDW must be compatible with how
project data is currently maintained and how it may be used in
the future. Moreover, the data in the warehouse must be
structured to meet the management needs at both the project and
company levels.
Two major types of data models are widely used for
constructing data warehouses. An Entity-Relationship (ER)
model removes all redundancies in the data. It provides
advantages in transactional processing by making transactions
simple and deterministic. On the other hand, dimensional
modeling enables speedy access and queries. Although it may
use more space to store data, it provides one of the most
practical techniques for delivering data to end users in a data
warehouse [15].
A dimensional data model is composed of a central fact table
and a set of surrounding dimension tables each of which
Project
Calendar
Date
Month
Year
Activity
Activity_key
Activity_name
Activity_code
Acitivity_type
Cost Fact
Date
Project_key
WBS_key
Activity_key
Cost
Duration
Quantity
Project_key
Project_name
Description
Category
WBS
WBS_key
WBS_name
WBS_code
Fig. 2. A star schema data model.
Project
Calendar
Date
Month
Year
Activity
Activity_key
Activity_name
Activity_code
Acitivity_type
803
Cost Fact
Date
Project_key
WBS_key
Activity_key
Cost
Duration
Quantity
Project_key
Project_name
Description
Category_key
Category
Category_key
Category
WBS
WBS_key
WBS_name
WBS_code
Fig. 3. A snowflake schema data model.
essential data of each activity includes its as-planned and asbuilt schedules, costs, resources, and change orders incurred
during construction.
After a careful examination of the available project data and
the essential data for potential future uses, the PDW is designed
with 26 tables: 16 dimension tables and 10 fact tables. The 16
dimension tables as summarized in Table 1 provide descriptive
information about a construction project including: its owner,
geographical location, time, cost accounts, materials, suppliers,
relationships between activities, organizational responsibility,
subcontractors, and activities. The 10 fact tables as summarized
in Table 2 detail a project information in 10 categories
including: bid, change order, contract, estimate, expense,
material, relationship, resource, schedule, and subcontract.
The relationships between the 26 tables are shown in Fig. 4
in which the boxes represent the fact tables and the ovals
represent the dimension tables. Some dimension tables are
shared by multiple fact tables such as project dimension,
activity dimension, and geography dimension.
The details of the 26 tables are not provided in this article due
to a limit on space. Interested readers may write to the authors
Table 1
The 16 dimension tables
Dimension table
Table descriptions
Activity
Activity predecessor
Awarded project
Calendar
Change extra
Cost account
Expense
Geography
Material
OBS
Owner
Project
Resource
Subcontractor
Supplier
WBS
General activity data

Predecessors of activities
Awarded project name from project database
Date when an event occurs
Change and extra work data
Cost accounts of a project
Types of expense information
Geographic information of parties and projects
General materials information
Assigned management responsibilities to the WBS
Owner information of a project
General data of participated projects
Labor and equipment information
Subcontractor information
Supplier information
WBS of a project
for more information. To explain the relationship between a fact

table and its dimension tables, an example schema for materials
is taken out as shown in Fig. 5 with six dimension tables:
geography, project, supplier, material, cost account, and
calendar dimensions. The relationships between the tables are
established by the primary keys in the dimension tables and the
foreign keys in the fact table.
To further illustrate the difference between a dimension table
and a fact table, the material dimension table and the material
fact table in the schema are detailed as shown in Tables 3 and 4.
The material dimension table in Table 3 has four fields:
Material_wk, Material_ID, Material_name, and Unit. It contains the textual attributes for constraining and grouping
materials in queries. The fact table in Table 4 contains 10 fields
which provide detailed material and its usage information. The
first six fields in Table 4 are the foreign keys for referencing to
the six relevant dimension tables. The other four fields in the
table contain fact information including: unit price, quantity,
cost and cost discount for each material. The last two rows in the
fact table show that PO 1007 includes two purchase items. From
their Material_wk 11 and 7 (column 6), we know that the two
items are Door and Paint in reference to the material
dimension tableTable 3. Similarly by referencing to the other
Table 2
The 10 fact tables
Fact table
Table descriptions
Bid
Bid data for the projects that the company has

submitted bids
Change and extra work
Contract data for all projects
Estimate data of all submitted bids
Expense data (material, subcontractor and other)
assigned to activities
Material data delivered to the completed projects
Logical relationship between activities
Detailed labor and equipment data assigned to activities
As-built and as-planned activity schedule data
for the completed projects
Subcontract data of a project
Change order
Contract
Estimate
Expense
Material
Relationship
Resource
Schedule
Subcontract
804
Schedule Facts
Activity
Estimate Facts
Calendar
Activity Predecessor
Relationship Facts
Contract Facts
Owner
Awarded-Project
Resource
Resource Facts
Bid Facts
Geography
Project
Expense
Expense Facts
Subcontract Facts
Subcontractor
Cost Account
Supplier
Change Extra
Change Extra Facts
WBS
Material Facts
Material
OBS
Fig. 4. The 26 tables and their relationships.
five dimension tables, we can determine the project ID, the cost
account, the time the PO was placed, the time that goods were
received, and the suppliers.
2.4. The development of the PDW
A data warehouse is a central database residing on the
servers and users can access and retrieve data at their client
computers. The PDW must be constructed in a clientserver
environment. After the warehouse architecture and data models

have been designed, a database management system must be
selected for constructing the warehouse. MS SQL Server 2000
was chosen in this research because it supports clientserver
data warehouse applications and also it comes with the Data
Transformation Service (DTS) pack. Implementing the PDW
involves the two major tasks: (1) creating the warehouse
structure with the 26 tables; (2) designing the strategy and tools
for populating the warehouse.
Fig. 5. The material fact table and its supporting dimension tables.

Table 3
The material dimension table
Material_wk
Material_ID
Material_name
Unit
1
2
3
4
5
6
7
8
9
10
11
RB10
C101
PC103
HR1
C104
WM3
P005
BK213
CAP11
WA5
DR200
Steel rebar
Structural concrete
Barrier pre-cast concrete
Hand rails
Topping concrete
Wire mesh
Paint
Brick
Carpet
Interior wall
Door
ton
cu yd
each
lf
cu yd
sf
gal
sf
sf
sf
each
805
it is commonly addressed as the data staging area which some

writers also refer it as the construction site of a data warehouse
[15]. The warehouse shows what data is needed and how it is to
be stored; and the sources of the different data may be located at
various servers across the company's computer network. After
all data is extracted from multiple sources to be held at one
place, data transformation can be efficiently performed. The
data staging area requires reconciling data structures of the
source systems and the data structures in the warehouse. It is
usually created with flat files and/or databases to meet the
needs.
In order to automate a data warehouse population process, an
ETL procedure must be developed. Developing an ETL tool
may consume about half of the time of a warehouse project. An
ETL tool must map the source and the destination for each piece
of data. It must be specified with the correct paths of the data
sources and corresponding destinations so that it can pull the
data from the given sources and send it to the right destinations
in the warehouse. Moreover, the ETL tool must also clearly
define what data to be pulled from each source and what
transformation is to be performed for the data. The DTS pack of
the MS SQL Server 2000 provides a collection of objects and
tools that allow users to import, export, and transform
heterogeneous data between one or multiple types of data
formats, such as MS SQL Server, MS Excel, MS Access, Text
files, etc. It provides an efficient means for developing ETL
tools.
An audit database is usually created to keep a record of the
operations of each ETL process such as the time when the
populating process is executed and the specific actions on what
data movements and transformations are performed. A data
warehouse must be updated periodically. Each time when an
ETL process is executed, the audit database will be updated so
that it can assist in detecting what data has been moved to the
warehouse before a new updating takes place. Therefore, it can
support an incremental loading approach as used in the PDW to
ensure that only new data will be uploaded to the warehouse in
every populating process.
Creating a warehouse structure is similar to creating a

database in any database environment. Implementing the PDW
starts with creating the 26 tables in the MS SQL Sever 2000
environment. In the process, each table is defined with
corresponding number of fields and attribute for each field.
Additionally, each dimension table is defined with a primary
key field that maps with the foreign key(s) in other tables for
defining the relationships between the tables. After the 26 tables
are created, the relationships between the tables are created
according to the primary keys and foreign keys.
A data warehouse must be uploaded with data from the
operating database systems. Populating a warehouse starts with
extracting data from its sources. Then, the extracted data must
be processed and checked for correctness before it is uploaded
to the warehouse. In general, populating a warehouse involves
executing an Extraction, Transformation, and Loading process
(ETL) as follows [16]:
Extract: extraction is the action of pulling data from a source
system or systems;
Transform: transformation reconciles data type and format
differences from different sources, resolves uniqueness
issues, and ensures conformity. A major task is to format the
extracted data to meet the requirement of the warehouse; and
Load: loading moves the data to the dimension or fact tables
in the warehouse.
2.5. Querying a warehouse

Retrieving information from PDW is achieved by queries.
Common queries are divided into three types: drill up type for
In order to avoid interference with the source systems, a

temporary working area is needed to host the extracted data, and
Table 4
The material fact table
Project_wk
Cost_Account_wk
Order_Date_wk
Received_date_wk
Supplier_wk
Material_wk
PO_Number
Unit_price
Quantity
Discount
1
1
1
1
1
1
2
2
2
2
2
1
1
3
2
2
4
1
1
7
5
4
20020105
20020112
20020203
20020215
20020215
20020215
20020412
20020412
20020515
20020520
20020520
20020112
20020120
20020210
20020119
20020222
20020225
20020417
20020421
20020522
20020527
20020529
1
1
3
2
2
2
1
1
4
4
4
1
2
3
4
5
7
1
2
10
11
7
1001
1002
1003
1004
1004
1004
1005
1005
1006
1007
1007
$50.00
$65.00
$810.00
$10.00
$55.00
$25.00
$50.00
$65.00
$3.00
$150.00
$25.00
110.00
346.15
20.00
120.00
196.36
8.00
30.00
21.54
800.00
14.67
52.00
$2.20
$6.92
$0.00
$2.40
$3.93
$0.00
$0.60
$0.00
$16.00
$0.00
$1.04
806
Fig. 6. The sample query schema and report.
summary information, drill down type for detailed information

of a particular project, and drill across type for retrieving
information across multiple fact tables or projects.
MS SQL 2000 Server provides a user interface for creating
queries. With the interface, creating a query report involves
identifying a fact table(s) and relevant dimension tables, and
then selecting fields from these tables to form a new table as the
query report. By executing the query, the database management
system will retrieve relevant data from the warehouse and
generate a report consisting of the retrieved records. A created
query can be saved as a sample report template. Without
recreating the query, a user can directly execute a query
template to generate an updated query report. Common queries
may be created and saved as sample reports. Query reports can
be exported to external systems as needed, such as, MS Excel,
MS Access, MS Query and Query editor, or any in-house
applications.
An example is used here to illustrate the process for creating
a query. The example searches for the suppliers who delivered
pre-cast concrete to the Canal bridge project and relevant cost
data. To create the query, the material fact table is first selected.
The following dimensional tables are also selected: Project,
Cost Account, Material, and Supplier. Checking the columns in
these tables will enable them to appear in the query report. The
selected columns include: Supplier Name, Material Name, Unit,
Quantity, and Unit cost. Additionally, the Canal bridge project
and Pre-cast concrete are set as the query constraints in the
Project and Cost Account Tables respectively. The total cost is
calculated by multiplying Quantity and Unit cost. Fig. 6 shows
the query schema and the generated report. The report as shown
in the table under the query schema in Fig. 6 contains four
records which show that two companies supplied the pre-cast

concrete products to the project.
3. Conclusions
Historical project data can assist construction managers in
answering questions about the business, the performance of
interested operations, business trends, and what can be done to
improve the business. Although the data is there, it is always a
challenge to find the needed data in time when a decision is
made. Decisions can be slowed down due to inconsistency and/
or inaccuracy of data. Data warehousing provides the
technology for storing historical construction project data
which can be extracted from existing operational databases/
systems. However, unlike many other application systems, an
organization cannot simply buy a data warehouse off the shelf.
Instead, a proper design is needed to ensure that the data
structure of the warehouse can address the nature of the
organization and meet its business needs so that the data to be
captured in the warehouse will reflect what is available and
what will be needed by the company.
PDW underlines the project-oriented nature of the construction business. It is intended to provide a robust tool for
collecting, storing, and utilizing historical construction project
data. The PDW architecture and data models presented in this
paper may be adopted by large and medium contractors. PDW
can serve as a central data facility for users to retrieve the right
data for making business decisions. Moreover, the quality data
available in the PDW will provide useful information for
conducting in-depth business analyses or data mining studies.
The PDW warehouse architecture is scalable with new
components to be added as needed, such as new data sources,

data access modules, data marts, new fact and dimension tables,
etc. To meet particular business analysis needs, data marts may
be developed in future research, such as productivity, contract
performance, resource allocation, etc. Those functions can be
added to the data access component in the architecture with the
data to be migrated from the warehouse.
References
[1] K. Orr, Data Warehouse Technology: Revised Edition 2000, White Paper,
The Ken Orr Institute, Kansas, 1996.
[2] I. Ahmed, Data warehousing in construction organizations, Construction
Congress VI 2000 Proceeding, ASCE, 2000, pp. 194203.
[3] N. Goyal, Data Warehousing Lecture Notes, Birla Institute of Technology
& Science, Pilani, India, 2003.
[4] K. Decker, A. Oaks, M. Salinas, Building a Cost Engineering Data
Warehouse, AACE International Transactions, IM.06, AACE International, Morgantown, 1997.
[5] I. Manning, Data warehousingwhat is it? http://www.csesolutions.com/
data_warehousing.htm, 2002.
[6] M. Corey, M. Abbey, I. Abramsom, Oracle 8 Data WarehousingA
practical Guide to Successful Data Warehouse Analysis, ORACLE Press,
1998.
807
[7] W.H. Inmon, Building the Data Warehouse, John Wiley and Sons, Inc.,
New York, 1993.
[8] R. Adhikari, Migrating legacy data, Software Magazine 16 (1) (1996
(January)) 7580.
[9] J. Kador, One on one, Midrange Systems 8 (20) (1995 (October)).
[10] L. Miller, S. Nilakanta, Data warehouse modeler: a CASE tool for
warehouse design, Proc. of the 31st Hawaii Intl. Conf. on Sys. Sciences,
IEEE Computer Society Press, 1998.
[11] P. Ponniah, Data Warehousing Fundamentals, John Wiley & Sons, Inc.,
New York, 2001.
[12] K.W. Chau, Y. Cao, M. Anson, J. Zhang, Application of data warehouse
and decision support system in construction management, Automation in
Construction (12) (2002) 213224.
[13] Primavera Systems, Inc., Primavera Enterprise: Administrator's Guide,
Pennsylvania, 2002.
[14] Primavera Systems, Inc., Primavera Enterprise: User's Guide, Pennsylvania, 2002.
[15] R.L. Kimball, L. Reeves, M. Ross, W. Thornthwaite, The Data Warehouse
Lifecycle Toolkit: Expert Methods for Designing, Developing, and
Deploying Data Warehouses, John Wiley & Sons, Inc., New York, 1998.
[16] M. Chaffin, B. Knight, T. Robinson, Professional SQL Server 2000 DTS,
Wrox Press Inc., Illinois, 2000.

A Project-Oriented Data Warehouse For Construction: Thammasak Rujirayanyong, Jonathan J. Shi

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Project-Oriented Data Warehouse For Construction: Thammasak Rujirayanyong, Jonathan J. Shi

Uploaded by

Copyright:

Available Formats

Automation in Construction 15 (2006) 800 807

A project-oriented data warehouse for construction

Department of Civil Engineering, Rangsit University, Patum-Thani, Thailand

T. Rujirayanyong, J.J. Shi / Automation in Construction 15 (2006) 800807

Data warehousing has become popular among organizations

Data Staging Area

estimating office keeps a historical cost database; the operations

Data Storage Server

Fig. 1. The project-oriented data warehouse architecture.

T. Rujirayanyong, J.J. Shi / Automation in Construction 15 (2006) 800807

component provides an interface for end users to retrieve data,

corresponds to one of the components or dimensions of the fact

2.2. Dimensional modeling

Fig. 2. A star schema data model.

T. Rujirayanyong, J.J. Shi / Automation in Construction 15 (2006) 800807

Fig. 3. A snowflake schema data model.

General activity data

for more information. To explain the relationship between a fact

Bid data for the projects that the company has

T. Rujirayanyong, J.J. Shi / Automation in Construction 15 (2006) 800807

Change Extra Facts

Fig. 4. The 26 tables and their relationships.

environment. After the warehouse architecture and data models

T. Rujirayanyong, J.J. Shi / Automation in Construction 15 (2006) 800807

it is commonly addressed as the data staging area which some

Creating a warehouse structure is similar to creating a

2.5. Querying a warehouse

In order to avoid interference with the source systems, a

T. Rujirayanyong, J.J. Shi / Automation in Construction 15 (2006) 800807

Fig. 6. The sample query schema and report.

summary information, drill down type for detailed information

records which show that two companies supplied the pre-cast

T. Rujirayanyong, J.J. Shi / Automation in Construction 15 (2006) 800807

components to be added as needed, such as new data sources,

You might also like