You are on page 1of 14

Building and Environment 205 (2021) 108267

Contents lists available at ScienceDirect

Building and Environment


journal homepage: www.elsevier.com/locate/buildenv

Building life-span prediction for life cycle assessment and life cycle cost
using machine learning: A big data approach
Sukwon Ji, Bumho Lee, Mun Yong Yi *
Graduate School of Knowledge Service Engineering, Department of Industrial & Systems Engineering, KAIST, Republic of Korea

A R T I C L E I N F O A B S T R A C T

Keywords: Life cycle assessment (LCA) and life cycle cost (LCC) are two primary methods used to assess the environmental
Building life span and economic feasibility of building construction. An estimation of the building’s life span is essential to carrying
Life cycle cost out these methods. However, given the diverse factors that affect the building’s life span, it was estimated
Life cycle assessment
typically based on its main structural type. However, different buildings have different life spans. Simply
Big data
Machine learning
assuming that all buildings with the same structural type follow an identical life span can cause serious esti­
Deep neural network mation errors. In this study, we collected 1,812,700 records describing buildings built and demolished in South
Korea, analysed the actual life span of each building, and developed a building life-span prediction model using
deep-learning and traditional machine learning. The prediction models examined in this study produced root
mean square errors of 3.72–4.6 and the coefficients of determination of 0.932–0.955. Among those models, a
deep-learning based prediction model was found the most powerful. As anticipated, the conventional method of
determining a building’s life expectancy using a discrete set of specific factors and associated assumptions of life
span did not yield realistic results. This study demonstrates that an application of deep learning to the LCA and
LCC of a building is a promising direction, effectively guiding business planning and critical decision making
throughout the construction process.

1. Introduction construction sector consumes 35% of global energy use and nearly 40%
of energy-related CO2 emissions [3]. Furthermore, the operation phase
1.1. Importance of building life span in architectural engineering accounts for 80–90% of the total energy consumed by the entire life
cycle [4]. In the case of apartments made from reinforced concrete (RC),
From planning to design, construction, maintenance, and disposal, the operation phase contributes to 75.4% of the carbon emissions pro­
decision making is essential at every stage in the building construction duced over the entire life cycle [5]. Thus, for energy saving and envi­
industry. Various economic and environmental feasibility evaluation ronment protection, it is highly important that LCA is conducted
methods are used to execute major decisions including the selection of accurately.
structures, materials, and construction methods throughout the design LCC is an economic analysis method that calculates all or part of the
and construction of a building. Among them, life cycle assessment (LCA) total costs of construction, operation, maintenance, and disposal
and life cycle cost (LCC) analysis are two primary verification methods incurred by a project during its useful life. In LCC, the costs incurred in
commonly used across the industry - in particular, LCA for environ­ the operating phase, which include expense created from operating a
mental verification and LCC for economic verification [1]. building with its resource consumption such as energy and water, and in
The LCA method evaluates the environmental aspects of a product the maintenance phase, which include expense from sustaining building
system at all phases of its life cycle [2]. In the construction sector, LCA condition and quality, account for approximately 40% of all costs over
assesses the impact of a building on the environment in all phases the life cycle of a building [6]. Calculating the operation and mainte­
including material manufacturing, material transportation, construc­ nance costs requires accurate reference data on the building’s life span
tion, operation and maintenance, demolition, and construction waste and the repair cycles of building materials, construction methods, and
disposal. According to the 2017 UNEP Global Status Report, the building equipment.

* Corresponding author.
E-mail address: munyi@kaist.ac.kr (M.Y. Yi).

https://doi.org/10.1016/j.buildenv.2021.108267
Received 14 June 2021; Received in revised form 13 August 2021; Accepted 17 August 2021
Available online 19 August 2021
0360-1323/© 2021 Elsevier Ltd. All rights reserved.
S. Ji et al. Building and Environment 205 (2021) 108267

In the construction industry, an essential component of LCA and LCC column spacing, and floor height of a building will affect its func­
is the life span of a building, which must be considered in the calculation tional life span.
of the environmental load and maintenance and operation costs. It
serves as a key factor that determines the number of repairs required. In addition, there are various other concepts of life span, including
Also, building life span has an absolute influence on the energy economic, cultural, design, and tax life spans. The economic life span is
consumed during the service cycle of the building, total cost and governed by economics. A building is an assembly constructed of
quantity of building repairs, and total environmental load [7]. In addi­ numerous materials using various methods that have diverse life cycles.
tion, as there are few reference data sources describing the standard life In addition to the original purpose of a building, which is to shield the
span of a building other than those sources assumed from the main inhabitants from the outside environment, buildings are assets that have
structure of buildings, there are many cases in which LCC calculations a strong purpose as a financial investment. Therefore, even if a building
omit the maintenance cost during the fulfilment of value engineering structure or component is defective, it may not be demolished;
(VE) in the design phase. conversely, it may be demolished before reaching its life span limit even
However, although a building’s life span strongly influences both the if its condition is sufficient for its intended purpose. The cultural life
LCA and LCC outcomes, it is unrealistic to consider all the factors that span refers to the cultural relevance of a building. Some buildings are
affect it. Hence, the life span is a component typically assumed to be the preserved well beyond their life span either by artistry or through
same for all the buildings with the same main structure type or other continuous renovation as a heritage site. The design life span is related
similar incidental factors. In LCA, the operation life is often assumed to to the relevant design trends. Public buildings, for example, are designed
be 30–50 years, and the guidelines of building operation and mainte­ to help them survive for a long time, whereas the design of commercial
nance are quite simple. However, some materials do not meet the typical buildings (and thus their life spans) changes depending on their sur­
service life period criteria due to mismatched exposure or usage char­ roundings, technology, and originality. The tax life span refers to the
acteristics, making it difficult to use a simplified description to fully period the relevant tax law of the land is applicable to a building. In
express its maintenance conditions [8]. In practice, the life spans of South Korea, for example, the tax law prescribes a tax life span of 40
buildings vary considerably due to different influencing factors, and the years for major structures such as steel–RC (reinforced concrete), RC,
use of standard assumptions often lead to grossly inaccurate results [9]. and steel frame buildings [10]. Thus, the actual life span of a building is
not simply the sum of the internal factors of the building itself but also
related to the site and surrounding environment (such as weather, hu­
1.2. Complexity of building life-span prediction midity, salinity, and disaster), extent of maintenance, building-related
regulations and guidelines, re-development and reconstruction stan­
There are three broad categories of building life span: physical life dards, asset value, and social demands and implications.
span, social life span, and functional life span.
1.3. Limitations of existing building life-span prediction methods
1. Physical life span refers to the period a building gets demolished
because the major components of the building physically reach their Various researchers have attempted to predict the building life span
life span and are no longer technically useable. It can be further via building materials and construction methods. Many have only
divided into the structural, facility, interior, and exterior life spans. studied the life of materials and construction methods located in a
The structural life span refers to the number of years until the specific region. It has been difficult to analyse all the factors influencing
physical strength of the building wanes. It is a systemic description of building life span because each material and method used to construct a
the performance deteriorations, defects in sections, and poor con­ building has distinct characteristics and are subject to different envi­
struction conditions associated with the material types and their ronmental conditions. Therefore, methods have been developed for
characteristics. In addition, the facility life span, which is concep­ predicting the life span of specific buildings according to specific factors
tually related to the idea of physical life span, refers to the number of by setting the goal and scope of the prediction according to the purpose
years until the facility inside a building reaches a state in which it of the study. Recently, hybrid methods that consider both the internal
cannot be physically repaired. For example, according to structural and external factors of a building to predict its life span have been
standards and specifications, RC-framed apartment buildings in developed [43,51–54]. However, most of these methods subjectively
South Korea, which serves as a backdrop for our research, have a select important factors from numerous external factors and assign
structural life span of 60–100 years. In contrast, the facility life spans quantified weights based on the opinions of researchers or participating
of these buildings will vary depending on their material character­ experts. Even in the case of the modeling-based approach [e.g., 52–54],
istics. In general, the facility life span is calculated to be approxi­ most researchers selected a model simply based on the life span distri­
mately one-third of the structural life span of a building. bution and determined factors deemed to have high weights to the
2. Social life span refers to the period a building lasts until human modeling. These methods suffer from inconsistency in assigning the
desire or legal requirement dictates replacement. It is dependent on weights across the countries or regions in which the internal and
changes in social conditions or needs. Apart from the physical or external factors of buildings are different, not to mention the subjec­
functional life span, the social life span includes the life span for re- tivity of participants. Therefore, the results of different studies are
development, reconstruction, and remodelling in accordance with a inherently empirical and difficult to standardize, making it impractical
national policy or social demand. In addition, a building’s social life to predict the actual life span of any given building using one of those
span may be affected by the adoption of new safety regulations or methods.
standards related to asbestos, fire resistance, or seismic resistance. In summary, the objective of this study is to identify the primary
3. The functional life span refers to the period before which a building factors that affect the life span of a building by applying machine
loses its functional value due to societal or lifestyle changes. For learning techniques to a large scale real-world dataset, which is the
example, commercial buildings, factories, and hospital buildings are entire set of building permission registry data accumulated since 1950s,
subject to particularly rapid changes in function. Furthermore, the era from which the current form of country-wide building registra­
housing needs vary over an approximately 10-year cycle due to tion system began in South Korea. To examine the comparative advan­
population changes or resident preference changes. As buildings are tage of our approach, we employ the current wide-spread method of
directly affected by technological innovation or change in life style, a calculating building life span using the linear regression technique, in
plan that will help a building withstand changes in function or use is addition to deep learning and traditional machine learning techniques.
important. In general, the spatial composition, centre (core) location, The ability to accurately predict the life span of a building can

2
S. Ji et al. Building and Environment 205 (2021) 108267

dramatically improve the output accuracy of LCA and LCC and help expectancy was typically assumed to be 100 years. Additionally, the life
building owners, contractors, designers, and other stakeholders make spans of 40, 45, and 150 years have been infrequently used in accor­
well-informed decisions. dance of national corporate tax law, depending on major structural type
variation (see Table 1).
2. Related work
2.2. Actual building life span and main influencing factors
2.1. Expected building life span
O’Connor [41] conducted a demolition survey to identify the age,
Until now, there has been no consensus in the literature on the
type, structural material, and reason for demolition of 227 buildings in a
calculation of building life span [21,22]. The life span of buildings used
major North American city. The life spans of most of the demolished
in prior LCA studies [19,21–40] varies from 40 to 150 years. Most LCA
steel and concrete buildings were less than 50 years and no significant
and LCC practitioners do not estimate the actual life span of buildings
relationship was found between structural material and average service
but only apply default values obtained from the relevant structural
life. Moreover, ‘local re-development’ (34%), ‘lack of maintenance’
calculation codes [e.g., 20]. When wooden or high-strength concrete
(24%), and ‘building no longer suitable for intended use’ (22%) were the
structures were used to fabricate the main structure of a building, its life
most important external factors affecting building life. Dias [42]

Table 1
Building life span and basis of estimation in existing LCA and LCC studies.
Building life span Basis of estimation Title Country Year Main frame type
(years)

1 50 Simple assumption Energy use during the life cycle of single-unit dwellings: Sweden 1997 Wood
Examples [21]
2 50 Economic building life span Life cycle assessment of four multi-family buildings [22] Sweden 2001 Lightweight concrete &
in Sweden concrete, RC, Wood, Steel
columns and concrete
3 40 National corporate tax law Method of economic analysis for remodelling of The Republic 2005 RC
deteriorated apartments using the life cycle costing [23] of Korea
4 50 Buidling life expectancy in Comparison of environmental effects of steel and US 2005 Steel frame, cast-in-place
midwestern U.S. concrete-framed buildings [24] concrete frame
5 50 Simple assumption Life cycle assessment of office buildings in Europe and the US & Europe 2006 Steel framed RC
United States [25]
6 100 Simple assumption Comparison of the life cycle assessments of an insulating US 2006 Wood, Insulating concrete
concrete form house and a wood frame house [26] form
7 50 Simple assumption Sustainability based on LCM of residential dwellings: a Spain 2008 Brick
case study in Catalonia, Spain [27]
8 50 Simple assumption Life cycle assessment in buildings: state of the art and Spain 2009 RC
simplified LCA methodology as a complement for
building certification [28]
9 50 Simple assumption The environmental impact of the construction phase: An Spain 2010 Block
application to composite walls from a life cycle
perspective [29]
10 50 Simple assumption Life cycle assessment of a house with alternative exterior Portugal 2011 RC
walls (Comparison of three impact assessment methods)
[30]
11 50 (existing The lower limit of the Life cycle CO2 evaluation on reinforced concrete The Republic 2011 RC
building) lifespan of high-strength structures with high-strength concrete [31] of Korea
100 (high- concrete structure
strength concrete
building)
12 50 Simple assumption Assessment of the environmental performance of Austria 2012 RC, Brick, Timber
buildings: A critical evaluation of the influence of
technical building equipment on residential buildings
[32]
13 50 Prior work Environmental impacts of the UK residential sector Life UK 2012 Brick and Block
cycle assessment of houses [19]
14 50 Simple assumption Life cycle assessment of the air emissions during building Hong Kong 2013 RC
construction process: A case study in Hong Kong [33]
15 45 A durable period of a build Sustainability life cycle cost analysis of roof The Republic 2014 RC
waterproofing methods considering LCCO2 [34] of Korea
16 50 Prior work Life cycle assessment (LCA) of roof waterproofing systems The Republic 2014 RC
for reinforced concrete building [35] of Korea
17 50 (Alt1) Simple assumption Life cycle cost optimization within decision making on The Czech 2014 RC
30 (Alt2) alternative designs of public buildings [36] Republic
18 50–150 Prior work Life cycle assessment (LCA) of building refurbishment: A Spain 2017 RC, etc.
literature review [37]
19 100 Simple assumption Life cycle assessment of building materials for a single- Sweden 2019 Wood, Concrete
family house in Sweden [38]
20 100 The expected building LCC Estimation Model: A Construction Material The Czech 2019 Masonry
lifetime in Czech Republic Perspective [39] Republic
21 40 Simple assumption Life cycle greenhouse gas emission and cost analysis of The 2020 Prefabricated concrete
prefabricated concrete building façade elements [40] Netherlands

Note: As shown in the table above, there are many cases in the LCC-related studies in which the life span was estimated to be a certain number of years using a simple
assumption (without providing any explicit reference). In the LCA-related research, most studies assumed the life span to be 50 years by referring to prior work. These
approaches are still widely used today.

3
S. Ji et al. Building and Environment 205 (2021) 108267

surveyed the conditions of several buildings under 125 years old in a • Authorised Register on Demolition Permit: When the owner or
humid, tropical environment to find that the change in function of or manager of a building has lost all or part of the building owing to a
lack of investment in a building reduced its life span and that new disaster or other destruction, they shall prepare an ‘Architecture Loss
planning regulation or archaeological heritage had an impact on Declaration’ and file it with the Governor of the Special Self-
extending the life span. Liu et al. [43] used an improved hedonic model Governing Province or the head of the District (called Si, Gun, Gu
to investigate the factors influencing the life spans of 1732 demolished in Korean) within a week of the scheduled demolition date.
buildings from 2008 to 2010 in seven communities of Jiangbei District,
Chongqing, China. The average service life of the buildings was deter­ The South Korean construction administration system, named
mined to be 34 years—much shorter than the design life span in China. Seumteo, has been established to handle almost all construction-related
They also noted that external influences were more important than in­ administrative duties. The construction data describing building per­
ternal influences, and that internal factors excluding floor area were less mits, construction reports, completion (Approval of Use), maintenance,
important than expected. Grant [9] modelled nine combinations of and demolition are provided to the private sector through an open data
building envelopes using five service life models. The study results service named the Architectural Data Private Open System [16]. Even in
indicated that the life span of a building varied with its conditions, and South Korea, where the information technology industry is relatively
the annual cumulative life cycle and life cycle effects depended mainly advanced, such large-scale data on buildings have not been released to
on the predicted life span of the materials and the environmental indi­ the private sector until recently. Although many other countries have
cator used. not yet opened administrative building data to the private sector, more
and more countries are expected to share the construction- and
3. Methodology demolition-related information freely in the form of raw data as time
goes on.
3.1. Big data approach The service life (or service period) of a building is the period for
which it remains durable, stable, and resilient [17]. The international
3.1.1. Data sources for building life-span prediction standard ISO 15686:2014 documents an analysis of this concept [18].
In many countries, to be in compliance with the building codes, Realistically, the building service life or building life span generally
building permits, construction permits, and construction reports are refers to the period between its construction and demolition, as the
required when constructing a building. Plan checks, which verify the actual life span does not solely depend on stability and durability.
compliance of buildings with the area plan requirement, are also being Therefore, the life span of a building can be considered as the period
implemented as part of the approval process [11]. Furthermore, builders between the date of approval of use and the date of demolition from an
must typically procure permits for use upon building completion and administrative perspective. Any country can therefore analyse and
apply for a permit prior to building demolition. In the United States, a predict the actual life span of a building in its territory by utilising its
building permit must be obtained before construction in accordance mandatory and practical administrative data, which has been accumu­
with the Law on Spatial Planning and Construction. Completed build­ lated by law as part of the construction permitting process. The larger
ings can be occupied and operated once a use permit has been obtained the quantity of data and the longer the period of recording, the more
indicating that all the relevant technical documents, standards, regula­ accurate a prediction based on this data can be.
tions, and norms have been observed. Finally, a demolition permit must
be obtained to demolish a building. These permits can be obtained from 3.1.2. Big data characteristics and challenges of the study data
the Department of Town Planning, Housing and Communal Affairs and The collection of the Korean dataset described in 3.1.1 involves a
Ecology. In the United Kingdom, a planning permit or development government-wide effort on establishing an integrated database from all
approval is necessary to construct or extend a building (including major the records produced in South Korea in relation to construction. Given
renovations), as well as to demolish a building in some jurisdictions [12, that various records and data from all over the country have been
13]. Under the Building Law of the People’s Republic of China, prior to accumulated over a long time period, the dataset reflected several
the commencement of construction work, the construction agency must characteristics of big data and consequently a big-data analysis strategy
apply for a construction permit with the department-in-charge of the was required to accomplish the intended purpose of this research. Below
government construction administration of the relevant administrative are the characteristics and the challenges of the integrated records that
district. When the construction is completed, a ‘Construction Process we encountered. We addressed the characteristics according to the 7Vs
Completion’ certificate is issued following an inspection [14]. In South presented in the work of Sivarajah et al. [55].
Africa, major permit stages and obligation items in construction, usage,
and demolition are specified in accordance with the National Building 1. Volume: The Seumteo database we accessed is established with all
Regulations and Building Standards Act (Act No. 103, 1977) [15]. In the construction related documents registered to the local govern­
most countries, the construction permit or declaration is mandatory, and ment offices throughout the nation. The partial database of the de­
although it may take slightly different names and forms, it involves molition records that this research mainly utilizes includes
nearly the same criteria. 1,812,700 cumulative demolition cases arranged with 78 different
In South Korea, the following permit-related records are created with building characteristics from the 1950s to the current date. As a full
regard to building construction and demolition: data record of construction events, the data used in this research can
be considered as one of the largest datasets used for analysing con­
• Authorised Register on Building Permit: A document detailing the struction and demolition events.
‘Building Permit’ expressing permission from the Governor of the 2. Variety: The database includes data with a different aspect of
Special Self-Governing Province or the head of a Si/Gun/Gu, by building characteristics. For example, characteristics such as an
which a person who intends to construct or repair a building is address, lot type, and environmental grade are recorded to describe
authorised. administrational information, while structures, number of floors,
• Authorised Register on Approval of Use: After the completion of and parking space type are recorded to describe the physical aspects
building construction, the owner of the building must apply for an of the building. All these different types of information are evaluated
‘Approval of Use’ in order to occupy the building by attaching a and registered from different procedures performed by different
supervision completion report and a construction completion docu­ groups of people with different professions. The variety of the type
ment prepared by the construction supervisor. and the source of this information cause challenges in processing and

4
S. Ji et al. Building and Environment 205 (2021) 108267

analysing source data, therefore cautious selection, imputation, and


integration of the data are necessary.
3. Veracity: The dataset has been collected for decades while the
registration procedure, form, input type of raw data have been
changed over time. This characteristic produces discrepancy in the
data quality and bias depending on the input methodology utilized.
Old data get corrupted or lost in physical form, producing data
sparsity when they are integrated and transferred to a large elec­
tronic database. Therefore, the validation and the control of sparse
data is necessary to keep an adequate level of veracity. For the same
reason, the database has been set up based on the information sub­
mitted to government offices under strict legal restrictions. There
also are several audit processes in place to verify the validity and
reliability of registered information and data, so that the Seumteo
database can be considered one of the most reliable construction
information databases in South Korea.
4. Velocity: As the Seumteo database accumulates the registration re­
cords submitted and approved by the local government offices, the
dataset gets updated on a monthly basis. To provide a more accurate
estimation of life span, the utilization of new data is essential so that
the model can respond to changes in the building characteristics or
new trends in the construction industry. Therefore, methodologies
with real-time analytical and evidence-based planning are necessary.
The machine learning-based approach in this research is an adequate
choice to address such challenges.
5. Variability: In terms of variability, there is a limitation in the data­
base and the dataset we utilize. It can be asserted that the meaning in
data changes as the requirements, terms, structures of construction
registration change over time. However, the Seumteo database aims
to provide a unified set of data for its users so that the variability will
be managed according to the legislation.
6. Visualization: Due to the sheet volume of the data and the diverse set
of features, it is a big challenge to represent key information in the Fig. 1. Flow of building life-span prediction approach using big data.
dataset in a pictorial or graphical layout. The dataset we utilize
typically requires the use of a geographic information system to from Step 1. There was no specific thresholds (e.g., minimum RMSE)
produce real-time, geographical rendering of registered building but each iteration required adjusting parameters for the model to get
information. a better outcome, which was indicated by the difference between the
7. Value: One of our goals in this research is to show the discrepancy estimated life span and the actual life span. Thus, if the subsequent
between life span calculated by conventional estimation methodol­ trial did not produce a better estimation result than the previous one,
ogy and the actual life span recorded. The Seumteo database can the parameters of the trial were disregarded and a new trial was
contribute to the comparison with the actual data recorded in the continued. Note that even if the problem is not solved by running the
dataset. In general, to estimate or forecast an accurate life span of a process once from beginning to end, it is not a failure; the results can
building, it is necessary to identify key building characteristics and be used to improve the model.
trace their influences on the life span. The Seumteo database in­ 5. Once an acceptable prediction was achieved, the model could be
cludes a huge amount of data regarding the buildings and con­ deployed in LCA and LCC analyses and used to support decision
struction activities in South Korea that would help to validate and making, and the results could be fed back into the model to better
find independent variables that affect the building life span. understand the building characteristics.

3.1.3. Focus and approach of analysis Indeed, every step of this process explores the data. Thus, the more
In this research, we chose South Korea as the research backdrop for iterations are run, the more information is generated, resulting in better
this study due to the extent and availability of its construction admin­ models [44].
istration records as well as the comprehensibility of data by the authors.
The overall flow of the big data analysis approach used to predict a 3.2. Building life span in South Korea
building’s life span for use in LCA or LCC is shown in Fig. 1.
The big data analysis approach was implemented in the following In South Korea, the life span of a building is considerably shorter
steps: than the one assigned to it according to its main structure type. Most
buildings in South Korea consist of RC, steel frame, or steel frame–RC
1. An understanding of the characteristics of the target building group structures; among these, 90% are RC structures [35]. Because limited
was developed. types of structures have been used in South Korea, a large amount of
2. Big data were collected describing as many factors as possible that data are available for each structure. Over the period covered in this
were likely to affect the life span of the buildings. study, there has been no change in the durability of buildings, and no
3. Predictive models suitable for data preparation and big data char­ drastic changes in construction methods, materials, or real estate and
acteristics were developed. building laws have occurred that would significantly affect the life span
4. Several iterations, from the previous stage to the predictive models of buildings in South Korea. In addition, the mandatory legal enforce­
were run until a model with satisfactory performance was achieved. ment of construction practices in South Korea means that there is almost
If the prediction was not acceptable, all steps were repeated starting no omission of building data.

5
S. Ji et al. Building and Environment 205 (2021) 108267

3.3. Exploratory study based on building registration records Table 3


Building-related information for life-span prediction.
To identify the characteristics affecting the life spans of buildings, Factors Details Source
the average life spans of each main frame type in each region were
1 Internal Building area Sum of the floor area of all EAIS
examined using data from an authorised register on demolition permits floors in all buildings on a site
in the Electronic Architectural Administration Information System 2 Internal Building floor area Floor area of building(s) is the EAIS
(EAIS). These data, described in detail in Appendix, constitute a total of sum of the gross horizontal
1,812,700 cumulative demolition records of buildings since the 1950s to areas of several floors
3 Internal Building structure Main structure types that make EAIS
December 2019, along with the characteristics of the buildings (i.e. main type up a building
frame type, site, gross area, etc.). In this study, after removing cases 4 Internal Roof structure type Types of major structures that EAIS
including missing values, misinputted data values and outliers from the make up the roof of a building
database, 971,514 building records were used to analyse building life 5 Internal Number of storeys Number of storeys above EAIS
above ground ground of a building
spans by major structural type and regions. The number of registered
6 Internal Number of storeys Number of storeys EAIS
records and the average life span from each period are listed in Table 2. under ground underground of a building
The EAIS databased built on the records which were constructed 7 Internal Number of Number of elevators in a EAIS
manually from 1950s by local governments personnel. The old hand­ elevators building
written data was transformed and digitized so that the database included 8 Internal Number of outdoor Number of outdoor parking EAIS
parking spaces spaces of a building
a large number of missing and misinputted values. Among 1,812,700 9 Internal Area of outdoor Area of outdoor parking spaces EAIS
records, 692,894 records with missing values, 148,284 records with parking spaces
misinputted data variables and 8 outliers, which have extremely long 10 External Approval date of Completion date of EAIS
life span (>200) were removed. Our primary focus of this research was use construction
11 External Building Type Detached building; Aggregate EAIS
to propose a better methodology to estimate building life span suitable
building
for LCC and LCA for general buildings, excluding extremely old build­ 12 External Building ledger General building; Section for EAIS
ings not typical for ordinary residence purposes. Type describing the title
Interestingly, according to Table 2, fewer than 250 buildings per year 13 External Metropolitan city Metropolitan city where EAIS
were demolished between the 1950s and 1970s. However, the reason for building is located
14 External City or county City or county where building EAIS
this aberration appears to be that many records from immediately after
is located
the end of the Korean War (1953) were lost in the process of digital­ 15 External District District where building is EAIS
isation, as it was often difficult to record data electronically because the located
existing handwritten records were too old to legitimise their contents. In 16 External Site Type Site; Forest land; Road; Rice EAIS
paddy; Field
addition, we have presumed that the number of demolition records
17 External Extra parcels Number of extra parcels EAIS
would only increase as we approach the modern age because more data 18 External Building Main building; Sub-building EAIS
would be stored in a computerised format, and there would be less data subordination type
loss. Thus, as the data are more recent, they are more likely to be 19 External Major use Major use of building EAIS
complete with no missing values and therefore to be included in the classification
20 External Seismic regional Classification of seismic hazard RBSS
analysis. In this study, the actual life span was calculated as the differ­
classification by location of buildings
ence between the year of demolition and the year of approval of use. 21 External Climatic regional Classification according to the ESDSB
classification heat perfusion rate table of
buildings by region
3.4. Machine learning-based building life-span estimation using building
registration records
analyses were performed using Python 3.5 on an Intel i5 8th-Gen CPU
The primary objective of this study was to predict the most realistic computer with 16 GB RAM running Windows 10. The software package
building life span using various machine learning techniques. Building was managed using Anaconda 4.8.4, and Pycharm 2020.02 was used as
life span was predicted based on an authorised register of the demolition the IDE for the actual programming and various library applications.
permit dataset from EAIS, seismic regional classification from Regula­
tion of Building Structure Standards (RBSS), and climate regional clas­
sification and insulation requirement information from Energy Saving 3.5. Development of machine-learning-based life-span estimation model
Design Standards of Buildings (ESDSB) [45,46]. A prediction model
containing overall 21 sets of building information as the independent 3.5.1. Model development overview
variables was thus developed using 19 sets of building information Machine learning commonly involves training a model using data
provided by EAIS and 2 sets of building information on seismic and and evaluating the ability of the model to generalise to new test data. In
climatic conditions provided by RBSS and ESDSB. The information addition to the classification of data, machine learning can be flexibly
describing these independent variables is provided in Table 3. The applied to solve various problems due to its ability to identify key re­
lationships between variables and predict patterns through data
learning [47]. Because machine learning makes it easy to analyse data
Table 2
using large volumes of multi-dimensional variables that are otherwise
Demolition records and average life cycle by period within the records to be
analysed. difficult to handle using traditional statistical methods, we considered
machine learning a suitable method for developing a prediction model
Period Number of demolition records Average building life span
using the data collected in this study.
(Cases) (Years)
When applying machine learning, the conditions required to obtain
1950s 53 12.7
optimal results depend on the problem and the characteristics of the
1960s 34 12.8
1970s 237 22.9 data. Therefore, it is necessary to apply techniques that are specific to
1980s 2324 20.3 the problem. Because the life cycle of a building is a continuous variable,
1990s 11,741 21.8 we constructed a prediction model using simple regression method as a
2000s 357,849 24.9 baseline and compared it with three other state-of-the-art machine
2010s 599,276 30.0
learning models to assess their potential improvements relatively.

6
S. Ji et al. Building and Environment 205 (2021) 108267

3.5.2. Regression model set to adjust the model performance and training time. To pre-process
For the baseline condition, the linear regression method from the the data, we used the pre-processing and scaler functions of Scikit-
Scikit-Learn Python library was used. A linear regression model seeks to Learn. The training and test data sets were normalised to values be­
describe the relationship between independent variables (x1 , ⋅ ⋅ ⋅ , xn ) tween 1 and 0 using the MinMaxScaler method provided with the pre-
and a continuous dependent variable (y) by fitting a linear equation processing package. Additionally, we sought the optimal model
y = w1 × x1 + ⋅⋅⋅ + wn × xn + b) to the data. The linear regression
(̂ parameter using the GridsearchCV function and Bayesian optimiser
module provided by Scikit-Learn uses the singular value decomposition packages provided in Scikit-Learn.
(SVD) method to calculate the pseudo-inverse matrix to build a model.
Like traditional statistical analysis, a model that minimises the root 3.5.7. Model training and evaluation
mean square error (RMSE) and maximizes the coefficient of determi­ Building life spans were trained and predicted by developing the
nation (R2) is considered to be more accurate. regression, XGBoost, LightGBM, and DNN models using the demolition
permit dataset. 971,514 cases, which included data from January 1950
3.5.3. XGBoost ensemble model to December 2019 were used to train the prediction models. For test
Extreme gradient boosting (XGBoost) is an improved model of the data, we acquired a recently generated dataset separately to test the
decision tree boosting technique presented by Chen and Guestrin [48]. performance of the model trained with records up to the 2010s. The
Boosting is an ensemble technique that combines numerous weak clas­ most recently recorded authorised register on demolition permit dataset
sifiers to create a strong classifier that can compensate for the errors (38,676 cases) from January 2020 to June 2020 was used as a test set.
from the previous steps while sequentially executing the weak-tree The new test dataset was preprocessed in the same manner as training
classifiers. In this manner, XGBoost improves the slow learning speed data. Overall, a total of 1,010,190 records from January 1950 to June
of boosting through parallelisation and is accordingly gaining popularity 2020 were used in this experiment. The resulting RMSE and R2 were
by proving its excellence in the data-analysis-competition platform calculated and compared to verify the performance of each model.
Kaggle (http://www.kaggle.com).
4. Results
3.5.4. LightGBM ensemble model
The light gradient boosting machine (LightGBM) algorithm de­ 4.1. Exploratory building life-span analysis
creases the data processing time using the gradient-based one-side
sampling and exclusive feature binding techniques. Its speed is thus far We conducted an exploratory analysis of the data to develop an
greater than that of the XGBoost and quantile gradient boosting initial understanding of how building life span changes by main frame
regression tree models, which use the existing gradient boosting deci­ type and regional differences.
sion tree [49].
4.1.1. Analysis of number of buildings by main frame
3.5.5. Deep neural network model The number of buildings classified by main frame type and their
As an alternative to XGBoost and LightGBM, we employed an arti­ corresponding percentages of representation in the data are shown in
ficial neural network (ANN) based on TensorFlow to implement a deep Fig. 3. There were 25 types of main frames in the data. In Fig. 3, the main
learning neural network (DNN) model. TensorFlow has become the most frame types that accounted for less than 1% of the data, including steel,
used machine learning software library since Google released it as a free cement block, and container, are grouped together and indicated as
and open-source library in November 2015 [50]. The ANN is a data ‘Others’. With 36%, brick buildings accounted for the largest proportion
processing system consisting of multiple layers, variable connection of the data, followed by block, RC, general wood, lightweight steel, steel,
strengths, a transition function, and a learning algorithm, and has a steel pipe, masonry, and others. Thus, RC buildings, which were
structure in which weights are repeatedly adjusted through the input assumed to be predominant, accounted for the third largest proportion
and output data values to eventually reflect the relationship between the of buildings in the data set. Although RC has been widely used since the
two. 1950s, brick, block, wood, lightweight steel, etc. are suitable for low-rise
In this study, we constructed a hidden layer in the form of an inverted buildings, are fireproof, durable, and easy to construct, and as a result
pyramid to output a single dependent variable using a limited number of were widely used after the Korean War.
independent variables. Thus, five hidden layers were used to minimise Fig. 4 illustrates the percentages of buildings by region in South
the mean squared error between the predicted and actual building life Korea according to main frame type. Brick buildings constitute the
spans (Fig. 2). The rectified linear unit (ReLU) was used as the activation largest share of building frame type in of South Korea overall, and in
function of the hidden layer, and the Adadelta optimiser provided by Seoul, Gyeonggi, Daegu, and Chungbuk. However, block buildings are
Keras was used as the gradient boost optimiser. the most common in Busan and Jeju, whereas wooden buildings are
most common in Gangwon, Gyeongnam, Jeonnam, and Jeonbuk, and
3.5.6. Pre-processing and model parameter tuning lightweight steel buildings are most common in Jeonbuk. It is presumed
We pre-processed and optimised the model parameters using a data that the preference for a particular frame type varies according to the

Fig. 2. Visual representation of the structure of a deep neural network.

7
S. Ji et al. Building and Environment 205 (2021) 108267

Fig. 3. Number of buildings according to main frame type.

site conditions in each region. span for the eight most used main frame types, excluding the two of the
top ten that accounted for less than 1% in total. As indicated in the
4.1.2. Analysis of building life span by main frame type graph, the building life spans exhibited large variations according to
To examine how the building life span changes by main frame type, main frame type. This overall trend reveals that the differences between
the average of building life span was computed for the ten most used the main frame types have a significant influence on the average life
main frame types identified. Fig. 5 illustrates the average building life span. In addition, it is commonly understood that there can be a

Fig. 4. of buildings in South Korean regions by main frame type.

8
S. Ji et al. Building and Environment 205 (2021) 108267

Fig. 5. Distribution of average building life span by main frame type.

Fig. 6. Average building life span according to region.

significant difference between the durability life span of a building and and Jeonnam (31 years). Notably, there were no specific factors dis­
its actual life span according to the main frame type. For example, a tinguishing different regions. Therefore, it cannot be concluded that the
wooden building is generally known to have a life span of 50–100 years characteristics of a region (i.e. local vs. city, regional building prefer­
but has an actual average building life span of 52.6 years, which is close ence, etc.) are key factors in determining the life span of a building.
to the lower limit. The assumed life spans of block and brick buildings However, in the case of Sejong City, the average life span was found to
are approximately 50 years each, whereas their actual average life spans be far lower than the other regions, a difference that may reflect the
were found to be 31.3 and 29.3 years, respectively. Buildings made of rapid, large-scale demolition of existing buildings occurred recently in
RC, which is the most used material for new buildings today, have an accordance with the creation of a newly planned city established by the
assumed durability life span of 50–100 years, but an actual average life Korean government policy.
span of only 22.8 years.
4.1.4. Analysis of building life span according to main frame type by region
4.1.3. Analysis of building life span by region Fig. 7 depicts the average life spans of buildings according to the
The average building life cycle and its distribution were analysed by most commonly used main frame types throughout South Korea. In all
region according to the classification of metropolitan cities and districts the regions, wooden buildings were found to have the highest average
in South Korea. As indicated in Fig. 6, there was little variation between building life span. Although there were slight differences between re­
regions in terms of average life cycle. In particular, there were five re­ gions, brick and block buildings had the second and third longest life
gions with an average building life of more than 30 years: Jeonbuk (34.2 spans, respectively. In most areas, RC buildings and gauge steel build­
years), Busan (32.6 years), Gyeongnam (32.4 years), Seoul (31.4 years), ings had the fourth and fifth longest life spans, respectively. There were

9
S. Ji et al. Building and Environment 205 (2021) 108267

Fig. 7. Average building life span by region for each main frame type.

model estimation due to its great volume and large variance from the
Table 4 real data. Therefore, it is necessary to use a metric robust to generalized
Performance comparison between machine learning models.
predictions. Other than prediction precision, we also needed to check if
Model RMSE R2 the regression model is well representing the observed value and dis­
Linear regression 4.53 0.934 tribution of life span. Therefore, we present R2 value (also known as the
XGBoost 4.60 0.932 coefficient of determination). There are arguments in prior studies that
LightGBM 4.33 0.939 SMAPE (symmetric mean absolute percentage error) should be used for
DNN 3.72 0.955
such criteria, however, recent comparative research claimed superiority
of R2 over SMAPE [56]. The RMSE of the linear regression model,
no significant differences between the life spans of buildings with the XGBoost regression model, LightGBM model, and DNN model were
same main frame type according to region. The average RC building life distributed between 3.72 and 4.6, and the coefficient of determination
span in Busan was the longest at 26.3 years, followed by Seoul and was greater than 0.9. The examined models were all able to predict the
Jeonbuk, both with an average life span of 24.3 years, and Daegu with life span of buildings in South Korea reasonably well and the DNN model
the third longest life span of 23.9 years. The region with the shortest was the best among them.
average building life span was Gyeonggi Province at 19.9 years. The performance of the XGBoost model was slightly weaker than the
Notably, Seoul is the largest city, with the largest number of residents regression model. According to prior research, XGBoost is expected to
and buildings in South Korea, while Daegu and Busan are also large show similar prediction performance as the LightGBM model because
cities with relatively high population and building densities. Therefore, they both are based on gradient boosting [49]. It is difficult to explain
the life spans of buildings must be long. However, it is difficult to make a precise reasons for the weaker performance of XGBoost, but the
blanket statement based on this simple analysis because Jeonbuk is not a
densely populated area.
Table 5
Key variables for each model sorted in the order of importance.

4.2. Analysis of prediction models and comparison of their performances Regression model Variables

Linear regression Date of use permit


The analysis results obtained using the four predictive models are Number of storeys above ground
shown in terms of RMSE and R2 in Table 4. As the RMSE represents an Number of storeys underground
Number of elevators
error value, the smaller the number, the smaller the error. In regression Metropolitan city
model evaluation with machine learning methodology, there is no XGBoost Date of use permit
consensus on a standard metric to assess the results of regression itself. Roof structure type
The mean square error (MSE) and its rooted variant (RMSE), or the mean Major use
Building area
absolute error (MAE) and its percentage variant (MAPE) were often
Building floor area
employed to evaluate prediction outcomes [56]. In this research, we LightGBM City or county (Si, Gun, Gu)
used RMSE to evaluate the model outcome because the metric is robust Date of use permit
to the influence of outliers and their errors [57]. Even though we stan­ District (Dong)
dardized values and preprocessed outliers and abnormal values in our Major use
Main frame type
dataset, there is the possibility that a few large values may distort the

10
S. Ji et al. Building and Environment 205 (2021) 108267

implementation details of the two algorithms seem responsible for the classified key factors (or features) while providing more accurate results
subtle difference between the two gradient-boosting models we have than other prediction methods. Because machine learning methods can
utilized in the study. The two algorithms are different in how they utilize patterns that exist in the given data set that are otherwise difficult
manage the nodes in splitting decision trees based on the gradient. to identify, they represent a suitable approach to solving highly complex
LightGBM uses Gradient-based One-Side Sampling (GOSS) to filter out problems, including building life span prediction. Thus, as demonstrated
the data instances for finding a split value. This approach applies a in this study, we can build a computational model using machine
multiplier constant to nodes with a small gradient, which allows learning methods that is more accurate than the currently employed life
focusing on under-trained features. However, XGBoost uses a pre-sorted cycle pattern analysis method (which is based on assumptions regarding
algorithm so that the model can be robust to overfitting but may building characteristics), providing improved predictive power over
disregard instances with small gradients. general regression analysis.
To further understand the comparative characteristics of the models, However, it should be noted that the methods evaluated in this study
we examine the key features of each model we constructed. Except for have several potential limitations. First, it is difficult to predict the life
the DNN model, the importance of each variable to the models can be span of buildings constructed using innovative methods and materials,
determined based on the weight or importance of the associated feature as there are no actual life span data that reflect the life span of such
in each model. Table 5 lists five main variables with the strongest in­ futuristic buildings. Innovative methods have the potential to drastically
fluence on the predicted values for each model. increase the general life span of a building. Another limitation may arise
Table 5 reveals that each model has a different set of important from scenarios in which current building laws have been amended or a
features. They all agree that the date of use approval is an important natural calamity has occurred, suddenly impacting building life spans.
variable. However, there are significant differences in the combination However, the current trend of using RC, steel, and wood is not likely to
of the five variables and their importance levels across the prediction change for a while. Moreover, although a few housing laws and building
models. Therefore, it shows that each model operates in a distinct way, laws may be enacted, altered, or abolished, no change in the basic
implying that it is important to consider a diverse set of variables and the construction paradigm is likely to occur. Authorities have endeavoured
influence of those variables together. To more directly examine the to prolong the life spans of recently constructed buildings by responding
problem of conventional life-span estimation, which employ main frame to the predictable functional and environmental changes and improving
type and region as the independent variables to estimate building life structural durability. If changes occur in the future that directly affect
span, we applied those two variables to the same machine learning building life span, such as the introduction of completely different
estimation protocols we used in this research. Table 6 shows the RMSE construction methods like building 3D printing technology, populari­
and R2 values for each of the predictive models using either or both of zation of new materials, or revision of re-development and reconstruc­
the major structural types of buildings and regions, which are most tion laws that affects building profitability, it will be necessary to
commonly employed features in determining the life span of a building consider a new hybrid research method that includes a more conven­
in LCA or LCC studies. tional factor method as well as the big data approach employed in this
As shown in Table 6, RMSE values are significantly greater than the study.
values presented in Table 4, revealing that there are large discrepancies It is important to note that the proposed methodology and prediction
between real building life span and estimated building life span, which is models are estimated based on the prior records of the buildings that are
predicted based on one of, or both of, the structural and regional factors. effectively demolished. Therefore, the prediction may not reflect the
Therefore, the proposed methodology of this research should be characteristics of existing buildings with long life spans and buildings
considered superior over the traditional estimation approaches. with different feature variances which may occur in the future. To
mitigate the bias and the discrepancy between the model prediction and
5. Discussion such untrained life span cases, we constructed and trained the models
with the records from 1950 to 2019, and have the model predict the life
In this study, the prediction of building life span was attempted for span of the buildings demolished in 2020, which are not reflected in the
the first time based on real-world administrative big data that spans over trained models. Even though, there still exists the potential limitation of
65 years of a whole nation. Big data analysis performed in this study model performance for cases with different feature variances, our study
showed that the actual life spans of the building differ substantially from has demonstrated that the proposed performs better than the conven­
those calculated using the mainframe type or region variables, indi­ tional approach. Moreover, there is a room for improvement as the
cating that the current estimation practices simply based on those var­ construction datasets are continuously being collected and applied to
iables are far from the reality. Despite its widespread use, it is not the model training.
appropriate to perform an LCA or LCC using the simple estimation of a The most practical problem that we wanted to address in this
building’s life span that is significantly different from the actual one. research was to introduce an improved methodology that estimates the
Machine learning methods, especially the DNN, depend less on pre- realistic life span necessary for accurate LCC and LCA results. Thus, we
aimed to employ and test a methodology that is universally applicable to
Table 6 the life span estimation modeling for LCC and LCA. As a data-driven
Performance comparison between machine learning models estimated with approach, our proposed model may not be permanent and have limi­
mainframe type and region. tations in predicting unobserved cases. However, our study results show
Factor variable Model RMSE R2
that our methology based on the utilization of big data and machine
learning is good enough to enhance the outcome of LCC and LCA cal­
Main frame Type Linear regression 18.40 0.080
culations. The methodology in this study can be employed universally,

XGBoost 20.30 − 0.315
LightGBM 20.31 − 0.316 with only minor adjustments for different datasets.
DNN 19.76 − 0.246 Building construction requires compulsory administrative processes
Region Linear regression 18.03 − 0.037 from building permits to construction reports to authorisation of use to
XGBoost 17.90 − 0.023 demolition. Therefore, if a computerised administrative registration
LightGBM 17.90 − 0.023
DNN 17.97 − 0.030
process for maintenance work such as waterproofing, facilities, and
Main frame Type & Region Linear regression 18.19 − 0.005 replacement of internal and external equipment is implemented, big
XGBoost 18.96 − 0.148 data can be collected on the replacement cycle and cost of materials or
LightGBM 20.13 − 0.294 construction methods during the maintenance stage. Thus, life expec­
DNN 19.18 0.175

tancy could be accurately predicted using big data for construction

11
S. Ji et al. Building and Environment 205 (2021) 108267

methods or building materials using the same method presented in this Table A1 (continued )
study. In addition, including the required quantity and cost in the record Factors Data Type
items generated during the maintenance process would help more Number of storeys
accurately analyse the economic and environmental impacts of building underground
life cycle characteristics. Number of elevators 0–93
Number of outdoor 0–9012
parking spaces
6. Conclusion Outdoor parking area 0–495,420
Date of approval of use 1800–2018
The objectives of this study are to investigate how much the pre­ Building type ‘Detached building’, ‘multi-unit dwelling’
Building ledger type ‘Section for describing the title’, ‘General
sumed building life span, which has been commonly utilized to perform
building’
LCA or LCC analyses, differs from actual building life span and to find a Metropolitan city ‘Seoul’, ‘Gyeonggi’, ‘Gangwon’, ‘Daegu’,
better approach that can accurately predict building life span, thereby ‘Incheon’, ‘Gwangju’, ‘Ulsan’, ‘Gyeongnam’,
contributing to more accurate practices of LCA or LCC analysis. In prior ‘Jeju’, ‘Daejeon’, ‘Chungbuk’, ‘Chungnam’,
LCA or LCC studies, the life span was often assigned to be either 50 or ‘Jeonbuk’ ‘, ‘Jeonnam’, ‘Gyeongbuk’, ‘Busan’,
‘Sejong’
100 years old depending on the structures. However, our analysis results City or county code (Si, 11,110–50130
show that the actual life span of the building was very different from that Gun, Gu)
of the 50 years or 100 years that was commonly used in those studies. District code (Dong) 00000–47029
Therefore, there is a dire need to reflect a realistic building life span in Site type 0,1,2
Extra parcels 0–4734
the LCA and LCC, rather than rely on an incorrectly presumed one.
Building subordination ‘Main building’, ‘Sub building’
Responding to this need, we have explored the possibility of applying type
prominent machine learning approaches to the data set of South Korean Major use ‘Multi-unit dwelling’, ‘detached building’, ‘class
building registration and demolition records (971,514 cases). Our study classificatiral accon 2 neighbourhood living facilities’, ‘educational
results clearly show that the actual building life spans are very different research and welfare facilities’, ‘class 1
neighbourhood living facilities’, ‘neighbourhood
from those presumed numbers. In the case of a reinforced concrete living facilities’, ‘animals and plants related
structure building, which is currently the most used structure in South facilities’, ‘warehouse facilities’, ‘factory’,
Korea, the average life span of the actual buildings was 22.8 years, with ‘business facilities’, ‘offices’, ‘automotive
a difference of 27.2 years from the standard reference of 50 years. In the facilities’, ‘correctional and military facilities’,
‘accommodation facilities’, ‘education and
case of the brick structure building, which accounts for 36% of the total
research facilities’, ‘dangerous goods storage and
buildings in the study, the life span of the building was 29.3 years, with a treatment facilities’, ‘public facilities’, ‘elderly
difference of 20.7 years. people facilities’, ‘excretion and waste disposal
In this study, we have also sought an effective way to predict realistic facilities’, ‘tourist rest facilities’, ‘cultural and
building life spans by applying the latest machine learning prediction assembly facilities’, ‘sales and sales facilities’,
‘religious facilities’, ‘amusement facilities’,
methods. Building life spans were predicted using four prediction ‘medical facilities’, ‘retail stores’, ‘other
models including linear regression, XGBoost, LightGBM, and Deep warehouse facilities’, ‘exercise facilities’, ‘singing
Neural Network (DNN). With this big data approach, even the basic practice grounds’, ‘training facilities’, ‘sales
linear regression model showed a fairly powerful prediction accuracy of facilities’, ‘restaurants’, ‘multi-family housing’,
‘oil sales office’, ‘academy’, ‘joint house’,
93.4%. Moreover, among the alternative machine learning models
‘congratulations’, ‘repair store’, ‘greenhouse’,
examined in this study, the prediction accuracy of the latest DNN model ‘community office’, ‘dormitory’, ‘public office’,
was 95.5%, which was 2.2% better than the linear regression model. ‘market’, ‘religious assembly hall’, ‘apartment’,
Overall, the big data approach shows that those machine learning ‘cemetery related facilities’, ‘general restaurant’,
models are far more accurate in predicting actual life spans of buildings, ‘manufacturing establishment’, ‘quarantine
hospital’, ‘transport facility’, ‘multiple housing’,
relative to the current practices relying on the presumed numbers, and
‘shop’, ‘church’, ‘police box’, ‘general business
DNN is the most effective model among those models examined in this facility’, ‘multi-family housing’, ‘gymnasium’,
study. The findings of this study contribute to the prediction of realistic ‘broadcasting and communication facilities’,
building life spans, opening up a new horizon for various evaluations ‘temporary buildings’, ‘power generation
facilities’, ‘shelters’, ‘golf driving range’, ‘power
and decision making practices in the building construction industry.
plants’, ‘kindergarten’, ‘dangerous goods
storage’, ‘health centre’, ‘research institute’,
Table A1 ‘military facility’, ‘other class 1 neighbourhood
Data and type by factor facility’, ‘funeral facility’, ‘geneommodation
facility’, ‘industrial exhibition centre’, ‘cosmetic
Factors Data Type shop’, ‘wholesale market’, ‘day-care centre’,
‘hospital’, ‘apse chapel’, ‘tourist
Building area (m2) 0–6,316,730 accommodation’, ‘living area training facility’,
Building floor area 0–204,930,809 ‘garage’, ‘prison’, ‘storage shop’, ‘general
(m2) factory’, ‘slaughterhouse’, ‘liquefied gas
Main frame ‘RC’, ‘Brick’, ‘Steel frame’, ‘Light gauge steel’, handling station’, ‘parking lot’, ‘children-related
‘Block’, ‘General wood’, ‘Masonry’, ‘General facilities’, ‘water purification plants’, ‘waste
steel’, ‘Stone’, ‘Steel pipe’, ‘Wood’, ‘Concrete’, treatment facilities’, ‘resource recycling
‘Steel RC’, ‘Precast concrete’, ‘Steel concrete’, facilities’, ‘hotels’, ‘gas stations’, ‘inns’,
‘Steel house’, ‘Log’, ‘Cement block’, ‘subsidiary facilities’, ‘living facilities’, ‘temples’,
‘Prefabricated panel’, ‘Steel pipe’, ‘Soil brick’, ‘social welfare facilities’, ‘camping facilities’,
‘Structure consisting of beams, columns, and ‘bowling alley’, ‘reading room’, ‘science hall’,
slabs’, ‘Container’, ‘Steel RC synthesis’, ‘Truss ‘post office’, ‘other sales and sales facilities’,
wood’ ‘athletic stamp’, ‘substation’, ‘airport facilities’,
Roof structure type ‘RC’, ‘Other roof type’, ‘Slate’, ‘Tile’ ‘entertainment tavern’, ‘excretion treatment
Number of storeys 1–274 facility’, ‘youth hostel’, ‘chicken factory’,
above ground ‘newspaper’, ‘waste recycling facility’,
0–51 ‘exhibition hall’, ‘theatre (movie theatre)’, ‘drug
(continued on next column) (continued on next page)

12
S. Ji et al. Building and Environment 205 (2021) 108267

Table A1 (continued ) [3] Un Environment and International Energy Agency, Global Status Report 2017:
towards a Zero-Emission, Efficient, and Resilient Buildings and Construction
Factors Data Type
Sector, United National Environment Programme, 2017.
clinic’, ‘welfare facilities’, ‘broadcasting [4] B. Unalan, H. Tanrivermis, M. Bulbul, A. Celani, A. Ciaramella, Impact of embodied
stations’, ‘other public facilities’, ‘other animals carbon in the life cycle of buildings on climate change for a sustainable future, Int.
and plants related facilities’ J. Hous. Sci. Appl. 40 (1) (2016) 61–71.
[5] A Trend Analysis of Building Life Cycle Assessment (LCA), Ministry of
Seismic regional I, II
Environment, Korea Environmental Industry Technology Institute, Republic of
classification
Korea, 2018.
Climatic regional Central region 1, central region 2, southern
[6] R. Pernetti, F. Garza, G. Paoletti, D2.2: Spreadsheet with LCCs. A Database for
classification region Benchmarking Actual NZEB Lifecycle Costs of the Case Studies, Horizon 2020
* Some data outliers were found, but the frequency of data with a building area Framework, Programme of the European Union, 2018.
[7] Committee on Trade and Investment, Life Cycle Assessment Best Practices of ISO
of 1,000,000 m2 or more is 1 and that with a total floor area of 3,000,000 m2 or 14040 Series, 2006.
more is 2, the data frequency of more than 50 storeys above ground level is 2, [8] A. Grant, R. Ries, Impact of building service life models on life cycle assessment,
that of more than 10 basement storeys is 7. It was confirmed that there were only Build. Res. Inf. 41 (2) (2013) 168–186, https://doi.org/10.1080/
four data frequencies exceeding 1000 other parcels and had little effect on the 09613218.2012.730735.
overall big data analysis. [9] A. Grant, R. Ries, C. Kibert, Life cycle assessment and service life prediction: a case
study of building envelope materials, J. Ind. Ecol. 18 (2) (2014) 187–200, https://
Table A2 doi.org/10.1111/jiec.12089.
[10] [Table A5] Table of Standard Useful Life and Scope of Useful Life of Buildings,
Earthquake area and area coefficient
Rules of the Corporate Tax Act, Ministry of Economy and Finance, Republic of
Korea, Order No. 792, [Revised] 21 April 2020.
Earthquake area Administrative district Earthquake [11] Demolition Plan Check/Permit Application (PDF). Santa Monica. 2017-06-05.
classification area coefficient [Retrieved] 2020-01-27.
[12] Planning permission—UK government [Retrieved] 27 November 2020, www.gov.
I City Seoul, Busan, Incheon, Daegu, 0.22 g uk.
Daejeon, Gwangju, Ulsan, [13] R. Harwood, Planning permission, International Specialized Book Services (24
Sejong September 2015). ISBN 9781780434919, [Retrieved] 5 March 2017.
Province Gyeonggi, South Gangwon1), [14] Building Act. People’s Republic of China. [Revised] 22 April 2011.
Chungbuk, Chungnam, [15] National Building Regulations and Building Standards ACT. Act No. 103,
Jeonbuk, Jeonnam, Government Gazette, Republic of South Africa, 1977.
Gyeongbuk, Gyeongnam [16] Architectural data private open system, Electronic Architectural administration
Information System (EAIS). [Retrieved], https://open.eais.go.kr/. (Accessed 3
II Province North Gangwon2), Jeju 0.14 g
November 2020).
Annotation
[17] Gobierno de Espãna, Código Técnico de la Edificación. DB-SE, Ministerio de la
1) South Gangwon: Gangneung, Donghae, Samcheok, Wonju, Taebaek, Yeongwol,
Vivienda, Madrid, 2010.
Jeongseon [18] Iso, Building Construction – Service Life Planning – Part 4: Service Life Planning
2) North Gangwon: Sokcho, Chuncheon, Goseong, Yanggu, Yangyang, Inje, Using Building Information Modeling, 2014, p. 2014. ISO 156864.
Cheorwon, Pyeongchang, Hwacheon, Hongcheon, Hoengseong [19] R.M. Cuéllar-Franca, A. Azapagic, Environmental impacts of the UK residential
sector: life cycle assessment of houses, Build. Environ. 54 (2012) 86–99, https://
Table A3 doi.org/10.1016/j.buildenv.2012.02.005.
[20] B. Palacios-Munoz, B. Peuportier, L. Gracia-Villa, B. López-Mesa, Sustainability
Classification by region according to climate in Energy Saving Design assessment of refurbishment vs. new constructions by means of LCA and durability-
Standards of Buildings based estimations of buildings lifespans: a new approach, Build. Environ. 160
(2019) 106203, https://doi.org/10.1016/j.buildenv.2019.106203.
[21] K. Adalberth, Energy use during the life cycle of single-unit dwellings: Examples,
Region Administrative district Build, Environ. Times 32 (4) (1997) 321–329, https://doi.org/10.1016/S0360-
classification 1323(96)00069-8.
[22] K. Adalberth, A. Almgren, E.H. Petersen, Life cycle assessment of four multi-family
Central region 1 Gangwon (excluding Goseong, Sokcho, Yangyang, Gangneung,
buildings, Int. J. Low Energy Sustain. Build. 2 (1) (2001) 1–21.
Donghae, and Samcheok), Gyeonggi (Yeoncheon, Pocheon,
[23] B. Son, A method of economic analysis for remodeling of deteriorated apartments
Gapyeong, Namyangju, Uijeongbu, Yangju, Dongducheon, using the life cycle costing, Architectural Institute of Korea 21 (2001) 73–81.
Paju), Chungbuk (Jecheon), Gyeongbuk (Bonghwa, [24] A.A. Guggemos, A. Horvath, Comparison of environmental effects of steel-and
Cheongsong) concrete-framed buildings, J. Inf. Syst. 11 (2) (2005) 93–101, https://doi.org/
Central region 2 Seoul, Daejeon, Sejong, Incheon, Gangwon-do (Goseong, 10.1061/(ASCE)1076-0342(2005)11:2(93).
Sokcho, Yangyang, Gangneung, Donghae, Samcheok), [25] S. Junnila, A. Horvath, A.A. Guggemos, Life-cycle assessment of office buildings in
Gyeonggi (excluding Yeoncheon, Pocheon, Gapyeong, Europe and the United States, J. Inf. Syst. 12 (1) (2006) 10–17, https://doi.org/
Namyangju, Uijeongbu, Yangju, Dongducheon, Paju), 10.1061/(ASCE)1076-0342(2006)12:1(10).
Chungbuk (excluding Jecheon), Chungnam, Gyeongbuk (except [26] M.L. Marceau, M.G. VanGeem, Comparison of the life cycle assessments of an
Bonghwa, Cheongsong, Uljin, Yeongdeok, Pohang, Gyeongju, insulating concrete form house and a wood frame house, J. ASTM Int. (JAI) 3 (9)
Cheongdo, Gyeongsan), Jeonbuk, Gyeongnam (Geochang, (2006) 1–11, https://doi.org/10.1520/JAI13637.
Hamyang) [27] O. Ortiz, C. Bonnet, J.C. Bruno, F. Castells, Sustainability based on LCM of
residential dwellings: a case study in Catalonia, Spain, Build. Environ. 44 (3)
Southern region Busan, Daegu, Ulsan, Gwangju, Jeonnam, Gyeongbuk (Uljin,
(2009) 584–594, https://doi.org/10.1016/j.buildenv.2008.05.004.
Yeongdeok, Pohang, Gyeongju, Cheongdo, Gyeongsan),
[28] I.Z. Bribián, A.A. Usón, S. Scarpellini, Life cycle assessment in buildings: state-of-
Gyeongnam (Excluding Geochang and Hamyang)
the-art and simplified LCA methodology as a complement for building certification,
Build. Environ. 44 (12) (2009) 2510–2520, https://doi.org/10.1016/j.
buildenv.2009.05.001.
[29] O. Ortiz, J.C. Pasqualino, G. Díez, F. Castells, The environmental impact of the
Declaration of competing interest construction phase: an application to composite walls from a life cycle perspective,
Resour. Conserv. Recycl. 54 (11) (2010) 832–840, https://doi.org/10.1016/j.
resconrec.2010.01.002.
The authors declare that they have no known competing financial [30] H. Monteiro, F. Freire, Life-cycle assessment of a house with alternative exterior
interests or personal relationships that could have appeared to influence walls: comparison of three impact assessment methods, Energy Build. 47 (2012)
572–583, https://doi.org/10.1016/j.enbuild.2011.12.032.
the work reported in this paper. [31] S. Tae, C. Baek, S. Shin, Life cycle CO2 evaluation on reinforced concrete structures
with high-strength concrete, Environ. Impact Assess. Rev. 31 (3) (2011) 253–260,
References https://doi.org/10.1016/j.eiar.2010.07.002.
[32] A. Passer, H. Kreiner, P. Maydl, Assessment of the environmental performance of
buildings: a critical evaluation of the influence of technical building equipment on
[1] C.M. Mah, T. Fujiwara, C.S. Ho, Life cycle assessment and life cycle costing toward
residential buildings, Int. J. Life Cycle Assess. 17 (9) (2012) 1116–1130, https://
eco-efficiency concrete waste management in Malaysia, J. Clean. Prod. 172 (2018)
doi.org/10.1007/s11367-012-0435-6.
3415–3427, https://doi.org/10.1016/j.jclepro.2017.11.200.
[33] X. Zhang, L. Shen, L. Zhang, Life cycle assessment of the air emissions during
[2] A.A. Jensen, J. Elkington, K. Christiansen, L. Hoffmann, B.T. Møller, A. Schmidt,
building construction process: a case study in Hong Kong, Renew. Sustain. Energy
F. van Dijk, Life Cycle Assessment (LCA): A Guide to Approaches, Experiences and
Rev. 17 (2013) 160–169, https://doi.org/10.1016/j.rser.2012.09.024.
Information Sources, European Environment Agency, 1998.

13
S. Ji et al. Building and Environment 205 (2021) 108267

[34] S. Kim, G.H. Kim, Y.D. Lee, Sustainability life cycle cost analysis of roof [46] [Table A10] Earthquake Zones and Regional Coefficients, Regulation of Building
waterproofing methods considering LCCO2, Sustainability 6 (1) (2014) 158–174, Structure Standards (RBSS). Order No. 777. Ministry of Land, Infrastructure and
https://doi.org/10.3390/su6010158. Transport, Republic of Korea, [Revised] 9 Nov 2020.
[35] S. Ji, D. Kyung, W. Lee, Life cycle assessment (LCA) of roof-waterproofing systems [47] D. Bzdok, N. Altman, M. Krzywinski, Statistics versus machine learning, Nat.
for reinforced concrete building, Adv. Environ. Res. 3 (4) (2014) 367–377, https:// Methods 15 (4) (2018) 233–234.
doi.org/10.12989/aer.2014.3.4.367. [48] T. Chen, C. Guestrin, XGBoost: a scalable tree boosting system, In: Proceedings of
[36] R.S. Heralova, Life cycle cost optimisation within decision making on alternative the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining,
designs of public buildings, Procedia Eng 85 (2014) 454–463, https://doi.org/ August 2016, San Francisco, pp. 785–794.
10.1016/j.proeng.2014.10.572. [49] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, T.Y. Liu, Lightgbm: a highly
[37] A. Vilches, A. Garcia-Martinez, B. Sanchez-Montanes, Life cycle assessment (LCA) efficient gradient boosting decision tree, in: Advances in Neural Information
of building refurbishment: a literature review, Energy Build. 135 (2017) 286–301, Processing Systems, 2017, pp. 3146–3154.
https://doi.org/10.1016/j.enbuild.2016.11.042. [50] A. Geron, Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow:
[38] B. Petrovic, J.A. Myhren, X. Zhang, M. Wallhagen, O. Eriksson, Life cycle Concepts, Tools, and Techniques to Build Intelligent Systems, second ed., O’Reilly,
assessment of building materials for a single-family house in Sweden, Energy California, 2019.
Procedia 158 (2019) 3547–3552, https://doi.org/10.1016/j.egypro.2019.01.913. [51] A.J.P. Ibáñez, J.M.M. Bernal, M.J.C. de Diego, F.J.A. Sánchez, Expert system for
[39] V. Biolek, T. Hanák, LCC estimation model: a construction material perspective, predicting buildings service life under ISO 31000 standard. Application in
Buildings 9 (8) (2019) 182, https://doi.org/10.3390/buildings9080182. architectural heritage, J. Cult. Herit. 18 (2016) 209–218.
[40] C. Zhang, M. Hu, X. Yang, A. Amati, A. Tukker, Life cycle greenhouse gas emission [52] A. Miatto, H. Schandl, H. Tanikawa, How important are realistic building lifespan
and cost analysis of prefabricated concrete building façade elements, J. Ind. Ecol. assumptions for material stock and demolition waste accounts? Resour. Conserv.
24 (2020) 1016–1030, https://doi.org/10.1111/jiec.12991. Recycl. 122 (2017) 143–154.
[41] J. O’Connor, Survey on actual service lives for North American buildings. In: [53] C.J. Chen, Y.K. Juan, Y.H. Hsu, Developing a systematic approach to evaluate and
Woodframe Housing Durability and Disaster Issues Conference, October 2004, Las predict building service life, J. Civ. Eng. Manag. 23 (7) (2017) 890–901.
Vegas, US, pp. 1–9. [54] W. Zhou, A. Moncaster, D.M. Reiner, P. Guthrie, Estimating lifetimes and stock
[42] W.P.S. Dias, Factors influencing the service life of buildings, Engineer 46 (4) turnover dynamics of urban residential buildings in China, Sustainability 11 (13)
(2013) 1–7. (2019) 3720.
[43] G. Liu, K. Xu, X. Zhang, G. Zhang, Factors influencing the service lifespan of [55] U. Sivarajah, M.M. Kamal, Z. Irani, V. Weerakkody, Critical analysis of Big Data
buildings: an improved hedonic model, Habitat Int. 43 (2014) 274–282, https:// challenges and analytical methods, J. Bus. Res. 70 (2017) 263–286.
doi.org/10.1016/j.habitatint.2014.04.009. [56] D. Chicco, M.J. Warrens, G. Jurman, The coefficient of determination R-squared is
[44] F. Provost, T. Fawcett, Data Science for Business: what You Need to Know about more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis
Data Mining and Data-Analytic Thinking, O’Reilly Media Inc., California, 2013. evaluation, PeerJ Computer Science 7 (2021) e623.
[45] [Table A1] Table of Heat Perfusion Rates for Building Parts by Region, Energy [57] T. Chai, R.R. Draxler, Root mean square error (RMSE) or mean absolute error
Saving Design Standards of Buildings (ESDSB). Notice No. 2017-881. Ministry of (MAE)?–Arguments against avoiding RMSE in the literature, Geosci. Model Dev.
Land, Infrastructure and Transport, Republic of Korea, [Revised] 28 December (GMD) 7 (3) (2014) 1247–1250.
2017.

14

You might also like