Professional Documents
Culture Documents
Abstract: A large volume of images is collected during postdisaster building reconnaissance. For both older and new buildings, the structural
drawings are an essential record of the structural information needed to extract valuable lessons to improve future performance. With older
construction, these drawings often need to be captured as multiple photographs, herein referred to as partial drawing images (PDIs), taken at a
close distance to ensure critical details are legible. However, the ability to use PDIs is quite limited due to the time-consuming process of
manually classifying such photographs and the challenge of identifying their spatial arrangement. The authors offer a new solution to auto-
matically recover high-quality structural drawing images. First, PDIs are classified from a set of images collected using an image classi-
fication algorithm, called convolutional neural network. Then, using the structure-from-motion algorithm, the geometric relationship between
each set of PDIs and a corresponding physical drawing are computed to identify their arrangement. Finally, high-quality full drawing images
are reconstructed. The capabilities of the technique are demonstrated using real-world images gathered from past reconnaissance missions and
newly collected PDIs. DOI: 10.1061/(ASCE)CP.1943-5487.0000798. © 2018 American Society of Civil Engineers.
Introduction data formats, photographs offer the most convenient and efficient
way of recording and preserving abundant information about the
After a natural event such as an earthquake, windstorm, or associ- building (EERI 2009; UC CEISMIC 2011; DataCenterHub 2014;
ated events such as tsunami and storm surge inundation, structural DesignSafe-CI 2016). Meaningful scenes of damaged and undam-
reconnaissance teams are dispatched to collect perishable scientific aged buildings and their components are captured using multiple
data on the conditions of these structures before any damaged struc- images taken from different angles and positions. In addition, the
tures are destroyed or repaired. Collecting such data offers valuable photographs typically encompass several types of metadata that are
opportunities to learn about the event and its consequences and to needed for documentation, such as structural drawings, measure-
identify potential gaps in existing research or in building codes ment values, or signs. Of all these forms of data, structural drawings
through the study of structural performance during the event. From are the focus of this paper. Such drawings provide critical informa-
these studies, important lessons about the performance of the build- tion that allows the engineer to infer the behavior of structural and
ings, those that exhibit damage as well as those that do not, can be nonstructural components during extreme events and better under-
learned. The high costs associated with such events, in terms of stand the cause of the damage shown in the collected images. In
both the loss of life and the impact on the livelihood of the local many cases, especially with older buildings, drawings are not likely
community, make the gathering of such data imperative. Many to be found in digital format. Since it is difficult to carry several sets
engineering organizations and communities around the world have of drawings home or to create complete physical or digital copies
sponsored and established reconnaissance teams to collect such on site, teams often capture structural drawings as segmented pho-
valuable information from each disaster and preserve it for future tographs. Multiple photographs of small portions of the structural
study (EERI 2009; UC CEISMIC 2011; DataCenterHub 2014;
drawings are taken at a close distance to ensure the details are leg-
DesignSafe-CI 2016).
ible (DataCenterHub 2014).
In a typical mission, each team visits between 5 and 10 build-
However, in their segmented format, denoted as partial drawing
ings a day and collects various forms of data within a short period.
images (PDIs), the structural drawings are difficult to interpret,
Some examples of data collection formats appeared in Applied
requiring a significant investment of the engineer’s time. These
Technology Council (1989, 2004), and American Concrete Insti-
drawings often represent only a small portion of the large volume
tute (2017). However, research teams may adopt enhanced forms
of images collected in the field. Manually categorizing and group-
of data collection to suit the needs of the effort. Among the diverse
ing them into sets from the same drawing is a quite challenging task
1 when one considers the possible number of buildings involved.
Professor, Dept. of Civil and Environmental Engineering, Univ.
of Waterloo, ON, Canada N2L 3G1 (corresponding author). Email: Moreover, such drawings are often printed on large engineering
cmyeum@uwaterloo.ca paper and contain markers with small-sized contents including
2
Graduate Student, Lyles School of Civil Engineering, Purdue Univ., text, digits, or lines. Thus, many images may need to be collected
West Lafayette, IN 47907. (e.g., 20–40 images for each drawing) at a close distance in order to
3
Professor, School of Mechanical Engineering, Purdue Univ., legibly capture all of the details contained in such a drawing. These
West Lafayette, IN 47907. manually captured images often are inconsistent in terms of per-
4
Professor, Lyles School of Civil Engineering, Purdue Univ., spective distortion and level of detail among the images when they
West Lafayette, IN 47907.
are captured with different angles or from different distances. Thus,
Note. This manuscript was submitted on November 26, 2017; approved
on June 5, 2018; published online on October 5, 2018. Discussion period in addition to the difficulties associated with their segmented forms,
open until March 5, 2019; separate discussions must be submitted for in- it is laborious to recognize the proper arrangement of the images
dividual papers. This paper is part of the Journal of Computing in Civil and align them. These issues are magnified by the tendency to
Engineering, © ASCE, ISSN 0887-3801. continue to store these drawings as PDIs. This means that each
Reconstruction of Full Drawing Images matrix can be obtained from the matching results. Further technical
This section describes how multiple FDIs were constructed from details are provided in Clauset et al. (2004).
the entire set of PDIs that were classified using the approach dis- In the subsequent steps, processing takes place within individ-
cussed in the previous section. An overview of the steps in this ual groups of PDIs obtained from Step 1. In Step 2 a projection
procedure is shown in Fig. 1. matrix of each PDI is estimated. The projection matrix describes
For each building, multiple physical drawings associated with the mapping of a pinhole camera from three-dimensional (3D)
different floors or details were often available. Thus, in Step 1, the points in the scene to two-dimensional (2D) points in an image,
PDIs collected from the same drawing were matched and grouped establishing a 2D–3D relationship. The projection matrix contains
across all PDIs. Since each of the PDIs contained only a small part intrinsic camera parameters such as focal length, principal points,
of a full drawing, and yet had a similar appearance as other draw- and extrinsic orientation including rotation angles and camera
ings across the set, it was challenging to manually group them. In location in 3D. SfM automatically computes the 3D geometry
this step, PDIs collected from the same drawing were classified as of the scene (point cloud), projection matrix, and lens distortion
such by automatically matching the common visual features present parameters. Recently, SfM has emerged as a technique for
in the PDIs. These extracted visual features were also reused for constructing a 3D point cloud from 2D images in conjunction with
computing the projection matrix in Step 2. Conventional visual fea- the development of unmanned aerial vehicle (UAV) technology,
tures and descriptors [e.g., scale-invariant feature transform (SIFT) and some popular software packages such as Pix4D, PhotoScan,
or speeded-up robust features (SURF)] can be used for this process. or ContextCapture, and non-commercial software, such as
Both SIFT and SURF automatically extract visual features (and de- VisualSFM and OpenMVG (Moulon et al. 2013). Regardless of
scriptors) invariant to scaling, rotation, and illumination, allowing the differences in functionalities and algorithms implemented in
accurate matching of identical local points across images (PDIs in these tools or their optimization for specific applications, in gen-
this study) (Lowe 2004; Bay et al. 2008). Ideally, the PDIs would eral the input to SfM is a set of images, and the outputs are the
only share common features with those in the same group. How- associated projection matrices (constructed based on several inter-
ever, separate drawings for the same building (e.g., for different nal and external camera parameters) and the 3D point cloud gen-
floors) are likely to have some repeated or duplicate regions such erated from the scene. Nearly all available commercial and
as text descriptions, digits, or graphics (Fig. 4). These will poten- noncommercial SfM software can export these outputs in a read-
tially generate spurious correspondences between PDIs even when able format. In SfM, all of these parameters and data are automati-
they are captured from different drawings. cally estimated by matching the features detected in Step 1.
Fig. 1. Overview of the technique for constructing FDIs. Once sets of PDIs captured from the same drawings are obtained in Step 1, the correspond-
ing FDI in each set is generated using Steps 2–6.
Y max ¼ minðY min ; Xnð2ÞÞ quality of the resulting image may be greatly reduced. In order to
X max ¼ maxðX max ; Xnð1ÞÞ fully demonstrate the capabilities of this drawing classification and
Y max ¼ maxðY max ; Xnð2ÞÞ reconstruction technique, the authors separated this study into two
else experiments. First, in “Representative Problem Demonstrating Full
continue Drawing Image Reconstruction,” reconstruction of a representative
s ¼ minðFDIW =ðX max − X min Þ; FDIH =ðY max − Y min ÞÞ drawing is considered to illustrate the technique on a realistic case
return s; X min ; Y min following the guidelines set forth herein. Then “Training of a
Partial Drawing Classifier” shows the capability of the PDI classi-
fication using actual reconnaissance images collected from past
Here, cornerPointsi is a 2 × 4 matrix having four corner pixel earthquake events. However, due to insufficient quality of the PDIs
locations for the ith image. The function homðxÞ is used to convert for the FDI construction (because they were not captured in the
a vector x from Cartesian coordinates to homogeneous coordinates, field using the guidelines), the authors focused on demonstrating
and nhomðxÞ is used to implement this transformation in a reverse the performance of PDI classification using these real-world PDIs.
way. The maximum dimensions of the FDIs in the width and height In “Implementation of the Developed Technique,” the authors dem-
directions are denoted as FDIW and FDIH , respectively. If the for onstrate how FDIs are obtained from a set of reconnaissance images
αi > αt then condition is satisfied, the ith image is included in taken without using the recommended guidelines. It will be seen
stitchSet. Once the scaling factors and translation parameters are that though the quality of the resulting FDI is impacted by the lack
computed, the final homography can be updated as follows (Hartley of a regulated collection technique, the image is successfully recon-
and Zisserman 2003): structed and is complete and sufficient for typical uses.
2 32 3
1 0 X min s 0 0 Representative Problem Demonstrating Full Drawing
6 76 7
Hiπ~ ¼ Hiπ~ 4 0 1 Y min 54 0 s 05 ð6Þ Image Reconstruction
0 0 1 0 0 1
Description of the Experiment
The authors first demonstrated the capability of FDI reconstruction
Again, with the homography matrix in Eq. (6), the points on from a typical set of PDIs. For data preparation, four different dig-
each image corresponding to a certain point on the FDI can be iden- ital drawings were printed on Arch E1 paper 76.2 × 106.7 cm
tified. By using the homography matrix on each image, the pro- (30 × 42 in:). Each drawing had fine text and thin lines, for which
jection of each image (removing perspective distortion) and its the details cannot be captured or included in one photograph. Each
alignment on the plane of the FDI are completed through this ho- drawing was smoothed out and placed on a table. Fig. 2 shows one
mography transformation. Only the images included in stitchSet are of the drawings used for this demonstration.
used for constructing the FDI. A total of 152 images were sequentially collected from the four
Step 6 is to stitch and blend the aligned projected images to con- different drawings. The number of PDIs for each drawing was ap-
struct FDIs. For their seamless composition, the authors utilized proximately 40 images (not uniform across different drawings).
gain compensation and multiband blending developed in Brown Fig. 3 shows sample PDIs for each drawing, where the images in
and Lowe (2007). Illuminations of images (here, PDIs) vary de-
pending on the angles and camera locations, and thus each image
has different overall gain (intensity), causing artifacts (banding) on
the FDIs without adjusting them. Gain compensation is a process to
normalize such gain variations between images. By minimizing the
intensity difference of overlapping pixels from multiple images, the
proper gain for each image is optimized (Brown and Lowe 2007).
After tuning the gain for each image (here, projected PDIs), they
are seamlessly composed as one single FDI. However, although the
gains of all images are normalized, they still may have inconsis-
tency between images due to vignetting (the intensity varies toward
the edge of the image), out-of-focus portions, or errors in radial
calibration. The multiband blending technique facilitates a smooth
transition between images to create high-quality FDIs. Images are
divided into multiple frequency bands, and weight maps for each
band are calculated to determine the contribution of each PDI to the
Fig. 2. Structural drawing printed on large engineering paper.
final FDI.
each column are captured from the same drawing and denoted a mixed set of the collected PDIs without any manual processing.
as Drawings 1–4. PDIs were taken using a handheld camera, (Nikon However, the final four FDIs in Fig. 4 were manually rotated and
D90, Nikon Corporation, Tokyo) with 18-105 mm lens, and zoom cropped to contain only the drawing areas for better visualization.
and flash functions were not utilized. The resolution of each image
was 4,288 × 2,848 pixels. To simulate conditions of data collection
in the field, the images were captured by a human photographer (no Training of a Partial Drawing Classifier
tripod or mechanical device held the camera) while minimizing per-
spective distortion, producing roughly uniform resolution across Reconnaissance Image Data Set
each drawing region. As per the guidelines, the images were col- For demonstration of PDI classification, the authors gathered an
lected with sufficient overlap in the horizontal and vertical directions extensive collection of reconnaissance images that were taken by
between the images. In Fig. 3, the last image in each column shows researchers and practitioners after past earthquake events (Yeum
an oblique image, which was captured with a large angle. 2016; Yeum et al. 2016, 2017a, c). These images are publicly avail-
The authors implemented the technique using MATLAB and able at DataCenterHub. Some of the authors of this study have
VisualSFM (MATLAB 2016). VisualSFM is a noncommercial free taken a significant role in collecting these data in the field, and
software with a graphical user interface (GUI) and is widely used the needs of those field missions provide motivation for this work
by many researchers due to its accuracy and speed using NVIDIA (Shah et al. 2015; Sim et al. 2015, 2016; NCREE 2016). The
CUDA (Wu 2013). Although its source code is not available in the authors utilized images collected from earthquake reconnaissance
public domain, it provides a command line operation without a missions in Düzce, Turkey, in 1999, Bingöl, Turkey, in 2003, Peru
GUI. Extraction of SIFT features and descriptors and computation in 2007, Haiti in 2010, Nepal in 2015, Taiwan in 2016, and Ecuador
of projection matrices were conducted with VisualSfM through in 2016 (Shah et al. 2015; Sim et al. 2015, 2016; NCREE 2016).
command-lines in MATLAB (Wu 2013). To access the processed Because PDIs are stored apart from the other images on Data-
data, the authors built a tool to read output files (in NVM format) in CenterHub, manual labeling was already performed and no addi-
MATLAB. For grouping the PDIs, each image was paired with the tional manual labeling was required for this study. A total of 31,173
next five images in the sequence of collected PDIs (to replicate pro- images were downloaded for these events, including 2,232 PDIs.
cedures in the field), followed by constructing an adjacency matrix. Some sample images are shown in Fig. 5.
The adjacency matrix became an input for the community detection
technique, producing a set of PDIs, which were collected from the Configuration
same drawing (Newman and Girvan 2004). Projection matrices In this study, the authors implemented the established ImageNet
were computed from each set of PDIs using VisualSFM, and then CNN model called AlexNet (TorontoNet in Caffe), framed in the
used for reconstructing the corresponding FDI. For blending the MatConvNet library (Vedali and Lenc 2015). AlexNet exhibited
images, the image blender function in OpenCV, called MultiBand- superior performance in the 2012 ImageNet image classification
Blender, was converted to an executable file to be used in MATLAB competition and has been widely used as a benchmark test for
(MATLAB 2016; OpenCV 2017). newly developed CNN models. The details of its network architec-
ture are explained in LeCun et al. (1990), Krizhevsky et al. (2012),
Drawing Reconstruction Results and Goodfellow et al. (2016). Although several advanced CNN ar-
The four constructed FDIs are shown in Fig. 4. The overall perfor- chitectures have been proposed and customized for specific appli-
mance of FDI reconstruction was quite successful. In Fig. 4, the cations, the authors chose AlexNet here because it is an established
images on the right are magnified areas corresponding to the boxes method. Because the positive and negative images in this study
on each FDI, respectively. Small text and digits as well as thin lines have an obvious visual boundary, the use of the original architec-
are clearly legible. These FDIs were automatically constructed from ture is acceptable (Goodfellow et al. 2016; Vedali and Lenc 2015).
Fig. 4. All four FDIs, automatically constructed using PDIs: (a) through (d) correspond to Drawings 1–4 in Fig. 3, respectively.
For training a classifier, labeled images were first transformed preset these values based on some preliminary tests conducted with
into inputs for the CNNs. Regardless of the category, all images an initial sample of training images. A PC workstation having a
were isotopically (preserving their aspect ratio) resized so that Xeon E5-2620 (Intel Corporation, US) and Telsa k40c (NVIDIA
the shorter side of each image became 256 pixels, followed by Corporation, US) with a 12-GB video memory GPU was used
cropping the center square region to create 256 × 256 pixel square for training and testing the algorithm. The MatConvNet library in-
images. This is based on the fact that objects of interest (herein, stalled on MATLAB, version 2017 was used for this study (Vedali
drawings) do not span the entire image area but rather are often and Lenc 2015).
concentrated at the center of images. The authors further imple-
mented data augmentation, which is a commonly used method that Classification Results
expands the training set to avoid overfitting by modifying the In this study, PDI classification successfully attained a relatively
existing input data without changing their label. First, 227 × 227 high accuracy. Among a total of 546 PDIs for testing (25% of
patches were randomly cropped from the 256 × 256 images and all PDIs), the authors obtained rates of 96.15% (525/546 images)
those patches were randomly flipped about the horizontal direction true positive recall (true classification of PDIs) and a precision of
in each epoch. Additionally, random color shifting was applied to 0.941 (defined as the number of true positives over the number of
vary (jitter) the intensities of RGB images to expand the training positives). These high recall and precision values indicate that most
images as a part of data augmentation (Krizhevsky et al. 2012). To PDIs among the set of images can be accurately detected and the
train the classifiers and test their capabilities, all labeled images rate of false positive errors (non-PDI detected as positive) is very
were divided into groups of 50%, 25%, and 25% for training, val- low. Figs. 6(a and b) show samples of false positive and false neg-
idating, and testing, respectively. The validation set was used for ative images. With only a glance, the images in Fig. 6(a) do look
monitoring the network accuracy during training. like PDIs. Also, because the images used for training and testing
The original implementation for the ImageNet image classifica- were collected by many different field engineers and after several
tion classified 1,000 output classes using the last 1,000-way soft- different earthquake events in various countries, the performance of
max layer (Krizhevsky et al. 2012). In this study, it was converted the classifier was expected to be similar to these results, without
to a logistic loss layer to conduct binary classification. The layers bias toward any real-world future data sets used.
were initialized as a Gaussian distribution with a zero mean and
variance equal to 0.1 (Krizhevsky et al. 2012; He et al. 2015;
Implementation of the Developed Technique
Krähenbühl et al. 2015). The hyperparameters were the same as
those used in AlexNet. The authors trained the models using a sto- In this section, the authors demonstrate the capability of the tech-
chastic gradient descent with a batch size of 512 images, momen- nique developed herein using real-world images collected during
tum of 0.9, and weight decay of 0.0005. The authors trained the past building reconnaissance missions. This example is intended to
network for 120 epochs and the learning rate was logarithmically demonstrate its implementation on an actual image set collected
decreased from 10−2 to 10−5 during training. The authors chose to in the field, with no particular regard for the specific performance.
Fig. 5. Sample images used for training a binary PDI classifier: (a) PDIs (positive); and (b) the rest of images except PDIs (negative). [Images from
Shah et al. 2015; Sim et al. 2015, 2016; NCREE 2016. License: Creative Commons Attribution License (CC BY-SA 3.0 US, https://creativecommons
.org/licenses/by-sa/3.0/us/).]
Performance was instead discussed in the previous sections. collected from a single building (Yujing vocational school teaching
Engineers collecting these data during past reconnaissance missions building) after the Taiwan earthquake in 2016. Fig. 7 shows all of
were not aware of the technique developed in this paper. Thus, the the 323 images collected from this building, including 145 PDIs
PDIs were collected under less than ideal conditions (without fol- marked by boxes. This collection of images is publicly available
lowing the image collection guideline suggested in “Collection of from DataCenterHub (Purdue University and NCREE 2016). This
Partial Drawing Images”). If crumpled regions or objects on draw- image set was selected because a large and relatively complex set of
ings were not removed, the corresponding areas on the projected PDIs was available for testing. The images in this set were not used
images (images in Step 5) would be shifted due to their different for training the PDI classifiers in “Reconstruction of Full Drawing
elevations of the projected plane, producing relief displacements Images.”
on the images. Relief displacement is scale variation on the photo- After processing, it was found that PDIs were classified accu-
graphs caused by height variation of the objects or terrain (Mikhail rately. Images classified as PDIs are marked with solid dots in each
et al. 2001). Cases with such traits would not yield perfectly aligned image in Fig. 7. Among 145 images classified as PDIs, 143 PDIs
FDIs with high quality, such as those constructed in “Representative were correctly classified, yielding a precision of 98.6%. This result
Problem Demonstrating Full Drawing Image Reconstruction.” shows that the trained classifier can be readily used for any PDI
As mentioned in the introduction, each reconnaissance team col- data collected from the future events. After classification, the FDIs
lects a large volume of images from multiple buildings. Here, it was were constructed, with two representative samples shown in Fig. 8.
assumed that the set of images collected from the same building The FDIs in Fig. 8 show that the compiled PDIs were not precisely
were already separated. Since modern reconnaissance images con- aligned with each other. The FDI will thus have irregular resolution
tain date and time information (metadata collected by the camera because the number of PDIs was insufficient and because the draw-
itself) and are captured in chronological order, images from the ing was not placed on a flat surface at the time the PDIs were col-
same building can easily be separated using the difference of image lected. However, the visual contents were legible and appropriate
collection times. The image set used for this demonstration was for typical uses. Based on the assumption that the PDIs fully cover
Fig. 6. Samples of (a) false positive; and (b) false negative images. [Images from Shah et al. 2015; Sim et al. 2015, 2016; NCREE 2016. License:
Creative Commons Attribution License (CC BY-SA 3.0 US, https://creativecommons.org/licenses/by-sa/3.0/us/).]
Fig. 7. Original 323 images collected from a single building after 2016 Taiwan earthquake. [Images from NCREE 2016. License: Creative Commons
Attribution License (CC BY-SA 3.0 US, https://creativecommons.org/licenses/by-sa/3.0/us/).]
the entire drawing, an insufficient number of PDIs indicates that the Therefore, users should follow the recommended guidelines, using
PDIs do not have much overlap with each other. This causes an the suggestions in “Collection of Partial Drawing Images.”
inaccurate estimation of the projection matrix of each image. Also, This demonstration also shows the possibility of applying
capturing PDIs without flattening the drawings produces relief the technique developed herein to existing legacy data sets to clas-
displacement, mentioned in the beginning of this section. Both sify drawing images and recover their full drawing images. Ever
cases cause misalignment of the PDIs on the corresponding FDIs. more past and future postdisaster images are being gathered and
Fig. 8. Two samples of constructed FDIs generated from PDIs classified from the images in Fig. 7. [Images from NCREE 2016. License: Creative
Commons Attribution License (CC BY-SA 3.0 US, https://creativecommons.org/licenses/by-sa/3.0/us/).]
published (EERI 2009; UC CEISMIC 2011; DataCenterHub 2014; the valuable image contributions from the Center for Earthquake
DesignSafe-CI 2016). Such automation will have substantial impli- Engineering and Disaster Data (CrEEDD) at Purdue University
cations for organizing and reusing those large amounts of data in a (DataCenterHub), the EUCentre (Pavia, Italy), the Instituto de
rapid manner to support future research. Ingenieria of UNAM (Mexico), FEMA, and the Earthquake Engi-
neering Research Institute (EERI) collections.