You are on page 1of 9

Deep Learning Spatial Signature

Inverted GANs for isovist representation in architectural floorplan

Mikhael Johanes1, Jeffrey Huang2


1,2École Polytechnique Fédérale de Lausanne
1{mikhael.johanes|jeffrey.huang}@epfl.ch

The advances of Generative Adversarial Networks (GANs) have provided a new


experimental ground for creative architecture processes. However, the analytical
potential of the latent representation of GANs is yet to be explored for architectural
spatial analysis. Furthermore, most research on GANs for floorplan learning in
architecture uses images as its main representation medium. This paper presents an
experimental framework that uses one-dimensional periodic isovist samples and GANs
inversion to recover its latent representation. Access to GANs’ latent space will open up a
possibility for discriminative tasks such as classification and clustering analysis. The
resulting latent representation will be investigated to discover its analytical capacity in
extracting isovist spatial patterns from thousands of floorplans data. In this experiment,
we hypothetically conclude that the spatial signature of the architectural floor plan could
be derived from the degree of regularity of isovist samples in the latent space structure.
The finding of this research will enable a new data-driven strategy to measure spatial
quality using isovist and provide a new way for indexing architectural floorplan.

Keywords: Machine Learning, Isovist, Latent Representation, GANs Inversion, Spatial


Signature.

INTRODUCTION floorplan in architecture uses images as its primary


Generative adversarial networks (GANs) are the representational medium as most GANs are
current generative machine learning model that developed for the image-oriented domain. This
gained popularity in the architecture community research is different from previous work by
due to their capacity to learn and generate novel introducing isovist representation as its input data to
data from a given set of examples (Goodfellow et al., bypass the figurative 2D representation of
2014). GANs are composed of two competing neural architectural floorplan and provide a more
networks, generator G and discriminator D. perceptual reading of the architectural space
Generator is trained to generate synthetic data from (Benedikt, 1979).
an input noise z drawn from a latent space with a Once trained, GANs provide a mapping from the
prior distribution. Gaussian and uniform prior distribution of latent space to the data
distributions are the most chosen prior distribution distribution of training data. However, the reverse
for the latent space. Several experiments have been mapping from the data to the latent space is not
done on using GANs for architectural purposes directly available. GAN inversion is a technique to
(Chaillou, 2019; Newton, 2019; Huang et al., 2021). recover the noise vector z in the latent space of a
However, most research on the GANs for learning trained GAN from a given input data to provide such

Volume 2 – Co-creating the Future – eCAADe 40 | 621


mapping (Creswell and Bharath, 2019; Xia et al., ISOVIST REPRESENTATION
2022). This research is particularly interested in the IN GAN’S LATENT SPACE
discriminative capacity of GAN’s latent space to map Isovists can capture spatial patterns as they encode
isovists samples that are taken from approximately the geometry of a floorplan into a set of local
4000 architectural floorplans (Kalervo et al., 2019). perceptual moments (Psarra, 2009). Several spatial
qualities such as degree of interestingness, pleasure,
beauty, spaciousness, complexity, and clarity have Figure 1
been derived from basic isovist shape descriptions Isovist periodic
(Franz and Wiener, 2008). Isovist essentially captures function
the spatial qualities that are morphologically
defined. Isovist periodic function thus translates the
shape of the isovist into discrete quantitive values for
the machine to learn to identify, extract and
Isovist is defined as a set of areas visible from a compare the embedded spatial qualities (Johanes
given point of view in an environment to provide a and Huang, 2021). Spatial classification using isovists
model of distribution of space (Benedikt, 1979). and machine learning has been done in urban and
Isovist representation offers a numerical description architectural contexts with relatively limited data
of spatial configuration such as area, perimeter (Leduc, Chaillou and Ouard, 2011; Sedlmeier and
compactness, and occlusivity from which certain Feld, 2018). This research aims to expand the
architectural quality can be mathematically derived. research of unsupervised deep machine learning of
(Ostwald and Dawes, 2020). To provide a machine isovist in an architectural context by leveraging
learnable isovist, we represent isovist as a discrete GAN's latent space discriminative properties using
periodic function that characterizes a panoramic an inversion technique.
view of space (Figure 1). The advancement of GAN inversion (Xia et al., 2022) is a technique to
machine learning provides a new way to leverage recover the latent vector of a given input data so that
the statistical pattern from the isovist representation it can be reconstructed back by GANs and provides
without high-level formalization in recognizing and insight into the interpretation of GAN’s latent space.
classifying architectural space (Sedlmeier and Feld, In providing GAN inversion, an additional encoder
2018; Johanes and Huang, 2021 ). network can be trained by using the generated
This research explores the latent representation output x and latent vector z as training data
of isovist in GANs model to provide an unsupervised (Perarnau et al., 2016). However, our experiments
inference of visuospatial quality derived from the with the encoder technique did not yield good
geometry of the floorplan. We begin by customizing results for isovist data. The bad results could be
and training a GAN model to learn the discrete attributed to the highly diverse isovist samples,
periodic representation of isovist training samples. making the training difficult for the encoder. The
Then, an inverse GAN projection is performed to best results we obtained are by using the
embed a given isovist with its latent representation, optimization method (Karras et al., 2020), in which an
and the relation between the structure of GANs‘ individual latent vector is initialized and optimized
latent space with the generated isovist is explored. to reach the target reconstructed isovist.
Our experiment shows that GANs‘ latent space can The research on the exploration of GANs latent
be used to infer a scalar measure of spatial signature space with low dimension latent space and 1D
distinctive for its isovist representation and provide output data is rarely available as most developed
a framework for indexing and measuring the models concentrate on the image-related domain.
experiential similarity of architectural floorplans. The common focus of the research is to find

622 | eCAADe 40 – Volume 2 – Co-creating the Future


directions in the latent space that are correlated with 2018). The smaller the Euclidean norm of the latent
human interpretable generated image vector, the closer it gets to the mean samples, which
transformation such as translation, zooming, and can be interpreted as generic isovists. We, therefore,
adding or removing elements (Voynov and Babenko, aim to explore this hypothetical pattern to
2020). The developed techniques usually require determine the degree of regularity of isovists from
human interpretation to evaluate the image's GANs latent space and propose a framework for
semantic content. Such techniques are challenging extracting this information for indexing and
to implement in this experiment as the semantic searching purposes.
content of isovists is not as interpretable as images.
Figure 2 Alternatively, we explore the correlation
Gaussian prior between the structure of prior distribution of GANs
distribution in high- latent space with generated isovists. The density of
dimensional space samples of Gaussian distribution in high dimensional
is concentrated space is concentrated a hypersphere surface with a
around a radius of the square root of its dimension d, as shown
hypersphere's in Figure 2 (Kilcher, Lucchi and Hofmann, 2018). As
surface such, trained GANs with Gaussian prior will generate
the most meaningful samples if the input latent
vector length, defined as Euclidean norm, is near the
radius of the hypersphere. This effect is more
apparent for high-dimension latent space, used by
most GANs that generate images (White, 2016). An
interpolation to the center of the latent coordinate While the isovists sample is considerably much
will generate unintelligible results as the center has simpler than 2D image, the variation in the dataset is
the lowest sampling density in this hypersphere very high, which can be tricky for GANs training. We
(Kilcher, Lucchi and Hofmann, 2018). implement progressive growing GANs (Karras et al.,
In our experiment, the latent space dimension 2018) using 1D convolutional layers to stabilize the
d=16 is relatively low to generate 256 discrete training and improve the variability of the generated
features of the periodic isovist sample. A low isovists. In order to avoid mode collapse and
dimension of latent space is possible due to measure the performance of GANs in the learning
exponentially fewer features in isovists than in 2D distribution of the dataset, a number of statistically-
images. The relation between latent space and different bins (NDB) are used (Richardson and Weiss,
generated data is easier to establish in a low- 2018). This metric provides a domain-agnostic
dimensional latent space, as the visualization is often technique to compare the distribution of the
more straightforward. This research aims to discover training samples to the generated samples. The high
patterns out of latent space structure such as NDB score will indicate the discrepancy between the
dimension axis and Euclidean norm. training and learned samples, thus measuring the
The intuition is that in relatively low dimensional GAN’s ability to learn the entire dataset distribution.
latent space, the density of the sampling point will
be closer to the center of the coordinate, and there GANS INVERSION
is a possibility that the samples that are generated ENCODING EXPERIMENT
around the center will converge into what we can The framework of GANs inversion isovist experiment
call mean samples (Kilcher, Lucchi and Hofmann, consists of (a) isovist sampling, (b) GANs training and
inversion, and (c) latent space interpretation and

Volume 2 – Co-creating the Future – eCAADe 40 | 623


architectural encoding (Figure 3). The dataset which stabilizes and allows the model to learn the
preparation and isovist sampling are performed full distribution of the dataset. We measure the
using Grasshopper scripts by parsing the dataset's learning results using an NDB score with 10000
SVG files, stochastically sampling the isovist from the samples and 100 bins to test. At the end of the
floor area, and turning it into a periodic function with training, a lower than 1% statistically-different bins
256 discrete features. The sampling collects 5m between training and generated samples is
radius isovists with a density of one sample point per achieved.
1 m2. The resulting dataset is comprised of a 300k
training set and a 34k evaluation set of isovist Latent space interpretation
samples. A spatial representation of isovist can be The trained GANs provide a starting point for the
recovered from the function by plotting it in a polar subsequent latent space exploration and inversion.
coordinate. The relatively low dimensional allows the
exploration of individual axis zi of the latent space
(Figure 4). The diagram shows the interpolation of
the individual axis from −√𝑑 to √𝑑 that crosses the Figure 3
middle point of the latent space. As predicted in the GANs inversion
previous section, the interpolation to the center of experiment
the latent space will generate, however, still framework
intelligible, an average isovist that looks almost like
a rectangular. The interpolation shows inward
convergence toward the center and divergence as it
moves away from the center. A further interpolation
beyond the √d does not significantly change the
generated isovists, which is consistent with the
prediction that most meaningful samples will be
generated around the hypersphere's surface (White,
2016). The finding provides a foundation for
machinic intuition in quantifying a degree of isovist's
regularity according to its latent feature's distance to
the center of the latent space.

GANs inversion
The position of the latent representation of the
isovist in the trained GAN's latent space provides
hypothetical discriminative properties to measure
the regularity of an isovist. An optimization
A 1d convolutional implementation of progressive technique is used to recover latent vectors from a set
growing GAN is implemented in a Python of input isovists. For image data, the optimization is
environment using the Pytorch machine learning performed by minimizing the reconstruction loss of
framework based on a simplified version of ProGAN the generated output, a combination of pixel-wise
(Liu, 2021). We use 16-dimension latent space to means squared error (MSE) distance with perceptual
generate 256 features of the periodic isovist loss (Karras et al., 2020). The lack of perceptual neural
function. The model learns to generate isovist network models for isovist representation forces us
progressively from low resolution to full resolution, to rely upon a single reconstruction error. After

624 | eCAADe 40 – Volume 2 – Co-creating the Future


Figure 4
Interpolation of the
latent space for the
individual axis of
the latent space

Figure 5
Inverting latent
vector from input
isovists

several experiments, we obtained the best results optimization is performed in batches to improve the
using mean absolute error (MAE) as a reconstruction efficiency of the projections.
loss instead. We adopt some techniques from (Karras The inversion process can recover the latent
et al., 2018): ramping up noises during the vector from a set of input isovists to some degree, as
optimization and using a scheduled learning rate for shown in Figure 5. The lack of perceptual loss and
a more stable result. The latent vectors are initialized highly diverse isovists make the isovist's perfect
from the latent space's center coordinate, and the inversion almost impossible to achieve. However,

Volume 2 – Co-creating the Future – eCAADe 40 | 625


the reconstruction of the isovists still shows that the encodings have minimum Euclidean norms mostly
recovered latent vector can generate most geometry sit in the center of the main spaces, capturing the
of the isovists. Therefore, the recovered latent vector common situation of the plan.
is being used as embedding for the isovists, and the The generated isovists show the contrast
relation between the latent vector and isovists in the between these two sets of isovists. The isovists in
architectural plan is explored. which the latent space sits near the center of latent
space resonates with the notion of 'spatial center
Architectural encoding 'introduced in (Franz and Wiener, 2008) as the
The exploration of the latent space indicates that the common referencing point to depict the overall
Euclidean norm of the latent vector to the center of morphological feature of spaces. Quite the opposite,
the latent space provides discriminative properties the isovists whose embedding is located far from the
for the isovist's distinctive quality. We sort the center of the latent space seem to represent the
projected latent code from the evaluation floorplan floorplan's salient spaces.
dataset based on its Euclidean norm and plot the The common spatial signature provides an
isovists from both ends of the latent representation analog to architectural typology where the typical
to clarify our hypothesis (Figure 6). The mapping pattern among collective architecture is established.
reveals that most isovists with the largest Euclidean We develop a framework to index architectural plans
norm for their latent vectors lie in relatively unique based on this latent space signature, representing
vantage points such as near small windows, the regularity of architectural moments among
between two openings, next to the walls, or in the different floorplans, thus providing a novel way to
balconies. On the other hand, the isovists whose index architectural space.

Figure 6
Architectural
mapping of isovist
latent signature

626 | eCAADe 40 – Volume 2 – Co-creating the Future


Figure 7
t-SNE visualization
of latent encoded
floorplan

Volume 2 – Co-creating the Future – eCAADe 40 | 627


TOWARD MACHINE’S INTUITION for self-organizing mapping of isovists data in the
OF SPACE latent space. Accessing and interpreting this latent
The sampling probability of the latent space should space will give insight into this structure and
match the corresponding statistical density in the facilitate different discriminative tasks. This research
training data (Karras, Laine and Aila, 2021). investigates GANs' latent space's discriminative
Intuitively, the use of normal distribution of the properties to provide a method to measure spatial
latent space will assume uniformity of generated quality using isovist and explore a new way to index
examples near the center of the latent space. an architectural floorplan.
However, in high-dimensional space, the sampling The current stage of the research explores an
density is relatively concentrated on the surface of a experimental framework for isovist sampling, GANs
hypersphere and leaves the center of the latent inversion, and latent space interpretation in isovist-
space almost empty. Our experiments utilize a based architectural analysis. The discovered spatial
relatively lower-dimensional latent space than the signature from the GAN’s inverted latent embedding
image generating GANs, thus facilitating the latent is used to encode architectural plans and provide
space structure interpretation. spatial indexing means. The experiments show a
The experiment reveals a glimpse of the unique correlation between the latent embedding’s
latent space structure of 1d convolutional Euclidean norm of the isovists with its architectural
progressive growing GANs trained by hundreds of properties in the floorplan. The measure indicates a
thousands of isovist samples collected from degree of regularity of the isovists and suggests a
thousands of housing floorplans. Euclidean norm of basis for the development of unsupervised discovery
the latent vector of the generated isovists indicates of spatial signature.
a degree of variability which can be translated into Future work is to examine further the semantic
spatial measures. The mapping reveals that both pattern embedded in the GAN's latent space and
unique and regular spatial signatures can be develop different mapping strategies to facilitate the
recognized by sorting the projection of isovists of a discovery of such patterns in the architectural
floorplan according to its Euclidean norm. The found floorplan. Research on the unsupervised discovery of
spatial signatures are potentially usable for spatial richer semantic properties of the latent space is
indexing and recognition and provide a direction of needed to improve the utility of this method. The
discovery machinic intuition of space that is data- quality of generated isovists and the inversion
driven and statistical. Figure 7 shows the t-SNE techniques also needs to be improved. Furthermore,
visualization of architectural floorplan classified by as the research on GANs interpretability is still
its spatial signature encoding. The developed improving, disentanglement of isovists feature in
signature encoding provides a promising GANs' latent space could also define the future
typological reading of architectural moments that direction of the research.
drives the clusterization and ordering of the
floorplan. REFERENCES
Benedikt, M.L. (1979) ‘To take hold of space: isovists
CONCLUSION AND FUTURE WORK and isovist fields’, Environment and Planning B:
While most research on GANs of architectural Planning and design, 6(1), pp. 47–65.
floorplans has been focused on images and its Chaillou, S. (2019) AI + Architecture: Towards a New
generative capacity, this research expands the use of Approach. Harvard University, Graduate School
GANs for isovist representation learning. The of Design.
unsupervised nature or GANs training acts as a tool

628 | eCAADe 40 – Volume 2 – Co-creating the Future


Creswell, A. and Bharath, A.A. (2019) ‘Inverting the Leduc, T., Chaillou, F. and Ouard, T. (2011) ‘Towards
Generator of a Generative Adversarial Network’, a “typification” of the Pedestrian Surrounding
IEEE Transactions on Neural Networks and Space: Analysis of the Isovist Using Digital
Learning Systems, 30(7), pp. 1967–1974. processing Method’, in Advancing
Franz, G. and Wiener, J.M. (2008) ‘From Space Geoinformation Science for a Changing World.
Syntax to Space Semantics: A Behaviorally and Springer, Berlin, Heidelberg (Lecture Notes in
Perceptually Oriented Methodology for the Geoinformation and Cartography), pp. 275–292.
Efficient Description of the Geometry and Liu, B. (2021) Progressive-GAN-pytorch. Available at:
Topology of Environments’, Environment and https://github.com/odegeasslbc/Progressive-
Planning B: Planning and Design, 35(4), pp. 574– GAN-pytorch (Accessed: 17 January 2022).
592. Newton, D. (2019) ‘Generative Deep Learning in
Goodfellow, I. et al. (2014) ‘Generative adversarial Architectural Design’, Technology|Architecture +
nets’, in Advances in neural information Design, 3(2), pp. 176–189.
processing systems, pp. 2672–2680. Ostwald, M.J. and Dawes, M.J. (2020) ‘the’, Nexus
Huang, J. et al. (2021) ‘On GANs, NLP and Network Journal, 22(1), pp. 211–228.
Architecture: Combining Human and Machine Perarnau, G. et al. (2016) ‘Invertible Conditional
Intelligences for the Generation and Evaluation GANs for image editing’, arXiv:1611.06355 [cs]
of Meaningful Designs’, Technology|Architecture [Preprint].
+ Design, 5(2), pp. 207–224. Psarra, S. (2009) Architecture and narrative: the
Johanes, M. and Huang, J. (2021) ‘Deep Learning formation of space and cultural meaning. Milton
Isovist: Unsupervised Spatial Encoding in Park, Abingdon, Oxon ; New York, NY:
Architecture’, in. ACADIA 2021 - REALIGNMENTS: Routledge.
Toward Critical Computation, Online + Global. Richardson, E. and Weiss, Y. (2018) ‘On GANs and
Kalervo, A. et al. (2019) ‘CubiCasa5K: A Dataset and GMMs’, in Bengio, S. et al. (eds) Advances in
an Improved Multi-task Model for Floorplan Neural Information Processing Systems. Curran
Image Analysis’, in Image Analysis: 21st Associates, Inc.
Scandinavian Conference, SCIA 2019. Sedlmeier, A. and Feld, S. (2018) ‘Discovering and
Norrköping, Sweden: Springer-Verlag, pp. 28– Learning Recurring Structures in Building Floor
40. Plans’, in Kiefer, P. et al. (eds) Progress in
Karras, T. et al. (2018) ‘Progressive Growing of GANs Location Based Services 2018. Cham: Springer
for Improved Quality, Stability, and Variation’, International Publishing (Lecture Notes in
arXiv:1710.10196 [cs, stat] [Preprint]. Geoinformation and Cartography), pp. 151–170.
Karras, T. et al. (2020) ‘Analyzing and Improving the Voynov, A. and Babenko, A. (2020) ‘Unsupervised
Image Quality of StyleGAN’, arXiv:1912.04958 Discovery of Interpretable Directions in the GAN
[cs, eess, stat] [Preprint]. Latent Space’, in Proceedings of the 37th
Karras, T., Laine, S. and Aila, T. (2021) ‘A Style-Based International Conference on Machine Learning.
Generator Architecture for Generative International Conference on Machine Learning,
Adversarial Networks’, IEEE Transactions on PMLR, pp. 9786–9796.
Pattern Analysis and Machine Intelligence, White, T. (2016) ‘Sampling Generative Networks’,
43(12), pp. 4217–4228. arXiv:1609.04468 [cs, stat] [Preprint].
Kilcher, Y., Lucchi, A. and Hofmann, T. (2018) Xia, W. et al. (2022) ‘GAN Inversion: A Survey’, IEEE
‘Semantic Interpolation in Implicit Models’, Transactions on Pattern Analysis and Machine
arXiv:1710.11381 [cs, stat] [Preprint]. Intelligence, pp. 1–17.

Volume 2 – Co-creating the Future – eCAADe 40 | 629

You might also like