You are on page 1of 7

Victor Okhoya, May 2016

CARNEGIE MELLON UNIVERSITY


DOCTOR OF PROFESSIONAL PRACTICE
EXPERIMENT REPORT

EXPERIMENT 5:
SPACE PLANNING USING MACHINE LEARNING AND GENETIC ALGORITHMS

ABSTRACT

The aim of this experiment is to establish whether machine learning and genetic algorithms can
recognize and generate architectural space configurations on a design grid. Although these
configurations can have varying patterns it is assumed that they have a complex underlying logic that
data science methods can detect. In concrete terms these configurations could represent functional
architectural floor plans. Thus, if data science algorithms can recognize and generate space
configurations on a design grid then this suggests that they can learn to recognize and generate
architectural floor plans.

PROBLEM STATEMENT

The study is motivated by the ability of machine learning algorithms to perform tasks like visual pattern
recognition and generating musical harmonies. The question is can this ability be extended to the
recognition and generation of architectural space plans. Blessing and Wen 1, for example, describe the
use of Support Vector Machines for classifying art works from different prominent artists. Can this be
extended so that a machine learning algorithm can recognize architectural space plans on a design grid?
Wiggins et al discuss the use of genetic algorithms to generate music. 2 Can genetic algorithms also be
used to generate architectural space plans?

This experiment is a proof of concept exercise that seeks to establish if the premise of using machine
learning and genetic algorithms to recognize and generate spatially variant configurations is plausible.
The broad idea is that architectural space plans can be represented as colored squares on a design grid
(see Figure 1) where the different colors represent the different space types of the space plan. The
design grid can be encoded into a linear string (see Figure 2) and several such strings can be used to
create a training set (see Figure 3). Machine learning algorithms can then be run on this training set to
see how well they perform in autonomously recognizing which configurations represent functional
space plans.

In addition, binary logistic regression can be used on the data set to establish a relationship, expressed
as an equation, between the output (the classifier) and all inputs. This equation can then be used as the
basis of a fitness function that can be used to run a genetic algorithm. The genetic algorithm will
produce viable candidates for evaluation. Finally, the trained machine learning algorithm can be re-run
on the outputs from the genetic algorithm until it finds viable candidates. The computer outputs these
viable candidates for a human agent to validate.

1
Blessing, A. & Wen, K., 2010.
2
Wiggins, G. et al, 1998.

1
Victor Okhoya, May 2016

Figure 1. Design grid of colored squares.

Figure 2. Encode design grid into linear string.

Figure 3. Encoded training set.

METHOD

The experiment described here (Figure 4) is a simplified prototype for a larger, more determinative
study. The larger study will, however, need substantially more data and so the current experiment helps
define the process and metrics in anticipation of the larger study.

As a proof of concept a simple configuration of a design grid, a 3x3 square, is used. The coloring is
restricted to two colors: black and white. A ‘preferred’ configuration is drawn onto 100 design grids
(Figure 5) and a random configuration is drawn on another 100 design grids (Figure 6). The design grids
are then encoded in binary and classified. Microsoft Excel is used to create and encode the design grids
(Figure 7). The encoded design grids are exported as a csv file and run in the machine language tool
Weka. The results are shown in Table 1.

2
Victor Okhoya, May 2016

Figure 4. Using machine learning algorithms to detect space plans and genetic algorithms to generate space plans.
0 0 1
0 1 0
0 1 0

1 1 0
0 0 1
0 0 0

1 1 0
0 0 1
0 0 0

0 0 0
1 1 0
0 0 1

0 0 1
0 1 0
0 1 0

Figure 5. 'Preferred' configurations.


1 0 0
0 0 1
1 0 0

0 1 0
0 1 0
0 1 0

0 0 0
1 0 0
0 0 0

1 1 0
0 1 0
0 0 0

0 0 0
0 1 0
1 0 1

Figure 6. Random configurations.

3
Victor Okhoya, May 2016

Figure 7. Encoding the design grid in MS Excel.

Figure 8. Binary Logistic Regression in PSPP

ALGORITHM METRIC VALUE


Zero R Accuracy 56.12%
Kappa Statistic 0
Naïve Bayes Accuracy 51.02%
Kappa Statistic -0.03
SMO (Support Vector Machine) Accuracy 61.22%
Kappa Statistic 38.78
Multilayer Perceptron Accuracy 87.76%
Kappa Statistic 0.75
J48 (Tree) Accuracy 81.63%
Kappa Statistic 0.64
Random Forest Accuracy 89.80%
Kappa Statistic 0.79
Table 1. Machine learning results on configuration dataset.

The dataset is also run in PSPP (Figure 8), a statistics program, using Binary Logistic Regression. The
following equation is derived:

log ( 1−pp )=−0.94−1.41 x +1.01 x + 0.94 x +0.6x -0.24x + 1.84 x -0.66x +0.06 x + 0.94 x
1 2 3 4 5 6 7 8 9

Where p is the probability that a given configuration is a ‘preferred’ configuration. This means that we
can create an objective function that seeks to maximize p by maximizing

( )
x
e
p= x
1+ e

4
Victor Okhoya, May 2016

where x = −0.94−1.41 x1 +1.01 x2 +0.94 x 3 +0.6 x 4 -0.24 x 5+ 1.84 x 6-0.66 x 7 +0.06 x 8+ 0.94 x 9

In addition we introduce the area factor a. Since the ‘preferred’ configuration always has three black
squares then we seek to maximize the objective function:

p+3−¿ a−3∨¿
Where a is the number of black squares in the configuration. We then run a genetic algorithm
developed in Python on the dataset with the aim of maximizing the objective function (See Figure 10).
The output from the genetic algorithm is gathered into a test set csv file and the algorithm already
trained in Weka is run on this test set csv. Any positively classified configurations will be viable
candidates for our ‘preferred configurations’.

RESULTS

Several Weka machine learning algorithms were run on the configurations dataset. The results are
shown in Table1. From there it is clear that the Random Forest and Multi-layer Perceptron algorithms
give very promising results. The dataset was then run in PSPP and the equation described above derived
as the basis of the fitness function. The genetic algorithm in Figure 10 was used to develop viable test
candidates which were then tested on the machine learning algorithm in Weka. From a dataset of 100
candidates generated by the genetic algorithm the seven shown in Figure 9 were identified as best
meeting the criteria of the preferred configuration (excluding any false negatives). Five out of the seven
are in fact exactly correct.

Figure 9. Results of machine learning test on genetic algorithm output.

In conclusion it appears that the machine learning algorithms have good potential in detecting
architectural space plans. The genetic algorithm also appears to be generating viable space plan
candidates. The simplicity of the prototype study however means broad conclusions cannot as yet be
drawn. It does, however, suggest that a more definitive study should be undertaken as a next step.

5
Victor Okhoya, May 2016

Figure 10. Genetic Algorithm code listing in python 3.

REFERENCES
3
Based on code retrieved on 5 May 2016 from http://lethain.com/genetic-algorithms-cool-name-damn-simple/

6
Victor Okhoya, May 2016

Blessing, A., Wen, K. (2010). Using machine learning for identification of art paintings. Technical report,
Stanford University.

Phon-Amnuaisuk, S., Law, E., & Kuan, H. (2007). Evolving music generation with SOM-fitness genetic
programming. Applications of Evolutionary Computing, 557-566.

Wiggins, G. A., Papadopoulos, G., Phon-Amnuaisuk, S., & Tuson, A. (1998). Evolutionary methods for
musical composition. 

You might also like