You are on page 1of 12

Knowledge-Based Systems 13 (2000) 459±470

www.elsevier.com/locate/knosys

Composition Analyzer: support tool for composition analysis


on painting masterpieces q
S. Tanaka a,*, J. Kurumizawa a, S. Inokuchi b,1, Y. Iwadate a
a
ATR Media Integration and Communications Research Labs., 2-2 Hikaridai Seika-cho Soraku-gun, Kyoto 619-0288, Japan
b
Department of Systems and Human Science, Osaka University, 1-3 Machikaneyama-cho Toyonaka City, Osaka 560-8531, Japan

Abstract
In this paper, we propose a tool for extracting compositional information from pictures called the Composition Analyzer. This tool extracts
such compositional information as the sizes, shapes, proportions, and locations of ®gures, by two processes. More speci®cally, it ®rst
segments a picture into ®gures and a ground by a ®gure extraction method we developed. It then extracts the above compositional
information from the ®gures based on the Dynamic Symmetry principle. The extracted compositional information is used to re®ne the
picture, and as such, facilitates the production of multimedia for non-professionals. q 2000 Elsevier Science B.V. All rights reserved.
Keywords: Composition; Paintings; Attractive region

1. Introduction color usage [1]. This system utilizes theories and guidelines
on human color perception, cultural associations of color,
Most problems faced by non-professional multimedia and appropriate color combinations, which have been
authors in creating good titles are not technological in studied in visual communications design, to construct a
nature, but rather due to a lack of expertise or knowledge rule base. Our system is an example-based system rather
about multimedia designs. Most commercial authoring tools than a rule-based system.
have few functions to support such users in achieving their In this paper, we present one of the tools the above envir-
goals [1]. Consequently, non-professional authors have onment has, i.e. the ªComposition Analyzerº, which
been suffering from not only problems in understanding extracts compositional information from pictures, such as
tool functions, but also in deciding design details. the shapes, proportions and locations of ®gures.
We believe that the multimedia elements of professional This paper is organized as follows: Section 2 presents an
products, such as color combinations, textures, composi- overview of the Composition Analyzer, Section 3 describes
tions, and lighting effects, encompass a lot of the profes- the composition analysis processes of this tool, and Section 4
sional techniques or expert knowledge developed introduces a system that utilizes compositional information
throughout the history of art. Consequently, providing extracted by the Composition Analyzer to re®ne a picture.
these elements with appropriate tools to non-professional
authors can navigate these non-professionals towards creat-
ing better products [2]. On this assumption, we have been 2. Composition analyzer
developing a creative learning environment to help authors
with the production of better images (see Fig. 1) [2]. Composition involves many aspects; however, it can
In previous research, Nakakoji, et al. developed a knowl- roughly be said that composition is a plan for arranging
edge-based color critiquing support system, eMMaC, which objects in a picture with a good balance [3±6]. Any picture
critiques the use of color in a title and suggests appropriate emotionally affects its viewers differently depending on
how the objects in the picture are composed. The composi-
* Corresponding author. Tel.: 181-774-95-1465; fax: 181-774-95-1408. tion can create not only emotional effects but also rhythm or
1
E-mail address: gon@mic.atr.co.jp (S. Tanaka). dynamics in the picture. It is therefore important to deter-
Tel.: 181-6-6850; fax: 181-6-6850-6371. mine where objects should be located, and also what sizes or
q
Derived from `Composition Analyzer: Computer Supported Composi-
tion Analysis on Masterpieces', published in the Proceedings of the Third
shapes these objects should have [6].
Conference on Creativity and Cognition, Loughborough, UK, October 10± Throughout the history of art, the golden section has been
13, 1999, pp. 68±75. Reproduced with permission from ACM q 1999. used to make the most beautiful and ideal proportions of
0950-7051/00/$ - see front matter q 2000 Elsevier Science B.V. All rights reserved.
PII: S 0950-705 1(00)00062-9
460 S. Tanaka et al. / Knowledge-Based Systems 13 (2000) 459±470

Fig. 1. Cyber Atelier.

architectures or art work [5]. Such proportions had already


been used by the ancient Egyptian civilization. In the Fig. 3. An example of a result from experiments.
Middle Ages, they were called ªDivina Proportionsº and
people thought God blessed them. These beautiful propor- the picture [7]. We assume that when a viewer looks at a
tions have had a signi®cant in¯uence on art work. Many picture, he/she ®rst looks over the whole picture and recog-
authors who have created masterpieces have used these nizes ®gures in the picture, and then, the viewer moves his/
proportions to compose what they wanted to express in her attention according to the level of attractiveness of the
their paintings. In particular, Dutch Masters in the 17th ®gures. Therefore, it can be said that two processes are
century, such as Rembrandt and Vermeer, maintained this performed in evaluating the level of attractiveness in the
traditional method to create their art work [5]. For example, visual system.
ªLady Seated at the Virginalsº and ªThe Love Letterº by For instance, in Fig. 2, it is very easy to tell what the most
Vermeer are typical examples of the golden section. In the attractive object would be. In this case, ªXº would be the
modern era, Seurat, Cezanne, Dali, Picasso, and Mondorian most attractive, because it is completely different from the
have composed their paintings based on this idea [5]. There- other objects. But before this could be concluded, i.e. ªXº
fore, by using the idea of the golden section, it is possible to being the most attractive object, the viewer would obviously
analyze relationships among objects in a picture. have had to segment the picture into ®gures (ªXº and ªLº)
The Composition Analyzer extracts compositional infor- and the ground (the white area). He/she would then have had
mation from a picture based on the idea of the golden to evaluate the attractiveness of each ®gure, and ®nally, to
section. It ®rst segments a picture into ®gures and a ground select ªXº as the most attractive object.
by a ®gure extraction method. It then extracts the sizes, To con®rm whether or not the above assumption is valid,
shapes, locations, and proportions of the extracted ®gures we carried out experiments on tracking the eye movements
based on the idea of the golden section. In the next section, of 10 people while they were looking at pictures. We also
the above composition analysis process is described in carried out experiments while having the subjects discrimi-
detail. nate each of the same pictures into ®gures and a ground. An
eye tracking camera system was used to track the eye move-
ments. For the latter experiments, we showed the subjects
the segmentation results for each picture, and then asked
3. Composition analysis process
them to select regions that were parts of the ®gures. The
3.1. Figure-ground segmentation Edge Flow model was used for the segmentation [12]. From
the results, we found that the subjects mostly paid attention
There are regions able to be recognized as the ®gures of a to those regions they selected as ®gures in the second
picture and regions able to be recognized as the ground of experiments. The subjects ®rst recognized ®gures of the
picture. Then, they evaluated the level of attractiveness
for each ®gure based on physical stimulus or the meaning
of the ®gure, or their interest. Finally, they moved their
attention according to their evaluation results. Fig. 3
shows an example of a result from the experiments.
As a result of the above experiments, we con®rmed that
Fig. 2. An example picture of a search problem. our assumption is valid.
S. Tanaka et al. / Knowledge-Based Systems 13 (2000) 459±470 461

In fact, it has been found that the V4 cortex in the human fi,1: the local color contrast of region i,
visual system plays an important role in ®gure-ground RgnColDifi,j: the color difference between region i and
segmentation [8]. The V4 cortex is sensitive to many neighboring region j, which touches region i,
kinds of information both in the spatial domain and in the wi: the penalty coef®cient of region i,
spectral domain relevant to object recognition [8]. In the ei: the euler number of the mask image of region i,
spatial domain, many V4 cells exhibit some length, width, tli: the length of the border line of region i and neigh-
orientation, direction of motion, and spatial frequency selec- boring regions,
tivity. In the spectral domain, they are tuned to the wave- li: the length of the contour of region i,
length [8]. In particular, it has been found that most V4 cells Lti, ati, bti: the color value of region i in the L p a p b p
respond best to a receptive ®eld stimulus if there is a spectral color space,
difference between the receptive ®eld stimulus and its Lsi, asi, bsi: the color value of neighboring region j in
surroundings [8]. The above ®ndings conclude that one of the L p a p b p color space,
the contributions of the V4 cortex to visual processes is ni: the number of neighboring regions of region i.
®gure-ground segmentation. 2. Local texture contrast:
At the V4 cortex, no semantic information is processed.
TexDif i 2 min TexDif k †
Consequently, no attractiveness evaluation is performed fi;2 ˆ k

based on the meanings of scenes or the viewer's interests max TexDif j † 2 min TexDif k †
j k
at this stage. Accordingly, it is possible for a picture to be
segmented into ®gures and a ground only based on physical
features, such as the spectral domain (color) and the spatial 1 X ni
TexDif i ˆ wi RgnTexDif i;j 6†
domain (texture) processed by the V4 cortex. ni jˆ1
From the above considerations, we use the color
contrast and texture contrast of regions for ®gure-ground v
u nf
segmentation. uX
RgnTexDif i;j ˆ t Tti;k 2 Tsj;k †2 7†
kˆ1
3.1.1. Contrast parameter de®nition
Two types of contrasts can be considered for picture where
regions. One is a local contrast, i.e. the difference between fi,2: the local texture contrast of region i,
a region and its surroundings. The other is a global contrast, RgnTexDifi,j: the euclidean distance between the
i.e. the difference between a region and the whole picture. texture feature vector region i and neighboring region
Here, we use both the local contrast and the global contrast. j,
In addition to the above types of contrasts, focus is Tti,k: the texture feature vector of region i,
another important factor for an enhancement of the contrast Tsj,k: the texture feature vector of neighboring region j,
[6]. A focused region is more attractive than a blurred nf: the number of elements in the texture feature
region. Furthermore, the contour of a focused region is vector.
sharp; the contour of a blurred region is not. 3. Global color contrast:
From the above considerations, the following parameters GRgnColDif i 2 min GRgnColDif k †
are used for ®gure-ground segmentation. k
fi;3 ˆ 8†
max GRgnColDif j † 2 min GRgnColDif k †
j k
1. Local color contrast:
ColorDif i 2 min ColorDif k † GRgnColDif i
k
fi;1 ˆ 1†
max ColorDif j † 2 min ColorDif k † q
j k
ˆ wi Lti 2 Lav †2 1 ati 2 aav †2 1 bti 2 bev †2 9†

1 X ni
where
ColorDif i ˆ wi RgnColDif i;j 2†
ni jˆ1 fi,3: the global color contrast of region i,
Lav, aav, bav: the average color value of the picture.
4. Global texture contrast:
RgnColDif i;j
q GRgnTexDif i 2 min GRgnTexDif k †
k
ˆ Lti 2 Lsj †2 1 ati 2 asj †2 1 bti 2 bsj †2 3† fi;4 ˆ 10†
max GRgnTexDif j † 2 min GRgnTexDif k †
j k

1 tli v
wi ˆ 4† u nf
uei 2 2u li uX
GRgnTexDif i ˆ wi t Tti;k 2 Tavk †2 11†
where kˆ1
462 S. Tanaka et al. / Knowledge-Based Systems 13 (2000) 459±470

Fig. 5. De®nition of the contour and the tangent line of a region.


Fig. 4. Response of a Gabor ®lter bank.

characteristics of ®gures.
where
fi,4: the global texture contrast of region i, 1. A closed or surrounded region is apt to be regarded as a
Tav: the average texture feature vector of the picture. ®gure [6].
5. Sharpness of contour: 2. A ®gure is seen as having a contour; not the ground [6].
Focusi 2 min Focusk † Concerning item (1) above, a euler number is calculated
k
fi;5 ˆ 12†
max Focusj † 2 min Focusk † for each mask image of the region. This number is calcu-
j k
lated by subtracting the number of objects in the picture
with the number of holes in the object [15]. For instance,
1 X ni
if a picture has one object and the object has two holes then
Focusi ˆ wi u7Rc i;j x; y†u 13†
ni jˆ1 the euler number of the picture will be 21. Based on this
characteristic, the penalty coef®cient for item (1) (the left
q side of Eq. (4)) is calculated by the following processes,
u7Rc i;j x; y†u ˆ Rcx2i;j x; y† 1 Rcy2i;j x; y† 14† such that the greater the number of holes in the region, the
smaller the contrast value will be.
where For item (2) above, there are regions that touch the
fi,5: the sharpness of the contour of region i, edge(s) of a picture. These regions have a lot of possibility
Focusi: the average edge magnitude of the contour of of being the ground of the picture. To represent whether or
region i, not a region is completely surrounded by other regions, the
u7RC i;j x; y†u : the edge magnitude of pixel j on the length of a tangent line to surrounding regions is measured
contour of region i, (excluding the inside of the region, see Fig. 5). Then, the
Rcx: the gradient in the x direction, length is divided by the length of the contour of the region.
Rcy: the gradient in the y direction. By multiplying this value with every parameter (the right
side of Eq. 4, see Fig. 5), the contrast value of the region is
For the color difference calculation, we use the CIE L p forced to be small if the region touches the edge(s) of the
a p b p color space. This is because color differences in this picture.
color space correspond to the human visual sense in general
[7]. For texture features, a multi-resolution representation 3.1.2. Discrimination function
based on Gabor ®lters is used. Gabor features have been In order to analyze how people achieve ®gure-ground
used to characterize the underlying texture information in segmentation, we collected data on ®gure regions and
given regions. [10,11]. Because Gabor ®lters are not ortho- ground regions of 100 pictures, by performing subjective
gonal to each other in nature, however, redundant informa- experiments with 15 people. In the experiments, we showed
tion exists in the ®ltered images. In order to reduce such complete original pictures individually on a CRT
redundant information, the ®lter parameters must be chosen (2048 £ 2048 resolution) and segmented regions on another
by using the algorithm presented in [9]. This algorithm CRT, and asked the subjects whether each segmented region
ensures that the half peak magnitudes of the ®lter responses was a part of the ®gure or not.
in frequency spectra touch each other as shown in Fig. 4. The Edge Flow model was used for the segmentation
To present texture features, we use 24 ®lters consisting of [12]. This model utilizes a predictive coding model to iden-
four scaled and six oriented ®lters. tify the direction of change in the color and texture at each
A penalty coef®cient is employed based on the following image location on a given scale, and constructs and edge
S. Tanaka et al. / Knowledge-Based Systems 13 (2000) 459±470 463

Moreover, there are pictures in which the number of ®gure


regions and the number of ground regions are the same.
Therefore, let us assume that P(c1) is equal to P(c2). In
fact, P(c1) and P(c2) were 0.48 and 0.52, respectively, in
the experiments.
On the above assumption, Eq. (15) becomes as follows:
p Xuc1 † $ p Xuc2 † 16†
Assuming that p(Xuci) can be represented as a K-dimen-
sional normal distribution, p(Xuci) becomes as follows:
 
1 1  i † 0 Vi21 X 2 X i †
p Xuci † ˆ 1=2
exp 2 X 2 X
2p†K=2 uVi u 2
17†
where
Fig. 6. Result of PCA.
Vi: the covariance of the principal values in the ®gure or
the ground data,
¯ow vector. By iteratively propagating the edge ¯ow,
X i : the average principal values of the ®gure or the
boundaries are detected (at those image locations encoun-
ground data.
tering two opposite directions of ¯ow in the stable state)
[12]. Then, the area surrounded by the boundaries is
By substituting Eq. (17) into Eq. (15) and taking the
extracted as a region. This method can segment a picture
logarithm on both sides of Eq. (15), the following formula
into regions whose colors and textures are homogeneous.
results.
Therefore, the method suits our purpose, i.e. ®gure-ground
segmentation based on the color and texture contrasts of uV1 u
regions. log 1 X 2 X 1 † 0 V121 X 2 X 1 †
uV2 u
In our experiments, the optimized parameters for the
Edge Flow model were chosen for each picture during the 2 X 2 X 2 † 0 V221 X 2 X 2 † # 0 18†
segmentation. After the experiments were completed, the
values of the de®ned contrast parameters (the number of Here, we de®ne our ®gure-ground discrimination func-
data sets was 1162) were measured for each region. Then, tion by using Eq. (18).
principal component analysis (PCA for short) was carried uV1 u
out on the data to detect characteristics of the ®gure region Dis X† ˆ log 1 X 2 X 1 † 0 V121 X 2 X 1 †
uV2 u 19†
and the ground region. Fig. 6 shows the results of the PCA.
From the PCA results, we found one possibility for discri- 2X2 X 2 † V221
0
X 2 X 2 †
minating the ®gure region from the ground region.
As a result of the experiments, we constructed a discri- If Dis(X) # 0, then X is a part of the ®gure.
mination function by applying the Maximum Likelihood
method to the PCA results. Let X be a feature vector for a 3.1.3. Figure extraction method
region consisting of the principal component values, and let A ®gure extraction method is proposed here. The method
c1 and c2 represent the category for the ®gure region and the has the following process.
category for the ground region.
If and only if the following condition is satis®ed, X is a 1. Segment a picture into plural regions by the Edge Flow
®gure region. model with the optimum parameters to the picture.
2. Measure the contrast parameters of the regions.
p Xuc1 †P c1 † $ p Xuc2 †P c2 † 15†
3. Transform the contrast parameters of the regions.
where 4. Transform the parameters to the principal component
space.
P(ci): the occurrence probability of category ci, 5. Select those regions whose evaluation values of the
p(Xuc1): the probability that X is a part of the ®gure, discrimination function are less than zero.
p(Xuc2): the probability that X is a part of the ground.
Here, we use every principal component (®ve principal
Note that P(ci) is an unknown variable and it varies components) for the discrimination. From the results of the
depending on the picture. There are pictures that have experiments, the following parameters for the discrimina-
more ®gure regions than ground regions, and vice versa. tion function were obtained.
464 S. Tanaka et al. / Knowledge-Based Systems 13 (2000) 459±470

X 0 ˆ ‰ x1 x2 x3 x4 x5 Š idea of the golden section, it is possible to analyze relation-


2 3 ships among objects in a picture.
0:46 0:46 0:22 0:51 20:51 A proportion is considered to be an attribute of a form
6 7 because ratio formulas are often used to determine visual
6 0:52 20:29 20:13 0:48 0:63 7
6 7 order or visual balance [14]. In ancient Greek civilization,
6 7
6 7
 6 0:35 0:68 20:22 20:49 0:35 7 architectures or sculptures were built based on the following
6 7
6 7 proportions to create visual balance [5].
6 0:45 20:39 20:60 20:27 20:46 7
4 5
0:43 20:30 0:73 20:44 20:01 1. 1:1 p
2. 1 : p2
2 3
0:96 3. 1 : p3
6 7 4. 1 : p4
6 20:05 7
6 7 5. 1: 5
6 7
 6 7 6. 1:1.618 (the golden proportion)
X 1 ˆ 6 0:13 7
6 7
6 7
6 0:01 7
4 5 A rectangle whose proportion is one of the above ®rst ®ve
0:0 proportions is called a root rectangle. A rectangle whose
proportion is the same as (6) is called a golden rectangle
2 3 [5]. Jay Hambidge, who was a professor at Yale University,
20:66
6 7 found that these rectangles can be created based on squares.
6 0:03 7 According to the Dynamic Symmetry principle he proposed,
6 7
6 7
6 7 these rectangles can be subdivided into smaller, similar
X 2 ˆ 6 20:09 7
6 7 rectangles that have parallel or perpendicular diagonals.
6 7
6 20:01 7 The aspect ratio of canvas actually has almost the same
4 5
0:0 proportion as one of the above proportions. There are three
standard types of canvas currently
p in use: Figure, Paysage,
2 3 and Marine. Payasage is a 2 rectangle, Marine is a golden
2:71 20:02 20:21 20:09 0:03
6 7 rectangle, and Figure is a rectangle made by combining two
6 20:02 1:35 0:0 0:04 0:01 7 golden rectangles at their long edges [5]. Therefore, apply-
6 7
6 7 ing the Dynamic Symmetry principle to canvas makes it
6 7
V1 ˆ 6 20:21 0:0 0:53 20:01 0:03 7
6 7 possible to determine not only the size of an object but
6 7
6 20:09 0:04 20:01 0:41 0:0 7 also its location, to maintain a good visual balance. Conse-
4 5
quently, it becomes possible to extract compositional infor-
0:03 0:01 0:03 0:0 0:19
mation from a picture by applying the principle to extracted
2 3 ®gures.
1:66 0:07 0:01 0:05 20:02
6 7
6 0:07 0:94 0:01 20:03 20:01 7 3.2.1. Composition extraction method
6 7
6 7 The way to subdivide a rectangle according to the
6 7
V2 ˆ 6 0:01 0:01 0:56 0:01 20:02 7
6 7 Dynamic Symmetry principle is as follows: ®rst, a diagonal
6 7
6 0:05 20:03 0:01 0:43 0:0 7 line is drawn across the rectangle. Second, perpendicular to
4 5
this diagonal, another line is constructed. This line intersects
20:02 20:01 20:02 0:0 0:20
the long side of the rectangle and divides it into two smaller
Fig. 7 shows examples of results with this method. similar rectangles. The same procedure can be performed on
We carried out experiments with this method in order to the smaller rectangles (see Fig. 8).
ascertain how precise the method can extract ®gure regions As mentioned in the previous section, the Figure type
that viewers will recognize. In the experiments, this method canvas is a rectangle made by combining two rectangles
could extract ®gure regions selected by human subjects at at their long edges [5]. Therefore, it is necessary for the
the rate of 80% accuracy [15]. Figure type canvas to be split in half at the beginning to
apply the Dynamic Symmetry principle. Moreover, the
3.2. Composition analysis center of a picture is also important; therefore, the center
lines are drawn at the beginning.
As mentioned in the previous section, the golden section Here, we propose the following composition extraction
has been used to create the most beautiful and ideal propor- method.
tions of architectures and art work [5]. In addition, many
famous painters have used the idea of the golden section to Procedure:
compose objects in their paintings. Therefore, by using the Detect the canvas type
S. Tanaka et al. / Knowledge-Based Systems 13 (2000) 459±470 465

Fig. 7. Figure extraction results.

Draw the center lines Make the whole picture the target
IF the type is Figure End IF
Subdivide the picture into two golden rectangles WHILE it is not stable
Make the two rectangles the targets FOR the target rectangles in the picture
ELSE Divide the rectangles into smaller rectangles
466 S. Tanaka et al. / Knowledge-Based Systems 13 (2000) 459±470

Fig. 8. Subdivision process based on the dynamic symmetry principle.

FOR the smaller rectangles


IF a rectangle is too small
Ignore the rectangle
ELSE
IF the rectangle is occupied by the ®gure
at the rate of the threshold
Ignore the rectangle
ELSE IF the rectangle is occupied by the
ground at the rate of the threshold
Ignore the rectangle
ELSE
Make the rectangle the target
ENDIF
ENDIF
END
END
END
Extract regions that are constructed by base lines and
occupied by the ®gure at the rate of the threshold;
Compositional information has now been extracted
from the picture consisting of base lines and an abstracted
®gure.

Fig. 9 shows an example of a result.


As a result of the method, we can explore how the ®gure
was drawn. For the example of Fig. 9, it is possible to see
that the body line of Venus was determined based on a base
line of the golden section, and the location of each part of
the body was also determined by golden section points.
By using this tool, the user can learn how professional
painters maintain the visual balance of a picture.
Fig. 9. Result of the composition extraction method.
3.2.2. Experiment of extracting compositional information
of masterpieces
In order to ascertain how applicable the proposed method
is to existing paintings, we performed experiments on Table 1
extracting the compositions of 100 paintings by the method. Result of experiments
We collected paintings from the beginning of the Renais-
sance to the Modern era, all of which were available on CD- Era Century Success rate
ROM [16]. In the experiments, we asked professional pain- Beginning of Renaissance 15 0.25
ters to judge whether or not the method could extract proper Mid-Renaissance 16 0.80
golden section points and base lines able to explain how the Baluch 17 0.70
subjects within the paintings were drawn. Romance 18 0.13
Impressionism 19 0.76
The results of the experiments are shown in Table 1.
Modern 20 0.75
From the results in Table 1, the average of the success
S. Tanaka et al. / Knowledge-Based Systems 13 (2000) 459±470 467

Fig. 10). Since the tool allows the user to generate different
pictures depending on speci®ed compositions, the user can
experiment with a variety of good compositions on an origi-
nal picture. As a result, the user can learn how masters have
maintained the visual balance of pictures.
For the image recomposition, the system asks the user to
input three pictures, an original picture, a ground picture,
and a guide picture. The ground picture is a picture onto
which the system recomposes ®gures of the original picture.
Both the original picture and ground picture are input into
the system by using an image scanner or by specifying them
as ®les. The guide picture is a picture with compositional
information, and provides recomposition guidelines for the
system.
An Image Database is available for guide picture retrie-
val. When the user searches for pictures, the user can specify
an author's name, a picture's name, a type of picture
(portrait, scenery, group, etc.), or how many or what kinds
of objects there are.
Fig. 10. Image re-composer. After the user inputs the original picture, the system, tries
to extract ®gure objects from the picture by the ®gure
rates is 56.5%, and the success rates differ depending on the extraction method. Then, the user chooses objects which
era. To understand these results, we investigated the history are to be recomposed by the system. Image Re-Composer
of paintings. ®nally recomposes the selected objects according to the
In fact, the golden section was not a popular technique to compositional information of the guideline picture.
determine the composition of paintings at the beginning of the The following section explains the above processes in
Renaissance, and it was only used by painters in Northern detail.
Europe [13]. Then, the golden section gradually spread out
to all of Europe, and in the mid-Renaissance period, it became 4.1. Object selection
a common technique among painters and architects, especially
those who designed churches or painted pictures of altars. In After the user inputs a picture having the desired objects
Baluch, the Dutch masters in particular used this traditional to recompose, Image Re-Composer tries to extract those
technique to compose their paintings. For Romance or objects by the ®gure extraction method. However, because
Realism, artists were not at all concerned about composition; the extraction method will not always give perfect results,
they simply wanted to compose their works without consider- the user is asked to correct the results by the system, when
ing composition. With Impressionism, artists came to realize necessary. The system shows the extraction results with
the importance of composition again, and the golden section ®gure regions in color and ground regions in gray. If the
was revived. This movement is still going on. user is not satis®ed with the results, he/she can correct any
Considering the above, it stands to reason why we got the one of them by selecting or de-selecting regions that have
results shown in Table 1. What is more interesting, however, been mis-discriminated by the system. The system then asks
is that this engineering approach can give one quantitative the user to discriminate each object that the user wants to
evidence to support theories in Art. recompose. The user can discriminate each object by select-
From the results in Table 1, the average of the success ing multiple regions consisting of the object.
rates is 56.5%; however, the method can be useful enough to When the user extracts an object from the input picture,
collect a suf®cient amount of compositional information the object appears in an object browser of the Image Re-
considering the existing number of paintings. Composer control panel, as shown in Fig. 11. This tool
registers all of the objects that the user previously extracted,
and it allows the user to specify objects extracted from
4. Image re-composer different pictures in order to recompose them within a
picture. Similar browsers are also available for the ground
Here, we introduce an application system that uses the picture and the guide picture.
compositional information extracted by the Composition If the user selects two objects, the system gives the
Analyzer. The system is called ªImage Re-Composerº, objects IDs in order to take the correspondence between
and is a post-production tool that decomposes a picture objects in the guide pictures and the user selected objects
and regenerates it as a new improved picture according to (see the left side of Fig. 11). Currently, the system allows the
compositional information extracted from masterpieces (see user to select two objects or less at the same time; because
468 S. Tanaka et al. / Knowledge-Based Systems 13 (2000) 459±470

Fig. 11. Image re-composer control panel.

we have experimentally found that it is better to recompose pixels of the ground picture, and s the scale coef®cient for
three objects or more as a group of objects rather than re- the selected object.
compose them individually.
s ˆ m p n=x 20†
4.2. Guide picture search Then, the system scales the object with the calculated
coef®cient.
When the user retrieves a guide picture from the image For the location adjustment, we calculate a new location
database, the user can specify what kinds of objects and how with the following equations.
many objects he/she wants to recompose as keywords. The Let wg be the width of a guide picture, hg the height of the
system retrieves pictures that match the speci®ed keywords, guide picture, (xg, yg) the center of the gravity of an object in
and then matches the shapes between the user selected objects the guide picture, wb the width of the ground picture, hb the
and the retrieved objects. Finally, the system shows the user height of the ground picture, and (xf, yf) the center of gravity
guide pictures whose objects are as similar to the user-selected of the user-speci®ed object.
objects as possible. This function is provided to assist the user
as much as possible in not specifying a bad combination. For xf ˆ wb p xg =wg 21†
example, when the user tries to recompose a standing ®gure, it
is not good for the system to recommend the composition of a yf ˆ hb p yg =hg 22†
sitting ®gure, because the result would obviously be bad.
For the shape matching, we employ P type Fourier
4.4. Image re-composed results
descriptors [17] to represent the shape, and measure the
similarity as the euclidian distance between vectors whose Fig. 12 shows some examples of recomposed results by
items are the above Fourier descriptors. the system.
As shown in Fig. 12, the user can experiment with a
4.3. Image re-composition variety of compositions on the same objects. Therefore,
the system is useful for non-professionals to learn how to
For the image recomposition, the system adjusts the sizes
maintain the visual balance within a picture.
and locations of the selected objects according to the speci-
®ed guide picture, and composes the objects onto the speci-
®ed ground picture. 5. Conclusion
For the size adjustment, the system calculates a scale
coef®cient for each object as follows: In this paper, we described a tool for extracting the
Let x be the number of pixels of a selected object, m the compositional information of pictures. The compositional
ratio between the size of an object within the guide picture information is extracted from a picture by two processes:
and the size of the whole guide picture, n the number of ®gure extraction and composition analysis on the extracted
S. Tanaka et al. / Knowledge-Based Systems 13 (2000) 459±470 469

Fig. 12. Re-composed results.

®gures by using the Dynamic symmetry principle. The result eMMaC: Knowledge-Based Color Critiquing Support for Novice
of the analysis is used in an application system, Image Re- Multimedia Authors, Proc. ACM Multimedia '95, 1995, pp. 467±
476.
Composer, which is a tool for re®ning a picture according to [2] A. Plante, S. Tanaka, S. Inoue, M-Motion: a creative and learning
compositional information extracted from masterpieces. environment facilitating the communication of emotions, Proc. CGIM
This tool can also help non-professionals learn how to '98, 1998, pp. 77±80.
maintain visual balance since they are able to explore a [3] D.A. Dondis, A Primer of Visual Literacy, The MIT Press,
variety of compositions of professional works. Furthermore, Cambridge, MA, 1974.
[4] Shikaku Design Kenkyusho Corporation, Essence of Composition,
the tool can be used by researchers working in art societies
1995 (in Japanese).
to analyze paintings. [5] Yanagi Ryo, Golden Section, Bijyuthu Shuppan Sha, 1998, (in Japa-
However, the composition information extracted by the nese).
Composition Analyzer is static information. A dynamic [6] R.D. Zakia, Perception and Imaging, Focal Press, 1997.
composition does exist that represents the context of a [7] Tadasu Oyama, Shoga Imai, Tenji Wake, Handbook of Sensation and
Perception, Seishin Shobo, 1996 (in Japanese).
picture [4]. This composition has been known as the ªLead-
[8] R. Desimone, S.J. Schein, J. Moran, L.G. Ungerleider, Contour, color
ing eyeº. In this case, the artist usually leads the attention of and shape analysis beyond the striate cortex, Vision Research 25
viewers from the main object in his/her work to various (1985) 441±452.
points in the picture, or leads the attention of viewers [9] B.S. Manjunath, W.Y. Ma, Texture features for browsing and retrie-
from a sub-object to the main object by controlling the val of image data, IEEE Transactions on Pattern Analysis and
level of attraction of sub-objects [4]. In order to extract Machine Intelligence 18 (8) (1996) 837±842.
[10] D. Dunn, W.E. Higgins, Optimal Gabor ®lters for texture seg-
this information, it is necessary to evaluate the level of mentation, IEEE Transactions on Image Processing 4 (7) (1995)
attractiveness of regions properly. Future work will there- 947±964.
fore involve such attractiveness evaluation. [11] K.Jain Anil, F. Farrokhnia, Unsupervised texture segmentation using
Gabor ®lters, Pattern Recognition 24 (12) (1991) 1167±1186.
[12] W.Y. Ma, B.S. Manjunath, Edge ¯ow: a framework of boundary
detection and image segmentation, Proc. CVPR '97, 1997, pp. 744±
References 749.
[13] T. Kanbayashi, K. Shioe, K. Shimamoto, Handbuch der Kunstwis-
[1] K. Nakakoji, B.N. Reeves, A. Aoki, H. Suzuki, K. Mizushima, senschaft, Keisou Shobou, 1997 (in Japanese).
470 S. Tanaka et al. / Knowledge-Based Systems 13 (2000) 459±470

[14] C. Wallshlaeger, C. Busic-Snyder, Basic Visual Concepts and Prin- [16] Planet Art, A Gallery of Masters, 1997.
ciples, McGraw Hill, New York, 1992. [17] Y. Uesaka, A new Fourier descriptor applicable to open curves, IEICE
[15] S. Tanaka, Y. Iwadate, S. Inokuchi, A ®gure extraction method based Transaction J67-A (3) (1983) 166±173.
on the color and texture contrasts of regions, Proc. ICIAP '99, 1999,
pp. 12±17.

You might also like