Professional Documents
Culture Documents
Lav Varshney
Abstract—The problems of face metamorphosis and face caricature are discussed. First, the
general problem of image metamorphosis, composed of the image warping and color
interpolation steps is formulated. A field warping algorithm based on corresponding feature
primitives for image warping and its use for face morphing is presented. Extensions to multiple
input image morphing are also provided. A method for generation of continuous tone face
caricatures through the use of the same field warping algorithm is given. There is also some
discussion on the generation, selection, and use of prototype faces for caricature generation. A
possible application of the face morphing and caricature techniques, the development of a face
beautifier is discussed, however no implementation is given. Finally, face morphing and face
caricature are framed in a mathematical context as possible operations for the mathematical
formulation of face space as a vector space. Step-by-step examples of face morphing and face
caricature are also given.
1. Image Metamorphosis
Image metamorphosis is an operation that allows a combination of images to be formed,
so that the combination is recognized as a natural image as well. In fact, image metamorphosis,
or simply morphing, as it is commonly called, allows an entire continuum of “in-between”
images to be formed. Traditionally, the primary application of morphing has been in the
entertainment industry, where it has been breathtakingly used in film and television as a visual
effects tool for fluidly transforming one image into another. Classic examples include Michael
Jackson’s Black and White and Steven Spielberg’s Indiana Jones and the Last Crusade. The
morphing operation is actually composed of two coupled operations, namely image warping and
color interpolation. The geometric operations of image warping allow the features of the images
being morphed to stay aligned, whereas the color interpolation, or cross-fading, operation blends
color.
A A’
B B’
C C’
Forward
D D’
Mapping
E E’
F F’
G G’
Input Output
For digital imagery, the output values, indexed by ordered sets of real numbers, must be assigned
to discrete output pixel locations. There are a couple of drawbacks that arise from discrete
implementations of forward mapping. The first problem is that some pixels in the output image
may not be assigned any values, resulting in holes. As seen in Figure 1, the output pixel F’ is not
1
assigned a value. The second problem is that multiple input pixels may be assigned to the same
output pixel, resulting in overlap. As seen in Figure 1, the output pixel G’ is assigned two values
[1].
Inverse mapping implementations of image warping are able to overcome these two
problems. Inverse mapping operates by projecting each output coordinate into the input image
via the mapping transformations. The value of the pixel at that input point is copied onto the
output pixel. The output pixels are centered on integer coordinate values, and are projected onto
the input at real-valued positions. Figure 2 demonstrates the inverse mapping operation.
A A’
B B’
C C’
Inverse
D D’
Mapping
E E’
F F’
G G’
Input Output
Unlike the forward mapping scheme, inverse mapping guarantees that all output pixels are
computed [1]. The overlap problem from forward mapping, however, is manifested in another
form. Evaluating the digital input image at arbitrary real-valued locations requires the input
image to be resampled. The simplest method to perform this resampling would be point
sampling, taking the value at the closest pixel, however this results in aliasing, referred to as the
“jaggies” (due to the jagged edges that are induced in images) by the computer graphics
community.
Filters may be used to mitigate the effects of aliasing, however there is a tradeoff between
blurring and aliasing. Additionally, the use of filtering increases the computational complexity.
Commonly used filtering methods include bilinearly interpolating the value of the four closest
pixels, and computing the weighted sum of a pixel neighborhood, using weights that are derived
from a normalized Gaussian function.
The simplest mapping functions include translations, rotations, dilations, affine
transformations, and perspective transformations. More complicated mapping functions are
generally based on warping locations of features in the input image. In order to develop these
mapping functions, it is necessary to locate salient features in the input image.
2
3. Feature Determination
Determining the mapping functions typically used for image morphing requires an
animator or other human operator to establish corresponding feature points of the images,
however there has been much work on automating this task for specific image classes, such as
human faces, e.g [2]. Figure 3 shows an example of the features that may be either manually or
automatically specified.
10
20
30
40
50
60
70
80
90
100
110
10 20 30 40 50 60 70 80 90
(a) (b)
Figure 3. Delineation of face features. (a) Original Image [4]. (b) Image with centers of eyes and edges
of mouth delineated.
Other features that are typically delineated in faces include the boundary between hair and
background, hairline, eyebrows, outline of eyes, outline of nose, outline of mouth, boundary
between face and background, chin, and ears. When morphing between other types of images,
boundaries are still the typical features that are marked. Generally, more features provide better
results, however reasonable results can be obtained with a minimal set of features, such as those
shown in Figure 3. An additional benefit of a reduced feature set is reduced computational
complexity.
3
morphing is based on fields of influence around two-dimensional control primitives, lines that
delineate features. Consequently this method of morphing is referred to as field morphing. First,
the warping with a single pair of corresponding control lines is considered.
The feature line in the source image, or input image, is defined by its endpoints P’ and
Q’. The corresponding feature line in the destination image, the image towards which the source
image is to be morphed, has endpoints P and Q. These two line segments define an inverse
mapping from the destination image pixel coordinate X to the source image pixel coordinate X’.
The mapping is given in (1)-(3).
u=
( X − P ) ⋅ (Q − P ) (1)
2
Q−P
v=
( X − P ) ⋅ perpendicular (Q − P ) (2)
Q−P
The dot product is the standard Euclidean inner product, the vector norm is the standard
Euclidean norm, and the perpendicular( ) operation gives the vector that is perpendicular and the
same length as the input vector. Although there are two such vectors, if the same choice is
always given, the particular choice is not important. The value u is the normalized distance
along the line, and v is the perpendicular distance from the line. A pictorial representation of this
mapping is given in Figure 4.
Figure 4. Beier-Neely warping transformation for a single pair of corresponding lines. (Figure 1 in [3])
The mapping leads to the warping algorithm given as pseudocode in Figure 5. Point sampling is
used for simplicity and clarity, at the expense of aliasing.
4
For each pixel X in destinationImage
Calculate u and v according to (1) and (2)
Calculate X’ according to (3)
Round elements of X’ to nearest integers, Z’
destinationImage(X) = sourceImage(Z’)
end
Simple geometric examples of single line correspondence Beier-Neely warping are shown in
Figures 6 and 7. With only a single line of correspondence, only these simple transformations
are possible.
(a) (b)
Figure 6. Rotation example of Beier-Neely warping (Figure 2 in [3]). Arrows indicate P’Q’ and PQ
respectively. (a) Source image. (b) Destination image.
(a) (b)
Figure 7. Dilation/Translation example of Beier-Neely warping (Figure 2 in [3]). Arrows indicate P’Q’
and PQ respectively. (a) Source image. (b) Destination image.
When multiple corresponding feature lines are provided, much more complicated
mappings are possible. In order to use multiple features, the basic Beier-Neely warping is
expanded so that a displacement for each of the lines is calculated and these displacements are
combined through a weighted sum. The weights are spatially variant, so that the lines closest to
5
a particular point are given more weight. Additionally, longer lines may be given more weight
than shorter lines.
The weighting parameter for a particular line is given by
length p
b (4)
weight =
(a + dist )
where length is the length of the line, dist is the distance from the pixel to the line. The
parameters p, a and b may be used to change the relative effects of the various lines. The
parameter a determines the effect that distance has; if a is nearly zero, then a point right on the
line will have nearly infinite weight. By increasing a, the warp is made more smooth. The
variable b determines how fast the effect of distance falls off. Parameter p relates the effect of
the length of the line to the weight. Figure 8 demonstrates the algorithm for multiple line Beier-
Neely warping.
A pictorial representation of the multiple line Beier-Neely algorithm is given in Figure 9, using
the two line case as a representative.
6
Figure 9. Beier-Neely warping transformation for two pairs of corresponding lines. (Figure 3 in [3])
A simple example of the two line Beier-Neely algorithm is given in Figure 10. As seen, even
with the two line case, a complicated warping operation is possible.
Figure 10. Simple example of two line Beier-Neely warping (Figure 4 in [3]). (a) Source image. (b)
Destination image.
7
6. Morphing Between Two Faces
In this section, a complete step through of the morphing of two faces is presented. We
start with two original faces, shown in Figure 11.
(a) (b)
Figure 11. Original Images. (a) Image A [4]. (b) Image B [4].
The next step is to mark the corresponding features in the two images, the Pi’ and Qi’.
Here we consider the centers of the eyes and the edges of the mouth. Images with delineated
features are shown in Figure 12.
10 10
20 20
30 30
40 40
50 50
60 60
70 70
80 80
90 90
100 100
110 110
10 20 30 40 50 60 70 80 90 10 20 30 40 50 60 70 80 90
(a) (b)
Figure 12. Original Images with delineated features. (a) Image A. (b) Image B.
8
Next, an intermediate set of features is derived from the features of the two input images.
Here, a half and half, endpoint interpolation method is used to generate the intermediate features.
This is shown in Figure 13.
20
40
60
80
100
0 10 20 30 40 50 60 70 80 90
The next step is to warp the two input images towards the intermediate feature set. This
is accomplished using the Beier-Neely two line warping algorithm. The results of this warping
are shown in Figure 14.
(a) (b)
9
The final step is to cross-dissolve the two warped images to get the final morphed result.
Figure 15 shows the result of this cross-dissolve.
Figure 15. Final Morphed Image, 50% Image A and 50% Image B.
Varying the percentages of images A and B in the morph allows a continuum of faces to
be created. Figure 16 shows a series of faces created from the input images shown in Figure 11.
(a) (f)
Figure 16. Series of Morphed Faces. (a) 100% B, 0% A. (b) 80% B, 20% A. (c) 60%B, 40% A. (d)
40%B, 60%A. (e) 20% B, 80% A. (f) 0% B, 100%A.
10
7. Morphing Multiple Faces
The morphing operation that was presented in the previous sections was a binary
operation, morphing together two face images. An extension to morphing together multiple
faces is referred to as polymorphing. In this case, the n input images I1, I2, …, In may be
formulated as vertices of an n –1 dimensional simplex. All of the in-between images are points
in the simplex, indexed by barycentric coordinates that are all greater than or equal to one, and in
total sum to one [5]. With this barycentric coordinate system, the set of representable images is
exactly like the set of perceivable colors in the barycentric coordinate system developed by
Newton [6]. The color system of Newton is shown in Figure 17. The intermediate shape is
derived by a barycentric coordinate weighted linear interpolation among the features of all input
images. Cross-fading is also proportional to the barycentric coordinates, which meet the axioms
of probability. An example is shown Figure 18.
11
(a) (b) (c)
(g)
Figure 18. The morph of three faces, 15% A, 60% B, 25% C. (a) Input Image A [4]. (b) Input Image B
[4]. (c) Input Image C [4]. (d) Warped A. (e) Warped B. (f) Warped C. (g) Final Morphed Output.
8. Face Caricature
Face caricatures have a long history, a prime example being in political cartoons, as in
Figure 19. The method used to create caricatures involves exaggerating features that deviate
from the prototype, or average face. The central Beier-Neely geometric warping technique may
not only be used for face morphing, but also for generating photographic quality (as opposed to
line art as in Figure 19) caricatures [9].
12
(a) (b)
Figure 19. Face Caricature. (a) Original, John Kennedy Portrait [7]. (b) Caricature, John Kennedy
Caricature [8].
In order to generate a caricature, a protoype face from the class of faces to which the face being
caricatured belongs, is required. Prototypes may be generated by polymorphing together a large
collection of faces in the class. Examples of prototype faces are shown in Figure 20.
Figure 20. Examples of Prototype Faces. From top left: El Greco adult male, East Asian adult male,
Modigliani adult male, Chimpanzee, European adult male, European older male, 1950's female 'pin-up',
European adult female, European female child (Figure 1 in [10]).
13
Once the prototype face has been chosen, the next step is to delineate corresponding features in
the face to be caricatured and the protoype. Rather than determining an intermediate face, as in
morphing, the difference between the prototype and the source face is computed. This difference
is multiplied by a caricaturization factor, and then is added to the source face. Then the Beier-
Neely warping is used to warp the source face towards the exaggerated face. Although color
exaggeration is possible, it will not be considered in this study. The next section shows an
example of caricature generation.
10
20
30
40
50
60
70
80
90
100
110
10 20 30 40 50 60 70 80 90
(a) (b)
Figure 21. Face to be caricatured [4]. (a) Original face. (b) Face marked with delineated features.
The next step is to select a protoype face that is in the same class as the face to be caricatured.
Since the face to be caricatured is white female, a white female prototype is used.
Corresponding features are delineated on the prototype are also delineated. Figure 22 shows the
prototype, and the prototype with marked features.
14
10
20
30
40
50
60
70
80
90
100
110
10 20 30 40 50 60 70 80 90
(a) (b)
Figure 22. White female prototype [11]. (a) Original prototype. (b) Prototype with delineated features.
Next, the difference in features between the source face and the caricature is calculated, and
exaggerated. Figure 23 shows the feature exaggeration process for a 200% caricature.
10
20
30
40
50
60
70
80
90
100
110
10 20 30 40 50 60 70 80 90
15
Finally, the source image is warped to the exaggerated destination using the Beier-Neely field
warping. The result of the 200% caricature is shown in Figure 24.
10
20
30
40
50
60
70
80
90
100
110
10 20 30 40 50 60 70 80 90
16
(a) (b) (c) (d)
Figure 25. Caricature sequence. (a) 300% caricature. (b) 200% caricature. (c) 100% caricature. (d) 0%
caricature. (e) 100% anti-caricature. (f) 200% anti-caricature. (g) 300% anti-caricature.
17
(a) (b) (c) (d)
Figure 26. Another caricature sequence. (a) 300% caricature. (b) 200% caricature. (c) 100% caricature.
(d) 0% caricature. (e) 100% anti-caricature. (f) 200% anti-caricature. (g) 300% anti-caricature.
18
lower facial bones lengthening. Desirable female dimorphic features include smallness in the
bony features of the lower face, a flat middle face, large lips, and width and height in the cheeks
[14]. In order to exaggerate desirable dimorphic features, a particular form of caricature may be
used.
Due to the difficulty in determining the exact transformations that would be required to
produce a beautified face, it is not implemented, however may be of future interest.
19
(a) (b) (c)
Figure 27. The inadequacy of matrix addition as a vector addition operation in face space. (a) One face,
k . (b) Another face, l . (c) Sum, k + l , using matrix addition. Note that this face has been
normalized for display.
It is seen that the sum of two faces is not a face, and so the closure property of vector addition is
not met. Furthermore, the scalar multiplication operation of matrix-scalar multiplication is not
necessarily a good scalar multiplication operation for faces for any field of scalars, because this
makes no change in the face that is observed, just increasing intensity, which if normalized for
display, is a trivial operation.
It seems that a better vector representation for faces would be one based on features, such
as locations of the eyes, mouth, nose, and face edge, as well as skin color, rather than one based
on pixel representation of face images. In order to develop a vector representation for faces
based on features, it would be desirable to have a way to combine faces, a vector addition
operation, and a way to magnify or minify a face, a scalar multiplication operation.
Face space is closed under the face morphing operation, as was seen in Figure 15.
Therefore, using this operation as a binary operator over the set, face space is cast as a groupoid
algebraic structure. Furthermore, it can be easily shown that binary morphing meets the
commutivity property, so face space can be easily cast as a commutative groupoid. By casting
the binary morphing operation into a subset of the polymorph operation, resemblance to the
associativity property may also be obtained. It is believed that minor tweaking would allow
actual associativity to be obtained. In that case, face space would be a commutative semigroup.
The use of different proportions of different faces is exactly equivalent to scalar multiplication,
and meets the associativity of scalar multiplication property.
Alternatively, a modification of the face caricature operation may be suitable as a scalar
multiplication operation. Caricature clearly meets the associativity of scalar multiplication
property. The result of a caricature is that the output perceptually looks more like the person
than the original, thus ties in philosophically with a scalar multiplication.
Further investigation into the possibility of using operations like morphing and caricature
as operations in a vector space may yield a solid mathematical foundation for face space.
20
References
[1] G. Wolberg, Digital Image Warping, Los Alamitos, California: IEEE Computer Society
Press, 1990.
[2] C.-H. Lin and J.-L. Wu, “Automatic Facial Feature Extraction by Genetic Algorithms,” IEEE
Trans. Image Processing, vol. 8, no. 6, pp. 834-845, June 1999.
[3] T. Beier and S. Neely, “Feature Based Image Metamorphosis,” in Computer Graphics
(Siggraph'92 Proceedings), vol. 26, pp. 35-42, July 1992.
[4] F. Samaria and A. Harter, “Parameterisation of a stochastic model for human face
identification,” in 2nd IEEE Workshop on Applications of Computer Vision, Dec. 1994.
[5] S. Lee, G. Wolberg, and S.Y. Shin, “Polymorph: Morphing Among Multiple Images,” IEEE
Computer Graphics and Applications, pp. 60-73, Jan./Feb. 1998.
[6] I. Newton, The First Book of Opticks: or, A treatise of the reflexions, refractions, inflexions
and colours of light, Sam. Smith, and Benj. Walford, London, 1704.
[7] Center for the Study of Intelligence, “John F. Kennedy 1961-63,” in “Our First Line of
Defense” Presidential Reflections on US Intelligence, [Online] Available:
http://www.cia.gov/csi/monograph/firstln/kennedy.html.
[8] T. Heinz, “Drawing John Kerry,” [Online] Available:
http://robrogers.com/aaec03/drawkerry.html.
[9] P.J. Benson and D.I. Perrett, “Synthesising continuous-tone caricatures,” Image and Vision
Computing, vol. 9, no. 2, pp. 123-129, Apr. 1991.
[10] B.Tiddeman and D. Perrett, “Moving Facial Image Transformations Based On Static 2D
Prototypes,” in Proc. 9th Int. Conf. In Central Europe on Computer Graphics, Visualization
andComputer Vision 2001, Pilsen, Czech Republic, Feb 5-9 2001.
[11] D.A. Rowland, D.I. Perrett, D.M. Burt, K.J. Lee, and S. Akamatsu, “Transforming Facial
Images in 2 and 3-D.” in Imaging 97 Proceedings, Monte Carlo, pp. 159-175, 1997.
[12] R. Thornhill and K. Grammer, “The Body and Face of Woman: One Ornament that Signals
Quality?,” Evolution and Human Behavior, vol. 20, pp. 105-120, 1999.
[13] A.J. O'Toole, T. Price, T. Vetter, J.C. Bartlett, and V. Blanz, “Three-dimensional shape and
two-dimensional surface textures of human faces: The role of "averages" in attractiveness
and age.” Image and Vision Computing Journal, vol. 18, pp. 9-19, 1999.
[14] R. Thornhill and S.W. Gangestad, “Facial Attractiveness,” Trends in Cognitive Sciences,
vol. 3, no. 12, pp. 452-460, Dec. 1999.
[15] R. Sekuler and R. Blake, Perception, Boston: McGraw-Hill, 2002, pp. 235-236.
21
Appendix A – Beier-Neely Image Warping
function EE = bnWarp(C,m,n,P,Q,Pp,Qp,p,a,b)
%BNWARP Beier-Neely Warping
%EE = BNWARP(C,m,n,P,Q,Pp,Qp,p,a,b)
%EE is the output image
%C is the input image
%[m,n] is the size of the output image
%P and Q specify the demarcated lines in the input image
%Pp and Qp specify the desired locations of the lines in the output
%p, a, and b, specify how the different lines are combined
%pixel location
X = [jj ii];
u = (dot((X-P(kk,:)),(Q_P)))/(norm(Q_P)^2);
v = dot((X-P(kk,:)),pQ_P)/(norm(Q_P));
%input pixel from which to get value, if this were the only line
Dd = [Xp(1)-jj Xp(2)-ii];
D(kk,:) = Dd;
%determine the location in the input image from which to get value
Xp = X + DSUM;
22
%round to integer
if ((Xp(2) < m) & (Xp(2) > .5))
xp2 = real(ceil(Xp(2)));
elseif (Xp(2) >= m)
xp2 = m;
else
xp2 = 1;
end
%set output image pixel value to value in the input image selected
EE(ii,jj) = C(xp2,xp1);
end
end
23