Articulated 3D human model and its animation for testing and learning algorithms of multi-camera systems
Ondˇej Mazan´ r y


CTU–CMP–2007–02 January 15, 2007


Available at Thesis Advisor: Tom´ˇ Svoboda as The work has been supported by the Czech Academy of Sciences un der Project 1ET101210407. Tom´ˇ Svoboda acknowledges suppor as as of the Czech Ministry of Education under Project 1M0567. Research Reports of CMP, Czech Technical University in Prague, No. 2, 2007 Published by Center for Machine Perception, Department of Cybernetics Faculty of Electrical Engineering, Czech Technical University Technick´ 2, 166 27 Prague 6, Czech Republic a fax +420 2 2435 7385, phone +420 2 2435 7637, www:


....) uveden´ v pˇiloˇen´m e r z e seznamu. .. SW atd........ V Praze dne ........ projekty.......... podpis .....................Prohl´ˇen´ as ı Prohlaˇuji... ˇe jsem svou diplomovou pr´ci vypracoval samostatnˇ a pouˇil s z a e z jsem pouze podklady (literaturu....


Acknowledgments I thank my advisers Tom´ˇ Svoboda and Petr Doubek for their mentoras ing and the time spent helping me with my work. . I thank my wife Eva and my parents for their support during my studies. for the faith that gives me hope and purpose to study. Finally I give thanks to Jesus Christ.


. Navrhli jsme vlastn´ algoritmus pro automatick´ spojen´ s´ˇov´ho a ı e ı ıt e modelu ˇlovˇka s modelem kostry. The texture of 3D human model is obtained from captured images. We designed our own algorithm for automatic rigging mesh model with the bones of the skeletal model. Jednotliv´ kroky n´vrhu jsou zahrnuty v c e e a t´to diplomov´ pr´ci spoleˇnˇ s pouˇit´ softwarov´ho bal´ e e a c e z ım e ıku. Transformations between computer vision and computer graphics are discussed in detail.Abstract This document describes a software package for creating realistic simple human animations. Steps of the design are covered in this thesis together with the usage of the software package. r kostry a navrhli zp˚sob animace s pomoc´ nasn´ u ı ıman´ch dat pohybu. Vytvoˇen´ animace z a ım r e pohledu nˇkolika kamer jsou urˇeny pro pouˇit´ v multikamerov´m syst´mu e c z ı e e pro uˇen´ a testov´n´ algoritm˚ poˇ´ cov´ho vidˇn´ Vytvoˇili jsme model c ı a ı u cıtaˇ e e ı. The animations generated for several cameras ought to be used for testing and learning of tracking algorithms in the multi-camera system where the ground-truth data are needed. Transe e c e ısk´ ımk˚ formace mezi poˇ´ cov´m vidˇn´ a poˇ´ covou grafikou jsou detailnˇ cıtaˇ y e ım cıta-ˇ e pops´ny. Abstrakt Tento dokument popisuje softwarov´ bal´ pro vytv´ˇen´ realistick´ch aniy ık ar ı y mac´ ˇlovˇka. Texy tury trojrozmˇrn´ho modelu ˇlovˇka jsou z´ any ze sn´ u z kamer. Softwarov´ bal´ je zaloˇen na open source 3D modelovac´ ıc e y ık z ım n´stroji Blender a programovac´ jazyku Python. We made a human skeletal model and designed the way how to animate it by scripts with using motion capture data. The package uses the open source 3D modeling software Blender and the scripting language Python.


. 8. . . . . . . . . . . . . . . . . 3 37 38 38 39 40 41 . . . . . . . . 4 Skeleton model definition 5 Rigging the mesh with bones 6 Texturing 6. . . . . . .4 Exported animation .1 Used approach . . . . . . . . .1 Motion capture data . . . . . 8. . . . . .2 Usage of motion captured data . . . . . . . . . . .2 Animation XML file . . . . . . . . . . . . . . . . . . . . . . . . . 6. . . . . . .2 The problems of our method . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 The best visibility algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Counting the visible pixels . . .1 Blender coordinate system . . . . . . . . .3 Determining the visibility of faces . . . . . . . . . . . . . . . .6 Conclusions (for texturizing approach) 5 7 11 12 12 13 14 16 19 25 25 27 27 28 30 30 . . . . 6. . . . . . 7 Animation 32 7. . . . . . . . . . 6. . . . . . . . . . . . 3.4 Blender camera . . . . . . . . .1 Mesh data format . . . . . . . . 32 7. . . 3. . . . . . 34 8 Description of the software package 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Blender scene and objects . . . . . . . . . . . .3 Camera configuration files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Blender materials and textures 3. . . . . . .5 Run parameters .Contents 1 Introduction 2 Articulated graphic model 3 Blender and Python overview 3. . . . . 8. . . . . . . . . . . . . . 6. . . . . . . . . 8. . . . .

. . . . .6 Vertex deformations by bones . . . . . . . . . . .1 10. . . . . . . . . . . . . . . . . . . . . . . 10 The 10. . . .8 Idealizing the calibrated camera . . 9. 9. . . . 66 . 9. . .5 Bone transformations .4 Using projection matrices with OpenGL 9. . 63 Loading animation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Blender’s math . . 64 11 Results and conclusions 66 11. . . .7 Configuring the bones . . . . . . . . . .2 Blender Euler rotations . . . . . . . . . . . . . rendering and animation data export . . . . . . . . . . . . . . 9. . . package usage 61 Importing mesh and attaching to the skeleton . . . . .3 CONTENTS 42 42 43 45 49 54 55 57 58 .3 Blender camera model . . . . . . . 62 Fitting the model into views and texturing . . . . 9.4 9 Used mathematics and transformations 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 10. . . . . . 9.

The 3D model of tracked object can be also a side product of the tracking algorithms when learning the model that is going to be tracked. Also the dependencies between the joints helps to determine valid poses. generate an animation and then verify information gained by the algorithms. They used these templates to detect silhouettes in images in both indoor and outdoor environments where the background subtraction is impossible. it is easy to estimate the full 3D pose of the detected body from 2D data. Since the detected templates are projections of posed 3D models. Lepetit and Fua used generated human postures to obtain template human silhouettes for specific human motions [3]. Urtasun and Fua [6]. It is easier to define own scene with own model. The priors of human model can sufficiently constrain the problem [13]. This approach is robust with respect to the camera position and allows detecting a human body without any knowledge about camera parameters.Chapter 1 Introduction Tracking in computer vision is still a difficult problem and in general it remains largely unsolved. Due to the lack of data for joint ranges authors tracked all the possible configurations for hand joints and applied the acquired data to constrain the motion tracker. Having the prior knowledge of the model can simplify the process of learning and make the tracking in computer vision more robust. Monocular tracking using only one camera is possible with the knowledge of the tracked object model as described in [14]. Another use is in testing the computer vision algorithms. Dimitrijevic. The noise in real data makes detecting silhouettes and events in the scene harder and therefore it is convenient to have ground truth data without the noise for testing the algorithms. The constraints for hierarchical joints structures tracked using existing motion tracking algorithms solve the problems with false positive classification of poses which are unreachable for human body as shown by Herda. 5 . The 3D estimation of a human pose from monocular video is often poorly constrained and the prior knowledge can resolve the ambiguities.

But creating a realistic 3D animation is often a hand work that starts with modeling a 3D mesh and finishes by its animation. This saves the time needed for capturing new data from the real environment. . but it has more features. This is a general process of creating animations used by computer artists which we tried to simplify and automate. The idea was to use any of the existing SW for generating photorealistic scenes such as Blender and Povray. Modern computer animations allow us to generate realistic scenes. Blender supports scripting in Python language. Our task was to design a realistic articulated human model and ways of its animation. We chose Blender [1] and Python [10] for their features and availability.6 CHAPTER 1. Python is built inside of Blender for Blender’s internal scripting. We took advantage of Blender in skeletal animation. It is true that the major changes can be done by changing the textures. For one’s safety it is preferred to simulate some dangerous scene scenarios as it is often used in movies nowadays. INTRODUCTION Developing simple system for creating realistic animations may speed up the process of testing multi-camera systems. It is important for us to be able to change the model easily. For testing and learning the algorithms we need several human models with the same animations and poses. Blender’s GUI was used during preparation of the scenes and the human model. We also used Matlab for some computations and testing. this allows to modify camera settings or objects in the scene easily. but sometimes the change of the whole model may be needed. and therefore we can work with synthetic data as well as with real data. Blender supports usage of external Python installation with external Python modules. This human model is viewed from different angles by different cameras. Blender is mainly a 3D modeling and animation tool for creating realistic scenes.

These models are often called avatars. The mesh model is a net of connected triangles or polygons and can contain separated parts. Articulated character models usually consists of: 1. see Figure 2. polygon-based ) model. when there is no animation applied. humanoids. VRML (virtual reality mark-up language) etc. each face has to have its normal. Building the model is similar to other commercial 3D modeling software tools. In our work the Blender articulated model consists of these three parts only. To be able to determine where the face is facing to.4 or more vertices. It describes the surface of the model in the model’s rest position (see Figure 5. textures. Vertex is a point in the 3D space. 7 . Vertex besides the coordinate parameters x. 3. Virtual bones. 2. Blender has less sophisticated character animation support then commercial softwares. but allows similar functions with good results.Chapter 2 Articulated graphic model Articulated graphic models are widely used in computer graphics and therefore many approaches were developed in this branch. Mesh (wire-frame. The sort of animation of these characters is often called character animation. Each face is defined by 3. Material. However Blender’s functions give us everything we need to build an animated human character with sufficient results. The character animation is used in computer games. Some softwares also allows simulating muscles. We can imagine the mesh model as skin of the model.y. virtual actors. Most of the 3D modeling and animation software tools are supplied with several models prepared for animation and also with character animation tools.z can also have its normal as a parameter. These polygons are also called faces of the mesh. modeling softwares. animated characters. so the focus is on the speed instead of on visual quality.5). Blender’s character animation ought to be used in game engine. virtual humans.

Virtual bones (further only bones) are used to deform the mesh.8 CHAPTER 2. The material specifies how the model will be visualized. It is possible to animate each vertex separately. how it will cast shadows. The lines between vertices show the edges of the polygons. the color of mesh polygons and the translucency. also used in our work. but unusable in case of thousands of vertices. The texture is often a bitmap which is mapped on the mesh model. ARTICULATED GRAPHIC MODEL Figure 2. Mapping the texture on the mesh can be done in several ways. This defines the correspondence between the texture pixels and the face surface. how it will reflect lights during the rendering. see Figure 2. Usually each face vertex has its UV coordinates. Figure 2. Instead of this you attach the vertices to the bones and though any change of the bone position will affect the .2: Example of a textured cube mesh (on the left) and the texture (on the right) with unwrapped UV coordinates.1: Example of a mesh model of a head. The UV stands for 2D coordinates in the texture space. see Figure 2. The way mostly used is the UV mapping. A texture can be used simultaneously with the material.

Figure 2. still often used especially in need of precise results. animating and rendering a model. The envelope of a bone defines a bone affected area. textures and bones. We automated the whole process using available software and methods. The artists get well looking models because they work on all details. Each vertex group belongs to a different bone.3: The bones in Blender and the effect on the mesh in a new pose.9 vertices. Blender allows attaching vertices to the closest bone but we found this function not working properly in our case. This is a common way of character animation and it is imitating real body movement as well. Graphic artists often create a realistic model by hand which is very time consuming. the animator uses any automatic tool and then corrects the mistakes. is to do it manually. At the beginning. Attaching the vertices to the bones is called rigging. The other possibility is to use envelopes. For our purposes we need only approximate model. but Blender offers tools for creating. To get an articulated model we need to define a mesh. Therefore the final deformation is constructed of deformations of all these bones. One way. and creates vertex groups using these correspondences. Blender is not supplied with such models as other softwares are. This may be useful in case of very simple meshes like robot models. Artists create human models according to their own knowledge about the model and body deformations. We need to change both the model and the animation easily and quickly as well as the animation. This is unusable if you need to change the mesh. We wrote our own rigging function which we describe later in Chapter 5. It is possible to attach one vertex to several bones. We achieved better results especially if the vertices are close to two bones. It finds the correspondences between the vertices and the bones. . The most difficult problem is to find the correspondences between vertices and bones and this has not been fully solved yet.

However this may be useful during fitting the model into captured images as described in Chapter 6. H-Anim joint nomenclature was used for naming the bones. Blender has an inverse kinematics solver.10 CHAPTER 2. There exists a standard H-Anim [4] for humanoid animations in VRML. Modeling a human mesh model is much easier using this tool then Blender so we used the MakeHuman to create the mesh models. The maintainers of the MakeHuman project are currently working on a muscle engine. We setup a multicamera system for capturing images of a real human in order to texture our model. ARTICULATED GRAPHIC MODEL The project MakeHuman [7] is an open source project which offers a tool for creating realistic humanoid models for computer graphics. We used the joints and the feature points from the H-Anim as a template during the designing the mesh skeleton. the project allows to create only a mesh model. Creating a skeleton for a character usually depends on levels of articulation and purposes of animation. Using the MakeHuman allows generating a skeleton that match the most of the created meshes. The final articulation should be rotation of the bones only. The used algorithm is based on determination of visible and hidden pixels in each camera and chooses the best view for texturing the particular mesh face. The textures have the biggest influence on appearance of final rendered images. The articulation is done through bones. Pose of the character may be changed quickly by grabbing the hands or feet instead of setting bone rotations. It is possible to change the length of the bones or their position during animation. The final skeleton was defined in order to match the MakeHuman created mesh and to be easy for animation with BVH files (more in Chapter 7. For example computer games work only with simple models but the final result looks realistic.obj format and reuse it in Blender. This project does not contain bones or textures but offers a wide variety in making humanoid mesh models. The feature points are derived from the CAESAR project . The advantages of using the MakeHuman include convenient pose for rigging the mesh and the possibility of defining a generic skeleton of virtual bones for the meshes generated by MakeHuman. More is described in Chapter 6. In our work we need to be able to change the textures of the model. The skeleton defines virtual bone connections. . In the current release.the database of measured human data. which will simulate the musculature. It is possible to export the model into the Maya . but this makes no sense in the case of human body with constant bone lengths. The key is the proper texture. which defines joints and some feature points on a human body.

We do not describe all the Blender features. The Blender has also the anti-aliasing featured called oversampling. We used the OpenGL in our work for rendering in texturing algorithm as will be described in Chapter 6. We advice to read a Blender manual [2] to learn using the Blender. For example we could not use the bones setting from our previous work [8]. Blender can use an external Python installation to extend functions and libraries. Blender uses Python for both internal and external scripting. motion blur. because Blender has Python built in. Some functions are accessible only through hot-keys or mouse clicking and may neither be found in menus nor in Blender Python API. Blender’s user interface and rendering is based on OpenGL [11]. Blender supports external raytracers as well as the internal renderer with support for radiosity or raytracing. The Python installation is not needed. The whole 11 . Blender supports inverse kinematics. tools for character animation and others. unfortunately in Blender Python API as well.Chapter 3 Blender and Python overview Blender [1] is open source software for 3D modeling and animation. The choice of Blender version is important.pdf file. We used Blender as the rendering and animation engine.42a and Python version 2. Blender has a panorama rendering that can simulate the fish-eye cameras. and gaussian filter. The BlenderQuickStart. More detailed information may be found in the Blender Manual [1] and in Blender Python API [1]. Blender Python API provides a wrapper of the OpenGL functions and allows using OpenGL for creating user-defined interface in Python scripts. It is possible to write a Python expression in the Blender text box or run an external Python script directly from Blender. which is shipped with Blender and can be found in Blender installation folder.4. we offer only an overview. because Blender is under massive development undergoing changes of the philosophy and data structures. contains a brief tour of Blender usage. We used Blender version 2.

For the sake of completeness. More about rotations and transformations in Blender will be described in Chapter 9. bone symbol for bones. Each object contain data for its own specification. we should mention that Blender internally calculates in radians. or relative to parent objects or object origins. Blender uses Euler rotations for general objects in Python API. Lamp.2. The example of scene contents is shown in Figure 3.2 Blender scene and objects The scene in Blender consists of objects. 3. Camera. BLENDER AND PYTHON OVERVIEW animation composed of several frames can be rendered into one file ( the avi or quicktime format) or into single images. Objects can be parents or children of other the world space. The units in Blender are abstract. Rotations used in Blender’s GUI are entered in degrees. The basic parameters of the objects include location and rotation (defined by Euler angles). Figure 3.12 CHAPTER 3. The coordinates can be absolute . For example our mesh model is composed of two objects: from the object with Mesh data. where eye symbol stands for camera. The Quaternions are for example used for posing the bones. Data associated with the objects can be either a Mesh. These parameters can be absolute (in the world space) or relative.1 Blender coordinate system Blender uses right-handed coordinate system as shown in Figure 3. We notice this only for certainty. Armature objects hold bones in Blender (they are equivalent to the skeleton). or Armature. and matrices. In our work we perceive Blender units as metric units for easier expression in real world. The Armature object is parent of the Mesh object and controls the deformation of mesh vertices.1: Right-handed coordinate system. Blender expresses the rotations in both Euler angles. axes . The Mesh is a wire-frame model built of polygons (Blender call them faces). 3. quaternions. and from object with armature data.1.

If bones are connected to the parent. the bone head automatically copies the location of the parent’s bone tail as shown in Figure 3.2: The example of a scene structure in Blender. This material option must be set explicitly. Materials in our work are set to be shadeless.3. They split mesh vertices into several groups that are used later for rigging with the bones. This means that the material is insensitive to lights and shadows and the texture is mapped without any intensity transformation. We . Final textured object will appear the same from different views even if the illumination differs. The method which we use is UV mapping. because this is not default texture mapping in Blender. In fact three parameters define the bone: the head of the bone. symbol of man for an armature object and spheres stand for materials.3 Blender materials and textures Blender materials have many features but we use only few of them in our work. The materials in Blender also specify the mapping of the texture onto an object. 3. Figure 3. The bones in Armature objects define the rest pose of the skeleton. The project MakeHuman has a plugin for Blender for simulating the skin. The materials can be also used for simulating the human skin which does not behave as a general surface. The textures must be used together with materials in Blender.3.3. The vertex groups shown as small balls will be described in Chapter 5. They are called same names as bones. This makes recognizing learned patterns of an object easier. The materials are used for binding the textures. We do not use it in our work. the tail of the bone and the roll parameter which rotates the bone around its y axis. The bones can be children of other bones and may inherit the transformations of their parent. BLENDER MATERIALS AND TEXTURES 13 for general objects.

The camera parameter lens sets the viewing angle. The textures which we use are normal images loaded in Blender. Obviously different cameras have different sensitivity on colors. 3. The materials can be linked to both general Blender Object and Mesh object. The Blender camera always has ideal perspective projection with principal axis in the image center. The important thing for usage of the Blender Python API is linking of the materials. The object textured with images of varied illumination will appear inconsistent in final rendering. It is recommended to align color intensities between images before mapping the textures. The view is in camera negative z axis.3: Bones in Blender. The Blender camera can be switched to orthographic. BLENDER AND PYTHON OVERVIEW Figure 3. The value 16 corresponds to a 90 degree angle as shown in Figure 3.4 Blender camera The camera in Blender is similar to pinhole camera or perspective projection. Up to 16 materials can be linked to an object. The drawback of Blender camera is in simulating a real camera projection. children bone is connected to parent bone and copies the parent transformation. set this option automatically in our scripts. The other texture usage is displacement maps. The main parameters are location and rotation in the world space.14 CHAPTER 3. The camera is configured as other objects in Blender scene.4. No other changes are needed in default materials. This also limits the number of textures for uniform-material object. We link materials to the Mesh object. The inverse of camera world transformation matrix can be used to get trans- . The light sources in the scene have also effect on the illumination in the captured image.

More about the camera model and related transformations will be described in Chapter 9.3. . Other parameters which define the projection are independent on camera objects and can be setup in Blender render parameters.4: Definition of lens parameter of Blender camera. Figure 3. −C] expression in computer vision. This inverse matrix equals to [R. These parameters are width. BLENDER CAMERA 15 formation from world space to the camera local space. height of the image and aspect ratio.4. This is more detailed described in Chapter 9.

1. These models can be imported into Blender. The bones are defined by pairs bone head and bone tail. We defined the general skeleton.obj file format which Blender can import. We follow the H-Anim recommendations and use them as a template for the skeleton model. dimensions. hands. the basic ideas and definitions of this standard can be converted for usage in Blender.Chapter 4 Skeleton model definition We use the MakeHuman project [7] for obtaining the 3D human mesh model. where the philosophy of character animation is different from the VRML (Virtual Reality Mark-up Language). The advantage is having the accurate lengths and positions of the bones. We adjusted the H-Anim definitions for easier use with motion capture format BVH. The advantage is easy change of the model.1 and the skeleton visualization is in Figure 4. The bone head together with tail define the bone’s 16 . This can be used for example if the computation of the final hand position in a new pose is needed. see Figure 3. Bones have also a roll parameter. The skeleton definition is ambiguous. face features etc. which can be specified to rotate the bone space around the bone y axis. This allows to define a general skeleton for most of the models. It is the origin of the bone space. The bone head is a joint location where the transformation applies. The H-Anim standard [4] defines levels of articulation. The locations are defined in table 4. The pose and the proportions of most models created in the MakeHuman are approximately constant. These models are intended to be used primarily in computer graphics and may lack anatomical correctness. Models can be exported into a . However. But this standard is hardly applicable to other data structures. where we attach bones for model articulation.3. and feature points of a human body for virtual humans (humanoids). which we generate by script. These models only describe the surface of the model. bones (joints). The names of our bones correspond with joint points of the H-Anim standard. They are without bones or materials. Using the MakeHuman project we can generate a mesh using several targets for the legs. so it fits most of the MakeHuman meshes.

069 0.000 0.0353 -0.042 Parent bone HumanoidRoot sacroiliac vl5 HumanoidRoot HumanoidRoot to l hip l hip l knee HumanoidRoot HumanoidRoot to r hip r hip r knee vl5 vl5 to l sternoclavicular vl5 to l sternoclavicular l shoulder l elbow vl5 vl5 to r sternoclavicular vl5 to r sternoclavicular r shoulder r elbow Table 4.000 0.0820 0.4600 1.1: The bone locations in a rest pose.000 -0.194 -0.434 1.029 -0.069 0.080 -0.011 -0.4488 1.052 -0.091 0.0694 -0.393 1.4583 1.434 1.000 -0.057 -0.659 -0.000 -0.4488 1.843 0.054 0.011 -0. We advise to use a wrapper function written in our scripts to avoid improper functionality.921 1.091 1.029 -0.194 -0.0694 -0.840 z -0.033 -0.379 1.393 1.032 -0.000 0.011 -0.410 -0.091 0.012 1.000 0.065 0. .180 -0.000 0. The units are abstract.410 0.054 -0.7504 0. The bones can be connected together (see the parent bone column).096 -0.062 -0.065 -0.379 1.042 0.042 -0.032 -0.921 1.080 -0.057 1.659 0.096 0.659 z 0.057 1.391 1.000 0. The Blender has importing script for obj files but it allows to rotate the mesh before importing. The joint locations are dimensionless but can be perceived as meter units.194 0.493 0.054 -0.843 0.434 1.000 0. but can be perceived as meter units.080 -0.17 Head y 0.096 0.052 x 0.029 -0.091 0. y axis.029 -0.096 -0.921 0. The imported mesh is sized and rotated so it fits the generated skeleton.4583 0.012 0.843 0.000 0.843 0.4583 1.393 Tail y 0.082 0.493 0.000 0.069 0.460 1.493 0.062 -0.921 0.180 -0.057 -0. Our function uses the same script supplied with Blender.069 -0.391 Bone name HumanoidRoot sacroiliac vl5 vt3 HumanoidRoot to l hip l hip l knee l ankle HumanoidRoot to r hip r hip r knee r ankle vl5 to l sternoclavicular l sternoclavicular l shoulder l elbow l wrist vl5 to r sternoclavicular r sternoclavicular r shoulder r elbow r wrist x 0.052 -0.194 0.034 -0.410 0. The final non-textured articulated model is finished by attaching (parenting) the bones to the mesh.032 -0.379 1.034 -0.493 0.065 0.410 -0.434 1.659 0.0277 -0.000 0.011 -0.0330 -0.057 0.4583 1.393 1.062 -0.054 0.032 -0.052 -0.065 -0.840 -0.042 -0.062 -0.379 1.057 -0.080 -0.0353 -0.824 0.

.18 CHAPTER 4. SKELETON MODEL DEFINITION Figure 4.1: The generic skeleton model as shown in Blender’s top view.

The rest pose is only defined by bones locations. The rigging is done by parenting an armature object (Blender’s skeleton object) to a mesh object. These options can be mixed. The armature deform options are: using the envelopes and using the vertex groups. One vertex can be deformed by several bones. This is shown in Figure 5.Chapter 5 Rigging the mesh with bones In this chapter we describe the problems of rigging the Mesh with the Bones. The rigging process (skinning the skeleton) is attaching the vertices to the bones. With the new Blender version it is possible to use the armature object as a modifier instead of as a mesh parent object. lengths and by roll parameter. The problem is that the vertices must be added to the vertex groups explicitly. Often the vertices which you do not want to attach are assigned to a bone. The need of finding new rigging algorithm arose when Blender tools did not work well. This function 19 . and also our own algorithm which we found suitable for rigging MakeHuman meshes with our general skeleton. You can place the bone into the desired area and start using it. The vertex groups define directly the correspondences between the vertices and the bones. The vertices are attached in armature (Blender’s skeleton) rest pose. The envelope defines an area around the bone. No transformations are applied on the vertices in the rest pose. The problem is in the envelope shape.1. We did not use this option. The advantage of envelopes is quick and simple usage. The Blender has a possibility of an automatic grouping the vertices according to closest bones. The better way is using the vertex groups. Blender tools may attach mesh vertices to an improper bone (if using envelopes option) or they may not attach all vertices (using closest bones). A vertex group must have same name as the corresponding bone. The rest pose is a default pose without rotations or other bone transformations. Usually the armature object consists of several bones. All the vertices in this area are affected by bone transformations. You can add or remove the vertices from the vertex group and directly control which vertex will be affected by the bone. Modifiers are applied on a mesh in order to modify it in several ways.

head − vertex. Using envelopes option on the left and the vertex groups created from the closest bones function on the right. Figure 5.1. The vertices can have weight parameter which defines the influence of the bone ) for vertex in mesh. see Figure 5.tail − vertex. The vertices with a higher angle to the bone’s y axis are less affected by setting scaled weight.0 for bone in armature.location vb = bone. Therefore we found an algorithm which is suitable for our general skeleton.vertices : J max = weight = 1. The algorithm in pseudo-code looks as follows: for bone in armature.tail − bone.0 vertex group = bone. that they do not correspond to any bone and stay unaffected. In order to make a correct new mesh pose.bones : create vertex group( bone.bones : v1 = bone.head J = AngleBetwenVectors( v1. Vertices at the border of two bones are more naturally deformed if they are attached to the both bones with different weights.2 for comparison. In our algorithm we use the vertex groups but select the group vertices using our own classification function. We also take an advantage of the mesh symmetry in our algorithm. the proper correspondences of vertices with bones must be found.location v2 = bone. This is useful in case of a vertex transformation by two bones. v2 ) . RIGGING THE MESH WITH BONES can miss some vertices as shown in Figure 5.1: Bad deformation in the new pose.20 CHAPTER 5.

1 >). If the angle α is bigger than angle β.0 − weight ) See Figure 5. .4. then the vertex belongs to the parent bone and vice weight = AngleBetweenVectors( −v1. How the vertices are deformed is described in Chapter 9.21 Figure 5. If the weight is less then 1.0 if J max < J then J max = J vertex group = bone. The simulation of the algorithm is in Figure 5. if bone not at the same side as the vertex then J = 0.3 for better understanding. vertices at the bones border are attached to both bones with different weights. We use Blender Python API function as AngleBetweenVectors function. parent(bone ).0 ) then assign vertex to group( vertex .0 and the bone has a parent bone. The weight factor is decreased for vertices with larger angle γ between the vb vector and -v1 then the vertex is also added to the parent bone group with the remaining weight (the weight is clamped in range < 0. The main idea of vertex classification is finding the b biggest angle between bone’s head and tail as shown in Figure 5. vertex group. vb ) / 60. The angle can be also computed as arc cosine of two vectors dot product arccos aa·b . We do not allow the vertices on the left side to be attached to bones with joints on the right side.0 assign vertex to group( vertex . We test each vertex if it lies on the right (positive x) or the left (negative x) side of y axis.3. The function returns absolute angle in degrees. weight ) if has parent(bone ) and ( weight < 1.2: The vertex deformations with each vertex attached to one bone on the left. On the right. 1.

5.22 CHAPTER 5. RIGGING THE MESH WITH BONES Figure 5. The final rigging results are shown in Figure 5. with rest poses and new poses for MakeHuman generated meshes.3: The rigging algorithm is based on finding the maximum angle between bone’s head. tail and vertex. . Classifying only by angle gives good results with our bone configuration. This algorithm was tested with MakeHuman meshes and our skeleton model.

7 0.5 0.2 0.4: The simulation of the vertex classification for two different bone configurations.1 0.7 0.4 0.7 0.8 Figure 5.6 0.4 0.3 0.3 0.5 0. .7 0.2 0.4 0.2 0.3 0.5 0.6 0.8 0.8 0.3 0.1 0.8 Red shows the border for attaching the vertices to the bones Parent Bone Child Bone 0.4 0.1 0.2 0.23 Red shows the border for attaching the vertices to the bones Parent Bone Child Bone 0.6 0.6 0.5 0.1 0.

.24 CHAPTER 5.5: The rest pose and the new pose for rigged meshes with the skeleton. RIGGING THE MESH WITH BONES Figure 5.

We use UV mapping method for mapping images onto mesh faces. The cameras in the system are calibrated and synchronized. The real cameras suffer from radial distortion and skew. Another problem is that visibility of entire body surface is not possible (for 25 . The UV means. We use multi-camera system in order to capture human images synchronously from different angles. see Figure 6. that each vertex of a face has its UV coordinates in the texture space. Faces have an associated texture. we also must shift the image origin to match the Blender camera projection. The transformations are discussed in Chapter 9. This task is hard to be solved automatically. The mesh model must be posed and fitted into the captured images before texturing. so before fitting the model into images we must adapt the images and adjust the camera parameters.1 Used approach Mapping the texture onto a model is well recognized by computer graphics and most of literature about 3D graphics investigate this. The texture mapping is an affine transformation defined by the vertex UV coordinates. but they do not have ideal parameters. Then the radial distortion and skew must be removed from images. Because we fit the object in the view of Blender cameras. In order to have a realistic human model we choose digitally captured images of people for textures. Our problem is different illumination in images and the image noise. We can not simulate a real camera in Blender.1. We use several images from different angles to cover whole mesh. The texture is interpolated during the mapping. 6. We use a manual posing and fitting using the Blender interface. This model looks like a figurine because it has not any texture or special material attached. The texture differentiates the model from others and provides more detailed information. First the camera parameters must be computed a converted to Blender camera parameters. we can create an articulated mesh model.Chapter 6 Texturing Follow the chapters above.

The distortion at boundaries arises due to limited accuracy and the problems mentioned above. We build on the same foundations as used in these two works. The drawback of single bitmap texture is more difficult implementation. They synthesize a new texture for hidden parts. All images are then composed to a single bitmap.1: The manually positioned and posed mesh model into images using the Blender’s user interface. In their model-based reconstruction of people they compose the captured images into a single texture. They use a sequence of images from stationary calibrated camera. The four cameras were used in this example. This impose static unwrapped vertex UV coordinates (but if usage of the new model is desired. is using several textures for one mesh. Another algorithm was presented by Starck and Hilton [12]. The second option. The next problem is occlusion of mesh faces from a camera view. There are two possibilities of using the captured images as textures. They apply a maximal texture resolution approach for the best face camera view classification but they extended the method for regrouping the faces in order to avoid boundary distortion. This allows easier manipulation with texture and easier filtering in texture space. example in the standing pose the bottom of feet is hidden). with constant ordering. TEXTURING Figure 6. . This is not a case of MakeHuman meshes because they are built of constant number of vertices. which we use. The first is using one bitmap as the texture for whole mesh. An algorithm with good results covering all these problems was developed by Niem and Broszio [9]. the old UV coordinates are invalid).26 CHAPTER 6.

Small polygons can be occluded even if they are visible in a camera. It allows to specify different materials for skin. The probability of the same surface for the back side as for the front side is high for human objects. 6. This has some drawbacks. The appropriate cameras arrangement is important. This makes coding easier without any manipulations with images but it has many drawbacks. The rendering process must properly solve this task. some test and cull the viewing frustum. The simple method for testing the visibility is: render the object and read the pixels in result.6. The back side of a limb is then occluded by the front side. We can only guess the texture for these parts. This impose poses where hands are laid up and do not occlude a body at least from one view. We use this approach but we do not use different materials. The objects viewed from different angles have often occluded parts. This can be partly solved by color intensity alignment. that the polygons are visible from the same view as their neighbor polygons. We did not solve these problems. THE PROBLEMS OF OUR METHOD 27 This method is also often used for manual texturing.2 The problems of our method The biggest challenges in texturing the model are: the face best visibility estimation in the cameras and a texturing method of hidden parts. 6. We use each captured image as a texture so the final model contains several textures. Lot of algorithms were developed for the visibility test. These algorithms are now often implemented . The hidden parts can be as large as limbs or small as fingers. clothes. Of course there are exceptions as head or colorful miscellaneous clothes. hair etc. the model with one material and one texture or model with several materials and textures. From one camera the same object may look darker or brighter than from another. Some use z-buffer. The texture for hidden parts would be synthesized from closest visible parts. The rendered pixels must contain information about the faces. The camera calibration is never perfect so the object projection from one camera will not match the projection from another camera exactly.2. they need to be addressed in future development. It must be considered which philosophy of the texture model would be better. but for easier coding we pretend. We expect that occlusions are caused by the same or similar body parts. First the problem with different illumination in images causes steps and distortion in boundaries.3 Determining the visibility of faces The problem of visibility is well known in computer graphics. some test polygon normals. For most of the real human bodies we can expect small changes in texture in small area.

hidden pixs ] = image. the hardware accelerated rendering to determine faces visibility.28 CHAPTER 6. 6. Python script interpretation is too slow for rendering algorithm. We pretend. we use OpenGL rendering capabilities which take advantage of hardware accelerators. All the OpenGL features which change the face color during rendering (like shading) must be set off. TEXTURING directly in hardware graphic accelerators in order to speed up the rendering process. The face index is encoded into RGB polygon color value. The implementation is complicated because we use several bitmaps instead of a single bitmap. This has some drawbacks. that the faces are visible. The final rendered bitmap contains the face indices of visible faces coded as colors. In our algorithm we simply search in neighbor faces for a camera with the best visibility for most of the faces. We compute the number of visible pixels in order to measure the visibility and for determining the amount of occlusion we count the hidden (occluded) pixels. The problem is searching for neighbor faces because the data accessed through Blender Python API has no such relation as neighbor face.4 The best visibility algorithm The algorithm for estimating the best face visibility in camera views is following (simplified for reading. Instead of writing our own functions. The OpenGL is turn to the simplest rendering mode with z-buffer testing and without shading. We use the OpenGL [11] features. We have only pixel accuracy for testing and small polygons can overlap each other.pixels of( face ) if hidden > 0 then visibility = visible pixs / hidden pixs else visibility = visible pixs if visibility > best visibility and (visibility > MIN VIS) then best visibility = visibility . Polygon edges can be overlapped by neighbor polygons. pseudo-code): % CLASSIFICATION OF THE FACES USING VISIBILITY % DEFINED BY VISIBLE PIXELS visible faces = [] hidden faces = [] selected image = None for face in mesh. Some small faces may not be classified as visible due to limited pixel accuracy. Other faces can be hidden or occluded. It is better to have these faces textured for a better visual quality. Our approach is similar to Starck and Hilton [12]. We must test the face vertices for same location.faces : best visibility = visibility = 0 for image in images : [visible pixs. The texture can be synthesized for occluded faces.

get UV coordinates( face ) set Face UV coordintes( face. This algorithm is neither sophisticated nor quick. we need to test vertex locations for each face vertex with the vertices of the rest of the faces. The slowest part is searching the neighborhood faces.2 shows the used images for textures and the final textured model. In our results we use value 10 for the constant MAX OF LOOPS.4. UV ) visible faces.append( face ) else hidden faces.append( face ) hidden faces.get UV coordinates( face ) set Face UV coordintes( face. Figure 6. This could be done once and more effectively than we do.1. Other reason is a slow speed of the algorithm.5 in our work. The additional error is that manually fitted model does not match exactly the real object in the scene as shown in Figure 6.append( neighbors ) loop count = loop count − 1 visible faces = new visible faces 29 The algorithm part of finding proper texture for unclassified faces must be constrained to avoid infinite loops.append( face ) % FINDING THE BEST VIEW FOR UNCLASSIFIED FACES loop count = MAX OF LOOPS while hidden faces and loop count new visible faces = [] for face in hidden faces : neighbors = get Neighbors( visible faces. UV ) new visible faces. The value of MIN VIS was set to 0. The texture distortion in boundaries and the distortion caused by different illumination is present in the final result. THE BEST VISIBILITY ALGORITHM selected image = image if selected image then set Texture( selected image ) UV = selected image. Because we do not have any relation between mesh vertices and mesh faces. We also specify the constant MIN VIS to avoid primary texturing from views with occluded faces. Because some faces may not have neighborhood faces (the mesh can contain separated parts). face ) if neighbors then selected image = get Most Used( neighbors ) set Texture( selected image ) UV = selected image. but shows the possible approach to the problem. We search only in the set of newly retrieved visible faces in order to eliminate visible faces which do not have hidden neighbor faces. This is caused by capturing source images from angles with different illumination and by different camera types.remove( face ) new visible faces.6. . thus the searching set is reduced.

Render the model in OpenGL render mode with depth (z-buffer) test and without shading. Set the projection transformations using the camera’s projection matrix. The drawbacks are in the pixel accuracy and rasterizing. In future the algorithm presented by Niem and Broszio [9] can be fully implemented. 3. We did not investigate the problems with texture distortion at edges or distortion caused by different illumination.30 CHAPTER 6.5 Counting the visible pixels Using the calibrated cameras we can render mesh model into images and estimate the vertex UV coordinates. The model fitting can be done semi-automatically as presented by Starck and Hilton [12]. The camera locations and orientations are also important. The polygon edges are discretized and therefore can overlap each other. Count the equal color values as visible pixels and others as hidden. . The visible pixels are obtained by the following process: 1. As mentioned above we use OpenGL to speed-up the process. The implementation in Blender shows the possible usage. Our approach can be extended in a future development. This will expect synthesizing the texture for hidden parts and would allow usage of the sentence of images from a single camera. For speed issues we used a built-in support for OpenGL and we took advantage of its fast rendering. TEXTURING 6. 6. Turning the depth testing off causes that the faces are fully rendered as they fit into the view. Compare the newly rendered area with data from the first step. Using the OpenGL feedback mode. We expect a well posed object for selecting the source texture for unclassified faces. Process every face of the model in render mode with depth testing turned off. The very small faces rendered as a single pixel can be occluded by their neighborhoods. 2. Render each mesh face with different color (face index is encoded in color RGB values). This is suitable for obtaining correct face window coordinates which correspond to the previously rendered data.6 Conclusions (for texturizing approach) This part of our work was most complicated to effectively code in Python. This causes that the face fully visible in a view can be enumerated as partly visible or hidden. The use of camera projection matrix for controlling the OpenGL view-port transformations will be described in detail in Chapter 9. This rapidly sped-up the whole process. The OpenGL feedback mode returns data of OpenGL pipeline right before rasterizing. process the model and obtain the image coordinates for each face.

6. CONCLUSIONS (FOR TEXTURIZING APPROACH) 31 Figure 6. .6.2: The four source images for texturing and the final rendered model in a new pose.

For bone’s pose. As mentioned in previous chapters we use a skeleton for animation. The curve is defined at desired frames and the rest of the curve is interpolated. Armature contains several bones used for animation. The pose channel is an interpolated curve which allows smooth animation. 7.1 Motion capture data For motion capture data we use the BVH format. The proper interpretation of data is not easy. The object which holds the bones in Blender is called armature. channels are for each quaternion element. We use BVH format because Blender has import script for BVH files. It must be noted that this script is not working correctly in our version of Blender with our BVH data even if they are correct. The number of frames and the frame rate can be set in Blender’s Button window. The script may omit some characters in joint names during import. Also the textures can be animated which can be useful for example for animation of face gestures. it is possible to animate vertices separately and directly by changing their position. Even Blender offers much more animation tools than we used (for example nonlinear animation and actions). It is always better to check joint names after import. The BVH files can be obtained from Internet without the need of capturing own data. 32 . Bones have pose channels which define the changes against the rest pose.Chapter 7 Animation The animation itself can be studied separately for its complexity. Channel interpolated curves can be viewed and edited in Blender’s IPO curve editor window (where IPO stands for interpolated curve). Besides that. This format is widely used by animation software and can be imported into Blender. The channels can be for change in location. This has several advantages. We prefer to use motion capture data for real representation of human movements. The bone rotation is expressed in quaternions. size or rotation. We will focus only on issues which we applied in our work.



The BVH file format was originally developed by Biovision, a motion capture services company. The name BVH stands for Biovision hierarchical data. Its disadvantage is the lack of a full definition of the rest pose (this format has only translational offsets of children segments from their parent, no rotational offset is defined). The BVH format is built from two parts, the header chapter with joint definitions and the captured data chapter. See the example:
HIERARCHY ROOT Hips { OFFSET 0.00 0.00 0.00 CHANNELS 6 Xposition Yposition Zposition ... ... Zrotation Xrotation Yrotation JOINT Chest { OFFSET 0.00 5.21 0.00 CHANNELS 3 Zrotation Xrotation Yrotation JOINT Neck { OFFSET 0.00 18.65 0.00 CHANNELS 3 Zrotation Xrotation Yrotation JOINT Head { OFFSET 0.00 5.45 0.00 CHANNELS 3 Zrotation Xrotation Yrotation End Site { OFFSET 0.00 3.87 0.00 } } } } } MOTION Frames: 2 Frame Time: 0.033333 8.03 35.01 88.36 ... 7.81 35.10 86.47 ...

The start of the header chapter begins with the keyword HIERARCHY. The following line starts with the keyword ROOT followed by the name of the root segment. The hierarchy is defined by curly braces. The offset is specified by the keyword OFFSET followed by the X,Y and Z offset of the segment from its parent. Note that the order of the rotation channels appears a bit odd, it goes Z rotation, followed by the X rotation and finally the Y rotation. The BVH format uses this rotation data order. The world space is defined as a right handed coordinate system with the Y axis as the world up vector. Thus the BVH skeletal segments are obviously aligned along the Y axis (this



is same as our skeleton model). The motion chapter begins with the keyword MOTION followed by a line indicating the number of frames (Frames: keyword) and frame rate. The rest of the file contains the motion data. Each line contains one frame and the tabulated data contains data for channels defined in header chapter.


Usage of motion captured data

We import BVH data using the Blender’s import script for BVH files. It creates empty objects in Blender’s scene. The empty objects copy the hierarchy of the BVH file (but as noticed above some characters from joint names may be missing after import). The animation data are imported and animation channels for objects are created. As we know, there is no standard for joint names or hierarchy of BVH format, but the data that we use have the same hierarchy and the joint names. Our data correspond with our skeleton model, but the joint names differ. We use a dictionary for correspondences between our bones and captured data joints. Because we set only rotations for the bone poses (this makes sense for human skeleton), it is not a problem if the imported data are in different scale. But the different scale between data and our skeleton model causes problem when we move the whole skeleton object. Therefore we compute the scale factor from known height of our skeleton and expected height of the captured object. We measure from the ankle’s and head’s end site joint the expected height of the used object for capture. Then the change in location is scaled by this factor. After import, we go through the dictionary of bones and joints and configure our bones to be parallel with corresponding joint links in all frames. We set the bones so they have the same direction as the axis connecting the corresponding parent joint with the child joint. For the joints, whose rest pose of motion capture hierarchy differ from our rest pose (in our case the ankles), we set only the difference rotation from the rest pose. For some bones of our skeleton we do not have any corresponding joints and then we let them unaffected. The computation of the needed rotation of the bones is described in Chapter 9. We use the following dictionary (listed the Python code), the # in code denotes a commentary:
skelDict={ "sacroiliac":["Hips","Chest"], "vl5":["Chest","Neck"], "vl5_to_l_sternoclavicular":["Neck","LeftCollar"], "l_sternoclavicular":["LeftCollar","LeftShoulder"], "l_shoulder":["LeftShoulder","LeftElbow"], "l_elbow":["LeftElbow","LeftWrist"], "l_wrist":["LeftWrist","LeftWrist_end"], "vl5_to_r_sternoclavicular":["Neck","RightCollar"], "r_sternoclavicular":["RightCollar","RightShoulder"],



"r_shoulder":["RightShoulder","RightElbow"], "r_elbow":["RightElbow","RightWrist"], "r_wrist":["RightWrist","RightWrist_end"], "HumanoidRoot_to_l_hip":["Hips","LeftHip"], "l_hip":["LeftHip","LeftKnee"], "l_knee":["LeftKnee","LeftAnkle"], "l_ankle":["LeftAnkle","LeftAnkle_end"], "HumanoidRoot_to_r_hip":["Hips","RightHip"], "r_hip":["RightHip","RightKnee"], "r_knee":["RightKnee","RightAnkle"], "r_ankle":["RightAnkle","RightAnkle_end"], "vt3":["Head","Head_end"] } # the bones that will change relativelly onlyDif=[#"sacroiliac","vl5","vt3", "l_ankle","r_ankle", #"l_knee","r_knee", #"l_hip","r_hip", "HumanoidRoot_to_l_hip","HumanoidRoot_to_r_hip"]

Our interpretation of motion data can be improper. We do not know where the measured joints are exactly located on the human body. We can make only an approximate reconstruction of the movement. Despite this, we are able to create a realistic animation of the human movement. The errors with limb self-occlusion may occur during the movement, this is caused by different rest pose joint locations in data than in our model. However we are able to create a short animation quickly with captured data without manual positioning. You can see the results in Figure 7.1.

36 CHAPTER 7. The model is without a texture. ANIMATION Figure 7. .1: The images from animation sequence generated with BVH motion captured data.

We used also Matlab to speed up the process. Figure 8.obj simple text format.1.Chapter 8 Description of the software package We focused on using the open source software during the development. In our work we used free third party data for mesh and motion. The whole package structure is shown in Figure 8. We use Blender scene files as templates for Blender. We designed our formats for storing our own data as camera configurations. We used Matlab instead of more difficult Python implementation of some math functions like matrix decompositions and using other Python packages. 37 . Motion data are stored in BVH format as mentioned before and the mesh is exported from MakeHuman application into Maya .1: The structure of software package.

737917 4. even though this is not expected.918237 v -8. some features are no further available and this file format is deprecated. The faces are defined by indices of vertices. Tag parameters: .963268 v -8.180817 0. The XML tags must correspond to Blender’s data structure. As the Blender changed. 8.2 for example.704912 4. It starts with object’s name human mesh and continues with vertex definitions (starts with the v character).179651 0. Instead. The root tag must be animation.963252 v -8. f 5740// 5749// 5761// 5758// f 5199// 5132// 5749// 5740// f 5206// 5748// 5749// 5132// f 5761// 5749// 5748// 5751// .704847 4.178391 0. The recognized tags are: startframe First frame to be rendered.945174 .161950 0.38 CHAPTER 8.. see Figure 8.162358 0.979013 v -8. the BVH files with motion data must be used in the case of bone driven animation.. The last is the chapter of face definitions (starts with the f character).2 Animation XML file We used an XML file in our previous work for animation definition. Blender automatically linearly interpolates object parameters between frames.737658 4. Parameters: n number of frame endframe Last frame to be rendered. All these settings will be done in this frame.960827 v -8. This file contains animation description and is parsed by Python scripts. You can see the example bellow (shortened printout): # MakeHuman OBJ File # http://www.. Parameters: n number of frame frame This tag tells which frame is going to be configured.162949 0. Faces can be triangles or quads (three or four vertex polygons). We show it here only for completeness.671900 4.1 Mesh data format The data exported from MakeHuman into *.704855 4. DESCRIPTION OF THE SOFTWARE PACKAGE 8.obj file are simple and easy readable.dedalo-3d.. This approach can be still o human_mesh v -8.

Note that camera in Blender captures in its negative z axis. so object 2 units wide by default in x axis will be 4 units wide.z axis f=1 is focal plane distance R= name objects name in Blender’s data structure px objects position in x axis py objects position in y axis pz objects position in z axis rx objects rotation around x axis in degrees ry objects rotation around y axis in degrees rz objects rotation around z axis in degrees sx objects size in x axis sy objects size in y axis sz objects size in z axis 8.0 is the camera centre in extension) are used for storing the configuration of cameras in Blender.0” will produce two times bigger object in x axis then the default. but the name parameter is mandatory. AVI is a commonly used format on Windows platforms avijpeg AVI movie w/ Jpeg images avicodec AVI using win32 codec quicktime Quicktime movie (if enabled) .3 Camera configuration files The camera configuration files (with . Depending on the type of object the following parameters can be passed.y. Camera configuration file is a normal text file with parameters on each line: C=3.0.y.0.0 is camera rotation around its x. Setting sx=” The size of object depends on its default size. CAMERA CONFIGURATION FILES n number of frame 39 object Which object will be set up. Possible formats are: aviraw Uncompressed AVI files.80.z axis k=4:2 is aspect ration u:v size=600x800 is output image size resolution in pixels format=png is image output type format.

-0. <object name="mySkeleton" sx="1.0"?> <animation> We start with first frame. <startframe n="1"/> And the 30th frame will be last.4 Exported animation We also export vertex locations for testing for each frame and the bone poses as well. that we want object mySkeleton to be a 1.2: Example xml animation file. We use simple text file format.verts and contain coordinates in x.0) <object name="mySkeleton" px="0. <object name="mySkeleton" rz="45" py="-1.40 CHAPTER 8. The files with vertex coordinates have extension . <frame n="1"> The order of setting in frame tag doesn’t matter.5783573985100.0" pz="0.-1.8 unit height.0.0) and rotated around z axis by 45 degrees.0"> </object> </frame> <frame n="15"> <object name="mySkeleton" rz="0.2762540280819.z order for each vertex on a separate line: -0. <endframe n="30"/> Now we define the first frame.0" py="0.8" sz="1. .y.8" sy="1. Here we say. The mySkeleton object is the name of armature object in Blender data. DESCRIPTION OF THE SOFTWARE PACKAGE <?xml version="1.8" /> Here we set the position at (0.4156879186630.0"> </object> </frame> <frame n="30"> The final position of object should be at (0. 1. (Deprecated) tga Targa files rawtga Raw Targa files png Png files bmp Bitmap files jpg Jpeg files hamx Hamx files iris Iris files iriz Iris + z-buffer files ftype Ftype file 8.0"> </object> </frame> </animation> Figure 8.

5792297720909.head=[-0.2761918604374. This can be done by the scroll button on the script window panel.3276234567165] l_wrist. The locations are absolute ( in world space) for both bones and vertices. For bones.blend -P \\ $bpypath/rahat/run.0712721124291] vl5. -0. -0.5 Run parameters Syntax to run Blender with Python script is: % Windows SET bpypath=drive:\path_to_python_scripts blender_dir\blender.0036427974701. -0.. 1.7440707683563.0174388140440] . -0.tail=[0.head=[0.z order: l_knee.0491226166487] l_knee.2594771981239] r_ankle.8. -0. or the default windows arrangement can be changed. -0.1632092595100.blend file and runs the main script with simple GUI.0712721124291] l_wrist. The bone pose data can be used for testing the detected pose.5795810222626. 1.head=[-0.1631074249744. RUN PARAMETERS 41 -0.4030768871307.5788146853447.1345274895430. we export the head and tail locations in the same x.5797730088234. 1. The functions from GUI can be easily rewritten to another script to automate the whole process of creating animations.2793728411198. -0. -0. like the script window is hidden.0797354504466] l_elbow.0820896327496] r_ankle.y.tail=[0. 1. -0. -0. 2. 2.1602035462856. 1. 1..head=[0. 1. 1.blend -P %bpypath%\rahat\run.1514077186584. -0..0506756305695. 8.2778972089291. 0. 1.tail=[0. .tail=[-0. Few unexpected events may occur.head=[0.9171544313431.4197340011597.4200913906097. % Linux bpypath=scripts blender_dir/blender template. 0. When you run other Blender’s script meanwhile the main script is This starts Blender in an interactive mode with opened template. -0. -0. you may need to switch back to main script by choosing the active script for the script window.4150956869125. 1.1573766618967] l_elbow.0485088936985.2223045825958] vl5. These data can be used for validation of human body detection. -0.. 1.5.9171544313431.0243827812374. -0.1498874127865. 2.blend file. Data are exported for each frame to a separate file. We use Blender’s GUI because some changes must be done manually and the whole package is still under testing.0510297790170. It is easy to switch to script window by icons.0497045181692.1602035462856. This is caused by different settings in template.tail=[-0.exe template.4538201093674.

. The matrices are stored in the OpenGL columnmajor order which differs from standard C programming language interpretation of arrays. • M(object) is the object’s transformation matrix 4×4 to the world space.Chapter 9 Used mathematics and transformations For row vectors we use math bold font (v) and for matrices the math true type font (M). This causes that the matrices are transposed and the multiplying order is changed. a2 . The matrix parameters array [a1 .getMatrix(). • A(bone) is the bone’s transformation matrix 4 × 4 to the armature space in the rest pose. It is obtained by calling Python function Object.] 42 . the vectors are transposed (row vectors are transposed to column vectors) as well. . It is obtained by accessing the attribute Bone.poseMatrix) are written with true type font. It describes the orientation of the bone in the rest pose.poseMatrix. The functions of Blender Python API (PoseBone. Definitions: • O(bone) is the bone’s pose matrix 4 × 4. It is obtained by accessing the attribute Bone. P 9. • B(bone) is the bone’s transformation matrix 3 × 3. It is obtained by accessing the PoseBone. It is a transformation from bone space to the armature space.1 Blender’s math The Blender is based on OpenGL and thus accept also the OpenGL’s data format and transformations. • ~ is the projection matrix extended for row vector to be 4 × 4 shape.matrix[’ARMATURESPACE’].matrix[’BONESPACE’]. • P is the projection matrix 3 × 4. .

Blender also recognizes the armature space. The armature space is used to define rest bone positions. The pose space is used for vertex transformations from armature space to the pose space.getEuler(space). the bone space and the pose space.9.  (9.3) v =v ·M .  (9.getLocation(space) or Object. the pose space vertex locations define new pose. This space is called local space. The coordinates in the bone space are relative to bone heads. The bone space is defined by the bone configuration. The absolute coordinates are in the world space. The local coordinates are relative to the object’s origin or to its parent. The space type must be specified for example if Blender Python API functions are used: Object.2. The coordinates in the armature space are relative to the armature’s object origin. 9. but rotation angles used in object parameters are in . BLENDER EULER ROTATIONS represent the following matrix:  43 M=    a1 a2 a3 a4 a5 a9 a6 a10 a7 a11 a8 a12 a13 a14 a15 a16     .2) The following order must be presented to transform a vector by Blender’s matrix: (9.2 Blender Euler rotations Blender uses Euler rotations to express the object rotation in the world space.1) This causes that a translation matrix in Blender has the following shape:  T=    1 0 0 0 1 0 0 0 1 tx ty tz 0 0 0 1     . Blender denotes the bone joints as the bone head and the bone tail. While the armature space vertex locations define the rest pose. The Blender Euler angles are in degrees (this corresponds to Blender Python Euler object). Therefore we follow this in the text whenever the Blender related transformations appear. Elsewhere we use the common notation used in computer vision books like [5]. Besides that. The coordinates in Blender can be relative to parent objects or absolute. This could be transposed but Blender’s Python API uses this matrix array representation and the vectors in Blender are row vectors as default (Blender math functions work improper with column vectors).

1−g 1 − g2 (9.6)   (9. cos(α) we get: sin(α) = − h . 0 sin α cos α cos β 0 sin β   0 1 0  . (9. cos(β) = − 1 − g 2 .y. The computation of rotation matrix is important mainly for computing the camera projective matrix. The α. but we search only one solution.11) .4) R(α. γ) =  cos(γ) cos(β) − sin(β) − sin(γ) cos(α) + cos(γ) sin(β) sin(α) cos(γ) cos(α) + sin(γ) sin(β) sin(α) cos(β) sin(α) sin(γ) sin(α) + cos(γ) sin(β) cos(α) − cos(γ) sin(α) + sin(γ) sin(β) cos(α) cos(β) cos(α)   (9. Ry =  − sin β 0 cos β cos γ − sin γ 0   Rz =  sin γ cos γ 0  . β.9) The sign in cos(β) can be also positive.8) that: sin(β) = −g. The rotation matrices around the individual axes are simple: 1 0 0   Rx =  0 cos α − sin α  .z axes. β. cos(α) = − √ i 2 .   sin(γ) cos(β) (9. g h i We can write using the result from (9. We describe here how to transform Blender Euler angles to a rotation matrix.44 CHAPTER 9. γ) = Rz (γ) · Ry (β) · Rx (α) .10)   (9. 0 0 1 The final matrix is: R(α. The Euler angles suffer from drawbacks as gimbal-lock. γ) =  d e f  .5)   (9. β.7)    . Assume that we already converted angles to radians.Rot* properties). USED MATHEMATICS AND TRANSFORMATIONS radians (this corresponds to Blender Python Object. β. The final rotation matrix can be composed as: (9. γ angles are rotations around x. Let R be a general rotation matrix with following parameters: a b c   R(α.8) The backward decomposition is more complicated and is not unique. When we express sin(α).

    − sin(γ) 1 − g g h i g h i So sin(γ).. 9. More about camera projections can be found in the book [5] by Hartley and Zisserman. Projections in Blender are based on OpenGL so all the transformations are similar to OpenGL viewing transformations. γ) =  0 y ±x  = ±1 0 0  0 0 ±1 − sin(γ) cos(α) cos(γ) cos(α) 0 cos(γ) sin(α) sin(γ) sin(α) sin(γ) sin(α) − cos(γ) sin(α) 0 cos(γ) cos(α) sin(γ) cos(α)      (9. cos(α) = 1 and express the γ as: sin(γ) = −b.. .14) =  and we get a 2 equations for 4 variables. cos(γ) can be expressed as: sin(γ) = − d . . Note that Blender uses transposed matrices.12) (9.9. To be able to describe the camera and to compute the projection matrix. (9. . otherwise the rotation matrix is degenerated: 0 x y   R(α. We need only perspective camera for our purposes because this is a closest approximation to real pinhole camera. This composition corresponds to Blender source code of the rotation matrix. cos(α). . .8) as well. cos(γ) = − √ a 2 .15) The lines above describe how we compute and decompose the rotation matrix composed of Blender Euler angles.  =  d e f  . We can choose for example α as: sin(α) = 0.16) (9. sin(β).3.. BLENDER CAMERA MODEL 45 Substitution of sin(α). . cos(β) back to rotation matrix gives such identity:  a b c − cos(γ) 1 − g 2 . .  2 .13) This is correct if g = ±1. 1−g 1 − g2    (9.. so a matrix obtained from Blender Python functions is transposed to the matrix (9. β. cos(γ) = e . we must understand the differences between the OpenGL and the computer vision camera models.3 Blender camera model The camera in Blender can be orthographic or perspective.

In the center of the image plane is the camera optical axis. The image is rasterized to pixels (u. 16. Figure 9. These parameters do not transform world coordinates to camera coordinates but reversely. we define the used notation and conventions.17) . This does not have influence to the camera model. except that objects out of this clipping plane are not viewed. The biggest difference between Blender and computer vision is Blender camera view which is in negative z axis (it is common for OpenGL projections).0 (9. The origin is at the top left corner.3. The origin of the image plane is shifted by offset u0 . y axes). The inverse transformation must be used to transform world coordinates to a camera space. The objects captured by this camera are projected into the image plane (x.1: A common OpenGL camera model used for perspective camera in Blender. We can write transformation between parameters as: f= lens .2 is the camera focal length. Cameras in Blender have also specified clipping plane for the closest and the most far visible objects. All these parameters are related to the world space (as all objects in Blender scene). The row is first and the column is second index. To avoid possible confusions.4. this corresponds to lens parameter of Blender camera in Figure 3. where C is the camera center. v axes).1. see Figure 9. USED MATHEMATICS AND TRANSFORMATIONS The basic camera parameters in Blender are location and rotation. Note that camera axes differ from OpenGL model shown in Figure 9. First.2. v0 (the principal point) from the origin of the image. The f parameter in Figure 9. see Figure 9.46 CHAPTER 9.1. we expect a pinhole camera model as shown in Figure 9. Here we use the coordinates as for indexing the matrices of stored images.

γ. This rotation can be written as: 0 1 0   T= 1 0 0  . This is caused by different axes orientation. The projection matrix describes the projection from world space to the image pixels. There is also another rotation of the axes between the Blender camera model and the pinhole model.3. BLENDER CAMERA MODEL 47 Figure 9. We know the Blender camera object’s Euler angles α. The rotation to camera space is inverse. 0 0 −1 The rotation from world space to camera space is: R = T · R(α. We need to estimate the projection matrix which is often used in computer vision. β. we can transform any point into camera space.y axes ) an the indexing of the image pixels (u.3: Image plane (x. Because we know the camera location C and camera rotation in world space.18) . (9.1 and Figure 9.19)   (9.2. The first step is transforming the world coordinates into the camera space. see Figure 9. β.9.2: A pinhole camera model. so we can compute the rotation matrix.v). γ) . Figure 9.

24) The computation of the parameters of K matrix is follows: u0 = height . 2 width v0 = . 1  (9. The coefficients mu .25) Blender chooses the axis with maximum resolution and the other axis is scaled if the aspect ratio is not 1 : 1.23) where the K is known as the calibration matrix and can be written as: −mu · f  0 K= 0  0 mv · f 0 u0  v0  . height · ku ) m mu = 2ku m mv = 2kv (9.20) xc = [R. 2 (9.48 CHAPTER 9. zw . 1] to the camera space can be written as: (9. USED MATHEMATICS AND TRANSFORMATIONS The C is the vector of camera location in the world space. where the xc = [xc . This vector is projected on the image plane as:  λx f 0 0      λy  =  0 f 0  xc λ 0 0 1    (9. yc .21) where f is the focal length.22) (9. mv can be estimated as: m = max(width · kv . The whole transformation of any homogenous vector xw = [xw . yw .26) We use this code in our function: . −C]     (9. −C]xw . Transformation from normalized coordinates to pixels is the following:  x u −mu 0 u0      mv v0   y   v = 0 0 0 1 1 1 and so we can write the Blender camera projection matrix as: P = K[R. zc ] .

MODELVIEW matrix to eye coordinates. Yo . Finally. 1 >. 3. but it is recommended to use OpenGL utility functions. The coordinates [Xo . We use here the OpenGL terminology. USING PROJECTION MATRICES WITH OPENGL if (width · kv > height · ku ): mv = width 2 kv mu = mv · k u else: mu = height 2 mv = mu · k u kv 49 where the ku : kv is the aspect ratio height : width defined in Blender render parameters.4. This corresponds to Blender transformations from object’s local space to world space and then to camera local space. The clip coordinates are transformed (divided) by perspective division to a normalized device coordinates. Object coordinates of any point are transformed by: 1. 9. we do a quick overview of OpenGL. Zo ] of any point (the .9. 2. Before we describing transformations. 4. OpenGL works almost always with homogennous vector coordinates and with 4 × 4 matrices. We will write Mv for MODELVIEW matrix and Pj for PROJECTION matrix.4 Using projection matrices with OpenGL In the previous chapter we derived the projection matrix of Blender perspective camera. The projection matrices are then exported into simple text files and can be used for computer vision algorithms. Instead of using OpenGL utility functions. which we can understand in the computer vision analogy as coordinates in the camera space. PROJECTION matrix transforms eye coordinates to clip coordinates. These coordinates are clamped in interval < −1. All the matrices can be set explicitly. the VIEWPORT transformation is applied to transform the normalized device coordinates to window (pixel) coordinates. We need also to use the projection matrix with OpenGL for rendering into images captured by real cameras. we must set the own OpenGL view transformations in order to get real camera projection transformations. These coordinates are clipped in the viewing frustum.

27) then to clip coordinates:      Xc Yc Zc Wc        = Pj    Xe Ye Ze We     . You can see on the Figure 9. Zn /2 + 1/2 z    (9. The equation is: x y = v height − u .31) . because we usually render into images). During rasterizing all the three coordinates x. Zc Wc (9.  (9. The point with world coordinates [Xw .30) The OpenGL window (image) coordinates x. 1 >. y have the origin in the bottom left corner (this differs from our indexing of image as shown in Figure 9. The last coordinate z is stored in z-buffer for visibility test. z are used. Yw .29) Note that these coordinates are in interval < −1.28) Finally the normalized device coordinates:  Xn     Yn  =  Zn   X  c Wc Yc  Wc  . We use the real camera’s projection matrix.width. We want to set the OpenGL transformations so the projected coordinates in the window x. v and by OpenGL x. Zw ] must have the same location in the OpenGL window (image) as in the captured image (see Figure 9. Figure 9. USED MATHEMATICS AND TRANSFORMATIONS object coordinates) are transformed to eye coordinates as:      Xe Ye Ze We        = Mv    Xo Yo Zo 1      (9. v.4 how are the normalized device coordinates displayed in the final window (we can understand OpenGL window as the image.0.3). y. (9. We set the default VIEWPORT transformations by OpenGL function glViewPort(0. y.5).height) We then get the window coordinates as follows:  width/2 · Xn + width/2 x      y  =  height/2 · Yn + height/2  .50 CHAPTER 9.5 shows the image coordinates used by our pinhole camera model u. We want to define the same projection as the real camera has. y image will correspond to the image pixel coordinates u.

0.4.34) .  (9. 0.33) We need to find a matrix T. we can extend it by a row vector [0. (9. USING PROJECTION MATRICES WITH OPENGL 51 Figure 9.32) u ˜ v ˜ w ˜ 1      ~  = P   Xw Yw Zw 1     . which we use to multiply the ~ matrix in order P to get the eye coordinates.9.  (9. 1]to size 4 × 4: ~= P to get the transformation:      P 0 0 0 1 . The eye coordinates must finally project on the same location in the image as if a point was captured by a real camera:      Xe Ye Ze 1        = T   u ˜ v ˜ w ˜ 1       =   ax bx cx dx u ˜  v ay by cy dy   ˜  az bz cz dz   w ˜ 0 0 0 1 1      .4: The normalized coordinates of OpenGL devices and their rasterization in the window (image). To be able to use a projection matrix P of size 3 × 4 with OpenGL.

− f −n  0 0  (9.35) 0 2 height Pj =     0 0 0 0 0 2 f −n 0 0   f +n  .36) 1 0 where n. pixel coordinates of the OpenGL window (image) are on the right. Ze width Ze height Ze (f − n) Xe 2 Ye 2 2(Ze − n) + . Ze width height f −n . . Ye .5: Pixel coordinates of a pinhole camera model are on the left image. It is a common way of setting OpenGL viewport. For the VIEWPORT transformation.38) . (9. (9. 1] will be mapped to clip coordinates Xe 2 2 2Ze − f − n .52 CHAPTER 9. Ye . We will set the PROJECTION transformation matrix Pj to this shape:  2 width . we define the PROJECTION and VIEWPORT transformations. USED MATHEMATICS AND TRANSFORMATIONS Figure 9. Note that the eye coordinates [Xe . + . we will use the transformation defined in (9. + . −1 Ze width Ze height f −n = .37) After perspective division we get the normalized coordinates: 2 Ye 2 2Ze − f − n Xe + . Ze . f is respectively the nearest and the furthermost value of the viewing distance. (9. which gives a set of equations: Xe = ax · u + bx · v + cx · w + dx ˜ ˜ ˜ Ye = ay · u + by · v + cy · w + dy ˜ ˜ ˜ Ze = az · u + bz · v + cz · w + dz ˜ ˜ ˜ Before we continue.30).

.36) and (9.41) T= 0 −1 0 0 1 −width/2 0 0 height/2 0    0 1 0  0 0 1  (9.39) further we can form a set of equations using (9.42) This applies if we use the transformations (9. USING PROJECTION MATRICES WITH OPENGL 53 If Ze = n then the window coordinate z is z = −1. dz = 0 ax = 0.height) glMatrixMode(GL PROJECTION) glLoadIdentity() glMultMatrixd( Pj ) glMatrixMode(GL MODELVIEW) glLoadIdentity() glMultMatrixd( T) glMultMatrixd( ~) P Note that matrices in OpenGL are stored in a different format than some programming languages use. cy = . We get these equations: Xe width + Ze 2 Ye height y= + Ze 2 x= = v ˜ w ˜ u ˜ w ˜ = height − (9. cz = 1. cx = − and we can write the T matrix as:     = = v ˜ w ˜ − width 2 height 2 − u ˜ w ˜ (9. We use the ~ matrix multiplied by our T matrix P (9. y window coordinates we get the eye coordinates divided by distance Ze and shifted. dy = 0 2 az = 0. If Ze = f then z = 1. bx = 1.width. In the x.35): ax · u + bx · v + cx · w + dx ˜ ˜ ˜ az · u + bz · v + cz · w + dz ˜ ˜ ˜ ˜ ˜ ˜ ay · u + by · v + cy · w + dy az · u + bz · v + cz · w + dz ˜ ˜ ˜ The one solution can be: width .0. So the origin is in the middle of the window (image). bz = 0.36) to define OpenGL PROJECTION matrix.9. The sequence of proper OpenGL commands for setting the described transformations looks as follows: glViewPort(0.4.42) as OpenGL MODELVIEW matrix and we use our Pj matrix (9.30).40) (9. by = 0. dx = 0 2 height ay = −1.

bone. First we need to describe computation of the quaternions. The bone matrix in bone space is 3 × 3.47) where R is the matrix from equation (9. (9. We compute the bone matrix in bone space B(bone) = R(0.tail.4 armature source code and therefore Blender developers published a schema to explain how the armature deformations work in Blender.8) and q is the quaternion composed as: v= bone. n sin 2 2 (9. 1.48) . locations [x.44) (9.head. It describes the orientation of the bone.43) We then use the Blender Python functions to convert the quaternions to rotation matrices. v as n=u×v and the angle θ between the vectors as: θ = arccos(u · v) .6. The vectors bone. are row vectors. Here we will write Rq (quaternion) . It is defined by the location of the bone’s head and tail and by the roll parameter which rotates the bone around its local y axis. bone. We describe how we compute these matrices shown in the schema in Figure 9. z] of the bone head and tail. (9. n sin 2 2 . You can see it in Figure 9. This describes the code and data structures but it does not say anything how this relates to the Blender Python API and how to compose the matrices.54 CHAPTER 9.6.head n = [0.45) (9. 0) · Rq (q) . (9.5 Bone transformations Lot of developers working with Blender were disappointed by changes in Blender 2. Then the quaternion of rotation u to v is q = cos θ θ .46) by this we mean a rotation matrix composed from quaternion.head bone. 0] × v θ = arccos(u · v) θ θ q = cos . We pursue here the Blender Python notation so the equations correspond to the Python code. USED MATHEMATICS AND TRANSFORMATIONS 9. We can compute the normal vector n of two normalized vectors u. y.tail − bone. used in this chapter.tail − bone.roll.

51) If the bone has parent. We describe only deformation caused by bones using vertex groups . Finally.roll. the bone matrix in the pose space is distinguished from the matrix in armature space by the quaternion pose. In order to get the bone matrix in the armature space we must extend the bone local space matrix to 4×4 and include the parent bone transformations. (9. (9. If the bone has not any parent bone we can express the bone matrix in armature space: A(bone) = B(bone) 0 0 0 0 1 I 0 bone.length 0 1 A(parent) ..rot) I 0 pose.head 1 I 0 pose. The description that we gave here refer to Figure 9. then the evaluation of the matrix is: O(bone) = Rq (pose.. I 0 bone. If the bone has parent the estimation of the bone matrix in the armature space is different: A(bone) = 0 .length is the length of the parent bone. bone. (9.loc which sets the bone to the new pose. The parent.loc 1 A(bone) . I 0 0 parent.49) where I is the identity matrix 3 × 3 and 0 is the zero vector.head 1 .. and can be computed as parent.head 1 .. 0) because the Blender Python API works with translated matrices and we use Blender functions for matrix operations.. but describes only the basic deformations that we used in our work.rot and by the translation row vector pose.length 0 1 O(parent) ..6.tail − parent.6. 9.. The quaternions are used to set the bone rotations.9.50) I 0 0 parent. (9.52) The matrices described above are used to compute the bones location and pose.head . If the bone does not have a parent. This is enough for our purposes.. B(bone) 0 0 0 1 I 0 bone. We only revised the Blender documentation which is less specific on how the matrices are computed.loc 1 0 B(bone) 0 0 0 1 .rot) . we can express its pose matrix: O(bone) = Rq (pose. VERTEX DEFORMATIONS BY BONES 55 We write R(0.6 Vertex deformations by bones We describe here vertex transformations of any mesh object by armature bones.

.3 1.60) this is the final vertex local space location after deformations. The deformation by envelopes or combination of vertex groups with envelopes is more complicated. (9. The vertex can be member of several vertex groups (we set a limit of 2 groups). 1 >.54) (9. The armature object is a parent to a mesh or in newer Blender’s philosophy it is a mesh modifier.59) weighti The vertex location for a new pose is vl = va · M(armature) · M−1 (object) . The final vertex location in the armature space is di va = va + ˜ . Each group is transformed only by this bone (note that the vertex groups in Blender must have the same names as the bones to which they belong) but the vertex can be member of several groups.57) The weighted difference of the new location is (use only x. This is not complete description of Blender bone transformations for vertices but it completely describes the transformations which we use. where the vertex is member. We found the transformations by studying the Blender source code. z. The final local space vertex location can be also obtained using NMesh. The vertex group only defines correspondence with bone.55) Now for each vertex group.GetRawFromObject(name) Blender Python function.. The world space vertex location is vw = vl · M(object) . The vertex is member of this Mesh object. The vertex location in the bone space is: vbi = va · A−1 (bonei ) .56) The vertex location in the armature space after applying the pose transformation is vpi = vbi · O(bonei ) . (9. USED MATHEMATICS AND TRANSFORMATIONS and weights for vertices.y.53) where the M(object) is the object transformation matrix.. (9. Assume we have a vertex local space location as vector: vl = [x.3 = (vpi − va )) · weighti .z parameters) 1. i (9. each vertex has its own weight that defines the influence of the bone. The vertex location in the parent’s armature space is va = vw · M−1 (armature) . (9.3 d1. .58) Note that weight is in interval < 0.56 CHAPTER 9. In our work. (9. 1] . y. we compute the weighted vector of relative change to the vertex rest position. ˜ ˜ (9.

1. bone.head) ∗ O−1 (bone) rest and compute the quaternion q which defines the transformation Q: v= x x (9.length.63) (9.tail. we use the correspondences between the empty objects (created by importing the BVH animation data) and our bones. which will transform the vector of head.head. bone. 1] Q · Orest (bone) . The following equality must be valid: bone. (9. we do not change the bone joint locations or the bone sizes. We will denote them empty. tail to be parallel to vector of empty objects: [bone.tailq . 0] × v θ = arccos(u · v) θ θ q = cos . is only a rotation.headq . 1] Orest (bone) . We can compute the bone’s head and tail locations in the rest pose as [bone. 0. 0.tail − empty. CONFIGURING THE BONES 57 9.65) n = [0. 1] = [0. We need such transformation Q (rotation).7 Configuring the bones When we animate our model.62) Because the sought transformation.head)∗O−1 (bone) . we can omit the size of the vectors and write: [0.headq empty.9. As described in the Chapter 7. 0. We only rotate the bones to get the desired pose.head = bone.tailq − bone. 1] = [0.head .66) .tail − empty.tailq − bone. bone.length. 0.length.64) rest We compute the vector x x = (empty.tail. (9.tail−empty.61) We know the locations of the corresponding empty objects. empty. 1] = [0. 0. 0. 1] Q = (empty. The bone y axis must be parallel in the new pose to the axis connecting the corresponding empty objects.head. (9. 0. 2 2 (9.tail − empty. We must estimate the rotation (expressed as quaternion) which rotates the bone from its rest pose to the new pose. n sin .headq empty. 1] = [0. 1] Q · Orest (bone) [bone. 1] Orest (bone) [bone.7.

3 = K · R (9.tail − empty. If the rotation should be relative to the empty object locations.70) (9.headrest n=u×v . This quaternion object can be used directly with bones without any transformation to rotation matrix.tailrest − empty.head empty.. The general calibration camera matrix is: ∗ ∗ ∗   K= 0 ∗ ∗  0 0 ∗   (9. We need also to separate the extrinsic parameters of the calibrated camera to be able to configure the Blender camera to the same view. We omit the skew in calibration matrix and estimate the parameters of the Blender .tailrest − empty.68) and get the translation vector from multiplying the inverse calibration matrix by last row of projection matrix: t = −K−1 · P4 . We can decompose the first three columns of projection camera matrix using the RQ decomposition as: P1. The locations in the first frame we refer as the default locations.58 CHAPTER 9. 9. The camera center in the world coordinates is: C=R ·t .tailrest .headrest u= empty.tail − empty. The problem is that the Blender camera intrinsic parameters are ideal and the real camera can not be simulated in Blender. v= (9.head empty.67) where the empty.24).headrest are locations of the empty objects in the animation’s first frame.71) This is different if compared with the matrix from equation (9. we must use the Blender camera views. empty. (9. we then use: empty.69) The camera center and the rotation matrix is enough to define Blender camera location and orientation but the intrinsic parameters of the camera can not be obtained so easily.8 Idealizing the calibrated camera Because we did not write own visualization for fitting the model into the camera views. USED MATHEMATICS AND TRANSFORMATIONS We use this quaternion in bone object’s parameter to define the rotation.

We compute the needed offset as: shift = width height . 1) .74) The computed shift was up to 50 pixels in our data.73) We can not set the image pixel coordinates origin offset in the calibration matrix of Blender camera. 1) ku = (9. 2) and the focal length f= K(1. We can compute the aspect ratio K(1.72) kv K(2. height) (9.9..2 (9. 2 2 − K3 1. 2 max(width. IDEALIZING THE CALIBRATED CAMERA 59 camera so the final transformation fits the real camera parameters. But we can shift the image origin to match the Blender camera calibration matrix.8. . It is apparent that without idealizing and shifting the images it is impossible to fit the object into views using the Blender camera views.

USED MATHEMATICS AND TRANSFORMATIONS Figure 9. describing how the armatures in Blender work.60 CHAPTER 9. . This Figure was obtained from Blender web site [1].6: The schema of Blender bones structure and transformations.

It is easy to write own Python scripts to automate the whole process instead of using user interface and clicking on buttons. The operation with scene objects can be attaching the mesh to skeleton or copying BVH animation. It is possible to access the functions from our scripts through the buttons in the right panel. you will get the screen shown in Figure 10. We show the possibilities in creating an animation.obj file. For the *. Note that some BVH files may not be compatible with our scripts. We describe here a quick tour of creating an animation. Using the buttons is more educative for learning and understanding the usage and functions of our package.Chapter 10 The package usage In this chapter we show how to use the software package. For this see the Section 9.1. This also helps to understand the procedure of creating an animation.3 in Chapter 8. The objects in the Blender scene must be selected by right mouse button click before an operation is chosen. If you run the Blender with parameters given in section 8.5 in Chapter 8. We expect these inputs to be already obtained: • Exported MakeHuman mesh into *. • The BVH motion captured data. 61 .cam files. We use the Blender in interactive mode with own user interface due to need for debugging and testing the Python file format see the section 8.8 in Chapter 9. Unfortunately this requires at least a minimum knowledge of Blender usage. • Captured images from calibrated video together with decomposed and idealized camera configurations in *.

the mesh model is prepared for articulation and can be fitted into camera . You can resize the window or zoom it in order to redraw. The window may need to be redrawn again. The imported mesh should appear. press Ctrl+Tab and switch to the pose mode. At the moment after import. You just choose the Import MH Mesh button and select the proper file to import.1 Importing mesh and attaching to the skeleton It is easy to import any mesh. To test the script. Now mesh can be attached to the skeleton. The subwindow should look like in Figure 10. The bone will start rotating. THE PACKAGE USAGE Figure 10. configuration and export of cameras. This can be done by holding the Shift key and clicking with right mouse button on the objects. Both the mesh and the skeleton object have to be selected.). the mesh is bigger and in different orientation than we generate the skeleton. The skeleton can be now generated using the Generate Skeleton button. If not. After that. The mesh should be also selected (highlighted).62 CHAPTER 10.1: The Blender window after start of run.2. it must be selected by clicking with right mouse button. The window may need to be redrawn after import. Select a bone with the right mouse button and press the R key. Switching back to the object mode is done again by Ctrl+Tab. Now the mesh was attached to skeleton. The button Fit to Skeleton must be used in order to align mesh with the skeleton. Buttons with our functions are on the right ( import and export. 10. mesh import and animation etc. Then the Attach Skeleton to Mesh button can be used.

FITTING THE MODEL INTO VIEWS AND TEXTURING 63 Figure 10. . in order to match the images. We can fit roughly the model into the desired object in camera views using the Blender interface. From the script window choose the button Load and Setup Camera and select the proper *. 2. Then from the 3D View window panel choose the View and Camera to get the view from the current camera. We can move. Load the images into the Blender. 1. Switch the 3D View window to the UV/Image editor window by selecting an icon from the menu. Nevertheless we did not write any function for automatic fitting. Through menu choose Image and Open to load the image into Blender. 3.10. rotate our model and bones as well. 2. 10. file. views in order to cover the model with textures. We switch the window back to 3D View. The fitting process is the following. scale. (The first icon in the list).2: The Blender window with skeleton and mesh object selected.2 Fitting the model into views and texturing We must admit it is practically impossible to get precise results by manual model fitting into the specific pose in camera views. 3. After loading all the images we can set the cameras in the scene.

If the armature object is selected. Split the window by clicking the middle mouse button on the window border and select the Split area. The model is then texturized from Blender camera views by loaded images. we can fit our model into views using rotation (the R key). We advice to check the empty object names and correct them if it is needed (this can be done by pressing the N key and by rewriting the name). We then select an image in the UV/Image editor window and assign the projection matrix by Assign P matrix to Image button. In the one of the new windows.1 for better visualization). rendering and animation data export We load the BVH files using the Blender’s import script for BVH motion captured data. 6.3 Loading animation. the Copy BVH movement button can be pressed to copy imported animation. All the cameras can be processed by the process listed above.1 on page 26. This script sometimes badly imports joint names. . Load the next camera. Motion Capture (. It can be recovered by choosing our script from the scripts list on the script window’s panel.64 CHAPTER 10. After all. we attach the images as textures by Attach Images As Texture button. Import. You can see the results in Figure 6. It can be accessed through the Blender menu by selecting File. Enable the background image in the camera view from the panel by choosing the View and Background image. After configuring the cameras in the Blender.bvh) (we recommend to set the script scale parameter to 0. Continue from the beginning to set the view to camera view and to set the background image. The running of the other script swaps the active script in the script window and our script can disappear. THE PACKAGE USAGE 4. These matrices are used to define our Blender camera projections to images. 10. moving (the G key) and scaling (the S key) in both object and pose mode. unlock the camera view by clicking on the icon with lock (the icon must be unlocked) and in the second window lock the view by the same icon. 7. select the camera by Get next camera button. Important is setting of the start and the end frame in the Buttons window in the anim scene panel. 5. This script imports BVH data and creates empty objects connected in the same hierarchy as the BVH data joints. Before attaching images as textures. Enable it and select the corresponding image with the current camera. we need to export camera projection matrices by Export Camera Projection Matrix.

All the other Blender features like lights. the Render for each camera button can be used. . RENDERING AND ANIMATION DATA EXPORT65 To render the animation for each camera. The button Export animation can be used for exporting vertex locations through the frames and for exporting the bone poses as well. These features are well documented in Blender Manual [2]. diffuse rendering. can be also used to get better results.10. LOADING ANIMATION.3. shadow casting etc.

Most of the process tasks can be done automatically or can be 66 . The skeleton model is shown in Figure 4. The textured articulated model is shown in Figure 6. The acquired images were idealized to match the Blender camera model which have principal axis in the center of the image.5 (on page 24).1 (page 18). The animation example is shown in Figure 7. We then mapped the textures onto the mesh faces taking into account faces visibility in camera views.1 Conclusions We developed a software package that can be used for quickly creating animation. We can generate a wide variety of real human animations from motion captured data. We can use outputs of multi-camera system to adapt our models to real observations. We followed the H-Anim standard and we adapted the recommendations for use with BVH motion captured data. 11. We imported data into Blender and copied the animation using dictionary of correspondences between our skeleton model bones and the motion data joints. Together with MakeHuman mesh model and our general skeleton model we created the articulated model. For animation we used motion captured data stored in BVH format. We defined the joint locations and the level of articulation. We created a general skeleton model which fits mesh models created by MakeHuman software. We setup calibrated multi-camera system for acquiring the images for textures. We attached automatically mesh vertices to the bones using our own algorithm. We use 22 virtual bones.Chapter 11 Results and conclusions We extended Blender functionality for the needs of computer vision.1 (page 36). but only 19 of them are intended for articulation.2 (on page 31). We manually fitted the articulated model using Blender interface into camera views. Articulated models in new poses are shown in Figure 5.

but they are easy to create and animate. • It would be better to use own captured motion data with exactly defined joint locations on the human body. However. We covered the steps of creating the articulated model animations. The HAnim defined human body feature points would be used for automatic model fitting into views. Constrains for joint rotations should be also used. • The H-Anim standard corresponds with the human anatomy.1. the motion data allow us to create animations with realistic human motions. We used the open source software which is free to use. we can extend the animated scene for objects. Outputs can be used for learning human silhouettes from different views. Using the modeling software Blender. This error is caused by improper interpretation of motion data. We can recommend the following for the further development : • The mesh model should be adapted to the real object captured by cameras. . The main goal was to create a tool by which we can extend the robustness of our tracking algorithms. We did not use any constraints for bone articulation which could help to bound the bones in valid poses. fitting the articulated model into observations or just testing the algorithms on ground-truth data. The texturing of the model can be done with less distortion caused by over or under fitting the model into the object. So the shape of the model will match the object’s shape. Our models can not compete with virtual actors used in movies. The developed software can be used anywhere the simple realistic human animations are needed. We developed a platform which can be extended and improved by more advanced algorithms for texturing or model creation. The steps that we used during creating an animation of the articulated model are: • Obtaining model representation (mesh model) • Defining the virtual skeleton • Setting the textures • Animation using motion captured data The reuse of the motion captured data with our model is not easy. CONCLUSIONS 67 fully automated by scripting. We described the needed transformations between computer vision and computer graphics.11. The model limbs can penetrate each other during the animation.

h-anim. Prague. creation January 2006. Czech Republic. Cambridge.makehuman. 2nd edition.python. Computer Vision and Image Understanding. CTU– CMP–2006–02. Department of Cybernetics. Fua. [2] Blender 2005. [10] Python. Broszio. Design of realistic articulated r y as human model and its animation. Research Report K333–21/06. Cambridge University. 99(2):189–209. Herda.. In Proceedings of the International Workshop on Stereoscopic and Three Dimensional Imaging. 68 . Faculty of Electrical Engineering Czech Technical University. Beijing. ISO/IEC FCD 19774:200x. Lepetit. 2003. Blender online wikipedia documentation. V.blender. Interpreted. Fua. The human model parametric modeling application. http://www. Mapping texture from multiple camera views onto 3D-Object models for computer animation. objective-oriented programming language. Niem and H. China. Hierarchical implicit surface joint limits for human body tracking. Multiple view geometry in computer October 2005. [7] Makehuman (c). http://www. Urtasun. animation. [8] Ondˇej Mazan´ and Tom´ˇ Svoboda. Human body pose recognition using spatio-temporal templates. [4] Humanoid animation (h-anim) standard. http://www. 1995. [5] Richard Hartley and Andrew Zisserman. [6] L.Bibliography [1] Blender. Open Source 3D Graphics modelling. and P.blender. and [3] M. Dimitrijevic. In ICCV workshop on Modeling People and Human Interaction. http://mediawiki. [9] W. http://www.

In International Conference in Computer Vision. and Jon Leach (ed). ICCV. [13] R. USA. Fleet. Fua. Kurt Akeley. IEEE Computer Society. The OpenGL Graphics System: A Specification. Urtasun. SGI. In Third International Symposium on 3D Data Processing. Los Alamitos. pages 915–922. June 2006. [12] Jonathan Starck and Adrian Hilton. volume 02.BIBLIOGRAPHY 69 [11] Mark Segal. J. and Jiˇ´ Matas. and P. A. Piscataway. Model-based multiple view reconstruction of people. Priors for people tracking from small training sets. D. IEEE Computer Society. USA. Hertzmann. October 2005. In International Conference on Computer Vision. Visualization and Transmission (3DPVT). 2003. CA. University of North Carolina. pages 403–410. [14] Karel Zimmermann. Multiview 3D as rı tracking with an incrementally constructed 3D model. 2004. . Tom´ˇ Svoboda.

Sign up to vote on this title
UsefulNot useful