Professional Documents
Culture Documents
Abstract: A videogrammetric technique is proposed for measuring three-dimensional structural vibration response in the laboratory. The
technique is based on the principles of close-range digital photogrammetry and computer vision. Two commercial-grade digital video
cameras are used for image acquisition. To calibrate these cameras and to overcome potential lens distortion problems, an innovative
two-step calibration process including individual and stereo calibration is proposed. These calibrations are efficiently done using a planar
pattern arbitrarily shown at a few different orientations. This special characteristic makes it possible to perform an on-site calibration that
provides flexibility in terms of using different camera settings to suit various application conditions. To validate the proposed technique,
three tests, including sinusoidal motion of a point, wind tunnel test of a cross-section bridge model, and a three-story building model under
earthquake excitation, are performed. Results indicate that the proposed videogrammetric technique can provide fairly accurate displace-
ment measurement for all three tests. The proposed technique is shown to be a good complement to the traditional sensors for measuring
two- or three-dimensional vibration response in the low-frequency range.
DOI: 10.1061/共ASCE兲0733-9399共2007兲133:6共656兲
CE Database subject headings: Photogrammetry; Vibration; Dynamic response; Measurement; Imaging techniques.
between the global coordinate XYZ system and the camera coor-
dinate xyz system. This projection matrix can be expressed as
P = KR关I t兴 共2兲
The camera calibration matrix K is of the form
冤 冥
␣x s u0
K = 0 ␣ y v0 共3兲
0 0 1
where ␣x and ␣y⫽focal lengths of the camera in terms of pixel
dimensions in the u and v-directions, respectively; s⫽skew pa-
rameter; and u0 and v0⫽coordinates of the principal point in Fig. 2. 共a兲 Plane pattern used in camera calibration; 共b兲 plane-based
terms of pixel dimensions. In Eq. 共2兲, I = 3 ⫻ 3 unity matrix; camera calibration
R = 3 ⫻ 3 rotation matrix; and t = 3 ⫻ 1 translation vector repre-
senting the orientation and the translation between the camera
coordinate system and the global coordinate system, respectively.
ing the global coordinate system is fixed on and rotated with the
In total, there are five intrinsic parameters 共␣x, ␣y, s, u0, and v0兲
pattern, these l corner points would have the same coordinates on
and six extrinsic parameters 共three Euler rotation angles and
the n images, Mj = 共X j , Y j , 0 , 1兲T, j = 1 , 2 , . . . , l. Also, since the
three translation parameters兲. To determine these 11 parameters, it
focal length of the camera is fixed during the image acquisition,
is necessary to provide at least six pairs of correspondence be-
the intrinsic matrix K is a constant matrix. The projected image
tween nonplanar 3D points and their 2D images 共Hartley and
coordinates of the jth corner point on the ith image from the
Zisserman 2003兲.
pinhole model, mij = 共uij , vij , 1兲T, can be obtained from
For metric cameras, off-site calibration methods using special
equipment such as multicollimator or geniometer are popular but mij = KRi关I ti兴Mj 共4兲
time-consuming and expensive 共Mikhail et al. 2001兲. For modern
close-range photogrammetry using nonmetric cameras, on-site The camera parameters can then be obtained through the follow-
calibration becomes attractive since some comprehensive and ing optimization function:
easy-to-use analytic techniques have been developed 共Gruen and n l
兺
Huang 2001兲. These techniques can be roughly classified into two
categories: photogrammetric calibration and self-calibration. For Minimize 兺
i=1 j=1
储m̂ij − mij储2 共5兲
photogrammetric calibration, a calibration object with precisely
known geometry in the 3D space is required. For self-calibration, where m̂ij⫽actual observed image coordinates for the jth control
bundle triangulation is performed using a network of highly point on the ith image. The objective of optimization is to deter-
convergent overlapping images. In this study, a plane-based cam- mine a total of 5 + 6n parameters 共including five intrinsic param-
era calibration method 共Tsai 1987兲 that requires only 2D control eters and six extrinsic parameters for each image兲 that minimize
information is adopted. This technique is more flexible than the the sum of distances between the projected image points mij and
photogrammetric calibration and more robust when compared to the actual observed image points m̂ij. This optimization problem
self-calibration 共Zhang 1998兲. The technique uses a planar cali- can be initialized at the algebraic solution suggested by Zhang
bration pattern such as the one shown in Fig. 2共a兲. The pattern 共1998兲 for fast convergence.
consists of 30 mm⫻ 30 mm black and white squares printed from Comparing to metric cameras, commercial cameras might suf-
a laser printer with a resolution of 1,200 dots per in. fer from a few problems that could affect their accuracy when
Assume that n images of the pattern shown at different orien- used for photogrammetric applications. One notable problem is
tations are acquired under a fixed focal length 关Fig. 2共b兲兴. The associated with lens distortion. Denote the distorted image coor-
rotation matrix and the translation vectors for these n images are dinates of a point as 共ũ , ṽ , 1兲. The corresponding ideal 共or
expressed as Ri and ti, i = 1 , 2 , . . . , n, respectively. On each of distortion-free兲 image coordinates 共u , v , 1兲 can be expressed as
these images, l corner points are selected for calibration. Assum- 共Sturm and Maybank 1999兲
Fig. 3. Epipolar condition and imperfect projection for the two Fig. 4. Target point extraction
cameras
Fig. 3共b兲. This problem arises from errors in the estimated camera
u = ũ + ū共k1r2 + k2r4兲 + k3共r2 + 2ū兲 + 2k4uv parameters. It is possible to minimize this geometric error shown
in Fig. 3共c兲 by performing another optimization for all the camera
v = ṽ + v̄共k1r2 + k2r4兲 + k4共r2 + 2v̄兲 + 2k3uv 共6a兲 parameters 共Hartley and Zisserman 2003兲
n l 2
ū = ũ − u0
Minimize 兺 兺 兺 储m̂ijs − m̃ijs储2
i=1 j=1 s=1
共8兲
v̄ = ṽ − v0 共6b兲
In this optimization, the geometric relationship of the two cam-
eras is established through a relative orientation matrix and a
r2 = ū2 + v̄2 共6c兲 relative translation vector to reduce the number of parameters for
where k1 and k2⫽coefficients of radial lens distortion; and k3 and optimization.
k4⫽coefficients of decentering lens distortion. To account for the
lens distortion problem, the optimization function of Eq. 共5兲 Target Point Tracking
should be extended to include these distortion coefficients
Measuring displacement at a selected location on an object is
n l
realized through the 3D tracking of a target point attached to this
Minimize 兺 兺 储m̂ij − m̃ij储
i=1 j=1
2
共7兲 location. This target needs to be visible on every frame of the
acquired image sequences. Fig. 4共a兲 shows a target that consists
where m̃ij⫽projection of point Mj on the ith image according to of four 30 mm⫻ 30 mm black and white squares attached on a
Eq. 共4兲, followed by distortion according to Eq. 共6兲. metal block that is then fixed on a shake table. The intersection
point of the four squares is used as a target point whose trajectory
Stereo Calibration will be tracked from the two sequences of images acquired from
After the two cameras are calibrated individually, all the param- the two cameras. To automatically track this target point in the
eters of the two cameras should be simultaneously optimized image sequences, a tracking algorithm consisting of the following
based on the epipolar geometry principle 共Hartley and Zisserman three steps is used: 共1兲 detecting candidate points from the image
2003兲, which is referred to herein as the stereo calibration. In the sequence using an image morphology technique 共Gonzalez et al.
following, Superscripts 1 and 2 are used to indicate those param- 2004兲; 共2兲 identifying most-likely points by a correspondence
eters associated with the first and the second cameras, respec- technique; and 共3兲 locating the target point using the Harris
tively. As illustrated in Fig. 3共a兲, the cameras are indicated by corner detection technique 共Harris and Stephens 1988兲. Fig. 4共c兲
their optical centers C1 and C2. According to the epipolar geom- shows the binary image skeleton of the zoom-in target in Fig. 4共b兲
etry principle, the two optical centers of the cameras, the point M using the image morphology technique. The intersections of
and its image projection points m1 and m2 should lie on the the skeletons are identified as candidate points. On the first
so-called epipolar plane. This condition unfortunately cannot be image of the sequence, a point that is visibly close to the target
guaranteed after individual camera calibration due to imperfection point is manually selected as a reference point. Most-likely points
of projection. As a result, the two projective rays defined by on the subsequent images are then identified using a corres-
the image points and the corresponding camera optical centers pondence technique. This technique calculates the following
cannot converge and meet at point M in the 3D space as shown in correlation coefficient kab between the r ⫻ c pixel rectangular
冑兺 兺
i=1 j=1
kab = r c r c
共9兲 linear triangulation, which does not take into account the geo-
metric error associated with the epipolar constraint. To minimize
i=1 j=1
关f a共i, j兲 − f̄ a兴 ⫻
2
兺 兺 关f b共i, j兲 − f̄ b兴
i=1 j=1
2
the geometric error, the following nonlinear optimization should
be performed:
where f a共i , j兲 and f b共i , j兲⫽gray level for the pixel 共i , j兲 in Mask a
冉 冊 冉 冊
2 2 2
p11M p12M
and Mask b, respectively; and f̄ a and f̄ b denote the mean gray
level of Mask a and Mask b, respectively. Among the candidate
Minimize 兺
i=1
储mim̂i储2 = u1 −
p13M
+ v1 −
p13M
冉 冊 冉 冊
points on a subsequent image, the most-likely point is the one that 2
gives the largest correlation coefficient. After all most-likely p21M p22M
+ u2 − + v2 − 共12兲
points are identified from the image sequence, the Harris corner p23M p23M
detection technique 共Harris and Stephens 1988兲 is applied on the
The corrected 3D coordinates are the ones that minimize this
image sequences using these most-likely points as the initial con-
objective function. In the above optimization, the 3D coordinates
ditions. The Harris corner detection technique can locate corner
obtained from the linear triangulation can be used as the initial
points in an image with subpixel accuracy as shown in Fig. 4共d兲.
value.
Reconstruction of 3D Point
After all the camera parameters and the target points on the two Experimental Studies
image sequences are determined, the next task is to reconstruct
the 3D coordinates of the target points from the two image se- To evaluate the accuracy of the proposed videogrammetric tech-
quences captured by the two cameras. This coordinate reconstruc- nique, three dynamic experiments were performed. Two digital
tion is done using a nonlinear triangulation method 共Hartley and video cameras were used for image sequence acquisition. The
Zisserman 2003兲. For the 3D point M = 共X , Y , Z , 1兲T, its undis- cameras came with a 1 / 3 in. 1.18 million-pixel progressive CCD
torted image coordinates from Cameras 1 and 2 can be estimated and could record high-definition images with a pixel resolution
from the observed distorted image coordinates using Eq. 共6兲 of 1,280⫻ 720 at 29.97 fps. They were equipped with 10⫻ opti-
and are expressed as 共u1 , v1 , 1兲T and 共u2 , v2 , 1兲T, respectively. The cal and 200⫻ digital zoom with a lens of F1.8 and focal length
reprojection from the two cameras can be expressed as the fol- ranging between 5.2 and 52 mm. The two cameras were cali-
lowing equations: brated prior to each experiment using the planar pattern as shown
in Fig. 2共a兲. A laser pointer was used to synchronize the two
1关u1, v1,1兴T = P1M cameras during the experiments. The two sequences of images
recorded by the two cameras were manually synchronized at the
2关u2, v2,1兴T = P2M 共10兲 images on which the laser dot was first observed.
Eq. 共10兲 can be rewritten as the following equations:
Experiment 1: Measurement of Harmonic Motion
AM = 0 共11a兲
In this experiment, one target was fixed on a shake table that
冤 冥
could be programmed to move in one or two directions with an
p11 − u1p13
arbitrary amplitude and frequency. The two cameras were placed
p12 − v2p13 at about 1.8 m away from the target and stood at about 1.5 m
A= 共11b兲
p21 − u2p23 above the ground 共see Fig. 6兲. The angle between the two cameras
was about 30°. For individual and stereo calibration, the calibra-
p22 − v2p23
tion pattern was placed in front of the target. Seven pairs of im-
pji
where denotes the ith row vector of the jth camera’s projective ages were recorded by the two cameras under a fixed focal length.
matrix. This is a set of four homogeneous equations with only Each image pair corresponded to a different orientation of the
15.6 −0.18 0.33 −0.25 0.46 −0.21 0.58 −0.26 0.61 0.27 1.24
20.8 −0.12 0.31 −0.19 0.36 −0.22 0.51 0.18 0.62 0.26 1.29
Note: SD⫽standard deviation.
for a vibration frequency of 0.5 Hz and between 0.26 and time for the two-step calibration of the two cameras using seven
0.31 mm for a vibration frequency of 5.0 Hz. The standard de- pairs of planar pattern images was about 4 min. The time for
viations of measurement errors, however, increase from about tracking a target point was about 0.6 s per image frame per cam-
0.4 mm for 0.5 Hz to about 1.4 mm for 5.0 Hz. The increase era. The time required for coordinate reconstruction of a 3D point
of standard deviations is possibly due to the following two rea- was in the order of 1 ms. These computational time requirements
sons: 共1兲 increasing vibration frequency implies a higher target indicate that the proposed videogrammetric technique still cannot
speed that results in more blurry images and larger measurement be used for real-time measurement.
variation and 共2兲 increasing vibration frequency leads to more
pronounced frame correspondence error from the image synchro-
Experiment 2: Measurement of Bridge Section Model
nization. These standard deviations provide an indication on the
in Wind Tunnel
resolution of this videogrammetric measurement technique for the
frequency range of 0.5– 5.0 Hz. Fig. 10共a兲 shows a wind tunnel setup for a bridge sectional model
Next, the shake table was programmed to move in a 2D cir- test. The sectional model consisted of two wooden box girders
cular motion with amplitude of 100 mm and frequency of 1 Hz. connected by cross bars. The model was rigidly connected to
Fig. 9 shows the results of videogrammetric measurement as circular rotatable shafts. The shafts were then attached to rigid
compared to the shake table output. It is seen that the videogram- rectangular bars that were supported by four linear springs to
metric technique is able to track this circular motion quite closely.
The largest discrepancy is less than 1 mm on the X – Y plane and
less than 0.5 mm along the vertical direction.
The results obtained above were calculated using a Pentium-4
2.6 GHz personal computer with 1.5 GB RAM. The processing
Fig. 10. 共a兲 Wind tunnel test for a bridge section model; 共b兲 layout of
Fig. 9. Videogrammetric tracking of a 2D circular motion targets and LVDTs
ment and the LVDT measurement at the two target locations. The
close match between the two measured results indicates that the
videogrammetric technique is able to track the two targets quite
accurately.
Fig. 11. Videogrammetric measured responses at Targets A and B Experiment 3: Measurement of a Three-Story Model
under Earthquake Excitation
provide necessary stiffness for the section model. Two laser In this test, the videogrammetric technique was used to measure
LVDTs were positioned at locations as shown in Fig. 10共b兲 to displacement time histories at a few selected points on a three-
measure the displacement time histories during the test. Two story structural model under earthquake excitation. The model
targets were placed at locations as shown to validate the appli- was made out of aluminum members with a uniform story height
cability of the videogrammetric technique for such a test. The of 0.38 m and a total height of 1.21 m. Four targets, A, B, C, and
cameras were placed at about 1.2 m from the targets and the angle D were attached to the base and the three stories as shown in
between the two cameras was about 30°. The focal lengths of Fig. 12. Four laser LVDTs were set up to measure the displace-
the two cameras were both set at 15.6 mm. The model was pulled ment time histories for the shake table and the three stories. The
by a string and released to vibrate under a mild and steady setup of the two cameras was the same as that shown in Fig. 6. To
wind condition. As the supporting rectangular bars were rigid, the excite the model, the NS component of the 1940 El Centro earth-
readings from the laser LVDTs could be linearly scaled to the quake ground displacement record was input to the shake table.
deformations at the two target locations for comparison. Fig. 11 Fig. 13 shows the displacement responses for these four targets,
shows the comparison between the videogrammetric measure- respectively. It is again seen that the displacement time histories
measured by the videogrammetric technique match closely with
that measured by the LVDTs.
Concluding Remarks