Professional Documents
Culture Documents
I. I NTRODUCTION
Authorized licensed use limited to: Argonne National Laboratory. Downloaded on May 15,2021 at 15:39:31 UTC from IEEE Xplore. Restrictions apply.
1732 IEEE SENSORS JOURNAL, VOL. 14, NO. 6, JUNE 2014
Authorized licensed use limited to: Argonne National Laboratory. Downloaded on May 15,2021 at 15:39:31 UTC from IEEE Xplore. Restrictions apply.
MALLICK et al.: CHARACTERIZATIONS OF NOISE IN KINECT DEPTH IMAGES: A REVIEW 1733
not receive any speckle, its corresponding region B A on Value: Kinect fails to estimate depth. This is like NaN
the imaging plane, receives no speckle either and shadow (Not-a-Number) [29] of computing. Under a flag Kinect
is formed. From similar triangles we get d /d = Z o / f and Windows SDK can spit specific NaN values (Fig. 1) for
d /b = (Z b − Z o )/Z b . Hence the offset d of shadow is object points that are too near or too far or for which the
depth is unknown. By default, all these are ZD (0) values.
d = b f (1/Z o − 1/Z b ) (3) Note that a depth value is marked as unknown if it is in
the range of Kinect (neither too near nor too far) yet
Examples of shadow are shown in Figs. 5(b) and 6(d) and the depth cannot be estimated (because of high surface
further analysed in Section III-A. discontinuity, specularity, shadow, or other reasons).
• Wrong Depth (WD) Value: When Kinect reports a non-
B. Empirical Models zero depth value, its accuracy depends on the depth itself.
WD relates to such Depth Accuracy (Definition 4).
The above models are based on the geometry of the sensor • Unstable Depth (UD) Value: Kinect reports a non-zero
and its intrinsic parameters. In contrast, empirical models depth value, but that value changes over time even when
( [4], [9]) directly correlate the observations from several there is no change of depth in the scene. UD is temporal
experiments and fit models with two or more degrees of in nature and relates to Depth Precision (Definition 6).
freedom. We summarize the empirical models in Table II.
We summarize the noise characteristics in Table III and
discuss their salient properties in the following sections.
C. Statistical Noise Models
The following two statistical noise models have been
A. Spatial Noise
proposed based on the pin-hole camera model of Kinect
(Section II-A). All kinds of noise observable within a single depth frame are
Spatial Uncertainty Model: In this model Park et al. [4] spatial in nature. Four parameters, namely, Object Distance,
show that for an image pixel (u, v), the corresponding 3D Imaging Geometry, Surface/Medium Property, and Sensor
point (x, y, z) occurs within an uncertainty ellipsoid E. The Technology (Table III) control spatial noise.
major axis of E is aligned along the direction of measurement 1) Object Distance: The distance between the object and
(Z ) and has length proportional to Z 2 . In contrast, the minor Kinect significantly influences the following kinds of noise.
axes (along X and Y ) have lengths proportional to Z . The Out-of-Range Noise: This is due to the limited depth range
former accounts for the quadratic Axial Noise while the latter of Kinect (Table I). Depth images from in Fig. 1(b) and (c)
justify the linear Lateral Noise (Fig. 3). illustrate too near and too far types of out-of-range data.
Authorized licensed use limited to: Argonne National Laboratory. Downloaded on May 15,2021 at 15:39:31 UTC from IEEE Xplore. Restrictions apply.
1734 IEEE SENSORS JOURNAL, VOL. 14, NO. 6, JUNE 2014
TABLE II
E MPIRICAL M ODELS OF K INECT AND I TS N OISE
plane of the camera. The depth values along the edges of this
face are non-uniform. To quantify lateral noise, we detect the
edge pixels [Fig. 5(c)] in the depth image [Fig. 5(b)], fit an
edge line to the vertical edge points, and compute the standard
deviation σ L of the edge pixels from the edge line [Fig. 5(d)].
σ L is defined as the Lateral Noise. Note that the distortion
due to the IR lens may affect the magnitude of the noise on
the edges. To take care of this we place the cardboard plate
at the centre of a scene with wide field of view. Later we
crop the region of interest for noise measurement.
Nguyen et al. [9] have experimented with a number of
plates (of increasing size for increasing distance to keep the
absolute error significant) at varying distances and angles
to show that the lateral noise increases linearly with depth
(Fig. 3). Empirically they derive the expression for lateral
noise
σ L (Z , θ ) at depth
Z and angle θ as: σ L (Z , θ ) =
0.8 + 0.035 × π/2−θ × Z × δf , where δf is the ratio between
θ
Fig. 5. Authors’ experiments with lateral noise. (a) RGB image of a pixel size and focal length of the IR camera.
cardboard sheet. (b) Depth image. Shadow of the board can be seen on right. Andersen et al. independently observed this error [7] and
(c) Extracted vertical edge pixels and straight line fit for the vertical edges report it as Spatial Precision. They measured an error of
in the image: L 1 ≡ y = 0.0185x + 251.63 and L 2 ≡ y = 0.005x + 361.17.
(d) Lateral Noise given by σ L 1 = 0.9392 and σ L 2 = 0.9229. 15 mm in both X and Y directions at a distance of 2.2m.
In-depth analysis of lateral noise, especially for arbitrary
Axial Noise: As the distance increases, the accu- edges, its experimental verification, and filtering algorithms
racy of depth measured by Kinect decreases quadratically for its elimination remain open problems.
(Equation 2). This is called Axial Noise [9] as the above Effect of Background Objects: Both Shadow and Lateral
disparity relations are derived along the Z-axis (perpendicular Noise of an object also depend on the position of the (other)
to and moving away from camera plane). The rate of change background objects. If the background object is near the
of accuracy along radial axes has not been studied in depth, shadow-edge or the lateral-edge of the object, the noise is
though Nguyen et al. have reported some early results in [9]. much less compared to when the background object is far
2) Imaging Geometry: Imaging geometry deals with the away. We believe this is due to the failure of the triangulated
imaging set-up and the geometric structure of the objects. reconstruction when the border-adjoining background object
Shadow Noise: Shadow Noise has been introduced in is far away as it cannot participate in well-formed triangles.
Shadow Model from [6] (Section II-A) and illustrated in No in-depth study on this is available.
Figs. 5(b) and 6(d). From the Equation 3 we see that the width 3) Surface or Medium Property: The IR emitter in Kinect
of the shadow decreases with increasing distance. uses 830nm near-infra-red laser [3] for the speckles and gets
Lateral Noise: An example of lateral noise [9] is shown in them reflected back from the object to estimate depth. The
Fig. 5 for the face of a cardboard sheet placed parallel to the IR Reflection, Refraction, Absorption and Radiation properties
Authorized licensed use limited to: Argonne National Laboratory. Downloaded on May 15,2021 at 15:39:31 UTC from IEEE Xplore. Restrictions apply.
MALLICK et al.: CHARACTERIZATIONS OF NOISE IN KINECT DEPTH IMAGES: A REVIEW 1735
TABLE III
C LASSIFICATION AND C HARACTERIZATION OF N OISE IN D EPTH I MAGES
Authorized licensed use limited to: Argonne National Laboratory. Downloaded on May 15,2021 at 15:39:31 UTC from IEEE Xplore. Restrictions apply.
1736 IEEE SENSORS JOURNAL, VOL. 14, NO. 6, JUNE 2014
Authorized licensed use limited to: Argonne National Laboratory. Downloaded on May 15,2021 at 15:39:31 UTC from IEEE Xplore. Restrictions apply.
MALLICK et al.: CHARACTERIZATIONS OF NOISE IN KINECT DEPTH IMAGES: A REVIEW 1737
Fig. 8. Residual noise of a plane. Reproduced from [10]. (a) Plane at 86cm. Fig. 10. Authors’ experiments on temporal noise. Entropy and SD of
(b) Plane at 104cm. each pixel in a depth frame over 400 frames for a stationary wall at 1.6m.
(a) Entropy image. (b) SD image.
Authorized licensed use limited to: Argonne National Laboratory. Downloaded on May 15,2021 at 15:39:31 UTC from IEEE Xplore. Restrictions apply.
1738 IEEE SENSORS JOURNAL, VOL. 14, NO. 6, JUNE 2014
Authorized licensed use limited to: Argonne National Laboratory. Downloaded on May 15,2021 at 15:39:31 UTC from IEEE Xplore. Restrictions apply.
MALLICK et al.: CHARACTERIZATIONS OF NOISE IN KINECT DEPTH IMAGES: A REVIEW 1739
Pattern Division Multiplex (PDM) IR Projections: In a To facilitate the above characterization, we also briefly
PDM scheme every Kinect tries to identify its own dot survey a number of noise models for Kinect. We often conduct
pattern from all others’ dot patterns in an overlapped region independent experiments to complete the characterization and
of patterns. Consider two Kinects–one static and the other to validate the published results. On the way we highlight the
vibrating at a low spatial frequency. Now the IR camera of scopes for further explorations. To the best of our knowledge,
each Kinect would see its own dots in a high contrast (as this paper is the first to comprehensively discuss the noise
both the IR emitter and camera vibrate in sync) compared behaviour of Kinect depth images in a structured manner.
to the relatively blurred dots of the other Kinect. Though We focus on the noise at the generation time only and do not
the vibrations cause blurring of the IR pattern; at a moderate study its other aspects. The study and characterization of noise
frequency of vibration the Kinect’s triangulation algorithm for in IR, skeleton [35] and audio streams, the study of various
depth estimation can filter out the blurred dots and produce de-noising algorithms, and the analysis of the effects of the
a clear depth view. Butler et. al. [17] vary the vibration software framework [7] on noise between Windows SDK [1],
frequency from 15Hz to 120Hz in 10Hz increments to show OpenNI [23] and OpenKinect [12] libraries would be topics
that 40Hz to 80Hz can be used for shaking without any of future research work in this regard.
significant depth distortion.
By vibrating the Kinects at different frequencies and phases, ACKNOWLEDGMENT
PDM can be extended to significantly large number of Kinects The authors acknowledge the reviewers for valuable sugges-
whose dot patterns actually overlap (note that one of these tions and TCS Research Scholar Program for financial support.
Kinects can be static). Up to 6 Kinects attached with low-
A PPENDIX A
frequency off-axis-weight vibration motors are shown to
operate simultaneously in [17], [21] with significantly low N OTATION AND D EFINITIONS
interference noise. In OmniKinect [20] every Kinect is attached Definition 1. A Depth Frame D = {dx,y |1 ≤ x ≤ r, 1 ≤
to a vertical rod equipped with vibrators. It uses up to y ≤ c} is an r ×c array of depth values. A sequence of n depth
8 vibrating Kinects and 1 non-vibrating Kinect. It is shown frames Dk = {dx,y
k }, 1 ≤ k ≤ n, is called a Depth Video. We
to produce very good results for KinectFusion experiments. use Depth Image to mean either a Depth Frame or a Depth
Overall PDM approach has advantages over SDM and TDM Video.
as it allows the use of arbitrary number of Kinects in widely Definition 2. The discrete pdf px,y of depth at pixel
flexible configurations, does not need modifications of Kinect’s (x, y) is given by px,y (d) = f (d)/n where f (d) is the
internal electronics, firmware or host software and does not frequency of depth value d at pixel (x, y) amongst the
degrade frame rate. Several experiments also quantitatively n frames. The Entropy at (x, y) is defined as: h x,y =
and qualitatively demonstrate that PDM is effective in reduc- − d px,y (d) log2 px,y (d) and the Entropy Image for a
ing noise without compromising sensed depth measurements. Depth Video sequence is: Hn = {h x,y }.
However, as the Kinects vibrate, the RGB image is blurred
Definition 3. The SD Imageof a Depth Video is defined
and needs de-blurring for clean up. n
as: n = {σx,y }, where σx,y = k=1 (d x,y − μ x,y ) /n, and
k 2
n
μx,y = ( k=1 dx,yk )/n. If d k is 0 in a frame then the same
x,y
IV. C ONCLUSION
is omitted for computations of μ and σ , and n is adjusted
In this paper we characterize the noise of Kinect depth accordingly.
images into Spatial, Temporal, and Interference Noise. Definition 4. Depth Accuracy5 is defined as the closeness
We observe that four parameters, namely, Object Distance, between the true depth dtrue ( p) of a point p and its observed
Imaging Geometry, Surface/Medium Property, and Sensor depth d( p). Accuracy is inversely related to error (for a
Technology control Spatial Noise. Based on these four con- single observation) or to the variance/SD (for a collection
trol parameters and the source of noise we characterize the of observations). The true depth may be measured by a
noise behaviour into Out-of-Range, Axial, Shadow, Lateral, higher quality sensor6 or estimated statistically from several
Specular Surface, Non-Specular Surface, Band, Structural, and observations.
Residual Noise classes. For each class we summarize the noise Definition 5. Depth Resolution is the minimum difference
behaviour either from others’ reports or from our experiments. between two depth values or the size of the quantization step
Object Distance, Motion, Surface Properties, and Frame in depth.
Rate control the behaviour of Temporal Noise. We find the
Definition 6. Depth Precision7 is defined as the close-
characteristic of temporal noise in terms of entropy and
ness between independently observed depth values di ( p)
standard deviation.
over observations i . Precision is inversely related to the
While dealing with the issue of noise in a multi-Kinect
variance/SD of the observations.
set-up we analyse the estimators for Interference Noise and
survey various imaging techniques to mitigate the interference 5 Depth Accuracy is indicative of the Static errors and represents the bias
at source. We characterize the techniques for minimization of of the estimation results. This can be corrected by calibration.
6 PMDVISION: 3D time-of-flight (ToF) camera or LIDAR: LIght and raDAR
interference noise into Space Division Multiplex (SDM), Time laser detector are often used
Division Multiplex (TDM), and Pattern Division Multiplex 7 Depth Precision is indicative of the Dynamic errors and represents the
(PDM) techniques. variance of the estimation results. This is improved by filtering methods.
Authorized licensed use limited to: Argonne National Laboratory. Downloaded on May 15,2021 at 15:39:31 UTC from IEEE Xplore. Restrictions apply.
1740 IEEE SENSORS JOURNAL, VOL. 14, NO. 6, JUNE 2014
R EFERENCES [26] (Accessed 2013, Nov. 25). Microsoft, Kinect Fusion Project
Page [Online]. Available: http://research.microsoft.com/en-us/projects/
[1] Microsoft, Kinect. (Accessed 2013, Jun. 8). Kinect for Win- surfacerecon/
dows Sensor Components and Specifications [Online]. Available: [27] L. Zelnik-Manor. (2012). Working with Multiple Kinects [Online]. Avail-
http://msdn.microsoft.com/en-us/library/jj131033.aspx able: http://webee.technion.ac.il/lihi/Teaching/2012_winter_048921/PPT/
[2] K. Khoshelham and S. O. Elberink, “Accuracy and resolution of Kinect Roy.pdf
depth data for indoor mapping applicationse,” Sensors, vol. 12, no. 2, [28] (Accessed 2013, Nov. 12). ISO 5725-1:1994—Accuracy (Trueness and
pp. 1437–1454, 2012. Precision) of Measurement Methods and Results—Part 1: General
[3] (2011). Cadet: Center for Advances in Digital Entertainment Tech- Principles and Definitions [Online]. Available: http://www.iso.org/iso/
nologies, Microsoft Kinect—How Does It Work [Online]. Available: catalogue_detail.htm?csnumber=11833
http://www.cadet.at/wp-content/uploads/2011/02/kinect_tech.pdf [29] Standards Committee of the IEEE Computer Society, ANSI/IEEE
[4] J.-H. Park, Y.-D. Shin, J.-H. Bae, and M.-H. Baeg, “Spatial uncertainty Standard 754/198, 1985.
model for visual features using a Kinect sensor,” Sensors, vol. 12, no. 7, [30] K. Rose. (Accessed 2013, May 23). Materials that Absorb Infrared Rays
pp. 8640–8662, 2012. [Online]. Available: http://www.ehow.com/info_8044395_materials-
[5] F. Faion, S. Friedberger, A. Zea, and U. D. Hanebeck, “Intelligent absorb-infrared-rays.html
sensor-scheduling for multi-Kinect-tracking,” in Proc. IEEE/RSJ Int. [31] T. Mallick, P. P. Das, and A. K. Majumdar, “Estimation of the orientation
Conf. Intell. Robot. Syst., Oct. 2012, pp. 3993–3999. and distance of a mirror from Kinect depth data,” in Proc. 3rd Nat. Conf.
[6] Y. Yu, Y. Song, Y. Zhang, and S. Wen, “A shadow repair approach Comput. Vis., Pattern Recognit., Image Process. Graph., 2013, pp. 1–15.
for Kinect depth maps,” in Proc. 11th Asian Conf. Comput. Vis., 2013, [32] A. Maimone and H. Fuchs, “Encumbrance-free telepresence system with
pp. 615–626. real-time 3D capture and display using commodity depth cameras,” in
[7] M. Andersen, T. Jensen, P. Lisouski, A. Mortensen, and M. Hansen, Proc. ISMAR IEEE 10th Int. Symp. Mixed Augmented Reality, Oct. 2011,
“Kinect depth sensor evaluation for computer vision applications,” pp. 137–146.
Dept. Eng. Electr. Comput. Eng., Aarhus Univ. Aarhus, Denmark, Tech. [33] Y. Schröder, A. Scholz, K. Berger, K. Ruhl, S. Guthe, and
Rep. ECE-TR-6, 2012. M. Magnor, “Multiple Kinect studies,” ICG, Burnsville, MN, USA,
[8] K. Essmaeel, L. Gallo, E. Damiani, G. D. Pietro, and A. Dipanda, Tech. Rep. 09-15, Oct. 2011.
“Temporal denoising of Kinect depth data,” in Proc. 8th Int. Conf. Signal [34] (Accessed 2013, Jun. 4). Kinect Windows Team, Inside the Newest
Image Technol. Internet Based Syst., 2012, pp. 47–52. Kinect for Windows SDK: Infrared Control [Online]. Available:
[9] C. V. Nguyen and D. L. S. Izadi, “Modelling Kinect sensor noise for http://blogs.msdn.com/b/kinectforwindows/archive/2012/12/07/inside-
improved 3D reconstruction and tracking,” in Proc. 2nd Int. Conf. Imag., the-newest-kinect-for-windows-sdk-infrared-control.aspx
Model., Process., Visualizat. Transmiss., 2012, pp. 524–530. [35] M. A. Livingston, J. Sebastian, Z. Ai, and J. W. Decker, “Performance
[10] J. Smisek, M. Jancosek, and T. Pajdla, “3D with Kinect,” in Consumer measurements for the microsoft Kinect skeleton,” in Proc. IEEE Virtual
Depth Cameras for Computer Vision—Research Topics and Applica- Reality, Mar. 2012, pp. 119–120.
tions: Advances in Computer Vision and Pattern Recognition. New York,
NY, USA: Springer-Verlag, 2013, ch. 1, pp. 3–25.
[11] N. Burrus. (Accessed 2013, Jun. 11). Kinect for Windows
Sensor Components and Specifications [Online]. Available:
http://nicolas.burrus.name/index.php/Research/KinectCalibration Tanwi Mallick received the B.Tech. and M.Tech.
[12] OpenKinect Community. (Accessed 2013, Jun. 11). Imaging Information degrees in computer science from Jalpaiguri Govern-
[Online]. Available: http://openkinect.org/wiki/Imaging Information ment Engineering College, and the National Institute
[13] N. Ahmed. (2012). A System for 360◦ Acquisition and 3D Animation of Technology, Durgapur, in 2008 and 2010, respec-
Reconstruction Using Multiple RGB-D Cameras [Online]. Available: tively. From 2010 to 2011, she was with the DIATM
http://www.mpi-inf.mpg.de/~nahmed/casa2012.pdf, unpublished. College, Durgapur, as an Assistant Professor. She is
[14] N. Ahmed, “Visualizing time coherent 3D content using one or more currently a TCS Research Scholar with the Depart-
Kinect,” in Proc. ICCMTD Int. Conf. Commun., Media, Technol. Des., ment of Computer Science and Engineering, IIT
2012, pp. 521–525. Kharagpur. She is interested in image processing and
[15] K. Berger et al., “The capturing of turbulent gas flows using multiple computer vision, and is actively involved in human
Kinects,” in Proc. IEEE Int. Conf. Comput. Vis. Workshops, Nov. 2011, activity tracking and analysis.
pp. 1108–1113.
[16] K. Berger, K. Ruhl, C. Brümmer, Y. Schröder, A. Scholz, and
M. Magnor, “Markerless motion capture using multiple color-depth
sensors,” in Proc. 16th Int. Workshop Model. Visualizat., 2011,
Partha Pratim Das received the B.Tech., M.Tech.,
pp. 317–324.
and Ph.D. degrees from IIT Kharagpur, in 1984,
[17] A. Butler, S. Izadi, O. Hilliges, D. Molyneaux, S. Hodges, and D. Kim,
1985, and 1988, respectively. He served as a Faculty
“Shake‘n’sense: Reducing interference for overlapping structured light
Member with the Department of Computer Science
depth cameras,” in Proc. ACM CHI Conf. Human Factors Comput. Syst.,
and Engineering, IIT Kharagpur, from 1988 to 1998,
2012, pp. 1933–1936.
and guided five Ph.D.s. From 1998 to 2011, he
[18] M. Caon, Y. Yue, J. Tscherrig, E. Mugellini, and O. A. Khaled, “Context-
was with Alumnus Software Ltd. and Interra Sys-
aware 3D gesture interaction based on multiple Kinects,” in Proc. 1st
tems, Inc., in senior positions before returning to
Int. Conf. Ambient Comput., Appl., Services Technol., 2011, pp. 7–12.
the Department as a Professor. His current research
[19] J. Kramer, N. Burrus, F. Echtler, and D. H. C. M. Parker, Hacking the
interests include object tracking and interpretation in
Kinect. San Diego, CA, USA: Academic, 2012.
videos and medical information processing.
[20] B. Kainz et al., “Omni Kinect: Real-time dense volumetric data acqui-
sition and applications,” in Proc. 18th ACM Symp. Virtual Reality Softw.
Technol., 2012, pp. 25–32.
[21] A. Maimone and H. Fuchs, “Reducing interference between multiple
structured light depth sensors using motion,” in Proc. IEEE Conf. Virtual Arun Kumar Majumdar is a Professor with the
Reality, Feb. 2012, pp. 51–54. Department of Computer Science and Engineering,
[22] J. Tong, J. Zhou, L. Liu, Z. Pan, and H. Yan, “Scanning 3D full human IIT Kharagpur. He received the M.Tech. and Ph.D.
bodies using Kinects,” IEEE Trans. Visualizat. Comput. Graph., vol. 18, degrees in applied physics from the University of
no. 4, pp. 643–650, Apr. 2012. Calcutta, and the Ph.D. degree in electrical engineer-
[23] (Accessed 2013, Jun. 8). OpenNI Consortium, OpenNI: The Standard ing from the University of Florida, in 1968, 1973,
Framework for 3D Sensing [Online]. Available: http://www.openni.org/ and 1976, respectively. He served as the Deputy
[24] Y. Arieli, B. Freedman, M. Machline, and A. Shpunt, “Depth mapping Director and Dean of IIT Kharagpur and as head of
using projected patterns,” U.S. Patent 0 118 123, Apr. 3, 2010. several departments. His research interests include
[25] M. Viager. (2011). Analysis of Kinect for Mobile Robots [Online]. data and knowledge-based systems, multimedia sys-
Available: http://www.scribd.com/doc/56470872/Analysis-of-Kinect-for- tems, medical information systems, and information
Mobile-Robots-unofficial-Kinect-data-sheet-on-page-27 security.
Authorized licensed use limited to: Argonne National Laboratory. Downloaded on May 15,2021 at 15:39:31 UTC from IEEE Xplore. Restrictions apply.