You are on page 1of 10

1

NGHIN CU, NG DNG OPENCV V KINECT H TR D NG


CHO ROBOT
Nguyn Pht Nht, Trng Th Qunh Hng
Khoa Cng ngh thng tin, Trng i hc Lc Hng

{nhut, huong}@lhu.edu.vn

Tm tt
Theo d on ca cc chuyn gia, trong vng 20 nm na mi ngi s
c nhu cu s dng robot c nhn nh nhu cu mt my tnh PC hin nay v
robot s l tm im ca mt cuc cch mng ln sau Internet. Xu hng ny
th hin r trong x hi ngay nay, robot c ng dng rng ri trong cc lnh
vc: robot trong cng nghip, y t, gio dc o to, gii tr v an ninh quc
phng th th trng robot s v cng to ln.
Trong khun kh ca ti, nhm tp trung phn x l nh cho robot gip
robot c kh nng nhn dng cc vt mi trng xung quanh cn n v trnh
nhng vt cn trn ng t hnh bng k thut x l nh 2D&3D. Robot s
c trang b thm i mt nhn Th Gii thc dng 3D l s h tr
ca thit b Kinect Sensor v thut ton x l nh trong lnh vc Computer Vision
iu khin cho robot.
T kho: OpenCV, OpenNI, Kinect, PCL, SIFT.
1 Gii thiu
X l nh s (DIP - Digital Image Processing) ni chung v nhn dng i
tng (Object Recognition) ni ring, v ang c ng dng rt nhiu. N
tri rng trn mi mt trong cuc sng t x l cc bn in n, bo, ba tp ch n
vn v nng lc nhn trong my hc, i khi n rt gn gi trong cuc sng
hng ngy m nhiu ngi khng nhn ra nh chc nng nhn dng khun mt,
i tng chuyn ng trong cc my nh, my quay phim hay n gin l chc
nng camera ca in thoi i ng cng c tch hp cc cng c x l nh v
nhn dng i tng.
Trong vn c th nhn dng i tng th ngy nay hng nghin cu
ph bin trn th gii l vic s dng cc im bt bin (Invarian Feature) trong
nh lm c trng (Keypoint) nhn dng. Tiu biu nht trong cc thut ton
i snh s dng keypoint dng ny l thut ton SIFT (Scale-Invarian Feature
Transform, David Lowe 1999 v 2004), SIFT c th coi l thut ton tin cho
cc ng dng cng nh gii thut khc v bin i c trng bt bin trong nh.
Cc c trng trong SIFT khng ph thuc vo cc php bin i nh c
bn nh xoay, thu phng, thay i sng... nn c th xem tp cc c trng
ca mt nh l th hin cho ni dung ca nh . V vy kt qu ca vic nhn
dng s c chnh xc rt cao v thm ch c th khi phc c i tng b
che khut trong nh. Tuy nhin gii thut SIFT rt phc tp trong ci t, i hi
2

thi gian nghin cu v am hiu nhiu thut ton thnh phn. Gii thut SIFT l
mt thut ton tt nht m nhm s dng trong nhn dng i tng cho robot,
nhn dng v xc nh ng i, hng i.
2 Vn v cc cng trnh nghin cu c lin quan
Theo d on trong vng 20 nm na mi ngi s c nhu cu s dng mt
robot c nhn nh nhu cu mt my tnh PC hin nay v robot s l tm im
ca mt cuc cch mng ln sau Internet. Vi xu hng ny, cng cc ng dng
truyn thng khc ca robot trong cng nghip, y t, gio dc o to, gii tr v
c bit l trong an ninh quc phng th th trng robot s v cng to ln.
Trc s pht trin y trin vng ca lnh vc robot hin ti cng nh trong
tng lai. Nhm tc gi tp chuyn ngnh cng ngh thng tin, tp trung nghin
cu lnh vc Computer Vision chuyn v lnh vc x l nh gn mt cho
robot. y l lnh vc rt th v, robot s cn l mt khi kim loi ch bit d
ng bng cc cng ngh in t khng qua xc gic chnh m nhn th gii
xung quanh bng i mt ging nh con ngi.
Mt s cng trnh nghin cu tiu biu trn Th gii gp phn vo s pht trin
ca robot nh:
- ROS - y l phn mm m ngun m (BSD giy php. ROS l kt qu
ca Willow Garage nghin cu trong phng th nghim phi hp vi i
hc Stanford. D n thc hin kim sot h thng ROS ca mt robot, v
trn c s pht trin cc gi phn mm: th vin Computer vision,
OpenCV, h thng lp k hoch, h thng qun l my ch v cc cng
ngh khc c s dng trong hng chc d n khoa hc v p dng trn
ton th gii.
- Microsoft Robotics Developer Studio - Windows theo nh hng pht
trin ng dng mi trng cho cc nn tng robot. Phin bn u tin ca
Robotics Studio c pht hnh vo nm 2006 v l phin bn hin ang
c sn Microsoft Robotics Developer Studio 2008 R3.Trong Robotics
studio cng c lp trnh trc quan c sn, cng nh mi trng o ba
chiu m phng vt l ca robot.
- OROCOS - phn mm m ngun m iu khin robot. D n h tr
cc hng dn sau y OROCOS C + + th vin: B cng c thi gian
thc, Kinematics v Th vin Dynamics, th vin lc Bayesian Hp phn
Orocos Th vin.
- Kinect l sn phm ca Microsoft da trn cng ngh camera c pht
trin bi PrimeSense vo ngy 4 thng 11 nm 2010. Kinect cho php
giao tip vi con ngi thng qua cc c ch, em li nhng cm gic th
v cho ngi chi game trn Xbox. Kh nng hiu c c ch con ngi
ca Kinect da trn hai c tnh chnh sau: thng tin v su nh (depth
map), kh nng pht hin v bm theo c tnh c th ngi (body
3

skeleton tracking).
3 Phng php tip cn
Nghin cu xy dng chng trnh h tr d ng i cho robot bng k thut
x l nh vi cc chc nng:
- Nhn dng i tng cho robot bng cng ngh x l nh (da vo c
trng ca i tng): SIFT (Scale-Invarian Feature Transform, David
Lowe 1999 v 2004), OpenCV (Computer Vision).
- Xc nh vt cn pha trc robot bng cch s dng thit b Kinect kt
hp OpenNI, PCL (Point Cloud Library) v cvBlobs [4].
3.1 Th vin OpenCV
OpenCV l mt th vin th gic my tnh m ngun m ca Intel. OpenCV
bao gm nhiu kh nng tin tin - tm, theo di, nhn dng cc b mt, lc
Kalman, l s a dng ca mt h thng tr tu nhn to. Ngoi ra n cn cung
cp cc c s thut ton th gic my tnh thng qua cc giao din lp trnh ng
dng mc thp. N c ng gi v hon ton min ph, ngi dng c th
sn sng s dng cho nhng mc ch khc nhau ca h [1][2].
3.2 Gii thiu thit b Kinect
Kinect l sn phm ca Microsoft da trn cng ngh camera c pht
trin bi PrimeSense. Kinect c coi nh l mt thit b ngoi vi cho php giao
tip vi con ngi thng qua cc c ch, em li nhng cm gic th v cho
ngi chi game trn Xbox. Kh nng hiu c c ch con ngi ca Kinect
da trn hai c tnh chnh sau: thng tin v su nh (depth map), kh nng
pht hin v bm theo c tnh c th ngi (body skeleton tracking) [7].

Hnh 3.1Thit b Kinect

Kinect gm c: RGB camera, cm bin su (3D Depth Sensors), dy
microphone (Multi-array Mic) v ng c iu khin gc ngng (Motorized Tilt).
- RGB Camera: nh mt camera thng thng, c phn gii 640x480
vi tc 30 fps.
- Cm bin su: su c thu v nh s kt hp ca hai cm bin:
n chiu hng ngoi (IR Projector) v camera hng ngoi (IR camera).
- Dy a microphone: gm bn microphone c b tr dc Kinect nh trn
hnh trn, c dng vo cc ng dng iu khin bng ging ni.
4

- ng c iu khin gc ngng: l loi ng c DC kh nh, cho php ta
iu chnh camera ln xung bo m camera c c gc nhn tt
nht
Cp cm bin IR camera v IR projector s phi hp vi nhau cho ra gi tr
su nh bng cng ngh Light Coding ca PrimeSense.

Hnh 3.2: Qu trnh thu v bn su nh
3.3 Th vin OpenNI
Th vin OpenNI c xem l th vin mnh nht trc s c mt ca
Kinect SDK beta, th vin ny h tr a ngn ng trn nhiu platform khc nhau,
gip cho cc lp trnh vin c th vit cc ng dng trn Kinect rt d dng vi
tng tc t nhin Natural Interaction (NI). Mc ch chnh ca OpenNI l xy
dng cc hm API chun, cho php th vin c kh nng kt hp vi cc
middleware nhm lm tng sc mnh cho Kinect [5].
3.4 Point Cloud Library
PCL [6] l th vin h tr cho n-D Point Cloud v cho vic x l nh trong
khng gian 3D. Th vin c xy dng vi nhiu gii thut nh lc (filtering),
khi phc b mt (surface reconstruction), phn vng (segmentation), c lng
c tnh vt (feature estimation),... n gin cho vic pht trin, PCL c
chia ra thnh nhiu th vin nh v c th bin dch mt cch ring l:
- Eigen: mt th vin m h tr cho cc php ton tuyn tnh, c dng
trong hu ht cc tnh ton ton hc ca PCL.
- FLANN: (Fast Library for Approximate Nearest Neighbors) h tr cho vic
tm kim nhanh cc im ln cn trong khng gian 3D.
- Boost: gip cho vic chia s con tr trn tt c cc module v thut ton
trong PCL trnh vic sao chp trng lp d liu c ly v trong
h thng.
- VTK: (Visualization Toolkit) h tr cho nhiu platform trong vic thu v d
liu 3D, h tr vic hin th, c lng th tch vt th.
- CMinPack: mt th vin m gip cho vic gii quyt cc php ton tuyn
tnh.
4 Thut ton nhn dng i tng - SIFT
4.1 Gii thiu thut ton SIFT
Mt thut ton tiu biu v c hiu qu kh cao l da theo cc c trng
cc b bt bin trong nh: SIFT (Scale-invariant Feature Transform) do David
Lowe a ra t nm 2004 v n nay c nhiu ci tin trong thut ton. c
trng c trch chn trong SIFT l cc im c bit (keypoint), cc im ny
km theo cc m t v n v mt vc t c ly keypoint lm im gc.
5

C bn giai on chnh c thc hin trong thut ton trch xut cc im
c bit v cc c trng ca n bao gm:
- D tm cc tr trong khng gian o (Scale-space Extrema Detection)
- Lc v trch xut cc im c bit (Keypoint localization)
- Gn hng cho cc im c trng (Oriented Assignment)
- B m t im c trng (Keypoint Description)

Hnh 4.1: Minh ha cc bc chnh trong gii thut SIFT
4.2 Ni dung gii thut
4.2.1 D tm cc tr cc b
Bc u tin s tm cc im tim nng c th tr thnh im c trng
bng phng php lc theo tng da vo vic thay i tham s b lc
Gaussisan. Trong bc ny, ta cn d tm cc v tr v cc s o (kch c) m
chng bt bin trong cc khung nhn khc nhau ca cng mt i tng. Cc v
tr bt bin v s o c th c d tm bng cch tm kim cc c trng n
nh trn ton b cc s o c th, s dng mt hm lin tc v s o vn rt ni
ting c tn l hm khng gian o (Witkin 1983).
Hm Gaussian l hm tt nht biu din khng gian o ca nh 2
chiu.Khng gian o ca mt nh s c nh ngha nh l mt lm L(x,y,)
c to ra bng cch nhn chp nh gc I(x,y) vi mt hm Gaussian G(x,y,)
c tham s v s o thay i.
L(x,y,) = G(x,y,) * I(x,y) (1)
Trong ton hng * l php nhn chp cc ma trn 2 chiu x,y. V G(x,y, ) hm
Gaussian:

(2)

tm nhng im c trng c tnh bt bin cao, thut ton c s
dng l tm cc tr cc b ca hm sai khc DoG (Difference-of-Gaussian), k
hiu l D(x,y, ). Hm ny c tnh ton t s sai khc gia 2 khng gian o
cnh nhau ca mt nh vi tham s o lch nhau mt hng s k.
D(x,y,) = L(x,y,k)-L(x,y,)=(G(x,y,k)-G(x,y,)*I(x,y)) (3)
6

Cc l do la chn hm Gaussian l v n l k thut rt hiu qu tnh
ton L (cng nh lm tng mn ca nh), m L th lun phi c tnh rt
nhiu m t c trng trong khng gian o, v sau , D s c tnh mt
cch n gin ch vi php tr ma trn im nh vi chi ph thc hin thp.

Hnh 4.2. Qu trnh tnh khng gian o (L) v hm sai khc D
Hn na, hm sai khc DoG c th c s dng to ra mt s xp x
gn vi o hm bc hai Laplace c kch thc chun ca hm Gaussian
(
2
V
2
G). ng ch ra rng vic chun ha o hm bc hai vi h s
2
l cn
thit cho bt bin o tr nn ng. C th, ng cng b rng cc gi tr cc
i v cc tiu ca
2
V
2
G chnh l nhng gi tr c tnh n nh nht (bt bin
cao) so vi mt lot cc hm nh gi khc nh: gradient, Hessian hay Harris.
Mi quan h gia D v
2
V
2
G c biu din nh sau:
(4)
Nh vy, V
2
G c th c tnh thng qua vic xp x s sai khc hu hn cc
tham s o gn nhau
(5)
Do : (6)
- T cng thc ny, ta thy khi m hm sai khc DoG c tnh ton ti cc
tham s o lch nhau mt hng s k, th ta c th s dng DoG xp x o
hm bc hai Laplace ca Gaussian. V h s (k-1) trong phng trnh trn l
hng s trong mi khng gian o nn n s khng nh hng n vic tm
cc v tr cc tr. Sai s trong vic xp x o hm bc 2 tin v 0 khi k gn vi
1. Tuy nhin, cc kt qu th nghim ca tc gi cho thy qu trnh xp x
o hm khng nh hng n vic d tm cc v tr cc tr thm ch ngay c
khi chn k kh xa, v d
- Sau khi p dng hm DoG ta thu c cc lp kt qu khc nhau (scale) t
nh gc, bc tip theo l tm cc cc tr trong cc lp kt qu theo tng
min cc b. C th l ti mi im trn cc lp kt qu s c so snh vi
8 im ln cn trn cng lp v 9 im ln cn trn mi lp khc (hnh di).
7


Hnh 4.3 Qu trnh tm im cc tr trong cc hm sai khc DoG
4.2.2 Trch xut keypoint
Php ni suy ln cn cho v tr ng ca im tim nng: Php ni suy ln cn () s
dng m rng Taylor (Taylor expansion) cho hm Difference-of-Gaussian D(x,y,):
(7)
Trong : D v o hm ca n c tnh ti mt im tim nng v X =
(x,y,) l khong cch t im . V tr ca im cc tr X c xc nh bng
cch ly o hm ca hm trn vi i s X v tin dn n 0:

Hnh 4.1: M phng s dng cng thc m rng ca Taylor cho hm DoG
(8)
Loi tr cc im c tnh tng phn km: Cc im nhy cm vi sng
v nhiu th khng c tr thnh im c bit v cn loi b khi danh sch
im tim nng. Trong khai trin Taylor m rng trn, nu im tim nng no
c gi tr X < 0.03 th im s b loi, ngc li th n c gi li theo v tr
mi (y+x) v ty bin , vi y l v tr c ca n cng gi tr bin .
Loi b cc im d tha theo bin: S dng hm DoG s cho tc ng mnh
n bin khi v tr ca bin l kh xc nh v v vy cc im tim nng trn bin
s khng bt bin v b nhiu. V tng s n nh cho cc im s c chn
lm im c bit ta s loi tr cc im tim nng kh nh v (tc l v tr d
thay i khi c nhiu do nm bin).
Sau khi p dng hm DoG s lm ng bin nh khng r rng v
cong chnh s c gi tr ln hn nhiu so vi cong dc theo bin v vy cn
loi b bt cc im c bit dc theo cng mt bin. Gii php cho vic ny l
s dng gi tr ca ma trn Hessian cp 2:
8

(9)
4.2.3 Gn hng cho cc keypoint
Cc k thut gn hng cc b cho cc im c trng. o ca cc
im c trng c s dng tm ra mt nh lc Gaussian L vi kch
thc gn nht sao cho mi tnh ton s c thc hin trong cng mt cch
bt bin v o. Vi mi mu nh L(x,y) ny, gi m(x,y) l bin gradient,
(x,y) l hng. Hai gi tr cui c tnh ton nh sau:
(10)
4.2.4 To b m t cc b
Cc php x l trn y thc hin d tm v gn ta , kch thc, v
hng cho mi im c trng keypoint. Cc tham s yu cu mt h thng
ta a phng 2D c th lp li c m t vng nh a phng v nh
vy to ra s bt bin i vi cc tham s . Bc tip theo y s tnh ton
mt b m t cho mt vng nh a phng m c tnh c trng cao (bt bin
vi cc thay i khc nhau v sng, thu - phng nh, xoay).
Hnh sau m phng qu trnh tnh ton cc b m t theo cch tip cn mi.

Image gradients Keypoint descriptor
Hnh 4.4: M t to b m t cc b
4.3 M t xy dng chng trnh xc nh v tr i tng
4.3.1 Phng php thc hin
Xy dng chng trnh pht hin v nh v tr i tng bng cch s
dng cc k thut x l nh. Chng trnh thc hin tm i tng, nu tm thy
s ra lnh iu khin thit b bn di thng qua cng RS32 tin li gn vt.
Trong chng trnh chng ti s dng th vin OpenCV trong x l nh
v trch xut keypoint ca i tng bng thut ton SIFT. Sau chit xut
trch lc keypoint v descriptor ca tng frame hnh c nh v. Tip theo
chng trnh s so snh cc im keypoint ln cn tm thy nhng cp
keypoint ph hp nht. V nu chng ph hp vi nhau th s tr v ta x,y
ca i tng. Vi phng php ny, chng ti chia mn hnh ra lm 9 phn
tng ng nh hnh sau [3]:
9


Hnh 4.5. M t vic chia mn hnh ra lm 9
phn nh v v tr i tng
Phn iu khin thit b bn di s dng cc thut ton iu khin thit
b c m ha sao cho robot di chuyn n ng ta ca i tng c
pht hin l khu vc s 5. Nu ta pht hin trong khu vc 6, th robot s di
chuyn sang bn phi, i tng s xut hin trong khu vc 5. Nu i tng ri
vo khu vc 2, robot s di chuyn v pha trc a i tng ri vo khu
vc 5. Khi i tng lt vo khu vc 5, nu robot c cch tay bt ly i tng
th cch tay s a ra kp ly i tng. Khi , robot s di chuyn v pha
trc kp chc i tng. Trong lc robot tin v pha trc s s dng
kinect o khong cch n i tng sao cho khong cch ng vi thit kt
ca cnh tay robot bt ly i tng.
5 Kt qu thc nghim


Hnh 5.1 Tm thy i tng bn tri v ra lnh
robot dch chuyn sang phi
Hnh 5.2 Tm thy i tng bn phi v ra
lnh robot di chuyn qua tri
10




Ti liu tham kho
[1]. Learning computer Vision with the OpenCV Library, Gary Bradski and Adrian
Kaehler, University of British Comlumbia, M.
[2]. OpenCV 2 Computer Vision Application Programming Cookbook, Robert
Laganire, Published by Packt Publishing Ltd, 32 Lincoln Road, Olton,
Birmingham, B27 6PA, UK.
[3]. Analysis of Kinect for Mobile Robots, Mikkel Viager, Technical University of
Denmark, p. 11
[4]. A Qualitative Analysis of Two Automated Registration Algorithms In Real
World Scenario Using Point Clouds from the Kinect Jacob Kjr, June 27,
2011.
[5]. http://openni.org/Documentation
[6]. http://pointclouds.org/documentation
[7]. http://en.wikipedia.org/wiki/Kinect
Hnh 5.3 Tm thy i tng bn pha
trc v ra lnh robot i thng

You might also like