Professional Documents
Culture Documents
Boston University
Department of Electrical and Computer Engineering
Boston, MA 02215
∆A
∆A0 = . (2)
maxi,j (|∆Aij |)
This ensures that all entries of ∆A0 lie in the interval [−1, 1]. Fig. 3: Architecture of proposed CNN. See Table 1 for details.
We note that sensors closer to the occupant should have
larger reading changes between empty and occupied room 2.3. Benchmarks
states than those further away. To leverage this spatial re- We benchmark our proposed CNN against support vector re-
lationship, we reshape the Ns × Nf matrix ∆A0 into a 3D gression (SVR) and K nearest neighbors (KNN) regression.
2637
Both methods take the flattened and normalized ∆A0 matrix Table 1: Network parameters and average localization errors
as the feature vector. for different upsampling factors.
In SVR, we train two regressors (with a Gaussian kernel) Upsampling factor 1 3 5 10
to estimate the x
b and yb coordinates separately. We determine Input channel size 3×4 7×10 11×16 21×31
the optimal values of two SVR tuning parameters, namely the Conv1 kernel 2×2 4×4 4×4 6×6
box constraint C and the margin of tolerance to errors , using Conv1 stride (1,1) (1,1) (2,2) (2,2)
a grid search and 5-fold cross-validation. Conv2 kernel 2×2 2×2 3×3 4×4
In KNN regression, we use Euclidean distance in the Conv2 stride (1,1) (1,1) (1,1) (2,2)
feature-vector space as the distance metric. For a test sample, Dim. hidden layer 170 300 500 750
we find K closest samples (in the Euclidean distance sense) Avg. localization
6.43 6.25 5.69 6.47
in the training set. Then, we estimate the location of the test error - private (cm)
sample by using the weighted centroid of the ground truth Avg. localization
7.89 7.96 8.04 8.61
locations of the K nearest neighbors, where the weight is error - public (cm)
the reciprocal of the Euclidean distance. The parameter K is has 63 samples on a 9 × 7 grid with 25cm spacing. There
optimized through a 5-fold cross-validation. is no overlap between the training and test set grids. The
We also compare CNN localization performance against ground-truth location of a human avatar is the projection of
our best model-based localization algorithm [20]. This algo- its centroid onto the floor. When placing an avatar at each
rithm is based on a light reflection model [17] and assumes ground truth location on the grid, we rotated it around the
the floor to be Lambertian and objects to be flat. It first com- vertical axis by an angle randomly chosen in range [0◦ , 360◦ ]
putes the change of floor albedo based on the change in light to change its orientation. We followed the steps described in
transport matrix, and the knowledge of room dimensions and our previous work [20] to modulate the LEDs and obtain a
locations of sensors and fixtures, and then uses the centroid light transport matrix A, and the difference matrix ∆A (by
of albedo change as the estimated location. subtracting from A the light transport matrix A0 obtained for
the empty room).
3. EXPERIMENTAL RESULTS
We considered two different scenarios when training our
3.1. Simulation experiments proposed CNN and the benchmark models: private and pub-
To validate the performance of our proposed CNN, we col- lic. In a private scenario, like a home, the system can only
lected datasets in both a Unity3D-simulated environment and be used by a small set of people, and therefore we can train
a real physical testbed. In Unity3D simulation, we created a model with data from all users. In a public scenario, like a
a room with tables, doors and a screen as furniture, and a store, the model cannot be trained on all users since there can
window that allows simulated sunlight to illuminate the in- always be new users that are never seen by the system.
terior. The size of the room and the placement of furniture In the private scenario, out of 1,530×8 training samples
are chosen to best approximate our real testbed room. Part we use only 50×8 samples (50 random samples for each
of the empty floor is used as the test area for data collection avatar) for training. All the data-driven models are trained
(2.8m×2m). We placed 12 LED/sensor pairs on the ceiling using the same set of samples. Then, we test each model
on a 3 × 4 grid at a height of 2.71m, simulated in the same on each avatar’s 63 test samples (with 25cm spacing) which
way as described in our previous work [20]. To capture the are separate from the larger training set of 1530×8 samples.
body shape of an occupant more realistically, we used 8 hu- In the public scenario, we perform a leave-one-person-out
man avatars that differ in height, weight, gender and clothing. cross-validation. We train a model on 7 avatars (50 samples
All human avatars are in a standing pose. Our simulated room per avatar) and test it on the eighth avatar, and repeat this pro-
and 8 human avatars are shown in Figure 4. cess 8 times so that each of the avatars is left out for testing.
The performance of a model is evaluated in terms of average
localization error.
We tested several choices of the upsampling factor for the
input tensor of our CNN: 1 (no upsampling), 3, 5 and 10. As
the input size is scaled, we scale the network to roughly match
the input size by changing the network parameters. The net-
work parameters and average localization errors for different
upsampling factors are shown in Table 1. The network with
Fig. 4: Simulated room in Unity3D with test area shown (left) an upsampling factor of 5 performs best in the private sce-
and 8 human avatars used in data collection (right). nario, while performing only slightly worse than the best case
We collected two datasets for each human avatar: the in the public scenario.
training set contains 1,530 samples with ground truth loca- Figure 5 shows the average localization errors for each
tions on a 45 × 34 grid with 5cm spacing, and the test set human avatar for the proposed CNN (upsampling factor of 5)
2638
and 3 benchmark methods. The CNN approach reduces the transport matrices for each location were then averaged to re-
average localization error across all avatars by 47.69% and duce noise. Before collecting data for each person, we ran
46.99% in private and public scenarios, respectively, com- the system for several modulation cycles in empty state and
pared to the best-performing method among SVR, KNN and averaged the obtained light transport matrices to obtain A0 .
model-based. We also considered both private and public scenarios in
25 SVR 40 SVR
the testbed experiments. In the private scenario, we randomly
KNN regression KNN regression
selected 50 samples from each person to form the training
Average localization error (cm)
2639
5. REFERENCES [11] Valery A Petrushin, Gang Wei, and Anatole V Gersh-
man, “Multiple-camera people localization in an in-
[1] Lionel M Ni, Yunhao Liu, Yiu Cho Lau, and Abhishek P door environment,” Knowledge and Information Sys-
Patil, “Landmarc: indoor location sensing using active tems, vol. 10, no. 2, pp. 229–241, 2006.
rfid,” Wireless networks, vol. 10, no. 6, pp. 701–710,
2004. [12] Xue Wang and Sheng Wang, “Collaborative signal pro-
cessing for target tracking in distributed wireless sensor
[2] Jeffrey Hightower, Roy Want, and Gaetano Borriello, networks,” Journal of Parallel and Distributed Comput-
“Spoton: An indoor 3d location sensing technology ing, vol. 67, no. 5, pp. 501–515, 2007.
based on rf signal strength,” UW CSE 00-02-02, Uni- [13] Wojciech Zajdel and Ben JA Kröse, “A sequential
versity of Washington, Department of Computer Science bayesian algorithm for surveillance with nonoverlap-
and Engineering, Seattle, WA, vol. 1, 2000. ping cameras,” International Journal of Pattern Recog-
nition and Artificial Intelligence, vol. 19, no. 08, pp.
[3] Roy Want, Andy Hopper, Veronica Falcao, and Jonathan
977–996, 2005.
Gibbons, “The active badge location system,” ACM
Transactions on Information Systems (TOIS), vol. 10, [14] May Moussa and Moustafa Youssef, “Smart cevices for
no. 1, pp. 91–102, 1992. smart environments: Device-free passive detection in
real environments,” in Pervasive Computing and Com-
[4] Se-Hoon Yang, Hyun-Seung Kim, Yong-Hwan Son, and munications, 2009. PerCom 2009. IEEE International
Sang-Kook Han, “Three-dimensional visible light in- Conference on. IEEE, 2009, pp. 1–6.
door localization using aoa and rss with multiple optical
receivers,” Journal of Lightwave Technology, vol. 32, [15] Eric A Wan and Anindya S Paul, “A tag-free solution
no. 14, pp. 2480–2485, 2014. to unobtrusive indoor tracking using wall-mounted ul-
trasonic transducers,” in Indoor Positioning and Indoor
[5] Weizhi Zhang, MI Sakib Chowdhury, and Mohsen Navigation (IPIN), 2010 International Conference on.
Kavehrad, “Asynchronous indoor positioning system IEEE, 2010, pp. 1–10.
based on visible light communications,” Optical Engi-
[16] Douglas Roeper, Jiawei Chen, Janusz Konrad, and
neering, vol. 53, no. 4, pp. 045105, 2014.
Prakash Ishwar, “Privacy-preserving, indoor occupant
[6] Heidi Steendam, “A 3-d positioning algorithm for aoa- localization using a network of single-pixel sensors,” in
based vlp with an aperture-based receiver,” IEEE Jour- Advanced Video and Signal Based Surveillance (AVSS),
nal on Selected Areas in Communications, vol. 36, no. 2016 13th IEEE International Conference on. IEEE,
1, pp. 23–33, 2018. 2016, pp. 214–220.
2640