Professional Documents
Culture Documents
86,1*5(*,21%$6('&2192/87,21$/1(85$/1(7:25.6
TianyuTang,ShilinZhou*,ZhipengDeng,LinLei,HuanxinZou
Fig. 2. Detection and annotation results from the test images. A red box denotes a correct localization, a blue box denotes a
false alarm and a black box denotes missing detection. (a)-(c) are image blocks of Munich test aerial images. (d)-(h) are
images of the collected vehicle data set, (d) is UAV images, (e) and (f) are satellite images. (g) is original test images of
Munich dataset (5616×3744 pixels), (h) is a large satellite image of Tokyo (18239×12837 pixels).
map, can improve the recall rate effectively. Compared with study, we will focus on mining hard negative samples to
the ACF detector, ACF + Fast R-CNN has a higher reduce false detection. Additionally, we will investigate
precision rate owing to the deep feature of Fast R-CNN. how to improve robustness of our method for object
This proves that CNN based methods have superior detection in large-scale satellite imagery.
classification performance than traditional methods.
Moreover, our method can detect a large-scale aerial images $&.12:/('*0(17
(5616×3744 pixels) in 3.65 seconds.
In Fig. 3, precision and recall (PR) curves of four test The work is supported by the National Natural Science
images are shown. These curves show that our method Foundation of China under Grants 61331015. The authors
achieves the best recall and precision. To further validate would like to thank Kang Liu and Gellert Mattyus, who
the ability of our method on aerial images with different generously provided their image data set with the ground
scales, we resized the image for the test but not the training. truth.
These detection results of scaled test image 2 are shown in
Fig. 4. Our method performs best on the same scale as it 5()(5(1&(6
was trained. The performance remains comparable under
small scale factor. But, our method does not perform well [1] J. Leitloff, D. Rosenbaum, F. Kurz, O. Meynberg, and P. Reinartz, “An
operational system for estimating road traffic information from aerial
under larger scale factor.
images,” RemoteSensing, vol. 6, no. 11, pp. 11 315–11341, 2014.
[2] K. Liu and G. Mattyus, “Fast multiclass vehicle detection on aerial
images,” IEEEGeoscience&RemoteSensingLetters, vol. 12, no. 9, pp. 1–
5, 2015.
[3] T. Moranduzzo and F. Melgani, “Automatic car counting method for
unmanned aerial vehicle images,” IEEE Transactions on Geoscience &
RemoteSensing, vol. 52, no. 3, pp. 1635–1647, 2014.
[4] T. Moranduzzo and F. Melgani, “Detecting cars in uav images with a
catalog-based approach,” IEEE Transactions on Geoscience & Remote
Sensing, vol. 52, no. 10, pp. 6356–6367, 2014.
[5] Z. Chen, C. Wang, H. Luo, and H. Wang, “Vehicle detection in high-
resolution aerial images based on fast sparse representation classification
and multiorder feature,” IEEE Transactions on Intelligent Transportation
Systems, pp. 1–14, 2016.
Fig. 3. PR curves of four test images in Munich dataset with [6] H. Y. Cheng, C. C. Weng, and Y. Y. Chen, “Vehicle detection in aerial
surveillance using dynamic bayesian networks.” IEEE Transactions on
different methods. ImageProcessing, vol. 21, no. 4, pp. 2152–9, 2012.
[7] W. Shao, W. Yang, G. Liu, and J. Liu, “Car detection from
highresolution aerial imagery using multiple features,” Geoscience and
RemoteSensingSymposium(IGARSS)4379–4382, 2012.
[8] S. Kluckner, G. Pacher, H. Grabner, and H. Bischof, “A 3d teacher for
car detection in aerial images,” ComputerVision(ICCV)pp. 1–8, 2007.
[9] A. Kembhavi, D. Harwood, and L. S. Davis, “Vehicle detection using
partial least squares,” IEEE Transactions on Pattern Analysis & Machine
Intelligence, vol. 33, no. 6, pp. 1250–1265, 2010.
[10] Z. Chen, C. Wang, C. Wen, and X. Teng, “Vehicle detection in
highresolution aerial images via sparse representation and superpixels,”
IEEETransactionsonGeoscience&RemoteSensing, vol. 54, no. 1, pp. 1–
14, 2015.
[11] X. Chen, S. Xiang, C. L. Liu, and C. H. Pan, “Vehicle detection in
Fig. 4. Performance after rescaling the image with different satellite images by hybrid deep convolutional neural networks,” IEEE
factors. Geoscience & Remote Sensing Letters, vol. 11, no. 10, pp. 1797–1801,
In conclusion, our method achieves the best results in 2014.
[12] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Region-based
both speed and accuracy, and have some migration convolutional networks for accurate object detection and segmentation,”
capabilities. IEEETransactionsonPatternAnalysis&MachineIntelligence, vol. 38, no.
1, pp. 1–1, 2015.
&21&/86,21 [13] R. Girshick, “Fast R-CNN,” Computer Science, 2015.
[14] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-
time object detection with region proposal networks,” IEEE Transactions
In this paper, we present a coupled region based CNN onPatternAnalysis&MachineIntelligence, pp. 1–1, 2016.
method for fast and accurate vehicle detection in large-scale [15] A. Ghodrati, M. Pedersoli, T. Tuytelaars, A. Diba, and L. V. Gool.
aerial images. Experimental results show that our method is “Deepproposal: Hunting objects by cascading deep convolutional layers,”
faster and more accurate than existing algorithms, and is ComputerVision(ICCV), 2015.
[16] M. D. Zeiler and R. Fergus, “Visualizing and Understanding
effective for images captured from UAV or downloaded Convolutional Networks,” SpringerInternationalPublishing, 2013.
from Google Earth. However, our method still produces [17] V. Turchenko and A. Luczak, “Caffe: Convolutional architecture for
some false detection, as well as missed detection. In future fast feature embedding,” EprintArxiv, pp. 675–678, 2014.