Professional Documents
Culture Documents
A Review On Content Piracy, Logo Detection and Recognition
A Review On Content Piracy, Logo Detection and Recognition
D. Jayadevappa, Department of Electronics and Instrumentation Engineering, JSS Academy of Technical Education, Bangalore,
India. E-mail:devappa22@gmail.com
B.R. Prakash, Department of Master of Computer Application, Sri Siddhartha Institute of Technology (SSIT), Tumkur, India.
E-mail:brp.tmp@gmail.com
Abstract--- With advent of technologies, availability of internet bandwidth and consumption of internet is rapidly
increasing. On the other side, to keep up the pace of internet bandwidth, device manufacturers are releasing new
devices every year with high CPU processing capability, enhanced display for rich content viewing experience.
Considering these device capabilities and sheer volume of internet pay television has evolved from a traditional
model of cable, satellite transmission to include over the top (OTT) services. Online services like over the top (OTT)
will eventually change the end user experience by providing personalized content. Though the end user enjoys this
paradigm shift, it opens up new challenges to content owners, service providers on how to protect the content from
piracy. An efficient and effective active visual processing system should be deployed to track illegal re-distribution
of pirated premium content in real time. However such a system will face a unique challenge of accuracy in
detection and recognition of content. Recently the role of Artificial Intelligence in media industry is significantly
increasing as it eases visual processing without human intervention. Hence integrating Artificial Intelligence (AI) for
automatically tracking illegal re-distribution of pirated content, a state of art visual processing system could be
developed to handle real scenarios. This paper focuses on past research and development in the field of visual object
detection and recognition. Computer vision system shall assist in video analytics. The aim of this review is to
summarize and compare some of approaches experimented by previous researchers and indicate the direction on
future research.
Keywords--- Artificial Intelligence, Computer Vision, Over the Top, Machine Learning, Content Delivery Network.
I. I NTRODUCTION
Artificial Intelligence (AI) solutions are rapidly increasing and it’s looking like AI will carve out a unique
space in everyday activities of human being. From self-driving cars, automatic surveillance, predicting online
purchasing to security applications, AI is the core of this technology. Media industry is not an exception;
Artificial Intelligence has spread its wings towards securing the content from piracy. Illegal redistribution of
content has become a big menace and poses a serious threat to content development and distribution business.
Live transmission of real time events especially sports attracts large viewership across the globe. These events
are termed as high value. Pirates target these events in particular to capture the decrypted content, recompress
and stream the content through various methods. Some of the popular methods to stream this illegal content are
through peer-to-peer (P2P) live streaming and web streaming [1]. Content delivery network (CDN) may or may
not be associated with direct web streaming.
TVants [2], SopCast [3] and PPLive [4] are considered to be popular peer-to-peer commercial streaming
systems. These streaming systems are considered as first generation peer-to-peer systems providing low quality
video content (200-400 Kbit/s). Joost [5], Babelgum [6] and TVUnetworks [7] are considered as next generation
peer-to-peer systems providing high quality video content (1-10Mbit/s) [9]. Most of the illegal content
redistribution happens through direct web streaming protocols. Unlike peer-to-peer streaming, web streaming
protocols adheres to client server architecture wherein users shall access the illegal content either through single
streaming server or through content delivery network (CDN). With the rise in number of users, cost factor
associated with server network bandwidth, network infrastructure maintenance will increase.
Pirates use the legitimate subscription for capturing the content. Using high end video capture device, live
decrypted content is recorded. The decrypted content undergoes limited modification, recompressed and finally
streamed to content streaming platforms using standard protocols (example: Real Time Messaging Protocol and
Real Time Streaming Protocol). There are several content streaming platforms available for streaming illegal
live content (Ex: hdcast.org, jjcast.com, etc.). The streaming URL link of content streaming platform or web
page link which embeds flash player video shall be advertised in google or social networking sites (Facebook or
Twitter). The re-transmission of pirated content may or may not include metadata. However it is quite difficult
to differentiate actual content versus illegal content even using metadata. This is because the metadata pushed
by pirates shall not be structured and consistent.
Video frames need to be analyzed to check if the broadcasted video content is actual or pirated. The
difficulty level of analyzing video frames becomes complicated when high value events like sports are
broadcasted in multiple physical channels. Analysis of video frames includes extracting visual objects and
comparing them with source content. Mapping of visual objects differs based on content category (Sports,
Reality shows, News, etc.) as discussed in [10], facial recognition [11], availability of logo, etc. Broadcaster
logo shall be the key to distinguish pirated content with original source. The broadcaster logo within the
redistributed content would have undergone significant makeover with respect to encoding quality, visibility
(Figure1). This makes the recognition of broadcaster logo extremely poor.
logo of some geometric shape and hence create negative feature. This approach is effective to classify all logos that
belong to same class.
The technique in [15] explains in detail on using wavelet based method and negative shape based method.
Negative shape method uses local shape data while wavelet based method uses global data of the logos available in
image database. The two methods are applied across different kind of images to understand the positive and negative
factors. It is observed that wavelet based approach is better than negative shape approach when there is high amount
of random noise introduced in the images. Based on these results, a new method is proposed comprising all positive
factors and ignoring negative factors of the two methods for logo classification.
[16] Focusses on sign detection and recognition based on conditional maximum entropy model and matching
model. Sign recognition will match sign regions identified by sign detection with sign classes available in database.
The closest match shall be considered. 99.5% mean accuracy is achieved for 35 signs followed by 90.4% mean
accuracy is achieved for 65 signs.
[17] Explains about the algorithm used to automatically detect and recognize trademarks in videos. SIFT is used
to detect and identify the interested points in video frames. Further frame selection is based on SVM classifier using
RBF kernel. The system used frames wherein they were partial occlusion and achieved 85% of precision in
matching.
[18] Proposes an approach called spatial pyramid mining used for logo detection. However based on the image
resolution multiple local features are required to be mined. This becomes tedious and hence has the limitation. [19]
Discusses about using spatial spectral saliency (SSS) and partial spatial context (PSC) to achieve good accuracy and
quick logo localization.
[20] Stable regions in the logo are detected using time-averaged edge algorithm and further logo classification is
done using Independent component analysis (ICA) architecture 2 on 32x32 macro pixel logo samples Figure2 and
Figure3 illustrates the flow chart for logo detection and logo recognition.
Recently machine learning technique is used for logo feature matching and classification. Support vector
machine (SVM), bag of words (BOG), neural networks (NN) are commonly used algorithms for logo classification.
Convolutional neural network (CNN) seems to be in the rise in applications specific to image recognition. Of late
CNN has been explicitly used to address logo detection and recognition problems.
Graph transform network is a new learning paradigm described in [21] which highlights about incorporating
automatic learning to achieve state of art result in terms of accuracy and performance. Graph transform network is
integrated with character recognition based on Convolutional neural network for reading bank cheque. This
automated system is deployed commercially to read bank cheques.
Recently usage of convolutional neural network is quite popular because of its state of art results for object
detection problem. Based on selective search [30] region proposal method, bounding boxes are generated. Clubbing
CNN with region proposal has provided good results. There are various algorithms like faster R-CNN, single shot
multibox detector (SSD) [31] which provides high performance and accuracy for object detection problems.
IV. C ONCLUSION
This paper explained about basic steps involved in creating illegal broadcasting content and re-distributing it.
Discussed about leveraging machine learning technique to track and detect illegal redistribution of pirated content.
Reviewed major technological perspective and key progress in logo detection and recognition. Further discussed
about various approaches for designing automatic logo detection and recognition system. Content piracy security
threats should be dealt with high priority so as to avoid any huge loss of cost. This requires quick and efficient logo
detection and recognition system that can be applied for real time scenarios. Based on the summary and analysis
results discussed, the accuracy and quick response of logo detection and classification system can be strengthened
for real time scenarios. The direction of future research could be in developing robust logo detection and
classification system to improve detection and classification accuracy.
R EFERENCES
[1] Leporini, D. Architectures and protocols powering illegal content streaming over the Internet. International
Broadcasting Convention (IBC), 2015.
[2] TVAnts, http://www.tvants.com.
[3] SOPCast, http://www.sopcast.com.
[4] PPLive, http://www.pplive.com.
[5] Joost, http://www.joost.com.
[6] Babelgum, http://www.babelgum.com.
[7] Tvunetworks, http://www.tvunetworks.com.
[8] Stolikj, M., Jarnikov, D. and Wajs, A. Artificial Intelligence in Media-Making security smarter.
International Broadcasting Convention (IBC), 2017.
[9] Horvath, A., Telek, M., Rossi, D., Veglia, P., Ciullo, D., Garcia, M.A., Leonardi, E. and Mellia, M.
Dissecting pplive, sopcast, tvants. submitted to ACM Conext, 2008.
[10] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R. and Fei-Fei, L. Largescale Video
Classification with Convolutional Neural Networks. IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 2014, 1725-1732.
[11] Parkhi, O.M., Vedaldi, A. and Zisserman, A. Deep Face Recognition. British Machine Vision Conference,
2015.
[12] Psyllos, A.P., Anagnostopoulos, C.N.E. and Kayafas, E. Vehicle logo recognition using a sift-based
enhanced matching scheme. IEEE Transactions on Intelligent Transportation Systems 11 (2) (2010) 322–
328.
[13] Gao, Y., Wang, F., Luan, H. and Chua, T.S. Brand data gathering from live social media streams.
Proceedings of International Conference on Multimedia Retrieval, 2014.
[14] Soffer, A. and Samet, H. Using Negative Shape Features for Logo Similarity Matching. Proceedings ICPR,
1998, 571-573.
[15] Neuman, J., Samet, H. and Soffer, A. Integration of local and Global Shape Analysis for Logo
classification. Pattern recognition letters 23 (2002) 1449-1457.
[16] Mattar, M.A., Hanson, A.R. and Learned Miller, E.G. Sign Classification using Local and Meta-Features.
IEEE CVPR Workshop on computer vision applications for visually impaired, San Diego, CA, 2005.
[17] Ballan, L., Bertim, M. and Jain, A. A System for Automatic Detection and Recognition of Advertising
Trademarks in Sports Videos. Proc.of ACM International conference on Multimedia (MM), 2008.
[18] Kleban, J., Xie, X. and Ma, W.Y. Spatial Pyramid Mining for Logo Detection in Natural Scenes. Proc. of
IEEE ICME, Hannover, Germany, 2008.
[19] Gao, K., Lin, S., Zhan, Y. and Tang, S. Logo detection based on spatial-spectral saliency and partial special
context. Proc. of IEEE ICME, New York, USA, 2009.
[20] Ozay, N. and Sankur, B. Automatic TV logo detection and classification in broadcast videos. IEEE
conference on signal processing, 2009.
[21] LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P. Gradient based learning applied to document
recognition. Proceedings of the IEEE 86 (11) (1998).
[22] Girshick, R., Donahue, J., Darrell, T. and Malik, J. Rich feature hierarchies for accurate object detection
and semantic segmentation. CVPR, 2014.
[23] Iandola, F.N., Shen, A., Gao, P. and Keutzer, K. DeepLogo: Hitting Logo Recognition with the Deep
Neural Network Hammer. https://arxiv.org/abs/1510.02131, 2015.
[24] Hoi, S.C.H., Wu, X., Liu, H., Wu, Y., Wang, H., Xue, H. and Wu, Q. Large-scale Deep Logo Detection
and Brand Recognition with Deep Region-based Convolutional Networks. arXiv preprint
arXiv:1511.02462 (2015).
[25] Sahbi, H., Ballan, L., Serra, G. and Bimbo, A.D. Context-Dependent Logo Matching and Recognition.
IEEE Transactions on Image Processing 22 (2012) 1018-1031.
[26] Kalantidis, Y., Pueyo, L.G., Trevisiol, M., Zwol, R.V. and Avrithis, Y. Scalable triangulation-based logo
recognition. ACM International Conference on Multimedia Retrieval, 2011.
[27] Bagdanov, A.D., Ballan, L., Bertini, M. and Bimbo, A.D. Trademark Matching and Retrieval in Sports
Video Databases. ACM International conference on multimedia information retrieval, 2007, 79-86.
[28] Zhu, G. and Doermann, D. Automatic document logo detection. Conference on Document Analysis and
Recognition 02 (2007), 864–868.
[29] Lowe, D.G. Object recognition from local scale-invariant features. Conference on Computer Vision 2
(1999).
[30] Uijlings, J. R. R., van de Sande, K. E. A., Gevers, T. and Smeulders, A.W.M. Selective search for object
recognition. International Journal of Computer Vision 104 (2) (2013) 154-171.
[31] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y. and Berg, A.C. Ssd: Single shot
multibox detector. European Conference on Computer Vision, Springer, 2016, 21-37.