You are on page 1of 6

Jour of Adv Research in Dynamical & Control Systems, Vol.

10, 08-Special Issue, 2018

A Review on Content Piracy, Logo Detection


and Recognition
J.P. Kiran Kumar, Department of Computer Science and Engineering, Sri Siddhartha Institute of Technology (SSIT), Tumkur,
India. E-mail:jpkiran@gmail.com

D. Jayadevappa, Department of Electronics and Instrumentation Engineering, JSS Academy of Technical Education, Bangalore,
India. E-mail:devappa22@gmail.com

B.R. Prakash, Department of Master of Computer Application, Sri Siddhartha Institute of Technology (SSIT), Tumkur, India.

E-mail:brp.tmp@gmail.com
Abstract--- With advent of technologies, availability of internet bandwidth and consumption of internet is rapidly
increasing. On the other side, to keep up the pace of internet bandwidth, device manufacturers are releasing new
devices every year with high CPU processing capability, enhanced display for rich content viewing experience.
Considering these device capabilities and sheer volume of internet pay television has evolved from a traditional
model of cable, satellite transmission to include over the top (OTT) services. Online services like over the top (OTT)
will eventually change the end user experience by providing personalized content. Though the end user enjoys this
paradigm shift, it opens up new challenges to content owners, service providers on how to protect the content from
piracy. An efficient and effective active visual processing system should be deployed to track illegal re-distribution
of pirated premium content in real time. However such a system will face a unique challenge of accuracy in
detection and recognition of content. Recently the role of Artificial Intelligence in media industry is significantly
increasing as it eases visual processing without human intervention. Hence integrating Artificial Intelligence (AI) for
automatically tracking illegal re-distribution of pirated content, a state of art visual processing system could be
developed to handle real scenarios. This paper focuses on past research and development in the field of visual object
detection and recognition. Computer vision system shall assist in video analytics. The aim of this review is to
summarize and compare some of approaches experimented by previous researchers and indicate the direction on
future research.
Keywords--- Artificial Intelligence, Computer Vision, Over the Top, Machine Learning, Content Delivery Network.

I. I NTRODUCTION
Artificial Intelligence (AI) solutions are rapidly increasing and it’s looking like AI will carve out a unique
space in everyday activities of human being. From self-driving cars, automatic surveillance, predicting online
purchasing to security applications, AI is the core of this technology. Media industry is not an exception;
Artificial Intelligence has spread its wings towards securing the content from piracy. Illegal redistribution of
content has become a big menace and poses a serious threat to content development and distribution business.
Live transmission of real time events especially sports attracts large viewership across the globe. These events
are termed as high value. Pirates target these events in particular to capture the decrypted content, recompress
and stream the content through various methods. Some of the popular methods to stream this illegal content are
through peer-to-peer (P2P) live streaming and web streaming [1]. Content delivery network (CDN) may or may
not be associated with direct web streaming.
TVants [2], SopCast [3] and PPLive [4] are considered to be popular peer-to-peer commercial streaming
systems. These streaming systems are considered as first generation peer-to-peer systems providing low quality
video content (200-400 Kbit/s). Joost [5], Babelgum [6] and TVUnetworks [7] are considered as next generation
peer-to-peer systems providing high quality video content (1-10Mbit/s) [9]. Most of the illegal content
redistribution happens through direct web streaming protocols. Unlike peer-to-peer streaming, web streaming
protocols adheres to client server architecture wherein users shall access the illegal content either through single
streaming server or through content delivery network (CDN). With the rise in number of users, cost factor
associated with server network bandwidth, network infrastructure maintenance will increase.

ISSN 1943-023X 1408


Received: 5 Apr 2018/Accepted: 15 May 2018
Jour of Adv Research in Dynamical & Control Systems, Vol. 10, 08-Special Issue, 2018

Pirates use the legitimate subscription for capturing the content. Using high end video capture device, live
decrypted content is recorded. The decrypted content undergoes limited modification, recompressed and finally
streamed to content streaming platforms using standard protocols (example: Real Time Messaging Protocol and
Real Time Streaming Protocol). There are several content streaming platforms available for streaming illegal
live content (Ex: hdcast.org, jjcast.com, etc.). The streaming URL link of content streaming platform or web
page link which embeds flash player video shall be advertised in google or social networking sites (Facebook or
Twitter). The re-transmission of pirated content may or may not include metadata. However it is quite difficult
to differentiate actual content versus illegal content even using metadata. This is because the metadata pushed
by pirates shall not be structured and consistent.
Video frames need to be analyzed to check if the broadcasted video content is actual or pirated. The
difficulty level of analyzing video frames becomes complicated when high value events like sports are
broadcasted in multiple physical channels. Analysis of video frames includes extracting visual objects and
comparing them with source content. Mapping of visual objects differs based on content category (Sports,
Reality shows, News, etc.) as discussed in [10], facial recognition [11], availability of logo, etc. Broadcaster
logo shall be the key to distinguish pirated content with original source. The broadcaster logo within the
redistributed content would have undergone significant makeover with respect to encoding quality, visibility
(Figure1). This makes the recognition of broadcaster logo extremely poor.

Figure 1: Distorted Logos due to Rebroadcasting (8)


Machine learning can be leveraged to continuously process the video frames advertised through google, social
networking sites. The aim of this processing is to identify the broadcaster logo and hence to confirm the originality
of the source.

II. R ELATED WORK


Logo Detection and Recognition
Logo detection and recognition is the key problem associated with many applications spread across different
domains, such as TV logo classification [20], vehicle logo recognition [12], product brand recognition [13],
controlling piracy for TV broadcasting platform, etc. The difficulty level of logo detection and recognition in video
stream increases based on transformations (rotation, scaling, and translation) applied to the logo and the level of text
and graphics combined within the logo. Logo detection and recognition can be categorized into three parts as
illustrated in table1.

Table 1: Definition of Logo Detection and Recognition


Categorization Definition
Detection Detect if any logo of interest is available in the input
Localization Provides exact location of the logo available in the input
Classification Matching the logo available in the input with samples

Detection and Recognition Methods


There are numerous techniques experimented by researchers for logo detection and recognition. The paper [14]
discusses about methods like positive and negative features to classify the logo of interest. The concept of negative
feature is to carve out some geometric shapes for the given logo. This implies that adding a boundary around the

ISSN 1943-023X 1409


Received: 5 Apr 2018/Accepted: 15 May 2018
Jour of Adv Research in Dynamical & Control Systems, Vol. 10, 08-Special Issue, 2018

logo of some geometric shape and hence create negative feature. This approach is effective to classify all logos that
belong to same class.
The technique in [15] explains in detail on using wavelet based method and negative shape based method.
Negative shape method uses local shape data while wavelet based method uses global data of the logos available in
image database. The two methods are applied across different kind of images to understand the positive and negative
factors. It is observed that wavelet based approach is better than negative shape approach when there is high amount
of random noise introduced in the images. Based on these results, a new method is proposed comprising all positive
factors and ignoring negative factors of the two methods for logo classification.
[16] Focusses on sign detection and recognition based on conditional maximum entropy model and matching
model. Sign recognition will match sign regions identified by sign detection with sign classes available in database.
The closest match shall be considered. 99.5% mean accuracy is achieved for 35 signs followed by 90.4% mean
accuracy is achieved for 65 signs.
[17] Explains about the algorithm used to automatically detect and recognize trademarks in videos. SIFT is used
to detect and identify the interested points in video frames. Further frame selection is based on SVM classifier using
RBF kernel. The system used frames wherein they were partial occlusion and achieved 85% of precision in
matching.
[18] Proposes an approach called spatial pyramid mining used for logo detection. However based on the image
resolution multiple local features are required to be mined. This becomes tedious and hence has the limitation. [19]
Discusses about using spatial spectral saliency (SSS) and partial spatial context (PSC) to achieve good accuracy and
quick logo localization.
[20] Stable regions in the logo are detected using time-averaged edge algorithm and further logo classification is
done using Independent component analysis (ICA) architecture 2 on 32x32 macro pixel logo samples Figure2 and
Figure3 illustrates the flow chart for logo detection and logo recognition.

Figure 2: Flow Chart for Logo Detection (20)

ISSN 1943-023X 1410


Received: 5 Apr 2018/Accepted: 15 May 2018
Jour of Adv Research in Dynamical & Control Systems, Vol. 10, 08-Special Issue, 2018

Recently machine learning technique is used for logo feature matching and classification. Support vector
machine (SVM), bag of words (BOG), neural networks (NN) are commonly used algorithms for logo classification.
Convolutional neural network (CNN) seems to be in the rise in applications specific to image recognition. Of late
CNN has been explicitly used to address logo detection and recognition problems.
Graph transform network is a new learning paradigm described in [21] which highlights about incorporating
automatic learning to achieve state of art result in terms of accuracy and performance. Graph transform network is
integrated with character recognition based on Convolutional neural network for reading bank cheque. This
automated system is deployed commercially to read bank cheques.

Figure 3: Flow Chart for Logo Recognition (20)


This is a continuous process and hence keeps learning to provide better accuracy and performance. [22] Explains
about object detection algorithm which is simple and scalable. Compared to PASCAL VOC 2012, this algorithm
provides 30% improvement in object detection. This approach comprises of two parts: First to detect, localize and
segment the objects, apply convolutional neural network to bottom-up region proposals. Second is to train
convolutional neural network when trained labeled data is scarce. As the CNN is clubbed with regions, this method
is also known as R-CNN. 53.7% mean average precision is achieved by R-CNN on PASCAL VOL 2012. However
on 200-class ILSVRC2013 dataset, mean average precision achieved by R-CNN is 31.4%. Table 2 summarizes and
highlighting about nature of data used for the detection technique, recognition technique and the algorithm accuracy.

III. A NALYSIS OF L OGO DETECTION AND RECOGNITION M ETHODS


This paper included various research works explaining on different technique, methods and algorithmic
approaches used to extract, detect and classify the objects along with results on different datasets. David G Lowe in
year 1999 proposed and introduced Scale Invariant Feature Transformation (SIFT). This method is widely used
compared to other methods since it is invariant to changes made in image with respect to scaling, rotation,
translation and illumination. SIFT algorithm consists of four stages: Scale-space filtering, localization of key points,
assigning orientation to key points, key point descriptor and matching. Applications like video tracking, movement
of robots, stitching of images, object recognition can make use of SIFT algorithm. In 2005, usage of Histogram of
oriented gradients (HOG) method was widespread in image processing for object recognition. This method provides
high accuracy index using overlapping local contrast normalization on uniformly spaced cells of the image.

ISSN 1943-023X 1411


Received: 5 Apr 2018/Accepted: 15 May 2018
Jour of Adv Research in Dynamical & Control Systems, Vol. 10, 08-Special Issue, 2018

Recently usage of convolutional neural network is quite popular because of its state of art results for object
detection problem. Based on selective search [30] region proposal method, bounding boxes are generated. Clubbing
CNN with region proposal has provided good results. There are various algorithms like faster R-CNN, single shot
multibox detector (SSD) [31] which provides high performance and accuracy for object detection problems.

Table 2: Summary of Previous Research on Logo Detection and Recognition


First Author / Research work Nature of data Detection technique Recognition technique Accuracy
Year
Forrest N. DeepLogo: Hitting Logo FlickrLogos-32 Deep convolutional Deep convolutional Logo classification – 89.6%
Iandola [23] / Recognition with the Deep neural networks neural networks
2015 Neural Network Hammer (DCNN) (DCNN) Logo detection without
localization – 73.3%

Logo detection with localization –


74.4%
Steven C.H. Large-scale Deep Logo “logos-18”: Deep Region-based Deep Region-based Region based Convolutional
Hoi [24] / Detection and Brand 18 logo classes, 10 convolutional convolutional Neural Networks (RCNN)
2015 Recognition brands, and 16,043 networks networks (CaffeNet-w/o-ft) 86.5%
with Deep Region-based logo objects (DRCN) (DRCN)
Convolutional Networks Fast region based convolutional
networks ( FRCN) (Caffenet)
93.2%

Spacial pyramid pooling


strategies (SPPnet-ZF-w/o-bb)
92.5%
Hichem Sahbi Context-Dependent Logo FlickrLogos-27 dataset Context Dependent Context Dependent 30 training images per class - 76%
[25] / 2012 Matching and Recognition Similarity Similarity
Yannis Scalable Triangulation- FlickrLogos-27 dataset Multi-scale Delaunay Multi-scale Delaunay 30 training images per class
Kalantidis based Logo Recognition triangulation approach triangulation approach Without distractor classes – 56 %
[26] / 2011 (msDT) (msDT) With distractor classes – 50%
Andrew D. Trademark matching and Three videos of three Scale Invariant Scale Invariant Recall rate of better than 85%
Bagdanov retrieval in sports video different sports Feature Feature
[27] / 2007 databases Transformation (SIFT) Transformation (SIFT)
Guangyu Zhu Automatic document logo Large collection of Multi-scale detection Multi-scale detection 84.2 %
[28] / 2007 detection real-world documents approach approach
- Tobacco-800 dataset
David G. Object Recognition from Three Model images Scale Invariant Robust object recognition is
Lowe [29] / Local Scale-Invariant of rectangular planer Feature achieved in cluttered partially-
1999 Features objects Transformation (SIFT) occluded images with a
computation time of < 2 seconds.

IV. C ONCLUSION
This paper explained about basic steps involved in creating illegal broadcasting content and re-distributing it.
Discussed about leveraging machine learning technique to track and detect illegal redistribution of pirated content.
Reviewed major technological perspective and key progress in logo detection and recognition. Further discussed
about various approaches for designing automatic logo detection and recognition system. Content piracy security
threats should be dealt with high priority so as to avoid any huge loss of cost. This requires quick and efficient logo
detection and recognition system that can be applied for real time scenarios. Based on the summary and analysis
results discussed, the accuracy and quick response of logo detection and classification system can be strengthened
for real time scenarios. The direction of future research could be in developing robust logo detection and
classification system to improve detection and classification accuracy.

R EFERENCES
[1] Leporini, D. Architectures and protocols powering illegal content streaming over the Internet. International
Broadcasting Convention (IBC), 2015.
[2] TVAnts, http://www.tvants.com.
[3] SOPCast, http://www.sopcast.com.
[4] PPLive, http://www.pplive.com.
[5] Joost, http://www.joost.com.
[6] Babelgum, http://www.babelgum.com.
[7] Tvunetworks, http://www.tvunetworks.com.
[8] Stolikj, M., Jarnikov, D. and Wajs, A. Artificial Intelligence in Media-Making security smarter.
International Broadcasting Convention (IBC), 2017.

ISSN 1943-023X 1412


Received: 5 Apr 2018/Accepted: 15 May 2018
Jour of Adv Research in Dynamical & Control Systems, Vol. 10, 08-Special Issue, 2018

[9] Horvath, A., Telek, M., Rossi, D., Veglia, P., Ciullo, D., Garcia, M.A., Leonardi, E. and Mellia, M.
Dissecting pplive, sopcast, tvants. submitted to ACM Conext, 2008.
[10] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R. and Fei-Fei, L. Largescale Video
Classification with Convolutional Neural Networks. IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 2014, 1725-1732.
[11] Parkhi, O.M., Vedaldi, A. and Zisserman, A. Deep Face Recognition. British Machine Vision Conference,
2015.
[12] Psyllos, A.P., Anagnostopoulos, C.N.E. and Kayafas, E. Vehicle logo recognition using a sift-based
enhanced matching scheme. IEEE Transactions on Intelligent Transportation Systems 11 (2) (2010) 322–
328.
[13] Gao, Y., Wang, F., Luan, H. and Chua, T.S. Brand data gathering from live social media streams.
Proceedings of International Conference on Multimedia Retrieval, 2014.
[14] Soffer, A. and Samet, H. Using Negative Shape Features for Logo Similarity Matching. Proceedings ICPR,
1998, 571-573.
[15] Neuman, J., Samet, H. and Soffer, A. Integration of local and Global Shape Analysis for Logo
classification. Pattern recognition letters 23 (2002) 1449-1457.
[16] Mattar, M.A., Hanson, A.R. and Learned Miller, E.G. Sign Classification using Local and Meta-Features.
IEEE CVPR Workshop on computer vision applications for visually impaired, San Diego, CA, 2005.
[17] Ballan, L., Bertim, M. and Jain, A. A System for Automatic Detection and Recognition of Advertising
Trademarks in Sports Videos. Proc.of ACM International conference on Multimedia (MM), 2008.
[18] Kleban, J., Xie, X. and Ma, W.Y. Spatial Pyramid Mining for Logo Detection in Natural Scenes. Proc. of
IEEE ICME, Hannover, Germany, 2008.
[19] Gao, K., Lin, S., Zhan, Y. and Tang, S. Logo detection based on spatial-spectral saliency and partial special
context. Proc. of IEEE ICME, New York, USA, 2009.
[20] Ozay, N. and Sankur, B. Automatic TV logo detection and classification in broadcast videos. IEEE
conference on signal processing, 2009.
[21] LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P. Gradient based learning applied to document
recognition. Proceedings of the IEEE 86 (11) (1998).
[22] Girshick, R., Donahue, J., Darrell, T. and Malik, J. Rich feature hierarchies for accurate object detection
and semantic segmentation. CVPR, 2014.
[23] Iandola, F.N., Shen, A., Gao, P. and Keutzer, K. DeepLogo: Hitting Logo Recognition with the Deep
Neural Network Hammer. https://arxiv.org/abs/1510.02131, 2015.
[24] Hoi, S.C.H., Wu, X., Liu, H., Wu, Y., Wang, H., Xue, H. and Wu, Q. Large-scale Deep Logo Detection
and Brand Recognition with Deep Region-based Convolutional Networks. arXiv preprint
arXiv:1511.02462 (2015).
[25] Sahbi, H., Ballan, L., Serra, G. and Bimbo, A.D. Context-Dependent Logo Matching and Recognition.
IEEE Transactions on Image Processing 22 (2012) 1018-1031.
[26] Kalantidis, Y., Pueyo, L.G., Trevisiol, M., Zwol, R.V. and Avrithis, Y. Scalable triangulation-based logo
recognition. ACM International Conference on Multimedia Retrieval, 2011.
[27] Bagdanov, A.D., Ballan, L., Bertini, M. and Bimbo, A.D. Trademark Matching and Retrieval in Sports
Video Databases. ACM International conference on multimedia information retrieval, 2007, 79-86.
[28] Zhu, G. and Doermann, D. Automatic document logo detection. Conference on Document Analysis and
Recognition 02 (2007), 864–868.
[29] Lowe, D.G. Object recognition from local scale-invariant features. Conference on Computer Vision 2
(1999).
[30] Uijlings, J. R. R., van de Sande, K. E. A., Gevers, T. and Smeulders, A.W.M. Selective search for object
recognition. International Journal of Computer Vision 104 (2) (2013) 154-171.
[31] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y. and Berg, A.C. Ssd: Single shot
multibox detector. European Conference on Computer Vision, Springer, 2016, 21-37.

ISSN 1943-023X 1413


Received: 5 Apr 2018/Accepted: 15 May 2018

You might also like