Professional Documents
Culture Documents
I. I NTRODUCTION
IDEO analytics plays a vital role in detecting and track-
V ing temporal and spatial events in video streams. A
number of preinstalled cameras, as shown in Fig. 1, produces
video data. This data needs processing to generate useful clus-
ters such as classification and tracking of a marked person. As
shown in Fig. 2, video data captured from different cameras
can be used to locate a person of interest. The mapping of
the person is then associated with particular locations visited
along with the time spent at each location. The large amounts
of data makes it nearly impossible for human operators to
manually process this data.
Deep learning-based video analytics systems can involve
many hyper-parameters, including learning rate, activation
Fig. 2. Mapping of marked person.
function and weight parameter initialization. A trial-and-error
approach is mostly followed in selecting these parameters,
which makes it time consuming and at times may provide
inaccurate results. To overcome these challenges, we present a system for
object classification from multiple videos. We propose hyper-
Manuscript received January 29, 2018; accepted May 9, 2018. This paper parameter tuning through a mathematical model to achieve
was recommended by Associate Editor A. Hussain. (Corresponding author:
Muhammad Usman Yaseen.) higher object classification accuracy. The mathematical model
M. U. Yaseen, A. Anjum, and N. Antonopoulos are with the aids in observing the hyper-parameter outcomes on overall
Department of Computing and Mathematics, University of Derby, Derby performance of the learned model. Values of the hyper-
DE22 1GB, U.K. (e-mail: m.yaseen@derby.ac.uk; a.amjum@derby.ac.uk;
n.antonopoulos@derby.ac.uk). parameters are dynamically varied and appropriate parameters
O. Rana is with the Department of Computer Science and Informatics, are selected.
Cardiff University, Cardiff CF10 3AT, U.K. (e-mail: ranaof@cardiff.ac.uk). We have first performed object extraction which are then
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. scaled and normalized. Each video frame is scaled at a size of
Digital Object Identifier 10.1109/TSMC.2018.2840341 150 × 150. During our experiments, it was observed that deep
2168-2216 c 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
learning networks perform better when input data is provided classification [29], [30]. These hand crafted features are
in the normalized form. combined to generate larger features. These larger features
The system performs training of the model on multiple dis- provide an estimate of appearance and motion information of
tributed processors by utilizing cloud infrastructure. Multiple objects in the video. These larger features are not suitable for
cloud nodes are used for partial model training. The results object classification from large video data. Anjum et al. [1]
from each partial model are then collected at the master. This proposed a system using GPUs to reduce the computational
results in the reduction of the overall training time. The selec- complexity involved in video stream decoding and processing.
tion of appropriate normalization scheme with gradient descent An operator could specify the video file and search criteria to
approach and learning rate helps to move the model score of a client program, video analytics is then performed on the
the system toward stability during training. cloud and results are returned back to the operator after some
We have adapted iterative reduce, an extended form of map- time. However, this paper also involved the use of a shal-
reduce paradigm to perform training quickly and efficiently. low (learning) network and produced high-dimensional feature
The parallel and distributed training also process data rapidly. vectors. Deep learning networks have emerged as influential
The Apache Spark cluster is tuned for maximum resource tools for solving complex problems such as medical imag-
utilization. The proposed system is customizable in terms ing, speech recognition, and classification and recognition of
of scalability, i.e., nodes can be added or removed with the objects [22]–[25], [31]. These networks are capable to perform
addition or deletion of videos. classification and recognition on large-scale data as compared
The evaluation of system is performed on a 100-GB video to shallow networks but require more computational resources
dataset. We present a video object classification pipeline to for training. It also poses many other challenging tasks like
evaluate the proposed system in which objects of interest are hyper-parameter tuning and increasing times for training.
located. We have adopted the techniques from data augmen- Hyper-parameter optimization has been an area of
tation including rotation, flip and skew for training due to the discussion over the years [16] and mainly included racing
limited labeled data for application pipeline. More training algorithms [17] and gradient search [18]. It is now shown
data leads to higher accuracy for the classifier by reducing that random search is better as compared to grid search. The
over-fitting and exposing the network to more training sam- Bayesian optimization methods can perform even better than
ples. Another advantage of applying these transformations is random or grid search. Some of the researchers also proposed
that they make the classifier invariant to typical transforma- methods to perform automatic hyper-parameter optimization.
tions in the target object which is being located in the video The most common implementations of automatic Bayesian
streams. optimization are Spearmint [19], which uses a Gaussian pro-
We have shown that the proposed system performs object cess model [20] and tree Parzen estimator [2], which generates
classification with high accuracy. We also demonstrate exper- a density estimate of each hyper-parameter. These methods
imentally that the distributed training with iterative reduce for have shown competitive results but their acceptance is ham-
automatic video analytics is a promising way of speeding up pered because of high computational requirements and per-
the training process. After training, the classifier can be stored forms best for problems with few numerical hyper-parameters.
locally and uses a match probability to classify objects. On the other hand, the hyper-parameter optimization done
There are mainly three contributions in this paper. manually by human operators is less resource intensive and
1) We devised a mathematical model to observe the out- consumes less time as compared to automated methods. The
comes of various hyper-parameter values on system evaluation of a poor hyper-parameter setting can be quickly
performance. A comparison of different hyper-parameter detected by human operators after a few steps of the stochas-
values has been made and the parameters which give the tic gradient descent algorithm. They can quickly judge that
most optimal performance are selected. network is performing bad and can terminate the evaluation.
2) We scaled and configured the (Apache Spark) cluster for A number of convolutional neural network models have
parallel model training. been proposed in the recent past for object classification.
3) We propose an automatic object classification pipeline Szegedy et al. [3] proposed to modify the CNN by changing
to support large-scale object classification in video data. the end layer of the network with regression. This modifica-
The organization of this paper is as follows. Section II tion resulted in the average precision of 0:305 over 20 classes.
details the related work. Section III explains approach used As opposed to Szegedy’s proposed model, Girshick et al. [4]
in carrying out video analysis, using a convolutional neural adopted a bottom-up region-based deep model called R-CNN
network (CNN) and hyper-parameter tuning for such a net- and an improvement of 30% in accuracy was observed.
work. Section IV explains the architecture and implementation Girshick [5] further improved their method and proposed a
used to realize our proposed system. Section V describes the method called fast R-CNN to detect objects rapidly. This
experimental setup. The results and conclusion are provided method reported higher detection accuracy and performed
in Sections VI and VII, respectively. training in a single stage using a multitask loss. Ren et al. [6]
further improved Girshick’s work and proposed faster R-CNN
and reduced the computation time. They also combined fast
II. R ELATED W ORK R-CNN and region proposal network by sharing their convo-
Recent video analytics systems often use shallow net- lutional features into a single network. This method outper-
works and hand crafted features to perform object formed both R-CNN and fast R-CNN on publicly available
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
YASEEN et al.: DEEP LEARNING HYPER-PARAMETER OPTIMIZATION FOR VIDEO ANALYTICS IN CLOUDS 3
Training DataSst X = x1, x2, . . . , xn (1) here g(.) is the rectified linear unit (ReLU) activation function.
Weights are represented by “W” and biases are represented by
where x1, x2 . . . are decoded frames. The detection of desired “b,” respectively. “*” represents the 2-D convolution operation.
object from whole frame and its extraction through cropping is The inputs are downsampled in case of subsampling layer.
an important preprocessing step for video analysis as shown The output from each layer represents a feature map. Multiple
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
feature maps are extracted from each layer which is helpful The momentum term used in the training of the network is
in detecting multiple features of objects such as lines, edges represented as
and contours.
Instead of using the standard hyperbolic tangent nonlinear- Vt+1 = ρvt − αδL(θt ) (15)
ity, we adopted “ReLU” as suggested by Dahl et al. [12]. Wt+1 = Wt + Vt+1 . (16)
ReLU is much more appropriate than tanh especially in case of
bigger datasets as the network trains much faster. Traditional The softmax layer acting as the last layer of the network is
hyperbolic tangent nonlinearity does not allow training the given as
system on bigger datasets. The ReLU function has a range l(i, xi T) = M(ei , f (xi T)). (17)
of [0, ∞], so it has the capability to model positive real num-
bers. The advantage of using ReLU is that it does not vanish as
the value of “x” increases as compared to sigmodal function. IV. A RCHITECTURE AND I MPLEMENTATION
The max function is The proposed video analysis approach is compute intensive
and operates on large datasets. We have tackled this problem
1 if x > 0; 0 if x < 0. (7)
by optimizing the code, tuning the hyper-parameters properly,
In order to aid generalization we adopted local response and introducing parallelism [32] by using spark. The paral-
normalization. This normalization scheme mimics the behav- lelism is achieved by distributing the dataset into small subsets
ior of real neurons and creates a competition amongst neuron and then passing over these subsets of data to separate neural
outputs for big activities. Max pooling is used to perform network models as shown in Fig. 4. The models are trained in
sample-based discretization or downsampling of an input rep- parallel and the resultant parameters for each model are then
resentation (feature maps from convolutional layer in our iteratively averaged and collected at the master node. This
case). Max pooling reduces the dimensionality, decreases the approach helped in speeding up the network training even on
amount of parameters to learn and reduces the overall cost. larger datasets.
L2 regularization has been added to reduce over-fitting. It The training process starts by first loading the training
tries to penalize network weights that are large. It is given by dataset into the memory. The master node which also acts as
the spark driver loads the initial parameters and the network
configuration. The network configuration of our spark cluster
λ2 θi2 (8) and deep learning model is shown in Table I. The dataset is
i partitioned in a number of subsets. This division is dependent
where θ represents the network weights and λ is lagrange on the configuration of the training master. These subsets of
multiplier which decides how significant this regularization data are distributed to various workers along with the config-
should considered to be. uration parameters. Each worker then performs training on its
The weight and bias deltas for convolutional layers are allocated dataset. Once the training by all the workers is com-
calculated as pleted, the results are averaged and returned to master which
F
has a fully trained network which is used for classification.
Wt , l = LearningRate xi ∗ Dhi + mnW(t−1,l) (9) The master node of spark loads the initial network configu-
i=1 ration and parameters. The master is termed as driver node as
F well because it is responsible to drive other nodes of the clus-
Bt , l = LearningRate Dhi + mnB(t−1,l) . (10) ter by distributing parameters among them. It also contains
i=1 the knowledge that how data is to be divided. On the basis
The weight and bias deltas for subsampling layers are of data division parameter, the dataset is partitioned into the
calculated as subsets. These subsets along with the configuration parameters
F are then distributed among worker nodes. Each worker works
Wt , l = LearningRate ↓ xi ∗ Dhi + mnW(t−1,l) on a partial model and the results are averaged together with
i=1
the help of iterative averaging. The master node then contains
(11) the trained classifier.
The separation of training data into subsets and then training
F
bt , l = LearningRate Dhi + mnb(t−1,l) . (12) the model with these subsets of data by averaging parame-
i=1 ters is a feasible approach for our system because we operate
with limited worker nodes in our cloud and the parameters
The loss function is given by
for estimation are also small. We use the same model for each
L(x) = LearningRate l(i, xi T) (13) worker node but train them on different data shards (mini-
xi −>X xi −>Ti batches). We then obtain the gradient for each split of the
mini-batch from each model and compute the overall aver-
here l(i, xT) is loss function for convolutional neural network
age using parameter averaging. This technique works faster
that we are trying to minimize.
for small networks as in our proposed system and is ideal for
The stochastic gradient descent is represented as
scenarios involving matrix computations which happens quite
Wt+1 = Wt − αδL(θt ). (14) often in convolutional neural networks.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
YASEEN et al.: DEEP LEARNING HYPER-PARAMETER OPTIMIZATION FOR VIDEO ANALYTICS IN CLOUDS 5
TABLE I
C ONFIGURATION OF S PARK C LUSTER AND D EEP N ETWORK This is also quiet important to set the rate of parame-
ter averaging. If this is too low, this will create overhead
in parameter initialization and will cause delay in network
communication. Similarly, if it is high, it will degrade the per-
formance. In the proposed video analytics system, the good
performance is obtained with 16 mini-batches. These mini-
batches are started in an asynchronous fashion which reduces
the delay. The data repartitioning is also a critical parameter
to be defined. It defines when data is to be repartitioned and
plays an important role in utilizing all the resources of the
cluster efficiently. A value of 0.6 is chosen for this.
The compute cluster consists of one master node and eight The locality configuration is also defined as the proposed
worker nodes. The averaging frequency is set to 1 for all the algorithm has high demand of computation, so single task per
experiments. The dataset comprising the size of 100 GB is executor is executed. It is therefore much suitable to shift
divided into various subsets of data. Each subset is further data to executor which is free. The default configuration of
divided into various mini-batches depending upon the config- spark waits for a free executor. This requires the data to be
uration. Training is performed on each subset by allocating copied across the network. Another important note is that
each mini-batch to each worker. Since the dataset is large in we have avoided the allocation of memory on java virtual
size it was not possible to load the whole dataset into memory machine (JVM) heap space by passing pointers for various
at once. So we have first exported the mini-batches of datasets numerical tasks. It is not required to load the data from JVM
to disk (HDFS) known as dataset objects. The datasets are heap to execute operations on it; neither has it required data
exported in batch and serialized form. We have used kryo seri- transmission (processed results) back to JVM. This helps to
alization to perform serialization of our dataset. This approach avoid the data transfer time and a decrease in overall execu-
of saving the dataset to disk is much more efficient and faster tion time of the system. This also avoids memory overhead
as compared to loading the whole dataset in memory. This required for each task.
approach consumes less memory and reduces split overhead. We have employed iterative MapReduce instead of simple
The dataset object has a number of examples based on the size MapReduce for our proposed application. Iterative MapReduce
of dataset object. Kryo serialization takes least amount of time is an advanced form of MapReduce in which multiple passes
to serialize objects and improves performance. It can serialize of the MapReduce operation are performed. Single pass does
objects much quickly and efficiently and offers more compact quite well for the application which are not iterative. As
serialization than Java. The serialization framework provided our application is built upon deep learning algorithm, it is
by Java has high CPU and RAM consumption which makes highly iterative and makes full use of the iterative map-
it inefficient for large-scale data objects. reduce. A sequence of map-reduce operations are performed
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
in which each MapReduce operation is performed in cascaded and evaluate the proposed system. The configuration of the
fashion. cluster is as follows: each node in the cluster possesses a
In the implementation phase, the video dataset is first loaded secondary storage of 100 GB. There are four virtual cen-
into the memory. It is preprocessed so that it can be further tral processing units running at a frequency of 2.4 GHz. The
used for training the deep multilayer network. The preprocess- total main memory has a size of 16 GB. The results gen-
ing starts with frame decoding using FFMPEG library [13]. erated by these experiments will help to deploy the system
The objects of interest are then detected and extracted from the on a much bigger infrastructure as per requirements of an
video frames by using Haar cascade classifier. Haar cascade application.
classifier is built on top of Haar features which are generated The video dataset which is used to train and test the system
from the objects in video frames to perform detection. is generated in a constrained environment. The streams are
The objects which are extracted necessitate the use of N- captured with individuals facing toward the camera. However,
dimensional arrays which could hold the pixel values. We have it also contains frames which have individuals with side, front
made the use of nd4j for Java [14]. It consumes minimum and rear poses. Most of the video streams do not pose illu-
memory and supports fast numerical computing for Java. The mination or other challenges. The test dataset comprises of
loading of data into the memory and training of the network 88 432 video frames.
are handled by two separate processes. This makes the data The input video streams are H.264 format encoded. The
loading process simple and is supported by the nd4j library. frame rate for each video stream in our database is 25 frames/s.
The data after loading into the memory is normalized. This The data rate is 421 kb/s and the bit rate of video streams is
normalization of data helps to train the neural network prop- 461 kb/s, respectively. These video streams are decoded to pro-
erly as it is based upon gradient descent optimization approach duce separate video frames. The video stream of one minute
for network training. The gradient descent approach having of length generates a decoded frame set of 1500 frames. The
their activation functions in this range helps to improve the data size of each video frame is 371 kB.
performance. “Apache Spark” is adopted for parallel and distributed train-
A dataset iterator is defined to iterate over the data present ing of the deep network. The video dataset is loaded in spark
in the memory. The iterator fetches the data from memory which executes executors to perform the network training. The
in a vectorized format. The iterator moves on to the dataset dataset objects are used by the executors to execute training
objects which contain multiple training examples along with of the network. The iterative MapReduce framework utilized
their labels. An n-dimensional array is created to store exam- in this paper executes multiple analysis tasks. These analysis
ples and labels. The high volumes of data makes it infeasible to tasks are executed in multiple stages. The analysis tasks are
load the data into the memory at once. So many mini-batches rescheduled if a task failure occurs.
are created. These mini-batches help to tackle the memory The spark context is utilized to load the video dataset and
requirements problem. A value of 12 for the mini-batch is is then stored into multidimensional arrays. The multidimen-
used in our system. sional arrays represents the data in the form of tensors which
The value of learning rate has been selected to be 0.0001. are then passed through multiple layers for training. The start-
We have selected this value carefully on the basis of exper- ing layer of convolutional neural network has a dimension of
imentation. We observed during the experiments that a high 150 × 150 × 1. It has 96 kernels in it. The stride of the kernels
value of learning rate can cause divergence and the divergence is set to be 4 × 4 with the kernel size of 11 × 11 × 1. The
can stop the learning. On the other hand, setting learning rate layer following the first convolutional layer has 256 kernels in
to a small value causes slow convergence. it with a stride of 2 and has a size of 1 × 1. The remaining
layers has a total of 284 kernels in them. These convolutional
layers operate on nonzero bias.
V. E XPERIMENTAL S ETUP There is a max-pooling layer next to the convolutional lay-
The details of our experimental setup which has been used ers with a size of 3 × 3 as shown in Fig. 5. The convolutional
to deploy the system is presented in this section. The main layers and the pooling layer are followed by the fully con-
focus of the results generated by using this experimental setup nected layers. The fully connected layers have a total of 4096
is accuracy of the proposed algorithm, scalability, precision neurons in them. The kernels and neurons of the subsequent
and performance of the system. The accuracy of the system layers have a connection with the previous layers. We have
is measured by precision, recall, and F1 score. The scalability also added local response normalization layers, max pooling
and performance is demonstrated by analyzing aspects of the layers and added ReLU as nonlinearity layer.
system including transfer time of data to cloud node and the
overall analysis time.
The proposed architecture for analyzing video streams con- VI. E XPERIMENTAL R ESULTS
sists of cloud resources. The compute nodes have multicores In this section, we present the results of the proposed
for processing in which most of the video analytics operations system using the experimental setup detailed in Section V.
are performed. In order to execute the experiments, we con- We first analyze the results generated by tuning the hyper-
structed a cluster of eight nodes on the cloud infrastructure. parameters of deep model to various values and propose the
The multiple instances running the cloud have OpenStack [21] parameters which could potentially produce best results. The
with Ubuntu version of 15.04. This cluster is used to deploy trained system on the proposed parameters is then evaluated
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
YASEEN et al.: DEEP LEARNING HYPER-PARAMETER OPTIMIZATION FOR VIDEO ANALYTICS IN CLOUDS 7
A. Hyper-Parameter Tuning
with different performance characterization including accu- There are a number of parameters which can be tracked dur-
racy, scalability and performance of the system. The precision, ing the training of a deep network. These parameters provide
recall, and F1 score are also considered as the performance intuitions about the settings of different hyper-parameters and
characterization. The scalability of the system is analyzed by help to make a decision that whether the setting should be
measuring the time to transfer data to cloud and overall time changed in order to have more efficient learning. The parame-
of analysis of data. The results from the object classification ters are tracked and represented in the form of graphs over
pipeline are presented at the end of the section. multiple time stamps in order to observe the trend in the
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
YASEEN et al.: DEEP LEARNING HYPER-PARAMETER OPTIMIZATION FOR VIDEO ANALYTICS IN CLOUDS 9
Fig. 13(a)–(c) shows the standard deviations of layer acti- Each video stream in our database has a frame per second
vations, gradients, and updates of parameters for the first rate of 25. These videos are decoded to produce separate video
convolution layer of the network. The proposed system made frames. The total number of decoded video frames is directly
use of off heap memory and most of the memory is not allo- proportional to the duration of video stream being analyzed.
cated on the JVM heap but outside of the JVM. This helps to The video stream of one minute of length generates a decoded
perform the numerical operations faster as data needs not to be dataset of 1500 frames.
copied to and from the JVM and pointers can be passed around The size of the input dataset varies from five gigabytes
for numerical computations avoiding data copying issue. to hundred gigabytes. The large number of frames are bun-
dled with the help of a batch process. The bundled frames
C. Performance Characterization and Scalability are shifted to cloud infrastructure for processing. The time
of the System required to bundle the frames is proportional to the input video
The accuracy of the proposed system is measured by frames size. The dataset ranging from ten gigabytes to hundred
the following performance characterization: recall, precision gigabytes requires a bundling time of 0.25–3.8 h. Inclusion of
(positive prediction value), and F1 score. The test dataset com- larger data increases the data bundling time.
prises of 88 432 video frames in total. The precision is turned The bundled video frames are then transferred to cloud for
out to be 0.9708. The recall of the system is recorded to be analysis. There are many factors on which the transfer time
0.9636. And the F1 score is found to be 0.9672. The recall to cloud depends including bandwidth of the network, block
and the F1 score are calculated by the following equations: size and also the amount of data which is to be transferred.
An estimated transfer time for different data sizes is shown in
Recall = TP/(TP + FN) (18) Fig. 14. It was observed that the transfer time for a dataset
F1 = 2TP/(2TP + FP + FN). (19) size of 20–100 GB varied from 0.36 to 2.18 h. The data trans-
It was observed from the results that there is also some fer time has also been measured with a block size of 256 MB
miss-classification of the video frames as well. Few objects are and its effect is shown in Fig. 14. In order to measure the
recorded as false positives in the system. There can be number time of network training, multiple tests are carried out on
of things which could be the reason for the miss-classification. multiple dataset sizes. We have then calculated the average
Some miss-classifications could be due to the variance in the execution time of each dataset size and plotted in Fig. 15.
pose, illumination conditions, and blur effects. As the training It was seen from the results that the increase in dataset size
of the classifier was performed on the dataset which was cap- directly increases time of execution.
tured under strict controlled conditions, high variance could
lead to the miss-classification of various subjects.
The scalability is tested by executing it on distributed infras- D. Video Object Classification Pipeline
tructure over multiple nodes. The system is evaluated mainly The trained classifier from cloud is saved locally and is fur-
on the following parameters: 1) transfer time of data to cloud ther used to locate the objects of interest. The target object
nodes; 2) total time of analysis; and 3) analysis time with vary- which is to be located from the video streams is passed
ing dataset sizes. Spark executes many executors and these through the trained classifier to perform classification. The tar-
executors accesses an resilient distributed dataset object in get object which is to be classified is passed through the same
each iteration. Spark has a cache manager which handles the preprocessing steps to make it appropriate for the classifier. It
iterations outcomes in memory. If the data is not required is also scaled and normalized to make it appropriate for the
anymore, it is stored on disk. classifier.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
YASEEN et al.: DEEP LEARNING HYPER-PARAMETER OPTIMIZATION FOR VIDEO ANALYTICS IN CLOUDS 11
[27] X. Cui, V. Goel, and B. Kingsbury, “Data augmentation for deep Ashiq Anjum received the B.S. degree in electrical
neural network acoustic modeling,” IEEE/ACM Trans. Audio, Speech, engineering from the University of Engineering and
Language Process., vol. 23 no. 9, pp. 1469–1477, Sep. 2015. Technology Lahore, Lahore, Pakistan, and the Ph.D.
[28] J. Zhu, N. Chen, H. Perkins, and B. Zhang, “Gibbs max-margin topic degree in computer science in collaboration with
models with data augmentation,” J. Mach. Learn. Res., vol. 15, no. 1, CERN, Genève, Switzerland, and the University of
pp. 1073–1110, 2014. the West of England, Bristol, U.K.
[29] M. U. Yaseen, A. Anjum, and N. Antonopoulos, “Spatial frequency He is a Professor of distributed systems with
based video stream analysis for object classification and recognition in the University of Derby, Derby, U.K. His current
clouds,” in Proc. 3rd IEEE/ACM Int. Conf. BDCAT, Shanghai, China, research interests include data intensive distributed
2016, pp. 18–26. systems and high performance analytics platforms.
[30] M. U. Yaseen, A. Anjum, O. Rana, and R. Hill, “Cloud-based scal-
able object detection and classification in video streams,” Future Gener.
Comput. Syst., vol. 80, pp. 286–298, Mar. 2018
[31] A. R. Zamani et al., “Deadline constrained video analysis via in-
transit computational environments,” IEEE Trans. Services Comput., Omer Rana received the B.S. degree in information
doi: 10.1109/TSC.2017.2653116. systems engineering from the Imperial College of
[32] R. McClatchey, A. Anjum, and H. Stockinger, “Data intensive and net- Science, Technology and Medicine, London, U.K.,
work aware (DIANA) grid scheduling,” J. Grid Comput., vol. 5 no. 1, the M.S. degree in microelectronics systems design
pp. 43–64, 2007. from the University of Southampton, Southampton,
U.K., and the Ph.D. degree in neural computing and
parallel architectures from the Imperial College of
Science, Technology and Medicine.
He is a Professor of performance engineering
with Cardiff University, Cardiff, U.K. His current
research interests include problem solving environ-
ments for computational science and commercial computing, data analysis and
management for large-scale computing, and scalability in high performance
agent systems.