You are on page 1of 7

DISTRIBUTED COMPUTING SYSTEMS IN RESOURCES

MANAGEMENT USING AI

Abdelrahman R. S. Almassri
Department of Artificial Intelligence and Data Science, Istanbul Aydin University, Istanbul, Turkey

Abstract

Currently, we are witnessing development in all areas, where Internet-based distributed computing systems (DCS) have
come to be the foundation of the economy, and additionally, the emergence of the Internet of matters and mobile
applications has generated a giant quantity of facts that needs computing resources to method this statistics and derive
precious insights for customers and companies. According to the report from Norton, 21 billion IoT devices will be
connected to the net via 2025, growing great economic opportunities. Computing models such as cloud computing and
advanced computing have revolutionized the way offerings are provided and fed on with the aid of imparting flexible on-
demand get right of entry to offerings the use of the pay-as-you-go model. Also, new software and implementation fashions
such as microservices, serverless computing, or function-as-a-service (FaaS) computing are becoming conventional that
appreciably decrease the complexity of designing and deploying software components. On the one hand, this elevated
connectivity and heterogeneous workloads require gorgeous Quality of Service (QoS) ranges to meet the application
requirements. These traits have led to the construction of large-scale data facilities and complicated multi-level
computing infrastructures that require new progressive strategies in managing assets effectively and offering dependable
services.

Resource Management Systems (RMS) in DCS's are intermediate system implement different tasks such as save resources,
observation, and many others.

1. INTRODUCTION
Cloud computing has emerged as an important model for service-oriented computing. Through increasingly
advanced infrastructures and technologies, cloud computing is demonstrating its excellent capacity in
scalability, flexibility and accessibility. However, due to its low cost and availability on demand, it is common
for the abuse of cloud assets to occur, which reduces the use of resources or even puts the infrastructure at risk.
As a result, resource management is the key to providing continuous availability and efficient use.

In general, the problems of resource management have become too many in all places, and these pose some
challenges:

1- Resource management is complex; it is a huge project and impossible to accurately design as a large number
of cloud systems are manually configured and mainly depend on competencies and experiences. For instance,
in cluster scheduling, the running time of a task varies with data locality server characteristics, interactions
with other tasks, and interference on shared resources such as CPU caches, network bandwidth, etc [1, 2].

2- Practical application creation processes should make decisions online with annoying inputs and work well
under a variety of conditions.

3- Some important performance measures, such as tail performance [3], are difficult to improve in principle.

Revisiting the above challenges, we believe (Reinforcement Learning) RL approaches are have been
introduced as an important transformation for managing and coordinating nearly all aspects of cloud assets.
Which have been applied to other difficult decision-making areas [4, 5, 6]. In particular, reinforcement learning
has become an active area in machine learning research [7, 8, 9, 10, 5]. Where reinforcement learning deals
with agents who can make decisions, this method is preferable to experimenting with interaction with the
environment. The agent has a clear knowledge of the task and this leads to his success in the task. Augmented
learning has a long history [11]. It has also been integrated with deep learning techniques in many applications
such as playing video games [7], Computer Go [5], cooling datacentres [12], etc.

2. RELATED WORK
We made shortest explain Reinforcement Learning (RL) techniques that build on two papers. In one of them,
they relied on some reinforcement learning (RL) techniques, which we will briefly list:

Reinforcement Learning. Consider the everyday putting proven in Figure 1 where an agent interacts with an
environment. At each time step t, the agent observes some state st, and is asked to choose an action at.
Following the action, the state of the environment transitions to st+1 and the agent receives reward rt. The state
transitions and rewards are stochastic and are assumed to have the Markov property; i.e. the state transition
probabilities and rewards depend only on the state of the environment st and the motion taken through the
agent at. It is essential to notice that the agent can only control its actions, it has no apriori expertise of which
nation the surroundings would transition to or what the reward might also be. By interacting with the
environment, throughout training, the agent can look at these quantities. The goal of learning is to maximize
the expected cumulative discounted reward: E[∑∞ 𝑡
𝑡=0 𝛾 τ𝑡 ], where γ 2 (0, 1) is a factor discounting future
rewards.

Figure 1: Reinforcement Learning with policy represented via DNN.

Policy. The agent picks actions based on a policy, defined as a probability distribution over actions π : π(s; a)
! [0; 1]; π(s; a) is the probability that action a is taken in state s. In most problems of practical interest, there
are many possible {state, action} pairs; Hence, it is impossible to store the policy in tabular form and it is
common to use function approximators [13, 14]. A function approximator has a manageable number of
adjustable parameters, θ; we refer to these as the policy parameters and represent the policy as πθ(s,a). The
justification for approximating the policy is that the agent should take similar actions for “close-by" states.

Many forms of function approximators can be used to represent the policy. For instance, linear combinations
of features of the state/action space (i.e., πθ(s,a) = θTφ(s,a)) are a popular choice. Deep Neural Networks (DNNs)
[15] have recently been used successfully as function approximators to solve large-scale RL tasks [7, 5]. An
advantage of DNNs is that they do not need hand-crafted features.

Policy gradient methods. We concentrate on a class of RL algorithms that learn by performing gradient-
descent on the policy parameters. Recall that the objective is to maximize the expected cumulative discounted
reward; the gradient of this objective given by [11]:

∇𝜃 𝐸𝜋 𝜃 [∑∞ 𝑡
𝑡=0 𝛾 𝜏𝑡 ] = 𝐸𝜋 𝜃 [∇𝜃 log 𝜋𝜃 (𝑠, 𝑎)𝑒
𝜋𝜃
(𝑠, 𝑎)]. (1)

Here, Qπθ(s,a) is the expected cumulative discounted reward from (deterministically) choosing action a in state
s, and subsequently following policy πθ. The key idea in policy gradient methods is to estimate the gradient by
observing the trajectories of executions that are obtained by following the policy. In the simple Monte Carlo
Method [16], the agent samples multiple trajectories and uses the empirically computed cumulative discounted
reward, 𝑣𝑡 , as an unbiased estimate of Qπθ(st, at). It then updates the policy parameters via gradient descent:

𝜃 ← 𝜃 + 𝛼 ∑𝑡 ∇𝜃 log 𝜋𝜃 (𝑠𝑡 , 𝑎𝑡 )𝑣𝑡 . (2)

where α is the step size. This equation results in the well-known REINFORCE algorithm [17], and can be
intuitively understood as follows. The direction ∇𝜃 log 𝜋𝜃 (𝑠𝑡 , 𝑎𝑡 ) gives how to change the policy parameters
in order to increase πθ(st, at) (the probability of action at at state st). Equation 2 takes a step in this direction;
the size of the step depends on how large is the return 𝑣𝑡 . The net effect is to reinforce actions that empirically
lead to better returns. In our design, we use a slight variant [18] that reduces the variance of the gradient
estimates by subtracting a baseline value from each return 𝑣𝑡 .

In the different paper, they use the most famous algorithm in reinforcement learning is Q-learning, and the
most important idea of the algorithm is to select actions based totally on a Q-table.

In 2006, deep learning was established and has been further developed as a robust set of techniques in an
exceedingly learning neural network; thoroughly, it's a collection of computational models that include
multiple processing layers to be told features of knowledge with multiple levels of abstraction [19]. With a
mixture of these enough transformations, deep learning has produced very promising leads for discovering the
complicated buildings of high-dimensional data, such as images, speech, and area data. As derived from the
neural network, deep gaining knowledge of can be a functional approximation besides a paradigm and can
without problems make the most of the growing quantity of computation and data. Building on the idea of its
exceptional capacity to extract aspects and approximate performance, DRL was once proposed by means of
linking typical reinforcement getting to know with deep learning.

In 2015, Google proposed a novel algorithm referred to as deep Q-network (DQN) that utilizes a Q-network
derived from the convolutional neural community (CNN) to approximate Q-function in Q-learning, and it is
successful of human-level performance on many Atari video games by the usage of unprocessed pixels for
inputs.[20] Generally, it is acknowledged that there exist some balance troubles of reinforcement getting to
know with a nonlinear approximator (e.g., neural network),[21] broadly speaking prompted by way of the
correlated inputs, doubtlessly oscillating policies, and unstable Q-learning gradients. Fortunately, on the
groundwork of the experience replay, Q-network freeze, and reward normalization, DQN addresses these
problems and dramatically improves brand new technologies in speech recognition, visual object detection,
and many other domains, such as drug discovery and genomics.

3. OVERVIEW AND ARCHITECTURE


We want to make an overview for sensible cloud resource management is brought as a full-size shift to
robotically manipulate and coordinate in reality all factors of cloud assets, especially aid utilization. Table 1
reveals several of the most popular shrewd cloud solutions, and it is obvious that shrewd cloud useful resource
administration has to turn out to be a present-day fashion in cloud computing.

TABLE 1. Intelligent cloud resource management solutions [22].

NAME MAIN FEATURE FOCUS


Microsoft’s intelligent cloud Secure solution with the ability to Flexible infrastructure that can
listen, learn and predict with an scale on demand
artificial intelligence Cloud availability and cloud-
based infrastructure reliability
Intelligent cloud computing Cloud-based platform as a basis High tiers of availability and on-
platform of Akamai for content transport community demand scale Around-the-clock
and cloud protection offerings perception and manage
Apttus Intelligent Cloud platform Flexible and scalable end-to-end Recommendation of smart
software as a service solution to transaction moves with desktop
maximize complete revenue learning
operation Effectiveness throughout sales
and income cycle
Zoolz Intelligent cloud storage The combination of Accurate discovery, fast-track
convolutional data management accessibility, and efficient
tools with artificial intelligence is corporation of documents with
a data management platform artificial intelligence
Adequate safety of statistics with
military-grade encryption
Intelligent Cloud Resources Inc Cloud enabler and cloud broker Efficient cloud migration and
services to small- and medium- cloud management
sized corporations Powerful utility development and
cloud enablement

One of the most well-known algorithms is deep reinforcement gaining knowledge of (DRL) and is an
essential branch of machine learning. Its essence is to clear up the hassle of making a decision, that
is, to routinely make decisions and make non-stop decisions. It commonly consists of four elements,
agent, environment, action, reward, and the goal of reinforcement learning is to achieve the
cumulative rewards, and it an amazing potential of autonomic and environment-friendly learning,
DRL deals with complicated and dynamic cloud environments.

FIGURE 2. The architecture of intelligent resource management using deep reinforcement learning
as Figure 2 shows, a intelligent resource management architecture, mainly contains two components:

an intelligent resource manager, which is composed of controller, monitor, and allocator, and an IT resource,
which consists of extensive resource pools.[23] Clients first communicate with the controller to submit
application requests with various demands.

Based on application needs and modern aid utilization information, the controller implements the algorithm
chosen from its useful resource time table algorithm pool to meet application demands, whilst respecting
gadget resource constraint. The aid time table algorithm pool, which plays an essential role in sensible aid
administration architecture, includes different kinds of algorithms, such as offline and on line algorithms and
algorithms combining both on-line and offline parts. The reveal is accountable for gathering facts of machine
aid utilization and application first-class of provider (QoS) to update the controller periodically, and the
allocator is in cost of mapping functions to aid swimming pools in accordance to the configuration negotiated
by way of the controller.

The controller is the key phase of a useful resource administration architecture, as it not solely figures out the
(near-) most suitable configuration policy but additionally coordinates with the reveal and allocator to allocate
resources intelligently. The coronary heart of the controller is a resource schedule algorithm pool, which
consists of masses of control algorithms. The DRL algorithm presented in this paper is an on-line algorithm,
which connects reinforcement learning with deep learning to generate the (near-) top-quality resource
configuration in restricted iterations at once from uncooked software demands, specifically for high
dimensional demands. As Figure two shows, the controller selects an action according to the deep neural
community and then obtains comments statistics as a reward and a new environment kingdom from the
application operating environment. The deep neural network is pretrained through the stacked autoencoder
(SA), followed by way of the use of reinforcement getting to know experiences for optimization. Therefore,
the reinforcement studying and deep getting to know part can entirely cooperate to process uncooked software
needs and discern out a configuration coverage intelligently countless step.

The resource pool is a fully managed cloud internet hosting answer with very good flexibility. For server
providers, a useful resource pool represents a set of strategies to categorize and control their resources, and for
users, it is an abstraction used to current and devour assets in a regular fashion. In general, a useful resource
pool incorporates five layers: bodily resource, digital resource, hypervisor, virtual machine, and application.
The allocator maps applications to the corresponding resource pools and then allocates appropriate resources
for implementation.

In conclusion, the controller, monitor, and allocator coordinate with every different to intelligently allocate
resources, whilst respecting two constraints: one is that QoS requirements of applications must be met, and the
other is the amount of resource utilization ought to be much less than the whole quantity of on-hand resources
in the system.

4. COMPARISON BETWEEN TWO PAPERS


In the approach [22], An intelligent useful resource administration gadget used in ICN is proposed. The gadget
applies a site visitors estimation algorithm first off used in ICN fields known as TP2 to analyze the users’
request demands and a two-stage cache placement algorithm assisted with the aid of TP2 to recognize
environment-friendly sources management. TP2 has higher overall performance in predicting statistics traffic
with specific title prefixes than the linear mannequin Locality and the non-linear mannequin ARMA. The
cache placement algorithm proposed efficaciously allocates cache space. The community machine with the
sensible assets administration machine has much less request latency and a greater cache hit ratio than either
ABC or ACM. Although the overall performance improvement is partially due to the amplify of computing
complexity and the communication overhead, the wise machine can function well with hardware costs reducing
and overall performance improvements. Future work will focus on genuine community checks and economic
cost calculation which include the cache resources, computation, and communication costs.

Resource management troubles in systems and networking frequently manifest as tough on line decision-
making tasks the place fabulous options depend on understanding the workload and environment. Inspired by
latest advances in deep reinforcement getting to know for AI problems, they consider building structures that
research to manage sources at once from experience. they present DeepRM, an instance solution that translates
the trouble of packing duties with more than one resource needs into a studying problem. Our preliminary
outcomes show that DeepRM performs comparably to present day heuristics, adapts to distinct conditions,
converges quickly, and learns strategies that are good in hindsight. Information-centric networking (ICN) is a
new network architecture that is based on getting access to content. It goals to resolve some of the issues related
with IP networks, increasing content material distribution functionality and enhancing users’ experience. To
analyze the requests’ patterns and wholly utilize the popular cached contents, a novel shrewd resources
management machine is proposed, which enables environment friendly cache useful resource allocation in
real-time, based on changing person demand patterns. The gadget is composed of two parts. The first section
is a fine-grain traffic estimation algorithm referred to as Temporal Poisson traffic prediction (TP2) that pursuits
at analyzing the traffic pattern (or aggregated user requests demands) for unique contents. The 2nd section is
a collaborative cache placement algorithm that is primarily based on traffic estimated through TP2. The
experimental effects exhibit that TP2 has better overall performance than other related site visitors prediction
algorithms and the proposed shrewd device can expand the utilization of cache resources and improve the
community capability.

Hongzi Mao, Mohammad Alizadeh, Ishai Menache and Srikanth Kandula in [24] shows that it is possible to
follow trendy Deep RL methods to large-scale systems. Our early experiments exhibit that the RL agent is the
same and on occasion higher than ad-hoc heuristics for a multi-resource cluster scheduling problem. Learning
aid management techniques directly from experience, if they can make it work in a sensible context, could
offer an actual alternative to contemporary heuristic-based approaches.

Acknowledgments. they thank the anonymous HotNets reviewers whose feedback helped us improve the paper
and Jiaming Luo for fruitful discussions. This work used to be funded in part by using NSF can provide CNS-
1617702 and CNS-1563826.

Resource administration problems in systems and networking regularly show up as difficult on line decision-
making obligations the place extremely good solutions depend on grasp the workload and environment.
Inspired with the useful resource of current advances in deep reinforcement analyzing for AI problems, they
assume about developing systems that find out about to control assets without lengthen from experience. The
modern-day DeepRM, an instantaneous answer that interprets the bother of packing obligations with a couple
of aid needs into a reading problem. Our preliminary results exhibit that DeepRM performs comparably to
existing day heuristics, adapts to unique conditions, converges quickly, and learns strategies that are precise in
hindsight.

5. CONCLUSIONS
The two studies aimed to solve resource management problems in networks
The first study proposed a solution for networks of the type of central information networks (ICN), while the
other study was talking about systems and networks in general. The first study proposed a new intelligent
resource management system that used an algorithm called TP2 and aims to analyze the pattern of "combined
user requests"
While the second study proposed a learning system for managing resources directly from the experience called
DEEPRM, which aims to transform the problem of task packages with multiple resource requirements into a
learning problem.
The first study concluded that the use of the TP2 algorithm achieved better results than the other algorithms
by predicting other similar traffic, while the second study concluded that the use of DEEMRM performed
similar to the latest research methods and showed an adaptation to different conditions.
In conclusion, the first study clarified that the use of the proposed system contributed to increasing the use of
cache resources and improving network capacity, while the second study indicated that the use of the system
could provide a real alternative to approaches based on revealing current affairs.

6. REFERENCES
[1] C. Delimitrou and C. Kozyrakis. Quasar: Resource-efficient and qos-aware cluster management. ASPLOS ’14,
pages 127–144, New York, NY, USA, 2014. ACM.
[2] R. Grandl, G. Ananthanarayanan, S. Kandula, S. Rao, and A. Akella. Multi-resource packing for cluster schedulers.
SIGCOMM ’14, pages 455–466, New York, NY, USA, 2014. ACM.
[3] J. Dean and L. A. Barroso. The tail at scale. Communications of the ACM, pages 74–80, 2013.
[4] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. A. Riedmiller. Playing atari
with deep reinforcement learning. CoRR, 2013.
[5] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V.
Panneershevlvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M.
Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis. Mastering the game of go with deep neural networks and tree
search. Nature, 2016.
[6] P. Abbeel, A. Coates, M. Quigley, and A. Y. Ng. An application of reinforcement learning to aerobatic helicopter
flight. Advances in neural information processing systems, page 1, 2007.
[7] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K.
Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S.
Legg, D. H. I. Antonoglou, D. Wierstra, and M. A. Riedmiller. Human-level control through deep reinforcement
learning. Nature, 2015.
[8] V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. P. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu. Asynchronous
methods for deep reinforcement learning. CoRR, 2016.
[9] J. Schulman, S. Levine, P. Moritz, M. I. Jordan, and P. Abbeel. Trust region policy optimization. CoRR,
abs/1502.05477, 2015.
[10] S. Agarwal, S. Kandula, N. Bruno, M.-C. Wu, I. Stoica, and J. Zhou. Reoptimizing data parallel computing. In
NSDI, pages 281–294, San Jose, CA, 2012. USENIX.
[11] R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
[12] J. Gao and R. Evans. Deepmind ai reduces google data centre cooling bill by 40%.
https://deepmind.com/blog/deepmind-ai-reducesgoogle-data-centre-cooling-bill-40/.
[13] D. P. Bertsekas and J. N. Tsitsiklis. Neuro-dynamic programming: an overview. In Decision and Control,. IEEE,
1995.
[14] I. Menache, S. Mannor, and N. Shimkin. Basis function adaptation in temporal difference reinforcement learning.
Annals of Operations Research, (1), 2005.
[15] M. T. Hagan, H. B. Demuth, M. H. Beale, and O. De Jesús. Neural network design. PWS publishing company
Boston, 1996.
[16] W. K. Hastings. Monte carlo sampling methods using markov chains and their applications. Biometrika, (1), 1970.
[17] R. S. Sutton, D. A. McAllester, S. P. Singh, Y. Mansour, et al. Policy gradient methods for reinforcement learning
with function approximation. In NIPS, 1999.
[18] M. T. Hagan, H. B. Demuth, M. H. Beale, and O. De Jesús. Neural network design. PWS publishing company
Boston, 1996.
[19] L. Deng, G. Hinton, and B. Kingsbury, “New Types of Deep Neural Network Learning for Speech Recognition and
Related Applications: An Overview,” Proc. 38th Int’l Conf. on Acoustics, Speech and Signal Processing, 2013.
[20] V. Mnih et al., “Human-Level Control Through Deep Reinforcement Learning,” Nature, vol. 518, no. 7540, 2015,
pp. 529–533.
[21] Y. LeCun, Y. Bengio, and G. Hinton, “Deep Learning,” Nature, vol. 521, no. 7553, 2015, pp. 436–444.
[22] Yu Zhang, Jianguo Yao, and Haibing Guan. Intelligent CloudResource Management with Deep Reinforcement
Learning. In Cloud Computing,. IEEE, 2017.
[23] S. Caton et al., “A Social Compute Cloud: Allocating and Sharing Infrastructure Resources via
Social Networks,” IEEE Trans. Services Computing, vol. 7, no. 3, 2014, pp. 359–372.
[24] Hongzi Mao, Mohammad Alizadeh, Ishai Menache and Srikanth Kandula. Resource Management with Deep
Reinforcement Learning. In Cloud Computing,. IEEE, 2017.

You might also like