You are on page 1of 14

This article has been accepted for inclusion in a future issue of this journal.

Content is final as presented, with the exception of pagination.


IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS 1
SEEDS: A Solar-Based Energy-Efcient
Distributed Server Farm
Chien-Ming Cheng, Student Member, IEEE, Shiao-Li Tsao, Member, IEEE, and Pei-Yun Lin
AbstractDistributed renewable energy has emerged as a
promising resource because of its environmental friendliness
and economic considerations. However, most renewable energy
sources are unreliable and may require considerable effort to be
efciently utilized in a computing center for providing services. In
this paper, we exploit distributed renewable energy (e.g., solar
energy) and peer-to-peer (P2P) technologies to aggregate dis-
tributed computing resources to provide an infrastructure called
solar-based energy-efcient distributed server (SEEDS) farm for
distributed computing and distributed storage. Energy-efcient
devices (e.g., embedded devices) powered by solar energy form
a P2P computing system to provide their computing resources
to end-users. Specically, this paper uses a Web-based service
as a case study. A group of solar-powered embedded devices
acts as a front-end cooperative caching system for Web servers.
Web objects may be accessed through the distributed caching
system without going through servers, and thus we can reduce
brown energy consumption of the servers. This paper also devel-
ops an analytical model to evaluate the total energy consumption
of Web-based services with and without SEEDS. Theoretical and
simulation results show that the SEEDS system can support ser-
vices and achieve signicant improvements in energy efciency
by aggregating distributed energy resources.
Index TermsDistributed computing, distributed energy
resources (DERs), distributed storage, peer-to-peer systems.
I. INTRODUCTION
R
ENEWABLE energy resources such as solar energy
and wind energy are widely available on Earth and
have attracted much interest in both research and practi-
cal applications [1]. The use of renewable energy not only
offers economic benets, but also achieves environmental
sustainability. Researchers have investigated the distributed
generation of electricity [2] and the local use of renew-
able energy resources, promoting the concept of distributed
energy resources (DERs) [3]. However, most renewable energy
sources (e.g., solar and wind) are intermittent and variable
in nature. Considerable effort is required to utilize renew-
able energy in a computing center to provide a reliable power
Manuscript received October 9, 2013; revised March 21, 2014; accepted
May 7, 2014. This work was supported in part by MediaTek Inc., and
in part by the National Science Council of the Republic of China under
Contract 102-2220-E-009-020 and Contract 102-2219-E-009-006. This paper
was recommended by Associate Editor M. Celenk.
C.-M. Cheng and S.-L. Tsao are with the Department of Computer
Science, National Chiao Tung University, Hsinchu 300, Taiwan (e-mail:
zjm@cs.nctu.edu.tw; sltsao@cs.nctu.edu.tw).
P.-Y. Lin is with MediaTek Inc., Hsinchu 300, Taiwan (e-mail:
lincs97@gmail.com).
Color versions of one or more of the gures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identier 10.1109/TSMC.2014.2329277
supply system for services and applications. For example, a
solar-powered datacenter may require a large-scale installa-
tion of solar panels or a supplemental supply of other power
sources (e.g., electricity) when solar energy is unavailable.
Although recent studies have considered the integration of
energy storage technologies (e.g., ywheels and batteries) with
intermittent energy resources [4], efciently using and aggre-
gating distributed renewable energy remains a challenging
issue, as these energy resources are typically small in scale
and located in a large number of locations.
This paper proposes a novel approach to convert distributed
renewable energy (e.g., solar and wind energy) into distributed
computing power (e.g., data processing and/or storage) for
providing services. Among the various renewable energy tech-
nologies, solar energy is easily accessible and available in most
areas [5]. Therefore, this paper uses solar energy as an example
to present a system called solar-based energy-efcient dis-
tributed server (SEEDS) farm. Energy-efcient devices (e.g.,
low-power embedded devices) which can operate on solar
energy provide their computing resources such as process-
ing power and/or storage space to users. These devices are
distributed over the world to fully utilize distributed energy.
The proposed system applies peer-to-peer (P2P) technology
to aggregate distributed computing resources with scalability,
self-organization, and fault-tolerance properties. By converting
distributed energy into distributed computing power, SEEDS
aggregates and utilizes free energy resources, and offers an
infrastructure for distributed computing and/or storage.
With proper planning and management, one centralized
and large-scale installation of solar panels may achieve better
utilization of renewable energy and efciency of renewable
energy conversion. However, a large-scale installation of solar
panels usually requires signicant investment, and installations
of solar panels around the world may not be able to lever-
age each other due to lack of smart grid infrastructure and
inefciency of long-distance power delivery. Therefore, we
suggest a decentralized approach and encourage individuals
and companies who are willing to reduce brown energy con-
sumption of computing/storage services to install small-scale
renewable energy generation devices at an affordable cost, con-
tribute and share the energy resources (i.e., computing and
storage resources) with each other, and form a high-availability
computing and/or storage infrastructure. SolarCity [6] also
encourages individuals to utilize and share renewable energy,
but the service depends on a smart grid infrastructure and
might not be sharable among users around the world. The idea
behind the SEEDS system is to exploit DERs that are available
2168-2216 c 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
2 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS
everywhere but not easy to share over long distances and
different locations, and to convert the DERs into distributed
computing resources to reduce brown energy consumption
of computing/storage services. Moreover, the SEEDS system
can expand incrementally, and distributed resources can be
aggregated as more SEEDS nodes participate.
We rst use a Web-based service as a case study of dis-
tributed storage, and also elaborate the design of distributed
computation services to illustrate the proposed distributed
computing and/or storage infrastructure. A number of energy-
efcient devices powered by solar energy act as cache nodes
to form a front-end caching system for Web servers. The
cache nodes are implemented on low-power embedded devices
with a small form factor, and designed to be energy-efcient.
This makes it feasible and convenient to incrementally deploy
the self-sustained cache nodes, which co-locate with energy
resources and utilize distributed energy. With the continuing
growth in the number of servers, several studies have reported
that servers account for a large portion of IT energy expendi-
tures and exhibit a high growth rate of energy consumption [7].
In addition, most existing servers rely on electricity gener-
ated from nonrenewable energy sources (e.g., fossil fuels),
also known as brown energy. This caching system which
operates on renewable energy (i.e., green energy) can handle
user requests without burdening origin servers. Therefore, the
servers can remain in a low-power state (i.e., low operating
frequency and voltage) most of the time, effectively reducing
brown energy consumption in providing the service. This paper
develops an analytical model to estimate the energy consump-
tion of both conventional service architecture and the proposed
system, and presents simulations and a prototype implemen-
tation. Results show that SEEDS can provide a distributed
storage service and reduce the brown energy consumption of
Web-based service.
The rest of the paper is organized as follows. Section II
reviews recent research on renewable energy use in provid-
ing services. Section III presents the system architecture of
the proposed SEEDS system. Section IV presents case studies
of distributed storage and computation. Section V models the
energy consumption of the conventional service architecture
and SEEDS system. Section VI discusses the simulation and
prototype implementation. Finally, Section VII concludes this
paper.
II. RELATED WORK
Because of environmental considerations and economic ben-
ets, solar energy has been used as an energy resource for
servers in datacenters [8]. Although solar energy is widely
distributed and easily accessible, it is intermittent and highly
dependent on weather conditions. There is also a great need
for achieving high solar energy conversion efciency when
converting solar energy into electricity [9]. Therefore, a solar-
powered datacenter operates depending on the availability
of solar energy and the amount of energy supplied to it.
GreenSlot [10] and GreenHadoop [11] develop job schedul-
ing policies for a datacenter powered by solar energy and
electricity grid. They predict future solar energy availability
Fig. 1. SEEDS system architecture.
and schedule batch jobs to maximize green energy use while
minimizing brown energy consumption. GreenSwitch [12]
schedules workloads and selects the energy supply from mul-
tiple energy sources (e.g., solar, battery, and electricity) to
minimize electricity cost. These works attempt to manage
workload execution and renewable energy use, and may use
other energy supplies (e.g., electricity) when needed.
This paper attempts to exploit and aggregate renewable
energy resources that are distributed and available in various
locations. Available local renewable energy is directly con-
verted into computing resources. P2P technology is adopted
to construct a distributed computing and/or storage infrastruc-
ture. When renewable energy is not available in one location,
the renewable energy available in other locations can be taken
place to supply the computing resources. The integration of
distributed computing paradigm with the use of distributed
renewable energy can provide benets to various applications
and services with lower brown energy consumption.
In this paper, we use a Web-based service as an example
to demonstrate the proposed idea because Web servers con-
sume a signicant amount of energy in serving user requests.
Researchers have proposed several approaches to reduce the
energy consumption of Web servers. These approaches include
dynamic voltage scaling [13], request batching [14], and
request redirection [15]. Most of these approaches focus on
enabling servers to process user requests more efciently. Our
proposed approach, which can be regarded as complementary
to existing approaches, attempts to distribute user requests to
a number of cache nodes operating on renewable energy, thus
minimizing the load on servers and their energy consumption.
III. SEEDS SYSTEM ARCHITECTURE
In the SEEDS system, the participating devices, called the
SEEDS nodes, are distributed worldwide and provide their
computing resources such as processing power and/or storage
space to users. The SEEDS nodes are organized into a P2P
overlay network, called a SEEDS overlay, to form an infras-
tructure for sharing of distributed computing resources. The
available resources that are distributed on SEEDS nodes can
be found in the SEEDS overlay using P2P lookup mecha-
nisms [16]. Fig. 1 shows the SEEDS system architecture.
This paper uses Chord [17] as an example to construct a
SEEDS overlay. Chord uses a circular m-bit identier space
modulo 2
m
. Each SEEDS node participating in Chord has a
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
CHENG et al.: SEEDS: A SOLAR-BASED ENERGY-EFFICIENT DISTRIBUTED SERVER FARM 3
unique key, called SEEDS node identier (ID), that is mapped
to the identier space and it can be generated by hashing the IP
address of the SEEDS node. After a SEEDS node participates
in the SEEDS overlay, it announces its available resources
such as computation resources and stored data in the SEEDS
overlay. A resource is identied by a unique resource ID or
key mapped to the identier space. The key can be generated
by computing a hash of the resource. A key k is assigned to
the successor of the key, the rst SEEDS node whose ID is
equal to or follows k in the identier space. Thus, each SEEDS
node is responsible for the keys that fall between its predeces-
sor and itself. When a resource is requested, the resource can
be located through lookups performed between SEEDS nodes
in the SEEDS overlay. A lookup would require O(logN) mes-
sages on average, where N is the number of SEEDS nodes in
the Chord overlay.
In the proposed SEEDS system, each SEEDS node joins
and leaves the system depending on the energy supplied to
it. When a SEEDS node joins the system, it rst constructs a
nger table and establishes connections to the successor and
predecessor. When a SEEDS node leaves the system because
of insufcient energy, it noties its successor and the succes-
sors of its resources. The departing SEEDS node also stores
its successor, predecessor, and nger nodes in its local stor-
age. Upon rejoining the system, the SEEDS node rst tries
to reconnect to those known nodes and then rebuilds its con-
nections. If all these known nodes are not active, the SEEDS
node contacts bootstrap nodes as it joins for the rst time.
This scheme reduces the workload of bootstrap nodes. Similar
to conventional Chord-based systems [17], each SEEDS node
in the SEEDS overlay periodically performs the Stabilization
procedure to check the connections to its successor and pre-
decessor. Each SEEDS node also periodically performs the
FixFinger procedure to ensure that the SEEDS nodes in the
nger table are correct.
IV. CASE STUDIES OF DISTRIBUTED STORAGE AND
DISTRIBUTED COMPUTATION
To illustrate the SEEDS system, we rst use a Web-based
service as a case study of distributed storage. A website can
utilize the distributed storage provided by SEEDS nodes. The
SEEDS nodes act as a front-end cooperative caching stor-
age system for Web servers. Then, we elaborate the design
of distributed computation services based on the SEEDS
infrastructure.
A. Overview
For a Web-based service, the SEEDS system consists of
three components: servers, clients, and a number of cache
nodes. Servers are conventional Web servers from which
clients (i.e., users) can access Web objects such as HTML
Web pages and images. A website may adopt a clustering
architecture and use multiple Web servers to handle client
requests [18]. The requests sent to the website can be dis-
tributed among the servers of the cluster. Although the servers
of a website do not participate in the SEEDS overlay, they
also have an ID mapped to the identier space. The server
Fig. 2. Example of the SEEDS system using Chord. Eight SEEDS nodes
provide a distributed storage system for two websites (i.e., server IDs 23 and
35).
ID, which can be generated by hashing the websites uniform
resource identier (URI), is used as the group ID for a group
of SEEDS nodes. The following section describes group orga-
nization. A website stores a number of objects, and each object
has a unique object key mapped to the SEEDS overlay. The
key can be generated by computing a hash of the objects
uniform resource locator (URL).
The cache node; i.e., the SEEDS node, is implemented on an
embedded device powered by a solar panel or other distributed
renewable energy. The SEEDS nodes are organized into the
SEEDS overlay, to form a cooperative caching storage system.
When receiving requests from clients, SEEDS nodes retrieve
the requested objects from servers and cache the objects in
local storage to serve subsequent requests for those objects.
Instead of the object location maintained on the successor of
the object key, an object is cached on the SEEDS node respon-
sible for the key of the object. Each SEEDS node joins and
leaves the system depending on the amount of solar energy
generated by its solar panel. Fig. 2 shows an example of the
SEEDS system.
An agent program is installed on a client machine to help
the client browser process all Web requests and responses. A
single Web page may contain a number of embedded objects.
This paper assumes that each client request is for a single
Web object identied by a URL. In general, when a browser
submits a URL request, it rst checks whether the requested
object is in the local browser cache. If a hit occurs, the object
is directly returned to the browser. Otherwise, the browser
sends out the request. The browser settings are modied by
setting the agent program as the local proxy so that the browser
forwards requests to the agent program instead of sending
requests to the network directly. The agent program converts
the requests into the SEEDS message format and sends the
requests to the SEEDS overlay to retrieve objects and process
received responses. Upon receiving requests, SEEDS nodes
perform lookups for the requested objects in the SEEDS over-
lay. Based on the lookup results, the agent program may obtain
the requested objects from SEEDS nodes if the objects are
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
4 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS
found in the SEEDS overlay or request objects from origin
servers otherwise. The retrieved objects can then be returned
to the browser. To accelerate the retrieval of objects, the agent
program also extracts information about SEEDS nodes in the
overlay from the lookup results and stores that information in
a local resource table.
B. Group Organization
After a SEEDS node has joined a SEEDS overlay, it must
join a group and obtain the group information. In a distributed
storage system, SEEDS nodes are organized into cache groups
to serve particular websites. A SEEDS node may join multiple
groups, and a group may serve several websites. A physi-
cal node that participates in the SEEDS overlay could be a
solar-powered embedded device or a data center powered by
renewable energy. To enable the SEEDS system to accommo-
date heterogonous nodes, which have different computing and
storage capacities, we dene a SEEDS node as a logical node
that provides a basic unit of computing or storage capacity.
The resources of a physical node must constitute an inte-
gral number of logical SEEDS nodes, and each SEEDS node
participates in the SEEDS system. According to the design
concept, a group can serve multiple servers. Without loss of
generality, we assume that each SEEDS node joins only one
group and that each group is congured to serve one website
in the rest of this paper. The objects available on a website
are cached and stored only on the SEEDS nodes of the group
that serve the website. This caching management scheme ef-
ciently locates objects in a SEEDS overlay, as described later.
Each group uses the ID of the served website as its group ID.
The directory node of a group is the rst SEEDS node whose
ID is equal to or follows the group ID in the SEEDS overlay.
Note that the directory node of a group may not belong to the
group. The directory node maintains a group node list that con-
tains state information about SEEDS nodes in a group (e.g.,
locations and loading status of the SEEDS nodes in the group).
The SEEDS nodes in a group also locally store the node list
of the group. Each group periodically updates the group node
lists stored on the directory node of this group and group mem-
bers. Let N(G) be the set of SEEDS nodes in group G, and
let L(G) = {I(n)|nN(G)} be the group node list for group G,
where I(n) is the state information of SEEDS node n. Fig. 2
provides an example of the group organization. Assume that
server 23 is served by group 23 with N(G23) = {N15, N22,
N38, N60}. The object o
27
O(S23) has a key k
27
and should
be cached on SEEDS node in N(G23) (i.e., SEEDS node 38),
not SEEDS node 30. In addition, the objects on server 23 with
keys between SEEDS node 22 and SEEDS node 38 should be
cached on SEEDS node 38. Although SEEDS node 30 does
not belong to group 23, it is the directory node of group 23
and maintains the group node list L(G23) = {I(N15), I(N22),
I(N38), I(N60)}. The list L(G23) is also stored on the group
nodes of group 23 (i.e., nodes 15, 22, 38, and 60), as shown
in Fig. 2.
Each SEEDS node must determine which group it belongs
to. A SEEDS node could be precongured to join a particu-
lar group or assigned to a group by the bootstrap node. The
bootstrap node maintains a group table that stores state infor-
mation about groups in the SEEDS overlay, including group
ID and group status. The bootstrap node decides which group
a SEEDS node should join (for example, a randomly chosen
group) and provides the necessary information for joining a
group. In most cases, the group assignment occurs only when a
SEEDS node joins the system the rst time. When the SEEDS
node joins the system again, it joins the same group. A SEEDS
node may change from one group to another group, for exam-
ple, an overloaded group, if a group load balancing mechanism
is applied.
The group joining procedures are as follows: When a
SEEDS node n wants to join a group G, it sends a join request
by performing a lookup of group G in the SEEDS overlay. The
directory node of group G can be found within an expected
O(logN) node hops using a Chord lookup mechanism when
there are N SEEDS nodes in the overlay. Upon receiving the
request, the directory node of group G adds SEEDS node n
to the node list of group G and then replies to SEEDS node
n with the current node list of group G. If the directory node
does not have the node list of group G, it creates a new group
node list for group G and SEEDS node n becomes the rst
node in this list. In the next section, we formulize the group
management problem and present our solutions.
C. Group Management
Previous studies [19] indicated that servers may suffer
different and time-variant workloads, as shown in Fig. 3.
Depending on the hardware capacities, geographic locations,
and network connections and bandwidth, SEEDS nodes may
have various characteristics such as variant computing and
storage capacities, active working periods, or network latencies
to clients. Therefore, group management for assigning newly
joined SEEDS nodes or rearranging existing SEEDS nodes to
suitable groups signicantly inuences the overall system per-
formance. As can be seen from the example in Fig. 3, SEEDS
node N45 may be more suitable than SEEDS node N38 to join
server S35 group because the server peak workload matches
SEEDS node peak service capacity.
For simplicity, each server represents a website. Assume that
there are N
s
servers, S
1
, S
2
, . . . , S
N
s
, and N
n
SEEDS nodes,
N
1
, N
2
, . . . , N
N
n
. The workload that server i suffers at time t
is S
i
(t), and the service capacity of SEEDS node j at time t
is C
j
(t). C
j
(t) can be determined by the computing capacity
of SEEDS node j, denoted as C
c
j
, provided that there is suf-
cient energy to supply SEEDS node j at time t, denoted as
C
e
j
(t). C
e
j
(t) is either 1, indicating a sufcient energy is sup-
plied, or 0, which means energy is insufcient. C
j
(t) can be
calculated as C
j
(t) = C
c
j
C
e
j
(t). We dene g
ij
= 1 to rep-
resent that SEEDS node j is serving server i.

N
s
k=1
g
kj
= 1
implies that SEEDS node j serves only one server. We dene
group i as a group of SEEDS nodes serving server i; i.e.,
G
i
= {N
k
|g
ik
= 1}, and the size of group i is G
i
= |G
i
|. On
the other hand, to determine the hit rate of SEEDS nodes when
a client accesses server i through the SEEDS nodes, we have
to calculate the objects that are cached on the SEEDS nodes
in group i. We dene the storage capacity of SEEDS node j
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
CHENG et al.: SEEDS: A SOLAR-BASED ENERGY-EFFICIENT DISTRIBUTED SERVER FARM 5
Fig. 3. Group management of the SEEDS system.
in terms of the number of cached objects as C
s
j
, and then the
total storage capacity of SEEDS nodes in group i becomes
Z
i
=

N
j
G
i
C
s
j
. We assume that the access probability of the
ith popular object is f (i), and that the hit rate of the cached
objects on the SEEDS nodes in group i can be derived as:

i
=

Z
i
k=1
f (k). Based on the above descriptions, we then
model the expected access delay of server i at time t as D
i
(t) =

i
Min

D
hit
i
(t) , D
timeout
i

+(1
i
)Min(D
miss
i
(t) , D
timeout
i
),
where D
hit
i
(t) =

2 D
cG
i
+D
l
G
i
+D
p
G
i
(t)

, D
miss
i
(t) =

D
hit
i
(t) +2 D
cS
i

, and D
timeout
i
=

T
timeout
+2 D
cS
i

. In
the above equations, D
cG
i
is the average delay between any
client and the SEEDS nodes in group i, and D
cS
i
is the aver-
age delay between any client and server i. D
l
G
i
is the average
latency of an object lookup in group i, and D
l
G
i
can be further
dened as (1 1/G
i
) D
a
, where D
a
is the average one-hop
latency in the SEEDS overlay. D
p
G
i
(t) is the average queu-
ing delay of the request that is served by a SEEDS node in
group i at time t. The model indicates that if the requested
object can be found in the SEEDS nodes in group i and if
the SEEDS nodes can answer the request within D
timeout
i
, the
client can receive the responses from the SEEDS nodes in
D
hit
i
(t). If the requested object can be found in the SEEDS
nodes in group i but the SEEDS nodes cannot answer the
request within D
timeout
i
, the client sends the request to server
i directly and can receive the response after D
timeout
i
. If the
requested object cannot be found through the SEEDS nodes
in group i, the client receives the object from server i after
Min(D
miss
i
(t) , D
timeout
i
).
The goal of group management is to minimize the aver-
age delay when clients access any servers. We can dene
the objective function as D
SEEDS
=

iN
s

i
D
i
, where
i
is the access probability of server i, which can be mod-
eled as
i
=

t
S
i
(t)/

t
S
i
(t), and D
i
is the average
latency for accessing server i from any client over time T;
i.e., D
i
=

t
D
i
(t)/T. Therefore, the optimization of group
management can be formulated so as to determine the assign-
ment of all SEEDS nodes to all groups, in order to minimize
the average access delay of the SEEDS system. We then can
derive an integer linear programming (ILP) model and nd the
optimal assignment. The ILP model can be written as
Find g
ij
j N
n
Minimize D
SEEDS
Subject to

N
s
k=1
g
kj
= 1 j N
n
g
ij
(0, 1) i N
s
, j N
n
.
Due to the complexity of the optimal solution and difculty
in solving an ILP model in a distributed environment, we then
propose a greedy algorithm to assign SEEDS nodes to groups.
First, each SEEDS node collects the object hit rate, the average
processing delay for handling requests at different times, and
the average delay between clients and the SEEDS node. The
SEEDS node reports the information to the directory node
of its group, say group i, periodically. The directory node of
group i averages the information from the SEEDS nodes in
its group and derives D
hit
i
(t). The directory node sends the
information to the bootstrap node periodically. The bootstrap
node further collects the delay information between clients
and server i, e.g., by querying server i, and thus can derive
D
miss
i
(t) and D
timeout
i
(t). The bootstrap node can thus estimate
the current D
SEEDS
. When a SEEDS node, say node j, rst
joins the SEEDS system, the bootstrap node determines what
value of g
ij
minimizes D
SEEDS
.
The greedy algorithm cannot guarantee the optimal group
assignment, and we thus suggest heuristics to reassign SEEDS
nodes to groups. We have two observations. First, when the
SEEDS node operates during the peak service period, but its
serving server has low workload, the SEEDS node might not
fully contribute its service capacity in serving the sever. In
that case, the utilization of the SEEDS node is usually low.
In Fig. 3, when N45 joins the SEEDS overlay, the greedy
algorithm may assign N45 to S23 to optimize D
SEEDS
at that
moment. However, N45s capacity cannot be fully exploited,
and we may consider rearranging the group assignments if
more SEEDS nodes join the overlay. The second observation
is that although a SEEDS node can achieve high utilization,
the delay between the SEEDS node and its clients may be
large. This indicates that although the peak service period of
the SEEDS node matches its servers workload, the network
connectivity and bandwidth between the SEEDS node and its
serving clients is poor. For example, assigning N53 to S23
would improve utilization, but the delay between N53 and
its serving clients would be large. Therefore, a better assign-
ment of N53 should be S35. This suggests that heuristic group
management should reassign low utilization and large delay
SEEDS nodes to groups in order to maximize the utiliza-
tion of the SEEDS nodes and to reduce the communication
delay between the SEEDS nodes and their serving clients. We
maintain two thresholds: U and D. When a SEEDS nodes
utilization is lower than U, or the delay between a SEEDS
node and its serving clients is larger than D, the SEEDS
node initiates a group reassignment request to the bootstrap
node. The bootstrap node periodically checks the reassign-
ment request queue and determines the new assignment based
on the objective function: minimizing D
SEEDS
.
To avoid interruption of service, a SEEDS node should
gracefully transition from the old group to the new group.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
6 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS
Moreover, since the bootstrap node might not nd better
groups for the SEEDS nodes initiating the requests, the expo-
nential backup mechanism would be applied. This means that
if a SEEDS node initiates a request but cannot nd a more
suitable group, it should double the waiting period before it
can initiate another reassignment request. This mechanism is
meant to avoid frequent reassignment requests and group rear-
rangement. Section V models the energy saving of the SEEDS
infrastructure based on random group assignment and optimal
group assignment. Section VI presents a simulation of random
group assignment and greedy group assignment schemes and
evaluates their performance.
D. Object Retrieval
When a client requests an object on a website (i.e., a URL),
the group ID (i.e., server ID) and object key can be computed
based on the requested URL. The client selects one SEEDS
node from its resource table and then sends the request to that
SEEDS node. The resource table maintains information about
r SEEDS nodes {n
1
, n
2
, . . . , n
r
} in the SEEDS overlay, such
as IP address, group ID, and service time, where r is a system
parameter. According to the group ID, the client rst selects
an active SEEDS node that belongs to the requested group.
We suggest clients to maintain a resource table that keeps the
average service latencies of the cached SEEDS nodes. The
average service latency of a cached node is a function of time
which means the average latencies may be variant at different
time. When a client sends a request, it picks up the SEEDS
node with the minimal latency at that specic time. The client
selects the SEEDS node with the minimal service latency to
retrieve the specic object in the SEEDS overlay. However,
a client may not know any SEEDS node of the requested
group because it has not yet made a request to the website. In
this case, it selects the SEEDS node with the minimal service
latency among all the SEEDS nodes in the resource table. After
the rst request to the website, the client can discover other
SEEDS nodes belonging to the group from the reply messages.
Thus, subsequent requests for the objects on the website can
be directly sent to the SEEDS nodes in the group.
As an example, suppose that a client requests an object
associated with key k on a website served by group G in
the SEEDS overlay. Assume that the client sends the request
to a SEEDS node n that does not have the object. SEEDS
node n checks whether it belongs to group G. If n N(G),
SEEDS node n can determine the SEEDS node in N(G) that
is responsible for key k (i.e., the successor of key k) using its
local group node list, and then directly forwards the request
to the successor, say SEEDS node n
k
. Note that SEEDS node
n may not forward the request if it is just the successor of key
k. On the other hand, if n N(G) where G = G, SEEDS
node n must perform a lookup for group G to nd the direc-
tory node of group G. This lookup request is routed through
the SEEDS overlay based on the Chord lookup mechanism to
SEEDS node n
g
, which is responsible for group G. SEEDS
node n
g
can determine the SEEDS node that is responsible for
key k (i.e., SEEDS node n
k
) according to its group node list,
and then forwards the request to SEEDS node n
k
. The lookup
Fig. 4. Object retrieval in SEEDS. The website requested by client c is
served by group G23 with N(G23) = {N15, N22, N38, N60}. The directory
node of group G23 is N30 that maintains the group node list L(G23).
for a group may occur only when the client has not previ-
ously retrieved objects from the group. When receiving the
request, SEEDS node n
k
replies with the requested object to
the requesting client if the object is cached on it. Otherwise, an
object not found message is returned. In both cases, the reply
message contains information about SEEDS nodes in group G,
allowing the client to update its resource table upon receiving
the reply. If the object is not found in the SEEDS overlay or
the client does not receive a reply within a dened waiting
time T
wait
(e.g., 2 s), the client requests the origin servers for
the object.
As Fig. 4 shows, two steps of object lookup are required
for the rst request from client c for objects on a website that
client c has not yet visited. After that, subsequent requests
from client c for objects on the website can be rapidly resolved
in the SEEDS overlay by contacting the members of the cache
group responsible for the website.
E. Object Caching
In the SEEDS system, each SEEDS node uses local stor-
age to cache objects from servers and serve client requests
for cached objects. The number of objects stored on a SEEDS
node is limited by the storage space. Many Web caching sys-
tems use cache replacement policies such as least recently
used (LRU) and least frequently used (LFU) to maintain the
cache storage [20]. These policies can also be applied to a
SEEDS system to manage the cached objects in SEEDS nodes
local storage. To maintain cache coherence, SEEDS node can
retrieve a list of modied objects from the servers periodi-
cally (e.g., every few minutes). If a cached object is modied,
SEEDS node may retrieve the updated object from the servers
immediately or it can invalidate its cache entry and retrieve
the updated object when receiving a request for this object
later.
F. Case Study of Distributed Computation Services
Analogous to distributed storage, SEEDS nodes can be orga-
nized into groups, each of which provides a specic computing
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
CHENG et al.: SEEDS: A SOLAR-BASED ENERGY-EFFICIENT DISTRIBUTED SERVER FARM 7
service, say F(X), through a Web service, a remote procedure
call (RPC), etc. When a new node joins the SEEDS overlay, it
contacts the bootstrap node so that the node may be assigned to
a service group. Each computing service may suffer different
and time-variant workloads. In that case, a group management
scheme as mentioned in Section IV-C is required to determine
the assignment of SEEDS nodes to groups in order to opti-
mize overall system performance. To simplify the explanation
of the case study, we assume there is no group organization
in the SEEDS overlay and that all SEEDS nodes can provide
computing services.
When a node rst joins the SEEDS overlay, it has to down-
load the service programs from peer nodes of the same group.
Then, the SEEDS node should announce its available com-
puting resources to the SEEDS overlay. To implement the
service, the service program can be developed using a paral-
lel computing package with distributed memory support, such
as message passing interface (MPI). Therefore, SEEDS nodes
can work together and act as a parallel computing infrastruc-
ture. The quantity of computing resource of a SEEDS node
may be expressed in basic computation units, e.g., million
instructions per second (MIPS). Then, the number of basic
computation units that a SEEDS node can provide becomes
a resource key maintained in the SEEDS overlay. When a
client sends a computation request to a SEEDS node, the lat-
ter acts as a coordinator node and can nd SEEDS nodes
with sufcient computation resources, claim the computation
resources from the SEEDS nodes, invoke the parallel process-
ing program, collect the results, and respond to the client. The
selection of the coordinator node can follow the mechanism
presented in Section IV-D.
To nd a certain number of SEEDS nodes with sufcient
computation resources in a P2P overlay, one approach is to
search the key; i.e., the amount of resources, in the overlay.
When there are N
n
SEEDS nodes in the P2P overlay, the P2P
overlay spends O(logN
n
) steps to identify the key and then
nds the SEEDS node that has the resources. There are two
issues with this approach. First, when a resource having the
requested size, say R, cannot be found, we want to nd a
larger resource that can fulll the request. If we apply the
best-t algorithm for allocating the resources, we have to con-
tinue the search from R+ 1 to the maximum amount of the
resource, and each search spends O(logN
n
) steps. The second
issue is that if we have to nd K SEEDS nodes, each of which
has at least R resources, we have to perform K independent
searches. One possible solution to improve the performance
of the best-t search is to apply range query technologies
of a structured P2P overlay [21]. For a range query in P2P
overlay, the keys are unhashed. Therefore, if a P2P overlay
cannot nd R resources, we can follow the nger table of the
node that maintains R, to nd the node maintaining R + 1
resources. This approach introduces O(logN
n
) steps to nd R
and as many as N
n
steps to nd K resources larger than R
if R is not available. We can signicantly reduce the search
steps from K N
n
O(logN
n
) to O(logN
n
) + N
n
. A side effect of
storing unhashed keys in a P2P overlay is load unbalancing.
A SEEDS node maintaining the resource key only provides
the directory that points to the SEEDS nodes providing the
Fig. 5. Example of providing distributed computing services based on the
SEEDS infrastructure.
resource, and key maintenance imposes negligible load on the
SEEDS node maintaining the key. Moreover, there exist solu-
tions [22] to solve the load unbalancing problem in a structured
P2P overlay. For further discussion of these solutions, please
refer to [22].
To further improve the resource search, resources can be
divided into zones, each of which represents a range of
resource amounts that is encoded in the zones key. For exam-
ple, assume that the amounts of resources are divided into Z
zones. A SEEDS node that maintains the zone key Z
0
has a
list of SEEDS nodes with 0 to Z
0
resources. A SEEDS node
that maintains the zone key Z
1
has a list of the SEEDS nodes
with Z
0
to Z
1
resources. Therefore, we may nd all SEEDS
nodes with sufcient resources from the same zone instead
of searching resources one-by-one. If there is no resource
in the zone, we then search for resources in the next zone.
In that case, we further reduce the K resources search from
O(logN
n
) + N
n
steps to O(logN
n
) + Z steps. For the zone-
based resource management, only Z or less than Z SEEDS
nodes maintain the resource keys and loads of SEEDS nodes
becomes unbalanced. The issue can be also solved by apply-
ing load balancing schemes proposed in the previous studies.
After the coordinator node nds all resources and the SEEDS
nodes, it invokes the parallel computation program, collects
results, and responds to the clients. After the SEEDS nodes
commit to provide the services, they have to announce the un-
availability in the P2P overlay. After the SEEDS nodes nish
providing the service, they publish their available resources
again to the P2P overlay. An example of providing distributed
computing services based on the SEEDS infrastructure can be
found in Fig. 5.
V. ANALYTICAL MODEL
This section presents an analytical model to analyze the
energy consumption of the conventional and SEEDS archi-
tectures. We describe the assumptions and parameters of our
model and then compute the energy consumption of different
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
8 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS
architectures. The objective of this analysis is to determine the
energy savings achieved by SEEDS.
A. Model Parameters and Assumptions
The total energy consumption required to provide Web-
based services is dened as E
tot
= E
base
+E
serv
+E
maint
, where
E
base
is base energy consumption, E
serv
is service energy
consumption, and E
maint
is system maintenance energy con-
sumption. We focus on the consumption of brown energy (i.e.,
nonrenewable energy) and thus assume that SEEDS nodes
operating on renewable energy account for no energy con-
sumption. In the model, each component accounts for the
energy consumed by servers and that consumed in networks.
This model does not include the energy consumed by the agent
program in user clients because it introduces negligible energy
in our measurement compared to the total energy consumption
of clients.
When receiving a request, a server of a website processes
the request and returns the requested Web object. Let O
be the number of objects available on a website. Assume
that the popularity of Web objects generally follows a Zipf-
like distribution [23]. The probability of a request for the
ith most popular object is given by f (i) = K/i

, where
K = 1/(

O
i=1
(1/i

)) and is the skewness factor. Thus,


the number of requests for the ith most popular objects is
R(i) =

t
S(t) f (i), where S(t) is the workload that servers
of a website suffer at time t and

O
i=1
R(i) = R. Similar
to [24], assume that the energy consumed by the server in
serving requests is proportional to the size of the responses.
Thus, the energy required to process a request for the object
of size sz(o
i
) is denoted by E
p
(sz(o
i
)).
To model the energy consumption in networks, we utilize
the results in [25] to compute the energy consumption of data
transmission in network equipment (i.e., router). Let E
tra
be
the energy required to transmit one bit in a router and let
H be the average number of router hops between any two
nodes in router-level Internet topologies. The energy required
to transmit message m with size sz
m
from source to destination
among routers through the networks is given by
E
r
(sz
m
) = E
tra
sz
m
H. (1)
Assume that each SEEDS node uses the same hardware
device with a maximum power consumption of p
n
watts.
Power is supplied to the SEEDS node device by a solar power
system with an output power of p
sol
watts and a power loss
of . The operation time of SEEDS node n per day can be
approximated as t
n
= (1) ((p
sol
h
n
)/p
n
), where h
n
is the
average peak sun hours (PSHs) in the location of SEEDS node.
The total operation time of SEEDS node n for an estimation
time period T
est
can be computed as T
n
= t
n
(T
est
/T
d
), where
T
d
is the time of a day. Thus, given N
n
SEEDS nodes, the
average number of live SEEDS nodes, denoted by N
l
, can be
computed as
N
l
=

n
T
n
N
n
. (2)
Furthermore, for the random group assignment, the aver-
age number of SEEDS nodes assigned to each of N
s
websites
can be computed as N
l
/N
s
. For the optimal group assign-
ment, the average number of SEEDS nodes assigned to the
group for website i can be estimated as N
l
w
i
, where
w
i
=

t
S
i
(t)/

t
S
i
(t) and S
i
(t) is the workload that
servers of website i suffer at time t.
In the cooperative caching system, SEEDS nodes cooper-
ate with each other to cache objects for the website. Given a
SEEDS node capacity C
n
, we assume that the SEEDS nodes
can cooperatively cache Z = C
n
N
l
(i) most popular objects
for website i, where N
l
(i) is the average number of SEEDS
nodes for website i.
B. Energy Consumption of Conventional Architecture
Let P
s
be the base power consumption of a server and
assume that the consumption is steady. Given S
i
servers of
a website i and an estimation time period T
est
, the total base
energy consumption of the conventional architecture is given
by
E
C
base
= P
s
T
est
S
i
. (3)
Assume that all requests have the same message size sz
req
.
The service energy consumption, which includes the energy
consumption of request transmission in networks, request pro-
cessing on servers, and object transmission in networks, is
given by
E
C
serv
=
O

i=1

E
r

sz
req

R(i)

+
O

i=1

E
p
(sz (o
i
)) R(i)

+
O

i=1

E
r
(sz (o
i
)) R(i)

. (4)
The conventional approach does not consume energy for
system maintenance. Let sz
obj
denote the average object
size. Substituting

O
i=1
R(i) = R into the above equation and
summing (3) and (4), the total energy consumption of the
conventional architecture is given by
E
C
tot
= P
s
T
est
S
i
+E
r

sz
req

R
+ E
p

sz
obj

R + E
r

sz
obj

R. (5)
C. Energy Consumption of SEEDS
Let S

i
denote the number of servers required by SEEDS.
We assume that less number of servers are required to handle
the same number of requests; i.e., S

i
< S
i
, because SEEDS
system can reduce the servers workload. The total base energy
consumption of SEEDS is given by
E
S
base
= P
s
T
est
S

i
. (6)
The service energy consumption consists of the energy
required to retrieve cached objects from the SEEDS overlay
and the energy required to retrieve the objects that are not
cached on SEEDS nodes from servers. Each client performs a
cache group lookup for the rst request sent to a cache group.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
CHENG et al.: SEEDS: A SOLAR-BASED ENERGY-EFFICIENT DISTRIBUTED SERVER FARM 9
The energy required to transmit the lookup message through
networks is E
r
(sz
req
) l, where sz
req
is the message size and
l = (1/2) log N
l
is the average lookup hop in the SEEDS over-
lay with N
l
SEEDS nodes. When a group member receives a
request from a client, it may be the successor of the object
key with probability r = 1/N
l
(i), where N
l
(i) is the number
of SEEDS nodes of group i. Alternately, it may need to for-
ward the request to the successor of the key with probability
1 r. The most Z popular objects cached on SEEDS nodes
can be directly returned to clients. For requests for the other
objects, an object not found message is returned. Thus, the
energy required to retrieve objects from the SEEDS overlay is
computed as
E
S
serv
(node) = (E
r
(sz
req
) l) N
c
+
O

i=1

E
r
(sz
req
) (1 +(1 r)) R(i)

+
Z

i=1

E
r
(sz
obj
) R(i)

+
O

i=Z+1

E
r
(sz
res
) R(i)

(7)
where N
c
is the number of clients and sz
res
is the size of
response message. To simplify the equation above, we ignore
the energy consumption of group lookups (i.e., the rst term)
because it is typically much smaller than that of object lookups
and object transmission.
If the storage capacity of live SEEDS nodes is less than
the total number of objects stored on servers (i.e., Z < O),
requests for the objects with popularity Z + 1, Z + 2, . . . , O
are sent to origin servers. The energy required to retrieve the
objects from the servers is computed as
E
S
serv
(server) =

O
i=Z+1

E
r
(sz
req
) R(i)

O
i=Z+1

E
p
(sz
obj
) R(i)

O
i=Z+1

E
r
(sz
obj
) R(i)

, if Z < O
0, otherwise.
(8)
Let F (l, u) =

u
i=l
f (i), so that we have

Z
i=1
R(i) = R F(1,Z) and

O
i=Z+1
R(i) = R F(Z +1, O)
because R(i) = R f (i). Summing (7) and (8), the total service
energy consumption of SEEDS is given by
E
S
serv
=

E
r
(sz
req
) R (2 r +F(Z +1, O))
+ E
p
(sz
obj
) R F(Z +1, O) +E
r
(sz
obj
) R
+ E
r
(sz
res
) R F(Z +1, O), if Z < O
E
r
(sz
req
) (2 r) R +E
r
(sz
obj
) R, otherwise.
(9)
For the system maintenance energy consumption, we com-
pute the energy consumed by networks for transmitting mes-
sages to maintain the SEEDS overlay and update the group
structures. The Stabilization procedure may require three mes-
sages [17]. For the FixFinger procedure, a lookup with average
l hops is required for a SEEDS node to nd the correct
nger node for each of its F nger entries. Each SEEDS
node periodically updates the group node list by contact-
ing the directory node of its group with average l hops
and retrieving the group node list from the directory node.
Assume that each SEEDS node performs the Stabilization,
FixFinger, and group update operations every T
stab
, T
x
,
and T
upd
, respectively. Let the message sizes be sz
stab
, sz
x
,
and sz
upd
, respectively. Given N
n
SEEDS nodes, the energy
consumption of the three maintenance operations can be
computed as
E
stab
=
N
n

n=1

E
r
(sz
stab
) 3
T
n
T
stab

E
x
=
N
n

n=1

E
r
(sz
x
) l F
T
n
T
x

E
upd
=
N
n

n=1

E
r
(sz
upd
) l
T
n
T
upd
+E
r
(sz
gl
)
T
n
T
upd

. (10)
Let T(N
n
, T
maint
) =

N
n
n=1
T
n
/T
maint
for a maintenance
operation with a time period of T
maint
and substitute this into
(10). The total maintenance energy consumption of SEEDS is
then given by
E
S
maint
= E
r
(sz
stab
) 3T(N
n
, T
stab
) +E
r
(sz
x
) lF T(N
n
, T
x
)
+ E
r
(sz
upd
) l T(N
n
, T
upd
) +E
r
(sz
gl
) T(N
n
, T
upd
).
(11)
Finally, the total energy consumption of SEEDS, E
S
tot
, can
be obtained by summing (6), (9), and (11).
With the energy consumption models for the conventional
architecture and SEEDS, it is possible to compute the energy
savings achieved by SEEDS. Given the energy consumption
E
C
tot
(i) and E
S
tot
(i) for website i, dene the relative energy
savings as = (

i
E
C
tot
(i)

i
E
S
tot
(i))/

i
E
C
tot
(i).
VI. SIMULATION AND PROTOTYPE IMPLEMENTATION
To evaluate the performance of the conventional service
architecture and the proposed SEEDS system, this paper
requires a large-scale distributed system simulation that
involves various models such as an earth geography model,
a solar power model, a network topology model, a network
delay model, a P2P overlay model, a request model, and a
service performance and energy model. However, we could
not nd one single tool that meets all requirements and is
suitable for the simulation. Therefore, we integrated several
recognized tools and developed a simulator to combine the
data from these tools and generate simulation results. Fig. 6
illustrates the integration of the tools and the SEEDS sim-
ulator. A prototype of the SEEDS system has also been
implemented. This section describes the simulation method-
ology and presents the simulation results and prototype
implementation.
A. Simulation Model
In the simulations, we used three websites located in differ-
ent time zones. The locations were Los Angeles (GMT8),
London (GMT0), and Taipei (GMT+8). Each website used
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
10 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS
Fig. 6. Integration of simulation models and tools.
a server cluster to handle requests. We assumed that requests
for the objects of a website were distributed to and handled
by the servers of the cluster. The number of servers for a web-
site was chosen to be capable of serving 99% of requests. For
simplicity, we considered static contents for Web-based ser-
vices. There were a total of 30 000 objects available on each
website. For the request model, we used realistic Web trace
logs obtained from [19] for time-variant request workloads.
The request rates were further scaled to simulate different
workloads for the three websites. The size of Web objects fol-
lowed the Pareto distribution [26] with shape parameter 1.2,
and the mean size is 300 kB, based on a statistical analysis of
Web pages [27]. The popularity of the ith most popular object
followed the Zipf-like distribution described in Section V-A.
To determine the performance model of a Web server, an
Apache HTTP server was set up on a Linux machine using
a Supermicro tower server with two 2.2 GHz AMD Opteron
Dual Core CPUs and 8 GB of RAM. The server was connected
to a Gigabit switch. A load testing tool, Apache JMeter [28],
which ran on another machine, reported that the server can
handle up to 300 requests per second in the conguration. To
construct the energy model of a server, a stream of requests
for objects with a mean size of 300 kB was sent to the server.
The energy consumption of the server was measured over a
time interval. The idle energy consumption of the server was
subtracted from this energy to determine the additional energy
required to process the requests during the interval. The addi-
tional energy consumption was then divided by the number
of requests served during the interval, to obtain the average
energy consumed per request.
To construct the performance and energy models of SEEDS
nodes, a BeagleBoard [29], which was a low-power single
board computer with 600 MHz ARM Cortex-A8 CPU and
512 MB of RAM, served as the reference device of the SEEDS
node. The maximum power consumption of the BeagleBoard
is approximately 3.5 W. Each SEEDS node was able to serve
25 requests per second. The LFU algorithm served as the cache
replacement policy. SEEDS nodes are designed to be dis-
tributed over the world and use locally available solar energy.
Thus, the solar power model takes the location of SEEDS
nodes into account. The locations were chosen based on the
Human Development Index (HDI) [30], which is between 0
and 1 and seems a reasonable indicator of the availability
of solar radiation at a location. We selected 20 cities with
the worst PSH (i.e., winter PSH) greater than zero from the
countries whose HDI is higher than 0.5. After determining
the location of SEEDS nodes, we calculated the sunrise and
sunset times for each location based on an implementation of
Jean Meeuss astronomical algorithms [31]. The real monthly
average values of PSH obtained from [32] were used to deter-
mine the amount of solar energy available in each city where a
SEEDS node was located. The output power of the solar panel
supplying the SEEDS node was assumed to be 10 watts, with
a power loss of 25% in the solar power system. The energy
conversion efciency of commercially available silicon solar
cells is approximately 15%20% [33]. Therefore, a panel with
a total solar cell area less than 1000 cm
2
is sufcient to sup-
ply the energy needs of a given SEEDS node. SEEDS nodes
were organized into a SEEDS overlay. For the P2P model,
Chord [17] was implemented for the SEEDS overlay.
The operation of the SEEDS system is highly dependent
on the geographical location of SEEDS nodes. Therefore, this
information is incorporated in the network topology model.
To generate location-aware router-level network topologies, we
used a data set obtained from [34] for the router locations. This
data set, which contains more than 280 000 routers, is based on
a real IP geo-location database and Border Gateway Protocol
(BGP) routing table dump. Because it would have been inef-
cient to construct a network topology consisting of all routers,
95 783 routers in 335 domains were selected from the data
set, and the IGen topology generator [35] was used to build a
router-level Internet topology. The servers and SEEDS nodes
were attached to the geographically closest routers. For the net-
work delay model, an open-source routing solver, C-BGP [36],
was used to further compute routing paths and network laten-
cies. The average number of hops in the router-level topology
was 14.8. The topology data were then fed into the simula-
tor. The router throughput was assumed to be 1 Tb/s. For a
high-end router with throughput of 1 Tb/s and power con-
sumption of 10 kW (e.g., Cisco CRS-1), the energy required
to transmit one bit of data in the router is approximately
10 nJ [37]. Finally, we used 100 clients distributed around the
world, each of which was attached to the geographically clos-
est router. The requests were randomly distributed among the
clients.
B. Simulation Results
We compared the performance of four different system
architectures. The rst architecture, denoted as CON, is the
conventional service architecture (see [24]) where servers are
powered solely by electricity (i.e., brown energy). The sec-
ond architecture, denoted as GREEN, is the centralized green
approach (see [6]) where servers are powered by electric-
ity and partial green energy. We used the same number of
solar panels as that used in the SEEDS system for the green
energy supplied to the second architecture. We assumed that
the solar panels were equally installed in the locations of
the three servers. As described previously, the solar energy
generated in each location depends on the according PSH.
The third architecture, denoted as SEEDS-R, is the proposed
SEEDS system using random group assignment. The fourth
architecture, denoted as SEEDS-G, is the proposed SEEDS
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
CHENG et al.: SEEDS: A SOLAR-BASED ENERGY-EFFICIENT DISTRIBUTED SERVER FARM 11
TABLE I
MODEL PARAMETERS AND VALUES
Fig. 7. Theoretical and simulation results of total energy consumption.
system using greedy group assignment. This section validates
the analytical models presented in Section V by comparing the
results obtained from the analytical model and those obtained
from the simulations. Table I summarizes the values of the
model parameters. We evaluated the effect of the skewness
factor , which varied from 0.1 to 0.9, on energy consumption.
In each node location, four SEEDS nodes were congured as
a node cluster. Simulations were conducted for the month of
March, with results averaged over 10 runs.
Fig. 7 shows the total energy consumption under differ-
ent skewness factors. The theoretical and simulation results
match very well for CON, GREEN, and SEEDS-R. The
results of SEEDS-G approach the theoretical results of SEEDS
using optimal assignment, denoted as SEEDS-O. For CON
and GREEN, the skewness factor has a relatively insigni-
cant effect on the energy consumption because all requests
are sent directly to the servers and fullled by the servers.
GREEN consumes less energy than CON because the former
servers are partially powered by solar energy. In contrast, the
energy consumption of SEEDS-R and SEEDS-G decrease as
the request distribution becomes more skewed (i.e., larger ).
In this case, SEEDS nodes are able to serve more requests
for popular objects, resulting in fewer requests processed by
the servers and lower energy consumed by the servers than
in CON and GREEN. Results show that the SEEDS sys-
tem can save more energy than the GREEN approach. The
rst reason is that for the GREEN approach, the servers at
different locations may suffer from various workloads, and
Fig. 8. Theoretical and simulation results of SEEDS energy consumption.
solar panels co-located with the servers may generate variant
amount of energy. However, the energy generated at different
locations cannot be shared with each other and supports other
servers for the GREEN approach. The second reason is that
the GREEN approach only has few and large-scale installation
sites of solar panels, and the energy generated by the solar
panels is inuenced by weathers and locations of those sites.
Moreover, SEEDS-R consumes more energy than SEEDS-G
because random group assignment cannot fully exploit the
capacities of SEEDS nodes. When SEEDS nodes in SEEDS-R
are randomly assigned to groups, each website is expected to
be served by roughly the same number of SEEDS nodes at a
given time, regardless of the websites current workloads. At
a given time, the number of requests to a heavy-load website
can be signicantly reduced by its front-end SEEDS nodes
while the same number of the front-end SEEDS nodes that
serve an under-loaded website can only contribute a little. As a
result, the solar energy for these low utilization SEEDS nodes
cannot be well exploited in providing services. On the other
hand, SEEDS-G assigns SEEDS nodes to websites according
to time-variant workloads. When a website which is under
heavy load can be served by more SEEDS nodes, the num-
ber of requests processed by the servers as well as the energy
consumed by the servers can be reduced. Therefore, SEEDS-G
consumes less energy than SEEDS-R.
Energy consumption of the SEEDS system consists of the
energy consumed by the servers in processing requests and
the energy consumed by networks in transmitting objects
from the servers or SEEDS nodes to clients, transmitting
requests between system components (i.e., servers, SEEDS
nodes, and clients), and maintaining the SEEDS system. The
energy consumed by the servers and networks is mainly
provided by nonrenewable energy resources. The renewable
energy supplied to SEEDS nodes is not involved in this energy
consumption. Fig. 8 shows the cumulative consumption of
nonrenewable energy under different skewness factors. The
results are normalized by the energy consumption of = 0.1.
For each portion of the consumption, the theoretical results
are close to the simulation results. The energy consumed
by the servers for request processing and the energy con-
sumed by networks for object transmission account for most
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
12 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS
Fig. 9. Theoretical and simulation results of relative energy savings.
of the SEEDS energy consumption. The energy consumption
for request transmission and system maintenance represents
an extremely small fraction of the total energy consumed in
SEEDS. Fig. 8 reveals that increasing the skewness factor
increases the energy consumption for transmitting objects from
SEEDS nodes to clients and decreases energy consumption
for processing requests in the servers and transmitting objects
from servers to clients. A larger value indicates a more
skewed condition, that is, more requests are concentrated on a
few popular objects cached on SEEDS nodes. Therefore, more
energy is required to transmit objects from the SEEDS nodes
to clients. On the other hand, the servers have lower energy
consumption because they have fewer requests to process. The
increasing energy consumption of object transmission from
SEEDS nodes to clients does not increase the total energy
consumption of SEEDS. The total energy required to transmit
objects from the servers and SEEDS nodes to clients through
networks remains unchanged. Thus, because the energy con-
sumed in the servers reduces with increasing skewness factor,
the total energy consumption of SEEDS can also decrease.
We compared the theoretical and simulation results of the
relative energy savings achieved by GREEN, SEEDS-R, and
SEEDS-G. Fig. 9 shows the savings in the entire system
under different skewness factors. The theoretical results are
close to the simulation results. The energy savings achieved
by GREEN remains constant. For SEEDS-R and SEEDS-G,
the energy savings in the entire system increase when the
skewness factor increases. This is because the SEEDS sys-
tem reduces the energy consumption of the servers, as shown
previously. SEEDS-G achieves more energy savings than
SEEDS-R. This is because SEEDS nodes in SEEDS-R are
not well assigned to support servers with time-variant work-
loads. Thus, SEEDS-G achieves the largest energy savings. In
particular, the SEEDS system achieves energy savings of up
to 35%. Both theoretical and simulation results demonstrate
the reduction of energy consumption in the proposed SEEDS
system.
We have shown that energy consumption of Web-based ser-
vices can be reduced by the SEEDS system. However, extra
latencies to retrieve an object are incurred when the object is
not cached on SEEDS nodes. For example, when requests for
Fig. 10. Relative energy-delay product.
objects tend toward a uniform distribution such as = 0.1, the
latency increases up to 280 ms for the SEEDS system, whereas
the latency in CON is less than 200 ms. To better understand
the tradeoffs between energy saving and extra latency, we eval-
uated GREEN and the SEEDS system using the energy-delay
product metric. Fig. 10 shows the energy-delay products nor-
malized by the products of CON. GREEN has the smallest
energy-delay products because it can reduce energy consump-
tion and does not increase access latency. Although SEEDS-R
can reduce energy consumption, it introduces additional access
latencies and thus might not have lower energy-delay products
than CON. The energy consumption in the SEEDS system can
be further reduced by taking time-variant loads into account
for the group assignment of SEEDS nodes. SEEDS-G not only
reduces energy consumption signicantly but achieves better
energy-delay products than SEEDS-R.
C. Prototype Implementation
A prototype of the SEEDS node has been implemented on
the eBox-4300, an embedded device with a 500 MHz VIA
Eden ULV CPU, 512 MB RAM, and input voltage/current
requirement of 5V/3A. The eBox was supplied by a solar
power supply system consisting of a solar panel, rechargeable
battery, programmable charge controller, and voltage con-
verter. The eBox connected to the charge controller through
the voltage converter. The charge controller connected with the
solar panel and battery. The solar power supply system per-
forms four functions: 1) converting the output voltage from the
solar panel into a stable 5V voltage for the eBox; 2) storing
the excess energy generated by the solar panel in the battery
for later use; 3) turning on the eBox when the power is suf-
cient; and 4) informing the eBox of the amount of energy left
in the battery through a RS232 interface on the charge con-
troller to determine when to shut down. Recently, we used the
BeagleBoard for the SEEDS node device because it had better
performance and lower power consumption than the eBox.
To measure the server energy consumption of the two ser-
vice architectures simultaneously, two Apache HTTP servers
were set up on two Lenovo ThinkPad X200 laptop com-
puters (2.4 GHz Core 2 Duo CPU and 2 GB RAM). One
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
CHENG et al.: SEEDS: A SOLAR-BASED ENERGY-EFFICIENT DISTRIBUTED SERVER FARM 13
Fig. 11. Prototype of SEEDS. (a) Solar power supply system. (b) Prototype implementation. (c) Screenshot of power measurement program (the conventional
server is on the left side and the SEEDS server is on the right side).
served as the server in a conventional architecture, whereas
the other functioned as a SEEDS server (the server that the
proposed SEEDS system was applied). The prototype sys-
tem used two SEEDS nodes deployed on two real SEEDS
node devices using the eBox-4300. We developed a simula-
tor to simulate a number of virtual SEEDS nodes because of
the limited number of real SEEDS nodes. The real SEEDS
nodes and virtual SEEDS nodes were assigned to their cho-
sen locations and congured with corresponding parameters.
They communicated with each other to form a cooperative
caching system for the SEEDS server. The simulator running
on another two computers generated client requests based on
the Zipf-like distribution for the two architectures. In the con-
ventional architecture, the simulator sent all the requests to the
conventional server. In the SEEDS architecture, the simulator
sent requests to the real SEEDS nodes, virtual SEEDS nodes,
or SEEDS server according to the object retrieval mechanism
described in Section IV-D. The energy consumption of the two
servers was measured using an NI PCI-6115 data acquisition
(DAQ) board, current probe, and voltage probe. Each chan-
nel was sampled at 50 000 times per second. Fig. 11 shows
the solar power supply system, the prototype implementation,
and a screenshot of the power measurement program. A demo
video of the SEEDS prototype can be found in [38].
VII. CONCLUSION
This paper presents an infrastructure for utilizing DERs.
By exploiting P2P technologies, SEEDS provides distributed
computing and/or storage services with low brown energy
consumption. We have designed and evaluated a distributed
caching storage system for Web-based service as a proof-of-
concept example. In the proposed SEEDS system, energy-
efcient devices utilize locally available renewable energy
resources and cooperate with other devices to form a P2P
caching system. This approach reduces the energy consumed
by servers because most client requests are served by the
caching system. A prototype has been implemented to demon-
strate the energy savings achieved by the proposed system.
Because the prototype is a small-scale implementation, this
paper also evaluates system performance in a large-scale envi-
ronment using simulations. This paper presents an analytical
model to estimate the energy consumed by the conventional
approach and the proposed system. Results show that the
SEEDS system exhibits larger reductions in brown energy
consumption and achieves acceptable energy-delay products
compared to the conventional service architecture.
ACKNOWLEDGMENT
The authors would like to thank P. Ding, A.-S. Chang, and
Y.-C. Chen for their support in developing the prototype.
REFERENCES
[1] S. R. Bull, Renewable energy today and tomorrow, Proc. IEEE,
vol. 89, no. 8, pp. 12161226, Aug. 2001.
[2] T. Ackermann, G. Andersson, and L. Soder, Distributed generation:
A denition, Electr. Power Syst. Res., vol. 57, no. 3, pp. 195204, 2001.
[3] J. Driesen and F. Katiraei, Design for distributed energy resources,
IEEE Power Energy, vol. 6, no. 3, pp. 3040, May/Jun. 2008.
[4] J. M. Carrasco et al., Power-electronic systems for the grid integration
of renewable energy sources: A survey, IEEE Trans. Ind. Electron.,
vol. 53, no. 4, pp. 10021016, Jun. 2006.
[5] N. S. Lewis and D. G. Nocera, Powering the planet: Chemical chal-
lenges in solar energy utilization, Proc. Nat. Acad. Sci. USA, vol. 103,
no. 43, pp. 1572915735, Oct. 2006.
[6] (2014, Jan.). SolarCity [Online]. Available: http://www.solarcity.com
[7] J. G. Koomey, Estimating total power consumption by servers in the
U.S. and the world, Lawrence Berkeley National Laboratory, Berkeley,
CA, Tech Rep., Feb. 2007.
[8] (2011, Dec.). AISO.net Green Hosting Services [Online]. Available:
http://www.solardatacenter.net/index.html
[9] G. W. Crabtree and N. S. Lewis, Solar energy conversion, Phys. Today,
vol. 60, no. 3, pp. 3742, Mar. 2007.
[10] . Goiri et al., GreenSlot: Scheduling energy consumption in green
datacenters, in Proc. ACM Supercomputing (SC), Seattle, WA, USA,
Nov. 2011.
[11] . Goiri et al., GreenHadoop: Leveraging green energy in data-
processing frameworks, in Proc. ACM EuroSys, Bern, Switzerland,
Apr. 2012, pp. 5770.
[12] . Goiri, W. Katsak, K. Le, T. D. Nguyen, and R. Bianchini, Parasol
and GreenSwitch: Managing datacenters powered by renewable energy,
in Proc. ASPLOS, Houston, TX, USA, Mar. 2013, pp. 5164.
[13] V. Sharma, A. Thomas, T. Abdelzaher, K. Skadron, and Z. Lu,
Power-aware QoS management in web servers, in Proc. IEEE RTSS,
Washington, DC, USA, Dec. 2003, pp. 6372.
[14] M. Elnozahy, M. Kistler, and R. Rajamony, Energy conservation poli-
cies for web servers, in Proc. USITS, Berkeley, CA, USA, Mar. 2003.
[15] E. Pinheiro, R. Bianchini, E. V. Carrera, and T. Heath, Load balancing
and unbalancing for power and performance in cluster-based systems,
in Proc. COLP, Sep. 2001, pp. 182195.
[16] E. K. Lua, J. Crowcroft, M. Pias, R. Sharma, and S. Lim, A survey
and comparison of peer-to-peer overlay network schemes, Commun.
Surveys Tuts., vol. 7, no. 2, pp. 7293, 2005.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
14 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS
[17] I. Stoica et al., Chord: A scalable peer-to-peer lookup protocol for inter-
net applications, IEEE/ACM Trans. Netw., vol. 11, no. 1, pp. 1732,
Feb. 2003.
[18] V. Cardellini, E. Casalicchio, M. Colajanni, and P. S. Yu, The state of
the art in locally distributed web-server systems, ACM Comput. Surv.,
vol. 34, no. 2, pp. 263311, Jun. 2002.
[19] (2014, Jan.). Traces Available in the Internet Trafc Archive [Online].
Available: http://ita.ee.lbl.gov/html/traces.html
[20] S. Podlipnig and L. Bszrmenyi, A survey of web cache replacement
strategies, ACM Comput. Surv., vol. 35, no. 4, pp. 347398, Dec. 2003.
[21] T. Schtt, F. Schintke, and A. Reinefeld, Range queries on structured
overlay networks, Comput. Commun., vol. 31, no. 2, pp. 280291,
Feb. 2008.
[22] D. R. Karger and M. Ruhl, Simple efcient load balancing algorithms
for peer-to-peer systems, in Proc. ACM Symp. Parallelism Algorithms
Architecture (SPAA), New York, NY, USA, 2004, pp. 3643.
[23] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker, Web caching
and Zipf-like distributions: Evidence and implications, in Proc. IEEE
INFOCOM, New York, NY, USA, Mar. 1999, pp. 126134.
[24] P. Bohrer et al., The case for power management in web servers, in
Power Aware Computing, R. Graybill and R. Melhem, Eds. New York,
NY, USA: Kluwer Academic Publishers, 2002, pp. 261289.
[25] J. Baliga, R. Ayre, W. V. Sorin, K. Hinton, and R. S. Tucker, Energy
consumption in access networks, in Proc. OFC, San Diego, CA, USA,
Feb. 2008, pp. 13.
[26] M. E. Crovella and A. Bestavros, Self-similarity in World Wide Web
trafc: Evidence and possible causes, IEEE/ACM Trans. Netw., vol. 5,
no. 6, pp. 835846, Dec. 1997.
[27] (2011, Dec.). Web Metrics: Size and Number of Resources [Online].
Available: http://code.google.com/speed/articles/web-metrics.html
[28] (2011, Dec.). Apache JMeter [Online]. Available: http://
jakarta.apache.org/jmeter/
[29] (2011, Dec.). BeagleBoard.org [Online]. Available: http://
beagleboard.org/
[30] (2011, Dec.). Human Development Reports [Online]. Available: http://
hdr.undp.org/en/statistics/
[31] (2011, Dec.). Solar Calculation Details [Online]. Available: http://
www.srrb.noaa.gov/highlights/sunrise/calcdetails.html
[32] (2011, Dec.). Gaisma Website [Online]. Available: http://
www.gaisma.com/en/
[33] M. A. Green, K. Emery, Y. Hishikawa, and W. Warta, Solar cell ef-
ciency tables (version 37), Progr. Photovolt. Res. Appl., vol. 19, no. 1,
pp. 8492, Jan. 2011.
[34] (2011, Dec.). IGenTopology Generation Through
Network Design Heuristics [Online]. Available: http://
informatique.umons.ac.be/networks/igen/
[35] B. Quoitin, V. Van den Schrieck, P. Francois, and O. Bonaventure, IGen:
Generation of router-level internet topologies through network design
heuristics, in Proc. ITC, Sep. 2009, pp. 18.
[36] B. Quoitin and S. Uhlig, Modeling the routing of an autonomous system
with C-BGP, IEEE Netw., vol. 19, no. 6, pp. 1219, Nov. 2005.
[37] J. Baliga, R. Ayre, K. Hinton, W. V. Sorin, and R. S. Tucker, Energy
consumption in optical IP networks, J. Lightw. Technol., vol. 27, no. 13,
pp. 23912403, Jul. 1, 2009.
[38] (2013, Oct.). Demo Video for SEEDS System [Online]. Available:
http://youtu.be/jFM0iswv9gA
Chien-Ming Cheng (S07) received the B.S. degree
from National Chiao Tung University, Hsinchu,
Taiwan, and the M.S. degree from National Tsing
Hua University, Hsinchu, both in computer science.
He is currently pursuing the Ph.D. degree from the
Department of Computer Science, National Chiao
Tung University.
His current research interests include mobile peer-
to-peer protocols and services.
Shiao-Li Tsao (M04) received the Ph.D. degree
in engineering science from National Cheng Kung
University, Tainan, Taiwan, in 1999.
He was a Visiting Scholar at Bell Labs,
Lucent Technologies, Murray Hill, NJ, USA, in
1998, a Visiting Professor at the Department of
Electrical and Computer Engineering, University
of Waterloo, Waterloo, ON, Canada, in 2007, and
the Department of Computer Science, ETH Zurich,
Zurich, Switzerland, in 2010, 2011, and 20122013.
From 1999 to 2003, he joined the Computers and
Communications Research Labs of Industrial Technology Research Institute
(ITRI), Hsinchu, Taiwan, as a Researcher and a Section Manager. He is
currently an Associate Professor with the Department of Computer Science,
National Chiao Tung University, Hsinchu. He has published over 100 inter-
national journal and conference papers, and has held or applied for 21
U.S. patents. His current research interests include energy-aware comput-
ing, embedded software and system, and mobile communication and wireless
network.
Dr. Tsao received the Research Achievement Awards of ITRI, in 2000 and
2004, the Highly Cited Patent Award of ITRI in 2007, the Outstanding Project
Award of Ministry of Economic Affairs (MOEA) in 2003, and the Advanced
Technologies Award of MOEA in 2003. He also received the Young Engineer
Award from the Chinese Institute of Electrical Engineering in 2007, the
Outstanding Teaching Award of National Chiao Tung University, the K. T. Li
Outstanding Young Scholar Award from ACM Taipei/Taiwan chapter in 2008,
and the 2013 Award for Excellent Contributions in Technology Transfer from
National Science Council.
Pei-Yun Lin received the B.S. and M.S. degrees
in computer science from National Chiao Tung
University, Hsinchu, Taiwan, in 2008 and 2010,
respectively.
She is currently a Software Engineer at MediaTek
Inc., Hsinchu.