You are on page 1of 13

Triage: Balancing Energy and Quality of Service in a

Microserver

Nilanjan Banerjee Jacob Sorber Mark D. Corner Sami Rollins † Deepak Ganesan
Department of Computer Science †Department of Computer Science
University of Massachusetts Amherst University of San Francisco
{nilanb, sorber, mcorner, dganesan}@cs.umass.edu srollins@cs.usfca.edu

ABSTRACT General Terms


The ease of deployment of battery-powered and mobile sys- Management, Measurement, Performance
tems is pushing the network edge far from powered infras-
tructures. A primary challenge in building untethered sys- Keywords
tems is offering powerful aggregation points and gateways
between heterogeneous end-points—a role traditionally played Power management, microservers, pervasive computing, sen-
by powered servers. Microservers are battery-powered in- sor networks, low-power computing, quality of service, em-
network nodes that play a number of roles: processing data bedded devices.
from clients, aggregating data, providing responses to queries,
and acting as a network gateway. Providing QoS guarantees 1. INTRODUCTION
for these services can be extremely energy intensive. Since The deployment of battery-powered and mobile systems
increased energy consumption translates to a shorter life- is pushing the network edge far from powered infrastruc-
time, there is a need for a new way to provide these QoS tures. Untethered, mobile, and multi-hop networks support
guarantees at minimal energy consumption. a wide range of applications from wildlife habitat monitor-
This paper presents Triage, a tiered hardware and soft- ing [19] and security surveillance [15], to space exploration
ware architecture for microservers. Triage extends the life- and disaster management [5]. Such applications require the
time of a microserver by combining two independent, but development of highly efficient, long-lived, and low-cost mo-
connected platforms: a high-power platform that provides bile and wireless systems.
the capability to execute complex tasks and a low-power One method for balancing efficiency, cost, and lifetime
platform that provides high responsiveness at low energy is to deploy distributed heterogeneous, hierarchical systems,
cost. The low-power platform acts similar to a medical that combine resource-constrained nodes, such as Motes [23]
triage unit, examining requests to find critical ones, and with resource-rich microservers [10]. Microservers form an
scheduling tasks to optimize the use of the high-power plat- intermediate battery-powered tier between tethered base-
form. The scheduling decision is based on evaluating each stations or access points, and resource constrained nodes.
task’s resource requirements using hardware-assisted profil- Microservers provide services including complex data pro-
ing of execution time and energy usage. Using three mi- cessing [16] and high-capacity storage to augment storage-
croserver services, storage, network forwarding, and query limited nodes [2]. They respond to user queries in a low-
processing, we show that Triage provides more than 300% latency manner [16], provide greater range and coverage [15],
increase in microserver lifetime over existing systems while and act as a gateway between short range and longer-range
providing probabilistic quality of service guarantees. radio networks [10].
Designing long-lived microservers poses a unique chal-
Categories and Subject Descriptors lenge: how can a microserver provide both performance guar-
antees and a long lifetime? These goals are in direct con-
D.4.7 [Operating Systems]: Organization and Design—
flict with one another: supporting performance guarantees
distributed systems, interactive systems, real-time and em-
requires examining each request immediately, as it arrives;
bedded systems; D.4.8 [ Operating Systems]: Performance—
however, providing such vigilance is energy intensive. To
measurements; D.4.4 [Operating Systems]: Communica-
see why, consider the following: normally the microserver
tions Management—network communication
remains in a low-power sleep mode. When a request arrives,
it transitions to a higher-power state, determines if it must
process the request and returns to a low-power state. Un-
Permission to make digital or hard copies of all or part of this work for fortunately, in current microserver designs this duty-cycling
personal or classroom use is granted without fee provided that copies are process is extremely energy intensive—an effect we validate
not made or distributed for profit or commercial advantage and that copies in the second section of this paper. We propose to resolve
bear this notice and the full citation on the first page. To copy otherwise, to this tension through a new architecture, Triage, that pro-
republish, to post on servers or to redistribute to lists, requires prior specific vides the responsiveness of an always-on server, together
permission and/or a fee.
with an order-of-magnitude greater lifetime than existing
MobiSys’07, June 11–13, 2007, San Juan, Puerto Rico, USA.
Copyright 2007 ACM 978-1-59593-614-1/07/0006 ...$5.00. microservers. Triage is predicated on the fundamental prop-
1600
erty that rather than choosing a single hardware platform, No Power
the needs of untethered embedded applications are best met 1400 Management

by combining platforms with complementary hardware char-

Average Power Consumed (mW)


acteristics. 1200
PSM-10s +DVFS

Research Contributions: Triage provides a general mech- 1000


PSM-33s+DVFS

anism for building highly energy-efficient and QoS-aware mi-


croservers suitable for many untethered applications, includ- 800

ing sensor applications, mobile systems, and pervasive com-


600
puting. Our design and implementation offers three novel
contributions. 400 Mote
First, the core of Triage is a new system architecture de-
signed for a combination of platforms with complementary 200

characteristics. The architecture is designed for multi-tiered


0
platforms and uses intelligent schemes to distribute tasks 0 1000 2000 3000
Forwarding Time (ms)
4000 5000 6000

across tiers to save energy. The Triage system targets com-


mon server functions such as storage, forwarding and query
processing. Figure 1: Results of the measurement study
Our second major contribution is a set of lightweight, on-
line profiling and scheduling mechanisms. A key challenge
require much less energy to transition to an active state, but
that Triage addresses is how to place the complex profiling
consume large amounts of power while waiting for requests.
and scheduling logic at the resource constrained platform
In contrast to the high costs of PDA-class devices, small
in order to maximize energy-efficiency without sacrificing
nodes such as Motes provide very efficient idling and power
QoS. We have built a highly optimized in-situ profiling ser-
state transition. This is primarily the result of using a low-
vice, and two intelligent scheduling algorithms: (a) a soft-
power set of components, including a small, simple micro-
realtime scheduler that uses an As-Late-As-Possible (ALAP)
controller, a low-power radio, small amounts of RAM, and
scheduling policy to provide deadline guarantees to tasks at
a minimal operating system. However, low-power platforms
minimal energy cost, and (b) a lifetime scheduler that uses
have insufficient resources to act as a microserver.
a token-bucket energy rate control algorithm to achieve a
We validate these claims through measurements of two
target microserver lifetime while maximizing the number of
possible microserver configurations, a Stargate and a Mote.
tasks that can meet their QoS requirements. This sched-
In each case, the microserver receives a request from a nearby
uler is adapted from previous work on scheduling network
node and responds with a 30kB message. This process re-
transfers in mobile networks [3].
peats for 15 minutes. The Stargate microserver responds
Third, we have built a prototype of Triage on a platform
using an 802.11 radio, and the Mote uses an 802.15.4 low-
combining the Intel/Crossbow Stargate with the TelosB Mote,
power radio. The Stargate can use PSM-10s+DVFS, and
augmented with a custom fabricated interface board for in-
PSM-33s+DVFS (the lowest-power PSM mode supported
situ profiling. On this platform, we show the results of an
by our card). The 10s, and 33s refers to the wakeup in-
extensive set of experiments using different workloads. We
tervals for the WiFi card in PSM mode — i.e., the WiFi
show that Triage provides a 300% increase in microserver
card wakes up every 10s, and 33s to scan for incoming pack-
lifetime over existing systems. It provides probabilistic Qual-
ets. The Mote does not use any power management. We
ity of Service guarantees and in addition meets all lifetime
show the average power each platform consumes, versus the
goals.
amount of time required to forward the message in Figure 1.
In this case, we do not suspend, or shutdown the Stargate,
2. MICROSERVER MEASUREMENTS as the transition times of several seconds would not improve
Microservers typically wait, in a low power state, for an its performance.
external request, transition to a high-power state, then pro- The Stargate is a much more capable platform, with a
cess the request. For the microserver to be useful, it must faster radio and processor, yielding very fast forwarding
have sufficient computational, networking, and storage re- times. However, the high power costs of keeping the plat-
sources to complete the request. Unfortunately, platforms form awake, even with aggressive power-management, gives
that have sufficient resources to act as a microserver incur a very high average power. In contrast, the Mote uses very
large idling energy costs, even in low-power modes, and can little power, but requires an order-of-magnitude more time
require significant energy to transition from a low-power to to forward the packet.
a high-power state. For these measurements, we see that neither of the plat-
Many systems provide a trade-off between the energy needed forms are sufficient to use as an energy-efficient microserver.
for waiting in a low-power state and the cost to transition to While the microserver must contain the resources of the
a mode that can process the request. As an example, con- Stargate to process low-latency requests, it is highly inef-
sider the PDA-class Stargate platform [6]. Though it does ficient when waiting for requests. Our goal is to extend the
consume little energy when in a suspended state, it requires operating modes of a high-power platform into the regimes
several seconds and a large amount of energy to resume op- offered by platforms such as the Mote, while retaining the
eration. This cost is due to a variety of factors including resource-richness of the Stargate.
operating system complexity, the use of large RAM banks,
and processor startup. In contrast, the Stargate can save 3. TRIAGE ARCHITECTURE
energy while waiting using shallower power saving modes,
Triage provides Quality of Service (QoS) and energy-efficiency
such as CPU DVFS [8], or wireless PSM [1]—these modes
through a combination of a high-power, resource-rich plat-
form and a low-power, resource-constrained platform. The
low-power tier, or tier-0, remains always-on ensuring respon-
siveness at minimal energy cost. The high-power tier, or
tier-1, remains in a power saving mode until its resources
are required for a given service. The low-power platform
acts similar to a medical triage unit, examining requests to
Task 2 Task 1
find the critical ones, and as a scheduler to minimize the
number of times the high-power platform is woken. Such a
scheduling mechanism requires accurate in-situ profiling of
the time and energy needs of each task, which is performed
at the low-power platform. The scheduler can optimize for Hardware Power
Wakeup Control
Measurement
different criteria—in this paper we focus on two: minimiz-
ing energy consumption while meeting soft-realtime latency
constraints, and meeting a lifetime goal while satisfying as
many task deadlines as possible.

3.1 Hardware Architecture


Triage employs a tiered hardware platform. The tier-0
platform is a very low-power platform, in this case a TelosB
Mote [23], and tier-1 is a more capable and higher-power Figure 2: Microserver Software Architecture
platform, in this case a Stargate [30]. In Triage, the two
tiers are tightly coupled and directly communicate over a
wired link. This enables the lower tier to trigger the wake- uler to determine where and when to execute the task; (3) if
up of the higher tier when necessary. the scheduler determines that the task should be executed
The TelosB Mote is extremely resource-constrained, but at tier-0, execute it immediately; (4) otherwise write the re-
consumes less than one-tenth the power of the Stargate. quest into the delayed request log. The delayed request log
This platform works well for always-on operation, simple acts as a priority queue—the task with the smallest deadline
packet processing, and providing low-latency responses. The lies at the head of the queue.
Stargate platform is significantly more capable but less re- Triage also uses the delayed request log as a cache for the
sponsive due to the high latency of sleep to active transi- surrogates. This functionality is particularly useful in stor-
tions. We augment this two-tier platform with a custom age applications; a read closely following a write to the same
fabricated interface board that provides necessary voltage data can be serviced from the log. In order to maximize the
conversions for the two platforms, the wakeup interface, as amount of cached data, Triage does not erase the tier-0 log
well as a fuel-gauge chip to measure energy-consumption in- when a batch of requests is played at tier-1. Instead, the pre-
situ. viously committed log entries and cached results are lazily
overwritten by new requests using an LRU eviction policy.
3.2 Software Architecture Though writing into the log consumes energy, we argue that
Figure 2 illustrates the components of the software archi- it is insignificant to the savings from minimising the number
tecture. Tier-0 virtualizes resources available on tier-1 using of tier-1 wakeups.
a collection of surrogates. Surrogates receive requests from To enable applications to compose the functionality of sev-
the network and can service them in three ways: by using eral surrogates, Triage also permits communication between
locally cached information, by performing local execution, surrogates using primitives provided by the operating sys-
or by passing them to tier-1 for execution. The surrogates tem. For instance, a client may query the microserver for
decide on how to service the request based on information information, and request that the results of the query be sent
provided by the profiler and the scheduler. A profiler mea- to another node. This requires a combination of a storage
sures the energy and processing requirements of tasks and a surrogate as well as a forwarding surrogate.
scheduler determines, based on the predicted energy cost of 3.2.2 Scheduler and Profiler
a task, when and where it should be executed to meet QoS
requirements. Requests that have been scheduled for execu- Triage uses a scheduler, running on tier-0, to provide QoS.
tion at tier-1 are written into a log and delayed until their The scheduler relies on a profiler to provide information re-
scheduled execution time. This log also serves as a cache of garding how long each type of task will take to process, and
recent requests. how much energy it will consume. The profiler measures the
execution time of each task and builds a model of task execu-
3.2.1 Surrogates tion time. Using this information, the scheduler determines
Surrogates are small software modules that run on tier-0 where and when to execute each request. The scheduler re-
and provide a service such as storage or forwarding. Though lies on the execution time profiles generated by the profiler.
tier-0 can process simple tasks, such as routing updates However, if a task exceeds the typical execution time, it is
or time synchronization, most tasks require resources only not preempted and still carried on till completion.
available at tier-1. When a request arrives at the microserver The question of where to execute a task can be answered
the surrogate performs the following process: (1) immedi- by comparing the amount of energy required to execute it
ately execute requests from information cached at tier-0; (2) at each platform. This decision may be even simpler if the
if a request cannot be serviced from the cache, ask the sched- task requires the resources of tier-1 and cannot be executed
on tier-0. The question of when to execute requests is more queried (3) the number of objects desired per image for com-
complicated. There are two cases when the scheduler must plex queries (4) latency deadline associated with each query.
wake tier-1 and dispatch outstanding tasks. The first case The query processing surrogate uses the other surrogates to
occurs when the log becomes full. In this case, the scheduler create a complex combination of resources, including pro-
is automatically invoked by the delayed request log; it wakes cessing, routing, and storage. Tier-1 can execute any query
tier-1 and dispatches each outstanding task to the appropri- since it has access to the powerful radio, the primary stor-
ate service. The second case requires the scheduler to con- age system, and a powerful processor. However, tier-0 can
sider QoS constraints. Each request can optionally contain respond to only simple queries. For example, any query that
a soft-realtime deadline, or latency constraint, which indi- can be performed using simple comparisons of cached meta-
cates the time by which a task should be executed. In order data can be performed at the tier-0 system. Therefore, tier-0
to meet the deadlines with maximum energy efficiency, the maintains an index of results stored from previous queries
scheduler will delay execution of tasks as long as possible to in its cache/log.
increase the amount of time tier-1 remains in a low-power When the surrogate receives a query, it first determines
state. However, if a batch is already being processed, all whether the query is a simple query that can be executed
tasks irrespective of their deadlines are executed on tier-1. over data cached in the delayed request log. We assume that
This is done to avoid the high transition cost of waking up tasks are statically mapped into simple queries, which can
tier-1. We describe the algorithms used by the scheduler be executed at either tier, and complex queries that require
and profiler in Section 5. the resources of tier-1. If the query can be executed at tier-0,
it is executed immediately. Otherwise, the surrogate passes
4. EXAMPLE SURROGATES the query and any latency constraint to the scheduler. Once
the scheduler has scheduled the query, it is inserted into the
Some of the common, and basic, functions found in servers
delayed request log.
for sensor networking, mobile networking, and pervasive com-
puting are storage, query processing and forwarding. To this
end, we present three example surrogates: a storage system 5. PROFILING AND SCHEDULING
surrogate, a network forwarding surrogate, and a query pro- At the heart of Triage is a profiling engine that is used to
cessing surrogate. As untethered networks proliferate, this estimate the execution time and energy usage for different
library of surrogates will be expanded, enhanced, and fur- tasks, and a scheduling engine that determines how to meet
ther optimized. QoS constraints. In this section, we describe the algorithms
employed by the profiling engine and the scheduling engine
4.1 Storage System Surrogate to enable energy-efficient scheduling of tasks.
The storage surrogate enables in-network storage appli-
cations. It accepts read, write, and delete requests for the 5.1 Task Profiling Algorithm
tier-1 storage system. Upon receiving a request from the Triage employs a profiler to measure the execution time
network, it first determines whether the request is a read and energy usage for different tasks. In this discussion, we
request that can be satisfied by a recent write cached in the focus on determining the typical energy usage and typical
delayed request log. If so, it immediately provides the re- execution time for each type of task. Such online profiling is
sult. Otherwise, it asks the scheduler to schedule the task. necessary to deal with the variability in execution time and
The scheduler considers any soft real time deadline provided energy usage of tasks that involve a combination of process-
with the task and tells the surrogate when the task has been ing, communication and storage.
scheduled. The surrogate then inserts the task into the de- Online profiling involves two steps, task grouping and pa-
layed request log. rameter estimation. The online profiling engine first iden-
tifies a task as belonging to a certain group based on the
4.2 Network Forwarding Surrogate nature of task. This grouping information is assumed to
The network forwarding surrogate enables efficient rout- be provided a priori by the system designer. We believe
ing by utilizing both the tier-0 and tier-1 network inter- that such a grouping is appropriate since many applica-
faces. When a packet arrives at the surrogate, it exam- tions of microservers involve a small and well-specified set of
ines the destination address, consults a routing table, and tasks. For instance, in a camera sensor network, a typical set
determines over which radios the destination is reachable. of tasks might be {Motion Detection, Face Recognition,
It immediately passes this information, along with any la- Store Image, Send Data}.
tency constraint, to the scheduler which determines which For each of these task groups, the profiler uses a sepa-
radio should be used to send the packet and when the packet rate history of execution times and energy consumption to
should be sent. If scheduler determines that the tier-0 radio build corresponding probability distributions. We focus on
should be used, the packet is sent immediately. Otherwise, the estimation of the typical execution time since a similar
the packet is inserted into the delayed request log. algorithm can be used for estimating typical energy usage.
Let f (ti ) be the probability distribution of time taken to
4.3 Query Processing Surrogate execute task type i. Further, let X i and σi denote the av-
The query processing surrogate provides a simple query erage time and standard deviation for task type i. The pro-
interface for data stored on the microserver. Clients may use filer uses the Chebyschev’s inequality (shown in Equation 1)
simple queries, such as retrieve all images from the last ten to determine an interval of time such that the task executes
seconds, or more complex queries, such as retrieve all images within that interval with probability at least p. Triage conse-
that contain 2 or more objects and are from a particular ge- quently takes the upper bound on the interval as the typical
ographic region. The queries are specified by the following execution time for the task. This is a conservative estimate,
fields (1) the type of query (simple/ complex) (2) the images however for tasks whose execute time distributions are not
known apriori, a conservative strategy is warranted. More- time: 64 61 60 57 50 0
over, typical execution time refers to the time such that p%
of the requests are likely to be satisfied. EB=3 EA=3
D=64 D=60 W=7 B=50
L=61 L=57
σi σi
T ime(i, p) = [X i − √ , Xi + √ ] (1)
1−p 1−p time: 64 61 58 55 48 0

The parameter p can be tuned depending on the guarantee


EB=3 EC=3 EA=3
required by a user. For instance, in the camera network used D=64 D=62 D=60 W=7 B=48
for surveillance, a Face Recognition task might require a L=61 L=58 L=55
tight guarantee as the person might move out of the field of
view of the camera sensors. In this case, p can be set to a
high value, say 0.9. Other tasks such as Send Data might be Figure 3: ALAP Example: The figure shows the execu-
more elastic, and Store Image might not have any deadline tion time and deadline for each task, the wake up latency
at all. for tier-1, and the resulting batching time. A new task
Using information collected after the task executes, the TC is inserted into the scheduling decreasing the batch
profiler builds two kinds of dynamic models: histograms and time.
parametrized models. When prior models of execution time
are unavailable, a simple histogram approximates the prob-
ability distribution, f (ti ), and tracks bins of execution times The scheduling framework that we propose is based on
and energy consumption for each task group. In contrast, the well-known As Late As Possible (ALAP) scheduler [29].
when prior models are available, these can be used to more When a new task arrives, the scheduler first queries the pro-
accurately model task execution time and energy consump- filer for the typical execution time for the task at tier-1.
tion. For instance, the execution time and energy used in Next, the algorithm recomputes the batch time, B, i.e. the
a communication task is a linear function of the number of latest time at which tier-1 can be woken such that all the
bytes transmitted. In this case, the execution times are fit batched tasks and the new task meet their deadlines. Let
to a simple linear model to determine the costs of the two the new task be inserted at index l into the sorted list S
radios in Triage. Our prototype uses parametrized models based on its deadline. The scheduler now needs to ensure
for the storage and network surrogates tuned to the size of that the insertion of the new task does not result in missed
the file and the size of the packet respectively. More com- deadlines for any of the other tasks in the list. The scheduler
plicated models can be built for applications such as video only lowers the batch time and never increases it, hence only
coding, compression, and encryption. For the query pro- the tasks that are before Tl in the list need to be checked for
cessing surrogate we use the histogram-based profiling. deadline violation. Thus, for each task Ti : l ≥ i ≥ 1, the
scheduler sets the latest start time such that it does not vi-
5.2 Task Scheduling Algorithm olate the deadline constraint of Ti or any task with deadline
The scheduler that resides on tier-0 uses the profiling in- after Ti , i.e. L(Ti ) = min(L(Ti ), L(Ti+1 )−E(Ti )). The new
formation about tasks to minimize the number of times tier- batch time, B, is updated to reflect the latest start time for
1 is woken while still satisfying the task deadlines. The the first task in the list, i.e. B = L(T1 )−W . If B ≤ 0, tier-1
Triage scheduler uses different algorithms depending on the is immediately woken and the batch executed. If B > 0, a
optimization criteria. In this work, we discuss two sched- timer will fire at time B and tier-1 will be woken. The time
ulers — the first is optimized to satisfy task deadlines, and required to update the schedule is linear in the number of
the second is optimized to achieve a target lifetime for the tasks currently in the queue.
microserver. While we limit our discussion to these two We illustrate the deadline scheduling algorithm with a
schedulers, we note that alternate schedulers that optimize, simple example, shown in Figure 3. Let there be two batched
or balance, other QoS constraints can be plugged into our tasks, TA with deadline 60 seconds and execution time 3
system. seconds, and TB with deadline 64 seconds and execution
time 3 seconds. The latest start times of the two tasks
5.2.1 Scheduling for Deadline Constraints are L(TA ) = 57 seconds and L(TB ) = 61 seconds respec-
The deadline scheduler tries to minimize the energy con- tively, and the batch time, B is 50 seconds, assuming a tier-1
sumed by the microserver, such that the deadlines of the wakeup time, W of 7 seconds. Now, a new task TC arrives
incoming tasks are met. Let the set of tasks which are al- with deadline 62 seconds and execution time 3 seconds. The
ready batched at tier-0 for delayed execution at tier-1 be task is inserted between TA and TB . The scheduler checks
denoted by S = {T1 , ...., Tk } where task Ti has deadline whether the current batch time satisfies TC ’s deadline, no-
D(Ti ), latest start time L(Ti ), and execution time E(Ti ). tices a violation, and pushes TA forward in the schedule.
The latest start time is the latest time at which a task can Hence, the batch time is set to 48 seconds.
begin executing on tier-1 such that the deadlines of all tasks
after and including itself are met, and the execution time is
5.2.2 Scheduling for a Lifetime Constraint
the time it takes to execute the task on tier-1. L(Ti ) and While the goal of the deadline scheduler is to miss only
E(Ti ) are provided by the profiler. Further, we assume that a small percentage of deadlines while minimizing energy us-
the list S is sorted by deadlines i.e. D(Ti ) > D(Tj ) if i > j. age, the objective of the lifetime scheduler is to meet a target
Let the wakeup time for tier-1 be W , and the current batch lifetime for the microserver while satisfying as many dead-
time, B, correspond to the latest time at which tier-1 needs lines as possible. The scheduler should also be capable of
to be woken up such that the deadlines of all tasks in the handling periods of burstiness. To accomplish this we use
list S can be satisfied. a token-bucket algorithm for the lifetime scheduler. We use
a lifetime scheduler from our previous work on networking
relays [3]. We summarize the algorithm here. Given a tar-
get lifetime, L, and battery capacity, E, energy tokens are
generated at a constant rate of E . The total number of Power Profiling Board TelosB
L
accumulated tokens represents the amount of energy that
is available for use by the system. The amount of energy
used by the system is continually monitored by the energy
profiler, and is queried periodically by the scheduler to de- Battery
termine the rate at which energy is depleted by the system.
The difference between the accumulated energy tokens and Stargate
depleted energy tokens at any time represents the surplus
of energy that can be used by the system to execute the
batched tasks.
The lifetime scheduler builds on the deadline scheduling
algorithm that we described in Section 5.2.1. When a new
task arrives, the deadline scheduling algorithm is used to
queue the task and determine the batch time. When this
batch time becomes zero, the lifetime scheduler checks to see Figure 4: Prototype Triage System
whether there are sufficient energy tokens to wakeup tier-1,
execute all the tasks in the batch, and shutdown tier-1. If so,
tier-1 is woken and the tasks are executed before shutting it Stargate, this minimum latency is 15.6 seconds. In order
down. The energy profiler is queried to determine how much to deal with this problem, more sophisticated profiling tech-
energy was used during this batched processing, and the niques could be used to anticipate the latency requirements
number of available energy tokens is updated accordingly. of upcoming tasks and choose the proper idle state accord-
If the number of energy tokens is insufficient to execute the ingly.
batch, the wakeup of the tier-1 platform is delayed until
sufficient tokens have accumulated. During this period of
waiting for tokens to accumulate, task deadlines could be
6. IMPLEMENTATION
missed, and tasks could be dropped if the size of the task In order to evaluate our approach we have implemented a
queue exceeds the storage capacity of the tier-0 platform. working prototype of Triage, shown in Figure 4.
Neither of our scheduling algorithms take into account the
availability of DVFS states at tier-1. Evaluating and pro-
6.1 Prototype Hardware
filing multiple DVFS states would possibly permit greater We constructed a prototype using a Crossbow Stargate
efficiency by allowing tier-1 to sleep longer, however this (tier-1) [30] and a TelosB Mote (tier-0) [23]. The Stargate
pushes the complexity of the scheduler to O(kn ) for n tasks contains a 32-bit, 400MHz PXA255 XScale processor, 64
and k power modes — the scheduling problem can be shown MB of RAM, 32 MB of internal flash, and a WiFi interface.
to be NP-Hard. We are currently investigating approximate The TelosB Mote contains an 8-bit, 8 MHz microcontroller,
heuristics that are computationally feasible for the tier-0 10kB of RAM, 1 MB of external flash, and an 802.15.4 ra-
node. These heuristics can be implemented on top of the dio. These hardware platforms were chosen because they
scheduling algorithms described in this section to determine handle the range of workloads that we have targeted, are
when tier-1 should be woken up. separated in power consumption by more than an order-of-
magnitude (300mW-3000mW and 20mW-100mW), are eas-
5.2.3 Idle State Management ily programmable, and are well supported. The Stargate
One remaining issue is the state in which the scheduler platform runs Linux, making available a broad range of soft-
leaves tier-1 while it batches new requests at tier-0—there ware tools and services. We used the power-gating technique
are two options, suspension and shutdown. Suspension re- with a relay to switch off power to the CF card when the
quires more idle power than shutting down the platform, Stargate is suspended [22]. This technique brought down
but the transition cost is lower. For example, the Stargate the suspension power of the Stargate to a modest 61 mW.
platform with a CF 802.11 card draws 60.46 mW in suspend We used a power supply board designed for attaching both
mode and costs 3.67 J and 32 J of energy to wake up from boards to a single battery [28]. This board provides an
suspend and shutdown respectively. additional hardware element necessary to Triage, a Maxim
The Triage scheduler determines the appropriate idle state DS2770 fuel gauge chip. The fuel gauge chip gives accurate
for tier-1 based on the expected time between wakeups of readings of the energy left in the battery, and the amount
tier-1. The expected time between wakeups are calculated of energy used by the system during task execution. As
over the last k wakeups, where k is a constant. We estimate described, the scheduler uses this profiling information to
the cost of each state based on the expected idle time, and control the platform energy policy.
choose the state that minimizes this cost. One limitation of our current implementation is the trans-
While this approach is sufficient for many applications, fer speed from the TelosB Mote to the Stargate. The two
the choice of idle state does enforce a minimum latency that devices communicate over a USB line which is limited to 230
the microserver can support. For example, if tier-1 is shut- kbps. However, data needs to be read from the flash and
down due to infrequently arriving tasks, the next task that then transfered over the USB which increases the total time
requires tier-1 will have to wait at least as much time as required for the transfer. As a result, transferring 1024 kB
tier-1 requires to wake from shutdown. In the case of the of batched work along with protocol overhead takes more
BASE STATION We use a camera sensor network application in our evalu-
SENSOR CAMERA ation. The application involves in-network processing, stor-
age and forwarding and forms an adequate testbed for Triage.
802.11b The experimental setup is shown in Figure 5. Motes emulate
Link
cameras in the network and feed images of variable sizes to
802.15.4 Link CLIENT
the microserver. The other nodes in the network are client
Motes attached to a device equipped with a 802.11b inter-
face. Therefore, results of queries can be routed back to a
client using a 802.15.4 or a 802.11b link. All power measure-
HUB/MAU NIC
ments were taken using a NI-PCI 6251 DAQ with a SC-2345
GD

JA
% UTIL IZATI ON

RE

KB
I F

LC
TAB

ENT ER
RUN
NI-PCI signal conditioning unit.
6251 DAQ
M7 N8 O9 PRINT
GD GD GD HELP
BNC
4M b/s GD T2 U3 ALPHA
V0 WX
. YZ SHIF T

MICROSERVER In our evaluation, we compare Triage with three other


systems.
1. PSM-DVFS : This is a single-tiered dual radio system
Figure 5: Experimental Setup which uses WiFi PSM and DVFS (dynamic voltage
frequency scaling) to save power on the Stargate. The
TelosB Max. Power Consumption 100 mW
Stargate Bootup Energy 32 J
system runs a DVFS algorithm which uses previously
Stargate Bootup Time 15.6 s measured data to identify the lowest DVFS state on
Stargate Resume Energy 3.7 J the Stargate where the given deadline can be satisfied.
Stargate Resume Time 2.6 s The wakeup interval in PSM mode for the 802.11 card
Stargate Suspension Power 60.5 mW
is set to the maximum value supported by the card
Stargate Idle Power (WiFi in PSM-idle mode) 912.3 mW
(33 secs). This provides an accurate comparison with
Figure 6: Platform Measurements a system which does not use the hardware architecture
of Triage.
2. WoW* : The system is similar to Wake-on-Wireless
than 150 seconds and wastes a great deal of energy while [26]. The published Wake-on-Wireless system wakes
blocked on serial I/O. We are currently investigating more up when it receives a network packet. WoW*, how-
efficient means of communication. Even with this limitation ever, wakes up tier-1 whenever a task arrives at tier-0.
the current prototype shows extremely high gains in energy Tasks are always executed on tier-1. WoW* is a sys-
efficiency. tem which uses the hardware architecture of Triage
6.2 Surrogates and Log but does not use its software architecture.
As part of this prototype, we implemented three surro- 3. Triage-Batch : The Triage-Batch system uses the same
gates: storage, network forwarding, and query processing. hardware as Triage. However, it does not use any on-
The delayed request log is managed on the TelosB’s flash line profiling, scheduling or caching. It batches a task
storage using a custom designed file system. We imple- as long as its deadline permits and then wakes up tier-1
mented these components as TinyOS modules written in to execute the task.
nesC [9]. The forwarding, storage, and query processing
surrogates comprise 535, 480, and 540 lines of nesC code
7.1 Static Energy Costs
respectively, and the custom file system consists of roughly In order to provide better intuition into the behavior of
1600 lines of code. The profiler and scheduler consist of our prototype, we measure the static energy costs that di-
1360 lines of code. We also implemented an execution en- rectly impact Triage’s performance. These values, shown in
gine that runs on the Stargate and executes tasks when they Figure 6, are the basis for the energy savings achieved by
are received from the Mote 1 . Triage. We observe that the idle cost of the Stargate plat-
form with WiFi card in PSM mode is 6 times the sum of
the suspend power of the Stargate and the maximum power
7. EVALUATION consumed by the TelosB. Therefore, offloading tasks to tier-
We evaluate the performance of Triage through a exten- 0 while keeping tier-1 asleep can lead to substantial energy
sive set of experiments. First, we provide micro benchmarks savings. However, the Stargate’s transition from suspend to
that validate our use of a two-tier platform to achieve a com- active costs as much energy as 36 seconds of computation on
bination of capability, energy-efficiency and responsiveness. tier-0. Transitioning from shutdown to active is equivalent
Second, we evaluate the deadline and lifetime schedulers, to 319 seconds of active tier-0 computation. Therefore, re-
and demonstrate their effectiveness in achieving their goals. placing expensive state transitions at tier-1 with low-power,
Third, we focus on the performance of the profiler, and its tier-0 computation as long as possible can lead to minimal
accuracy when used with the forwarding surrogate. Fourth, energy costs for the system.
we demonstrate how the Triage system is able to adapt itself
to different QoS constraints at minimal energy consumption. 7.2 Soft-Realtime Scheduling
Finally, we show the energy consumption of different inde- In order to evaluate the deadline scheduling algorithm,
pendent components of the system and identify potential we observe how a Triage microserver performs in the pres-
bottlenecks in the system. ence of different latency constraints. In this experiment the
1
The source code for Triage is available at microserver answers queries of client Motes for objects in
http://prisms.cs.umass.edu/hpm/triage.html. images. Such a query is common in surveillance applica-
tions where a user might want to detect movements at a
0.35
1200

Average power consumed (mW)


0.3
1000

0.25
800
Probability

0.2
600

0.15
400 Triage (burst)
Triage−Batch (burst)
0.1
200 Triage (constant)
Triage−Batch (constant)
0.05
0
100 150 200 250 300
0 Average latency constraint(ms)
2 4 6 8 10 12 14 16 18
Time taken (seconds)
Figure 8: Average Power Consumption for the dead-
line scheduler. The deadline scheduler consumes slightly
Figure 7: Histogram of the time taken to execute image
more power in order to meet more deadlines.
classifier on Stargate
1

scene or extract important features of a scene. The objects


in the image are computed using the kmeans classifier at

Percentage of Deadlines met


0.8
the microserver and the classified image is sent back to the
client.
The histogram of the amount of time taken to classify 0.6
random 100-by-100 images using the kmeans algorithm is Triage (constant)
Triage (burst)
shown in Figure 7. The figure illustrates the large variance
0.4 Triage−Batch (constant)
in the amount of time taken to classify an image. Moreover, Triage−Batch (burst)
it is clear that the time taken to classify an image does not
follow a known probability distribution. Hence, accurate
0.2
time profiling for the application is crucial to the success of
the scheduler in meeting the deadline constraint at minimal
energy consumption. The profiler uses a threshold of p = 0.9 0
in Equation 1 to determine the typical execution time of each 100 150 200 250 300
Average Latency Constraint (sec)
task, i.e., at least 90% of the time the microserver will meet
its deadline for tasks.
We evaluate the scheduler on two task arrival distributions— Figure 9: Percentage of Queries executed within dead-
(i). when tasks arrive at a constant rate of one per 30 sec- line. Triage meets more than 95% of all deadlines while
onds (ii). when tasks arrive in bursts of three queries—the Triage-Batch is able to meet only 70% of all deadlines
inter arrival time between bursts follow a Poisson distribu-
tion with mean 30 seconds. While the first scenario exhibits
a constant load on the system which is easy to learn, the termine the computational needs required for these queries,
second task arrival distribution tests loads which are unpre- and the accuracy of the deadline scheduler, which correctly
dictable. Therefore, the system has to accurately predict schedules the wakeup of tier-1 to meet the desired latency
and adapt to the variable load patterns to perform well. constraints. Without profiling the execution time of tasks or
We varied the latency constraint on the task in the ex- the deadline scheduler, the Triage-Batch scheduler is unable
periment and we measure the power consumed by the mi- to determine when to wake tier-1 and regularly misses dead-
croserver and the percentage of tasks completed within the lines, especially when latency constraints are small. This is
deadline. Each point on the x-axis represents experiments because scheduling errors are more likely to occur at the
where the latency constraint is chosen uniformly at random beginning of each batch of tasks. Allowing longer latencies
within the interval [x−30, x+30] seconds. 100-by-100 images results in larger batches of tasks and more tasks that are
are sent to the microserver at a rate of one per minute. The processed ahead of their deadlines. However, it is worth
results are compared with the Triage-Batch system. This is noting that the Triage system only consumes slightly more
because the Triage-Batch system keeps the tier-1 system in power than the Triage-Batch system.
the low-power mode for a longer period of time. The second observation from the experiment is that the
The results of the experiment are shown in Figure 8 and Triage system satisfies all constraints irrespective of the task
Figure 9. The first thing to note in Figure 9 is that for arrival distribution. However, both systems consume more
each workload the Triage deadline scheduler is always able power for the bursty task arrival distribution. This is be-
to meet at least 90% of the latency constraints. Triage is able cause, even though the mean inter-arrival time between bursts
to adapt to bursty unpredictable task arrival distributions. is the same as the interval between arrival of constant rate
This demonstrates both the accuracy of the non-parametric tasks, the number of tasks executed for bursty task arrival
histogram profiler of Triage, which is able to precisely de- is more than the constant task arrival distribution.
100 18000

Battery Capacity Remaining (mAh) Lifetime Constraint


16000 Latency constraint
Triage
80 WoW* 14000

Average Latency (ms)


PSM−DVFS Triage
12000 WoW*
PSM−DVFS
60
10000

8000
40
6000

4000
20
2000

0
0 0 20 40 60 80 100 120
0 10 20 30 40 50 60 Size of Images (KB)
Time (minutes)

Figure 11: Time Profile Accuracy. This shows the for-


Figure 10: Lifetime Scheduler. This figure shows warding surrogate choosing between two radios based on
Triage’s use of the lifetime scheduler for a goal of 60 a latency constraint. For 80KB of data is switches from
minutes. For the first 30 minutes the load is light and the 802.15.4 radio to the 802.11 radio to meet the con-
the microserver accumulates an energy surplus. For the straint.
last thirty minutes it uses the surplus at the detriment
of meeting deadlines.
respectively. Therefore, these systems are left without any
battery for 37% and 65% of the time.
7.3 Lifetime Scheduling
To show that the lifetime scheduler is able to meet a life- 7.4 Task Profiling
time goal, we subject the microserver to a similar experi- The profiling function of the microserver is essential in
ment as before using Query Processing. However, we use providing soft-realtime guarantees. We use simple parame-
simple queries for an images stored in the Stargate storage. terized linear models for the forwarding surrogate and gen-
In order to expedite this experiment, we use a small battery eral histogram models for the more complex tasks like image
capacity of 100 mAhr—enough energy for the microserver processing where good models are not known apriori. The
to operate at maximum load for 9 minutes. We set the life- model parameters are learned by Triage over time. The effi-
time goal of the server to be 60 minutes. For the first 30 cacy of the histogram model was shown in Section 7.2. We
minutes, all queries arrive at a constant rate of one per 30 present the efficacy of the parameterized model here. We
seconds with a latency constraint uniformly distributed over use an experimental setup where variable sized images are
the interval [150, 210] seconds. For the remaining time, the sent from the camera Motes to the microserver at a fixed
server sees a more intense load with queries arriving with rate of one image per minute to be routed to some destina-
latency constraints uniformly distributed over the interval tion Mote. The destination could correspond to a central
[5, 15] seconds. Figure 10 shows the results of this experi- processing server or another node. Since the linear model
ment. The slope of the straight line demonstrates the life- for the amount of time taken to route data for the two ra-
time goal divided by the energy capacity of the server—this dios is a function of the amount of data that Triage wants to
is the overall average power goal. send over the radio, we vary the size of the images sent dur-
Recall that when using the lifetime scheduler, Triage pri- ing the experiment—this corresponds to images of different
oritizes lifetime over latency constraints. It attempts to resolutions required by a client.
meet the latency constraint whenever it can, and the token- Each forwarding request has a fixed latency constraint of
bucket algorithm allows for bursts of short, energy intensive 15 seconds. We show the results of the experiments in Fig-
workloads. For this algorithm to operate correctly, Triage ure 11 and Figure 12. We compare our results with the
must accurately profile the energy use of tasks and track the PSM-DVFS and WoW* systems. Comparing these systems
overall energy consumption of the microserver to account we see several effects. First, Triage is able to profile the
for profiling errors. For the first 30 minutes, the server time required for forwarding tasks correctly and send them
consumes less energy than is required to meet its lifetime by their required latency constraint for data sizes less than
goal. During this time the server’s workload is insufficient 100KB. Triage uses the 802.15.4 radio at the lower data
to drain the bucket, so Triage is free to schedule all tasks rates, and above 80KB of data it must wake the tier-1 system
and therefore meets its deadlines. After operating for 10 to make the latency constraint. Second, as Triage can use
minutes under a more intense workload, Triage continues the TelosB-node alone to transfer data it can achieve 200%
to meet its deadlines, but it begins to consume the surplus increase in lifetime over WoW* system and 500% increase
energy that has accrued. At 45 minutes, Triage runs out in lifetime over PSM-DVFS system. The savings clearly
of surplus energy and begins to sacrifice latency constraints demonstrate the benefit of using a tiered architecture. The
for conserving energy. Note that Triage meets the lifetime architecture provides the microserver with the flexibility of
goal with an excess energy of about 3mAh. The WoW* and using resources on either tier intelligently—leading to sub-
PSM-DVFS systems are unable to meet the lifetime con- stantial energy savings. Third, both WoW* and Triage are
straint. Their batteries die out at 38th and 21st minutes unable to meet the latency constraint for 120KB images.
1200 1200

Average Power Consumed (mW)


Average Power Consumed (mW)
1000 1000
Triage
WoW* Triage
800 PSM−DVFS 800 WoW*
PSM−DVFS

600 600
Suspension
mode
400 400
Shutdown
mode
200 200

0 0
0 20 40 60 80 100 120 0 100 200 300 400 500 600
size of images (KB) QoS Constraint (secs)

Figure 12: Time Profile Power Consumption. This Figure 13: Scaling to QoS constraints. Triage finds the
shows the forwarding surrogate power consumption minimal energy cost at which a QoS constraint can be
based on the amount of data it sends. At 80KB of data it satisfied. The WoW* system and PSM-DVFS system
must switch to the 802.11 radio to meet the latency con- consume 3x and 5x more power than Triage respectively.
straint, thus using more power. The WoW* and PSM-
DVFS solutions use more power as they only use the 3
PSM−DVFS
802.11 radio to send data. 2

Power Consumed (W)


This is due to the USB data transfer bottleneck for multi- 0
0 50 100 150 200 250 300 350 400 450 500
tiered system. The PSM-DVFS system meets all deadlines 3
at a much higher energy cost. 2
WoW*

7.5 Scaling to QoS Constraints 1

The primary goal of Triage is to balance energy consump- 0


0 50 100 150 200 250 300 350 400 450 500
tion of the server with a given QoS constraint. Triage uses a 3
combination of task profiling, scheduling, caching, and idle Triage
2
state management to determine the minimal cost at which
1
a QoS constraint can be satisfied.
We perform the following experiment to validate the above 0
0 50 100 150 200 250 300 350 400 450 500
claim. Camera Motes feed 100-by-100 images into the mi- Time (sec)
croserver at a rate of one per minute. Client Motes request
images from the microserver at a rate of one per 30 seconds.
Each query request is for a single image taken T seconds ago, Figure 14: Power Traces for the Triage, WoW* and
PSM-DVFS systems.
where T is exponentially distributed with a mean 100 min-
utes. This represents applications where newer data is more
valuable to the user than old data. We vary the latency con- We find that the PSM-DVFS system suffers from a huge
straint on the task in the experiment. We compare Triage idle cost of keeping the Stargate platform awake with the
with PSM-DVFS and WoW* systems. The results are shown WiFi card in PSM-idle mode. This problem is solved by the
in Figure 13. We see that Triage consumes 300% less power WoW* system by using the hardware architecture of Triage
than WoW* and 600% less power than the PSM-DVFS sys- and duty-cycling the Stargate. However, the WoW* sys-
tem. Triage uses a combination of serving requests from tem suffers from a huge transition energy cost, since it has
cached data, scheduling and delayed execution to amortize to wake up the Stargate on every task arrival. The above
the transition cost of the tier-1 platform over a long period problem is solved by Triage using its software architecture
of time and hence shows large power savings. and amortizing the transition cost of a long interval of time.
7.6 Component Power Consumption However, we find that the USB-transfer cost is a potential
bottleneck for both the Triage and WoW* systems and the
Finally, we demonstrate the power consumed by inde- performance of Triage could be improved if the bottleneck
pendent components of Triage. We perform an experiment is eliminated using low power DMA (direct memory access)
where 100-by-100 images are sent to the microserver at a between the Mote and the Stargate.
rate of 1 per minute. The client Motes request classified
images with a task arrival rate distributed in the interval
[30, 60] seconds. This application provides for a combina- 8. RELATED WORK
tion of storage, data transfer, processing and data forward- The design and implementation of Triage draws from sev-
ing. The power trace collected is shown in Figure 14. The eral related research areas.
break-down of the average power consumed by different in-
dependent components of the system are show in Figure 15. 8.1 Microservers and Clustering
1200
The Wake-on-Wireless project (WoW) [26] proposes a hi-
TelosB
Average Power Consumed (mW)
Suspend
erarchy of devices for PDAs, including a low-power receiver
1000 Transition that can wake the PDA. Our goal is similar to WoW, to re-
USB−Transfer duce power consumption in battery powered devices. Their
800
Computation+WiFi focus was solely on exploiting low-power radios, whereas
Stargate−idle Triage tackles the significantly broader problem of building
energy-efficient, QoS-aware microservers.
600
Our Turducken system [27] employs multiple hardware
tiers in the context of an always-on laptop system. The sys-
400 tem designer predetermines the tasks to be executed on a
tier. Unlike Triage, in Turducken task sharing is static and
200 the system lacks a clean software architecture for automatic
distribution of tasks among tiers. Moreover, Turducken sys-
tem provides an always-on capability to laptops at minimal
0
Triage WoW* PSM−DVFS energy cost while Triage is an architecture for building a
power efficient general purpose microserver in a static sen-
Figure 15: Break-down of the power consumed by inde- sor network setting.
pendent components of Triage, WoW* and PSM-DVFS Narayanan et. al use a history-based scheme to predict the
systems. effect of application fidelity on resource consumption [20].
The predictors designed in the paper were specific to the
platform and input data on which they are executed. Such
Several sensor network systems utilize a subset of the par- predictors allows a system to learn over time, the behavior of
ticipating nodes as aggregaters, central processing nodes, or an application and can provide feedback about its resource
gateways [10]. This work can be classified into algorithms consumption. The paper uses the predictors in an operating
for networks of homogeneous devices and algorithms for net- system platform to monitor logs and predict how application
works of heterogeneous devices. In homogeneous systems resource consumption varies with fidelity.
such as Heed [31] and LEACH [12], the leader, or cluster-
head, rotates among nodes in the network. The goal is to
8.3 Sensor Platforms
distribute the extra energy drain incurred by the leader. In Recently, many embedded sensor platforms have emerged.
heterogeneous systems, larger, more powerful nodes called These platforms span a broad spectrum of power require-
microservers herd other smaller nodes [11]. Our work fo- ments and functionality. A popular instance of sensor plat-
cuses on the latter scenario and addresses the need for a forms is the family of Motes. These nodes are commercially
power-aware software and hardware architecture to reduce available, widely used, and include the Crossbow MicaZ and
the energy drain on the resource-rich nodes. Mica2Dot as well as the Telos node. All of these nodes con-
sume peak power between 10-100mW and are tuned to be
8.2 Energy Management highly power efficient.
Reducing the power consumption of mobile devices has There are also several more capable but still very power-
been the subject of much research. Approaches include scal- efficient sensor nodes such as the Yale XYZ [18]. This node
ing the CPU voltage and frequency [8], managing wireless has dynamic frequency scaling capability and can operate
interface usage [1], turning off banks of RAM [13], or employ- between 2MHz and 56MHz with a power consumption of
ing microsleep [4, 14]. In a larger device, such as a Stargate, up to 3x greater than a Mote at comparable clock speeds.
these techniques still do not enable a power mode compara- Such intermediate platforms can be used as clusterheads in
ble to a Mote device. While these efforts target optimization applications that have moderate computation requirements.
for laptops and PDAs, our architecture targets microservers Our architecture targets resource-rich but power efficient
for wireless networks, and is designed to exploit a hardware sensor platforms that combine two processing elements—one
platform with complementary components. small and one large. Such architectures have been used in
Papathanasiou and Scott made an observation similar to other research efforts. The Stargate platform [30] incorpo-
ours: batching work, or increasing idle periods, leads to rates a connector to a Mica or Telos Mote, but the intention
greater energy efficiency [21]. However, the goal of the their was to provide a gateway for Mote radios, rather than to
work was to increase burstiness in laptop disk drives. optimize the energy efficiency of the platform. The PASTA
In our previous work, we designed a multi-tiered multi- node is an architecture that combines a trip-wire board with
radio architecture to design solar-powered energy efficient a DSP processor together with a PXA processor [25], and the
DTN routers [3]. The architecture uses a mobility predic- LEAP platform [7] integrates a higher-end processor-radio
tion engine, and a lifetime scheduler to efficiently route net- module (Intel PXA255 XScale running Linux) with a lower-
work packets at minimal energy cost. Unlike Triage, which end processor-radio module (TI MSP430). The mPlatform
is an architecture to build general purpose microservers in a is another modular sensor platform which provides real time
static sensor network setting, the throwbox architecture fo- processing for requests on heterogeneous processors [17]. While
cused on application of a similar hardware architecture in a these efforts employ hierarchical structures, they do not pro-
more mobile scenario. Although the design and algorithms vide software architectures or algorithms for intelligently
were specific to disruption tolerant networking and cannot controlling the use of the hardware infrastructure.
be applied to static sensor networks, the paper shows the
efficacy of the hardware architecture in a mobile network 9. CONCLUSIONS
scenario where contacts are sparse and short-lived. This paper presents the design, implementation, and eval-
uation of Triage, an architecture for QoS-aware, energy- International Conference on Mobile Systems, Applications,
efficient microservers. Our work exploits hierarchical hard- and Services (MobiSys’04), Boston, MA, June 2004.
ware with two connected platforms, one with high capability [5] B. Burns, O. Brock, and B. N. Levine. MV routing and
for executing a batch of tasks and one with high energy- capacity building in disruption tolerant networks. In
Proceedings of IEEE Infocom 2005, March 2005.
efficiency while waiting for new tasks to arrive. A novel as-
[6] Crossbow Technology Inc., San Jose, CA. Stargate
pect of our work is our energy and QoS-optimized scheduler Developer’s Guide, Rev. A edition, September 2004.
that uses extensive in-situ profiling capabilities to efficiently 7430-0317-12.
execute storage, communication and processing tasks. We [7] K. Ho D. McIntire, B. Yip, A. Singh, W. Wu, and W. J.
demonstrate the use of two schedulers in Triage, an As-Late- Kaiser. The Low Power Energy Aware Processing (LEAP)
As-Possible scheduler that optimizes for task deadlines, and Embedded Networked Sensor System. Technical report,
a token-bucket based scheduler that optimizes for node life- UCLA, Los Angeles, CA, 2005.
time. Our results show that Triage provides a 300% increase [8] K. Flautner, S. Reinhardt, and T. Mudge. Automatic
performance-setting for dynamic voltage scaling. In
in microserver lifetime over existing systems and provides Proceedings of the Seventh ACM International Conference
probabilistic quality of service guarantees. on Mobile Computing and Networking (MobiCom’01),
While our Triage prototype demonstrates the potential for Rome, Italy, July 2001.
long-lived QoS-aware microservers, we believe that the ceil- [9] D. Gay, P. Levis, R. V. Behren, M. Welsh, E. Brewer, and
ing is significantly higher than what we have demonstrated D. Culler. The nesC Language: A Holistic Approach to
in this paper. First, we only considered an always-on tier-0 Networked Embedded Systems. In Proceedings of
platform since efficient duty-cycling support is not yet avail- Programming Language Design and Implementation
(PLDI), June 2003.
able for the CC2420 radio on the Telos Mote. We believe
[10] L. Girod, T. Stathopoulos, N. Ramanathan, J. Elson,
that Triage can provide significantly more benefits if tier-0 D. Estrin, E. Osterweil, and T. Schoellhammer. A system
were also duty-cycled, especially when requests are infre- for simulation, emulation, and deployment of heterogeneous
quent. For instance, we can reduce the tier-0 energy by sensor networks. In Proceedings of the Second ACM
a factor of 10 while increasing average latency by 1 sec- Conference on Embedded Networked Sensor Systems,
ond [24]. In such a scenario, a WoW-style system could be Baltimore, MD, 2004.
used to control the wakeups of tier-0 which can consequently [11] R. Govindan, E. Kohler, D. Estrin, F. Bian,
K. Chintalapudi, O. Gnawali, S. Rangwala, R. Gummadi,
wakeup tier-1 when required—this is something we aim to and T. Stathopoulos. Tenet: An architecture for tiered
study as future work. Second, the hardware used in the cur- embedded networks. Technical Report TR-56, CENS,
rent prototype is not ideal for a number of reasons including November 2005.
the large latency in waking the Stargate from suspension, its [12] W. Heinzelman, A. Chandrakasan, and H. Balakrishnan.
low bandwidth available to transfer data from the Telos to Energy-efficient communication protocols for wireless
the Stargate. The large latency to wake the Stargate is the microsensor networks. In Proceedings of the Hawaiian
International Conference on Systems Science, January
greatest limiting factor as it restricts the smallest latency
2000.
requests that the microserver can support. As low-power
[13] H. Huang, P. Pillai, and K. G. Shin. Design and
platforms continue to evolve we expect that most of these implementation of power-aware virtual memory. In
bottlenecks will improve making Triage an even more use- Proceedings of USENIX Technical Conference, San
ful and efficient system for untethered sensor deployments, Antonio, TX, June 2003.
mobile computing, and ubiquitous computing. [14] N. Kamijoh, T. Inoue, C. M. Olsen, M. T. Raghunath, and
C. Narayanaswami. Energy trade-offs in the IBM
wristwatch computer. In Proceedings Fifth International
Acknowledgements Symposium on Wearable Computers, Zurich, Switzerland,
We thanks our anonymous reviewers and our shepherd Mar- October 2001.
garet Martonosi for their valuable feedback and comments. [15] P. Kulkarni, D. Ganesan, and P. Shenoy. Senseye: A
multi-tier camera sensor network. In ACM Multimedia,
This material is based upon work supported by the National
2005.
Science Foundation under awards CNS-0447877 and CNS- [16] M. Li, D. Ganesan, and P. Shenoy. PRESTO:
0546177, CNS-0520729, CNS-0519881, DUE-0416863, CNS- Feedback-driven data management in sensor networks. In
0615075. Any opinions, findings, and conclusions or recom- Proceedings of the 3nd USENIX/ACM Symposium on
mendations expressed in this material are those of the au- Networked Systems Design and Implementation (NSDI
thor(s) and do not necessarily reflect the views of the NSF. 2006), May 2006.
[17] D. Lymberopoulos, B. Priyantha, and F. Zhao. mPlatform:
A Flexible and Efficient Architecture for Sharing Data in
10. REFERENCES Stack-Based Sensor Network Platforms. In
[1] M. Anand, E. B. Nightingale, and J. Flinn. Self-tuning MSR-TR-2006-142, October 2006.
wireless network power management. In Proceedings of the [18] D. Lymberopoulos and A. Savvides. XYZ: A
9th ACM International Conference on Mobile Computing motion-enabled, power aware sensor node platform for
and Networking (MobiCom’03), San Diego, CA, September distributed sensor network applications. In Proceedings of
2003. Information Processing in Sensor Networks (ISPN), Los
[2] R. Balani, S. Han, V. Raghunathan, and M.B.Srivastava. Angeles, CA, April 2005.
Remote Storage for Sensor Networks. [19] A. Mainwaring, J. Polastre, R. Szewczyk, D. Culler, and
UCLA-NESL-200504-09, UCLA, 2005. J. Anderson. Wireless sensor networks for habitat
[3] N. Banerjee, M. D. Corner, and B. N. Levine. An monitoring. In Workshop on Wireless Sensor Networks and
Energy-Efficient Architecture for DTN Throwboxes. In Applications, Atlanta, GA, September 2002.
Proceedings of Infocom, May 2007. [20] D. Narayanan, J. Flinn, and M. Satyanarayanan. Using
[4] L. S. Brakmo, D. A. Wallach, and M. A. Viredaz. history to improve moble application adaptation. In
microSleep: A technique for reducing energy consumption Proceedings of 3rd IEEE Workshop on Mobile Computing
in handheld devices. In Proceedings of the Second Systems, December 2000.
[21] A. E. Papathanasiou and M. L. Scott. Energy efficiency
through burstiness. In Proceedings of the IEEE Workshop
on Mobile Computing Systems and Applications
(WMCSA), Monterey, CA, October 2003.
[22] T. Pering, Y. Agarwal, R. Gupta, and R. Want. Coolspots:
Reducing the power consumption of wireless mobile devices
with multiple radio interfaces. In Proceedings of Mobisys,
June 2006.
[23] J. Polastre, R. Szewczyk, and D. Culler. Telos: Enabling
ultra-low power wireless research. In Proceedings of the
Fourth International Conference on Information
Processing in Sensor Networks: Special track on Platform
Tools and Design Methods for Network Embedded Sensors
(IPSN/SPOTS), April 2005.
[24] Joseph Polastre, Jason Hill, and David E. Culler. Versatile
low power media access for wireless sensor networks. In
SenSys, pages 95–107, 2004.
[25] B. Schott, M. Bajura, J. Czarnaski, J. Flidr, T. Tho, and
L. Wang. A modular power-aware microsensor with 1000x
dynamic power range. In Proceedings of Information
Processing in Sensor Networks (ISPN), Los Angeles, CA,
April 2005.
[26] E. Shih, P. Bahl, and M. J. Sinclair. Wake on Wireless: An
event driven energy saving strategy for battery operated
devices. In Proceedings of the Eighth ACM Conference on
Mobile Computing and Networking, Atlanta, GA,
September 2002.
[27] J. Sorber, N. Banerjee, M. D. Corner, and S. Rollins.
Turducken: Hierarchical power management for mobile
devices. In Proceedings of The Third International
Conference on Mobile Systems, Applications, and Services
(MobiSys ’05), Seattle, WA, June 2005.
[28] J. Sorber, A. Kostadinov, M. Garber, M. D. Corner, and
E. D. Berger. eFlux: A Language and Runtime System for
Perpetual Mobile Systems. In Technical Report 06-61,
University of Massachusetts, Amherst, December 2006.
[29] RA Walker and S Chaudhuri. Introduction to the
scheduling problem. In IEEE Design & Test of Computers,
1995.
[30] R. Want, T. Pering, G. Danneels, M. Kumar, M. Sundar,
and J. Light. The personal server - changing the way we
think about ubiquitous computing. In Proceedings of
Ubicomp 2002: 4th International Conference on Ubiquitous
Computing, Goteborg, Sweden, September 2002.
[31] O. Younis and S. Fahmy. HEED: A hybrid, energy-efficient,
distributed clustering approach for ad-hoc sensor networks.
IEEE Transactions on Mobile Computing, 4(4), October
2004.

You might also like