You are on page 1of 8

C omp u t ing Pr ac t ice s

CASAS: A
Smart Home
in a Box
Diane J. Cook, Aaron S. Crandall, Brian L. Thomas, and Narayanan C. Krishnan
Washington State University

The CASAS architecture facilitates the to date most implementations are somewhat narrow and
development and implementation of future are performed primarily in controlled laboratory settings.
smart home technologies by offering an These limitations are due in large part to the difficulty
of creating a fully functional smart home infrastructure.
easy-to-install lightweight design that
In fact, although realistic smart home prototypes
provides smart home capabilities out of exist,2,3 implementing these designs is so cumbersome
the box with no customization or training. that meetings have been organized to discuss ways to
scale such pervasive computing systems and to share
valuable data that have been successfully captured in

I
such settings. A recent example of such a meeting is the
n the past decade, machine learning and pervasive 2012 National Science Foundation Workshop on Pervasive
computing technologies have matured to the point Computing at Scale (http://sensorlab.cs.dartmouth.edu/
where they provide integrated and automated context- NSFPervasiveComputingAtScale).
aware support in our everyday environments. A smart The goal of the Washington State University’s CASAS
home is a physical embodiment of such a system. Computer project is to design a smart home in a box (SHiB). This smart
software uses sensors and artificial intelligence techniques home kit is small in form, lightweight in infrastructure,
to perceive and reason about the state of the home’s physi- extendable with minimal effort, and ready to perform key
cal environment and its residents, and then initiates action capabilities out of the box.
to achieve specified goals.
During perception, sensors embedded in the home gen- ARCHITECTURE
erate readings while residents perform their daily routines. Figure 1 shows the CASAS architecture. The physical
The sensor readings are collected by a computer network layer contains hardware including sensors and actuators.
and stored in a database that an intelligent agent uses to The architecture utilizes a ZigBee wireless mesh which com-
generate useful knowledge such as patterns, predictions, municates directly with hardware components. A publish/
and trends. On the basis of this information, a smart home subscribe manager governs the middleware layer. The
can select and automate actions that meet the goals of the manager provides named broadcast channels that allow
smart home application. component bridges to publish and receive messages. The
Many researchers view the potential uses of smart home middleware provides valuable services, such as adding
technology for applications such as health monitoring and time stamps to events, assigning universally unique identi-
energy-efficient automation as “extraordinary.”1 However, fiers (UUIDs), and maintaining site-wide sensor state. Every

62 computer Published by the IEEE Computer Society 0018-9162/13/$31.00 © 2013 IEEE

Authorized licensed use limited to: Eskisehir Teknik Universitesi. Downloaded on August 29,2021 at 20:35:39 UTC from IEEE Xplore. Restrictions apply.
component of the CASAS architecture
communicates via a customized Exten-
sible Messaging and Presence Protocol Activity Activity
recognition discovery Energy
(XMPP) bridge to this manager. Ex-
amples include the ZigBee bridge; the
Scribe bridge, which archives messages Application bridge
in permanent storage; and bridges
Scribe
for each application-layer software bridge
component. Publish/subscribe manager
The CASAS architecture is easily Archive
maintained because the communica- storage
tion bridges use lightweight APIs that ZigBee bridge
support a wide variety of free-form
messages. As a result, the middleware
is compact and stable—it has had only ZigBee wireless mesh
one update in five years. CASAS is ex-
tendable because users can configure
and integrate new bridges without
changing or even restarting the middle- Sensor/
ware. In addition, we have designed actuator
Actuator
bridges that link multiple smart homes Sensor
together, allowing CASAS to scale to
communities of smart homes.
As Figure 2a shows, all of the CASAS Figure 1. CASAS smart home components. During perception, control flows
up from the physical components through the middleware to the software
components fit into a single small box. applications. When the smart environment initiates an action, control moves from
The physical components in the cur- the application layer to the physical components that automate the action. The
rent box consist of sensors that are goal is that each layer is lightweight, extensible, and ready to use.

(a)

(b)

Figure 2. CASAS smart home in a box: (a) SHiB kit and (b) smart home installation site. The sensors come prelabeled with
their intended location.

JULY 2013 63
Authorized licensed use limited to: Eskisehir Teknik Universitesi. Downloaded on August 29,2021 at 20:35:39 UTC from IEEE Xplore. Restrictions apply.
C omp u t ing Pr ac t ice s

Table 1. Summary of costs for CASAS smart home in a box (SHiB).

Component Unit price Quantity Total price

Computer server $350 1 $350


Infrared motion/light sensor $85 24 $2,040
Door sensor $75 1 $75
Relay $75 2 $150
Temperature sensor $75 2 $150
Total cost $2,765
Note: prices are accurate at time of publication.

prelabeled with the intended location. Additional sensors that the most difficult issue was determining the optimal
and controllers can be included as needed. The middle- placement of sensors.
ware, database, and application components reside on a
small, low-power computer with an ITX form-factor server. SYSTEM CAPABILITIES
Although this layout lets each smart home run indepen- Ideally, a smart home installation can be accomplished
dently and locally, smart homes can also securely upload using only a toolkit that works out of the box, with no
events to be stored in a relational database or in the cloud. customization or training. We designed two core software
Table 1 summarizes the prototype costs of the components and two applications to meet this goal.
components that comprise our SHiB design.
Activity recognition
USABILITY Intelligent systems that focus on human needs require
Our research group has installed 32 smart home information about the human’s activities. At the core of
testbeds to date. Many of the corresponding datasets are these systems, then, is activity recognition.4,5 Smart home
available on the project webpage at ailab.wsu.edu/casas. sensors generate events consisting of a date, time, sensor
A total of 19 datasets represent single-resident sites, four identifier, and sensor message. Activity recognition maps
represent sites with two residents, and the rest house a sequence of sensor data to a corresponding activity label.
larger families or residents with pets. The CASAS activity recognition software, called AR,
Figure 2b shows a smart home installation site. The provides real-time activity labeling as sensor events arrive
CASAS smart home design minimizes installation costs. in a stream. To achieve this functionality, we formulated
We can install a new smart home in approximately two the learning problem as one of mapping the sequence of
hours and can remove the equipment in 30 minutes, with the k most recent sensor events to a label indicating the
no changes or damage to the home. activity corresponding to the most recent event in the
Once the smart home system is installed, the residents sequence. The preceding sensor events define the context
must maintain the equipment. The CASAS SHiB includes a for this event. For example, the sequence of sensor events
software agent that alerts residents if sensor battery levels consisting of
are getting low or if a sensor suddenly stops reporting
events. In practice, this seldom happens as the batteries 2011-06-15 03:38:23.271939 BedMotionSensor ON
typically last more than a year. 2011-06-15 03:38:28.212060 BedMotionSensor ON
To test the kit’s usability, we conducted a study in an 2011-06-15 03:38:29.213955 BedMotionSensor ON
on-campus three-bedroom apartment. We recruited
participants to visit the apartment, one at a time, and could be mapped to a Sleep activity label.
install a CASAS smart home. The study included 20 We designed a support vector machine (SVM) method
participants (eight men and 12 women) aged 21 to 62 years for real-time activity recognition. We have tested other
(a mean age of 33 years), with a variety of backgrounds and machine learning models as well, including naïve Bayes
technological familiarity. classifiers, hidden Markov models, and conditional random
We gave each participant a writ ten document fields. However, we found that SVMs achieve consistently
explaining the smart home parts and installation process stronger performance than other approaches. In addition,
and a CASAS smart home kit. All of the participants the model quantifies the degree of fit between the data
completed the installation without difficulty. The and provides an activity label that facilitates additional
average installation time was just over one hour. On a capabilities such as anomaly detection.
scale of 1 (simple) to 10 (impossible), participants rated To provide input to the classifiers, we define features
the installation difficulty as 2.53 (σ = 1.07). They noted describing a data point i that corresponds to a sequence

64 computer

Authorized licensed use limited to: Eskisehir Teknik Universitesi. Downloaded on August 29,2021 at 20:35:39 UTC from IEEE Xplore. Restrictions apply.
Table 2. Activity recognition confusion matrix.

Ground truth Automatically generated activity label


activity label
Bed-toilet Enter Leave Personal
transition Cook Eat home home hygiene Phone Relax Sleep Work Accuracy

Bed-toilet 18,288 143 261 0 0 22,233 0 3 5,866 38 0.39


transition
Cook 3 370,300 1,616 11 11 172 4 140 28 1,917 0.99
Eat 53 20,528 9,871 4 0 41 1 979 118 27,052 0.17
Enter home 0 195 0 1,606 107 3 0 4 57 126 0.77
Leave home 0 5 0 59 316 3 0 0 1 4 0.81
Personal 15,769 928 81 3 3 295,616 0 77 1,216 921 0.94
hygiene
Phone 0 21 2 0 0 4 8 34 73 1,072 0.01
Relax 6 1,282 322 13 0 178 8 2,030 1,459 2,735 0.25
Sleep 33,900 66 33 1 0 279 0 60 65,189 306 0.65
Work 37 2,875 10,544 66 17 489 20 497 237 71,684 0.83
Note: the diagonal entries indicate the activities that were correctly categorized; the last column shows the accuracy for each individual activity.

of sensor events. This fixed dimensional feature vector xi Activity discovery


includes the time of day for the first and last sensor events Recognizing activities from streaming data introduces
(discretized into four equal-length bins), the time span of new challenges because AR must process data that does
the k-event sensor window, and the number of events for not belong to any of the targeted activity classes. One way
each sensor within the window. to handle unlabeled data is to design an unsupervised
Each vector x i is tagged with the label y i , which learning algorithm to discover activities from unlabeled
corresponds to the activity label associated with the last sensor data. Segmenting unlabeled data into smaller
sensor event in the window. Although we could identify classes improves activity recognition performance because
a fixed window size k that works well for a given dataset, the “other” class is no longer the largest, as frequently
this approach requires additional user customization. To happens in activity recognition datasets. Another
increase the approach’s generalizability, AR dynamically important reason to discover activity patterns from
adjusts the window size k based on the most likely activities unlabeled data is to characterize and analyze as much
that are being observed and their typical duration. behavioral data as possible, not just predefined activity
To evaluate AR’s ability to recognize activities out of classes. Researchers must examine and model such
the box, we collected sensor data in 18 separate smart unlabeled data to get a complete view of everyday life.6
apartments, each housing one resident and each using Like earlier approaches to sequence mining, our activity
the CASAS smart home kit. We manually annotated one discovery algorithm, called AD, searches the space of
month of data to provide ground truth activity labels. We candidate sensor event sequences ordered by increasing
evaluated performance as the percentage of sensor events sequence length. Because the space of candidate patterns
that were correctly labeled across all of the apartments, is exponential in the input data size, we use a greedy
with no additional customized training for each apartment. search to find the sequence pattern that best compresses
Table 2 shows the confusion matrix we generated from the input dataset. During discovery, AD scans the entire
this experiment. dataset to create initial patterns of length one.
As the matrix indicates, it is easier to recognize After this initial pass, AD extends the patterns
some activities than others. This occurs because discovered in the previous iteration by considering events
some activities, such as cooking, have a fairly unique occurring before and after instances of the previous
spatiotemporal signature. Other activities are more pattern. AD stores the patterns in a beam-limited open
challenging because they overlap with other activity list and value-orders them. Once the search terminates and
classes or not enough training data is available to learn the activity discovery algorithm reports the best pattern
the model. The weighted average accuracy is 84 percent, found, AD compresses the sensor event data using the
which indicates that the models are fairly robust even best pattern. The compression procedure replaces all
when they are used out of the box in new, distinct home instances of the pattern with single event descriptors
settings. representing the pattern definition. AD can then invoke

JULY 2013 65
Authorized licensed use limited to: Eskisehir Teknik Universitesi. Downloaded on August 29,2021 at 20:35:39 UTC from IEEE Xplore. Restrictions apply.
C omp u t ing Pr ac t ice s

midafternoon. This could represent a person’s


11 12
10
9
11 12 1
2
3
10
9
1
2
3
activities upon returning home, such as putting
away groceries or getting a drink.
8 4 8 4
7 5 7 6 5
6

(a) (b) The pattern shown in Figure 3c consists of


a sequence of events occurring in the morning
that alternate between the bedroom, a work
area, and the living room. This pattern might
represent gathering items needed for the
resident’s daily routine.
Other patterns represent transitions between
activities or activities that are recognizable
11 12 1
10
9
8
2

4
3 Sensor key: but do not appear on the predefined activity
: Motion list, such as spending extended time in a
7 6 5

(c) : Motion (area) secondary bedroom that is used for guests


: Door or crafts. Using the AD activity discovery
: Temperature
: Light algorithm to find patterns in unlabeled data
: Water, burner that would otherwise be labeled as “other”
: Item increases activity recognition for our smart
home datasets by an average of 10 percent.7

Activity-aware applications
Figure 3. Visualization of discovered patterns: (a) sequence of motions that The world’s population is aging, with the
might represent a person getting ready for bed; (b) sequence of motions estimated number of people over age 85
that could indicate a person returning home; and (c) sequence of motions expected to triple by 2050.8 Instead of deploying
that might represent a person getting ready to begin a daily routine.
healthcare reactively, we need innovative and
preventive healthcare methods that can be
the activity discovery process on the compressed data to automated and deployed within an individual’s own home.
find additional activity patterns. We installed 20 smart homes at an assisted care facility
We evaluate candidate patterns based on their ability to where residents’ average age is 85 years. Because the
minimize the original dataset’s size when it is compressed CASAS smart home kit is simple to maintain, we can collect
using the pattern definition. Because AD’s compression data over multiple years, allowing us to monitor behavioral
replaces each pattern occurrence with a single event labeled changes that indicate variations in cognitive or physical
with a pattern identifier, it calculates the description length health. As Figure 4 shows, the monitored parameters
of a pattern P, given input data D, as DL(P) + DL(D|P), where include overall activity level, sleep quality, and time spent
DL(P) is the description length of the pattern definition, and on individual activities of interest.
DL(D|P) is the description length of the dataset compressed In addition, CASAS can provide users with activity-aware
using the pattern definition. Because human behavior health assistance in the form of prompting them to initiate
patterns vary greatly, we employ an edit distance measure daily activities such as taking medicine, exercising, or
to determine if a sensor sequence is sufficiently similar to talking to their children. Although many reminder systems
a pattern to be considered an instance of the pattern. This exist, few take into account an individual’s behavioral
measure counts the minimum number of add, delete, or patterns to provide context-aware prompts, despite studies
transpose operations needed to convert a sensor sequence indicating that they offer significant advantages over
to one that is equivalent to the pattern definition. traditional time-based prompts.9
Figure 3 provides a visualization of the three top In the CASAS software, a machine learning algorithm
activity patterns that occur when we apply the AD activity is trained to identify when an individual performs an
discovery algorithm to our combined dataset. The pattern activity as a function of wall-clock time such as “pick up
in Figure 3a shows a sequence consisting of motion in grandchildren at 2:00 p.m.” and as a function of other
the bedroom, followed by motion in the living room, activity occurrences such as “take medicine with breakfast.”
followed by more motion in the bedroom, around 10:20 An additional application is supporting energy-efficient
p.m. Many of these events occur prior to sleeping and thus behavior in the home. Over the past 40 years, energy con-
can represent a person getting ready for bed. sumption has increased at a higher rate than population
The pattern shown in Figure 3b consists of a front growth, and buildings are now responsible for 40 percent
door closing, followed by a series of kitchen events, and of total energy usage.10 By identifying activities occurring
then a living room event, usually in the late morning or in the home and concurrently monitoring whole-home

66 computer

Authorized licensed use limited to: Eskisehir Teknik Universitesi. Downloaded on August 29,2021 at 20:35:39 UTC from IEEE Xplore. Restrictions apply.
Figure 4. Activity trends for a smart-home resident. The graphs include a plot of number of sensor events over the time period
(top left); daily indicators for activity level, sleep quality, and time out of the home (top middle); comparison of socialization
and sleep quality parameters for a time period (top right); occurrences of tracked activities (bottom left); time spent in
different areas of the home (bottom middle); and health trend over a time period calculated as a function of activity level,
sleep quality, and socialization (bottom right).

CASAviz: Web-based Visualization


of Behavior Patterns

Figure 5. Snapshot of the CASAS activity visualizer. The visualizer renders sensor events on a computer or mobile device while
plotting usage of resources such as electricity.

JULY 2013 67
Authorized licensed use limited to: Eskisehir Teknik Universitesi. Downloaded on August 29,2021 at 20:35:39 UTC from IEEE Xplore. Restrictions apply.
C omp u t ing Pr ac t ice s

1.2 0.5
1.0 0.4
Relative activity level

Relative activity level


0.8
0.3
0.6
0.2
0.4
0.2 0.1

0 0
1 3 5 7 9 11 13 15 17 19 21 23

ne

an
ep
lax

rk

Hy r

Eat

Be k
t

ve

ds
te

ile
o
Wo

Lea

Me
Co
gie
Sle

Cle
En
Re

dTo
(a) Hour of day (b) Activity level

Figure 6. Relative activity level (a) as a function of the hour of the day and (b) as a function of the activity class.

energy usage, we can predict an activity’s energy con- Larger variances exist for the enter activity (which con-
sumption. In addition to providing this information to the siders time spent outside the home) and for bed-toilet
home’s resident, as Figure 5 shows, the smart home can transitions, which vary dramatically by age, health, and
promote energy-efficient behavior11 and automate con- sleep quality.
trol of selected devices to support more energy-efficient Conducting such large-scale analyses will provide a
activities. valuable tool for understanding behavior that is central to
many research fields, including sociology, psychology, and
POPULATION-WIDE FINDINGS technology development.
One type of analysis not found in the literature is a pop-

A
ulation-wide evaluation of resident behavior using smart
home data. Although analyzing behavioral features across s a next step in our work, we plan to evaluate the
a larger demographic could benefit researchers in many ease with which we can incorporate additional
fields, gathering data at a significant scale has not yet been sensor modalities such as radio frequency identifi-
a practical goal. However, researchers can use CASAS to cation and smartphones into the CASAS architecture and
investigate questions that apply to demographic groups, to design applications that more extensively utilize device
families, and communities. controllers. We also anticipate expanding the data collec-
As a first step, we consider behavioral properties for the tion to include a greater diversity of resident demographics
CASAS datasets we have collected. In particular, we want to so that we can perform longitudinal studies. Finally, our
identify how activity levels vary throughout the day for an future work will focus on designing home automation
entire cohort. We also want to determine how individuals strategies that provide safe and energy-efficient support
spend their home time in terms of individual activities and of a resident’s daily activities.
how consistent the functions are across the group.
Figure 6 shows the results of these two analyses for the Acknowledgments
18 smart apartments included in our study. As Figure 6a We thank Jim Kusznir, Allan Drassal, Leah Zulas, and all the
indicates, a clear pattern exists for the entire group, with members of the CASAS team for their contributions to this
low activity levels early in the day but increasing, peak- work. This material is based on work supported by the US
ing at midmorning, midafternoon, and early evening. The National Science Foundation under grant number 0852172,
exact activity levels vary across the population. This vari- the Life Sciences Discovery Fund, and National Institute of
Biomedical Imaging and Bioengineering (NIBIB) grant number
ance might be due either to mobility differences or sensor R01EB009675.
granularity within the home.
In contrast, the variance across the population for
References
time devoted to various activities is much smaller. As
1. P. Hewitt, “Speech by the Rt Hon Patricia Hewitt MP,
Figure 6b shows, the most time is dedicated to sleep,
Secretary of State for Health, 16 May 2007: Long-term
while other activities, such as taking medicine (which Conditions Alliance Annual Conference,” 2007; http://
is typically quick) and cleaning the home (which might webarchive.nationalarchives.gov.uk/+/www.dh.gov.uk/en/
not happen as often as other activities), receive less time. MediaCentre/Speeches/DH_074812.

68 computer

Authorized licensed use limited to: Eskisehir Teknik Universitesi. Downloaded on August 29,2021 at 20:35:39 UTC from IEEE Xplore. Restrictions apply.
2. S. Helal et al., “The Gator Tech Smart House: A Diane J. Cook is the Huie-Rogers Chair Professor in the
Programmable Pervasive Space,” Computer, Mar. 2005, School of Electrical Engineering and Computer Science at
pp. 50-60. Washington State University. Her research interests in-
3. B. Logan et al., “A Long-Term Evaluation of Sensing clude artificial intelligence, machine learning, graph-based
Modalities for Activity Recognition,” Proc. 9th Int’l Conf. relational data mining, smart environments, and robotics.
Ubiquitous Computing (UbiComp 07), Springer, 2007, Cook received a PhD in computer science from the Univer-
pp. 483-500. sity of Illinois. She is an IEEE Fellow. Contact her at cook@
4. D.H. Hu, V.W. Zheng, and Q. Yang, “Cross-Domain Activity eecs.wsu.edu.
Recognition Via Transfer Learning,” Pervasive and Mobile
Computing, vol. 7, no. 3, 2011, pp. 344-358. Aaron S. Crandall is an assistant research professor in the
5. T.L.M. van Kasteren, G. Englebienne, and B.J.A. Kröse, School of Electrical Engineering and Computer Science at
“Hierarchical Activity Recognition Using Automatically Washington State University. His research interests include
Clustered Actions,” Proc. 2nd Int’l Conf. Ambient Intelligence the application of artificial intelligence, human factors, and
(AmI 11), Springer, 2011, pp. 82-91. engineering principles to building better smart home sys-
6. T. Gu et al., “An Unsupervised Approach to Activity tems. Crandall received a PhD in computer science from
Recognition and Segmentation Based on Object-Use Washington State University. He is a member of IEEE and
Fingerprinters,” Data & Knowledge Eng., vol. 69, no. 6, 2010, ACM. Contact him at acrandal@eecs.wsu.edu.
pp. 533-544.
7. D. Cook, N. Krishnan, and P. Rashidi, “Activity Discovery Brian L. Thomas is a PhD student and IGERT Fellow in the
and Activity Recognition: A New Partnership,” to appear School of Electrical Engineering and Computer Science at
in IEEE Trans. Systems, Man, and Cybernetics, Part B, 2013; Washington State University. His research interests include
http://ee.wsu.edu/~cook/pubs/smc12.pdf. artificial intelligence, home automation, and computer se-
8. G.K. Vincent and V.A. Velkoff, “The Next Four Decades— curity. Thomas received a BS in computer science from
The Older Population in the United States: 2010 to 2050,” Washington State University. Contact him at bthomas@
US Census Bureau, 2010; www.census.gov/prod/2010pubs/ eecs.wsu.edu.
p25-1138.pdf.
Narayanan C. Krishnan is an assistant research professor
9. P. Kaushik, S.S. Intille, and K. Larson, “User-Adaptive
in the School of Electrical Engineering and Computer Sci-
Reminders for Home-Based Medical Tasks: A Case Study,”
Methods of Information in Medicine, vol. 47, no. 3, 2008, pp.
ence at Washington State University. His research interests
203-207. include activity recognition, pervasive computing, pattern
10. L. Pérez-Lombard, J. Ortiz, and C. Pout, “A Review of recognition, and machine learning for pervasive computing
Building Energy Consumption Information,” Energy and applications. Krishnan received a PhD in computer science
Buildings, vol. 40, no. 3, 2008, pp. 394-398. and engineering from Arizona State University. Contact
11. A. Faruqui, S. Sergici, and A. Sharif, “The Impact of him at ckn@eecs.wsu.edu.
Informational Feedback on Energy Consumption—A
Survey of the Experimental Evidence,” Energy, vol. 35, Selected CS articles and columns are available
no. 4, 2010, pp. 1598-1608. for free at http://ComputingNow.computer.org.

Call for Articles


IEEE Software seeks practical,
readable articles that will appeal
to experts and nonexperts alike.
The magazine aims to deliver reliable
information to software developers
and managers to help them stay on
top of rapid technology change.

Author guidelines:
www.computer.org/software/author.htm
Further details: software@computer.org
www.computer.org/software

JULY 2013 69
Authorized licensed use limited to: Eskisehir Teknik Universitesi. Downloaded on August 29,2021 at 20:35:39 UTC from IEEE Xplore. Restrictions apply.

You might also like