You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/252013629

Voice recognition based wireless home automation system

Article · May 2011


DOI: 10.1109/ICOM.2011.5937116

CITATIONS READS
49 26,562

3 authors:

Humaid AlShu'eili Gourab Sen Gupta

1 PUBLICATION 49 CITATIONS
Massey University
177 PUBLICATIONS 1,776 CITATIONS
SEE PROFILE
SEE PROFILE

S.C. Mukhopadhyay
Macquarie University
598 PUBLICATIONS 15,001 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Tree/Pole Climbing Robot View project

Hyperspectral NIR plant imaging View project

All content following this page was uploaded by S.C. Mukhopadhyay on 17 April 2015.

The user has requested enhancement of the downloaded file.


th
2011 4 International Conference on Mechatronics (ICOM), 17-19 May 2011, Kuala Lumpur, Malaysia

Voice Recognition Based


Wireless Home Automation System

Humaid AlShu’eili, Gourab Sen Gupta, Subhas Mukhopadhyay


School of Engineering and Advanced Technology
Massey University, Turitea Campus, Palmerston North, New Zealand
humaid.shueili@gmail.com, g.sengupta@massey.ac.nz, S.C.Mukhopadhyay@massey.ac.nz

Abstract— Home Automation industry is growing rapidly; this is


fuelled by the need to provide supporting systems for the elderly
and the disabled, especially those who live alone. Coupled with
this, the world population is confirmed to be getting older. Home
automation systems must comply with the household standards
and convenience of usage. This paper details the overall design of
a wireless home automation system (WHAS) which has been built
and implemented. The automation centres on recognition of voice
commands and uses low-power RF ZigBee wireless
communication modules which are relatively cheap. The home
automation system is intended to control all lights and electrical
appliances in a home or office using voice commands. The system
has been tested and verified. The verification tests included voice Figure 1: uControl Home Security, Monitoring and Automation (SMA) [3].
recognition response test, indoor ZigBee communication test, and There have been several commercial and research projects
the compression and decompression tests of DPCM (Differential on smart homes and voice recognition systems. Figure 1 shows
Pulse Code Modulation) speech signals. The tests involved a mix
an integrated platform for home security, monitoring and
of 35 male and female subjects with different English accents. 35
different voice commands were sent by each person. Thus the test automation (SMA) from uControl [3]. The system is a 7-inch
involved sending a total of 1225 commands and 79.8% of these touch screen that can wirelessly be connected to security
commands were recognised correctly. alarms and other home appliances. The home automation
through this system requires holding and interacting with a
large panel which constraints the physical movements of the
Keywords— Home automation, ZigBee transceivers, voice
user [4].
streaming, ADC, Differential Pulse Code Modulation (DPCM),
voice recognition. Another popular commercially available system for home
automation is from Home Automated Living (HAL) [5]. HAL
I. INTRODUCTION software taps the power of an existing PC to control the home.
The demography of the world population shows a trend that It provides speech command interface. A big advantage of this
the elderly population world wide is increasing rapidly as a system is it can send commands all over the house using the
result of the increase of the average live expectancy of people existing highway of electrical wires inside the home’s walls.
[1]. Caring for and supporting this growing population is a No new wires means HAL is easy and inexpensive to install.
concern for governments and nations around the globe [2]. However, most of these products sold in the market are heavily
Home automation is one of the major growing industries that priced and often require significant home make over.
can change the way people live. Some of these home
automation systems target those seeking luxury and The rest of the paper is organised as follows: Section II
sophisticated home automation platforms; others target those provides a system overview. The hardware design is detailed in
with special needs like the elderly and the disabled. The aim of Section III while the software design is detailed in Section IV.
the reported Wireless Home Automation System (WHAS) is to The experimental results are discussed in Section V. The paper
provide those with special needs with a system that can concludes by looking at the future research and development
respond to voice commands and control the on/off status of work required to make the system more versatile.
electrical devices, such as lamps, fans, television etc, in the
II. SYSTEM OVERVIEW
home. The system should be reasonably cheap, easy to
configure, and easy to run. The Wireless Home Automation System (WHAS) is an
integrated system to facilitate elderly and disabled people with

978-1-61284-437-4/11/$26.00 ©2011 IEEE


Figure 3: Functional block diagram of the Wireless Home Automation System (WHAS). Legends- A: Analogue, D: Digital

an easy-to-use home automation system that can be fully • Handheld Microphone Module which incorporates a
operated based on speech commands. The system is microphone with RF module (ZigBee protocol).
constructed in a way that is easy to install, configure, run, and
maintain. The functional blocks of the overall system are
• Central Controller Module (PC based).
shown in Figure 2. • Appliance Control Modules.
Figure 3 illustrates the sequence of activities in the WHAS.
The voice is captured using a microphone, sampled, filtered
and converted to digital data using an analogue-to-digital
converter. The data is then compressed and sent serially as
packets of binary data. At the receiving end (Central Controller
Module), binary data are converted to analogue, filtered and
passed to the computer through the sound card. A Visual Basic
application program, running on the PC, uses Microsoft Speech
API library for the voice recognition. Upon recognition of the
commands, control characters are sent wirelessly to the
specified appliance address. Consequently, appliances can be
turned ON or OFF depending on the control characters
received.

III. HARDWARE DESIGN


In this section we present the hardware descriptions of the
three modules that constitute the WHAS.
A. Handheld Microphone Module(MM)
The components of the microphone module are shown in
Figure 4. The system captures human voice using a sampling
rate (fs) of 8 kHz. It is known that the highest frequency
component of the human voice is 20 kHz, however the most
significant parts of the information is encoded in frequencies
between 6 Hz and 3.5 kHz [6]. To meet Nyquist sampling
Figure 2: Sequence of activities in the Wireless Home Automation System
criteria, an anti-aliasing filter is used to block all the
The system consists of three modules: frequencies above the Nyquist frequency (Fn).
f s = 2 Fn (1)

Figure 6: Block diagram of the Central Controller Module.

C1

Figure 4: Block diagram of the handheld Microphone Module. VCC


4.7n
VCC VCC

R2 U2A CDEC

8
150K 100n
The incoming speech wave goes through a low pass filter P1
3 DAC0
C3
R3
8.2K
R1
15K
2

3 A (COMP)
MC33204DR2
1

GND
2

(Figure 5). A 3-pole Butterworth low pass filter is used as an 1


VCC 100n
R4 C2

4
Header 3 GND 150K 3.3n VCC U1
1 8
SIGNAL VDD

anti-aliasing filter [7]. The signal is then amplified in order to GND


GND
GND
6

7
VIN VOUT
3

2 LS
PCOMP COMP

utilise the full range of the ADC. A voltage divider and a DC C11
100n
C8
10uF
4
VSS
TPA4861D
DEM
5
VO2
VO1

Speaker
C7

blocking capacitor provide a voltage translation from the filters GND


10uF

to the ADC. In the microcontroller, data is first converted to C4 U2B


R8 GND

8
22K
R7 6 MC33204DR2 C6 C10
2.7K 7 R6

digital format using the in-built ADC, and then compressed R5


68n

6.8K
5 B (OPA)

100n
12K
1.5n

C5

4
using Differential Pulse Code Modulation (DPCM) algorithm. 3.3n

GND

The data is compressed from 12 bits to 6 bits. Data are sent


serially from the microcontroller to the ZigBee RF module at Figure 7: Filtering and amplification circuit of the received audio.
the baud rate of 115200 bits/s. This is the maximum
C. Appliance Control Module
configurable baud rate provided by ZigBee [8].
C5
Once the speech commands are recognised, control
VCC

7p
charterers are sent to the specified appliance address through
R6
R1
10K 5.1K
VCC
ZigBee communication protocol. Each appliance that has to be
C3
10uF
C4
VCC
controlled has a relay controlling circuit shown in Figure 8.
1u R? R3 C6
4.99K C1 154k X1A 1u
8

2 LM392N VAC 230AC 50HZ


1 J1 2
0.1u A (COMP)
3 GND 3
GND R5 1
MK01
R4 133K GND
154K PWR2.5
4

Mic1 K1
R7
C2 GND 4.99K Port to home appliance
1u Vcc2
VCC

GND D2
R9 Diode 1N4934 Relay-SPST
X1B 100K
8

6 LM392N C9 P2
7 R11 TO_ADC0
R8 B (OPA) 1 P1 U1A MM74HC08N ACGND
5 10K 1
0.1u 2 2 A
9.76K 3 frm ucon R2 Q1
C7 R10 C10 Header 2 1 Y 2N3904
2 Res1
C8 2700p 100K 10n B
4

MHDR1X2 7 3900
1200p GND
GND Vcc3 14
VCC
GND GND GND P2
GND U1B
4 Vcc2 GND
11
6
10
5
9
Figure 5: Portable microphone circuit. MM74HC08N R3
10K
Vcc3
8
7
U1C 6
9 Ccou
10n 5
8 Vcc3
4
B. Central Controller Module 10
MM74HC08N
R4
12K5 GND
3
2
1
U1D
The functional blocks of the central controller module are 12

13
11
Header 11

shown in Figure 6. At the central controller module MM74HC08N


GND

(coordinator), when data are received, the received bytes are GND

decompressed using DPCM algorithm [9]. Decompressed data Figure 8: Circuit schematic for appliance control module
is assigned to the digital-to-analogue converter (DAC). The
analogue output of the DAC is filtered and fed to the computer IV. SOFTWARE DESIGN
as analogue signal through the sound card of the PC. The filter Software design includes ADC sampling and
and amplifier circuit is shown in Figure 7. compression/decompression algorithms, transmission and
receiving, and voice recognition.
A. ADC sampling and data compression / decompression
The portable microphone module implements DPCM
compression scheme. This compression algorithm is inherently
lossy because of the error incurred due to the nature of the
compression algorithm. The algorithm compresses each ADC
sample from 12 bits of data down to 6-bit codes. This code
represents the difference between the actual sample and the
predicted value of the sample. The predicted sample is obtained
from the previous iteration result. The difference between the
sample and the predicted value is then quantised. The 6 bit
code is then packed into bytes of data in order to send them
serially. In order to calculate the new predicted value, the
compression algorithm decodes the difference and adds it into
the current predicted value.

Figure 11: Voice recognition application hierarchy.

The designed graphical user interface (GUI) offers the user


the choice of selecting the desired serial communication port as
Figure 9: DPCM Compression algorithm. well as it provides a record of all the commands that have been
recognised and executed. The application implements the
hierarchy described earlier in Figure 11 and the flow chart
shown in Figure 12. When designing the programme GUI,
making it a user friendly application was a huge priority since
the target clients need to avoid any possible complications in
the system. A screen shot of the GUI is shown in Figure 13.
Control characters corresponding to the recognised
commands are then sent serially from the central controller
module to the appliance control modules that are connected to
the home appliances.

Figure 10: DPCM Decompression Algorithm.

Figure 9 shows the DPCM compression algorithm. At the


receiving end, data are decompressed to the original form using
the DPCM decompression algorithm. Figure 10 shows the
decoding algorithm which basically matches the received code
with the quantised difference and adds this difference to the
predictor [10].
B. Voice Recognition Application
The voice recognition application implements Microsoft
speech API. The application compares incoming speech with
an obtainable predefined dictionary. The Microsoft speech API
run time environment relies on two main engines: Automatic
Speech Recognition (ASR engine) and Text To Speech (TTS
engine) as shown in Figure 11. ASR implements the Fast
Fourier Transform (FFT) to compute the spectrum of the
fingerprint data [4]. Comparing the fingerprint with an existing
database returns a string of the text being spoken. This string is
represented by a control character that gets sent to the
corresponding appliance’s address. Figure 12: Flow chart of the voice recognition application.
Figure 14: Microphone circuit board with ZigBee module
Figure 13: Voice recognition GUI

C. ZigBee RF communication
Zigbee protocol is the communication protocol that’s used
in this system. Zigbee offers 250 kbps as maximum baud rate,
however, 115200 bps was used for sending and receiving as
this was the highest speed that the UART of the
microcontroller could be programmed to operate at.
For each byte transmitted, there is a start and stop bit.
Hence the actual baudrate is :

The amount of data (bits/s) produced by the ADC is: Figure 15: Fabricated relay control unit

The streaming will not be possible without voice data being


compressed [11]. After compression, the total resultant data
rate (bits/s) will be:

This allows a window for error checking and resending data


if necessary.

V. EXPERIMENTAL RESULTS AND DISCUSSIONS


The prototype of the system has been fabricated and tested.
Figure 14 shows the microphone module. Figure 15 shows the Figure 16: Results of voice recognition experiments showing percentage of
correct recognition for different ethnicity/accent
appliances control module.

The graph in Figure 16 and the data in Table I show the


response of the speech recognition application to spoken
commands. The tests involved 35 subjects; the trails were
conducted with people with different English accents. The test (DPCM) compression algorithms that allows to compress the
subjects were a mix of male and female and 35 different voice speech data to half of its orignal data size. The preliminary test
commands were sent by each person. Thus the test involved results are promising.
sending a total of 1225 commands. 79.8% of these commands
Future work will entail:
were recognised correctly. When a command is not recognised
correctly, the software ignores the command and does not • Adding confirmation commands to the voice
transmit any signals to the device control modules. The recognition system.
accuracy of the recognition can be affected by background • Integrating variable control functions to improve the
noise, speed of the speaker, and the clearity of the spoken system versatility such as providing control
accent. These factors need to be studied further in more details commands other than ON/OFF commands. For
by conducting more tests. The system was tested in an example “Increase Temperature”, “Dim Lights” etc.
apartment and performed well up to 40m. With a clear line-of- • Integration of GSM or mobile server to operate
sight transmission (such as in a wide open gymnasium) the from a distance.
reception was accurate up to 80m. • Design and integration of an online home control
Additional tests are being planned involving a bigger panel.
variety of commands. •
TABLE I: RESULTS OF VOICE COMMAND RECOGNITION TESTS: PERCENTAGE OF REFERENCES
COMMANDS CORRECTLY RECOGNISED
[1] T. Birtley, (2010) Japan debates care for elderly. [Cited 21/09/2010].
Pacific Available: http://www.youtube.com/watch?v=C0UTqfigSec
Category Kiwi Arab Filipino Japan Thai African
Island [2] Population Division, DESA, United Nations. (2009). World Population
Person 1 68 85.7 88.6 70.3 67.7 75.8 96.7 Ageing: Annual report 2009. [29/07/2010]. Available:
http://www.un.org/esa/population/publications/WPA2009/WPA2009_W
Person 2 86 80 85 77 70 81.8 88 orkingPaper.pdf
[3] (2010) uControl Home security system website. [Cited 2010 14th Oct].
Person 3 57.1 88 80 80 74 85 85
Available: http://www.itechnews.net/2008/05/20/ucontrol-home-
Person 4 90 77 90 67 78.6 82 90 security-system/
[4] R. Gadalla, “Voice Recognition System for Massey University
Person 5 85 60 77 90 76 68 82 Smarthouse,” M. Eng thesis, Massey University, Auckland, New
Average 77.3 78.1 84.1 78.7 73.3 78.5 88.3
Zealand, 2006.
[5] (2010) Home Automated Living website. [Cited 2010 14th Oct].
Available: http://www.homeautomatedliving.com/default.htm
[6] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals,
VI. CONCLUSIONS AND FUTURE WORK New Jersey, US: Prentice Hall Inc, 1978.
[7] “XBee-2.5-Manual,” ZigBee RF communication protocol. (2008).
A home automation system based on voice recognition was Minnetonka: Digi International Inc.
built and implemented. The system is targetted at elderly and [8] B. Yukesekkaya, A. A. Kayalar, M. B. Tosun, M. K. Ozcan, and A. Z.
Alkar, “A GSM, Internet and Speech Controlled WirelessInteractive
disabled people. The prototype developed can control electrical Home Automation System,” IEEE Transactions on Consumer
devices in a home or office. The system implements Automatic Electronics, vol. 52, pp. 837-843, August 2006.
Speech Recognition engines through Microsoft speech APIs. [9] F. J. Owens, Signal Processing of Speech, New York, US: McGraw-Hill
The system implements the wireless network using ZigBee RF Inc, 1993.
[10] Voice Recoder Refrence Design (AN 278), Silicon Laboratories, 2006.
modules for their efficiency and low power consumption. [11] D. Brunelli, M. Maggiorotti, L. Benini, and F. L. Bellifemine, “Analysis
Multimedia streaming through the network was impleneted of Audio Streaming Capapbility of Zigbee Networks,” in EWSN 2008,
with the help of the Differential Pulse Code Modulation 2008, LNCS 4913, pp. 189-204.

View publication stats

You might also like