You are on page 1of 10

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO.

8, AUGUST 2016 2679

Low-Power System for Detection of Symptomatic


Patterns in Audio Biological Signals
Himanshu S. Markandeya and Kaushik Roy, Fellow, IEEE

Abstract— In this paper, we present a low-power, efficacious, the problem. In the literature, most of the developed systems
and scalable system for the detection of symptomatic patterns in detect a single acoustic symptom (cough or sneeze) [2], [4].
biological audio signals. The digital audio recordings of various The Kids Health Monitoring System (KiMS) proposed in [4]
symptoms, such as cough, sneeze, and so on, are spectrally
analyzed using a discrete wavelet transform. Subsequently, we use uses wearable sensors and acoustic signal processing in order
simple mathematical metrics, such as energy, quasi-average, and to provide health monitoring in children. Using the neural
coastline parameter for various wavelet coefficients of interest network-based processing, the KiMS classifies various symp-
depending on the type of pattern to be detected. Furthermore, toms and activities and, subsequently, transmits the record to a
a mel-frequency cepstrum-based analysis is applied to distinguish parent or doctor for further analysis [4]. The use of an artificial
between signals, such as cough and sneeze, which have a
similar frequency response and, hence, occur in common wavelet neural network ensures a good classification rate. However, it
coefficients. Algorithm-circuit codesign methodology is utilized also leads to a higher computational load on the implemented
in order to optimize the system at algorithm and circuit levels hardware and, hence, higher power consumption. Apart from
of design abstraction. This helps in implementing a low-power that, complex training methodology is required in order to
system as well as maintaining the efficacy of detection. The system train the KiMS system for high efficacy [4]. The high power
is scalable in terms of user specificity as well as the type of signal
to be analyzed for an audio symptomatic pattern. We utilize mul- consumption also implies that the limited energy source, that
tiplierless implementation circuit strategies and the algorithmic is the battery, is drained of its energy in a shorter period
modification of mel cepstrum computation to implement low- of time. In the case of wearable products, such draining of
power system in the 65-nm bulk Si technology. It is observed battery will lead to functional failures, which are undesirable.
that the pattern detection system achieves about 90% correct Although the power consumption of the system can be reduced
classification of five types of audio health symptoms. We also
scale the supply voltage due to lower frequency of operation and by reducing the complexity of computation, this may lead to
report a total power consumption of ∼184 µW at 700 mV supply. a reduced efficacy of the system. Reduced efficacy can render
the primary function of the product redundant. Hence, there
Index Terms— Low-power circuit, pattern detection, signal
processing, VLSI. is need for the algorithm used in the wearable system and
its corresponding hardware implementation to be designed in
tandem, so that it is possible to maintain a high algorithmic
I. I NTRODUCTION efficacy and acceptable hardware power efficiency. The system

T ECHNOLOGY scaling has resulted in the development


of novel applications in a wide array of fields. The field
of medical systems is no exception to this and has benefitted
should be scalable to detect patterns in a large variety of
signals over a wide array of users. Programmability of the
system to user desired function is another desirable feature.
immensely. In the past decade, rapid advancements in the In this paper, we have proposed an algorithm and its
development of low-power design methodologies have resulted corresponding circuit to detect symptomatic patterns in human
in feasible designs for various wearable and implantable acoustic nonspeech signals. These include audio recordings of
medical systems [1]. Numerous wearable health monitoring cough, sneeze, belch, wheeze, and vomit patterns. These five
systems have been proposed in order to deliver early warning human nonspeech audio tracks are selected, because they are
of an impending health condition [2]. These systems monitor the most commonly observed signals. They are also known to
various internal as well as external parameters related to the be symptoms for diseases ranging from influenza, ear infection
human health, such as temperature, heart rate, and so on. to serious conditions, such as asthma, bronchitis, stomach flu,
Apart from these parameters, it is well known that acoustic and so on. It should be noted that apart from the identified five
symptoms, such as cough, sneeze, belching, and so on, are acoustic symptoms, the proposed system is scalable to other
early markers of serious health issues, such as influenza, diar- human nonspeech audio as well. In order to correctly classify
rhea, and whooping cough, especially among children [3], [4]. the type of symptom, the acoustic signal needs to be processed
If repetitive occurrence of these symptoms is detected in efficiently to cause detection. Complexity of this processing
advance, it is possible for the patient or the healthcare per- is directly translated into equivalent power consumption of
sonnel to commence remedial action prior to aggravation of corresponding hardware implemented. In order to design an
effective and long lasting wearable system for symptomatic
Manuscript received December 1, 2014; revised May 25, 2015,
August 22, 2015, and November 20, 2015; accepted January 11, 2016. Date pattern detection, it is necessary to reduce its power consump-
of publication February 16, 2016; date of current version July 22, 2016. tion without degrading the efficacy of detection. This puts
The authors are with the Department of Electrical and Computer Engi- stringent design constraint on the power consumed. Therefore,
neering, Purdue University, West Lafayette, IN 479065 USA (e-mail:
hmarkand@purdue.edu; kaushik@purdue.edu). a successful design can be achieved by optimizing algorithmic
Digital Object Identifier 10.1109/TVLSI.2016.2521869 efficacy and hardware power efficiency during the design
1063-8210 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
2680 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 8, AUGUST 2016

process. Previously, such approach has been used in the These algorithms are used primarily for speech recognition or
development of implantable systems as well [5], [6]. Using for the classification of limited patterns viz., cough, sneeze,
intelligent approximations at the algorithm level and low- and so on [8], [9]. These algorithms use signal processing
power circuit techniques, it was shown that a high efficacy techniques, which are expensive in terms of hardware power
of pattern detection can be achieved while maintaining power consumption. As mentioned in Section I, in order to design
efficiency [6]. a system for a wearable product, it is necessary to optimize
Our primary contribution, in this paper, is to address power consumption with functional efficacy. Hence, optimal
two important issues. First, using a single input (human audio signal processing techniques need to be selected depend-
recording), multiple symptomatic patterns have been identified ing on signal analyzed, hardware cost, and computational
with a high efficacy. Second, the implemented hardware has efficiency. Several mathematical tools, such as fast Fourier
been made scalable over variety of signals and power efficient. transform (FFT), short-time Fourier transform (STFT), wavelet
This methodology can be extended to efficaciously detect other transform, and so on, can be used to spectrally analyze
symptomatic patterns using power-efficient circuits. We have the acoustic. Another technique that can be used to analyze
used the wavelet transform as a mathematical tool to resolve audio signals is the S-transform [10]. This is an extension of
the acoustic signals into their spectral components. Each continuous wavelet transform, where the STFT is calculated
component can be subsequently identified for specific pattern. over a window of varying width. This gives a better resolution
In order to reduce the effect of sporadic spikes and noise in the of the signal. However, in this paper, we are proposing a
signal, we have utilized the statistical nature of mathematical universal system using the algorithm-circuit codesign approach
metrics, such as average, coastline (CL), and so on. Using such for detecting multiple symptomatic patterns from a single
methods, the dominant patterns can be detected and classified input acoustic signal. These symptoms have specific frequency
efficaciously. Furthermore, we have used processing based on composition corresponding to each pattern. The audio signal
mel cestrum computation to detect signals, which have indis- is streamed, and it is essential to preserve the spectral as well
tinguishable frequency spectrum [7]. Mel cepstrum calculation as the temporal information in the signal. This can be achieved
is based on the principle by which a human ear can distinguish using wavelet transform. The frequency-resolved signal is
between audio patterns and is well known for its use in speech subsequently processed using mathematical metrics and mel
recognition [7]. Using low-power design methodologies, such cepstrum-based analysis in order to cause detection. In this
as multiplierless filters, the power constraints on the design section, we discuss the basic principles of these three methods
are met. This enhances the feasibility of integrating the system and justify their usage. These techniques are then utilized in
into a wearable product. Design parameter choices have been the algorithm methodology, as will be described in Section III.
made at algorithm and circuit levels of abstraction in order to Discrete wavelet transform (DWT) is a common signal
achieve power efficiency in the implementation. In this paper, processing tool used for multiresolution analysis of various
the algorithm-circuit codesign approach is successfully utilized types of signals. DWT decomposes the input signal into
to not only make the system scalable in terms of signals narrow bands of its component frequencies. This decompo-
analyzed but also programmable to patient specific needs. sition is represented in the form of approximate and detail
The algorithm-circuit codesign points to the methodology, coefficients. While the approximate coefficients correspond to
where we make calculated and intelligent approximations the low- frequency/coarser variations of the signal, the detail
and modifications to the mathematical model and the circuit coefficients are the high frequency/finer variations. DWT uses
topology to achieve the goal of low power and high efficacy. various types of wavelet and scaling function as the basis
For instance, ideally, a wavelet transform would be sufficient to for signal decomposition. Choosing an appropriate wavelet
decompose a signal into its component frequencies. However, function is essential for an accurate resolution of the signal.
to do that at a lower hardware cost, we make modification Due to multiresolution property, DWT helps in preserving
to filter coefficients (algorithm modification) and filter circuit both spectral and temporal information in the signal unlike
topology (circuit modification) to achieve similar functionality FFT. It also has a better resolution as compared with STFT
without any degradation in quality at a much lower hardware due to dyadic scaling [11]. Traditionally, wavelet transform
cost (power). has been used extensively in image processing, especially for
Section II discusses the existing detection systems in the applications requiring data compression. In recent times, it
literature and justifies the signal processing choices used in has also been used in analyzing biological signals in field
the proposed system. In Section III, we present the algorithm of bioinformatics and neuroscience [5], [12]. Apart from the
methodology in detail. The hardware implementation and above-mentioned advantages of using DWT, the hardware
the low-power circuit level techniques used are explained in implementation of DWT using techniques, such as Mallat’s
Section IV. We present the results in Section V. Finally, the algorithm or lifting facilitates low-power design. The proposed
conclusion is drawn in Section VI. system has to distinguish and segregate the five acoustic
signals efficiently. Due to this requirement and the advantages
II. BACKGROUND over FFT/STFT, we select DWT for the spectral resolution of
The main goal of the system proposed in this paper is input signals. It is observed that the five types of symptomatic
to detect the symptomatic patterns using acoustic nonspeech patterns being detected occur in frequency band specific
human signals. In the literature, several algorithms have DWT coefficients. The multiresolution property of DWT also
been presented in order to process human acoustic signals. filters out the unwanted noise from the signal of interest
MARKANDEYA AND ROY: LOW-POWER SYSTEM FOR DETECTION OF SYMPTOMATIC PATTERNS 2681

effectively. The Mallat’s algorithm, used to implement the


wavelet transform, uses lower order filters in combination with
subsampling operation to resolve the signal into very narrow
frequency bands [11]. This is advantageous in implementing
the hardware. However, the wavelet resolved signal needs to
be processed further in order to remove the sporadic spikes
and noise, which might trigger a false detection.
Another class of signal processing used in this paper is
based on simple mathematical metrics. It was shown previ-
ously that these mathematical metrics can be used in detecting
patterns in neural signals [6]. The acoustic signals in the form
of wavelet coefficients have certain patterns corresponding to
the symptom to be detected. We know that it is possible
to represent a signal in terms of various types of absolute
and statistical parameters, such as average, variance, and so
on. These statistical parameters computed over a window of
data in the wavelet coefficient are defined as the mathemat-
ical metrics. In this paper, we use the energy, trace length Fig. 1. Block diagram of mel cepstrum coefficient algorithm and mel filter
[coastline (CL)], and quasi-average (QA) in order to analyze bank spectrum plot.
the DWT processed data. The reason for this selection is
TABLE I
that each symptomatic pattern to be detected has a specific
M APPING OF DWT C OEFFICIENTS TO F REQUENCY
characteristic in the wavelet coefficient. We select the optimal
metric depending on this characteristic of the pattern described
in Section III. Although only three types of metrics are used
in this paper, scalability of the system enables the usage of
other mathematical metrics as well. However, these metrics
are susceptible to noise significantly and may result in false
detections. These metrics are discussed in detail in Section III.
Some of the symptomatic patterns are
 
f discern the difference between two closely spaced frequencies,
M( f ) = 1125 ∗ ln 1 + (1)
700 especially at higher frequencies. The mel scale assumes an
 m 
M −1 (m) = 700 ∗ e 1125 − 1 (2) almost linear transfer of power for frequencies under 1 kHz
and a logarithmic dependence for higher frequencies, thereby
resolved into the same DWT coefficient. Such patterns can- mimicking the human auditory system. The spectral energy
not be distinguished using these mathematical metrics. For in each filter in the mel filter bank is then given to the
such signals, we propose the mel frequency cepstrum-based logarithm block for a nonlinear normalization. Since the filters
analysis. are overlapped, there is significant correlation between the
Mel frequency cepstrum coefficients (MFCCs) are the set spectral filter energies. The discrete cosine transform (DCT)
of coefficients extracted from an audio signal. It is extensively block helps in decorrelating the energies in these overlapping
used in speech or speaker recognition [13], [14]. The principle bandpass filters. The output coefficients of the DCT block
of MFCC is based on the fact that the sounds generated by are the mel cepstrum coefficients. The term cepstrum denotes
human vocal tract are modulated by the shape of the tract, the operation that it is the calculation of spectrum of a
including the tongue and teeth. This shape is manifested in spectrum. In general, 26–40 filters are used in the mel filter
the form of an envelope of power spectrum over short periods bank generating as many mel coefficients. Depending on the
of time. MFCC have been shown to accurately represent this complexity of the speech pattern, the specific mel coefficients
envelope [7]. Fig. 1 shows the basic algorithm for computing can be analyzed. Due to its ability to distinguish between two
MFCC. The primary component is the mel filter bank, a set shape modulated signals in the same spectrum, it is an ideal
of overlapping bandpass filters which are uniformly spread candidate to be used in our proposed system for signals that
around the center frequency on the mel scale. cannot be resolved using the DWT. It will be shown in Sec-
The mel scale describes the human auditory system on a tions III and IV that due to the indistinguishability of the cough
linear scale. The conversion between mel scale (m in mels) and sneeze signals on the frequency spectrum, the MFCC
and frequency scale ( f in Hz) is computed using (1) and (2). algorithm with appropriate modifications efficiently segregates
The mel filters with a triangular frequency response are also these signals. Traditionally, MFCC classifies much complex
shown in Fig. 1. Based on (1) and (2), the uniformly spread and variable speech patterns. The proposed algorithm distin-
center frequencies of the mel filters transform to a loga- guishes only between the cough and the sneeze. Hence, by
rithmic spacing on the frequency scale. This transformation applying the algorithm-hardware codesign approach, instead of
is coherent with the fact that the human cochlea cannot using 26–40 MFCC, the detection of pattern in this algorithm
2682 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 8, AUGUST 2016

B. Step 2: Spectral Resolution Using DWT


As explained in Section II, initially, the input has to be
spectrally resolved using the DWT. Its multiresolution ability
to retain both temporal and spectral information justifies it
to be the ideal choice for spectral resolution as compared
with FFT or STFT. The DWT resolves the symptomatic
patterns into narrow frequency bands or wavelet coefficients
(Di ) (Table I). We have used the Daubechies fourth-order
wavelet as the wavelet function for computation of the wavelet
transform due to its optimal coarseness and smoothness to
truly represent the signals of interest. The order of the
selected mother wavelet is an algorithmic design decision,
which has a direct impact on the complexity of its hardware
implementation. The various values of Di are classified as
the coefficients of interest for specific symptomatic patterns.
For instance, the acoustic patterns corresponding to wheezing
and vomiting are resolved in the D5 and D6 wavelet coeffi-
cients, respectively. The pattern consistent with burp/belching
is found in multiple coefficients (D4 and D5 ). The cough
and sneeze signals have a common frequency spectrum and
are resolved into a single coefficient (D3 ). Another algorithm
level design decision is the approximation of the filter coef-
ficients used in computation of DWT. This has a negligible
change in their frequency response. However, it will be
shown in Section IV that it helps in optimizing the hardware
implementation.
Fig. 2. Proposed algorithm/methodology.
Subsequent to the signal decomposition, the spectral as
well as the temporal information of the signal is available
can be achieved using the reduced number of MFCC. Hence, for further processing. Although the symptomatic patterns are
redundant computation in the MFCC algorithm is avoided frequency resolved into separate wavelet coefficients, there are
based on this algorithmic modification. This manifests directly several sporadic spikes in the wavelet processed data, which
into power savings in the corresponding hardware implemen- might trigger false detection. Some of the coefficients are
tation. Section III discusses the algorithm methodology and consisting of multiple symptoms too, while other patterns are
these modifications in detail. resolved into multiple coefficients. In order to separate out
these patterns further and reduce the noisy spikes to avoid
false detections, these coefficients are subjected to various
III. A LGORITHM /M ETHODOLOGY mathematical metric-based computation and MFCC base com-
putation depending on the type of pattern to be detected. These
In this section, we describe the proposed algorithm and
are described in Sections III-C–III-F.
the methodology used to modify the various computational
tools in order to make it implementable into low-power
hardware. In Section II, we had described the basics and C. Step 3(a): Energy Parameter for DWT Vomit Pattern
justified the basis for selecting specific computational tech- In the DWT resolved signal, it is noticed that the coefficient,
niques used in developing this algorithm. The application which represents the acoustic pattern pertaining to vomiting
of these computations is dependent on the characteristic viz. D6 , shows a continuous and significantly large increase
property of the symptom to be detected. The algorithm in amplitude at the onset of the vomit pattern signal. Such
methodology is shown in Fig. 2. We also describe the details increase in amplitude in specific frequency band corresponds
along with the mapping of algorithm to specific signals as to an increase in the energy content of the signal. Hence,
follows. the energy metric is selected to detect the vomit pattern.
The energy of the D6 coefficient is computed over adjacent
windows of prefixed size (N). The coefficient is divided into
A. Step 1: Streaming the Input Data
windows, and the energy is then computed over that window
The input data are the human audio recording of various using (3), where x is the input data (D6 )
symptomatic patterns, such as cough, sneeze, belch, wheeze,
1 
N
and vomit. The assumption in this paper is that the input signal
E AVG [n] = E(i + (n − 1) ∗ N) (3)
will be available to the hardware in 10-bit 2’s complement N
i=1
digital format. This digitized signal is streamed at the input of
the algorithm at its sampling frequency (11.025 kHz). where E(i ) = x(i )2 .
MARKANDEYA AND ROY: LOW-POWER SYSTEM FOR DETECTION OF SYMPTOMATIC PATTERNS 2683

The thresholded value of the energy parameter eventually


helps in detecting a vomit pattern in the acoustic input signal.
The value of N and threshold is based on the training
data.

D. Step 3(b): Coastline Parameter for Wheeze Pattern


The CL parameter of the signal is the magnitude of the
trace length of the signal [6], [15]. The use of CL parameter
is found to be advantageous in the case of an audio signal,
where there is a consistent repetition of a pattern within a
certain window size (N), without any significant change in
the amplitude of the signal. In the DWT resolved acoustic
signal, we observed that the wheezing signal resolved into
the D5 wavelet coefficient was an ideal case for the use of
CL parameter metric. It should also be noted that due to
lower computational complexity, the CL parameter results in a
Fig. 3. Frequency spectrum of cough and sneeze patterns.
low-power hardware implementation. Equation (4) represents
the computation of the CL parameter
same DWT coefficient D3 . As mentioned in Section III-E, we

N use the MFCC-based computation to distinguish between the
CL(k) = x[i + (k − 1) ∗ N] − x[i − (k − 1) ∗ N] (4) cough and the sneeze. Section III-F explains the modification
i=1 of MFCC methodology in detail.
where x is the input data and N is the window size for
kth window. The periodic pattern corresponding to wheezing
can be detected by comparing the windowed trace length or F. Step 3(d): Mel Cepstrum-Based Analysis
the CL parameter with a prefixed threshold. for Cough and Sneeze Patterns
In the wavelet processed signal, it is observed that the cough
E. Step 3(c): Quasi-Average for Belch/Burp Pattern and sneeze signals are resolved into the same DWT coeffi-
cient (D3 ). This is because these two symptomatic patterns
Unlike the vomit and wheezing pattern, it is observed in
have very similar frequency response (Fig. 3). Due to this,
the DWT decomposition of the acoustic signal that the pattern
standard mathematical metrics described previously, such as
corresponding to belching/burping is resolved in multiple
energy, CL, and so on, are not suitable for the efficacious
coefficients viz., D4 and D5 . In order to detect this signal,
classification of these signals. These signals are distinguished
it is necessary to process both the wavelet coefficients with
on the basis of the shape of the vocal tract while emitting
appropriate weights. The averaging function was identified to
them. Such distinguishability is achieved based on MFCC
achieve this detection. However, windowed average would be
computation. The spectral envelope, which encodes the vocal
hardware inefficient. Hence, an algorithm level modification
tract shape in them, is extracted by the MFCC algorithm.
was used viz., QA [5]. QA is computed by modifying the
The MFCC algorithm was discussed in Section II (Fig. 1).
traditional definition of average with an assumption that each
However, the MFCC algorithm is designed to decode speech
element of a continuously moving window is truly represented
patterns, which have a high degree of variability. Compara-
by the average value of its window. Such an approximation
tively, in this paper, two types of patterns are to be classified.
results in an acceptable error of the order of 10−6 . Equation (5)
Hence, the over computation in the standard MFCC algorithm
is the mathematical representation of QA
can be reduced. The algorithm level modifications imple-
1 mented are as follows. First, since the input acoustic signal
Wk+1  = (Si:i+w − Wk  + x i+w+1 ) (5)
w has already been resolved in the frequency domain using DWT,
where W is QA of kth window, S is the accumulated sum of the FFT block in the MFCC algorithm is redundant (Fig. 1).
the kth window, and w is the window size. The windowed data Second, in the traditional MFCC, the entire bandwidth of an
from D4 and D5 is quasi-averaged over a fixed window size. acoustic signal is divided into approximately 26 mel filters.
The QAs are weighted to normalize their values and added However, since our coefficient of interest is D3 , it is only
for detecting the belching pattern. From the perspective of necessary to have the mel filters that overlap in the frequency
hardware implementation, the advantage of QA is that the band corresponding to D3 (689–1378 Hz), as shown in Fig. 4.
window can be a continuously moving over a data set and This results in the reduced number of mel filters (3). The spec-
results in memoryless hardware implementation. The weights tral energies of these filters are computed over a predetermined
and window size are fixed by operating the algorithm on the window size. The energies are passed on to a DCT block that
training data. decorrelates them producing the modified cepstrum coefficient.
Based on the above-selected mathematical metrics, three The second and third coefficients of the DCT correspond to
of the symptoms are classified correctly. However, the signal the coefficient, which can separate the cough from the sneeze.
corresponding to the cough and the sneeze occurs in the It should be noted that the reason for omission of the logarithm
2684 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 8, AUGUST 2016

Fig. 4. DWT coefficient mapping to mel filters.

Fig. 5. Block diagram of hardware implementation.


block is that the cough and sneeze signals are normalized
during the DWT operation. The purpose of the logarithm block
is to normalize multiple spectral energies nonlinearly. Since we
are using a reduced mel filter bank, the logarithm operation is
found to be redundant. These design decisions on algorithm
level also help reduce the power consumption in the hardware
implementation.

G. Step 4: Thresholding
Subsequent to the computation of the mathematical metrics
as well as the mel coefficients over the decomposed wavelet Fig. 6. Mallat’s algorithm to compute DWT.
coefficients, the processed values are compared with the preset
thresholds. These thresholds are fixed based on the training
data, which represent a typical case of each of the type of
symptoms that are being detected.

H. Training
In Section III, it is evident that the proposed algorithm has a
number of parameters that are user-specific, such as thresholds,
weights, and so on. It should also be noted that although in this
paper, we have proposed the detection of five types of patterns
corresponding to the symptoms indicative of general health;
it is possible to increase the number of symptoms that can be
detected. This makes the system scalable. The methodology
described above is generic in nature such that, it can be applied
to other audio biological signals as well. This necessitates
a proper training to select optimal wavelet coefficients of
Fig. 7. Block diagram for (a) energy parameter, (b) CL parameter,
interest, set appropriate weights, and thresholds in order to and (c) QA.
have efficacious functionality. A set of data containing various
signal patterns to be detected is used as a training set. This made in order to facilitate the low-power implementation of
data set is subjected to the algorithm described above. The hardware. Fig. 5 shows the block diagram of the system. These
wavelet coefficients are identified, and depending on the nature blocks are discussed below.
of the signal, the corresponding mathematical metrics are
applied to process the coefficients. The thresholds for each A. Discrete Wavelet Transform Block
of the processed signals are set, such that they give maximum The wavelet transform block is the most computationally
efficacy in terms of accurate classification. The windowing
intensive block in the system and consumes a significant
operation in various blocks is such that the window size is
amount of power. There are various methods available in the
equivalent to 1024 samples of audio input data. literature to implement the DWT block [16]. In this paper,
we use Mallat’s algorithm [9]. The DWT block consists of
IV. H ARDWARE I MPLEMENTATION consecutive stages of low-pass (H ) and high-pass (G) filters.
In this section, we discuss the circuit level techniques that These cascading stages are separated by intermediate subsam-
are used to implement the proposed algorithm into a power- pling (Fig. 6), which is achieved by appropriate clocking of
efficient hardware. As explained in Section III-G, certain the filters in successive stages. The number of filter stages
decisions at the algorithm level of design abstraction were in the DWT block depends on the number of coefficients
MARKANDEYA AND ROY: LOW-POWER SYSTEM FOR DETECTION OF SYMPTOMATIC PATTERNS 2685

of interest in the system. As shown in Section IV, for the The weights are used to normalize the magnitudes of the
purpose of this application, it is necessary to derive six wavelet two coefficients. The weighted sum is compared with a pre-
coefficients. Hence, six cascading stages of H and G are fixed threshold to detect occurrence of belching or burping
needed. Since the five acoustic patterns are detected using pattern.
the wavelet coefficients D3 through D6 , we need to have
five H filters and four G filters. All these filters are of the C. MFCC-Based Analysis
eighth order due to the use of the Daubechies fourth-order
As described in Section III, the MFCC-based analysis uses
mother wavelet. A standard implementation of nine filters
the DWT as the first stage spectrum instead of the FFT.
of the eighth order would be computationally intensive in
The number of mel filters in the mel filter bank is reduced
terms of number of multiplication. We utilize multiplierless
due to the resolution of the cough and sneeze signals into a
technique of computation sharing multiplier (CSHM) and
single coefficient of interest. In this paper, three overlapping
common subexpression elimination (CSE) to reduce power
bandpass filters are used in the mel filter bank. It can be
consumption.
observed from the mel filter bank response in Fig. 1 that for
These are well-known low-power methodologies, where the
wavelet coefficient D3 (689–1378 Hz), these three mel filters
filter coefficients are represented using the minimum number
are sufficient. The mel filters are designed for a triangular
of alphabets and their precomputed products with the input
magnitude response around the center frequency. These filters
data [17], [18]. The partial products of the input data with the
are of the 16th order, so that the frequency response is closely
filter coefficients are subsequently computed by shifting and
matching the required triangular response. The coefficients
adding these precomputed products and reusing the intermedi-
of these filters are adjusted by reducing the number of 1s.
ate sum [5]. The choice of filter coefficients (algorithm level)
This reduces the number of computations without adversely
and multiplierless filter (circuit level) applies the algorithm-
affecting the frequency response of the filter. These filters are
circuit codesign approach. The wavelet coefficients are nor-
also implemented using the CSHM and CSE methodologies
malized before subsequent processing to reduce data path
in order to reduce the power consumption of the filter. The
width and maintain the correlation.
coefficients of all the filters are successfully represented using
three alphabets for precomputation. The output is subsequently
B. Mathematical Metric Blocks passed to the energy block to calculate the spectral energy
in each of the mel filters. The three filter energies in each
The block diagrams for the mathematical metric blocks are
accumulation window are passed to the DCT block. Due to the
shown in Fig. 7. The energy parameter is computed according
overlapping nature of the mel filters, the outputs are highly cor-
to (3). The block diagram for computation of energy is shown
related. The DCT decorrelates these filter outputs and separates
in Fig. 7(a). It consists of a multiply and accumulate operation,
the spectral envelope into multiple MFCC-based parameters.
which adds the squared value of the input viz., D6 coefficient.
The DCT block is also designed by modifying the coefficient
The D6 window size is chosen in the training phase and
matrix in order to reduce the number of 1s and facilitate the
corresponds to 1024 samples of the digitized input data. The
CSHM-based implementation [17]. The first output coefficient
average energy value is then compared against the threshold
of DCT corresponds to the dc component and can be ignored.
to detect acoustics pertaining to vomiting sound. Energy
The second and third coefficients correspond to the cough and
parameter captures the continuous increase in the amplitude
sneeze patterns, respectively. These MFCC-based parameters
of the low-frequency component in human auditory signal to
are then compared with a threshold, predetermined during
correctly detect this symptom.
training to detect the corresponding pattern. Depending on the
The CL parameter block diagram is shown in Fig. 7(b). The
type of signals being used, it is possible to modify the number
CL parameter is calculated based on (4). The D5 coefficient
of the mel filters in the filter banks and correspondingly modify
is the input to the CL block. The input is delayed by a clock
the hardware implementation to achieve scalability and patient
cycle in order to calculate the difference between two adjacent
specific programmability.
samples. The magnitude of the difference is accumulated over
a prefixed window in order to calculate the trace length of
the signal. This accumulated value is then compared with D. Threshold Block and Clock Circuitry
the threshold for detecting wheezing. Since wheezing signal The threshold block consists of registers that are loaded with
is periodic signal for time duration without any significant the prefixed threshold values corresponding to each individual
increase in amplitude, the CL parameter captures this pattern acoustic pattern to be detected. These threshold values are
accurately. fixed in the training phase. Comparators in the threshold blocks
The block diagram for the quasi-averaging circuit is shown are used to compare and raise the detection flag for each
in Fig. 7(c). In order to enable a memoryless implementation of the symptomatic pattern detected. The clock circuitry is
and a continuously moving average, the average calculated used to synchronize all the operations in the system. The
in the previous window is subtracted from the sum of the input data are streamed in at 11.025 kHz. Each successive
running window instead of the individual data sample. Since coefficient of the wavelet transform is computed at half the
the window size is a power of two, the divider is imple- frequency as that of the previous coefficient. The signal
mented by discarding the appropriate least significant bits. processing block operates at the same frequency as that of
The QA is calculated over two coefficients viz., D4 and D5 . the coefficient they are operating on. A 10-bit counter is used
2686 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 8, AUGUST 2016

Fig. 8. Pattern detection in the case of cough and sneeze signals.

TABLE II
C LASSIFICATION A CCURACY FOR F IVE A COUSTIC S YMPTOMATIC PATTERNS

to produce the clock for the various block and the stages of the These results are summarized in Table II. It is evident that the
DWT block. MFCC-based processing results in 90% correct classification.
In the acoustic pattern pertaining to vomiting, the accuracy is
V. R ESULTS observed to be the lowest. The KiMS system shows a similar
Based on the algorithm described in Section III and accuracy (88%–93%) for cough and sneeze as the proposed
the hardware implementation in Section IV, the system for systems [4]. However, it comes at a hardware cost of a power
detecting symptomatic patterns in a nonspeech audio signal hungry artificial neural network. The training required by the
was simulated. A total of 74 recordings of various acoustic KiMS system in the case of scaling the number of symptoms
symptomatic patterns were used for testing the accuracy of to be detected would be much higher. In these respects,
detection. These recordings consisted of five types of patterns the proposed system has a better advantage due to
viz., cough, sneeze, belch, vomit, and wheeze. The audio good efficacy, scalability, and low-power implementation
recordings were downloaded from readily available sound methodology.
library [4], [20]. These recordings were in the “.wav” format. The hardware implementation of the system was described
Apart from these signals, another set of data was used using very high speed integrated circuit (VHSIC) Hardware
in the training phase to determine various parameters. The Description Language and synthesized using Synopsys tools
digital audio signals from the recordings were processed in in the TSMC 65-nm technology bulk-Si library. The system
MATLAB, according to the proposed algorithm. The func- was optimized for 1 V VDD and 100 kHz f CLK . The extracted
tionality and the efficacy of the algorithm were calculated. circuit was simulated using Nanosim with a sample test data to
The cough and sneeze signals, which require the MFCC-based get the power consumption of the system. The 10-bit digital
coefficient calculation, were successfully classified. The result data were streamed into the system and the output verified
for an MFCC-based classification is shown in Fig. 8. As can for correct operation. To lower the power consumption of the
be seen, the second MFCC-based coefficient is sensitive to system, VDD was scaled to 700 mV. Due to the quadratic
the cough pattern, while the third coefficient is sensitive dependence of the dynamic power on the power supply,
to the sneeze pattern. The first coefficient is ignored and, VDD scaling reduces the dynamic power significantly. The
hence, is not shown in Fig. 8. The classification accuracy system power was observed to be leakage dominated. Leak-
is calculated as the percentage of signals classified correctly. age control techniques can be used to further reduce power
MARKANDEYA AND ROY: LOW-POWER SYSTEM FOR DETECTION OF SYMPTOMATIC PATTERNS 2687

TABLE III [4] A. Basak, S. Narasimhan, and S. Bhunia, “KiMS: Kids’ health monitor-
P OWER AND A REA R ESULTS ing system at day-care centers using wearable sensors and vocabulary-
based acoustic signal processing,” in Proc. 13th IEEE Int. Conf. e-Health
Netw. Appl. Services (Healthcom), Jun. 2011, pp. 1–8.
[5] H. S. Markandeya, G. Karakonstantis, S. Raghunathan, P. P. Irazoqui,
and K. Roy, “Low-power DWT-based quasi-averaging algorithm and
architecture for epileptic seizure detection,” in Proc. 16th ACM/IEEE
Int. Symp. Low Power Electron. Design (ISLPED), Aug. 2010,
pp. 301–306.
[6] H. S. Markandeya, S. Raghunathan, P. P. Irazoqui, and K. Roy,
“A low-power ‘near-threshold’ epileptic seizure detection processor with
multiple algorithm programmability,” in Proc. ACM/IEEE Int. Symp.
Low Power Electron. Design (ISLPED), Feb. 2012, pp. 285–290.
[7] S. B. Davis and P. Mermelstein, “Comparison of parametric repre-
sentations for monosyllabic word recognition in continuously spoken
sentences,” IEEE Trans. Acoust., Speech, Signal Process., vol. 28, no. 4,
pp. 357–366, Aug. 1980.
[8] J. Hogan and M. Mintchev, “Manometry-based cough identification
algorithm,” Int. J. Inf. Theories Appl., vol. 14, no. 2, pp. 127–132, 2007.
[9] S. Matos, S. S. Birring, I. D. Pavord, and D. H. Evans, “Detection
of cough signals in continuous audio recordings using hidden Markov
models,” IEEE Trans. Biomed. Eng., vol. 53, no. 6, pp. 1078–1083,
Jun. 2006.
consumption. The simulated power and area are tabulated [10] R. G. Stockwell, L. Mansinha, and R. P. Lowe, “Localization of the
in Table III. complex spectrum: The S transform,” IEEE Trans. Signal Process.,
vol. 44, no. 4, pp. 998–1001, Apr. 1996.
The algorithm is designed to be scalable to other acoustic [11] S. Mallat, A Wavelet Tour of Signal Processing. San Diego, CA, USA:
biological signals. Depending on the frequency spectrum of the Academic, 1999.
signal of interest and the pattern to be detected, the wavelet [12] P. Lio, “Wavelets in bioinformatics and computational biology: State of
art and perspectives,” Bioinformatics, vol. 19, no. 1, pp. 2–9, 2003.
coefficients can be calculated to even more than six stages. [13] W. Han, C.-F. Chan, C.-S. Choy, and K.-P. Pun, “An efficient MFCC
The MFCC-based parameters can be used for detecting any extraction method in speech recognition,” in Proc. IEEE Int. Symp.
signals, which occur in the same wavelet coefficients. The Circuits Syst. (ISCAS), May 2006, pp. 145–148.
[14] F. Zheng, G. Zhang, and Z. Song, “Comparison of different implemen-
algorithm-circuit codesign methodology can be utilized to tations of MFCC,” J. Comput. Sci. Technol., vol. 16, no. 6, pp. 582–589,
optimize power consumption and maintain high efficacy. This Nov. 2001.
enables the system to be user-specific as well as scalable [15] A. M. White et al., “Efficient unsupervised algorithms for the detection
of seizures in continuous EEG recordings from rats after brain injury,”
to the type of audio signals being used for the detection of J. Neurosci. Methods, vol. 152, nos. 1–2, pp. 255–266, 2006.
symptoms. [16] C. Souani, M. Abid, K. Torki, and R. Tourki, “VLSI design of 1-D
DWT architecture with parallel filters,” Integr. VLSI J., vol. 29, no. 2,
pp. 181–207, 2000.
VI. C ONCLUSION [17] G. Karakonstantis and K. Roy, “An optimal algorithm for low power
multiplierless FIR filter design using Chebychev criterion,” in Proc.
In this paper, we have proposed a generic system based on IEEE ICASSP, vol. 2. Apr. 2007, pp. II-49–II-52.
wavelet transform, mathematical metrics, and mel cepstrum- [18] J. H. Choi, N. Banerjee, and K. Roy, “Variation-aware low-power synthe-
sis methodology for fixed-point FIR filters,” IEEE Trans. Comput.-Aided
based analysis, which can be used to detect symptomatic Design Integr. Circuits Syst., vol. 28, no. 1, pp. 87–97, Jan. 2009.
patterns in audio biological signals. Modifications in the algo- [19] W.-H. Liao and Y.-K. Lin, “Classification of non-speech human sounds:
rithm and the use of low-power methodologies to implement Feature selection and snoring sound analysis,” in Proc. IEEE Int. Conf.
Syst., Man Cybern., Oct. 2009, pp. 2695–2700.
the algorithm into circuit enable the design of a low-power [20] Audio Micro Stock Audio Library. [Online]. Available:
system. The system can be scaled to include other http://www.audiomicro.com, accessed Oct. 2014.
health markers and can also be made user-specific. The
MFCC-based processing, which is generally used for speech
or speaker recognition, has been shown to successfully distin-
guish signals that share the frequency spectrum. The algorithm
shows a high classification rate (>75%) with a low-power
implementation [4]. We believe that the algorithm-circuit code- Himanshu S. Markandeya received the bachelor’s
sign strategy followed in this paper is the basis for designing degree in electronics engineering from the Uni-
an efficient low-power health monitoring system. versity of Mumbai, Mumbai, India, in 2003, the
M.Sc. degree in electrical engineering from the Illi-
nois Institute of Technology, Chicago, IL, USA, in
R EFERENCES 2006, and the Ph.D. degree in electrical engineering
from Purdue University, West Lafayette, IN, USA.
[1] S. Patel, H. Park, P. Bonato, L. Chan, and M. Rodgers, “A review He was a Component Design Engineer (Intern)
of wearable sensors and systems with application in rehabilitation,” with the Intel Custom Foundry Group, Intel
J. Neuroeng. Rehabil., vol. 9, no. 21, pp. 1–17, 2012. Corporation, Chandler, AZ, USA, in 2013. He was
[2] A. Pantelopoulos and N. G. Bourbakis, “A survey on wearable sensor- with Diehl Controls North America Ltd., Naperville,
based systems for health monitoring and prognosis,” IEEE Trans. Syst., IL, USA, as a Hardware Design Engineer, and Tata Consultancy Services,
Man, Cybern. C, Appl. Rev., vol. 40, no. 1, pp. 1–12, Jan. 2010. Mumbai, and Autodesk Inc., San Rafael, CA, USA, where he was involved
[3] Children’s Health Topics-Infectious Diseases. [Online]. Available: in the software domain. His current research interests include design of
http://www.aap.org/healthtopics/infectiousdiseases.cfm, accessed algorithms and low-power systems for pattern detection and algorithm-circuit
Oct. 2014. co-design methodologies for biomedical application.
2688 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 8, AUGUST 2016

Kaushik Roy (F’01) received the B.Tech. degree SRC Technical Excellence Award in 2005, the SRC Inventors Award, the
in electronics and electrical communications engi- Purdue College of Engineering Research Excellence Award, the Humboldt
neering from IIT Kharagpur, Kharagpur, India, and Research Award in 2010, the IEEE Circuits and Systems Society Techni-
the Ph.D. degree from the Electrical and Computer cal Achievement Award in 2010, the Distinguished Alumnus Award from
Engineering Department, University of Illinois at IIT Kharagpur, the Semiconductor Research Corporation Aristotle Award
Urbana–Champaign, Champaign, IL, USA, in 1990. in 2015, best paper awards at the International Test Conference in 1997,
He was with the Semiconductor Process and the IEEE International Symposium on Quality of IC Design in 2000, the
Design Center, Texas Instruments, Dallas, TX, USA, IEEE Latin American Test Workshop in 2003, the IEEE Nano in 2003,
where he was involved in field-programmable gate the IEEE International Conference on Computer Design in 2004, and the
array architecture development and low-power cir- IEEE/ACM International Symposium on Low Power Electronics and Design
cuit design. He joined the Electrical and Computer in 2006, the IEEE Circuits and System Society Outstanding Young Author
Engineering Faculty, Purdue University, West Lafayette, IN, USA, in 1993, Award (Chris Kim) in 2005, the IEEE T RANSACTIONS ON VLSI S YSTEMS
where he is currently an Edward G. Tiedemann Jr. Distinguished Professor. Best Paper Award in 2006 and 2013, and the ACM/IEEE International
He was a Research Visionary Board Member with Motorola Labs, Bangalore, Symposium on Low Power Electronics and Design Best Paper Award in 2012.
India, in 2002. He has authored over 600 papers in refereed journals and He was a Purdue University Faculty Scholar from 1998 to 2003. He has been a
conferences, supervised 70 Ph.D. students, co-authored two books entitled Fullbright-Nehru Distinguished Chair and DoD National Security Science and
Low Power CMOS VLSI Design (John Wiley) and Low Power CMOS Engineering Faculty Fellow from 2014 to 2019. He has been on the Editorial
VLSI Design (McGraw Hill), and holds 15 patents. His current research Board of the IEEE Design and Test, the IEEE T RANSACTIONS ON C IRCUITS
interests include spintronics, device-circuit co-design for nanoscale silicon and AND S YSTEMS , the IEEE T RANSACTIONS ON VLSI S YSTEMS , and the IEEE
nonsilicon technologies, low-power electronics for portable computing and T RANSACTIONS ON E LECTRON D EVICES . He was a Guest Editor of the
wireless communications, and new computing models enabled by emerging Special Issue on Low-Power VLSI in the IEEE Design and Test in 1994, and
technologies. the IEEE T RANSACTIONS ON VLSI S YSTEMS in 2000, IEEE Proceedings:
Dr. Roy received the NSF Career Development Award in 1995, Computers and Digital Techniques in 2002, and the IEEE J OURNAL ON
the IBM Faculty Partnership Award, the ATT/Lucent Foundation Award, the E MERGING AND S ELECTED T OPICS IN C IRCUITS AND S YSTEMS in 2011.

You might also like