You are on page 1of 12

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2021.3087333, IEEE Sensors
Journal
IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2017 1

Towards Development of an ISFET-based Smart


pH Sensor: Enabling Machine Learning for Drift
Compensation in IoT Applications
Nishad Sahu* , Rishabh Bhardwaj*, Het Shah, Ravindra Mukhiya, Senior Member, IEEE, Rishi
Sharma, Member, IEEE, and Soumendu Sinha, Member, IEEE

Abstract— Monitoring of pH is crucial for several chemical and biochemical processes. ISFET (Ion-Sensitive Field-Effect
Transistors)-based pH sensors are promising candidates for pH monitoring applications. However, ISFET devices are
prone to temporal and temperature drifts, which severely affects the precision of pH measurements. In this work, we
collect experimental data of temporal and temperature drifts in an ISFET sensor to formulate an accurate SPICE macro
model, incorporating both temporal and temperature non-idealities. The developed macro model is utilized for generating
training data for state-of-the-art machine learning models for drift compensation, with a primary focus on the temporal
characteristics. We utilize recurrent neural networks (RNNs) to model the temporal characteristics of ISFET, and thus,
compensate the non-ideality. The sensor data is collected in various pH buffer solutions and a data set of sequences
containing time-dependent voltage readings are generated by the device and the RNNs are trained to learn the crucial
features from the data and map them to the precise pH of the solution. We compare two variants of RNNs, i.e. LSTM
(long short-term memory) and GRU (gated recurrent unit), and their bidirectional low computational cost variants -
biLSTM and biGRU. Each model is tested in a memory-constrained environment with the availability of a 32-bit and 64-bit
floating-point number. Empirically, we find biLSTMs to perform best, where the achieved root mean square error (RMSE)
between the model predicted pH and the true pH of the test solution is less than 0.212 pH, with an average RMSE of 0.126
pH. For temperature drift compensation, we collect data for four different temperatures and adapt well-established MLPs
(Multi-layer Perceptrons) to compensate the intrinsic temperature drift in the sensor. We observe an average RMSE of
the model predicted pH to the true pH to be less than 0.286 pH. The developed RNN models were implemented on Xilinx
ZCU104 FPGA development kit using PYNQ framework, which demonstrates low power consumption. The proposed
framework establishes the efficacy of Machine Learning (ML) techniques for drift compensation in ISFET-based pH
sensors for deployment in IoT applications.

Index Terms— ISFET, SPICE, machine learning, artificial neural networks, recurrent neural networks, IoT.

I. I NTRODUCTION and NBC (Nuclear, Chemical, and Biological) warfare defense


Ion-Sensitive Field-Effect Transistors (ISFETs) are a pop- systems [9], [10]. ISFET-based pH sensors are suitable for
ular class of electrochemical sensors [1]–[3] that have found these applications due to their small size in comparison to glass
wide usage in applications such as DNA detection [1], ion electrode-based pH sensors and CMOS compatible fabrication,
imaging [4], biomedical diagnosis [5], measuring acidity of which allows batch processing, making them economically
lubricants [6] and the Human Genome Project [7]. Recently, feasible [10]. However, these sensors are prone to drift with
in the current pandemic times, it has been used for detecting time and sensitive to temperature variations [11]. Real-time
the SARS-CoV-2 virus, which causes the COVID-19 disease applications expose these sensors to a wide range of tem-
[8]. Thus, they have huge potential to be used in upcoming In- perature and long operating times, which introduce errors in
ternet of Things (IoT)-enabled point-of-care (PoC) diagnostic measurements due to drift in sensor readout. Thus, accounting
networks. for the temporal and temperature drift is crucial to obtain
Smart ISFET pH sensors have the potential to be used in en- accurate measurements. Machine Learning (ML) models are
vironmental monitoring, biomedical monitoring, food process capable of learning and predicting the drift patterns of chem-
monitoring, petrochemical industry, pharmaceutical industry, ical/biochemical sensors and compensate them [12], [13],
which alleviates the need for complex subsystems reported
*Nishad Sahu and Rishabh Bhardwaj are equally contributing authors in literature for sensor drift compensation tasks [14], [15].
Nishad Sahu and Het Shah are with the Department of Electrical Machine learning (ML) is a field of artificial intelligence
and Electronics Engineering, Birla Institute of Technology and Science,
Pilani, India (AI) that aims to study computer algorithms that automatically
Rishabh Bhardwaj is with the Information Systems Technology and learn crucial features from the data and solve the underlying
Design, Singapore University of Technology and Design, Singapore task, thus, without being explicitly programmed [16]. In this
Ravindra Mukhiya, Rishi Sharma and Soumendu Sinha are with the
Semiconductor Devices Area, CSIR-Central Electronics Engineering paper, we use a sub-class of ML algorithms called supervised
Research Institute, Pilani, India learning [17]. Such algorithms have access to input and desired

1530-437X (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 04,2021 at 08:08:47 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2021.3087333, IEEE Sensors
Journal
2 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2017

ISFET Smart Sensor development and IoT applications

4 Circuit-Board
Real-time environmental monitoring
systems in harsh climate
Time and temperature
drift compensation using
machine learning
Health Monitoring Systems

1
SPICE model
Development Implementing ML algoorithms
on FPGA/ MCU hardware for
2 final System-On-Chip(SoC)
ISFET smart sensor
Nuclear, Chemical and Biological
(NBC) Warfare Denfense System for
Experiments extreme hot and cold environment
with ISFET device

Fig. 1: Overview of the ISFET smart sensor development process and its applications in IoT systems.

outputs, hence the task is to closely determine a function that RNN models on Xilinx ZCU104 FPGA development kit
maps its input to the desired output. Specific to our task of drift using PYNQ framework for sensor drift compensation.
compensation, we use recurrent neural networks (RNNs) and 3) We have developed a robust SPICE macro model of
multi-layer perceptrons (MLPs) for modeling temporal and ISFET pH sensor using experimentally extracted para-
temperature drift characteristics, respectively. RNNs belong to meters for temporal and temperature drift characteristics.
a class of neural networks that aims to model the relationship We use regression analysis to model the ISFET threshold
between sequence input to single or sequence output [18]. voltage as a function of time, temperature, and pH for
MLP is one of the simplest classes of feed-forward neural developing an accurate SPICE model.
networks [16]. ML techniques have become a popular tool The rest of the paper is organized as follows: Section II dis-
to make sensors robust against non-idealities [19]–[21]. Early cusses the ISFET background theory and data collection meth-
works on ML-based ISFET temperature drift modeling and odology. Section III elucidates the method of ISFET SPICE
compensation were presented by Bhardwaj et al. [19], [20] and macro model development using the regression technique.
Mehta et al. [22], which compared performances of different Section IV presents the temporal drift compensation model
ML models. Sinha et al. proposed the first model for ISFET development and Section V discusses the temperature drift
temporal drift compensation using an ML technique called compensation using MLP model. Section VI elucidates the
Bayesian inference [21]. While ML-based temperature drift FPGA implementation of developed RNN models for sensor
compensation has been well explored, no existing work has drift compensation. We present the results and discussion in
studied the relevance of Artificial Neural Networks (ANNs) Section VII and finally conclude with Section VIII.
for ISFET temporal drift compensation. Thus, we focus our
work on temporal drift modeling and provide an RNN-based II. E XPERIMENTAL M ETHODOLOGY FOR ISFET S ENSOR
framework to model inputs as a sequence. As compared to DATA C OLLECTION
[21], we do not assume any prior over the temporal charac-
teristics of the device. A. ISFET Theory and Operation
In this work, we propose a 4-step procedure towards smart ISFET is a FET-based chemical sensor popularly used for
ISFET pH sensor development for IoT applications. The first pH sensing applications. In an ISFET, the gate electrode is
stage involves the process of data collection. The second absent and the gate region is left exposed during device
stage involves ISFET macro model development in SPICE packaging to allow interaction of gate oxide with analyte
based on experimental device characteristics, which is used to solutions. The gate oxide is stacked with a sensing film such as
perform SPICE simulations to generate pH-Vref characteristics Al2 O3 , Si3 N4 , TiO2 etc., to obtain better sensing performance
of ISFET. The third stage is dedicated to ML techniques [23], [24]. The threshold voltage of the device is sensitive to
that aim to compensate for temporal and temperature drift in pH of analyte solution [25]. Fig. 2 shows the schematic of
sensor output. The fourth and final stage proposes methods to ISFET pH sensors at device and circuit levels. For an ISFET
implement the ML models on FPGA board to obtain a portable device operated in linear region, the drain current is given by
ISFET smart sensor for deployment in IoT applications. The [26]:
overall smart sensor development process is shown in Fig. 1. W VDS
The major contributions of this study are: IDS = µn Cox [(VGS − VT (ISF ET ) ) − ]VDS (1)
L 2
1) We report for the first time the development of RNN where, µn is the mobility of electron; W and L are the channel
models, viz., LSTMs, GRUs, biLSTMs, and biGRUs for width and channel length, respectively; Cox represents oxide
temporal drift compensation. For temperature drift com- capacitance per unit area; VT (ISF ET ) represents the ISFET
pensation, we implement feed-forward MLP models. threshold voltage; VGS is the difference between the gate and
2) We also report for the first time the implementation of source voltage; and VDS is the difference between drain and

1530-437X (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 04,2021 at 08:08:47 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2021.3087333, IEEE Sensors
Journal
AUTHOR et al.: PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS (FEBRUARY 2017) 3

source voltage. When the ISFET is operated in a Constant C. Data Collection


Voltage Constant Current (CVCC) topology [20], I DS , drain We performed four different sets of experiments using the
voltage (V D ) and source voltage (V S ) remains constant. Any experimental setup:
change in V T(ISFET) is reflected in the gate voltage, V G . 1) Transfer characteristics measurement:: The transfer char-
Consequently, V G can be uniquely mapped to each pH value, acteristics of the device was obtained at temperatures = 23o C,
which can be used to measure the pH of an unknown solution. 33o C, 43o C and 53o C for pH = 2, 4, 7 and 10. The drain to
A schematic of the topology with component values specific source voltage, V DS was fixed at 0.5 V and the gate to source
to this study is given in Fig. 3. voltage, V GS was varied from 0.0 to 2.1 V.
2) Output characteristics measurement:: The output charac-
Reference
Electrode (Ref)
Power supply and
circuit board
teristics of the device was obtained at temperatures = 36o C
Silicon
Metallization
for pH = 2, 4, 7 and 10. The V GS was kept constant at 2.0 V
Reference Electrode is
Insulator immersed in pH buffer
Circuit-
Board
and V DS was varied from 0 to 2 V.
Sensing Film solution
Ref 3) Temperature drift in CVCC topology output:: A set of 20
Channel
Reference
D readings (data points) were taken at temperatures = 15o C,
B
Electrode
Drain(D) S 25o C, 35o C, 45o C and 55o C for pH = 2, 4, 7 and 10.
pH buffer Source(S)
ISFET 4) Temporal drift in CVCC topology output:: The measure-
Device
n+
Sensing film in contact ments were done for a time duration of 2-3 hours at 30o C for
n+ with pH buffer solution
pH = 2, 4, 7, and 10.
p-Si
III. ISFET SPICE M ACRO M ODEL D EVELOPMENT
BASED ON R EGRESSION A NALYSIS
DEVICE LEVEL CIRCUIT
VIEW LEVEL VIEW
Bulk(B) After obtaining the experimental data of the ISFET sensor,
we develop its SPICE macro model using the collected data.
Fig. 2: Device level and circuit level view of ISFET-based pH We used HSPICETM software for implementing the mathem-
sensors. atical equations governing the electrochemical behavior of the
ISFET device.

B. Experimental Setup
A. SPICE Macro Model
A commercial Al2 O3 ISFET device (Micropto® , Italy) was
used for experiments with whose device dimensions are gate The methodology for the development of ISFET macro
width = 700 µm and gate length = 20 µm. Ag/AgCl reference model is different from previous studies, in terms of the
electrode (Metrohm® ) was used during sensor characterization modeling of ISFET threshold voltage [20], [21], [27]. There
and testing. The pH buffer solutions were purchased from are 5 terminals of the macro model, i.e., pH, Drain, Src,
Merck® . Keysight® B2902A Source Measure Unit (SMU) was Ref, and Blk. The pH of the buffer solution is modeled as
used for measuring the electrical characteristics of the ISFET a potential source and the potential at the pH node denotes
device. Tarsons® SPINOTTM digital magnetic stirrer hotplate the pH of the buffer solution. The SPICE macro model has
was used to control the temperature of pH buffer solutions three important components [27]:
during measurements. The experimental setup is shown in the 1) A level 3 NMOS model for modeling the electronic stage
Supplementary article (Fig. S1). of the ISFET device [28].
2) A time dependent potential source for modeling the
temporal drift in the ISFET device.
Voltage buffer 3) A temperature and voltage dependent voltage source for
mode
modeling the electrochemical properties of the ISFET
pH buffer solution in and its temperature drift.
contact with the ISFET
sensing film and The theoretical background of temporal and temperature
reference electrode D
dependence of ISFET device has been presented by Sinha et al.
Ref ISFET
[27] and a brief description is included in the Supplementary
B 3.5 Kꭥ
Article (Section SI.B and SI.C, respectively). Fig. 4 shows the
S +
schematic of the developed SPICE macro model of ISFET.
5V
-

B. Modeling ISFET Temporal Drift


Negative
feedback mode 0.845 Kꭥ 1.5 Kꭥ
To account for the temporal drift in the SPICE model,
we need to include a time dependent potential source to
produce the required potential difference as a function of time.
The theory governing this time dependent potential is briefly
Fig. 3: CVCC topology for ISFET readout used in this study. discussed in Supplementary Article (Section SI.B).

1530-437X (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 04,2021 at 08:08:47 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2021.3087333, IEEE Sensors
Journal
4 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2017

VpH (pH) VDrain (D)


1.8
pH 2
VSurface_Potential
pH 4
1.7 pH 7
1K
pH 10
NMOS_model
1.6

1.5

VT(ISFET) (V)
VRef CGuoy CHelm
VSurface_Dipole VTemporal_Drift 1.4

VReference 1.3
VSource (S)
_electrode
(REF) VBulk (B) 1.2
VTemperature_Drift

1.1
Fig. 4: Schematic of the ISFET SPICE macro model. 1.0
18 23 28 33 38 43 48 53 58
Temperature (C)

1) Mathematical model and approximations: For a particular Fig. 5: V T(ISFET) extracted from experimental transfer charac-
temperature and pH, the change in reference electrode voltage teristics data using Constant Current (CC) method.
(or gate voltage) with time, t is given by:-
t
∆VG (t) = c × [1 − exp−( )β ] (2) 2) Threshold voltage, VT(ISFET) extraction: To model the
τ
V T(ISFET) in SPICE macro model, we need to incorporate its
where, τ represents the structural relaxation time constant, β dependence on pH and T, whose values are extracted from
is associated with the dispersive transport and c is a constant. the experimental data. The threshold voltage, V T(ISFET) was
By using Eq. 2 and the available experimental data set, we extracted from the transfer characteristics using the Constant
extract the three unknown parameters τ , β and c for pH 2, 4, Current (CC) method [29]. The transfer characteristics data set
7 and 10 at T = 30o C. was used for threshold voltage extraction. Twenty V T(ISFET)
2) Regression analysis based parameter extraction in MAT- values were extracted and their values have been plotted
LAB: The temporal drift data set was used for extracting the against their respective temperature and pH, as shown in Fig.
parameters in Eq. 2. The extraction was carried out with the 5.
help of the Curve Fitting Toolbox of MATLABTM . We used the
method of ‘Non Linear Least Squares’ with the ‘Trust Region’
algorithm for performing the regression analysis. The values
of extracted parameters along with their respective goodness
of fit parameter Adj. R2 is given in Table I.
3) Incorporating mathematical model and extracted paramet-
ers in SPICE macro model file: The time dependent potential
source based on Eq. 2 and the values of the extracted para-
meters τ , β, and c are included in the macro model file. It is
important to note that these parameters are specific to pH and
temperature, as listed in Table I. Fig. 6: Comparison of experimental ISFET threshold voltage
with values predicted by the best fit polynomial.
TABLE I: Temporal drift parameters for ISFET SPICE macro
model extracted from experimental data (at 30o C) 3) Regression analysis in MATLAB: The V T(ISFET) is depend-
pH
c (or) ∆ VG (∞) β τ
Adj. R2
ent on pH and temperature (refer to Supplementary Article
( in Volts) ( Unitless) (in hr.) Section SI.A) . Instead of using our previous approach to
2 -0.03101 0.9568 2.138 0.9963
4 -0.015 1.833 1.69 0.9959 model V T(ISFET) [27], we used regression analysis to simplify
7 -0.005254 3.02 1.237 0.9696 the equation, which establishes the dependence of V T(ISFET)
10 -0.02328 0.8963 0.5958 0.9981 on T and pH. The Curve Fitting Toolbox of MATLAB was
used for regression analysis, where we used Bisquare Robust
technique for this purpose. We used polynomial regression to
model experimentally extracted V T(ISFET) values and obtained
C. Modeling ISFET electrochemical properties and a very good Adjusted R-square value of 0.9912. The other
temperature drift Goodness of Fit (GoF) parameters for this polynomial are
The procedure for modeling of electrochemical properties R-square value of 0.9949, squared estimate of errors (SSE)
and temperature drift of ISFET have been explained in the of 0.003812 and RMSE of 0.01862. The best fit polynomial
following subsections: surface has been plotted against the experimental data points
1) Mathematical Model: The equation defining ISFET in Fig. 6.
threshold voltage consists of temperature dependent terms 4) Incorporating mathematical model and extracted paramet-
(refer to Supplementary Article Section SI.C ), which causes ers in SPICE macro model file: The mathematical expression
drift in the sensor characteristics with temperature variation. derived after regression analysis, which models the electro-

1530-437X (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 04,2021 at 08:08:47 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2021.3087333, IEEE Sensors
Journal
AUTHOR et al.: PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS (FEBRUARY 2017) 5

chemical properties and temperature drift of the ISFET is


6.7
incorporated in the SPICE macro model. Temperature drift measurements
6.6 in CVCC topology

6.5
-4
4.0x10
pH 2 (sim) 6.4
3.5x10
-4 pH 2 (exp)
Temperature = 30 C
pH 4 (sim)

VRef (V)
15 C (sim)
pH 4 (exp) 6.3
-4 15 C (exp)
3.0x10 pH 7 (sim)
pH 7 (exp) 25 C (sim)
pH 10 (sim)
2.5x10
-4
pH 10 (exp)
6.2 25 C (exp)

pH 12 (sim) 35 C (sim)
IDS (A)

-4 pH 12 (exp)
2.0x10 6.1 35 C (exp)
45 C (sim)
-4
1.5x10 6.0
45 C (exp)
55 C (sim)
-4
1.0x10 55 C (exp)
5.9
5.0x10
-5 1 2 3 4 5 6 7 8 9 10 11
pH
0.0
0.0 0.4 0.8 1.2 1.6 2.0
VDS(V)
Fig. 9: Comparison of experimental(exp) and simulated(sim)
ISFET CVCC topology readout. The readings were taken at
Fig. 7: Comparison of experimental (exp) and simulated (sim) pH 2, 4, 7, 10 and at temperature 15o C, 25o C, 35o C, 45o C
ISFET output characteristics curve at pH 2, 4, 7, 10 and at and 55o C.
temperature 30o C.

The developed SPICE macro model includes both the tem-


perature and time drift features similar to the ISFET device
-4 -4
2.0x10 2.0x10
Temperature = 23 C Temperature = 33 C (from Micropto, Italy) which was used for experimentation.
Vds = 0.5 V Vds = 0.5 V

1.5x10
-4
pH 2 (sim)
pH 2 (exp) 1.5x10
-4
pH 2 (sim)
pH 2 (exp) The macro model replicates the behavior of the ISFET device
pH 4 (sim) pH 4 (sim)
pH 4 (exp)
pH 7 (sim)
pH 4 (exp)
pH 7 (sim) to a good degree, which makes it suitable to test the perform-
IDS (A)
IDS (A)

1.0x10
-4 pH 7 (exp) 1.0x10
-4 pH 7 (exp)
pH 10 (sim)
pH 10 (exp)
pH 10 (sim)
pH 10 (exp) ance of machine learning algorithms on it.
-5 -5
5.0x10 5.0x10

0.0
0.6 0.9 1.2 1.5 1.8 2.1
0.0
0.6 0.9 1.2 1.5 1.8 2.1
IV. T EMPORAL D RIFT C OMPENSATION
VRef (V) VRef (V)

-4 (a) 23o C temperature -4 (b) 33o C temperature In this section, we introduce a novel methodology for
2.0x10 2.0x10
Temperature = 43 C
Vds= 0.5V
Temperature = 53
Vds = 0.5 V
temporal drift compensation in the device. First, we show
pH 2 (sim)
1.5x10
-4
pH 2 (sim)
pH 2 (exp)
pH 4 (sim)
1.5x10
-4
pH 2 (exp)
pH 4 (sim)
how we process the raw sensor data. Next, we provide
pH 4 (exp) pH 4 (exp)
pH 7 (sim) pH 7 (sim)
pH 7 (exp)
problem formulation and quality feature selection as a part
IDS (A)
IDS (A)

1.0x10
-4 pH 7 (exp) 1.0x10
-4

pH 10 (sim) pH 10 (sim)
pH 10 (exp) pH 10 (exp) of the methodology. Further, we describe the RNN models,
5.0x10
-5
5.0x10
-5
experimental setup, followed by results and discussion on the
0.0 0.0
suitability of considered RNNs.
0.6 0.9 1.2 1.5 1.8 2.1 0.6 0.9 1.2 1.5 1.8 2.1
VRef (V) VRef (V)

(c) 43o C temperature (d) 53o C temperature


A. Data Pre-processing
Fig. 8: Comparison of experimental(exp) and simulated(sim)
For the ML model parameter learning, it is crucial to have
ISFET transfer characteristics curve at pH 2, 4, 7 and 10 for
a high quality data set. For a given task, high quality data
temperatures 23o C, 33o C, 43o C and 53o C.
helps the model to learn generic parameters that perform well
not only on the samples seen during training but also over
5) Fitting model with device characteristics: In this step,
the unseen samples during testing. Next, we describe data
we fit some of the electrical device parameters such as
prepossessing method used in this task. As shown in Fig. 10,
k p (transconductance), θ (mobility degradation), etc., with
the horizontal axis corresponds to time stamp T, where each
the device characteristics obtained through experiments. The
unit is 10 seconds. The orange region from time stamp 0 to
values were calculated using simplified equations from experi-
1,000 (or 10,000 seconds) is flagged as a part of our dataset.
mental data to get an initial guess. Depending on the deviation
Thus, for pH 2, 4, 7, and 10, we have 4000 (1000 × 4) data
of the model from the experimental transfer characteristics, we
samples 1 . Next, we standardize the data by subtracting the
optimized the values to achieve the best fit. The final simulated
mean voltage µ from each data sample. Standardization is
and experimental IV characteristics are shown in Fig. 7 (output
helpful in removing unnecessary biases from the dataset by
characteristics) and Fig. 8 (transfer characteristics).
centering the data on zero. The formula used is as follows:
6) Fitting model with CVCC topology output: We fit the
model with the experimental results obtained from the CVCC N
X
readout circuit. We choose the parameter value which gave an µ= vi (vi is ith data sample) (3)
acceptable match in both cases, i.e. we gave both the device i=1
characteristics and CVCC topology readings equal weights in vi = vi − µ (∀i ∈ {1, . . . , N }) (4)
arriving at the final parameter value. The final simulated and
experimental CVCC topology output is shown in Fig. 9. 1 We skip the reading taken at time 10,000 second

1530-437X (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 04,2021 at 08:08:47 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2021.3087333, IEEE Sensors
Journal
6 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2017

Fig. 11: Methodology: Curves in dashed-blue and dotted-


bold-red are considered for training and testing, respectively.
From the later curve, a randomly sampled set of ten Vref
Fig. 10: Temporal characteristics of ISFET for various pH consecutive readings is called test sample and denoted by
values. The orange region shows the area where the data P. The yellow curve is Vref readings taken at different pH,
samples were taken. ambient temperature 30◦ C, and time T = 0. We denote it as
pH vs Vp plot. The model takes P at input (sampled at T >
0), predicts the Vref at T = 0, i.e., Q . Q is mapped to the
Thus, the sampling frequency is 0.1 Hz. As discussed earlier, corresponding pH, i.e., R using pH vs Vp curve.
we limit 0 ≤ T ≤ Tmax , where Tmax = 10,000 seconds (1000
time units).
1. Requires less information: The approach does not re-
quire information about the time T at which we start
B. Methodology loading Vset . Moreover, we do not restrict the class of
In this section, we describe our method to tackle the non- functions as was assumed as apriori in [21], i.e., the pH-
ideal temporal dependence of Vref . We describe the method to time to be exponential in time.
feed more quality sequential input instead of using a myriad of 2. pH early estimation: Since the approach does not depend
Vref values. Additionally, we demonstrate how the availability on the time T at which readings were taken, we can
of auxiliary data further aids the performance. exploit the benefit of obtaining early estimation of pH
1) Problem formulation: Let vset :={vt , . . . , vt+n }∈IR de-
even before the Vref curve gets stabilized.
notes the sequence of Vref readings. The reading v1 is taken at
time T = t ∈ IR≥0 , while, v2 onwards are the readings taken C. Machine Learning Techniques
at each subsequent time interval until time T = t + n. As In this section, we provide background on a class of ML
mentioned in Section IV-A, the time interval between any two algorithms, i.e., RNNs (LSTM, biLSTM, GRU, biGRU) which
subsequent readings is 10 seconds. The task is to predict the we have used to learn Vp -Vset relationship from the data.
pH ∈ [0, 14]. Further, we explain how we approach to solve
the task.
2) Features: From Fig. 10, we observe the voltage-time
curve to be exponentially decaying. Thus, we append vset with
natural logarithm of each voltage it contains. Furthermore, we
provide the model with scaled time-stamp information. Hence,
1
we define Vset := {(vt , log vt , 10 ), . . . , (vt+n , log vt+n , 10
10 )}.
3) Approach: We split the task in two sub-tasks:
Fig. 12: The supervised neural network learns to map variable
1. Prediction: For a given sequence of readings Vset taken Vref at input to the variable Vp at output.
from time T=t to T=t + n, predict Vref at T=0, which we
refer as Vp . Neural Network (NN) – Given a dataset, a supervised
2. Projection: Project the predicted Vref to pH using a pH vs neural network (NN) algorithm learns to approximate a func-
Vp plot. This task requires auxiliary data to be obtained2 . tion fW that relates the independent variables X = x1 , . . . , xn
The methodology is summarized in the Fig. 11. P and Q to the dependent variables y. fW is a network of neurons
represent Vset and Vp , respectively. P is a sequence of readings (processing units) that connects the input to the output, and W
sampled from the test set (red-bold-dotted curve). The model is a set of model parameters. Fig. 12 illustrates the functioning
takes P as an input and predicts Q, i.e., Vref at T = 0. The of an ML model such as a supervised NN. The input Vref is fed
predicted voltage is then mapped to the corresponding pH to the model which generates its prediction for Vp . The loss
value R using pH vs Vp curve. is a function that penalizes bad predictions. Hence, it takes
Advantages: As compared to the previous approach [21], two inputs, i.e. true value and predicted value of Vp . In our
our procedure has the following advantages: case, we calculate the regression loss, such as mean squared
error, between true and predicted values (Eq. 7). The model
2 section IV-D elaborates the method of obtaining auxiliary data parameter set W is updated to minimize the loss.

1530-437X (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 04,2021 at 08:08:47 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2021.3087333, IEEE Sensors
Journal
AUTHOR et al.: PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS (FEBRUARY 2017) 7

Recurrent Neural Network (RNN) – RNNs are a class of v1 up to vt . We refer it as forward-RNN with the parameter
F B
NNs specialized for modeling sequence of data such as time set fW . The backward-RNN with the parameter set fW at the
series. In our case, the sequence is Vset . An RNN model right encodes information from vn up to vt and generates hB t
iterates over the time-steps of a sequence. Each step t takes in the backward direction. We concatenate hF B
t and ht vectors
a representation ht−1 of input vt−1 from previous step t − 1 and learn a transformation matrix fO to predict the output Vp
and generates hidden representation at step t, i.e., ht . The input .
sequence can be processed as follows: 3) GRU: Similar to LSTMs, gated recurrent unit (GRU)
aims to solve long-term dependencies with a lower number
of model parameters as compared to LSTMs. Unlike LSTMs,
ht = fW (ht−1 , vt ) (5) GRU does not possess a cell state and uses a hidden state
Vp = fO (hn ) (6) to carry the information forward. Thus, GRUs have only two
where, fW is the function with parameters W . fO is a function gates: 1) Reset gate; and 2) Update gate.
with parameters O that learns to map hidden representation at
time t to the output Vp . It is noteworthy that in our setting, we rt = σg (Wr [ht−1 , x] + br ) (reset gate)
are calculating output only for input at step t + n, i.e., vt+n . zt = σg (Wz [ht−1 , xt ] + bf ) (update gate)
RNNs are designed to encode the necessary information from
all the previous inputs vt , . . . , vt+n−1 , use current input vt+n , hˆt = φh (Wh [xt , rt ht−1 ] + bh )
and fuse them to make robust predictions. ht = (1 − zt ) ht−1 + zt (hˆt ) (hidden state)
All the models considered in this paper are RNN-based
where, {Wr , WZ , Wh } are a weight matrices; {br , bf , bh }
models. The models learn their parameters via optimization
are bias vectors; σg is a sigmoid function; φh is a hyperbolic
algorithms, learning from data, to approximate an underlying
tangent function.
function that maps input to the output.
4) BiGRU: As elaborated in Section IV-C.2, biLSTM is a
1) LSTM: As the length of the input sequence grows, stand-
bidirectional variant of standard unidirectional LSTM. Follow-
ard RNNs find it difficult to bridge the connection between in-
ing a similar analogy, a biGRU is a bidirectional extension of
put at step t and output y at step n, as (n−t) increases. Hence,
GRU.
RNNs suffer from long-term dependencies. Long Short Term
Memory networks (LSTMs) [30] are a special kind of RNN
D. Experimental Design
that are explicitly designed to solve the long-term dependency
issues by introducing gates. LSTMs possess memory known In this section, we first obtain a regression curve, i.e., a
as cell state, which keeps the information about the past. An function that maps Vp at T = 0 to pH value. The curve is
LSTM is comprised of 3 gates: 1) Input gate; 2) Forget gate; required to carry out the projection step discussed in Section
3) Output gate. The gates help regulate the flow of information. IV-B.3. Further, we show the methodology to train an RNN
for the prediction stage. The RNN will map Vref at time T
> 0 to Vref at time T = 0 for the same pH of the medium.
ft = σg (Wf [ht−1 , xt ] + bf ) (forget gate)
it = σg (Wi [ht−1 , xt ] + bi ) (input gate)
ot = σg (Wo [ht−1 , xt ] + bo ) (output gate)
ct = ft ct−1 + it σh (Wc [ht−1 , xt ] + bc )
(cell state at time t)
ht = ot σc (ct )
(hidden state at step t)
where, {Wf , Wt , Wo , Wc } are weight matrices; Fig. 13: Polynomial regression curve fitting. Green curve
{bf , bi , bo , bc } are bias vectors; σg is a sigmoid function; shows the curve fitted on the dotted points. The dataset
{σh , σc } are hyperbolic tangent functions. Alternatively, σc contains 9 samples, i.e., a set of (pH, Vp ) readings.
can also be defined as σc (x) = x [31]; is element-wise
product operation.
2) BiLSTM: Standard LSTMs are unidirectional, i.e., they
have access to the information from the past until current step
t. As an extension, bidirectional LSTMs (biLSTMs) utilize
information from the past as well as from the future. BiLSTMs
comprise of two unidirectional LSTMs, one encodes necessary
information from step 0 to step t, while another from step n to
step t (t ≤ n) in the backward direction. Separate hidden states
are calculated from both the networks and appended to predict
the output. As shown in Fig. 15, the unidirectional RNN at left Fig. 14: Xilinx ZCU104 FPGA board booted with PYNQ.
(such as LSTM) calculates hF t by encoding information from

1530-437X (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 04,2021 at 08:08:47 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2021.3087333, IEEE Sensors
Journal
8 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2017

1) Regression curve for pH vs Vp : Since the temporal V. T EMPERATURE D RIFT C OMPENSATION


readings were taken at the ambient temperature of 30◦ C, we
simulate the ISFET voltage output with a change in pH of We adapt the temperature drift compensation method as pro-
the solution at the same ambient temperature. We store the posed in [19] which showed the efficacy of artificial neural net-
simulation output for pH in {2, 3, . . . , 10}. This constitutes works (ANNs). Following the methodology and experimental
our auxiliary data defined in Section IV-B.3. To obtain pH vs settings, we train a multilayer perceptron regressor (MLP)
Vp curve at temperature 30◦ C, we fit a polynomial function. to predict the pH of the solution given Vref and temperature
We choose the degree of the function based on 10-fold cross- readings. MLP is a class of ANN. An MLP consists of at least
validation. We found that quadratic function minimizes the three layers: an input layer, hidden layer(s), and an output
root of mean square error between the true and predicted pH. layer. The input layer takes input as independent variables,
The curve fitting is facilitated using sklearn library in Python i.e., voltage and temperature. The hidden layer consists of
[32]. independent processing units (nodes) that linearly combine the
2) Loss functions: A loss function determines the modeling input using weights, add a shift (bias) term, and pass the output
quality of an algorithm. The optimization objective is to through a non-linear activation function such as ReLU . The
minimize the loss function defined over training samples. For output layer collects all the output from the preceding hidden
our use case, we define two loss functions, i.e., L and LR . The layer and linearly combines them using weights to predict the
loss L is the mean squared error regression loss between actual dependent variable, i.e., pH. In our task, we use only one
pH and predicted pH ˆ value and calculated over ns training hidden layer with 128 (26 ) nodes. The nodes possess ReLU
samples. activation function, i.e., max(0, x) where x is input to the
function. Fig. 16 depicts the ANN architecture used.
ns −1 We learn model parameters – weights and biases of each
ˆ = 1 X ˆ 2
L(pH, pH) (pH − pH) (7) node in the MLP using temperature drift data obtained through
ns i=0
the developed ISFET SPICE macro model simulations. The
However, such a loss can overfit to give impressive per- data consist of 8,001 samples of each pH (2, 3, 4, 5, 6, 7,
formance over the set of training samples. Such ill-defined 8, 9, 10). We split the data into the train and test sets. The
loss functions may lead to poor performance over an unseen train set consists of all the data samples collected for pH 2,
test set. To resolve such issues, we define LR that imposes a 3, 4, 5, 6, 7, 8, 9, 10, except for 6,001 randomly sampled
penalty over objective function. instances of pH under testing. Thus, the train set contains
(8, 001 × 9 − 6, 001) 66,008 instances. The test set contains
X X 6,001 instances of a specific pH under testing, for example, pH
ˆ = L(pH, pH)
LR (pH, pH) ˆ +λ ||wx ||2 + µ ||wb ||2 (8) 4. We train the model for iterations until we observe the change
in mean squared error loss (MSE) between predicted pH to
where, wx is a set of weights of a model, and wb is biases. true pH (under testing) less than 0.01. Each iteration updates
The two summations over wx and wb are regularization terms. parameters once in the direction of reducing loss function.
The λ and µ are regularization parameters that decide the
relative importance of regularization terms over L.
3) RNNs Setup: From the dataset mentioned in Section IV- VI. FPGA IMPLEMENTATION AND POWER CONSUMPTION
A, we split it into packets of 10 contiguous readings. Thus, FEASIBILITY STUDY FOR I OT DEPLOYMENT
we get 100 samples per pH. We keep 300 samples from
three pH values for the model training phase. We test the We implemented the RNN model used for sensor drift
model on the left-out pH with 100 samples. For instance, compensation on Xilinx® ZCU104 FPGA development board
we train an RNN on samples corresponding to pH: 2, 4, shown in Figure 14. Firstly, the FPGA was booted with a
10, and test it on samples for pH 7. This way, each RNN PYNQ (Python Productivity for ZYNQ) image. PYNQ is
variant is train-tested for 4 times. As shown in Fig. 15 (left), a tool which can directly implement python libraries and
RNN-based models take input as a Vset sequence and pass it codes on FPGA without the need of writing architecture level
through the recurrent layer with hyperbolic tangent activation. hardware description using Verilog. Next, a docker was in-
We feed the output of the recurrent layer to a feed-forward stalled on the PYNQ framework and a compatible TensorFlow
layer that predicts the pH value. The output of LSTM/GRU library docker container was downloaded [34]. We further
is set to a 32-dimensional vector. To keep the shape of this build this container and executed it using python. The RNN
output representation consistent, we reduce the number of models were run on python terminal and used for the sensor
output units to half (i.e. 16) in bidirectional variants. Thus, the drift compensation, using the TensorFlow library. In order to
concatenated forward and backward vector provides a vector measure the power consumption, the onboard PMBus was
of 32-dimensions. The vector is then fed to an output feed- used, which was available on the ZCU104 board. The real-
forward layer to generate pH.ˆ For parameter learning, the loss time data was accessed using a Jupyter notebook running in
L (or LR with λ, µ = 1e−5 ) is minimized using the ADAM parallel with the python terminal and the power consumption
optimizer with a learning rate of 0.001. The number of epochs was monitored while the RNN inference ran multiple times.
is 5 and the batch size is kept as 8 samples. The model The results of the power consumption study are presented in
parameters are initialized using Glorot uniform initializer [33]. Table IV.

1530-437X (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 04,2021 at 08:08:47 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2021.3087333, IEEE Sensors
Journal
AUTHOR et al.: PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS (FEBRUARY 2017) 9

Fig. 15: The left figure shows a representative RNN-based model. Input to the model is Vref (vt ), natural logvt (ut ), and
t
time-stamp information pi := 10 . The output of RNN is 32-dimensional. The center figure shows a 3-dimensional t-SNE plot
of the data (best viewed in color). The right figure is its 2-dimensional t-SNE map. Each color dot represents the true pH value
of model input. We observe that dots with similar values are clustered together.

TABLE II: Mean squared error


Standard Regularized
Model
pH:2 pH:4 pH:7 pH:10 min max avg pH:2 pH:4 pH:7 pH:10 min max avg
lstm 0.110 0.007 0.036 0.041 0.007 0.110 0.048 0.081 0.003 0.028 0.024 0.003 0.081 0.034
32-bit

gru 0.142 0.083 0.001 0.654 0.001 0.654 0.220 0.301 0.130 0.002 0.570 0.002 0.570 0.250
bilstm 0.045 0.006 0.003 0.010 0.003 0.045 0.016 0.109 0.003 0.002 0.027 0.002 0.109 0.035
bigru 0.344 0.208 0.001 1.393 0.001 1.393 0.487 0.342 0.178 0.001 1.407 0.001 1.407 0.482
lstm 0.090 0.002 0.014 0.034 0.002 0.090 0.035 0.063 0.005 0.013 0.024 0.005 0.063 0.026
64-bit

gru 0.341 0.105 0.001 0.644 0.001 0.644 0.273 0.333 0.082 0.001 0.830 0.001 0.830 0.312
bilstm 0.057 0.007 0.003 0.044 0.003 0.057 0.028 0.041 0.003 0.008 0.020 0.003 0.041 0.018
bigru 0.374 0.230 0.001 1.093 0.001 1.093 0.425 0.325 0.227 0.001 0.981 0.001 0.981 0.383

VII. R ESULTS AND D ISCUSSION

We compare the models trained for two losses, i.e., L


(standard) and LR (regularized). Furthermore, we compare
the performances when the model parameters are restricted to
store a 32-bit and 64-bit floating values, the latter occupying
twice the hardware memory as compared to the former setting.
Fig. 16: Artificial neural network architecture. w1 , w2 , and Table II shows mean squared error (MSE) of actual pH and
ˆ In the case of LSTM, the average MSE of
predicted pH.
w3 are model parameters to be learned via training process.
w1 and w2 are weights corresponding to first node in hidden regularized models is better than the standard counterparts.
layer, b1 is bias term. However, in most of the other cases, standard-setting is
observed to be outperforming regularized ones. This can be
seen as a result of the penalty applied to regularized models,
reducing their capacity to learn parameters with high values.
In the case of regularized models with 64-bit values, we
find that 3 out of 4 models perform better than their 32-bit
analogue. This gives a clear indication that the performance
loss via the restriction to learn large parameter values is
compensated by the extra floating point bits. For regularized
versions, the average loss of the 64-bit biLSTM is minimum.
We observe GRU variants to perform worse as compared to
LSTM under similar settings. Overall, the 32-bit biLSTM out-
performs all the other model-loss-precision settings considered
in experiments. In Fig. 15, we plot a 3-dimensional t-SNE map
[35] of 32-dimensional representations obtained from the data
Fig. 17: Distribution of predictions made by 32-bit RNNs (best at the output of the biLSTM (32-bit standard). We observe that
viewed in color). the same pH values are mapped close to each other. This is
further evident from the corresponding 2-dimensional t-SNE
plot. Fig. 17 shows the pH-wise distribution of 32-bit standard
model predictions on test-data samples. We observe that most

1530-437X (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 04,2021 at 08:08:47 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2021.3087333, IEEE Sensors
Journal
10 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2017

of the models make significant prediction errors for pH 10; this onboard network parameters can be accordingly fine-tuned.
can also be observed in Table II. We suspect this behavior The inference task’s average energy per inference (prediction)
is a result of a shift in device temporal characteristics, i.e., is shown in Table IV. The per-inference energy consumption
generic patterns learned from train data comprised of pH 2, 4, of GRU and LSTM algorithms are 57.76 mJ and 63.02 mJ,
and 7. All the models trained on samples from pH 2, 4, and respectively. Consider a scenario where the deployed IoT
10 are observed to perform well for test samples from pH 7 sensor is used once every 15 minutes for a day, the total
(MSE ≤ 0.036 pH2 ). energy consumed by the sensor will be approximately 6 J for
The temperature compensation results are shown in Table 96 inferences per day3 . The commercial batteries have energy
III. The iteration-wise loss for the temperature drift compens- capacity up to 20,000 Joules [36], which makes it feasible to
ation is shown in Figure 18. perform the RNN inference on a battery’s power for a long
time. This analysis proves the applicability of RNN’s for IoT
deployment on platforms like FPGA. Our results are in line
with reported studies [37]–[39] who have implemented similar
RNN models for low power IoT applications, such as image
and text processing on FPGA.

TABLE IV: Summary of power consumption by various RNN


models on the Xilinx ZCU104 FPGA Development Board
LSTM BiLSTM GRU BiGRU
Power consumption by FPGA Development
11.55
Board + PYNQ (Watts, W)
Power consumption by FPGA Development
11.85
Board + PYNQ + FAN (W)
Fig. 18: MSE loss at each learning iteration for temperature Power consumption by FPGA Development
12.16 12.14 12.13 12.13
Board + PYNQ + FAN + RNN running (W)
drift compensation. Power consumption by RNN, PRNN (W) 0.31 0.29 0.28 0.28
Time taken by RNN during inference, t (s) 0.2033 0.2080 0.2063 0.2038
Energy consumed per inference by the
63.02 60.32 57.76 58.32
We observe that the MLP is significantly able to compensate RNN, EINF = t x PRNN (milli Joule, mJ)
for the temperature drift with a mean squared deviation from
the true pH < 0.18. We also calculate the proportion of vari- We compare our study with previous state-of-the-art works
ance in true pH that has been explained by model independent and present a summary in Table V. Our study employs
variables (R2 ). It is a performance measure of the model on efficient ML techniques for compensating both temporal and
an unseen test set. temperature drift and also ensures a low computational design
Pn to facilitate robust implementation, which is suitable for IoT
ˆ i )2
(pHi − pH
2
R (pH, pH)ˆ = 1 − Pi=1 (9) applications.
n ¯ 2
i=1 (pHi − pH)
ˆ is model predicted pH; and for n train samples:
where pH VIII. C ONCLUSION
n
We have proposed a 4-step method to develop an ISFET-
¯ = 1 based smart pH sensor which is robust in varying temporal and
X
pH pH (10)
n i=1 i ambient temperature conditions. We performed experiments
to record the temporal and temperature drift behavior of the
MLP obtains an R2 score of 0.991, which shows the ISFET sensor to extract the device parameters and developed
effectiveness of developed model. The best possible value of an accurate SPICE macro model of the ISFET device using
R2 is 1. the experimental data. We simulate the macro model to collect
device characteristics of ISFET and also used it as a subcircuit
TABLE III: ANN (MLP) based temperature compensation.
model in a CVCC topology to obtain voltage readings for a
pH 2 4 7 10 All given pH at different times and temperatures.
MSE 0.081 0.001 0.068 0.175 0.081
RMSE 0.286 0.031 0.260 0.418 0.286
We focus on developing a robust ML model to compensate
R2 - - - - 0.991 the temporal drift. We split the problem into two sub-problems
– prediction and projection. The projection part is solved using
a regression curve fitting, while the prediction part consists of
Table IV provides a summary of the power consumed by
a popular class of machine learning techniques, i.e., RNNs,
various RNN models when implemented on ZCU104 FPGA
and a rigorous evaluation shows that the bidirectional variant
development board. Although the overall power consumption
biLSTM achieves the least average root mean square error
of the development board is high due to several unused
(RMSE) between true and predicted pH values, i.e., 0.126
onboard peripherals, the extra power consumed by running
pH. The experiments test the model in a memory-constrained
onboard RNN computations is relatively lower. A smarter
environment. In the case of temperature drift, we exploit the
way to save onboard power is to train the neural network
well-established multi-layer perceptron (MLP) to compensate
on the server as it requires high-end computations, while
the deployable IoT solution will only perform inference task. 3 keeping the smart sensor in sleep mode at all other times for near-zero
As the device characteristics vary with wear and tear, the power consumption

1530-437X (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 04,2021 at 08:08:47 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2021.3087333, IEEE Sensors
Journal
AUTHOR et al.: PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS (FEBRUARY 2017) 11

TABLE V: Comparison of our study with other ISFET based sensor development research works.
Temperature Temporal Major technology/tools
Ref. Target Application Additional features
compensation compensation used
Wearable, flexible,
Real time and non-invasive health
[40] 3 - InGaZno material system low cost and
care devices, sweat sensing in sports
disposable
Dual-mode sensing array ( Both optical Accurate, high
Biomedical and personal and chemical sensing mode), Correlated throughput pH
[5] - -
genome diagnostics double mode (CDS) readout circuit and sensing and
pipelined ADC low cost
Precision pH sensory
Programmable current mirror circuit and
Water quality/environmental function, long
[14] 3 3 MCU( TI MSP430 series) based
monitoring term monitoring and
ISFET sensory system
low power
Sequential Bias reconfiguration of ISFET and
Self reliable with
Generic applications for Machine Learning ( support vector
[41] - - the help of
data driven sensing devices regression and backpropagation
wireless android IoT
neural network)
BSS ( blind source separation) algorithm
Low power, low cost,
[42] Online water pollution monitoring - - to detect ion concentrations and
and small
MCU (TIMSP430 series)
Low computation, both
Relay driver circuit for water parameter
Arduino based smart photon and pH
[43] - - correction and automatic feeding system,
aquaponics system bi-direction
Arduino YUN MCU and Android
capabilities
MCU ( AVR ATMEGA-328 8 bit MCU), Low cost , low power
Huawei GSM900 module, SPI, and easy to
[44] In-situ water environment monitoring - -
mobile networks, Li-Ion batteries, PHP, implement
Mysql and Android( Mini SDK: API 8, OS 2.2) and expand
Dual gate ISFET(DG ISFET),
pH/light bi functional sensing device Low cost and
[45] - - sequential control method and
for generic IoT applications low power
Back Propagation Neural Network(BPNN)
2-stage sawtooth oscillator, TSMC 0.18 um
Portable, merges
M6 CMOS technology, pH to PWM readout
On-chip DNA amplification and chemical
circuit, ISFET based linear OTA (Operational
detection for Lab-on-a-Chip and temperature
[46] - 3 Transconductance Amplifier), on-chip R-2R
diagnostics Point of Care sensing in the
11-bit DAC, 24-bit slave SPI unit,
(PoC) diagnostics) same pixel,
FPGA development platform terasIC DE2i-150 and
Programmable
Keithley 2602 ( for controlling electrode potential)
IoT applications in long term Machine Learning- Recurrent Neural
Accurate,
environmental monitoring Networks (RNNs): BiLSTMs for temporal drift
Programable ,
This in a harsh climate, compensation and Artificial Neural Network(ANN)
3 3 functional in extreme
Work biomedical monitoring Multilayer perceptron regressor(MLP) for
conditions and low
and NBC warfare defense temperature drift compensation, CVCC readout
computational cost.
systems. circuit and MCU/FPGA based hardware.

for the device’s non-ideality. We observed that the MLP research work.
achieves the RMSE score of 0.286 pH. We also implemented
the RNN models on Xilinx ZCU104 FPGA development board
using the PYNQ framework and found that they consume very R EFERENCES
low energy, around 6 mJ per inference and are very suitable for
[1] N. Moser, J. Rodriguez-Manzano, T. S. Lande, and P. Georgiou, “A
IoT applications. Thus, the results show that low computation scalable isfet sensing and memory array with sensor auto-calibration
ML algorithms can be used to effectively compensate temporal for on-chip real-time dna detection,” IEEE transactions on biomedical
and temperature drift in ISFET pH sensors deployed in various circuits and systems, vol. 12, no. 2, pp. 390–401, 2018.
[2] V. Pachauri and S. Ingebrandt, “Biologically sensitive field-effect tran-
IoT applications. sistors: from isfets to nanofets,” Essays in biochemistry, vol. 60, no. 1,
pp. 81–90, 2016.
[3] M. J. Schöning and A. Poghossian, “Recent advances in biologically
ACKNOWLEDGMENT sensitive field-effect transistors (biofets),” Analyst, vol. 127, no. 9, pp.
1137–1151, 2002.
The authors are grateful to the Director, CSIR-CEERI, [4] J. Zeng, N. Miscourides, and P. Georgiou, “A 128× 128 current-
Pilani for his constant support and motivation to carry out mode ultra-high frame rate isfet array for ion imaging,” in 2018 IEEE
this work. They are thankful to all the scientific and technical International Symposium on Circuits and Systems (ISCAS). IEEE, 2018,
pp. 1–5.
staff members of the Semiconductor Devices Area for the [5] X. Huang, H. Yu, X. Liu, Y. Jiang, M. Yan, and D. Wu, “A dual-
discussions and technical support. They would like to thank mode large-arrayed cmos isfet sensor for accurate and high-throughput
Dr. T. Eshwar and Dr. Satyam Srivastava for providing support ph sensing in biomedical diagnosis,” IEEE Transactions on Biomedical
Engineering, vol. 62, no. 9, pp. 2224–2233, 2015.
to carry out the device measurements. The authors are also
[6] T. Hyodo, M. Yuto, H. Tanigawa, M. Tsuruoka, H. Sakamoto, T. Ueda,
thankful to Dr. Vinay Chamola for providing us the Xilinx K. Kamada, and Y. Shimizu, “Solid-state isfet-based sensors capable
ZCU104 FPGA Development Board and Mr. Anubhav Elhence of measuring acidity of lubricants,” ECS Transactions, vol. 98, no. 12,
for support in the implementation of RNN models on FPGA p. 59, 2020.
[7] N. Moser, T. S. Lande, C. Toumazou, and P. Georgiou, “Isfets in
board. They are grateful to CSIR, New Delhi for providing cmos and emergent trends in instrumentation: A review,” IEEE Sensors
the research facilities and financial support to carry out this Journal, vol. 16, no. 17, pp. 6496–6514, 2016.

1530-437X (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 04,2021 at 08:08:47 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2021.3087333, IEEE Sensors
Journal
12 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2017

[8] G. Seo, G. Lee, M. J. Kim, S.-H. Baek, M. Choi, K. B. Ku, C.-S. [30] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
Lee, S. Jun, D. Park, H. G. Kim et al., “Rapid detection of covid-19 computation, vol. 9, no. 8, pp. 1735–1780, 1997.
causative virus (sars-cov-2) in human nasopharyngeal swab specimens [31] F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, “Learning precise
using field-effect transistor-based biosensor,” ACS nano, vol. 14, no. 4, timing with lstm recurrent networks,” Journal of machine learning
pp. 5135–5142, 2020. research, vol. 3, no. Aug, pp. 115–143, 2002.
[9] P. J. Bresnahan Jr, T. R. Martz, Y. Takeshita, K. S. Johnson, and [32] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,
M. LaShomb, “Best practices for autonomous measurement of seawater O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander-
ph with the honeywell durafet,” Methods in Oceanography, vol. 9, pp. plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch-
44–60, 2014. esnay, “Scikit-learn: Machine Learning in Python,” Journal of Machine
[10] P. Bergveld, “Thirty Years of ISFETOLOGY: What Happened in the Learning Research, vol. 12, pp. 2825–2830, 2011.
Past 30 Years and What May Happen in the Next 30 Years,” Sensors [33] X. Glorot and Y. Bengio, “Understanding the difficulty of training
and Actuators B: Chemical, vol. 88, no. 1, pp. 1–20, 2003. deep feedforward neural networks,” in Proceedings of the thirteenth
[11] S. Jamasb, “Current-mode signal enhancement in the ion-selective field international conference on artificial intelligence and statistics, 2010,
effect transistor (isfet) in the presence of drift and hysteresis,” IEEE pp. 249–256.
Sensors Journal, 2020. [34] flodutot GitHub repository. (2021) Tensorflow for aarch 64. [Online].
[12] Q. Liu, X. Li, M. Ye, S. S. Ge, and X. Du, “Drift compensation for Available: https://github.com/flodutot/tensorflow_aarch64
electronic nose by semi-supervised domain adaption,” IEEE Sensors [35] L. v. d. Maaten and G. Hinton, “Visualizing data using t-sne,” Journal
Journal, vol. 14, no. 3, pp. 657–665, 2013. of machine learning research, vol. 9, no. Nov, pp. 2579–2605, 2008.
[13] L. Zhang, Y. Liu, Z. He, J. Liu, P. Deng, and X. Zhou, “Anti-drift in [36] U. B. . E. PRODUCTS. (2015) U9vl-j-p battery technical datasheet.
e-nose: A subspace projection approach with drift reduction,” Sensors [Online]. Available: https://cellpacksolutions.co.uk/wp-content/uploads/
and Actuators B: Chemical, vol. 253, pp. 407–417, 2017. 2015/06/ultralife-u9vl-jp-technical-data-sheet.pdf
[14] D. Chen and P. K. Chan, “An intelligent isfet sensory system with [37] Y. Hao and S. Quigley, “The implementation of a deep recurrent
temperature and drift compensation for long-term monitoring,” IEEE neural network language model on a xilinx fpga,” arXiv preprint
Sensors Journal, vol. 8, no. 12, pp. 1948–1959, 2008. arXiv:1710.10296, 2017.
[15] P. K. Chan and D. Chen, “A CMOS ISFET Interface Circuit with [38] A. X. M. Chang, B. Martini, and E. Culurciello, “Recurrent
Dynamic Current Temperature Compensation Technique,” IEEE Trans- neural networks hardware implementation on fpga,” arXiv preprint
actions on Circuits and Systems I: Regular Papers, vol. 54, no. 1, pp. arXiv:1511.05552, 2015.
119–129, 2007. [39] C. Gao, D. Neil, E. Ceolini, S.-C. Liu, and T. Delbruck, “Deltarnn: A
[16] C. M. Bishop, Pattern recognition and machine learning. springer, power-efficient recurrent neural network accelerator,” in Proceedings of
2006. the 2018 ACM/SIGDA International Symposium on Field-Programmable
[17] M. Mohri, A. Rostamizadeh, and A. Talwalkar, Foundations of machine Gate Arrays, 2018, pp. 21–30.
learning. MIT press, 2018. [40] S. Nakata, T. Arie, S. Akita, and K. Takei, “Wearable, flexible, and
[18] Z. C. Lipton, J. Berkowitz, and C. Elkan, “A critical review of multifunctional healthcare device with an isfet chemical sensor for
recurrent neural networks for sequence learning,” arXiv preprint simultaneous sweat ph and skin temperature monitoring,” ACS sensors,
arXiv:1506.00019, 2015. vol. 2, no. 3, pp. 443–448, 2017.
[19] R. Bhardwaj, S. Majumder, P. K. Ajmera, S. Sinha, R. Sharma, [41] W.-E. Hsu, Y.-H. Chang, and C.-T. Lin, “A machine-learning assisted
R. Mukhiya, and P. Narang, “Temperature compensation of isfet based sensor for chemo-physical dual sensing based on ion-sensitive field-
ph sensor using artificial neural networks,” in 2017 IEEE Regional effect transistor architecture,” IEEE Sensors Journal, vol. 19, no. 21,
Symposium on Micro and Nanoelectronics (RSM). IEEE, 2017, pp. pp. 9983–9990, 2019.
155–158. [42] S. Bermejo, G. Bedoya, V. Parisi, and J. Cabestany, “An on-line water
[20] R. Bhardwaj, S. Sinha, N. Sahu, S. Majumder, P. Narang, and monitoring system using a smart isfet array,” in IEEE 2002 28th Annual
R. Mukhiya, “Modeling and simulation of temperature drift for isfet- Conference of the Industrial Electronics Society. IECON 02, vol. 4.
based ph sensor and its compensation through machine learning tech- IEEE, 2002, pp. 2797–2802.
niques,” International Journal of Circuit Theory and Applications. [43] E. Galido, L. Tolentino, B. Fortaleza, R. Corvera, A. De Guzman,
[21] S. Sinha, R. Bhardwaj, N. Sahu, H. Ahuja, R. Sharma, and R. Mukhiya, V. Española, C. Gambota, A. Gungon, K. Lapuz, N. Arago et al.,
“Temperature and temporal drift compensation for al2o3-gate isfet-based “Development of a solar-powered smart aquaponics system through
ph sensor using machine learning techniques,” Microelectronics Journal, internet of things (iot),” Lecture Notes on Research and Innovation in
vol. 97, p. 104710, 2020. Computer Engineering and Computer Sciences, pp. 31–39, 2019.
[22] A. Mehta, H. Ahuja, N. Sahu, R. Bhardwaj, S. Srivastava, and S. Sinha, [44] F. Cao, F. Jiang, Z. Liu, B. Chen, and Z. Yang, “Application of isfet
“Machine learning techniques for performance enhancement of si 3 n microsensors with mobile network to build iot for water environment
4-gate isfet ph sensor,” in 2020 IEEE 17th India Council International monitoring,” in 2014 International Conference on Intelligent Environ-
Conference (INDICON). IEEE, 2020, pp. 1–7. ments. IEEE, 2014, pp. 207–210.
[23] S. Sinha, R. Mukhiya, R. Sharma, P. Khanna, and V. Khanna, “Fabric- [45] W.-E. Hsu, Y.-H. Chang, Y.-J. Huang, J.-C. Huang, and C.-T. Lin, “A
ation, characterization and electrochemical simulation of aln-gate isfet ph/light dual-modal sensing isfet assisted by artificial neural networks,”
ph sensor,” Journal of Materials Science: Materials in Electronics, pp. ECS Transactions, vol. 89, no. 6, p. 31, 2019.
1–12, 2019. [46] M. Cacho-Soblechero, K. Malpartida-Cardenas, C. Cicatiello,
[24] S. Martinoia, G. Massobrio, and L. Lorenzelli, “Modeling isfet micro- J. Rodriguez-Manzano, and P. Georgiou, “A dual-sensing thermo-
sensor and isfet-based microsystems: a review,” Sensors and Actuators chemical isfet array for dna-based diagnostics,” IEEE Transactions on
B: chemical, vol. 105, no. 1, pp. 14–27, 2005. Biomedical Circuits and Systems, vol. 14, no. 3, pp. 477–489, 2020.
[25] M. W. Shinwari, M. J. Deen, and D. Landheer, “Study of the electrolyte-
insulator-semiconductor field-effect transistor (eisfet) with applications
in biosensor design,” Microelectronics Reliability, vol. 47, no. 12, pp.
2025–2057, 2007.
[26] P. Bergveld, “Thirty years of isfetology: What happened in the past 30
years and what may happen in the next 30 years,” Sensors and Actuators
B: Chemical, vol. 88, no. 1, pp. 1–20, 2003.
[27] S. Sinha, N. Sahu, R. Bhardwaj, H. Ahuja, R. Sharma, R. Mukhiya, and
C. Shekhar, “Modeling and simulation of temporal and temperature drift
for the development of an accurate isfet spice macromodel,” Journal of
Computational Electronics, vol. 19, no. 1, pp. 367–386, 2020.
[28] S. Martinoia and G. Massobrio, “A Behavioral Macromodel of the
ISFET in SPICE,” Sensors and Actuators B: Chemical, vol. 62, no. 3,
pp. 182–189, 2000.
[29] A. Ortiz-Conde, F. G. Sánchez, J. J. Liou, A. Cerdeira, M. Estrada,
and Y. Yue, “A review of recent mosfet threshold voltage extraction
methods,” Microelectronics reliability, vol. 42, no. 4-5, pp. 583–596,
2002.

1530-437X (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 04,2021 at 08:08:47 UTC from IEEE Xplore. Restrictions apply.

You might also like