You are on page 1of 6

Age of Incorrect Information for Remote

Estimation of a Binary Markov Source


Clement Kam∗ , Sastry Kompella∗ , and Anthony Ephremides‡
∗ InformationTechnology Division, Naval Research Laboratory, Washington, DC
‡ Electrical and Computer Engineering Department, University of Maryland, College Park, MD

Abstract—For monitoring applications, the Age of Information are not directly correlated, and various effective age metrics
(AoI) metric has been the primary focus of recent research, but were proposed which are designed to correlate with error
closely related to monitoring is the problem of real-time or remote performance.
estimation. Age of Information has been shown to be insufficient
for minimizing remote estimation error, but recently a metric In considering alternative metrics to age, there have been
known as Age of Incorrect Information (AoII) was proposed some studies on nonlinear age penalties. In [12], the authors
that characterizes the cost of a monitor being in an erroneous propose two related metrics characterizing the possibly non-
state over time. In this work, we study the AoII metric in the linear cost of having stale information. In [13], the authors
simple context of monitoring a symmetric binary information optimized the continuous-time sampling for non-negative, non-
source over a delay system with feedback. We compare three
different performance metrics: real-time error, AoI, and AoII. decreasing functions of age [13]. They extended this work
For each metric, we formulate the optimal sampling problem as in [14] to include possibly negative non-decreasing functions
a Markov decision process and apply a dynamic programming of age, and both continuous- and discrete- time versions of
algorithm to compute the optimal performance and policy. We the sampling problem [14].
also simulate the system for two sampling policies: sample-at- In this work, we focus on a recent remote estimation
change and zero-wait, and we observe which policy coincides
with the optimal policy for each metric. For a variety of delay metric called Age of Incorrect Information (AoII), which was
distributions and AoII penalty functions, we observe that the proposed in [15]. In contrast to real-time error performance,
optimal policy for the real-time error and for AoII are equal to AoII is designed to capture the application-dependent cost
the sample-at-change policy, whereas the optimal policy for AoI of not having a correct estimate for some amount of time.
is a threshold policy. The authors studied a basic AoII metric and optimized the
sampling policy with and without energy constraints, and
I. I NTRODUCTION
they showed that AoI-optimal policy does not do as well as
In applications ranging from Internet of Things to timely directly minimizing the AoII policy. One difference in the
data analysis, there has been a surge in interest in the freshness work presented here is that our system does not allow samples
of information for real-time remote monitoring applications. chosen for transmission to be preempted before being received
These studies have been formalized in the analysis of the at the monitor.
Age of Information (AoI) metric [1]–[7]. Although the AoI Our contributions in this work include deriving the optimal
characterizes the freshness according to the data with the sampling policy for the real-time error, AoI, and AoII metrics
most recent generation time, it may be more useful to use the for remotely estimating a symmetric binary Markov source.
received data to estimate the status of the source in real time. We formulate the problems as Markov decision processes, and
In this work, we focus on this problem of real-time remote we compute the optimal policies for the infinite horizon, av-
estimation and its relation to Age of Information. erage cost problem using dynamic programming. We generate
There have been a number of studies on real-time remote es- numerical results for a variety of delay distributions and AoII
timation and Age of Information. In [8], the authors studied the penalties, and compare with simulation results for the sample-
problem of remotely estimating a Wiener process over a delay at-change and zero-wait policies. Our results show that in the
channel, and they showed that the policy that optimizes the cases studied, the optimal sampling policy for AoII coincides
mean squared error (MSE) performance is a signal-dependent with that of real-time error, which is the sample-at-change
threshold policy, and minimizing MSE is not equivalent to policy.
minimizing AoI. Later, they showed in [9] that if the sampling The remainder of the paper is organized as follows. In
policy must be independent of the observed process, the age- Sec. II, we present the system model for the remote monitoring
optimal policy is also the MSE-optimal policy. This work was system and the source signal. In Sec. III, we present the three
extended for the Ornstein-Uhlenbeck process and nonlinear performance metrics that we consider, formulate the optimal
age metrics [10]. In [11], two sampling policies, sample-at- sampling problem as an MDP for each metric, and provide
change and zero-wait, were compared for the discrete-time the dynamic programming solutions. We also describe how the
case, and the age and error performance were analyzed. This algorithms are altered to handle different delay distributions.
analysis proved once again that age and error performance In Sec. IV, we present numerical results for each of the

1
Authorized licensed use limited to: The Chinese University of Hong Kong CUHK(Shenzhen). Downloaded on October 10,2022 at 06:14:22 UTC from IEEE Xplore. Restrictions apply.
ACK (MDP), for which we need to specify the state space S, the
action space A of whether to transmit or remain idle when
Y(t) Real-Time the channel is idle, the cost function g(s), s ∈ S, and the
X(t) Channel X̂ (t) state transition probabilities P (sn+1 |sn , u(sn )), u(s) ∈ A. To
Estimator
determine the probabilities, we first consider a geometrically
Fig. 1. Real-time estimation system with feedback. distributed random delay, where the probability of successful
p reception at the monitor in a given slot is equal to ps .
p 0 1 p A. Real-Time Error
The first performance metricR we consider is the average
real-time error, defined as E = 0 1{X̂(t) 6= X(t)}dt. Since

p
we are considering a symmetric binary Markov source, the
Fig. 2. Symmetric binary Markov source.
MDP does not need to track the actual state of the source and
metrics and for different delay distributions and AoII penalty the estimate, so we can reduce the dimensions of the state
functions. Lastly, we present some conclusions based on these space as follows. We define the states as s = (s1 s2 ), where
results in Sec. V. s1 is 0 if the estimate is correct, and 1 if the estimate is
incorrect; s2 is i if the transmitter is idle, 0 if the update in
II. S YSTEM M ODEL transmission is correct, and 1 if the update in the transmission
A. Remote Monitoring Model is incorrect. For the actions, if s2 6= i, there is no action
to take since the channel is busy. If s2 = i, then the action
The real-time remote estimation model that we study is
is either to begin transmission or remain idle until the next
shown in Fig. 1. We consider a slotted-time system in which
decision point at the next slot. (The actions are the same for
the source decides at the beginning of a slot whether or not to
all three performance metrics.) Here, the cost at each stage
sample a signal X(t), provided the channel is not busy (i.e.,
is g(s1 s2 ) = s1 , i.e., whether or not the current estimate is
a sample being transmitted cannot be preempted). The sample
correct. Lastly, the transition probabilities Pr(sn+1 |sn , u(sn ))
enters the channel, and after a delay of a random number of
are functions of the probabilities of the source changing its
slots, it is received error-free at the monitor, which uses the
state p and the success probability ps for the geometrically
samples to estimate the current state of the source. Initially,
distributed delay. These probabilities are worked out in our
we consider a geometrically distributed delay, but also extend
dynamic programming algorithm in the following paragraph.
it to more general distributions. Once the monitor receives
the sample, it sends an instantaneous acknowledgment to the To solve the average cost MDP, we use the relative
source, letting it know that the channel is free to transmit value iteration (RVI) [16] algorithm which, if convergent,
another sample, which eliminates the need for queueing. This yields the optimal average cost (and policy) for finite
system was considered in our previous work [11], in which state andh action spaces. We denotei the mapping T h(x) =
P
we showed that the sampling policy that minimizes AoI does minu∈A g(x) + j∈S pxj (u)h(j) , x ∈ S. This is the
not minimize the real-time error performance, and vice versa. dynamic programming mapping that will be applied in our
RVI algorithm. Using state 0i as the arbitrary reference state,
B. Source Model the kth iteration of the RVI for minimizing the real-time error
The source model is a symmetric binary Markov source is given as follows (RVIE ):
shown in Fig. 2, where the probability of changing state is p
T hk (0i) = min{phk (0i) + phk (1i), ps (phk (00) + phk (11))
and remaining in the same state is p = 1 − p. The source
state transitions occur just before the sample transmission + ps (phk (0i) + phk (1i))}
k+1
opportunities. In this work, for convenience of describing the h (1i) =1 + min{phk (1i) + phk (0i), ps (phk (10)
problem formulation, we assume p < 1/2, for which we + phk (01)) + ps (phk (0i) + phk (1i))} − T hk (0i)
showed in [11] that the optimal estimator at the monitor is
hk+1 (00) =ps (phk (0i) + phk (1i)) + ps (phk (00) + phk (11))
simply the freshest sample available. If we were to consider
the p > 1/2 case, we could extend the following analysis − T hk (0i)
using the corresponding best estimator, which alternates every hk+1 (01) =ps (phk (1i) + phk (0i)) + ps (phk (01) + phk (10))
other slot (since the source is more likely to change than stay − T hk (0i)
the same in consecutive slots).
hk+1 (10) =1 + ps (phk (0i) + phk (1i)) + ps (phk (10)
III. O PTIMAL S AMPLING P OLICIES + phk (01)) − T hk (0i)
Our goal is to determine the best sampling policy for the hk+1 (11) =1 + ps (phk (1i) + phk (0i)) + ps (phk (11)
following remote monitoring/estimation performance metrics:
+ phk (00)) − T hk (0i),
1) real-time error, 2) Age of Information, and 3) Age of Incor-
rect Information. For each performance metric, we formulate where for T hk (xi), the first term in the min is for remaining
the infinite horizon, discrete time Markov Decision Process idle, and the second term is for sampling and transmitting. All

2
Authorized licensed use limited to: The Chinese University of Hong Kong CUHK(Shenzhen). Downloaded on October 10,2022 at 06:14:22 UTC from IEEE Xplore. Restrictions apply.
other states do not have an action to take. After the algorithm penalty e(·) is just whether the estimate is incorrect or not
has converged, the optimal cost is given by T hk (0i). (this AoII is the form introduced and analyzed in [15]). To
As an example of how these transition probabilities are com- formulate the MDP, the states are defined as (s1 s2 ), where s1
puted, for state 0i, if we choose to remain idle, remaining in is the current AoII at the end of the slot, and s2 is the same
state 0i occurs with probability Pr(0i|0i, i) = p and changing as in the real-time error case (i: channel is idle, 0: update in
to state 1i occurs with probability Pr(1i|0i, i) = p. However transmission is correct, 1: update in transmission is incorrect).
if we choose to transmit, transmission may be unsuccessful The cost at each stage is equal to the average AoII over the
with probability ps , in which case, the source state transitions time slot:
with probability p, so Pr(11|0i, t) = ps p, and so on.
s1 − 21 s1 6= 0

g(s1 s2 ) =
B. Age of Information 0 otherwise.
The second performance metric is the average R ∞Age of Infor- As with AoI, we need a finite state space and therefore
mation, which is defined as ∆ = limT →∞ T1 0 (t − u(t))dt, truncate the states so that the AoII never exceeds xmax . Using
where u(t) is the time stamp of the freshest packet at the state 0i as the arbitrary reference state, the kth iteration of the
monitor. To formulate the MDP, the states are defined as RVI for minimizing AoII is shown as follows (RVIAoII ):
(s1 s2 ), where s1 is the current AoI at the start of the slot,
and s2 is the current AoI of the update in transmission or i T hk (0i) = min{phk (0i) + phk (1i), ps (phk (00)
if the channel is idle. The cost at each stage is equal to the + phk (11)) + ps (phk (0i) + phk (1i))}
average AoI over the time slot g(s1 s2 ) = s1 + 21 . The transition 1
probabilities depend only on the probability of successful hk+1 (xi) = x − + min{phk ((x + 1)i) + phk (0i),
2
reception in each slot, since the AoI does not depend on the ps (phk ((x + 1)0) + ph(01)) + ps (phk (0i)
correctness of the estimate. They transition probabilities are
+ phk ((x + 1)i))} − T hk (0i),
shown in the RVI algorithm below.
Since the RVI requires a finite state space, we truncate the 0 ≤ x < xmax
states such that the AoI never exceeds some maximum value 1
hk+1 (xmax i) = xmax − + min{phk (xmax ) + phk (0i),
xmax . We will see in our numerical results that the xmax need 2
not be very large (=20) to approach the optimal performance. ps (phk (xmax 0) + ph(01)) + ps (phk (0i)
Using state 1i as the arbitrary reference state, the kth iteration + phk (xmax i))} − T hk (0i)
of the RVI for minimizing AoI is shown as follows (RVIAoI ): hk+1 (00) = ps (phk (0i) + phk (1i)) + ps (phk (00)
3 + phk (11)) − T hk (0i)
T hk (1i) = + min{hk (2i), ps hk (21) + ps hk (1i)}
2 h k+1
(01) = ps (phk (1i) + phk (0i)) + ps (phk (01)
1
hk+1 (xi) = x + + min{hk ((x + 1)i), ps hk ((x + 1)1) + phk (10)) − T hk (0i)
2
+ ps hk (1i)} − T hk (1i), 0 ≤ x < xmax 1
hk+1 (x0) = x − + ps (phk (0i) + phk ((x + 1)i))
1 2
hk+1 (xy) = x + + ps hk ((x + 1)(y + 1)) + ps hk (1i) + ps (phk ((x + 1)0) + phk (01)) − T hk (0i)
2
− T hk (1i), 0 ≤ x < xmax , 0 ≤ y < x 1
hk+1 (x1) = x − + ps (phk ((x + 1)i) + phk (0i))
1 2
hk+1 (xmax i) = xmax + + min{hk (xmax i), ps hk (xmax 1) + ps (phk ((x + 1)1) + phk (00)) − T hk (0i),
2
+ ps hk (1i)} − T hk (1i), 0 ≤ x < xmax 0 ≤ x < xmax
1 k+1 1
hk+1 (xmax y) = xmax + + ps hk (xmax (y + 1)) + ps hk (1i) h (xmax 0) = xmax − + ps (phk (0i) + phk (xmax i))
2 2
− T hk (1i), 0 ≤ y < xmax + ps (phk (xmax 0) + phk (01)) − T hk (0i)
1 1
hk+1 (xmax xmax ) = xmax + + ps hk (xmax xmax ) hk+1 (xmax 1) = xmax − + ps (phk (xmax i) + phk (0i))
2 2
+ ps hk (1i) − T hk (1i) + ps (phk (xmax 1) + phk (00)) − T hk (0i)

C. Age of Incorrect Information D. Other Delay Distribution


The third performance metric is the average Age of In- To determine the behavior for a non-geometrically dis-
correct Information,
R∞ which is generally defined as ∆AoII = tributed delay, we consider a random delay with pmf Pr(D =
limT →∞ T1 0 f (t)e(X(t), X̂(t))dt [15], where f (t) is a d) ≥ 0 for 1 ≤ d ≤ dmax , and Pr(D = d) = 0 otherwise. To
time penalty function and e(X, X̂) is an estimate penaltyR ∞func- handle this, we augment the state spaces above as (s1 s2 s3 ),
tion. Initially, we use the form ∆AoII = limT →∞ T1 0 (t − where s3 is the residual delay for the update in transmission.
v(t))1{X̂(t) 6= X(t)}dt, so that the time penalty is just the We alter the three RVI algorithms above to accommodate
time since the estimate was last correct (v(t)), and the estimate the transition to the appropriate state when a sample is

3
Authorized licensed use limited to: The Chinese University of Hong Kong CUHK(Shenzhen). Downloaded on October 10,2022 at 06:14:22 UTC from IEEE Xplore. Restrictions apply.
0.5
0.5 Transmit
Idle

Average RT Error
0.4
0.4
0.3
0.3

p
Change, Sim. 0.2
0.2 Zero Wait, Sim.
RVI, 0.1
0.1
0.1 0.2 0.3 0.4 0.5 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5
AoII
p
Fig. 4. Optimal policy for RVIAoII .
(a) Real-time error.

sampling policy that is induced by the RVIE and RVIAoII


20 Change, Sim. algorithm (shown in Fig. 4) is confirmed to be sample-at-
Zero Wait, Sim.
Average AoI

15 RVI, AoI change. The policy at p = 0.5 does not affect performance
because the signal is now independent of the past.
10
B. Zipf Distributed Delay
5 We would like to test various other settings to see if the
0.1 0.2 0.3 0.4 0.5 optimal AoII policy deviates from the optimal E policy. First
p we consider different delay distributions. Here we use the Zipf
distribution, which has the following pmf:
(b) Age of Information.
d−a
Pr(D = d) = Pdmax , a > 0.
i=1 i−a
0.6
Average AoII

In our numerical results, we set a = 3 and dmax = 20. The


results of the simulations and RVI algorithms are shown in
0.4 Change, Sim. Fig. 6. The AoI plot is least affected by the delay distribution,
Zero Wait, Sim. but the values for E and AoII are noticeably lower than in
0.2 RVI, AoII the geometric case in Fig. 3. However, the optimal policy for
0.1 0.2 0.3 0.4 0.5 E and AoII are still the sample-at-change policy.
p
C. Two-State Delay Distribution
(c) Age of Incorrect Information. We also consider a simple two-state delay distribution,
Fig. 3. Geometrically distributed delay, ps = 0.5. which has the pmf Pr(D = 1) = Pr(D = 20) = 21 . The
results of the simulations and RVI algorithms are shown in
taken and a random delay is drawn. These modifications are Fig. 6. The AoI plot shows that the zero-wait policy is no
straightforward and thus are not shown in this paper. longer optimal, as has been shown in [14], and the optimal
sampling policy induced by the RVI is a threshold policy
IV. N UMERICAL R ESULTS (as proven in [14]), where a sample is only taken when
A. Geometrically Distributed Delay s1 ≥ 8. However, for E and AoII, the optimal policy is still the
We run the RVI algorithms to determine the optimal per- sample-at-change policy. Note that for the AoII, the penalty
formance for each metric. The parameters used are ps = 0.5 is decreasing as p increases from 0.05 to 0.5, so using the
and xmax = 20. We also simulate the system for two sampling received information to estimate is actually detrimental to AoII
policies: 1) sample-at-change, which samples when the source performance compared to simply randomly guessing the state,
signal changes state, and 2) zero-wait, which immediately in which case, the sampling policy does not impact the AoII
samples after the previous sample has been received and the performance.
channel is free.
D. Scaled Linear Time Penalty
Fig. 3 plots the three performance metrics for the two
simulated policies and the optimal performance from the RVI We also investigate the impact of the time penalty f (t) on
algorithm. For the real-time error and Age of Incorrect Infor- the optimal sampling policy. Instead of f (t) = t − v(t), we
mation, the optimal performance coincides with the sample-at- consider scaling this penalty as f (t) = α(t − v(t)), and we
change policy, while for the Age of Information, the optimal update the cost function calculation in the RVI algorithm to
performance coincides with the zero-wait policy. We see that be the average penalty over the slot:
this performance is essentially achieved by the RVI even with α(s1 − 21 ) s1 6= 0

xmax only equal to 20. In this and all of the cases tested, the g(s1 s2 ) =
0 otherwise.

4
Authorized licensed use limited to: The Chinese University of Hong Kong CUHK(Shenzhen). Downloaded on October 10,2022 at 06:14:22 UTC from IEEE Xplore. Restrictions apply.
0.5 0.50

Average RT Error

Average RT Error
0.4 0.45
0.3 0.40
0.2 Change, Sim. Change, Sim.
Zero Wait, Sim. 0.35 Zero Wait, Sim.
0.1 RVI, RVI,
0.30
0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5
p p
(a) Real-time error. (a) Real-time error.

20 Change, Sim. 30 Change, Sim.


Zero Wait, Sim. Zero Wait, Sim.
Average AoI

Average AoI
15 RVI, AoI 25 RVI, AoI
10 20
5
15
0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5
p p
(b) Age of Information. (b) Age of Information.

Change, Sim. Change, Sim.


0.6 Zero Wait, Sim. 3 Zero Wait, Sim.
Average AoII

RVI, AoII Average AoII RVI, AoII


0.4 2
0.2
1
0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5
p p
(c) Age of Incorrect Information. (c) Age of Incorrect Information.
Fig. 5. Zipf distributed delay, a = 3, dmax = 20 1
Fig. 6. Delay with Pr(D = 1) = Pr(D = 20) = 2
.

The results for the AoII for α = 0.5, 1, 3 are shown in Fig. 7 V. C ONCLUSION
for the geometrically, Zipf, and two-state distributed delays.
Although the AoII values change, the policy that optimizes the In this work, we studied the problem of remote estimation
AoII is still the sample-at-change policy, with the exception of a symmetric binary Markov source and various performance
that the two-state delay is a degenerate case where randomly metrics. For each metric, we formulated the optimal sampling
guessing yields better AoII performance. problems as MDPs and implemented the relative value it-
eration (RVI), the dynamic programming algorithm for the
E. Power Law Time Penalty infinite horizon, average cost stochastic control problem. Our
We also consider a nonlinear time penalty, following a numerical results verified past results that the optimal policy
power law as f (t) = (t − v(t))α . The average penalty over for the Age of Information is a threshold policy (in many cases,
the slot used in the RVI algorithm is now as follows: the threshold was 0), and the optimal policy for both the real-
( α+1 time error and the Age of Incorrect Information was to sample
s1 −(s1 −1)α+1
g(s1 s2 ) = α+1 s1 6= 0 only when the source signal flipped. We showed that the two
0 otherwise. policies coincided for different delay distributions1 and penalty
functions. On the one hand, our results suggest that the AoII
The results for the AoII for α = 1, 2, 3 are shown in Fig. 8 is a sufficient metric for minimizing the real-time error for
for the geometrically, Zipf, and two-state distributed penalties. this binary Markov source. On the other hand, they suggest
Again, the AoII values change, but the optimal policy is still
the sample-at-change policy (aside from the random guessing 1 Aside from the degenerate case where it is better for AoII to randomly
estimator for the two-state case). guess the state.

5
Authorized licensed use limited to: The Chinese University of Hong Kong CUHK(Shenzhen). Downloaded on October 10,2022 at 06:14:22 UTC from IEEE Xplore. Restrictions apply.
4 =3.0
Change, Sim. 4 Change, Sim. Change, Sim.
3 Zero Wait, Sim. Zero Wait, Sim. 4 Zero Wait, Sim.

Average AoII
Average AoII

Average AoII
RVI, AoII 3 RVI, AoII RVI, AoII
2 2
=3.0 =3.0 2 =1.0
1 1 =0.5
=1.0
=0.5 =1.0
=0.5
0 0 0
0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5
p p p
1
(a) Geometric delay, ps = 0.5. (b) Zipf distributed delay, a = 3, dmax = 20. (c) Delay with Pr(D = 1) = Pr(D = 20) = 2
.

Fig. 7. Scaled linear penalty.

15
Change, Sim. Change, Sim. Change, Sim.
Zero Wait, Sim. 4 Zero Wait, Sim. 400 Zero Wait, Sim.
Average AoII

Average AoII

Average AoII
10 RVI, AoII RVI, AoII RVI, AoII
=3.0 2 200
5 =3.0 =3.0
=2.0 =2.0
=1.0 =2.0
=1.0
0
=1.0 0 0
0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5
p p p
1
(a) Geometric delay, ps = 0.5. (b) Zipf distributed delay, a = 3, dmax = 20. (c) Delay with Pr(D = 1) = Pr(D = 20) = 2
.

Fig. 8. Power law penalty.

that the specifics of the AoII function, such as the time penalty [6] A. M. Bedewy, Y. Sun, and N. B. Shroff, “Optimizing data freshness,
function f (t), do not matter in optimizing the sampling policy throughput, and delay in multi-server information-update systems,” in
2016 IEEE International Symposium on Information Theory (ISIT), July
for the system studied here, provided the penalty function is 2016, pp. 2569–2573.
non-negative and non-decreasing in time. The similarity in the [7] R. D. Yates, E. Najm, E. Soljanin, and J. Zhong, “Timely updates over an
dynamic programming mapping T h(x) for the two metrics, erasure channel,” in 2017 IEEE International Symposium on Information
Theory (ISIT), June 2017, pp. 316–320.
as seen by a number of identical equations in the RVIE [8] Y. Sun, Y. Polyanskiy, and E. Uysal-Biyikoglu, “Remote estimation of
and RVIAoII algorithms, also supports this hypothesis. We the wiener process over a channel with random delay,” in 2017 IEEE
plan to continue investigating the AoII for different source International Symposium on Information Theory (ISIT), June 2017, pp.
321–325.
models with more states, different estimation error metrics, [9] “Optimal sampling and remote estimation of the wiener process over
and a multiuser setting with different AoII penalty definitions. a channel with random delay,” CoRR, vol. abs/1707.02531, 2017,
We also plan to apply reinforcement learning under different withdrawn. [Online]. Available: http://arxiv.org/abs/1707.02531
[10] T. Z. Ornee and Y. Sun, “Sampling for remote estimation through
models to see how different metrics impact the learning rate. queues: Age of information and beyond,” CoRR, vol. abs/1902.03552,
2019. [Online]. Available: http://arxiv.org/abs/1902.03552
ACKNOWLEDGMENT [11] C. Kam, S. Kompella, G. D. Nguyen, J. E. Wieselthier, and
A. Ephremides, “Towards an effective age of information: Remote
This work was supported by the Office of Naval Research. estimation of a Markov source,” to appear in 2018 IEEE Conference
on Computer Communications Workshops (INFOCOM WKSHPS), Apr.
2018.
R EFERENCES [12] A. Kosta, N. Pappas, A. Ephremides, and V. Angelakis, “Age and
value of information: Non-linear age case,” in 2017 IEEE International
[1] S. Kaul, R. Yates, and M. Gruteser, “Real-time status: How often should Symposium on Information Theory (ISIT), June 2017, pp. 326–330.
one update?” in Proc. IEEE INFOCOM, Orlando, FL, Mar. 2012, pp. [13] Y. Sun, E. Uysal-Biyikoglu, R. D. Yates, C. E. Koksal, and N. B. Shroff,
2731–2735. “Update or wait: How to keep your data fresh,” IEEE Transactions on
[2] C. Kam, S. Kompella, and A. Ephremides, “Age of information under Information Theory, vol. 63, no. 11, pp. 7492–7508, Nov 2017.
random updates,” in Proc. IEEE International Symposium on Informa- [14] Y. Sun and B. Cyr, “Sampling for data freshness optimization:
tion Theory (ISIT), Istanbul, Turkey, Jul. 2013, pp. 66–70. Non-linear age functions,” CoRR, vol. abs/1812.07241, 2018. [Online].
[3] L. Huang and E. Modiano, “Optimizing age-of-information in a multi- Available: http://arxiv.org/abs/1812.07241
class queueing system,” in 2015 IEEE International Symposium on [15] A. Maatouk, S. Kriouile, M. Assaad, and A. Ephremides, “The
Information Theory (ISIT), June 2015, pp. 1681–1685. age of incorrect information: A new performance metric for status
[4] K. Chen and L. Huang, “Age-of-information in the presence of error,” updates,” CoRR, vol. abs/1907.06604, 2019. [Online]. Available:
in 2016 IEEE International Symposium on Information Theory (ISIT), http://arxiv.org/abs/1907.06604
July 2016, pp. 2579–2583. [16] D. P. Bertsekas, D. P. Bertsekas, D. P. Bertsekas, and D. P. Bertsekas,
[5] C. Kam, S. Kompella, G. D. Nguyen, J. E. Wieselthier, and Dynamic programming and optimal control. Athena scientific Belmont,
A. Ephremides, “Age of information with a packet deadline,” in In- MA, 1995, vol. 2, no. 2.
formation Theory (ISIT), 2016 IEEE International Symposium on, July
2016.

6
Authorized licensed use limited to: The Chinese University of Hong Kong CUHK(Shenzhen). Downloaded on October 10,2022 at 06:14:22 UTC from IEEE Xplore. Restrictions apply.

You might also like