Nxt forging algorithm: simulating approach | Simulation | Algorithms

Nxt forging algorithm: simulating approach

andruiman, andruiman@gmail.com

October 17, 2014
Abstract
In this paper we investigate properties of the forging algorithm
used in PoS crypto-currencies networks such as Nxt. The approach
we are using is statistical modeling and simulation. We analyzed the
current implemented algorithm and found some weaknesses of it. We
have found that time for block generation depends on balance dis-
tribution over network accounts and even in the simplest case with
one node it cannot converge to specified value of 1 minute per block.
We also present some newer regulation techniques which help to avoid
those issues and allow adapt nodes to generate a block in the specified
average time interval independent on balance distribution for static
and dynamic cases.
Keywords: PoS crypto-currencies, forging, statistical simulation
1 Pre-introduction
We begin series of papers concentrated on the PoS algorithms themselves
and their implementations in the Nxt. Our final goal is to develop a work-
ing model which can simulate different algorithms and approaches fast and
with analyzable data. That model we plan to implement based on a mix
of mathematical statistical simulations (like this paper), formal logic proof
(with the help of the COQ system, http://coq.inria.fr/) and a fast pro-
totyping language (haskell, http://www.haskell.org/ at the moment).

To support this work please use Nxt address: NXT-L892-ZKXZ-2JJY-AD9JV
1
While we do not care yet much about performance that would be a rea-
sonable choice as we believe. Please see details of our plans in the pre-
ceding papers at
1
and
2
. To simulate something we need that it should
be predictable, measurable and modifiable. We start with some basic en-
tities of the model and come close to the forging algorithm which we’d
like to be investigated so we can play with params and see what happens
in the simulating. So this paper considers the forging algorithm from dif-
ferent sides. Author must note the outstanding work of mthcl at http:
//www.docdroid.net/ecmz/forging0-5-2.pdf.html and his precise inves-
tigation of the probabilistic properties of the forging algo. In our paper we
observed some of the same results as he did using statistical simulation and
propose some different correcting procedures. We’d like to note that this
paper doesn’t belong to strict math papers. We skip some details, don’t
try to prove theorems, don’t describe the numerical experiments with super-
accuracy and however some of the data is of course available, our goal is to
make an impression of different regulative procedures and results to realize
what worth and what worthless (at least yet) to include in the simulating
system, which parameter is important or even critical and which is less im-
portant for the network excellence.
2 Introduction
The forging process considered as opposite to the mining is used in the Proof-
of-Stake (PoS) crypto-currencies networks to build a blockchain, which is the
block sequence, containing all the network specific data in a structured type-
class. For details see https://wiki.nxtcrypto.org. However an algorithm
of forging can be examined from the mathematical point of view, following
the goal to construct an optimal and effective core network clients, whose
collective work leads to the specified network behavior. We divide our paper
in some sections moving from the easiest case to more realistic, discovering
necessary properties of the forging process to be implemented.
Let us consider a network of N nodes, where each node corresponds to
some user account, but not vice versa (we think of sleeping accounts). Each
account corresponds to some balance value V
n
, which are not together exceed
the total system balance:

V
n
= V . So for live accounts (in nodes) we have
1
http://chepurnoy.org/blog/2014/10/inside-a-proof-of-stake-cryptocurrency-part-1/
2
http://chepurnoy.org/blog/2014/10/inside-a-proof-of-stake-cryptocurrency-part-2/
2
inequality

V
n
≤ V . In the Nxt network V = 10
9
. Further, we denote
the blockchain sequence as B
m
and define some time interval within which
we’d like to have the block to be generated in average: Et
m
= τ where t
m
is interval between m and (m − 1) blocks. In the Nxt τ = 60 (measured
in seconds). We also have the zero block B
0
which is called genesis block.
Each block has a special measure which is called base target H
m
. To add
a non-deterministic entity we suggest that each node can generate pseudo-
random (natural) numbers p
nm
somehow distributed between 0 and enough
big number, say P. They are called hits. In the p
nm
numbers n stands for
the node number and m for the current block. Hereafter we suggest the
uniform distribution with infinitesimal measure dp/P (we may think that p
is continuous as P is very big). In the Nxt P = 2
64
− 1. The starting base
target is defined so that the estimation Et
1
= τ
3
and is equal to H
0
=
P
2V τ
. In
the Nxt H
0
=
2
64
2·10
9
·60
= 153722867. We also yet suggest the static blockchain
which means B
nm
= B
n

m
≡ B
m
through all the paper.
The algorithm which is examined and currently implemented in the Nxt
is the following (we’ll refer it as original):
t
nm
= p
nm
/(V
n
H
m−1
);
H
max
= min(V H
0
, 2H
m−1
);
H
min
= max(1, H
m−1
/2);
H
c
= t
nm
H
m−1
/τ;
H
m
= min(H
max
, max(H
min
, H
c
)).
Rewrite the latter equations as follows substituting t
nm
, max’s and min’s:
H
c
= p
nm
/(V
n
τ);
H
m
= min(V H
0
, 2H
m−1
, max(1, H
m−1
/2, H
c
)).
In the next sections we consider the following cases: (1) one node - per-
manent balance, (2) one node - changing balance, (3) multi-node - permanent
balance, (4) multi-node - changing balance.
Another important question we’d like to investigate is what we expect
from the perfect algorithm. At the moment our expectations are: (1) per-
fect algorithm should be immune to the total balance distribution that is no
matter how forging coins are distributed between accounts (2) it should be
3
Hereafter we calculate the mean value as an average within the series and denote
Ex
k
≡ lim
K→∞
1/K

i≤K
x
i
3
immune to sudden forging balance change (due to transactions or just turn-
ing machines off) (3) it should be proportional to account’s forging balance
that is with total amount of blocks generated we expect that estimation of
contribution of each node is proportional to it’s forging balance.
3 One node - permanent balance
So we start our examination with the simplest case: N = 1, V
1
= V . Rewrit-
ing the algorithm we have:
H
c
= p
m
/(V τ);
H
m
= min(V H
0
, 2H
m−1
, max(1, H
m−1
/2, p
m
/(V τ))).
Let us further normalize all the calculated stuff by 2H
0
to simplify notes.
So, let p ∈ U[ε; 1] and set
H
0
= 0.5; H
m
= min(V/2, 2H
m−1
, max(H
m−1
/2, p
m
))
where we introduce ε = 1/P > 0 small enough. Analytical solution for
EH
m
≡ lim
M→∞
1
M

m≤M
H
m
is not straightforward so we will use numerical
results sometimes to demonstrate the properties. To get fast simulation
results we use Excel and Gnumeric (maybe not excellent choice but sometimes
it works). The distribution of H
m
we got looks like on the figure below.
4
The distribution of H
m
looks pretty smooth and we’ve got the mean value
of around 0.5. However for one node with permanent balance we may try
not to tune H
m
at all, setting H
m
≡ 0.5 for all m with expectable result of
E[p
m
/H
m
] = Ep
m
/H
0
= 0.5/0.5 = 1. As we can see later, keeping H
m
close
to constant is a good idea while dealing with permanent overall balance.
But what happened with the mean time interval τ
1
. Although EH
m
=
0.5, τ
1
didn’t converged to unit and is about 1.3 −1.4. Here is a distribution
for t
m
:
Again we will not yet try to prove it analytically and consider much more
easier task, i.e. what is E[p
1
/p
2
] with both p
1,2
∈ U[1; P]? We believe it is
E[p
1
/p
2
] =
1
P
2
P
_
1
P
_
1
p
1
/p
2
dp
1
dp
2
=
1
P
2
_
P
2
2
−0.5
_
ln P ≈
ln P
2
where we neglect a value of 1 P
2
in the brackets.
This example answered us what we’ll get if we regulate H
m
naively setting
H
m
= p
m
; t
m
= p
m
/H
m−1
: Et
m
converges to ln P/2 with P big enough
rather than to unit, although
Epm
EHm
= 1. The given original algorithm is
much more complex to solve analytically but what we demonstrated is that
obviously E[A/B] = EA/EB even for statistically independent variables.
This question goes further to theoretical equation of the kernel functions
5
f(x), g(x) which play here roles of probability density functions:
E
g
xE
f
_
1
x
_
=
_
(1/x)f(x)dx
_
xg(x)dx = 1.
So playing with the simplest case we have realized that tuning of H
m
should be not so rough and we’ll try some methods to overcome the problem
of one-node time convergence with more interesting methods, than merely
constanting H
m
, in the next section. Main goal is to find a good way for
probability propagation from p
m
to t
m
through H
m
caring about rather Et
m
but maybe not EH
m
as mean time interval is more valuable property than
mean base target of the block. The latter will play a role when we’ll come to
dynamic blockchain and even blocktree where we’ll be looking for the longest
and therefore most trustful chain.
4 One node - changing balance
To see how we can regulate H
m
when account’s balance changes from block
to block let us introduce a new random entity, say k
m
. The distribution
of k
m
is not important to our goal so let k
m
= k
0
+ αq with q ∈ U[0; 1]
for further simulations (the latter is the case when Ek
m
exists but we may
also be in principle interesting when it doesn’t). With the given k
m
we
set t
m
= p
m
/H
m−1
k
m
. Start with the naive approach and let H
m
= H
m−1
t
m
supposing than we regulate H
m
depending on the last time of block. We have
H
m
= p
m
k
m
and t
m
=
pm
p
m−1
km
k
m−1
. And that is not very good and although
we know that K
p
= E
_
pm
p
m−1
_

ln P
2
and we can add this coefficient directly
to H
m
: H
m
= p
m
k
m
K
p
(which in fact perfectly works) but we don’t know
anything a priori about the distribution of k
m
and the correspondent value
of E
_
km
k
m−1
_
, because it depends on user actions.
So we need some more elegant method to regulate H
m
. There is no proper
way to expect user actions but we can try to calculate some mean values on
the fly to achieve our goal. Doing this we pursue two aims of estimating
the mean value of account balance and get it relatively local, that is we
won’t wait for decades to calculate the correct mean value because generally
speaking the balance distribution might have no mean value at all. So we
try to use some moving average value as a local estimation of the mean. For
6
that we choose some window within which we calculate the average value of
forging balance and use this for local regulating of H
m
. We have
R
m
=
1
r
m

i=m−r+1
k
i
; H
m
= H
0
R
m
where r is the window size. And actually we have rather good results with
Et
m
close to unit and with the distribution like:
Now it’s important to notice that the latter results are not so excellent
as they could seem due to following reasons: (1) we still probably don’t
want to know anything about current forging balance (because it’s not easily
convertible to the case of multi node) and (2) the distribution of t
m
has a long
tail and high values below the unit. It would be more preferable if it looks
like gaussian distribution around the unit. So we proceed our investigation
with other types of regulation.
What if we tune H
m
directly based on the time of block measurements
and define:
R
m
=
1
r
m

i=m−r+1
t
i
; H
m
= H
0
R
m
.
In this case we have Et
m
< 1 with distribution like
7
We see that the distribution is better as it has shorter tail but we have
over-regulated H
m
and got mean time less than the goal. Also the mean
value of block time depends on the k
m
distribution what is unacceptable.
That is because we don’t let t
m
to relax between H
m
changes. So let it relax:
R
m
=
1
r
m

i=m−r+1
t
i
, if mod (m, r) = 0; R
m
= 1 otherwise; H
m
= H
m−1
R
m
.
And after this we actually have Et
m
a bit more than unit, relatively
smooth H
m
(it’s getting constant between changes) and the distribution of
t
m
looking like
8
with short tail and almost uniform distribution before the tail. Not bad for
now. Let’s go to the multi-node case.
5 Multi node - permanent balance
The only thing we need from this case is to realize the mean time of a block
dependence on the forging balance distribution between nodes. So for some
regulations it actually depends. Suppose nodes to be not concurrent that is
the first found block is acceptable by the system and instantly redistributed
between nodes. So the winner is the node which finds a block in shorter
time. In the uniform and permanent balance case this time is proportional
to a random number generated by each node, so t
nm
∼ p
nm
where p
nm
are
still uniformly distributed. So
t
m
= min
n
t
nm
.
Let’s calculate the estimation E min p
nm
. It is
N
1
_
0
pdp
_
_
1
_
p
dq
_
_
N−1
= N
1
_
0
p(1 −p)
N−1
dp = −
1
_
0
p d[(1 −p)
N
] =
=
1
_
0
(1 −p)
N
dp =
1
N + 1
=
2Ep
N + 1
.
So if we naively put Et
nm
= N (which is the case when H ≈ H
0
) to be pro-
portional to account’s balance we have Et
m
=
2N
N+1
→2(N →∞). Also it’s
important that the resulting mean value depends on the balance distribution
if we don’t tune H
m
carefully. Actually we already have a method to do this
even in the case of changing balance.
By the way we notice that for the original algorithm of forging we observe
the mean time of ∼ τ
1
· β where β ∈ [1; 2] and depends on the forging balance
distribution. In the real Nxt network the final block time value is around 1.9
at the moment.
9
6 Multi node - changing balance
We use the regulation method of the section 4 for each node supposing that
nodes immediately share the solved block and the correspondent H values.
So
H
nm
≡ H
m
= R
m
H
0
;
R
0
= 1; R
m
=
R
m−1
r
m

i=m−r+1
t
i
, if mod (m, r) = 0; R
m
= R
m−1
otherwise;
t
nm
= p
nm
/(H
nm
V
n
);
t
m
= min
n
t
nm
.
We get the mean value of t
m
close to unit and the following simulated
distribution:
We see that the latter distribution decreases while argument goes more
positive from zero but it has almost nothing common with the gaussian which
we believe is one of the best when talking about some value more likely to
be a constant. Let’s try to get it more concentrated around the goal value
of the unit. To do this we present some simple method which we called
pool-in-nodes.
7 mthcl’s algorithm
Here at https://nxtforum.org/proof-of-stake-algorithm/forging-2088/
40/ a new forging algo with two extra regulating parameters had been pro-
posed. We shall investigate it also to reveal its statistical properties. The
10
main idea was that it should be a bit more difficult to decrease the BaseTarget
than to increase it. Here, the parameter bias is a number between 0 and 1,
e.g. 1/2. This should dramatically decrease the probability of long times be-
tween blocks. Now, the constant K is chosen in such a way that the expected
time between blocks is 1 minute (so, K is a function of bias). It is difficult to
calculate K exactly, because the balance equation for the stationary measure
of the system is too complicated. So it was proposed to simulate the process
and get parameter K numerically.
However we can find in mthcl’s paper
4
the adapted algorithm with one
extra parameter, namely γ and the second β defined by γ. We will use the
latter version of the algorithm as it described in the paper (see pages 21–22):
H
0
= 1;
t
nm
= −ln p/(H
m
V
n
), p ∈ U[0; 1];
t
m
= min
n
t
nm
;
H
m+1
= H
m
·
_
¸
¸
_
¸
¸
_
ift
m
≥ 2 =⇒ 2;
ift
m
∈ (1; 2) =⇒ t
m
;
ift
m
∈ (1/2; 1] =⇒ (1 −γ(1 −t
m
));
ift
m
≤ 1/2 =⇒ 1/β
β = (1 −γ/2)
−1
.
The simulation results for H
m
give the following distribution:
4
http://www.docdroid.net/ecmz/forging0-5-2.pdf.html
11
which seems very good and is also presented in the paper as a numerical
solution for PDF of the base target H. Also the algorithm is highly adapting
for distribution of the stake between nodes even in a case of the fluctuating
node’s balance. The number of generated blocks is quite well proportional
to the stake portion of the generating node. For the block time distribution
we have the picture like (it was intentionally broken at 3):
with mean value around the unit for γ ≈ 0.5. This distribution is a little
similar to what we got in the section 6 but with more descending shape be-
cause of the Exp distribution for the hits. However the distribution above
allows small intervals more likely and sometimes allows large intervals. For
our simulation the interval for t
m
was up to 20. We suggest that the distri-
bution of the block time should be more concentrated around the unit and
never (or almost never) run out a reasonable neighborhood and that is the
main reason to proceed our investigation. Nevertheless the examined algo-
rithm is better than original due to high stability to the immediate balance
distribution, better proportionality and good mean value. The regulating
parameter γ taken from the neighborhood of 0.5 works well in wide limits of
modified stake distribution. So for the given network it can be chosen once
for a long run.
12
8 Pool-in-nodes
Recall that a mean value of uniformly distributed numbers is asymptotically
normally distributed. Previously we calculated a node internal block time
as t
nm
= p
nm
/(H
m
V
n
). Now suppose that we distinguish between block and
sub-block, or maybe one likes to call them super-block and just block. Each
sub-block is generated with less difficulty, so the normal base target is roughly
multiplied by some predefined number (say 16 or 32) which is equivalent to
dividing the hit by the same number. Let us denote is like w. Then the
procedure is the same. Nodes build the sequence of sub-blocks and after
w sub-blocks built the real block is generated. The remained questions of
what information to be included in the final block and how fee should be
distributed between contributing nodes we’ll analyze in the next paper. The
simple idea is to behave like a pool and distribute the cumulative fee between
nodes, which generated at least one sub-block proportional to number of sub-
blocks generated.
Each sub-block time is distributed like in the section 6 within [0; P/(wH
m
(max V
n
))]
and while we have been before regulating H
m
to have Et
m
close to unit now
we believe that the mean value of a sum of such w t-values goes to unit more
gaussian-like. So we define
t
m
=

w

≤w
t
w

m
; t
w

m
= min
n
t
w

nm
; t
w

nm
= p
w

nm
/(H
w

nm
V
n
); H
w

nm
= H
nm
w ≡ H
m
w.
And actually we’ve got what we hoped. The distribution looks like much
more gaussian and concentrated around the unit:
13
Now we finished our examination and proceed it in future works.
9 Conclusion and future work
What we realize from our investigation are the following: (1) the original
algorithm of forging is not immune to the balance distribution and even for
one node converges to the mean block time more than unit (2) there is an
adapting algorithm which solves both the issues and offer the regulation con-
vergent to unit and immune to changing balance distribution. The number
of found blocks is proportional to node’s forging balance as the local node’s
time is inverse proportional to it, hits are uniformly distributed and nodes
share blocks immediately after they have been generated.
Future work includes: (1) concurrent nodes model that is nodes may
choose the blocks sequence on which forge based on its cumulative base tar-
get (2) model for asynchronous and delayed process of blocks exchange (3)
analyzing of attacks opportunities in the different forging models.
14

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.