Professional Documents
Culture Documents
html
One problem with the benchmark was that it used rand to generate input data in a non-portable way
(thanks Travis Downs for reporting this issue (https://github.com/fmtlib/format-benchmark/issues/12)). This
was fixed by replacing rand with a proper pseudo-random number generator (https://en.cppreference.com
/w/cpp/numeric/random) so now the input distribution is consistent across platforms.
9,708,172,187 uops_issued.any
8,745,800,873 idq.dsb_uops
1,903,352,743 idq.mite_uops
470,737,031 dsb2mite_switches.penalty_cycles
1 de 5 06/04/2021 21:58
Converting a hundred million integers to strings per second https://www.zverovich.net/2020/06/13/fast-int-to-string-revisited.html
11,585,171,382 uops_issued.any
13,069,550,403 idq.dsb_uops
335,873,122 idq.mite_uops
11,939,259 dsb2mite_switches.penalty_cycles
As you can see there is almost 13% performance difference and a dramatic drop in
dsb2mite_switches.penalty_cycles (and mite_uops / dsb_uops ) that suggests that the mitigation is
working.
And finally I added digest computation for output strings to prevent compilers from optimizing away parts of
the conversion and to validate the results. This is only a few percent slower than
benchmark::DoNotOptimize even for the fastest case and makes the benchmark much more reliable and
arguably more realistic since the conversion output is unlikely to be discarded in practice.
Here are the results on Intel Core i7-8569U CPU @ 2.80GHz running macOS and compiled with Apple
clang version 11.0.3 (clang-1103.0.32.62) and libc++:
Speed ratio, as the name suggests, is the fastest method’s speed divided by the current method speed or
how much slower this method is compared to the leader. [c] and [r] mean compile-time and runtime
2 de 5 06/04/2021 21:58
Converting a hundred million integers to strings per second https://www.zverovich.net/2020/06/13/fast-int-to-string-revisited.html
As you can see {fmt}’s format_int still remains the fastest on this benchmark closely followed by
format_to with compile-time processing of format strings. Both can now deliver over a 100 million
conversions per second on a medium-spec laptop.
Alf P. Steinbach’s elegantly simple decimal_from which does formatting in reverse to avoid size
computation and then calls std::reverse is in the third place (fourth after adding u2985907 ).
Remarkably, fmt::to_string and fmt::format with compile-time format string processing are only ~7%
slower than libc++’s to_chars even though they return std::string while to_chars operates on a stack-
allocated buffer. This is partly due to small string optimization that avoids dynamic memory allocation.
Another interesting result is that fmt::format with runtime format string processing is faster than
std::to_string and is more than 4x faster than sprintf (previously it was less than 2x faster).
Now let’s repeat the exercise on Linux. Here are the results on Intel Core i9-9900K CPU @ 3.60GHz
running Ubuntu 20.20 and compiled with GCC 9.3.0 and libstdc++:
Here the results are more balanced. fmt::format_int is still the leader but it is only ~21% faster than
std::to_chars .
Interestingly, std::ostringstream is faster than sprintf on this platform (if you reuse the same stream
object).
Runtime format string processing is not as good on gcc as on clang so it might be something worth looking
into. In the meantime it’s possible to use compile-time format string processing in rare cases when
3 de 5 06/04/2021 21:58
Converting a hundred million integers to strings per second https://www.zverovich.net/2020/06/13/fast-int-to-string-revisited.html
The idea is very simple - it has a small buffer within the object itself and formats backwards with a single
pass handling pairs of digits at a time to minimize expensive integer divisions (which a compiler turns into
cheaper but still expensive multiplications). The idea of processing pairs of digits comes from the great talk
by Alexei Alexandrescu Three Optimization Tips For C++ (https://archive.org/details/AndreiAlexandrescu-
Three-Optimization-Tips). format_int is very easy to use (https://godbolt.org/z/2Kp9iZ (https://godbolt.org
/z/2Kp9iZ)):
auto f = fmt::format_int(42);
// f.data() is the data, f.size() is the size
It’s more user-friendly and safer than other low-level alternatives because it manages memory automatically
but you may need to copy the data. Note that neither snprintf nor to_chars give you the information on
how much storage you need, so you often have to overallocate and then do an extra copy anyway. For
example, here’s how to format into a std::string using std::to_chars :
In any case, you don’t need to use these low-level APIs often - they are mostly for implementing higher level
facilities and it’s easy to wrap std::to_chars in a more user-friendly API. For example numeric formatting
in std::format (https://en.cppreference.com/w/cpp/utility/format/format) is defined in terms of
std::to_chars .
Summary
There have been big improvements to standard library implementations in terms of performance (e.g.
std::to_string is much faster now) and the availability of new locale-independent formatting facilities
( std::to_chars ). The open-source {fmt} library (https://github.com/fmtlib/fmt) continues to provide some of
the fastest integer formatters that are often more performant than their standard counterparts.
Benchmarking remains to be a tricky business and extra care should be taken to ensure reproducibility of
results, particularly in view of recent hardware bugs and mitigations and increasingly advanced compiler
optimizations.
Update:
Update 2:
4 de 5 06/04/2021 21:58
Converting a hundred million integers to strings per second https://www.zverovich.net/2020/06/13/fast-int-to-string-revisited.html
Related Posts
04 Aug 2020 » Writing files 5 to 9 times faster than fprintf (/2020/08/04/optimal-file-buffer-size.html)
21 May 2020 » Reducing {fmt} library size 4x using Bloaty McBloatface (/2020/05/21/reducing-library-
size.html)
23 Jul 2019 » std::format in C++20 (/2019/07/23/std-format-cpp20.html)
5 de 5 06/04/2021 21:58