Skip to content

Conversation

@fyrsta7
Copy link
Contributor

@fyrsta7 fyrsta7 commented Sep 14, 2025

Summary

This PR optimizes the performance of the append function in include/fmt/base.h.
This performance improvement was identified while profiling the spdlog logging library, a major downstream project that bundles fmt. The benchmarks show that this optimization provides significant gains in spdlog's throughput and latency. For context, this change was first submitted to spdlog (see gabime/spdlog#3465) where I was advised to contribute it upstream directly to fmt.
Across 47 test cases from spdlog's benchmark suite, this change achieves a maximum improvement of 16.67% while guaranteeing no regression exceeds 0.99% in any other case.

Test Plan

  1. Correctness: All existing unit tests in the fmt project pass.
  2. Performance: The performance impact was evaluated using the comprehensive benchmark suite from the spdlog project. This was chosen because spdlog is a key real-world use case, and its benchmarks effectively measure the performance of fmt's core formatting operations under various conditions (single-threaded, multi-threaded, different logging patterns). The commands used were ./bench/bench and ./bench/latency from the spdlog repository.

Performance Evaluation & Results

Testing Protocol:

  • The benchmark was run on an isolated Ubuntu 24.04 server using the spdlog benchmark suite.
  • The first run was discarded to account for cold-start effects.
  • The results below are the average of 5 subsequent runs.
  • The improvement for a test case is calculated as (new_value - old_value) / old_value * 100% if a higher value is better (e.g., throughput), or (old_value - new_value) / new_value * 100% if a lower value is better (e.g., latency). A positive percentage indicates a performance gain.

Results:

Test Case Improvement
overall_throughput_improvement 3.00%
overall_latency_improvement 3.19%
single_threaded.level_off.normal.messages_per_sec 0.37%
single_threaded.level_off.backtrace_on.messages_per_sec 4.33%
single_threaded.rotating_st.normal.messages_per_sec 2.82%
single_threaded.rotating_st.backtrace_on.messages_per_sec 1.95%
single_threaded.basic_st.normal.messages_per_sec 3.15%
single_threaded.basic_st.backtrace_on.messages_per_sec 3.48%
single_threaded.daily_st.normal.messages_per_sec 3.13%
single_threaded.daily_st.backtrace_on.messages_per_sec 3.12%
multi_threaded_1.rotating_mt.normal.messages_per_sec 1.41%
multi_threaded_1.rotating_mt.backtrace_on.messages_per_sec 3.45%
multi_threaded_1.daily_mt.normal.messages_per_sec 2.73%
multi_threaded_1.daily_mt.backtrace_on.messages_per_sec 12.95%
multi_threaded_1.basic_mt.normal.messages_per_sec 2.48%
multi_threaded_1.basic_mt.backtrace_on.messages_per_sec 2.73%
multi_threaded_1.level_off.normal.messages_per_sec -0.19%
multi_threaded_1.level_off.backtrace_on.messages_per_sec 4.35%
multi_threaded_4.rotating_mt.normal.messages_per_sec 2.49%
multi_threaded_4.rotating_mt.backtrace_on.messages_per_sec 1.82%
multi_threaded_4.daily_mt.normal.messages_per_sec 6.08%
multi_threaded_4.daily_mt.backtrace_on.messages_per_sec 3.00%
multi_threaded_4.basic_mt.normal.messages_per_sec 3.94%
multi_threaded_4.basic_mt.backtrace_on.messages_per_sec -0.69%
multi_threaded_4.level_off.normal.messages_per_sec -0.99%
multi_threaded_4.level_off.backtrace_on.messages_per_sec 4.07%
single_threaded.level_off.backtrace_on.elapsed_time 0.00%
single_threaded.rotating_st.normal.elapsed_time 0.00%
single_threaded.rotating_st.backtrace_on.elapsed_time 0.00%
single_threaded.basic_st.normal.elapsed_time 0.00%
single_threaded.basic_st.backtrace_on.elapsed_time 16.67%
single_threaded.daily_st.normal.elapsed_time 3.33%
single_threaded.daily_st.backtrace_on.elapsed_time 12.50%
multi_threaded_1.rotating_mt.normal.elapsed_time 3.03%
multi_threaded_1.rotating_mt.backtrace_on.elapsed_time 4.76%
multi_threaded_1.daily_mt.normal.elapsed_time 0.00%
multi_threaded_1.daily_mt.backtrace_on.elapsed_time 14.89%
multi_threaded_1.basic_mt.normal.elapsed_time 0.00%
multi_threaded_1.basic_mt.backtrace_on.elapsed_time 0.00%
multi_threaded_1.level_off.backtrace_on.elapsed_time 0.00%
multi_threaded_4.rotating_mt.normal.elapsed_time 1.59%
multi_threaded_4.rotating_mt.backtrace_on.elapsed_time 0.00%
multi_threaded_4.daily_mt.normal.elapsed_time 6.06%
multi_threaded_4.daily_mt.backtrace_on.elapsed_time 2.47%
multi_threaded_4.basic_mt.normal.elapsed_time 1.67%
multi_threaded_4.basic_mt.backtrace_on.elapsed_time 0.00%
multi_threaded_4.level_off.backtrace_on.elapsed_time 0.00%

Copy link
Contributor

@vitaut vitaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

if (free_cap < count) count = free_cap;
auto count = to_unsigned(end - begin);
if (free_cap < count) {
try_reserve(size_ + count);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's replace this with

grow_(*this, size_ + count);

to avoid an extra capacity check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion!
I've replaced try_reserve with grow_ to avoid the extra capacity check as you pointed out. The code has been updated.

@vitaut vitaut merged commit 4cce5f4 into fmtlib:master Sep 21, 2025
41 checks passed
@vitaut
Copy link
Contributor

vitaut commented Sep 21, 2025

Merged, thanks!

@fyrsta7
Copy link
Contributor Author

fyrsta7 commented Sep 21, 2025

Thank you for your quick action! 🫶

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Nov 2, 2025
# 12.1.0 - 2025-10-29

- Optimized `buffer::append`, resulting in up to ~16% improvement on spdlog
  benchmarks (fmtlib/fmt#4541). Thanks @fyrsta7.

- Worked around an ABI incompatibility in `std::locale_ref` between clang and
  gcc (fmtlib/fmt#4573).

- Made `std::variant` and `std::expected` formatters work with `format_as`
  (fmtlib/fmt#4574,
  fmtlib/fmt#4575). Thanks @phprus.

- Made `fmt::join<string_view>` work with C++ modules
  (fmtlib/fmt#4379,
  fmtlib/fmt#4577). Thanks @Arghnews.

- Exported `fmt::is_compiled_string` and `operator""_cf` from the module
  (fmtlib/fmt#4544). Thanks @CrackedMatter.

- Fixed a compatibility issue with C++ modules in clang
  (fmtlib/fmt#4548). Thanks @tsarn.

- Added support for cv-qualified types to the `std::optional` formatter
  (fmtlib/fmt#4561,
  fmtlib/fmt#4562). Thanks @OleksandrKvl.

- Added demangling support (used in exception and `std::type_info` formatters)
  for libc++ and clang-cl
  (fmtlib/fmt#4542,
  fmtlib/fmt#4560,
  fmtlib/fmt#4568,
  fmtlib/fmt#4571).
  Thanks @FatihBAKIR and @rohitsutreja.

- Switched to global `malloc`/`free` to enable allocator customization
  (fmtlib/fmt#4569,
  fmtlib/fmt#4570). Thanks @rohitsutreja.

- Made the `FMT_USE_CONSTEVAL` macro configurable by users
  (fmtlib/fmt#4546). Thanks @SnapperTT.

- Fixed compilation with locales disabled in the header-only mode
  (fmtlib/fmt#4550).

- Fixed compilation with clang 21 and `-std=c++20`
  (fmtlib/fmt#4552).

- Fixed a dynamic linking issue with clang-cl
  (fmtlib/fmt#4576,
  fmtlib/fmt#4584). Thanks @FatihBAKIR.

- Fixed a warning suppression leakage on gcc
  (fmtlib/fmt#4588). Thanks @ZedThree.

- Made more internal color APIs `constexpr`
  (fmtlib/fmt#4581). Thanks @ishani.

- Fixed compatibility with clang as a host compiler for NVCC
  (fmtlib/fmt#4564). Thanks @valgur.

- Fixed various warnings and lint issues
  (fmtlib/fmt#4565,
  fmtlib/fmt#4572,
  fmtlib/fmt#4557).
  Thanks @LiangHuDream and @teruyamato0731.

- Improved documentation
  (fmtlib/fmt#4549,
  fmtlib/fmt#4551,
  fmtlib/fmt#4566,
  fmtlib/fmt#4567,
  fmtlib/fmt#4578,).
  Thanks @teruyamato0731, @petersteneteg and @zimmerman-dev.
mtremer pushed a commit to ipfire/ipfire-2.x that referenced this pull request Nov 6, 2025
- Update from version 11.2.0 to 12.1.0
- Update of rootfile
- so-bump so mpd requires shipping
- Changelog
    12.1.0
	- Optimized `buffer::append`, resulting in up to ~16% improvement on spdlog
	  benchmarks (fmtlib/fmt#4541). Thanks @fyrsta7.
	- Worked around an ABI incompatibility in `std::locale_ref` between clang and
	  gcc (fmtlib/fmt#4573).
	- Made `std::variant` and `std::expected` formatters work with `format_as`
	  (fmtlib/fmt#4574,
	  fmtlib/fmt#4575). Thanks @phprus.
	- Made `fmt::join<string_view>` work with C++ modules
	  (fmtlib/fmt#4379,
	  fmtlib/fmt#4577). Thanks @Arghnews.
	- Exported `fmt::is_compiled_string` and `operator""_cf` from the module
	  (fmtlib/fmt#4544). Thanks @CrackedMatter.
	- Fixed a compatibility issue with C++ modules in clang
	  (fmtlib/fmt#4548). Thanks @tsarn.
	- Added support for cv-qualified types to the `std::optional` formatter
	  (fmtlib/fmt#4561,
	  fmtlib/fmt#4562). Thanks @OleksandrKvl.
	- Added demangling support (used in exception and `std::type_info` formatters)
	  for libc++ and clang-cl
	  (fmtlib/fmt#4542,
	  fmtlib/fmt#4560,
	  fmtlib/fmt#4568,
	  fmtlib/fmt#4571).
	  Thanks @FatihBAKIR and @rohitsutreja.
	- Switched to global `malloc`/`free` to enable allocator customization
	  (fmtlib/fmt#4569,
	  fmtlib/fmt#4570). Thanks @rohitsutreja.
	- Made the `FMT_USE_CONSTEVAL` macro configurable by users
	  (fmtlib/fmt#4546). Thanks @SnapperTT.
	- Fixed compilation with locales disabled in the header-only mode
	  (fmtlib/fmt#4550).
	- Fixed compilation with clang 21 and `-std=c++20`
	  (fmtlib/fmt#4552).
	- Fixed a dynamic linking issue with clang-cl
	  (fmtlib/fmt#4576,
	  fmtlib/fmt#4584). Thanks @FatihBAKIR.
	- Fixed a warning suppression leakage on gcc
	  (fmtlib/fmt#4588). Thanks @ZedThree.
	- Made more internal color APIs `constexpr`
	  (fmtlib/fmt#4581). Thanks @ishani.
	- Fixed compatibility with clang as a host compiler for NVCC
	  (fmtlib/fmt#4564). Thanks @valgur.
	- Fixed various warnings and lint issues
	  (fmtlib/fmt#4565,
	  fmtlib/fmt#4572,
	  fmtlib/fmt#4557).
	  Thanks @LiangHuDream and @teruyamato0731.
	- Improved documentation
	  (fmtlib/fmt#4549,
	  fmtlib/fmt#4551,
	  fmtlib/fmt#4566,
	  fmtlib/fmt#4567,
	  fmtlib/fmt#4578,).
	  Thanks @teruyamato0731, @petersteneteg and @zimmerman-dev.
    12.0.0
	- Optimized the default floating point formatting
	  (fmtlib/fmt#3675,
	  fmtlib/fmt#4516). In particular, formatting a
	  `double` with format string compilation into a stack allocated buffer is
	  more than 60% faster in version 12.0 compared to 11.2 according to
	  [dtoa-benchmark](https://github.com/fmtlib/dtoa-benchmark):
	  ```
	  Function  Time (ns)  Speedup
	  fmt11        34.471    1.00x
	  fmt12        21.000    1.64x
	  ```
	  <img width="766" height="609" src="https://github.com/user-attachments/assets/d7d768ad-7543-468c-b0bb-449abf73b31b" />
	- Added `constexpr` support to `fmt::format`. For example:
	  ```c++
	  #include <fmt/compile.h>
	  using namespace fmt::literals;
	  std::string s = fmt::format(""_cf, 42);
	  ```
	  now works at compile time provided that `std::string` supports `constexpr`
	  (fmtlib/fmt#3403,
	  fmtlib/fmt#4456). Thanks @msvetkin.
	- Added `FMT_STATIC_FORMAT` that allows formatting into a string of the exact
	  required size at compile time.
	  For example:
	  ```c++
	  #include <fmt/compile.h>
	  constexpr auto s = FMT_STATIC_FORMAT("{}", 42);
	  ```
	  compiles to just
	  ```s
	  __ZL1s:
	        .asciiz "42"
	  ```
	  It can be accessed as a C string with `s.c_str()` or as a string view with
	  `s.str()`.
	- Improved C++20 module support
	  (fmtlib/fmt#4451,
	  fmtlib/fmt#4459,
	  fmtlib/fmt#4476,
	  fmtlib/fmt#4488,
	  fmtlib/fmt#4491,
	  fmtlib/fmt#4495).
	  Thanks @arBmind, @tkhyn, @Mishura4, @anonymouspc and @autoantwort.
	- Switched to using estimated display width in precision. For example:
	  ```c++
	  fmt::print("|{:.4}|\n|1234|\n", "🐱🐱🐱");
	  ```
	  prints
	  ![](https://github.com/user-attachments/assets/6c4446b3-13eb-43b9-b74a-b4543540ad6a)
	  because `🐱` has an estimated width of 2
	  (fmtlib/fmt#4272,
	  fmtlib/fmt#4443,
	  fmtlib/fmt#4475).
	  Thanks @nikhilreddydev and @localspook.
	- Fix interaction between debug presentation, precision, and width for strings
	  (fmtlib/fmt#4478). Thanks @localspook.
	- Implemented allocator propagation on `basic_memory_buffer` move
	  (fmtlib/fmt#4487,
	  fmtlib/fmt#4490). Thanks @toprakmurat.
	- Fixed an ambiguity between `std::reference_wrapper<T>` and `format_as`
	  formatters (fmtlib/fmt#4424,
	  fmtlib/fmt#4434). Thanks @jeremy-rifkin.
	- Removed the following deprecated APIs:
	  - `has_formatter`: use `is_formattable` instead,
	  - `basic_format_args::parse_context_type`,
	    `basic_format_args::formatter_type` and similar aliases in context types,
	  - wide stream overload of `fmt::printf`,
	  - wide stream overloads of `fmt::print` that take text styles,
	  - `is_*char` traits,
	  - `fmt::localtime`.
	- Deprecated wide overloads of `fmt::fprintf` and `fmt::sprintf`.
	- Improved diagnostics for the incorrect usage of `fmt::ptr`
	  (fmtlib/fmt#4453). Thanks @TobiSchluter.
	- Made handling of ANSI escape sequences more efficient
	  (fmtlib/fmt#4511,
	  fmtlib/fmt#4528).
	  Thanks @localspook and @Anas-Hamdane.
	- Fixed a buffer overflow on all emphasis flags set
	  (fmtlib/fmt#4498). Thanks @dominicpoeschko.
	- Fixed an integer overflow for precision close to the max `int` value.
	- Fixed compatibility with WASI (fmtlib/fmt#4496,
	  fmtlib/fmt#4497). Thanks @whitequark.
	- Fixed `back_insert_iterator` detection, preventing a fallback on slower path
	  that handles arbitrary iterators (fmtlib/fmt#4454).
	- Fixed handling of invalid glibc `FILE` buffers
	  (fmtlib/fmt#4469).
	- Added `wchar_t` support to the `std::byte` formatter
	  (fmtlib/fmt#4479,
	  fmtlib/fmt#4480). Thanks @phprus.
	- Changed component prefix from `fmt-` to `fmt_` for compatibility with
	  NSIS/CPack on Windows, e.g. `fmt-doc` changed to `fmt_doc`
	  (fmtlib/fmt#4441,
	  fmtlib/fmt#4442). Thanks @n-stein.
	- Added the `FMT_CUSTOM_ASSERT_FAIL` macro to simplify providing a custom
	  `fmt::assert_fail` implementation (fmtlib/fmt#4505).
	  Thanks @HazardyKnusperkeks.
	- Switched to `FMT_THROW` on reporting format errors so that it can be
	  overriden by users when exceptions are disabled
	  (fmtlib/fmt#4521). Thanks @HazardyKnusperkeks.
	- Improved master project detection and disabled install targets when using
	  {fmt} as a subproject by default (fmtlib/fmt#4536).
	  Thanks @crueter.
	- Made various code improvements
	  (fmtlib/fmt#4445,
	  fmtlib/fmt#4448,
	  fmtlib/fmt#4473,
	  fmtlib/fmt#4522).
	  Thanks @localspook, @tchaikov and @way4sahil.
	- Added Conan instructions to the docs
	  (fmtlib/fmt#4537). Thanks @uilianries.
	- Removed Bazel files to avoid issues with downstream packaging
	  (fmtlib/fmt#4530). Thanks @mering.
	- Added more entries for generated files to `.gitignore`
	  (fmtlib/fmt#4355,
	  fmtlib/fmt#4512).
	  Thanks @dinomight and @localspook.
	- Fixed various warnings and compilation issues
	  (fmtlib/fmt#4447,
	  fmtlib/fmt#4470,
	  fmtlib/fmt#4474,
	  fmtlib/fmt#4477,
	  fmtlib/fmt#4471,
	  fmtlib/fmt#4483,
	  fmtlib/fmt#4515,
	  fmtlib/fmt#4533,
	  fmtlib/fmt#4534).
	  Thanks @dodomorandi, @localspook, @remyjette, @Tomek-Stolarczyk, @Mishura4,
	  @mattiasljungstrom and @FatihBAKIR.

Signed-off-by: Adolf Belka <[email protected]>
Signed-off-by: Michael Tremer <[email protected]>
polter-rnd added a commit to polter-rnd/slimlog that referenced this pull request Dec 15, 2025
polter-rnd added a commit to polter-rnd/slimlog that referenced this pull request Dec 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants