Optimize `Fw::StringUtils::string_length` with SWAR algorithm by vsoulgard · Pull Request #4789 · nasa/fprime

vsoulgard · 2026-03-02T13:17:26Z


*Related Issue(s)*	#4788
*Has Unit Tests (y/n)*	y
*Documentation Included (y/n)*	n
*Generative AI was used in this contribution (y/n)*	y

Change Description

This PR implements the performance optimization proposed in #4788. It replaces the naive byte-by-byte string length calculation with a SWAR (SIMD Within A Register) approach using word-sized and bitwise NUL char detection.

Details:

Processes the initial unaligned head byte-by-byte to ensure word-alignment before the main loop.
Uses standard bitwise operations to detect NUL char in parallel within a register.
For short strings (word-sized or smaller) a simple byte-by-byte loop is used to avoid overhead.
Added macro NO_ASAN to suppress false positives from AddressSanitizer for trusted functions.
Added new edge test cases and alignment assert.

Rationale

SWAR reduces the number of loop iterations and memory load instructions, thus improving performance. This technique is widely used in high-performance standard libraries.

Benchmarks:

(Tested on Linux x86_64 with GCC 14.2.0)

Benchmark Results (aligned)

New version is slightly slower for very short strings, but a lot faster for long ones:
N=2 - 1.28 vs. 2.01 (ns)
N=4 - 1.76 vs. 2.27 (ns)
N=8 - 3.06 vs. 3.01 (ns)
N=16 - 6.05 vs. 3.43 (ns)
N=32 - 19.30 vs. 4.31 (ns)
N=64 - 27.97 vs. 6.37 (ns)
N=128 - 53.45 vs. 10.36 (ns) (x5 faster)
N=256 - 100.27 vs. 22.29 (ns)
N=512 - 191.22 vs. 42.47 (ns)

Benchmark Results (unaligned)

Unaligned calculations are slower than aligned ones, because of head byte-by-byte cycle, but still faster than old version:
N=2 - 1.33 vs. 1.91 (ns)
N=4 - 2.03 vs. 2.26 (ns)
N=8 - 3.55 vs. 3.02 (ns)
N=16 - 6.65 vs. 5.18 (ns)
N=32 - 17.12 vs. 6.10 (ns)
N=64 - 30.34 vs. 8.44 (ns)
N=128 - 56.18 vs. 12.60 (ns)
N=256 - 103.61 vs. 24.40 (ns)
N=512 - 199.45 vs. 44.11 (ns)

Testing/Review Recommendations

Benchmark Source

static void BM_string_length(benchmark::State& state)
{
    const size_t LEN = state.range(0);
    std::string src_str = std::string(LEN, 'A');
    const char* src = src_str.c_str();

    for (auto _ : state)
    {
        auto result = Fw::StringUtils::string_length(src, LEN);
        benchmark::DoNotOptimize(result);
    }
}

static void BM_string_lengthV2(benchmark::State& state)
{
    const size_t LEN = state.range(0);
    std::string src_str = std::string(LEN, 'A');
    const char* src = src_str.c_str();

    for (auto _ : state)
    {
        auto result = Fw::StringUtils::string_lengthV2(src, LEN);
        benchmark::DoNotOptimize(result);
    }
}

BENCHMARK(BM_string_length)->RangeMultiplier(2)->Range(0, 1024);
BENCHMARK(BM_string_lengthV2)->RangeMultiplier(2)->Range(0, 1024);

BENCHMARK_MAIN();

AI Usage

AI was used for research and testing of ideas. All code implementation, algorithm design, benchmarking, and technical decisions were done by me.

Copilot

Pull request overview

This PR updates Fw::StringUtils::string_length to use a SWAR (word-at-a-time) NUL detection approach to reduce iterations and improve performance for longer strings, and adds unit tests intended to cover additional edge cases.

Changes:

Replaced byte-by-byte string_length with an aligned, word-at-a-time SWAR implementation (with a short-string fast path).
Introduced a NO_ASAN macro and applied it to string_length.
Added new unit tests for empty strings, various lengths, large strings, and unaligned pointers.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`Fw/Types/StringUtils.cpp`	Implements the SWAR-based `string_length` and applies `NO_ASAN`.
`Fw/Types/Assert.hpp`	Adds the `NO_ASAN` macro definition (compiler-attribute based).
`Fw/Types/test/ut/TypesTest.cpp`	Adds new unit tests for `string_length` edge cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-04T01:19:49Z

Fw/Types/StringUtils.cpp

+NO_ASAN
 FwSizeType Fw::StringUtils::string_length(const CHAR* source, FwSizeType buffer_size) {
    FwSizeType length = 0;
    FW_ASSERT(source != nullptr);
-    for (length = 0; length < buffer_size; length++) {
+
+    // Fast path for short strings - avoids alignment overhead
+    if (buffer_size <= sizeof(std::size_t))
+    {


string_length is annotated with NO_ASAN, which will suppress AddressSanitizer reports for this function. Since this implementation bounds all reads to buffer_size, ASan findings here are likely to indicate a real caller bug (e.g., passing an incorrect buffer_size) rather than a false positive; disabling ASan could hide those issues. Consider removing NO_ASAN or limiting it to a narrowly-scoped build/config where intentional over-reads actually occur (and document why).

That is correct statement. But as far as i know even if the passed buffer_size is incorrect, the over-read will be safe, since aligned reading do not cross memory page boundaries.

The reason I added NO_ASAN macro is because an incorrect buffer_size passed in Fw::StringUtils::string_copy, which causes ASan finding when running unit tests:

char* Fw::StringUtils::string_copy(char* destination, const char* source, FwSizeType num) { // Handle self-copy and 0 bytes copy if (destination == source || num == 0) { return destination; } FW_ASSERT(source != nullptr); FW_ASSERT(destination != nullptr); // Copying an overlapping range is undefined FwSizeType source_len = string_length(source, num) + 1; // <-- HERE FW_ASSERT(source + source_len <= destination || destination + num <= source); char* returned = strncpy(destination, source, static_cast<size_t>(num)); destination[num - 1] = '\0'; return returned; }

destination length (num) passed as source length into Fw::StringUtils::string_length, which causes ASan finding in cases where destination length (num) is greater than the length of source buffer.

It seems to me the sanitizer is finding a bug in the unit test code here, and we should fix the unit test, rather than disabling the sanitizer. If I understand what is going on, num should never be greater than the minimum of the sizes of the buffers pointed to by destination and source. The unit test code seems to violate this assumption. In general, if we allow this assumption to be violated by disabling the sanitizer, then we are opening the door to a buffer overrun.

In general, disabling the sanitizer for an entire function is something we should avoid -- especially for a function like this that does low-level memory manipulation.

For example, what if someone calls this function with an actual buffer size of 100 for destination and 1000 for source, and provides 1000 as num? We definitely don't want to disable sanitation in that case. I agree with Copilot here:

Since this implementation bounds all reads to buffer_size, ASan findings here are likely to indicate a real caller bug (e.g., passing an incorrect buffer_size) rather than a false positive; disabling ASan could hide those issues.

Thank you for the review! You are right, ASan only triggers if we pass buffer_length greater than the actual buffer size. But that happens not only in Fw::StringUtils::string_copy, but also in StringBase::operator+=:

fprime/Fw/Types/StringBase.cpp

Lines 25 to 28 in ffa5629

const CHAR* StringBase::operator+=(const CHAR* src) {

this->appendBuff(src, static_cast<SizeType>(StringUtils::string_length(src, this->getCapacity())));

return this->toChar();

}

It worked well with the original byte-by-byte comparison, but SWAR implementation causes an over-read here (optimized strnlen implementations behave similarly). To avoid breaking things, I suggest sticking with the original version, but adding SWAR implementation as second high-performance (for long strings) option for cases where buffer length is known in advance, if you'd be interested in that approach.
Like:

FwSizeType Fw::StringUtils::string_length_swar(const CHAR* source, FwSizeType max_buffer_size)

OK, thanks. It seems that the SWAR approach is pushing on an issue that has always been in the framework. In the case of the operator+= code cited above, the operation is not safe against a buffer overrun in all cases (even without SWAR), because (1) no buffer size is provided for src, and (2) there is no guarantee that the string pointed to by src is null terminated.

The F Prime team will need to decide how to proceed here.

Fw/Types/test/ut/TypesTest.cpp

vsoulgard added 2 commits March 2, 2026 15:04

Optimize Fw::StringUtils::string_length

0070054

Add unit tests for new cases

fb6d2f3

thomas-bc requested a review from Copilot March 4, 2026 01:14

Copilot started reviewing on behalf of thomas-bc March 4, 2026 01:14 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

fix tests

ed4ce50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `Fw::StringUtils::string_length` with SWAR algorithm#4789

Optimize `Fw::StringUtils::string_length` with SWAR algorithm#4789
vsoulgard wants to merge 3 commits intonasa:develfrom
vsoulgard:fw-types-string-length-swar

vsoulgard commented Mar 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

vsoulgard Mar 4, 2026

Uh oh!

bocchino Mar 4, 2026

Uh oh!

bocchino Mar 4, 2026

Uh oh!

bocchino Mar 4, 2026 •

edited

Loading

Uh oh!

vsoulgard Mar 4, 2026

Uh oh!

bocchino Mar 4, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	const CHAR* StringBase::operator+=(const CHAR* src) {
	this->appendBuff(src, static_cast<SizeType>(StringUtils::string_length(src, this->getCapacity())));
	return this->toChar();
	}

Conversation

vsoulgard commented Mar 2, 2026

Change Description

Details:

Rationale

Benchmarks:

Testing/Review Recommendations

AI Usage

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

vsoulgard Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

bocchino Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

bocchino Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

bocchino Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vsoulgard Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

bocchino Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bocchino Mar 4, 2026 •

edited

Loading