Skip to content

[Exporter.Geneva] Add thread-safety regression tests for MsgPackTraceExporter#3882

Closed
rajkumar-rangaraj wants to merge 3 commits intoopen-telemetry:mainfrom
rajkumar-rangaraj:rajrang/investigateGenevaThreadingIssue
Closed

[Exporter.Geneva] Add thread-safety regression tests for MsgPackTraceExporter#3882
rajkumar-rangaraj wants to merge 3 commits intoopen-telemetry:mainfrom
rajkumar-rangaraj:rajrang/investigateGenevaThreadingIssue

Conversation

@rajkumar-rangaraj
Copy link
Member

@rajkumar-rangaraj rajkumar-rangaraj commented Feb 18, 2026

Adds regression tests that verify MsgPackTraceExporter.SerializeActivity produces valid MessagePack output under concurrent multi-threaded access.

PR #3214 introduced a race condition where multiple threads calling SerializeActivity simultaneously would each invoke CreateFraming(), concurrently mutating the shared prepopulatedFields dictionary. This caused Dictionary internal state corruption and resulted in "Bad forward protocol format" errors on the receiving end.

The fix was delivered in #3881, but no regression tests existed to cover the
multi-threaded scenario. These tests fill that gap.

Tests added

  • SerializeActivity_ConcurrentThreads_ProducesValidMsgPack — 8 threads
    serialize activities concurrently and validate that every output is a well-formed
    Fluentd Forward Mode MessagePack message.
  • CreateFraming_CalledMultipleTimes_FieldCountRemainsConsistent — Calls
    CreateFraming() multiple times and asserts that prepopulatedFields.Count
    and bufferPrologue.Length remain stable.
  • SerializeActivity_DifferentThreads_SameFieldCount — 8 threads serialize
    activities concurrently and verify the Map16 field count is identical across
    all threads.

Merge requirement checklist

  • CONTRIBUTING guidelines followed (license requirements, nullable enabled, static analysis, etc.)
  • Unit tests added/updated
  • Appropriate CHANGELOG.md files updated for non-trivial changes
  • Changes in public API reviewed (if applicable)

@rajkumar-rangaraj rajkumar-rangaraj requested a review from a team as a code owner February 18, 2026 21:57
@github-actions github-actions bot requested a review from xiang17 February 18, 2026 21:58
@github-actions github-actions bot added the comp:exporter.geneva Things related to OpenTelemetry.Exporter.Geneva label Feb 18, 2026
@codecov
Copy link

codecov bot commented Feb 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.53%. Comparing base (d84e825) to head (99328f3).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3882      +/-   ##
==========================================
- Coverage   71.59%   71.53%   -0.07%     
==========================================
  Files         447      447              
  Lines       17827    17827              
==========================================
- Hits        12764    12753      -11     
- Misses       5063     5074      +11     
Flag Coverage Δ
unittests-Exporter.Geneva 54.45% <ø> (-0.12%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@mattsains mattsains left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad we're adding this test, it's very needed. If you have time, we could do the same for the log exporter easily too!


// Validate: the serialized data must be valid MessagePack
// representing a proper Fluentd Forward Mode message.
// Under the race condition, the Map16 header field count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that this test is a reaction to the bug, but this is the fifth time in this file that the bug is explicitly called out - is that necessary? This test will live on as a general test of multi-threading support, not simply that the exact bug we identified won't come back. In my view, the file inappropriately refers back to the specific issue that caused us to realise the need for this test throughout, and these references should mostly be removed

// In the buggy code, repeated calls can cause prepopulatedFields to grow
// (duplicate entries added from resource attributes), making the Count
// diverge from what's in the prologue.
exporter.CreateFraming();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense to call CreateFraming in the tests. Perhaps CreateFraming should be changed to be private.

This test does not do what it says it does - CreateFraming is called twice, but synchronously, so it just does the same thing twice. I think this test should be removed.

using var activitySource = new ActivitySource("ThreadSafetyTest");
using var listener = new ActivityListener
{
ShouldListenTo = _ => true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other tests following a convention of only listening to the specific activity source name of the test, and comments say this is to avoid interference with other tests. However, I am not sure this is actually possible because the listener will not be attached to any other tests' activity sources. Anyway, not sure what to suggest here but just pointing out it's different to the other tests' layout.

var data = new byte[serialized.Count];
Array.Copy(serialized.Array!, serialized.Offset, data, 0, serialized.Count);

var deserialized = MessagePack.MessagePackSerializer.Deserialize<object>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To improve this test, you can manually construct a MessagePackReader, and make sure that all the bytes have been consumed by the deserialize call. This way we ensure that there are no extra message pack encoded fields that were left off by an incorrect field count. That would also let you remove the test called CreateFraming_CalledMultipleTimes_FieldCountRemainsConsistent

Assert.Contains("env_cloud_role", mapping.Keys);
Assert.Equal("TestService", mapping["env_cloud_role"]);
Assert.Contains("env_cloud_roleInstance", mapping.Keys);
Assert.Equal("Instance123", mapping["env_cloud_roleInstance"]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest asserting the number of fields in mapping to ensure there are no extra, unintended fields

/// Under the race condition, different threads can end up with different field counts.
/// </summary>
[Fact]
public void SerializeActivity_DifferentThreads_SameFieldCount()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my comment above discussing a way to roll this test into SerializeActivity_ConcurrentThreads_ProducesValidMsgPack

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:exporter.geneva Things related to OpenTelemetry.Exporter.Geneva

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants