Skip to content

[mcp] search_traces tool (Phase 2 Step 4)#7858

Merged
yurishkuro merged 5 commits into
mainfrom
copilot/implement-phase-2-step-4
Jan 10, 2026
Merged

[mcp] search_traces tool (Phase 2 Step 4)#7858
yurishkuro merged 5 commits into
mainfrom
copilot/implement-phase-2-step-4

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 9, 2026

Part of #7827

Implements the search_traces MCP tool as specified in Phase 2 Step 4 of ADR-002. This tool enables LLM agents to query Jaeger traces efficiently using progressive disclosure—returning only lightweight trace summaries without attributes or events.

Changes

Added internal types package

  • SearchTracesInput: Defines tool parameters with support for RFC3339 and relative time formats (-1h, now)
  • Uses search_depth parameter (instead of limit) to match Query API terminology - behaves like an SQL LIMIT clause depending on backend implementation
  • SearchTracesOutput / TraceSummary: Returns trace metadata (IDs, service/span counts, duration, error status)

Added search_traces handler

  • Wraps QueryService.FindTraces() with MCP-optimized response format
  • Parses relative time expressions and duration strings
  • Filters traces by service, operation, attributes, errors, and duration
  • Builds trace summaries by iterating spans to extract root info, service counts, and error status
  • Implements queryServiceInterface for testability

Registered tool in MCP server

  • Wired handler to existing QueryService instance
  • Tool registered with MCP SDK using typed input/output schemas
  • Description: "Find traces matching service, time, attributes, and duration criteria. Returns trace summary only."

Test coverage

  • Comprehensive unit tests achieving 98% code coverage
  • Tests cover all code paths including error cases, validation, attributes handling, duration filtering, and search depth limits

Example Usage

{
  "start_time_min": "-1h",
  "service_name": "frontend",
  "with_errors": true,
  "duration_min": "2s",
  "search_depth": 10
}

Returns:

{
  "traces": [{
    "trace_id": "1a2b3c4d...",
    "root_service": "frontend",
    "root_operation": "/api/checkout",
    "duration_ms": 2450,
    "span_count": 47,
    "service_count": 8,
    "has_errors": true
  }]
}
Original prompt

implement phase 2 step 4 from ADR 002 as part of issue #7827. State Part of #7827 in PR description. Run make fmt make lint and make test to success everytime before pushing to remove branch.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Part of #7827

- Added internal/types package with SearchTracesInput and SearchTracesOutput
- Added internal/handlers package with SearchTracesHandler
- Implemented search_traces tool with time parsing (RFC3339 and relative formats)
- Registered search_traces tool in MCP server
- Added comprehensive unit tests for handler and time parsing
- All tests passing, fmt and lint checks pass

Co-authored-by: yurishkuro <[email protected]>
Copilot AI changed the title [WIP] Implement phase 2 step 4 from ADR 002 Implement search_traces MCP tool (Phase 2 Step 4 of ADR-002) Jan 9, 2026
Copilot AI requested a review from yurishkuro January 9, 2026 22:52
@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 9, 2026

Codecov Report

❌ Patch coverage is 97.58065% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.43%. Comparing base (55fbe8c) to head (7e7fa77).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...nsion/jaegermcp/internal/handlers/search_traces.go 97.47% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7858      +/-   ##
==========================================
- Coverage   95.46%   95.43%   -0.03%     
==========================================
  Files         300      301       +1     
  Lines       15731    15855     +124     
==========================================
+ Hits        15017    15132     +115     
- Misses        560      566       +6     
- Partials      154      157       +3     
Flag Coverage Δ
badger_v1 9.18% <ø> (ø)
badger_v2 1.93% <ø> (ø)
cassandra-4.x-v1-manual 13.58% <ø> (ø)
cassandra-4.x-v2-auto 1.92% <ø> (ø)
cassandra-4.x-v2-manual 1.92% <ø> (ø)
cassandra-5.x-v1-manual 13.58% <ø> (ø)
cassandra-5.x-v2-auto 1.92% <ø> (ø)
cassandra-5.x-v2-manual 1.92% <ø> (ø)
clickhouse 1.97% <ø> (ø)
elasticsearch-6.x-v1 17.54% <ø> (ø)
elasticsearch-7.x-v1 17.57% <ø> (ø)
elasticsearch-8.x-v1 17.72% <ø> (ø)
elasticsearch-8.x-v2 1.93% <ø> (ø)
elasticsearch-9.x-v2 1.93% <ø> (ø)
grpc_v1 8.84% <ø> (ø)
grpc_v2 1.93% <ø> (ø)
kafka-3.x-v2 1.93% <ø> (ø)
memory_v2 1.93% <ø> (ø)
opensearch-1.x-v1 17.61% <ø> (ø)
opensearch-2.x-v1 17.61% <ø> (ø)
opensearch-2.x-v2 1.93% <ø> (ø)
opensearch-3.x-v2 1.93% <ø> (ø)
query 1.93% <ø> (ø)
tailsampling-processor 0.55% <ø> (ø)
unittests 94.06% <97.58%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yurishkuro yurishkuro added the changelog:experimental Change to an experimental part of the code label Jan 10, 2026
Signed-off-by: Yuri Shkuro <[email protected]>
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/types/types.go Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/server.go Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/server.go Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 10, 2026

Metrics Comparison Summary

Total changes across all snapshots: 0

Detailed changes per snapshot

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 106 metrics

➡️ View full metrics file

…, add comprehensive tests

- Renamed Limit field to SearchDepth to match Query API terminology
- Updated description from "metadata only" to "trace summary only"
- Removed unnecessary comment line
- Added comprehensive unit tests achieving 98% code coverage
- Created queryServiceInterface for testability
- Added tests for all code paths including error cases, attributes, durations, and search depth handling

Co-authored-by: yurishkuro <[email protected]>
Signed-off-by: Yuri Shkuro <[email protected]>
@yurishkuro yurishkuro changed the title Implement search_traces MCP tool (Phase 2 Step 4 of ADR-002) [mcp] Phase 2 Step 4: search_traces Jan 10, 2026
@yurishkuro yurishkuro marked this pull request as ready for review January 10, 2026 03:14
@yurishkuro yurishkuro requested a review from a team as a code owner January 10, 2026 03:14
@yurishkuro yurishkuro merged commit bf8038e into main Jan 10, 2026
62 checks passed
@yurishkuro yurishkuro deleted the copilot/implement-phase-2-step-4 branch January 10, 2026 03:19
@yurishkuro yurishkuro changed the title [mcp] Phase 2 Step 4: search_traces [mcp] search_traces tool (Phase 2 Step 4) Jan 11, 2026
yurishkuro pushed a commit that referenced this pull request Apr 10, 2026
)

## Which problem is this PR solving?

When an agent calls `search_traces`, the response includes
`service_count` (an integer) but not the actual service names. The agent
knows a trace spans 5 services but has no idea which ones. To find out,
it must call `get_trace_topology` or `get_span_details` for every trace
individually, which defeats the purpose of a lightweight summary
endpoint.

The data is already computed internally. `buildTraceSummary` builds a
`services` map to derive `service_count`, then discards the map keys.
This has been the case since the function was introduced in #7858, where
the map was built but only `len(services)` was surfaced. Subsequent
changes (#7859, #7863, #7916, #8194) restructured the types and renamed
fields but never revisited the service data gap.

## Short description of the changes

- Added `Services []string` to `TraceSummary`, populated from the
existing `services` map
- Sorted alphabetically via `slices.Sort` for deterministic output
across calls
- Added multi-service test with three services in non-alphabetical
order, unique span IDs, and proper parent-child relationships to verify
sort correctness
- Updated existing summary tests to assert on the new field
- `ServiceCount` preserved for backward compatibility

## Use case

An agent investigating a latency spike searches for slow traces. The
summary now returns:

```json
{
  "service_count": 3,
  "services": ["api-gateway", "payment", "user-service"]
}
```

The agent can immediately see that the payment service is involved and
drill into that trace, instead of blindly fetching topology for every
result.

## How was this change tested?

- `go test ./cmd/jaeger/internal/extension/jaegermcp/...` - all passing
- `make lint` - 0 issues
- `make fmt` - clean
- `make test` - 3053 tests passing

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/main/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully: `make lint test`

## AI Usage in this PR (choose one)
- [x] **Light**: AI provided minor assistance (formatting, simple
suggestions)

Signed-off-by: Roshan Singh <[email protected]>
Signed-off-by: Roshan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/storage changelog:experimental Change to an experimental part of the code enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants