Skip to content

Add Go-specific telemetry design document#298

Merged
samikshya-db merged 1 commit intomainfrom
telemetry/go-design-doc
Nov 5, 2025
Merged

Add Go-specific telemetry design document#298
samikshya-db merged 1 commit intomainfrom
telemetry/go-design-doc

Conversation

@samikshya-db
Copy link
Collaborator

Summary

This PR adds a comprehensive telemetry design document specifically adapted for the databricks-sql-go driver. The design was transformed from a C#/.NET ADBC driver design to follow Go best practices and idiomatic patterns.

Go-Specific Adaptations

This design document has been completely rewritten to align with Go conventions and the existing codebase patterns:

1. Replaced C#/.NET Concepts with Go Equivalents

C#/.NET Pattern Go Pattern
Activity/ActivitySource context.Context + middleware interceptors
ActivityListener Custom telemetry interceptor pattern
async/await Goroutines and channels
ConcurrentDictionary map with sync.RWMutex
IDisposable Close() methods
C# namespaces Go packages

2. Applied Go Naming Conventions

  • Unexported types: featureFlagCache, clientManager, metricsAggregator (lowercase for internal types)
  • Exported functions: Following Go conventions for public APIs
  • Idiomatic names: mu for mutex, cfg for config, ctx for context
  • Package naming: Single lowercase word (telemetry)

3. Idiomatic Go Code Patterns

Concurrency & Thread Safety

```go
// Singleton with sync.Once
var (
managerOnce sync.Once
managerInstance *clientManager
)

func getClientManager() *clientManager {
managerOnce.Do(func() {
managerInstance = &clientManager{
clients: make(map[string]*clientHolder),
}
})
return managerInstance
}

// Thread-safe operations with RWMutex
func (m *clientManager) getOrCreateClient(host string, ...) *telemetryClient {
m.mu.Lock()
defer m.mu.Unlock()
// ...
}
```

Context Propagation

```go
// Context-based metric collection
func (i *interceptor) beforeExecute(ctx context.Context, statementID string) context.Context {
mc := &metricContext{
statementID: statementID,
startTime: time.Now(),
tags: make(map[string]interface{}),
}
return withMetricContext(ctx, mc)
}
```

Error Handling

```go
// Defer/recover pattern for error swallowing
func recoverAndLog(operation string) {
if r := recover(); r != nil {
// Log at trace level only
}
}

func (i *interceptor) afterExecute(ctx context.Context, err error) {
defer recoverAndLog("afterExecute")
// Telemetry logic
}
```

4. Async Patterns with Goroutines

```go
// Background flush loop
func (agg *metricsAggregator) flushLoop() {
ticker := time.NewTicker(agg.flushInterval)
defer ticker.Stop()

for {
    select {
    case <-ticker.C:
        agg.flush(context.Background())
    case <-agg.stopCh:
        return
    }
}

}

// Async export
go func() {
defer recoverAndLog("export")
agg.exporter.export(ctx, metrics)
}()
```

5. Standard Library Integration

  • `net/http`: HTTP client for telemetry export
  • `context.Context`: Cancellation and deadline propagation
  • `time`: Timers, tickers, and duration handling
  • `sync`: Mutexes, WaitGroups, and Once
  • `encoding/json`: Metric serialization

6. Driver Integration Points

In `connector.go`

```go
func (c *connector) Connect(ctx context.Context) (driver.Conn, error) {
// ... existing code ...

if c.cfg.telemetryEnabled {
    conn.telemetry = newTelemetryInterceptor(conn.id, c.cfg)
    conn.telemetry.recordConnection(ctx, tags)
}

return conn, nil

}
```

In `statement.go`

```go
func (s *stmt) QueryContext(ctx context.Context, args []driver.NamedValue) (driver.Rows, error) {
if s.conn.telemetry != nil {
ctx = s.conn.telemetry.beforeExecute(ctx, statementID)
defer func() {
s.conn.telemetry.afterExecute(ctx, err)
}()
}
// ... existing implementation ...
}
```

7. Testing Strategy

  • Unit tests: Standard `*testing.T` patterns
  • Integration tests: Using `testing.Short()` for skip flags
  • Benchmarks: `BenchmarkXxx` functions to measure overhead
  • Table-driven tests: Go idiomatic test patterns

```go
func BenchmarkInterceptor_Overhead(b *testing.B) {
// ... setup ...
b.ResetTimer()
for i := 0; i < b.N; i++ {
ctx = interceptor.beforeExecute(ctx, "stmt-123")
interceptor.afterExecute(ctx, nil)
}
}
```

Key Design Features

Per-Host Resource Management

  • Feature Flag Cache: Singleton per host with reference counting (15min TTL)
  • Telemetry Client: One shared client per host to prevent rate limiting
  • Circuit Breaker: Per-host protection against failing endpoints

Privacy & Security

  • ✅ No PII collected (no SQL queries, user data, or credentials)
  • ✅ Tag filtering ensures only approved metrics exported
  • ✅ All sensitive info excluded from Databricks export

Reliability

  • ✅ All telemetry errors swallowed (never impacts driver)
  • ✅ Circuit breaker prevents cascade failures
  • ✅ Graceful shutdown with proper resource cleanup
  • ✅ Terminal vs retryable error classification

File Structure

```
telemetry/
├── DESIGN.md # This comprehensive design document
├── config.go # Configuration types
├── tags.go # Tag definitions and filtering
├── featureflag.go # Per-host feature flag caching
├── manager.go # Per-host client management
├── circuitbreaker.go # Circuit breaker implementation
├── interceptor.go # Telemetry interceptor
├── aggregator.go # Metrics aggregation
├── exporter.go # Export to Databricks
├── client.go # Telemetry client
├── errors.go # Error classification
└── *_test.go # Test files
```

Alignment with Existing Codebase

This design follows patterns observed in:

  • `connection.go`: Connection lifecycle management
  • `connector.go`: Factory patterns and options
  • `internal/config/config.go`: Configuration structures
  • `internal/client/client.go`: HTTP client patterns

Next Steps

This is a design document only. Implementation will be tracked in separate PRs following the implementation checklist in the design.

Related Work

  • Based on JDBC driver telemetry implementation patterns
  • Adapted from C#/.NET ADBC driver design
  • Follows Go best practices and standard library patterns

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

This design document provides a comprehensive telemetry system design
adapted specifically for the databricks-sql-go driver following Go
best practices and idiomatic patterns.

Key Go-specific adaptations:
- Replaced C# Activity/ActivitySource with context.Context and interceptors
- Used goroutines and channels for async operations
- Applied sync.RWMutex and sync.Once for thread-safe singletons
- Implemented circuit breaker pattern with Go idioms
- Used defer/recover for error handling
- Followed Go naming conventions (unexported types, camelCase)
- Designed around standard library patterns (http.Client, context)
- Included Go-specific testing patterns (unit, integration, benchmarks)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@samikshya-db samikshya-db merged commit cf6e3cc into main Nov 5, 2025
2 of 3 checks passed
@samikshya-db samikshya-db deleted the telemetry/go-design-doc branch November 5, 2025 18:27
@samikshya-db
Copy link
Collaborator Author

Merging this at the moment, verified go best practices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants