Skip to content

Conversation

@RoryCrispin
Copy link
Member

Note to reviewers

These schema changes are committed in the simplest possible way so that we can ship something to core team which will allow them to use the product.

After shipping this, we will rewrite the code such that the columns which are pulled out from the resource attributes are configurable - so that we may then push a PR to the upstream community branch.

There is no intention that these changes remain as a fork on top of the upstream branch.

Follow up with: https://github.com/ClickHouse/data-plane-application/issues/4156

Closes: https://github.com/ClickHouse/data-plane-application/issues/4038

NOTE:
These schema changes are committed in the simplest possible way so that
we can ship something to core team which will allow them to use the
product.

After shipping this, we will rewrite the code such that the columns
which are pulled out from the resource attributes are configurable - so
that we may then push a PR to the upstream community branch.

There is no intention that these changes remain as a fork on top of the
upstream branch.

Please consider this when reviewing.
@CLAassistant
Copy link

CLAassistant commented Apr 12, 2023

CLA assistant check
All committers have signed the CLA.

@RoryCrispin RoryCrispin merged commit d450cea into main Apr 12, 2023
@RoryCrispin RoryCrispin deleted the update-schema branch April 12, 2023 15:25
RoryCrispin added a commit that referenced this pull request May 15, 2023
* Custom ClickHouse Schema changes

NOTE:
These schema changes are committed in the simplest possible way so that
we can ship something to core team which will allow them to use the
product.

After shipping this, we will rewrite the code such that the columns
which are pulled out from the resource attributes are configurable - so
that we may then push a PR to the upstream community branch.

There is no intention that these changes remain as a fork on top of the
upstream branch.

Please consider this when reviewing.
Garbett1 pushed a commit that referenced this pull request Oct 24, 2024
… Histo --> Histogram (open-telemetry#33824)

## Description

This PR adds a custom metric function to the transformprocessor to
convert exponential histograms to explicit histograms.

Link to tracking issue: Resolves open-telemetry#33827

**Function Name**
```
convert_exponential_histogram_to_explicit_histogram
```

**Arguments:**

- `distribution` (_upper, midpoint, uniform, random_)
- `ExplicitBoundaries: []float64`

**Usage example:**

```yaml
processors:
  transform:
    error_mode: propagate
    metric_statements:
    - context: metric
      statements:
        - convert_exponential_histogram_to_explicit_histogram("random", [10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]) 
```

**Converts:**

```
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: response_time
     -> Description: 
     -> Unit: 
     -> DataType: ExponentialHistogram
     -> AggregationTemporality: Delta
ExponentialHistogramDataPoints #0
Data point attributes:
     -> metric_type: Str(timing)
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-07-31 09:35:25.212037 +0000 UTC
Count: 44
Sum: 999.000000
Min: 40.000000
Max: 245.000000
Bucket (32.000000, 64.000000], Count: 10
Bucket (64.000000, 128.000000], Count: 22
Bucket (128.000000, 256.000000], Count: 12
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
```

**To:**

```
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: response_time
     -> Description: 
     -> Unit: 
     -> DataType: Histogram
     -> AggregationTemporality: Delta
HistogramDataPoints #0
Data point attributes:
     -> metric_type: Str(timing)
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-07-30 21:37:07.830902 +0000 UTC
Count: 44
Sum: 999.000000
Min: 40.000000
Max: 245.000000
ExplicitBounds #0: 10.000000
ExplicitBounds #1: 20.000000
ExplicitBounds #2: 30.000000
ExplicitBounds #3: 40.000000
ExplicitBounds #4: 50.000000
ExplicitBounds #5: 60.000000
ExplicitBounds #6: 70.000000
ExplicitBounds #7: 80.000000
ExplicitBounds #8: 90.000000
ExplicitBounds #9: 100.000000
Buckets #0, Count: 0
Buckets #1, Count: 0
Buckets #2, Count: 0
Buckets #3, Count: 2
Buckets #4, Count: 5
Buckets #5, Count: 0
Buckets #6, Count: 3
Buckets #7, Count: 7
Buckets #8, Count: 2
Buckets #9, Count: 4
Buckets #10, Count: 21
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
```

### Testing

- Several unit tests have been created. We have also tested by ingesting
and converting exponential histograms from the `statsdreceiver` as well
as directly via the `otlpreceiver` over grpc over several hours with a
large amount of data.

- We have clients that have been running this solution in production for
a number of weeks.

### Readme description:

### convert_exponential_hist_to_explicit_hist

`convert_exponential_hist_to_explicit_hist([ExplicitBounds])`

the `convert_exponential_hist_to_explicit_hist` function converts an
ExponentialHistogram to an Explicit (_normal_) Histogram.

`ExplicitBounds` is represents the list of bucket boundaries for the new
histogram. This argument is __required__ and __cannot be empty__.

__WARNING:__

The process of converting an ExponentialHistogram to an Explicit
Histogram is not perfect and may result in a loss of precision. It is
important to define an appropriate set of bucket boundaries to minimize
this loss. For example, selecting Boundaries that are too high or too
low may result histogram buckets that are too wide or too narrow,
respectively.

---------

Co-authored-by: Kent Quirk <[email protected]>
Co-authored-by: Tyler Helmuth <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants