Skip to content

Conversation

@zeitlinger
Copy link
Member

No description provided.

@zeitlinger zeitlinger requested review from a team July 7, 2023 11:12
zeitlinger added a commit to zeitlinger/opentelemetry-java-instrumentation that referenced this pull request Jul 10, 2023
zeitlinger added a commit to zeitlinger/opentelemetry-java-instrumentation that referenced this pull request Jul 10, 2023
zeitlinger added a commit to zeitlinger/opentelemetry-java-instrumentation that referenced this pull request Jul 10, 2023
zeitlinger added a commit to zeitlinger/opentelemetry-java-instrumentation that referenced this pull request Jul 10, 2023
| Name | Instrument Type ([*](/docs/general/metrics-general.md#instrument-types)) | Unit | Unit ([UCUM](/docs/general/metrics-general.md#instrument-units)) | Description |
|------------------------|---------------------------------------------------|--------------|-------------------------------------------|------------------------------------------------------------------------------|
| `faas.invoke_duration` | Histogram | milliseconds | `ms` | Measures the duration of the invocation |
| `faas.init_duration` | Histogram | milliseconds | `ms` | Measures the duration of the function's initialization, such as a cold start |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these changes come with explicit bucket boundary advice so that the distribution is useful out of the box?

This question applies to all the histogram metrics that appear in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch - added

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is a breaking change, I think we should treat it similarly to the HTTP semconv breaking changes and give guidance on how to migrate and flag-guard this change.

At a minimum, I'd like a plan for how to roll out this change to instrumentation prior to accepting the PR.

See: https://github.com/open-telemetry/semantic-conventions/tree/main/docs/http

@zeitlinger
Copy link
Member Author

Given this is a breaking change, I think we should treat it similarly to the HTTP semconv breaking changes and give guidance on how to migrate and flag-guard this change.

At a minimum, I'd like a plan for how to roll out this change to instrumentation prior to accepting the PR.

See: https://github.com/open-telemetry/semantic-conventions/tree/main/docs/http

What do you think about re-using the feature flag you linked, i.e. to use seconds if OTEL_SEMCONV_STABILITY_OPT_IN==http?

zeitlinger added a commit to zeitlinger/opentelemetry-java-instrumentation that referenced this pull request Jul 14, 2023
zeitlinger added a commit to zeitlinger/opentelemetry-java-instrumentation that referenced this pull request Jul 14, 2023
@jsuereth
Copy link
Contributor

What do you think about re-using the feature flag you linked, i.e. to use seconds if OTEL_SEMCONV_STABILITY_OPT_IN==http?

FYI - reusing this is the right thing to do. You can add one for database, fass, rpc, etc. However we may want to tie changing these to an overall effort to stabilize those components.

Do you plan to make progress on this PR or should we close and tackle individual semconv areas separately?

@pyohannes
Copy link
Contributor

Let's discuss how to continue with those changes. Options I see:

  1. Start stabilization efforts for all relates areas (databases, FaaS, RPC) in order to establish a compatibility time window and rules for this change and other breaking changes.

    Probably not realistic at the moment?

  2. Do nothing now and wait for stabilization efforts in related areas to start at some point in the future.

    To my knowledge there's not timeline for starting those efforts. Assuming increasing adoption of semantic conventions, breaking changes in the future will be more painful.

  3. Find a way to do this changes without starting stabilization efforts in related areas.

    Keeps semantic conventions consistent with its own requirements, maybe takes some pain out of future stabilization efforts? However, OTEL_SEMCONV_STABILITY_OPT_IN might be a misleading mechanism to use.

@pyohannes
Copy link
Contributor

pyohannes commented Oct 2, 2023

Discussed in today's semantic convention SIG meeting.

FaaS metrics don't seem to be used currently (at least we didn't find any evidence for it), RPC and database metrics are used by OTel and third-party libraries.

The proposed approach:

  1. Close this PR and open three issues for the respective areas (duration metrics for FaaS, RPC, and databases).
  2. Update FaaS metrics right away, if we don't find any evidence that they're currently used (maybe check with the FaaS SIG).
  3. Have the remaining issues (RPC and databases) tackled during stabilization efforts for those areas. @trask mentioned he plans to start stabilization efforts for database clients after HTTP is stable.

@jsuereth What do you think?

@jsuereth
Copy link
Contributor

jsuereth commented Oct 9, 2023

@pyohannes The plan looks good to me.

@pyohannes
Copy link
Contributor

As proposed in #176 (comment), this PR will be closed and the changes it proposes will be addressed by the following issues:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants