Skip to content

Conversation

@jlaundry
Copy link
Contributor

Summary

The current azure_monitor_logs sink uses the Data Collector API, which has been deprecated and will be removed in September 2026.

This sink uses the replacement Logs Ingestion API.

While I did consider making this a drop-in replacement for the existing sink, users need to make numerous breaking infrastructure changes, including:

  • Creating new Data Collection Endpoint and Data Collection Rule resources
  • Moving from a workspace-based secret key to an OAuth credential (App Registration, Managed Identity, etc.)
  • (optionally) Re-configuring logs to use the built-in tables, instead of _CL custom tables.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

How did you test this PR?

  1. Following the Tutorial steps, create a Log Analytics workspace, App Registration, Data Collection Endpoint, and Data Collection Rule
  2. Set the AZURE_TENANT_ID, AZURE_CLIENT_ID, and AZURE_CLIENT_SECRET environment variables from the App Registration
  3. Use the following vector.yaml:
sources:
  stdin:
    type: stdin

sinks:
  azure:
    type: azure_logs_ingestion
    inputs:
      - stdin
    endpoint: https://dce-e42z.westus2-1.ingest.monitor.azure.com
    dcr_immutable_id: dcr-00000000000000000000000000000000
    stream_name: Custom-vector_CL

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

References

@jlaundry jlaundry requested review from a team as code owners April 20, 2025 23:12
@bits-bot
Copy link

bits-bot commented Apr 20, 2025

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added domain: sinks Anything related to the Vector's sinks domain: external docs Anything related to Vector's external, public documentation labels Apr 20, 2025
Signed-off-by: Jed Laundry <[email protected]>
check-spelling run (pull_request_target) for feature-azure_logs_ingestion

Signed-off-by: check-spelling-bot <[email protected]>
on-behalf-of: @check-spelling <[email protected]>
@pront pront self-assigned this Apr 21, 2025
Copy link
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jlaundry, thanks for this contribution!

It looks good. Two things:

  • We will need some documentation files. See an example here (all files under website). Note that base/ is generated by make generate-component-docs.
  • Is the intention to complete replace the azure_monitor_logs sink? If that's the case maybe we can mark the existing one as deprecated in favor of this new sink.

Copy link
Contributor

@rtrieu rtrieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! Approving with one very minor suggestion.

@jlaundry jlaundry changed the title feat(azure_logs_ingestion): Initial azure_logs_ingestion sink feat(azure_logs_ingestion sink): Initial azure_logs_ingestion sink May 13, 2025
Signed-off-by: Jed Laundry <[email protected]>
@pront
Copy link
Member

pront commented May 13, 2025

Hi @jlaundry, we received this report #23036 and we will be reverted to older azure_* crate versions. Does this affect your PR?

@jlaundry
Copy link
Contributor Author

Hi @jlaundry, we received this report #23036 and we will be reverted to older azure_* crate versions. Does this affect your PR?

From memory, there is a minor refactor required in https://github.com/jlaundry/vector/blob/02562be6447af36404d8b5668434e317c87a45b2/src/sinks/azure_logs_ingestion/config.rs#L139 to change it back to azure_identity::create_default_credential()?;, and possibly subsequent type changes... but reverting to 0.17 or 0.19 won't fundamentally block this PR, as thankfully I was using the raw REST API 🙂

Probably easiest if you rollback the package first, and then I'll rebase, retest, and push.

@pront
Copy link
Member

pront commented May 14, 2025

FYI - Reverted deps #23039

@jlaundry jlaundry marked this pull request as draft May 23, 2025 21:06
@jlaundry
Copy link
Contributor Author

FYI, I started preparing the rebase, but I've seen that the Azure Rust team have recently decided to change how authentication works with the azure_identity SDK: Azure/azure-sdk-for-rust#2283

Depending on what they decide, we may need to explicitly configure credentials, either via the vector config file or environment variables. The azure_blob sink will need a similar change (unless using connection_string config).

So instead of releasing an initial sink, and then requiring another config/environment change, I'll wait until there's stability in the SDK before moving forward with this PR.

(I'm not giving up!!!)

@yoelk
Copy link

yoelk commented May 29, 2025

Hi @jlaundry,
Thanks a lot for your initiative! In the company I work in we need this sink exactly.
I went over your PR and I haven't seen explicit auth fields (like connection_string for AzureBlobStorage or shared_key for AzureMonitorLogs). Are they implicit in some way? Or are client secrets not supported for authentication?
I'm asking because our use case is running vector on an AWS machine and connecting to multiple Azure sinks, so if only AAD or ENV_VAR based authentication is currently supported, we wouldn't be able to use it.
Do you consider adding the explicit possibility to use client secrets per sink? I'd be happy to contribute to that.

By the way, I see that the discussion here is closed. Does it mean you will go forward with merging your PR?

Thanks a lot!
Joel

@pront
Copy link
Member

pront commented Jun 11, 2025

So instead of releasing an initial sink, and then requiring another config/environment change, I'll wait until there's stability in the SDK before moving forward with this PR.

Makes sense. I feel that these crates are unfortunately a bit unstable and each version update is risky. Thank you for your interest in contributing 👍

@Renizmy
Copy link

Renizmy commented Jul 8, 2025

Hello
@jlaundry, @pront
Is the above-mentioned blocking point still applicable? I don't have a lot of experiences in rust but I would be interested in working on it

@pront
Copy link
Member

pront commented Jul 9, 2025

Hi all, currently the azure_* crates are not in a good state. Coincidentally, @thomasqueirozb was looking at this issue today. He will comment on this PR if we have a solution.

@jlaundry
Copy link
Contributor Author

For those playing along at home (hi @yoelk @Renizmy), a summary of the current issues and why we're blocked:

  • Vector's azure_blob sink currently uses the azure_storage and azure_storage_blobs crates, which are deprecated/legacy/EOL. The last version released was 0.21.0, which aligns to the 0.21.0 azure_core and azure_identity crates.
  • The proposed replacement azure_storage_blob crate is a ground-up re-implementation, currently in it's infancy; Microsoft have a big scary warning that there are bugs, and this crate must not be used in production.
  • In addition, the azure_core and azure_identity 0.22.0 crates changed the Traits of various components, and refactored the project structure, making the updated crates incompatible with the last azure_storage out of the box.
  • I've seen other projects in the same boat do things like compatibility shims to use azure_storage 0.21.0 with more recent azure_core (which is what @thomasqueirozb is working on in chore(azure_blob sink)!: Update azure (0.25) and azure storage (0.21) #23351)... which works, but adds technical debt to each project. The azure_storage crate will need to be removed eventually.
  • But, the bigger issue: for those unfamiliar with Azure workloads, there are a multitude of different ways to get credentials, depending on the deployment type and usage requirements (Managed Identities, Workload Identities, Azure CLI, certificate files... all the way down to good old OAuth Client ID & Secret). Usually, these are abstracted by the language's SDK through the DefaultAzureCredential class.
    • The Go SDK and Python SDK documentation have better descriptions and examples of how this works if you're interested
  • Starting with azure_identity 0.22.0, the Microsoft team decided to make DefaultAzureCredential different for the Rust SDK, and only use development credentials, for unspecified security reasons. While a ChainedTokenCredential was proposed (again, similar to the pattern established in the Go/Python/.NET/JavaScript SDKs), this was also removed.
  • The net result is that Rust projects that upgrade to azure_identity >= 0.22.0 will need to explicitly add configuration for the user to specify what credential type they're intending to use, and then implement a switch to instantiate the appropriate Credential - otherwise, current production deployments that use Managed/Workload Identities or AZURE_* environment variables will just stop working.
  • ... and this morning, I see that Microsoft are considering re-designing the identity library, using cross-compiled .NET code (🫤), so more turbulence is on the horizon...

What I think this means for this PR, and my thoughts/opinions for the wider project:

  1. Upgrading azure_identity in #23351 will break current users of the azure_blob sink unless they are using a connection_string. Given that this is on the critical path to update Vector to use http 1.x, this is probably still worth doing - but existing users will need to migrate their config to use a connection string.
  2. Once #23351 has been merged, I can then restart development on this PR, and I'll spend some time designing some reusable config options for selecting the appropriate Credential.
  3. Finally, once the azure_storage_blob crate reaches production stability, that's probably the point to migrate the azure_blobs sink, and as part of that introduce the various identity config options.

Also note: The existing azure_monitor_logs sink is unaffected by all this drama, because it (only) uses a shared key credential. However, the upstream API is still going to be deprecated September 2026.

@pront
Copy link
Member

pront commented Sep 4, 2025

  1. Upgrading azure_identity in #23351 will break current users of the azure_blob sink unless they are using a connection_string. Given that this is on the critical path to update Vector to use http 1.x, this is probably still worth doing - but existing users will need to migrate their config to use a connection string.

Hi @jlaundry, I missed the context on this one. How does this break existing users? E.g. assume we keep the connection_string and go ahead with that PR. The old configs will still load. Are you saying that it will break in production? Making connection_string mandatory has other benefits so we will probably go ahead with making it mandatory.

@jlaundry
Copy link
Contributor Author

jlaundry commented Sep 4, 2025

Hi @jlaundry, I missed the context on this one. How does this break existing users? Making connection_string mandatory has other benefits so we will probably go ahead with making it mandatory.

@pront the current documented behavior of the azure_blob sink is that if the storage_account is specified, it will attempts to load credentials for the account in the following ways, in order:

This is based on the azure_identity <= 0.21.0 behavior of DefaultAzureCredential. Once upgraded, and unless we create our own Credential wrapper struct that re-implements the old behavior, this will change to:

Based on past experience with customers, I expect 60-70% of production users are using connection_string, and among the remaining it's evenly split between environment variables and Managed Identities - but there's no real way to know for sure until it breaks. I don't think anyone has a valid use case for using an az CLI identity outside development environments.

So yes, I support removing the storage_account field, and forcing everyone to use connection_string until we have a patterned for using environment variables and Managed Identities.

@pront
Copy link
Member

pront commented Sep 9, 2025

FYI @jlaundry: this #23351 was merged

@sb1-nicolai
Copy link

Very nice work. Looking forward to seeing this upstream. Right now I am using Logstash with the microsoft-sentinel-log-analytics-logstash-output-plugin plugin as a temporary solution.

@sb1-nicolai
Copy link

Hi.

Can we expect this to be included in the next release of Vector?

@pront
Copy link
Member

pront commented Nov 4, 2025

Hi.

Can we expect this to be included in the next release of Vector?

Hi @sb1-nicolai, this depends on @jlaundry and the community. The Vector team is not actively working on this PR.

@jlaundry
Copy link
Contributor Author

jlaundry commented Nov 5, 2025

While I am still keen to finish this feature in time for the older API deprecation, I don't want to load the project with an unmanageable/unsupported mess. Unfortunately, not much has changed since I wrote #22912 (comment)

The azure_storage_blob crate still doesn't have feature parity with azure_storage_blobs, and the Microsoft Rust team aren't making their roadmap clear. Other projects are continuing to vendor the old azure_storage_blobs crate with shims.

And on the azure_identity side, it's also unclear if they're moving ahead with their (horrible, IMHO) plan to wedge .NET cross-compiled code in, and it appears the product group are resisting adopting the AZURE_TOKEN_CREDENTIALS convention that they've established with the other languages.

@sb1-nicolai if Azure Log Ingestion is important to you and your team, may I please suggest you reach out to your Microsoft CSAM, and ask them to escalate to the product group.

Otherwise... Vector has fantastic support for other cloud logging platforms... 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: external docs Anything related to Vector's external, public documentation domain: sinks Anything related to the Vector's sinks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Azure DCR-based custom logs -> Logs ingestion API

7 participants