Skip to content

Conversation

@askvinni
Copy link
Contributor

@askvinni askvinni commented Aug 1, 2025

Summary

Keeping in mind that I'm not anywhere near proficient in rust, I'm starting to look into arroyo as a possible streaming engine, but noticed it lacks azure support. This PR attempts to fix that.

Should be fairly self explanatory as the object_store crate includes an azure builder. Since Azure has a ton of different authentication methods, I'm opting to only offer it through env vars parsed automatically by the object_store builder (see below)

* AZURE_STORAGE_ACCOUNT_NAME: storage account name
* AZURE_STORAGE_ACCOUNT_KEY: storage account master key
* AZURE_STORAGE_ACCESS_KEY: alias for AZURE_STORAGE_ACCOUNT_KEY
* AZURE_STORAGE_CLIENT_ID -> client id for service principal authorization
* AZURE_STORAGE_CLIENT_SECRET -> client secret for service principal authorization
* AZURE_STORAGE_TENANT_ID -> tenant id used in oauth flows

How I tested these changes

Unit tests, compiled image and ran locally against an azure account, tested both writing to azure with the filesystem sink, as well as the backend checkpointing.

docker run -p 5115:5115 \
  -e ARROYO__CHECKPOINT_URL="https://my-cool-storageaccount.blob.core.windows.net/my-cool-bucket/arroyo-test" \
  -e AZURE_STORAGE_ACCOUNT_NAME= \
  -e AZURE_STORAGE_CLIENT_ID= \
  -e AZURE_STORAGE_CLIENT_SECRET= \
  -e AZURE_STORAGE_TENANT_ID= \
  arroyo-full

To note

The delta sink is untested; I expect it would work based on the underlying filesystem implementation, but can't be sure as I don't have a delta setup. The docs on storage_provider.get_backing_store() confused me a little, as typically azure would contain both the container and the storage account, but the object_store crate also has a few examples that include a URI following the format I passed.

@askvinni askvinni marked this pull request as draft August 1, 2025 16:31
@askvinni askvinni marked this pull request as ready for review August 1, 2025 17:16
Copy link
Member

@mwylde mwylde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow review, this looks good!

@mwylde mwylde force-pushed the azure-storage-support branch from a4fdb22 to e436d47 Compare August 20, 2025 18:51
@mwylde mwylde merged commit 01f120f into ArroyoSystems:master Aug 20, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants