-
Notifications
You must be signed in to change notification settings - Fork 1
Add Python scripts and documentation for MongoDB Queryable Encryption with Azure Key Vault #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 2 commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,154 @@ | ||
| # MongoDB Queryable Encryption Tutorial (Python) | ||
| **Automatic Client-Side Field Level Encryption with Azure Key Vault – Including CMK Rotation in Atlas** | ||
|
|
||
| ## Overview | ||
|
|
||
| This repository demonstrates how to set up [MongoDB Queryable Encryption (QE)](https://www.mongodb.com/docs/manual/core/queryable-encryption/#std-label-qe-manual-feature-qe) using Python and Azure Key Vault, including secure Data Encryption Key (DEK) management and rewrapping after Customer Master Key (CMK) rotation in MongoDB Atlas. | ||
|
|
||
| Queryable Encryption allows you to **encrypt sensitive data client side**, perform expressive queries on encrypted fields, and manage your encryption keys securely with cloud KMS providers such as Azure Key Vault. | ||
|
|
||
| ## Features | ||
|
|
||
| - **Create encrypted MongoDB collections** with [automatic encryption](https://www.mongodb.com/docs/manual/core/queryable-encryption/install-library/#std-label-qe-csfle-install-library) | ||
| - **Encrypt and decrypt fields transparently** in application code | ||
| - Use [Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/general/overview) for secure key management (CMK) | ||
| - **Rewrap DEKs** (change key under which your encrypted keys are wrapped) after CMK rotation | ||
| - Full Python demo including helper functions, insertion, and querying | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| ### Software | ||
|
|
||
| - **Python 3** | ||
| - [MongoDB Atlas Cluster](https://www.mongodb.com/cloud/atlas/register) | ||
| - [PyMongo Driver](https://www.mongodb.com/docs/languages/python/pymongo-driver/current/) (`>=4.4`) | ||
| - [pymongocrypt](https://pypi.org/project/pymongocrypt/) (`>=1.6`) | ||
| - Automatic Encryption Shared Library ([crypt_shared](https://www.mongodb.com/docs/manual/core/queryable-encryption/install-library/#automatic-encryption-shared-library)) | ||
|
|
||
| ### Cloud Providers (Azure) | ||
|
|
||
| - [Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/general/overview) with your **CMK** | ||
| - [Register your application in Microsoft Entra ID](https://learn.microsoft.com/en-us/entra/identity-platform/quickstart-register-app) | ||
| - Assign the application the **Key Vault Administrator** role, or permissions to wrap/unwrap keys | ||
|
|
||
| ### Other Supported KMS Providers | ||
| - AWS, GCP, KMIP, or local (see `.env` placeholders) | ||
|
|
||
| --- | ||
|
|
||
| ## Getting Started | ||
|
|
||
| ### 1. Clone This Repository | ||
alexchengpeng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ```bash | ||
| git clone https://github.com/<your-org>/<your-repo>.git | ||
| cd /<your-repo>/mongodb-qe-tutorial | ||
| ``` | ||
|
|
||
| ### 2. Populate Environment Variables | ||
|
|
||
| Edit the **.env** file and replace all placeholder values (`<Your ...>`) with your credentials. | ||
|
|
||
| ```bash | ||
| # Azure Example: | ||
| export AZURE_TENANT_ID="<Your Azure tenant ID>" | ||
| export AZURE_CLIENT_ID="<Your Azure client ID>" | ||
| export AZURE_CLIENT_SECRET="<Your Azure client secret>" | ||
| export AZURE_KEY_NAME="<Your Azure Key Name>" | ||
| export AZURE_KEY_VERSION="<Your Azure Key Version>" | ||
| export AZURE_KEY_VAULT_ENDPOINT="<Your Azure Key Vault Endpoint>" | ||
| export KEY_VAULT_MONGODB_URI="<Your Atlas Connection String>" | ||
| export MONGODB_URI="<Your Atlas Connection String>" | ||
| export SHARED_LIB_PATH="/full/path/to/mongo_crypt_v1.so" | ||
| ... | ||
| ``` | ||
|
|
||
| See `.env` in repo for a full example including other KMS providers. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As a best practice, we don't add .env files to repo so I would remove this. |
||
|
|
||
| ### 3. Install Python Dependencies | ||
|
|
||
| ```bash | ||
| python -m pip install -r requirements.txt | ||
| ``` | ||
|
|
||
| ### 4. Download Automatic Encryption Shared Library | ||
|
|
||
| Follow [these instructions](https://www.mongodb.com/docs/manual/core/queryable-encryption/install-library/#automatic-encryption-shared-library) to download the correct `mongo_crypt_v1.so` (or `.dylib` for Mac) for your system, and record its full path in your `.env`. | ||
|
|
||
| --- | ||
|
|
||
| ## Usage | ||
|
|
||
| ### Step 1: Create Key Vault and Encrypted Collection | ||
|
|
||
| This script creates the **key vault collection** (to hold your DEKs) and sets up an **encrypted collection** for your data. | ||
|
|
||
| ```bash | ||
| python create_encrypted_collections.py | ||
| ``` | ||
|
|
||
| ### Step 2: Insert Encrypted Document | ||
|
|
||
| This script uses automatic encryption to insert a document with encrypted fields. | ||
|
|
||
| ```bash | ||
| python insert_encrypted_doc.py | ||
| ``` | ||
|
|
||
| **Sample output:** | ||
| ```plaintext | ||
| Successfully inserted another patient with ssn: 123-45-6789 | ||
| {...decrypted document...} | ||
| ``` | ||
|
|
||
| ### Step 3: Rotate Your CMK in Azure Key Vault | ||
|
|
||
| - Use the Azure Portal to [rotate your root key](https://learn.microsoft.com/en-us/azure/key-vault/keys/change-key-version). | ||
| - Record the new version in your `.env` if needed. | ||
|
|
||
| ### Step 4: Rewrap Data Encryption Keys (DEKs) | ||
|
|
||
| After CMK rotation, rewrap all the DEKs in MongoDB – they’ll be wrapped under the new version of your master key and remain usable. | ||
|
|
||
| Edit `rewrap_deks.py` with your new CMK details if needed: | ||
|
|
||
| ```bash | ||
| python rewrap_deks.py | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| ### Common Issues | ||
|
|
||
| - **"Not all keys were satisfied":** | ||
| If demo code is run multiple times without dropping collections, documents may be encrypted under keys that are lost or missing. Drop your vault and collection, restart, and generate keys once. | ||
|
|
||
| - **Shared library load errors:** | ||
| Example: | ||
| ``` | ||
| Error while opening candidate for crypt_shared dynamic library [/path/mongo_crypt_v1.so] | ||
| ``` | ||
| - Ensure your library matches your OS and CPU arch (`file mongo_crypt_v1.so`, `uname -a`) | ||
| - Path must be correct and the file must be present | ||
|
|
||
| --- | ||
|
|
||
| ## File Reference | ||
|
|
||
| - `requirements.txt` – Python package requirements | ||
| - `.env` – Environment variables for all supported KMS providers | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would remove .env here as well as this file won't be in the repository. |
||
| - `queryable_encryption_helpers.py` – Helper functions for KMS credentials and encryption setup | ||
| - `create_encrypted_collections.py` – Create vault, DEKs, and encrypted collection | ||
| - `insert_encrypted_doc.py` – Insert and query encrypted documents | ||
| - `rewrap_deks.py` – Rewrap DEKs after master key rotation | ||
|
|
||
| --- | ||
|
|
||
| ## References & Documentation | ||
|
|
||
| - [Queryable Encryption Tutorials](https://www.mongodb.com/docs/manual/core/queryable-encryption/tutorials/#queryable-encryption-tutorials) | ||
| - [Queryable Encryption Quick Start](https://www.mongodb.com/docs/manual/core/queryable-encryption/quick-start/#queryable-encryption-quick-start) | ||
| - [MongoDB Atlas](https://www.mongodb.com/docs/atlas/) | ||
| - [Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/general/overview) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,119 @@ | ||
| from pymongo import MongoClient #import MongoClient class to connect to MongoDB servers/clusters. | ||
| import queryable_encryption_helpers as helpers # our helper functions | ||
| import os #For reading environment variables. | ||
| from dotenv import load_dotenv #Loads variables from a .env file into your environment | ||
|
|
||
| load_dotenv() #Loads the values in a .env file | ||
|
|
||
| # start-setup-application-variables | ||
| kms_provider_name = "azure" | ||
|
|
||
| # URIs for Atlas clusters | ||
| key_vault_uri = os.environ['KEY_VAULT_MONGODB_URI'] # Key Vault Cluster! | ||
| data_uri = os.environ['MONGODB_URI'] # Application Data Cluster! | ||
|
|
||
| key_vault_database_name = "queryable_encryption" | ||
| key_vault_collection_name = "queryable_keyVault" | ||
| key_vault_namespace = f"{key_vault_database_name}.{key_vault_collection_name}" | ||
| encrypted_database_name = "mongoMedicalRecords" | ||
| encrypted_collection_name = "mongoDBpatients" | ||
|
|
||
|
|
||
|
|
||
| kms_provider_credentials = helpers.get_kms_provider_credentials(kms_provider_name) | ||
| customer_master_key_credentials = helpers.get_customer_master_key_credentials(kms_provider_name) | ||
|
|
||
| #Drop old collections for a fresh setup | ||
| data_client = MongoClient(data_uri) | ||
| try: | ||
| data_client[encrypted_database_name][encrypted_collection_name].drop() | ||
| except Exception: | ||
| pass | ||
|
|
||
| key_vault_client = MongoClient(key_vault_uri) | ||
| try: | ||
| key_vault_client[key_vault_database_name][key_vault_collection_name].drop() | ||
| except Exception: | ||
| pass | ||
|
|
||
| # ---- Ensure the key vault collection has a unique index on keyAltNames ---- | ||
| key_vault_client[key_vault_database_name][key_vault_collection_name].create_index( | ||
| "keyAltNames", | ||
| unique=True, | ||
| partialFilterExpression={"keyAltNames": {"$exists": True}} #Creates a unique index only on documents that actually have keyAltNames (not all do). | ||
| ) | ||
| print("Created unique index on keyAltNames for key vault collection.") | ||
|
|
||
| # Set Up the ClientEncryption Object | ||
| #Initializes an object that lets you securely create and use data encryption keys (DEKs). | ||
| #Uses the key vault, KMS, credentials, and collection namespace. | ||
| client_encryption = helpers.get_client_encryption( | ||
| key_vault_client, | ||
| kms_provider_name, | ||
| kms_provider_credentials, | ||
| key_vault_namespace | ||
| ) | ||
|
|
||
|
|
||
|
|
||
| # ---- Create DEKs with keyAltNames (one per field) ---- | ||
| ssn_altname = f"{encrypted_database_name}.ssn" | ||
| billing_altname = f"{encrypted_database_name}.billing" | ||
|
|
||
| # create a DEK (only once), record its keyId: | ||
| # key_id is a BSON Binary(UUID_subtype_4) and Use the keyIds for both fields: | ||
| ssn_key_id = client_encryption.create_data_key( | ||
| kms_provider_name, | ||
| master_key=customer_master_key_credentials, | ||
| key_alt_names=[ssn_altname] | ||
| ) | ||
| billing_key_id = client_encryption.create_data_key( | ||
| kms_provider_name, | ||
| master_key=customer_master_key_credentials, | ||
| key_alt_names=[billing_altname] | ||
| ) | ||
| print(f"Created SSN Key ID: {ssn_key_id}") | ||
| print(f"Created Billing Key ID: {billing_key_id}") | ||
|
|
||
| # Save the DEKs for use in insert_doc.py (write to file, print, etc.) | ||
| with open("ssn_key_id.bin", "wb") as f: | ||
| f.write(ssn_key_id) | ||
| with open("billing_key_id.bin", "wb") as f: | ||
| f.write(billing_key_id) | ||
|
|
||
|
|
||
| # start-encrypted-fields-map | ||
|
|
||
| encrypted_fields_map = { | ||
| "fields": [ | ||
| { | ||
| "path": "patientRecord.ssn", | ||
| "bsonType": "string", | ||
| "queries": [{"queryType": "equality"}], | ||
| "keyId": ssn_key_id | ||
| }, | ||
| { | ||
| "path": "patientRecord.billing", | ||
| "bsonType": "object", | ||
| "keyId": billing_key_id | ||
| } | ||
| ] | ||
| } | ||
|
|
||
|
|
||
| # creates a new collection in your MongoDB data cluster. | ||
|
|
||
| try: | ||
| client_encryption.create_encrypted_collection( | ||
| data_client[encrypted_database_name], | ||
| encrypted_collection_name, | ||
| encrypted_fields_map, | ||
| kms_provider_name, | ||
| customer_master_key_credentials, | ||
| ) | ||
| print("Encrypted collection created successfully.") | ||
| except Exception as e: | ||
| print("Unable to create encrypted collection due to the following error:", e) | ||
|
|
||
| data_client.close() | ||
| key_vault_client.close() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,96 @@ | ||
| from pymongo import MongoClient | ||
| import queryable_encryption_helpers as helpers | ||
| import os | ||
| from dotenv import load_dotenv | ||
| from random import randint | ||
| load_dotenv() | ||
|
|
||
| kms_provider_name = "azure" | ||
| uri = os.environ['MONGODB_URI'] | ||
|
|
||
| key_vault_database_name = "queryable_encryption" | ||
| key_vault_collection_name = "queryable_keyVault" | ||
| key_vault_namespace = f"{key_vault_database_name}.{key_vault_collection_name}" | ||
| encrypted_database_name = "mongoMedicalRecords" | ||
| encrypted_collection_name = "mongoDBpatients" | ||
|
|
||
| kms_provider_credentials = helpers.get_kms_provider_credentials(kms_provider_name) | ||
|
|
||
| # --- Connect to key vault and retrieve DEKs by keyAltName --- | ||
| key_vault_client = MongoClient(os.environ['KEY_VAULT_MONGODB_URI']) | ||
| key_vault_coll = key_vault_client[key_vault_database_name][key_vault_collection_name] | ||
|
|
||
| ssn_key_id = key_vault_coll.find_one({"keyAltNames": "mongoMedicalRecords.ssn"})["_id"] | ||
| billing_key_id = key_vault_coll.find_one({"keyAltNames": "mongoMedicalRecords.billing"})["_id"] | ||
|
|
||
| encrypted_fields_map = { | ||
| f"{encrypted_database_name}.{encrypted_collection_name}": { | ||
| "fields": [ | ||
| { | ||
| "path": "patientRecord.ssn", | ||
| "bsonType": "string", | ||
| "queries": [{"queryType": "equality"}], | ||
| "keyId": ssn_key_id | ||
| }, | ||
| { | ||
| "path": "patientRecord.billing", | ||
| "bsonType": "object", | ||
| "keyId": billing_key_id | ||
| } | ||
| ] | ||
| } | ||
| } | ||
|
|
||
| # specify the key vault client (recommended for Atlas multi-region) | ||
| key_vault_client = MongoClient(os.environ['KEY_VAULT_MONGODB_URI']) | ||
|
|
||
| auto_encryption_options = helpers.get_auto_encryption_options( | ||
| kms_provider_name, | ||
| key_vault_namespace, | ||
| kms_provider_credentials, | ||
| encrypted_fields_map=encrypted_fields_map, | ||
| key_vault_client=key_vault_client | ||
| ) | ||
|
|
||
| # Set up the encrypted client | ||
| encrypted_client = MongoClient(uri, auto_encryption_opts=auto_encryption_options) | ||
|
|
||
| # Get the encrypted collection reference | ||
| encrypted_collection = encrypted_client[encrypted_database_name][encrypted_collection_name] | ||
|
|
||
| ssn = f"{randint(100, 999)}-{randint(10,99)}-{randint(1000,9999)}" | ||
| new_patient = { | ||
| "patientName": f"Alice Charles {randint(1, 1000)}", # Randomize | ||
| "patientId": randint(10000000, 99999999), # random patientId | ||
| "patientRecord": { | ||
| "ssn": ssn, # random SSN | ||
| "billing": { | ||
| "type": "Amex", | ||
| "number": "340000000000009" | ||
| }, | ||
| "billAmount": randint(1000, 5000), # Optional: random bill amount | ||
| }, | ||
| } | ||
|
|
||
| result = encrypted_collection.insert_one(new_patient) | ||
| if result.acknowledged: | ||
| print(f"Successfully inserted another patient with ssn: {ssn}") | ||
|
|
||
| # start-find-document | ||
| find_result = encrypted_collection.find_one({ | ||
| "patientRecord.ssn": ssn | ||
| }) | ||
|
|
||
| print(find_result) | ||
| # end-find-document | ||
|
|
||
| encrypted_client.close() | ||
| key_vault_client.close() | ||
|
|
||
| ''' | ||
| print("Listing all DEKs and their keyAltNames in key vault:") | ||
| for doc in key_vault_coll.find(): | ||
| print("DEK _id:", doc.get("_id"), "keyAltNames:", doc.get("keyAltNames")) | ||
| print("ssn_key_id:", ssn_key_id, type(ssn_key_id)) | ||
| print("billing_key_id:", billing_key_id, type(billing_key_id)) | ||
| ''' |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.