-
Notifications
You must be signed in to change notification settings - Fork 104
Release Tests for AWS Postgres passwordless authentication #381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
raviharshicorp
wants to merge
77
commits into
main
Choose a base branch
from
pravi/IND-5776-release-test
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+432
−26
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Added postgres_enable_iam_auth, postgres_use_password_auth, and db_iam_username variables - Added database_passwordless_aws_use_iam local variable derived from postgres variables - Pass database_passwordless_aws_use_iam to runtime_container_engine_config module - Added postgres_passwordless_debug output for troubleshooting - Clean implementation focused only on PostgreSQL IAM auth (no Redis configs)
- Add postgres_enable_iam_auth, postgres_use_password_auth, db_iam_username variables - Enable iam_database_authentication_enabled when postgres_enable_iam_auth is true - Make password generation and usage conditional based on postgres_use_password_auth - Update outputs to handle conditional password authentication
- Make password output conditional based on postgres_use_password_auth - Return empty string when using IAM authentication instead of password
…QL IAM auth branch - Change module references from ref=main to ref=pravi/IND-5776-release-test - This branch contains the required database_passwordless_aws_use_iam variables - Fixes 'Unsupported argument' error for database_passwordless_aws_use_iam in runtime_container_engine_config module
- Update user_data_base64 reference to use correct module name 'tfe_init' - Fixes 'Reference to undeclared module' error for tfe_init_fdo - The module is correctly named 'tfe_init' not 'tfe_init_fdo'
- Always generate a password as AWS RDS requires it even for IAM auth - Remove conditional count from random_string resource - AWS RDS needs password during creation, but IAM auth takes precedence when enabled - Fixes 'Invalid master password' error when postgres_use_password_auth=false
- Add postgres_iam_username variable to service_accounts module - Create inline IAM policy aws_iam_role_policy.postgres_iam_connect when postgres_iam_username is provided - Pass db_iam_username from main module to service_accounts as postgres_iam_username - This avoids for_each dependency cycles by using inline policy instead of separate resource - Enables EC2 instances to authenticate to RDS PostgreSQL using IAM
- Update locals.tf to properly use database_passwordless_aws_use_iam variable - Add database_passwordless_aws_region to runtime_container_engine_config - This ensures DATABASE_AUTH_USE_AWS_IAM=true is set in TFE runtime environment - Fixes 502 errors caused by incorrect database auth configuration
- Add null_resource to create IAM database user with rds_iam role - Only runs when postgres_enable_iam_auth is true and db_iam_username is set - Includes proper error handling and idempotency - Waits for database to be ready before creating user - This fixes PostgreSQL passwordless authentication by ensuring the IAM user exists in PostgreSQL Fixes: PostgreSQL IAM authentication 502 errors
- Add detailed logging to debug execution - Auto-install PostgreSQL client if missing - Fix SQL syntax issues with dynamic EXECUTE statements - Better error reporting and status messages - This should help identify why the IAM user creation is not working
- Add debug null_resource to check variable values - Enhanced logging to see execution environment - Better error handling and network connectivity checks - Graceful degradation if psql cannot be installed - This will help identify why IAM user creation is not working Debug: PostgreSQL IAM authentication troubleshooting
… user setup - Remove problematic local-exec provisioner that was failing silently - Create SSM document that can be executed on EC2 instances with proper credentials - This approach allows the database user creation to run from TFE instances - Provides foundation for automated PostgreSQL IAM authentication setup
- Add missing iam_user_setup_status and postgres_iam_setup_ssm_document outputs to enterprisedb module - This ensures enterprise_db and standard_db objects have consistent structure - Fixes 'Inconsistent conditional result types' error in locals.tf selection logic - EDB module returns appropriate defaults since it doesn't support IAM authentication Fixes: Type mismatch for object attribute 'endpoint' in conditional expression
- Add missing iam_user_setup_status and postgres_iam_setup_ssm_document to default_database - This ensures all database objects have consistent 7-attribute structure - Resolves type mismatch when modules fall back to default_database Root cause: try(module.edb[0], local.default_database) was falling back to 5-attribute default_database when EDB module doesn't exist, causing type mismatch with 7-attribute standard_db
…username - Updated locals.tf to override database username when IAM authentication is enabled - Set password to null for IAM authentication (no password required) - This ensures TFE uses the IAM database user (tfe_rds_iam_user) instead of admin user
- Use IAM username (db_iam_username) when IAM authentication is enabled - Use admin username (database.username) for password-based authentication - Apply to both settings and tfe_init modules - Set password to null only when IAM authentication is enabled - This follows the Redis IAM pattern: separate users for different auth methods
- Added database connection parameters to tfe_init module call - Includes database_host, database_name, admin credentials - Added database_iam_username for IAM user creation - Enables automated PostgreSQL IAM user setup in user_data scripts This ensures the PostgreSQL IAM user is created automatically before TFE startup, eliminating manual steps for release tests.
When using PostgreSQL IAM authentication, TFE's configuration validation still requires a password value to be set, even though the actual authentication uses IAM tokens. Changed from: database_password = null To: database_password = "" This should resolve the 'database password must be set' error while maintaining proper IAM authentication functionality.
Changed database_password back to null when IAM auth is enabled: - TFE's configuration validation may require null instead of empty string - Ensures proper IAM authentication flow without password requirement - Aligns with expected IAM authentication patterns This should resolve the persistent 'database password must be set' error.
…dation TFE's configuration validation requires a password field even when IAM authentication is enabled. Use 'aws-iam-auth' placeholder password to pass config validation while maintaining IAM authentication functionality.
Remove placeholder password approach and set database_password to null for IAM authentication. The DATABASE_URL will be generated at runtime with actual AWS RDS IAM authentication tokens in the user_data script.
TFE configuration validation requires a non-empty password field. Set 'IAM_AUTH' placeholder for IAM authentication mode to pass config validation while pgmultiauth handles actual authentication.
Set database_password to null for IAM auth so TFE_DATABASE_PASSWORD is completely excluded from environment variables rather than being set to a placeholder value.
…terprise - Update postgres_engine_version from '16' to '16.10' for consistency - Update aurora_postgres_engine_version from '16.9' to '16.10' for consistency - Ensures uniform PostgreSQL version across all TFE deployments - Resolves IAM authentication compatibility issues with older PostgreSQL versions
- Set password to 'password' for easy database access during testing - This allows manual creation of IAM user before automated setup works - Maintains compatibility with existing IAM authentication flow
- Pass SSM document name to tfe_init module for automated execution - Update database module output to return just document name - Integrate IAM user creation into instance startup process - This fixes 502 errors by ensuring IAM user exists before TFE starts
- Compress SSM document commands to reduce size - Use fixed password directly in SSM parameters - Remove verbose logging and comments from SSM script - Maintain full functionality in compact form
- Add proper error handling and database readiness checks - Include verbose logging for better debugging - Add comprehensive user permissions for TFE operations - Verify user creation after setup - Handle both Ubuntu and RHEL package installation
- Remove duplicate network_public_subnets declaration - Fix aws_iam_instance_profile to use correct local variable
- Set network_public_subnets to empty array - Set aws_iam_instance_profile to empty string - Disables EC2 instance creation for explorer database
- Add security group rule to allow outbound internet access - Enables PostgreSQL client package downloads from Ubuntu repositories - Fixes 'Network is unreachable' errors during remote-exec provisioning
- Rename egress rule to postgres_db_egress as requested - Add security group rule allowing EC2 instance to connect to PostgreSQL - Add test script for manual database connectivity testing - Fixes issue where EC2 in public subnet couldn't reach RDS in private subnet
- Add small t3.micro test VM with PostgreSQL client pre-installed - Include automated connectivity test script in user_data - Add outputs for test VM IP and SSH connection command - Enables isolation testing of database connectivity issues - VM automatically tests network connectivity and PostgreSQL authentication
- Remove triggers from null_resource to prevent unnecessary recreations - Update IAM user creation to use heredoc syntax for cleaner SQL execution - Remove verification section to streamline the process - Fix quoting issues with direct psql command approach
- Remove triggers from null_resource to prevent unnecessary recreations - Update IAM user creation to use heredoc syntax for cleaner SQL execution - Remove verification section to streamline the process - Fix quoting issues with direct psql command approach - Clean up test VM outputs that were previously added
- Remove duplicate aws_security_group_rule resource declaration - Keep only the first postgres_db_egress rule defined earlier in the file - Fixes terraform validation error about duplicate resource names
- Install hstore, uuid-ossp, and citext extensions during database setup - These extensions are required for TFE to start successfully - Add verification query to confirm extensions are installed - Fixes TFE startup check failure for missing PostgreSQL extensions
- Add CREATEDB privilege to IAM user creation - Grant CREATE ON DATABASE privilege for schema creation - Fixes permission denied error when TFE tries to create terraform_enterprise schema - IAM user now has sufficient privileges for TFE database operations
- Use current AWS region from data source instead of variable - Ensures TFE gets the correct region for IAM database authentication - Fixes configuration issue where region was empty for passwordless auth
- Grant CREATEROLE privilege for more comprehensive database operations - Add USAGE grants on citext, hstore, and uuid types to allow registry API to use extensions - Apply permissions to both new and existing users for consistency - This fixes terraform-registry-api migration failures due to missing citext extension access
- Add cleanup of dirty migration records in schema_migrations table - Reset version 4 migration state that was causing 'Dirty database version 4' errors - Clean up before creating IAM user to ensure migrations can run successfully - This fixes terraform-registry-api startup failures due to corrupted migration state
- Drop and recreate schema_migrations table on every database setup - This ensures terraform-registry-api always starts with a clean migration state - Prevents persistent 'Dirty database version 4' errors from previous failed attempts - More aggressive but reliable approach to migration state cleanup
- Add debugging output to show migration table state before/after cleanup - Clean up multiple possible migration table names (schema_migrations, schema_version, gorp_migrations, migrations) - Clear advisory locks that might be preventing migrations - Grant explicit permissions on schema_migrations table to IAM user - Show table structure after creation for verification
- Force terminate active database connections that might hold locks - Clear all advisory locks including golang-migrate specific locks (classid 1410924490) - Drop and recreate terraform_registry schema completely - Create fresh schema_migrations table with exact golang-migrate format - Reset database configuration and reload settings - Add comprehensive debugging output to track cleanup progress - Grant full permissions to IAM user on all schemas and tables This addresses persistent 'Dirty database version 4' errors by ensuring completely clean migration state for terraform-registry-api startup.
- Recreate extensions explicitly in public schema to ensure global access - Grant USAGE on extension types to both IAM user and public - Add wait period after terminating connections - Enhanced debugging for migration state tracking - Set search_path for IAM user to include terraform_registry schema - More comprehensive advisory lock cleanup with proper iteration - Verify extension availability with detailed output This addresses the 'type citext does not exist' error and should resolve persistent migration state corruption issues.
- Drop and recreate public schema entirely to eliminate ALL existing state - Force unlock ALL advisory locks including golang-migrate specific locks - Terminate ALL database connections to ensure clean state - Create completely fresh schema_migrations table with zero history - Recreate extensions in clean public schema - Grant comprehensive permissions to IAM user on all schemas - Set proper search_path for schema access - Force PostgreSQL configuration reload This nuclear approach eliminates any possibility of dirty migration state persistence that was causing repeated 'Dirty database version 4' errors in terraform-registry-api. Clean slate approach.
REVERTED: All complex migration cleanup and schema reset approaches IMPLEMENTED: Simple and correct solution: - Create dedicated database _db for IAM user - Install extensions in both main and IAM databases - Grant proper permissions to IAM user on dedicated database - Update TFE configuration to use IAM database when IAM auth enabled - Added iam_database_name output from database module - Updated both database_name and pg_dbname in main configuration This eliminates migration conflicts by giving IAM user clean database with no existing migration state. Much simpler and more reliable than attempting to clean up shared database migration state.
- Added database_iam_name local that correctly resolves IAM database name - Only applies IAM database for standard database module (not Aurora/mTLS/EDB) - Updated both database_name and pg_dbname to use database_iam_name local - Fixes issue where TFE was still connecting to main database instead of IAM database This ensures terraform-registry-api connects to the clean separate database _db instead of the main database with existing migrations.
The external data source was trying to SSH with wrong key path causing: - Permission denied (publickey) errors - Identity file /tmp/postgres_key.pem not accessible FIXED: Use static output instead since database name is predictable - Database name is simply _db - No need for SSH or file reading from EC2 instance - Removed temporary file creation in script - Simplified architecture and removed failure point This resolves the terraform apply failure in release tests.
- Create IAM user first before creating database - Set IAM user as database owner instead of DB_USERNAME - Install required extensions in IAM database - Separate database creation into logical steps - Ensure proper permissions and extension access
- Create IAM user first before creating database - Set IAM user as database owner instead of DB_USERNAME - Install required extensions in IAM database - Separate database creation into logical steps - Ensure proper permissions and extension access
- Create IAM user and database in single transaction - IAM user owns their database from creation - No migration conflicts - clean slate - Install extensions directly in IAM database - Simple, atomic operation
- Connect to 'postgres' system database instead of main TFE database - Avoid migration conflicts during IAM setup - Create IAM user and database in clean environment - Remove extension installation in dirty main database
- Remove debug echo statements exposing database connection details - Fix terraform formatting issues - Ensure secure logging practices - Code follows best practices and naming conventions
- Change postgres setup instance from m5.xlarge to t3.micro - Reduce volume size from 100GB to 20GB - Instance is only used for IAM user setup, doesn't need high compute
- Remove local.database_iam_instance_profile which was declared but never used - Fixes terraform_unused_declarations warning from tflint
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
This release test tests the passwordless authentication to AWS postgres using IAM based token.
Relates OR Closes #(https://hashicorp.atlassian.net/browse/IND-4009)
JIRA: https://hashicorp.atlassian.net/browse/IND-4009
Related PRs
terraform-enterprise: https://github.com/hashicorp/terraform-enterprise/pull/3397
ptfedev-infra: https://github.com/hashicorp/ptfedev-infra/pull/887
terraform-random-tfe-utility: hashicorp/terraform-random-tfe-utility#192
How Has This Been Tested
The release test ran on Github:https://github.com/hashicorp/terraform-enterprise/actions/runs/19146818577/job/54726521821
Checked the TFE logs, database connection was success.
Test Configuration
TFE_DATABASE_PASSWORDLESS_AWS_USE_INSTANCE_PROFILEto trueTFE_DATABASE_PASSWORDLESS_AWS_REGIONto aws_regionCI/CD: https://github.com/hashicorp/terraform-enterprise/actions/runs/19464088212/job/55694964359
Screenshot: