Skip to content

fix: replace f-string SQL with psycopg2.sql for safe identifier handling#32

Open
wicky-zipstack wants to merge 2 commits intomainfrom
fix/sql-injection-sample-project
Open

fix: replace f-string SQL with psycopg2.sql for safe identifier handling#32
wicky-zipstack wants to merge 2 commits intomainfrom
fix/sql-injection-sample-project

Conversation

@wicky-zipstack
Copy link
Copy Markdown
Contributor

What

  • Replace all f-string SQL interpolation in sample_project.py with psycopg2.sql.SQL, sql.Identifier, and sql.Literal

Why

  • VAPT finding: SQL injection vulnerability in sample project creation code (OR-1226)
  • f-string interpolation used directly in SQL queries for database names, user names, schema names, and passwords
  • Even though inputs are sanitized with re.sub, proper parameterization is the industry standard

How

  • CREATE/DROP DATABASE: sql.Identifier for database names
  • CREATE/DROP USER: sql.Identifier for user names, sql.Literal for passwords
  • GRANT/ALTER SCHEMA: sql.Identifier for schema and user names
  • WHERE datname = ...: sql.Literal for value parameters
  • Schema DDL: sql.Identifier for schema names in DROP/CREATE
  • All changes in the base class SampleProject — covers all 4 sample projects (Jaffle Shop Starter/Final, DVD Rental Starter/Final)

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

  • Low risk. psycopg2.sql generates identical SQL output but with proper quoting. The only difference is identifiers are now double-quoted (e.g., "my_database" instead of my_database), which is valid PostgreSQL and handles reserved words correctly. Sample project creation should work identically.

Database Migrations

  • None

Env Config

  • None

Relevant Docs

Related Issues or PRs

  • Fixes OR-1226: Fix SQL injection in sample_project.py
  • Parent Epic: OR-1219 (Public Repo Launch Checklist)

Dependencies Versions

  • None (psycopg2 already a dependency)

Notes on Testing

  • Create all sample projects (Jaffle Shop Starter, Jaffle Shop Final, DVD Rental Starter, DVD Rental Final)
  • Verify seed data loads correctly
  • Add transformations and run them
  • Verify end-to-end flow works

Screenshots

N/A

Checklist

Replace all raw f-string SQL interpolation in sample_project.py with
psycopg2.sql.SQL, sql.Identifier, and sql.Literal for proper escaping:

- CREATE/DROP DATABASE, CREATE/DROP USER — sql.Identifier for names
- GRANT, ALTER SCHEMA — sql.Identifier for schema and user names
- WHERE datname = ... — sql.Literal for value parameters
- CREATE USER ... PASSWORD — sql.Literal for password value
- Schema DDL (DROP/CREATE SCHEMA) — sql.Identifier for schema names
@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 2, 2026

Greptile Summary

This PR addresses a VAPT-identified SQL injection vulnerability (OR-1226) in sample_project.py by replacing all f-string SQL interpolation with psycopg2.sql.SQL, sql.Identifier, and sql.Literal. The fix is thorough and correctly applied across all SQL-constructing methods in the SampleProject base class (clear_existing_db, create_new_database, grant_permissions, _grant_schema_permissions_on_new_db, and create_schemas).

Key changes:

  • All SQL identifiers (database names, user names, schema names) now use sql.Identifier for proper double-quoting
  • All SQL value parameters (passwords, datname literals) now use sql.Literal
  • Dead code user_check_query (which was defined but never used, and also had a missing-quotes bug) is correctly removed
  • The execute_sql_queries type hint is updated to \"list[sql.Composable]\" (string annotation pending cleanup per prior thread)
  • No functional behavior change — sql.Identifier produces double-quoted identifiers which is valid PostgreSQL and consistent with the sanitized lowercase input names

Confidence Score: 5/5

This PR is safe to merge — it is a targeted security fix with no functional behavior changes and no new issues introduced.

All 15 SQL construction sites are correctly migrated to psycopg2.sql composables. Identifiers use sql.Identifier (double-quoting), values use sql.Literal (proper escaping). Dead code was cleaned up. The only outstanding item (type annotation string-quoting) was acknowledged in a prior thread and deferred to a follow-up — it has no runtime impact. No P0/P1 findings remain.

No files require special attention.

Vulnerabilities

This PR directly remediates a SQL injection vulnerability. All 15 SQL construction sites in sample_project.py now use psycopg2.sql composables — identifiers are wrapped with sql.Identifier (double-quoted) and literal values with sql.Literal (properly escaped). No remaining f-string SQL interpolation was found. No new security concerns were introduced.

Important Files Changed

Filename Overview
backend/backend/application/sample_project/sample_project.py All f-string SQL interpolation replaced with psycopg2.sql composables; dead code (unused user_check_query) removed; all 15 SQL construction sites are correctly migrated with appropriate use of sql.Identifier for identifiers and sql.Literal for values.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant SampleProject
    participant psycopg2_sql as psycopg2.sql
    participant AdminDB as Admin PostgreSQL DB
    participant NewDB as New Sample DB

    Caller->>SampleProject: load_sample_project()
    SampleProject->>SampleProject: create_new_database()
    SampleProject->>psycopg2_sql: sql.SQL("CREATE DATABASE {} TEMPLATE {}").format(sql.Identifier(db), sql.Identifier(template))
    psycopg2_sql-->>SampleProject: Composable (safe, double-quoted identifiers)
    SampleProject->>AdminDB: execute_sql_queries([create_db, create_user, grant])
    AdminDB-->>SampleProject: OK

    SampleProject->>NewDB: _grant_schema_permissions_on_new_db()
    Note over SampleProject,NewDB: Direct connection to new DB
    SampleProject->>psycopg2_sql: sql.SQL("GRANT USAGE ON SCHEMA {} TO {}").format(sql.Identifier(schema), sql.Identifier(user))
    psycopg2_sql-->>SampleProject: Composable
    SampleProject->>NewDB: cursor.execute(composable)
    NewDB-->>SampleProject: OK

    SampleProject->>SampleProject: create_project_connection()
    Note over SampleProject: Updates sample_connection to new DB creds, closes admin connection

    SampleProject->>SampleProject: create_schemas() [non-clone path only]
    SampleProject->>psycopg2_sql: sql.SQL("DROP SCHEMA IF EXISTS {} CASCADE").format(sql.Identifier(schema))
    psycopg2_sql-->>SampleProject: Composable
    SampleProject->>NewDB: execute_sql_queries([drop/create schemas])
    NewDB-->>SampleProject: OK
Loading

Reviews (2): Last reviewed commit: "fix: use proper type annotation for sql...." | Re-trigger Greptile

Address Greptile P2 — type hint was weakened from list[str] to bare
list. Now correctly annotated as list[sql.Composable].
@wicky-zipstack wicky-zipstack requested review from a team as code owners April 8, 2026 11:00
Copy link
Copy Markdown
Contributor

@tahierhussain tahierhussain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants