Skip to content

Add postgres support#1254

Open
dadmobile wants to merge 28 commits intomainfrom
add/postgres_support
Open

Add postgres support#1254
dadmobile wants to merge 28 commits intomainfrom
add/postgres_support

Conversation

@dadmobile
Copy link
Member

@dadmobile dadmobile commented Jan 23, 2026

Summary by CodeRabbit

  • New Features

    • Added PostgreSQL database support alongside existing SQLite option
    • Implemented environment-based database configuration for flexible deployment
  • Chores

    • Added asyncpg dependency for PostgreSQL async operations
    • Created database migration for UUID column type optimization in PostgreSQL
    • Refactored database initialization logic to support multiple database backends

@sentry
Copy link

sentry bot commented Jan 23, 2026

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 25, 2026

📝 Walkthrough

Walkthrough

The pull request extends the system to support PostgreSQL with async driver alongside SQLite, adding database URL normalization, PostgreSQL-specific migrations, a migration utility for table existence checks, environment-based database configuration, dialect-aware ORM operations, and UUID type conversions across user lookups.

Changes

Cohort / File(s) Summary
Database Configuration
api/transformerlab/db/constants.py, api/transformerlab/db/migration_utils.py
Introduces PostgreSQL credentials and DATABASE_TYPE environment-driven configuration. Adds table_exists() utility function supporting SQLite, PostgreSQL, and other databases for Alembic migrations.
Session Initialization
api/transformerlab/db/session.py
Refactors database initialization with DATABASE_TYPE conditionals: SQLite-specific setup (WAL, PRAGMA, migration logic) runs only when DATABASE_TYPE == "sqlite"; PostgreSQL placeholder added with simplified initialization.
Alembic Environment & Normalization
api/alembic/env.py
Adds conditional URL normalization: converts sqlite+aiosqlite:// to sqlite:// and postgresql+asyncpg:// to postgresql:// for Alembic engine, preserving other URLs unchanged.
Alembic Migrations - UUID & Refactoring
api/alembic/versions/a1b2c3d4e5f6_convert_uuid_columns_for_postgres.py, api/alembic/versions/c175b784119c_create_oauth_account_table.py, api/alembic/versions/f278bbaa6f67_create_api_keys_table.py, api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py
New migration adds PostgreSQL-native UUID column conversion. Existing migrations refactored to use centralized table_exists() utility and adopt boolean default values (true/false) for user table fields.
Alembic Team Config Migration
api/alembic/versions/c78d76a6d65c_add_team_id_to_config_table.py
Replaces ad-hoc SQL queries and defensive checks with SQLAlchemy constructs (sa.update, select-join operations), guarded try/except blocks for index/constraint operations, and SQLite-specific table recreation logic in downgrade.
User ID Type Conversions
api/transformerlab/routers/auth/api_key_auth.py, api/transformerlab/routers/quota.py, api/transformerlab/routers/teams.py, api/transformerlab/shared/api_key_auth.py
Updates user lookups to cast string IDs to uuid.UUID() objects before querying User.id field, aligning with UUID storage type.
Database Models & Dependencies
api/transformerlab/shared/models/user_model.py, api/pyproject.toml
Adds dialect-aware upsert logic for OAuth account insertion with PostgreSQL-specific on_conflict_do_update handling. Adds asyncpg==0.30.0 dependency to nvidia, rocm, and cpu optional dependency groups.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add postgres support' directly and clearly summarizes the main objective of this pull request, which is to introduce PostgreSQL database support alongside the existing SQLite implementation.
Docstring Coverage ✅ Passed Docstring coverage is 93.75% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch add/postgres_support

Comment @coderabbitai help to get the list of available commands and usage tips.

@dadmobile dadmobile removed the blocked label Mar 1, 2026
@dadmobile dadmobile marked this pull request as ready for review March 1, 2026 19:59
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py (1)

164-183: ⚠️ Potential issue | 🟠 Major

Use dialect-aware column introspection instead of unconditional PRAGMA.

The table_exists() function (line 164) is dialect-aware and can return true for SQLite, PostgreSQL, and other databases. However, the else branch (line 181) unconditionally executes PRAGMA table_info(user), which is SQLite-specific and will fail on PostgreSQL or other databases where the table exists.

Suggested fix
     else:
         # Table exists - check the schema
-        result = connection.execute(sa.text("PRAGMA table_info(user)"))
-        existing_columns = [row[1] for row in result.fetchall()]
+        if connection.dialect.name == "sqlite":
+            result = connection.execute(sa.text("PRAGMA table_info(user)"))
+            existing_columns = [row[1] for row in result.fetchall()]
+        else:
+            inspector = sa.inspect(connection)
+            existing_columns = [col["name"] for col in inspector.get_columns("user")]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py`
around lines 164 - 183, The else branch after table_exists(connection, "user")
uses a SQLite-specific PRAGMA ("PRAGMA table_info(user)") which will fail on
non-SQLite dialects; replace the unconditional PRAGMA introspection with a
dialect-aware column lookup using SQLAlchemy's inspector (e.g.
sqlalchemy.inspect or sa.inspect against the connection/bind) to call
get_columns("user") and build existing_columns from the returned column
metadata, or branch on connection.dialect.name and run the appropriate query per
dialect; update the existing_columns assignment (currently produced from
result.fetchall()) to use the inspector's column names so schema checks work
across PostgreSQL, SQLite, and other DBs.
🧹 Nitpick comments (4)
api/transformerlab/db/migration_utils.py (1)

10-10: Add type hint for connection parameter.

The connection parameter is missing a type hint. Per coding guidelines, all function arguments must have type hints.

Suggested fix
+from sqlalchemy.engine import Connection
+
-def table_exists(connection, table_name: str) -> bool:
+def table_exists(connection: Connection, table_name: str) -> bool:

As per coding guidelines: "Mandatory type hints for all Python function arguments and return types".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/transformerlab/db/migration_utils.py` at line 10, Add a type hint for the
connection parameter of table_exists to satisfy the coding guideline requiring
type hints for all arguments: update the signature of table_exists(connection,
table_name: str) -> bool to include an appropriate type for connection (e.g.,
typing.Any or the specific DB connection type used in the project such as
sqlalchemy.engine.Connection or sqlite3.Connection), and add the necessary
import (typing.Any or the concrete type) at the top of the module; keep the
return type bool as-is.
api/transformerlab/shared/models/user_model.py (1)

74-97: Consider extracting common upsert logic to reduce duplication.

The two branches have identical logic except for the import. This could be slightly simplified, though the current approach is functional.

Optional refactor
         # Perform an upsert: insert if not exists, update if conflict on unique constraint.
         # Must use dialect-specific insert for ON CONFLICT DO UPDATE support.
         if DATABASE_TYPE == "postgresql":
             from sqlalchemy.dialects.postgresql import insert as dialect_insert
         else:
             from sqlalchemy.dialects.sqlite import insert as dialect_insert

-        if DATABASE_TYPE == "postgresql":
-            from sqlalchemy.dialects.postgresql import insert as pg_insert
-
-            stmt = (
-                pg_insert(OAuthAccount)
-                .values(user_id=user.id, **create_dict)
-                .on_conflict_do_update(
-                    index_elements=["oauth_name", "account_id"],
-                    set_={k: v for k, v in create_dict.items() if k not in ["id"]},
-                )
-            )
-        else:
-            from sqlalchemy.dialects.sqlite import insert as sqlite_insert
-
-            stmt = (
-                sqlite_insert(OAuthAccount)
-                .values(user_id=user.id, **create_dict)
-                .on_conflict_do_update(
-                    index_elements=["oauth_name", "account_id"],
-                    set_={k: v for k, v in create_dict.items() if k not in ["id"]},
-                )
-            )
+        stmt = (
+            dialect_insert(OAuthAccount)
+            .values(user_id=user.id, **create_dict)
+            .on_conflict_do_update(
+                index_elements=["oauth_name", "account_id"],
+                set_={k: v for k, v in create_dict.items() if k not in ["id"]},
+            )
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/transformerlab/shared/models/user_model.py` around lines 74 - 97, The
upsert logic for OAuthAccount is duplicated for PostgreSQL and SQLite; instead,
determine the dialect-specific insert class once (use DATABASE_TYPE to pick
pg_insert or sqlite_insert), assign it to a common name (e.g., dialect_insert)
and then build the stmt once using that symbol: call
dialect_insert(OAuthAccount).values(user_id=user.id,
**create_dict).on_conflict_do_update(index_elements=["oauth_name","account_id"],
set_={k: v for k, v in create_dict.items() if k != "id"}), removing the
duplicated block that separately constructs stmt for pg_insert and
sqlite_insert.
api/transformerlab/shared/api_key_auth.py (1)

37-103: Consolidate duplicate api_key_auth.py files to eliminate maintenance burden.

Two nearly identical copies of this file exist:

  • api/transformerlab/shared/api_key_auth.py
  • api/transformerlab/routers/auth/api_key_auth.py

The files share all 4 functions with 95%+ identical code. The routers version contains unused code (enumerate() with unused idx variable and debug print() statements), and maintaining separate copies risks divergence.

Move the shared version to the common location and update imports in routers/auth to use it from there.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/transformerlab/shared/api_key_auth.py` around lines 37 - 103, Two nearly
identical modules exist; consolidate shared logic into the common implementation
(the one with validate_api_key_and_get_user) and update the router copy to
import from that shared module: replace local definitions of
validate_api_key_and_get_user (and any related helpers like get_key_prefix,
verify_api_key, is_key_expired) in the routers/auth file with imports from the
shared module, remove the duplicate file, and clean up the router copy by
deleting unused enumerate/idx and debug print() statements so only the shared
function is maintained and referenced.
api/alembic/versions/c78d76a6d65c_add_team_id_to_config_table.py (1)

31-56: Avoid broad except Exception: pass around DDL steps.

Silently ignoring all failures can leave an inconsistent schema and hide real migration breakages. Prefer explicit checks or narrow exception handling.

Also applies to: 104-157

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/alembic/versions/c78d76a6d65c_add_team_id_to_config_table.py` around
lines 31 - 56, The migration currently swallows all errors with broad "except
Exception: pass" around DDL calls (op.drop_index, op.create_index,
op.create_unique_constraint), which can hide real failures; replace these with
explicit existence checks and/or narrow exception handling: obtain the current
connection via op.get_bind(), use sqlalchemy.inspect(connection) (or
connection.dialect/get_inspector) to check for the presence of the
index/constraint (check index names via inspector.get_indexes("config") and
constraints via inspector.get_unique_constraints("config")) and only call
op.create_index/op.drop_index/op.create_unique_constraint when the object is
actually absent/present; if you must catch exceptions, catch specific SQLAlchemy
exceptions (e.g., sqlalchemy.exc.OperationalError or
sqlalchemy.exc.ProgrammingError) and log/raise unexpected errors instead of
silently passing, referencing the functions op.drop_index, op.create_index, and
op.create_unique_constraint to locate the changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/alembic/env.py`:
- Around line 39-45: Add explicit type annotations to the include_object
function (add argument and return type hints for the signature of include_object
used in alembic env.py) so it satisfies the mandatory type-hint requirement, and
update the DATABASE_URL normalization that replaces "postgresql+asyncpg://" with
"postgresql://" (used to create sync_url) by ensuring the project declares
psycopg2 (or psycopg2-binary) as a dependency; add psycopg2/psycopg2-binary to
the same optional dependency groups that already include asyncpg in
api/pyproject.toml (nvidia/rocm/cpu) so SQLAlchemy can load the synchronous
PostgreSQL driver at migration time.

In `@api/alembic/versions/a1b2c3d4e5f6_convert_uuid_columns_for_postgres.py`:
- Around line 39-43: Before running the ALTER TABLE ... TYPE uuid USING ...
casts, add a validation guard that queries each target column (user.id,
oauth_account.id, oauth_account.user_id) for values that are not valid UUID
strings and aborts the migration with a clear error if any are found; implement
this by using op.get_bind() to execute a SELECT ... WHERE NOT (column ~
'^[0-9a-fA-F0-9]{8}-[0-9a-fA-F0-9]{4}-[0-9a-fA-F0-9]{4}-[0-9a-fA-F0-9]{4}-[0-9a-fA-F0-9]{12}$')
for each column, check results, and raise a MigrationError (or call raise
SystemExit) with details if rows exist, then proceed to the existing
op.execute(sa.text(...)) ALTER TABLE statements only when validations pass
(follow the same pattern used in c78d76a6d65c_add_team_id_to_config_table.py).

In `@api/alembic/versions/c78d76a6d65c_add_team_id_to_config_table.py`:
- Around line 53-55: The current unique constraint created by
op.create_unique_constraint("uq_config_user_team_key", "config", ["user_id",
"team_id", "key"]) is ineffective when user_id or team_id are NULL; instead
create explicit unique indexes to enforce the intended uniqueness across
NULLable columns: add a partial unique index for the fully-populated case (e.g.
pg index on (user_id, team_id, key) WHERE user_id IS NOT NULL AND team_id IS NOT
NULL), plus two additional partial unique indexes to cover the single-NULL cases
(e.g. unique index on (team_id, key) WHERE user_id IS NULL and unique index on
(user_id, key) WHERE team_id IS NULL); replace the op.create_unique_constraint
call with op.create_index/op.execute calls that create these three indexes (give
them clear names instead of "uq_config_user_team_key") so duplicates cannot be
inserted when user_id or team_id are NULL.
- Around line 30-33: The migration currently only drops the index
"ix_config_key" but not the original sa.UniqueConstraint("key"), so you must
also drop the UNIQUE constraint on table "config" before recreating a non-unique
index; update the migration to call op.drop_constraint(..., type_="unique",
table_name="config") using the actual constraint name (or inspect and resolve it
if the name is generated), then recreate a non-unique index
(op.create_index("ix_config_key", "config", ["key"], unique=False)) so the key
column is no longer globally unique and per-team duplicates are allowed.

In `@api/transformerlab/db/constants.py`:
- Around line 16-20: DATABASE_URL construction currently interpolates
DATABASE_PASSWORD raw, which breaks when the password contains special
characters; update the code that builds DATABASE_URL to URL-encode the
credentials (at least DATABASE_PASSWORD, and preferably DATABASE_USER) using
urllib.parse.quote_plus (or equivalent), then use the encoded values when
formatting DATABASE_URL so special characters like `@/`:#% are escaped and the
connection string remains valid.

In `@api/transformerlab/db/session.py`:
- Around line 141-142: The module reads the module-level variable db in close()
but only assigns it inside init(), risking NameError on early shutdown; declare
and initialize db = None at module scope and ensure init() assigns to the module
variable (use global db if needed) so close() can safely check "if DATABASE_TYPE
== 'sqlite' and db is not None:" without raising; update any references in
init() and close() to the same db symbol.

In `@api/transformerlab/routers/auth/api_key_auth.py`:
- Line 88: The code directly calls uuid.UUID(api_key_obj.user_id) which will
raise ValueError for invalid UUID strings and cause a 500; wrap the conversion
in a try/except (catch ValueError) and treat it as an unauthorized/malformed API
key (return a 401/raise the same auth error used elsewhere), then use the
successfully parsed UUID variable in the select(User).where(...) expression
(replace the inline uuid.UUID(...) call with the parsed UUID variable) so
invalid user_id values are handled gracefully.

---

Outside diff comments:
In `@api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py`:
- Around line 164-183: The else branch after table_exists(connection, "user")
uses a SQLite-specific PRAGMA ("PRAGMA table_info(user)") which will fail on
non-SQLite dialects; replace the unconditional PRAGMA introspection with a
dialect-aware column lookup using SQLAlchemy's inspector (e.g.
sqlalchemy.inspect or sa.inspect against the connection/bind) to call
get_columns("user") and build existing_columns from the returned column
metadata, or branch on connection.dialect.name and run the appropriate query per
dialect; update the existing_columns assignment (currently produced from
result.fetchall()) to use the inspector's column names so schema checks work
across PostgreSQL, SQLite, and other DBs.

---

Nitpick comments:
In `@api/alembic/versions/c78d76a6d65c_add_team_id_to_config_table.py`:
- Around line 31-56: The migration currently swallows all errors with broad
"except Exception: pass" around DDL calls (op.drop_index, op.create_index,
op.create_unique_constraint), which can hide real failures; replace these with
explicit existence checks and/or narrow exception handling: obtain the current
connection via op.get_bind(), use sqlalchemy.inspect(connection) (or
connection.dialect/get_inspector) to check for the presence of the
index/constraint (check index names via inspector.get_indexes("config") and
constraints via inspector.get_unique_constraints("config")) and only call
op.create_index/op.drop_index/op.create_unique_constraint when the object is
actually absent/present; if you must catch exceptions, catch specific SQLAlchemy
exceptions (e.g., sqlalchemy.exc.OperationalError or
sqlalchemy.exc.ProgrammingError) and log/raise unexpected errors instead of
silently passing, referencing the functions op.drop_index, op.create_index, and
op.create_unique_constraint to locate the changes.

In `@api/transformerlab/db/migration_utils.py`:
- Line 10: Add a type hint for the connection parameter of table_exists to
satisfy the coding guideline requiring type hints for all arguments: update the
signature of table_exists(connection, table_name: str) -> bool to include an
appropriate type for connection (e.g., typing.Any or the specific DB connection
type used in the project such as sqlalchemy.engine.Connection or
sqlite3.Connection), and add the necessary import (typing.Any or the concrete
type) at the top of the module; keep the return type bool as-is.

In `@api/transformerlab/shared/api_key_auth.py`:
- Around line 37-103: Two nearly identical modules exist; consolidate shared
logic into the common implementation (the one with
validate_api_key_and_get_user) and update the router copy to import from that
shared module: replace local definitions of validate_api_key_and_get_user (and
any related helpers like get_key_prefix, verify_api_key, is_key_expired) in the
routers/auth file with imports from the shared module, remove the duplicate
file, and clean up the router copy by deleting unused enumerate/idx and debug
print() statements so only the shared function is maintained and referenced.

In `@api/transformerlab/shared/models/user_model.py`:
- Around line 74-97: The upsert logic for OAuthAccount is duplicated for
PostgreSQL and SQLite; instead, determine the dialect-specific insert class once
(use DATABASE_TYPE to pick pg_insert or sqlite_insert), assign it to a common
name (e.g., dialect_insert) and then build the stmt once using that symbol: call
dialect_insert(OAuthAccount).values(user_id=user.id,
**create_dict).on_conflict_do_update(index_elements=["oauth_name","account_id"],
set_={k: v for k, v in create_dict.items() if k != "id"}), removing the
duplicated block that separately constructs stmt for pg_insert and
sqlite_insert.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 819729d and 9e070d5.

📒 Files selected for processing (15)
  • api/alembic/env.py
  • api/alembic/versions/a1b2c3d4e5f6_convert_uuid_columns_for_postgres.py
  • api/alembic/versions/c175b784119c_create_oauth_account_table.py
  • api/alembic/versions/c78d76a6d65c_add_team_id_to_config_table.py
  • api/alembic/versions/f278bbaa6f67_create_api_keys_table.py
  • api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py
  • api/pyproject.toml
  • api/transformerlab/db/constants.py
  • api/transformerlab/db/migration_utils.py
  • api/transformerlab/db/session.py
  • api/transformerlab/routers/auth/api_key_auth.py
  • api/transformerlab/routers/quota.py
  • api/transformerlab/routers/teams.py
  • api/transformerlab/shared/api_key_auth.py
  • api/transformerlab/shared/models/user_model.py

Comment on lines +39 to +45
if DATABASE_URL.startswith("sqlite+aiosqlite:///"):
sync_url = DATABASE_URL.replace("sqlite+aiosqlite:///", "sqlite:///")
elif DATABASE_URL.startswith("postgresql+asyncpg://"):
sync_url = DATABASE_URL.replace("postgresql+asyncpg://", "postgresql://")
else:
# Fallback: use as-is if format is unexpected
sync_url = DATABASE_URL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Check if the file exists and read the relevant lines
if [ -f api/alembic/env.py ]; then
  echo "=== File content around lines 39-45 ==="
  cat -n api/alembic/env.py | sed -n '30,55p'
else
  echo "File not found at api/alembic/env.py"
  # Try to find it
  find . -name env.py -path "*/alembic/*" 2>/dev/null | head -5
fi

Repository: transformerlab/transformerlab-app

Length of output: 1281


🏁 Script executed:

# Check for psycopg2 dependency
echo "=== Checking project dependencies ==="
if [ -f api/pyproject.toml ]; then
  echo "--- pyproject.toml ---"
  cat api/pyproject.toml
elif [ -f pyproject.toml ]; then
  echo "--- pyproject.toml (root) ---"
  cat pyproject.toml
fi

if [ -f api/requirements.txt ]; then
  echo "--- requirements.txt ---"
  cat api/requirements.txt
fi

if [ -f api/requirements-dev.txt ]; then
  echo "--- requirements-dev.txt ---"
  cat api/requirements-dev.txt
fi

Repository: transformerlab/transformerlab-app

Length of output: 2904


🏁 Script executed:

# Check the overall structure of the api directory
echo "=== API directory structure ==="
ls -la api/ 2>/dev/null || echo "api/ directory not accessible"

Repository: transformerlab/transformerlab-app

Length of output: 1027


🏁 Script executed:

# Verify what happens when postgresql:// URL is used without explicit driver
echo "=== Checking SQLAlchemy dialect behavior ==="
# Search for any psycopg2 references in the codebase
rg -i "psycopg" api/ 2>/dev/null || echo "No psycopg references found in api/"

# Check if there's any fallback or conditional logic for database drivers
rg "postgresql" api/ --type py | head -20

Repository: transformerlab/transformerlab-app

Length of output: 1160


🏁 Script executed:

# Double-check all dependencies for any psycopg variants
echo "=== Searching all dependency files for psycopg variants ==="
find api/ -type f \( -name "*.txt" -o -name "*.toml" -o -name "*.cfg" \) -exec grep -l "psycopg\|asyncpg" {} \;

Repository: transformerlab/transformerlab-app

Length of output: 155


🏁 Script executed:

# Check what psycopg reference exists in pyproject.toml
echo "=== Checking for psycopg in pyproject.toml ==="
grep -n "psycopg" api/pyproject.toml

echo -e "\n=== Checking context around psycopg reference ==="
grep -B2 -A2 "psycopg" api/pyproject.toml || echo "No psycopg found in dependencies"

Repository: transformerlab/transformerlab-app

Length of output: 207


🏁 Script executed:

# Verify Ruff compliance for the env.py file
echo "=== Checking Ruff compliance ==="
# Count actual line lengths in the relevant section
awk 'NR>=39 && NR<=45 {print NR": " length($0) " chars: " $0}' api/alembic/env.py

echo -e "\n=== Checking for type hints in env.py ==="
grep -n "def " api/alembic/env.py | head -10

Repository: transformerlab/transformerlab-app

Length of output: 740


🏁 Script executed:

# Check if there's documentation or setup instructions about psycopg2
echo "=== Checking for installation/setup docs ==="
find api/ -type f \( -name "*.md" -o -name "README*" -o -name "setup*" -o -name "install.sh" \) | head -5

# Check install.sh for psycopg2
grep -n "psycopg\|pip\|requirements" api/install.sh | head -20

Repository: transformerlab/transformerlab-app

Length of output: 938


🏁 Script executed:

# Verify the include_object function more thoroughly
echo "=== Full include_object function definition ==="
awk 'NR>=29 && NR<=35' api/alembic/env.py

Repository: transformerlab/transformerlab-app

Length of output: 329


Add mandatory type hints to include_object function and ensure psycopg2 dependency is available for PostgreSQL migrations.

Lines 29-34 define include_object without type hints, violating the mandatory type hints requirement. Add argument and return type annotations.

Additionally, the URL normalization on lines 41-42 converts postgresql+asyncpg:// to postgresql://, which relies on psycopg2 as SQLAlchemy's default PostgreSQL driver. However, psycopg2 or psycopg2-binary is not listed in api/pyproject.toml dependencies, causing PostgreSQL migrations to fail at runtime with ModuleNotFoundError: No module named 'psycopg2'. Add the dependency to the appropriate optional dependency group (nvidia/rocm/cpu) alongside asyncpg.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/alembic/env.py` around lines 39 - 45, Add explicit type annotations to
the include_object function (add argument and return type hints for the
signature of include_object used in alembic env.py) so it satisfies the
mandatory type-hint requirement, and update the DATABASE_URL normalization that
replaces "postgresql+asyncpg://" with "postgresql://" (used to create sync_url)
by ensuring the project declares psycopg2 (or psycopg2-binary) as a dependency;
add psycopg2/psycopg2-binary to the same optional dependency groups that already
include asyncpg in api/pyproject.toml (nvidia/rocm/cpu) so SQLAlchemy can load
the synchronous PostgreSQL driver at migration time.

Comment on lines +39 to +43
op.execute(sa.text('ALTER TABLE "user" ALTER COLUMN id TYPE uuid USING id::uuid'))

# Convert oauth_account.id and oauth_account.user_id from CHAR(36) to native UUID
op.execute(sa.text("ALTER TABLE oauth_account ALTER COLUMN id TYPE uuid USING id::uuid"))
op.execute(sa.text("ALTER TABLE oauth_account ALTER COLUMN user_id TYPE uuid USING user_id::uuid"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -path "*/alembic/versions/*" -name "*uuid*.py" -type f

Repository: transformerlab/transformerlab-app

Length of output: 149


🏁 Script executed:

ls -la api/alembic/versions/ | head -20

Repository: transformerlab/transformerlab-app

Length of output: 1151


🏁 Script executed:

# Search for the specific migration file
fd "a1b2c3d4e5f6" api/alembic/versions/

Repository: transformerlab/transformerlab-app

Length of output: 147


🏁 Script executed:

# Get all migration files to understand the naming pattern
git ls-files api/alembic/versions/ | sort

Repository: transformerlab/transformerlab-app

Length of output: 747


🏁 Script executed:

cat -n api/alembic/versions/a1b2c3d4e5f6_convert_uuid_columns_for_postgres.py

Repository: transformerlab/transformerlab-app

Length of output: 2584


🏁 Script executed:

# Check other migrations to see existing patterns for data validation or guards
rg -A 10 "invalid|validate|check" api/alembic/versions/ --type py

Repository: transformerlab/transformerlab-app

Length of output: 1411


🏁 Script executed:

# Check if there are any existing patterns that validate data before migrations
rg -B 5 -A 10 "LIMIT 1|fetchone" api/alembic/versions/ --type py

Repository: transformerlab/transformerlab-app

Length of output: 3348


🏁 Script executed:

# Check the initial migration to understand how user.id and oauth_account columns were created
cat -n api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py | head -150

Repository: transformerlab/transformerlab-app

Length of output: 8182


🏁 Script executed:

# Look for how user.id was initially populated
rg -B 5 -A 5 'user.*id' api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py | head -50

Repository: transformerlab/transformerlab-app

Length of output: 1336


🏁 Script executed:

# Check if there are any other migrations that populate these tables
cat -n api/alembic/versions/c175b784119c_create_oauth_account_table.py

Repository: transformerlab/transformerlab-app

Length of output: 2483


🏁 Script executed:

# Check the rest of the initial migration to see how user table is defined
sed -n '150,300p' api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py

Repository: transformerlab/transformerlab-app

Length of output: 5417


🏁 Script executed:

# Search specifically for the user table creation
rg -A 20 'create_table.*"user"' api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py

Repository: transformerlab/transformerlab-app

Length of output: 59


🏁 Script executed:

# Check if there are any default values or inserts that populate user.id with non-UUID values
rg -B 5 -A 10 'INSERT INTO.*user' api/alembic/versions/ --type py

Repository: transformerlab/transformerlab-app

Length of output: 59


🏁 Script executed:

# Check how user.id and oauth_account columns are populated in the application code
find api -name "*.py" -type f | xargs grep -l "user.*id.*uuid\|User.*id" | head -10

Repository: transformerlab/transformerlab-app

Length of output: 504


🏁 Script executed:

# Check if there's any application code that populates user.id
rg -l "CREATE.*user\|INSERT.*user" api --type py

Repository: transformerlab/transformerlab-app

Length of output: 59


🏁 Script executed:

# Check the models to see how user.id is defined
fd "models" api --type f -name "*.py" | head -5

Repository: transformerlab/transformerlab-app

Length of output: 307


🏁 Script executed:

# Test PostgreSQL UUID casting behavior with invalid data
python3 << 'EOF'
# According to PostgreSQL documentation, casting an invalid string to UUID will raise an error
# Let's verify the regex pattern suggested in the guard is correct for UUID v1-5

import re

# RFC 4122 UUID v1-5 regex from the suggested guard
uuid_pattern = r'^[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$'

test_cases = [
    ('550e8400-e29b-41d4-a716-446655440000', True),   # Valid v4
    ('6ba7b810-9dad-11d1-80b4-00c04fd430c8', True),   # Valid v1
    ('00000000-0000-0000-0000-000000000000', False),   # Invalid version (not 1-5)
    ('550e8400-e29b-61d4-a716-446655440000', False),   # Invalid version (6, outside 1-5)
    ('invalid-uuid-string', False),                    # Invalid format
    ('550e8400e29b41d4a716446655440000', False),       # Missing hyphens
]

for test_uuid, should_be_valid in test_cases:
    matches = bool(re.match(uuid_pattern, test_uuid, re.IGNORECASE))
    print(f"UUID: {test_uuid:40} | Matches: {matches:5} | Expected: {should_be_valid:5} | {'✓' if matches == should_be_valid else '✗'}")
EOF

Repository: transformerlab/transformerlab-app

Length of output: 592


🏁 Script executed:

# Check how user.id is generated in the models
cat -n api/transformerlab/models/users.py

Repository: transformerlab/transformerlab-app

Length of output: 23576


🏁 Script executed:

# Check how oauth_account records are created
rg -B 5 -A 10 "oauth_account\|OAuthAccount" api --type py | head -100

Repository: transformerlab/transformerlab-app

Length of output: 59


🏁 Script executed:

# Check the db.py to see how UUIDs are handled
cat -n api/transformerlab/db/db.py | head -80

Repository: transformerlab/transformerlab-app

Length of output: 3275


🏁 Script executed:

# Check if there's any place where oauth_account.id is manually set
rg -B 5 -A 5 "oauth_account.*id\|OAuthAccount" api/transformerlab --type py | head -150

Repository: transformerlab/transformerlab-app

Length of output: 59


🏁 Script executed:

# Look for how oauth_account records are inserted or created
rg "add_oauth_account\|oauth_account" api/transformerlab/shared --type py -B 3 -A 10

Repository: transformerlab/transformerlab-app

Length of output: 59


🏁 Script executed:

# Check if there's uuid generation or defaults in the models
cat api/transformerlab/shared/models/models.py | grep -A 30 "class OAuthAccount"

Repository: transformerlab/transformerlab-app

Length of output: 1587


🏁 Script executed:

# Check how oauth_account.id is generated by fastapi-users
rg "SQLAlchemyBaseOAuthAccountTableUUID" api --type py -B 5 -A 15

Repository: transformerlab/transformerlab-app

Length of output: 3086


🏁 Script executed:

# Look at the actual implementation of how IDs are generated in oauth accounts
find api -name "*.py" -type f -exec grep -l "OAuthAccount\|oauth_account" {} \; | xargs grep -l "id.*uuid\|generate\|default" | head -5

Repository: transformerlab/transformerlab-app

Length of output: 307


🏁 Script executed:

# Check if fastapi-users generates the oauth_account.id as UUID
python3 << 'EOF'
# According to fastapi-users documentation:
# SQLAlchemyBaseOAuthAccountTableUUID uses UUID for the id field
# The id is typically auto-generated when creating oauth accounts
# Let's check what fastapi-users does

print("FastAPI-Users behavior:")
print("- SQLAlchemyBaseOAuthAccountTableUUID: defines oauth_account table with UUID id")
print("- oauth_account.id: auto-generated as UUID (not user-provided)")
print("- oauth_account.user_id: references user.id (should be valid UUID)")
print()
print("Risk assessment:")
print("- If all oauth_account.id values were generated by fastapi-users: should be valid UUIDs")
print("- If all oauth_account.user_id values reference existing user.id: should be valid UUIDs")
print("- HOWEVER: if data was manually inserted or corrupted, invalid UUIDs could exist")
EOF

Repository: transformerlab/transformerlab-app

Length of output: 585


🏁 Script executed:

# Check if there are any schema migrations that might have inserted manual data
rg -B 10 -A 10 "INSERT\|execute.*text" api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py | grep -A 5 -B 5 "user\|oauth"

Repository: transformerlab/transformerlab-app

Length of output: 59


🏁 Script executed:

# Check if there are any INSERT statements or manual data population in migrations
rg -B 5 -A 10 "INSERT\|VALUES" api/alembic/versions/ --type py | head -50

Repository: transformerlab/transformerlab-app

Length of output: 59


🏁 Script executed:

# Verify the coding guideline about migrations
grep -A 5 "alembic.*migration" /dev/stdin << 'EOF'
`api/alembic/versions/*.py`: Use existing patterns for database tables and do not create any table with a foreign key
Use existing patterns for Alembic migrations and do not create any migration with a foreign key; try to autogenerate the migration if possible
EOF

Repository: transformerlab/transformerlab-app

Length of output: 120


🏁 Script executed:

# Check the coding guideline statement directly
cat << 'EOF'
Coding Guideline for api/alembic/versions/*.py:
"Use existing patterns for database tables and do not create any table with a foreign key"
"Use existing patterns for Alembic migrations and do not create any migration with a foreign key; try to autogenerate the migration if possible"
EOF

Repository: transformerlab/transformerlab-app

Length of output: 360


🏁 Script executed:

# Verify if the migration violates the foreign key guideline
cat api/alembic/versions/a1b2c3d4e5f6_convert_uuid_columns_for_postgres.py

Repository: transformerlab/transformerlab-app

Length of output: 2199


Add validation guard before casting CHAR(36) columns to UUID type.

The migration converts user.id, oauth_account.id, and oauth_account.user_id from CHAR(36) to native UUID using PostgreSQL's USING ::uuid cast. If any value in these columns is not a valid UUID string, the ALTER TABLE statements will fail and abort the migration.

While FastAPI-users auto-generates valid UUIDs for these columns in normal operation, defensive validation before the schema change is prudent and follows the pattern already established in other migrations (e.g., c78d76a6d65c_add_team_id_to_config_table.py).

Suggested guard
 def upgrade() -> None:
     """Convert CHAR(36) UUID columns to native UUID type on PostgreSQL."""
     connection = op.get_bind()

     if connection.dialect.name != "postgresql":
         # SQLite has no native UUID type — nothing to do
         return

+    invalid_user_id = connection.execute(
+        sa.text(
+            """
+            SELECT id FROM "user"
+            WHERE id IS NOT NULL
+              AND id !~* '^[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$'
+            LIMIT 1
+            """
+        )
+    ).fetchone()
+    if invalid_user_id:
+        raise RuntimeError(f'Invalid UUID in "user.id": {invalid_user_id[0]}')
+
     # Convert user.id from CHAR(36) to native UUID
     op.execute(sa.text('ALTER TABLE "user" ALTER COLUMN id TYPE uuid USING id::uuid'))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/alembic/versions/a1b2c3d4e5f6_convert_uuid_columns_for_postgres.py`
around lines 39 - 43, Before running the ALTER TABLE ... TYPE uuid USING ...
casts, add a validation guard that queries each target column (user.id,
oauth_account.id, oauth_account.user_id) for values that are not valid UUID
strings and aborts the migration with a clear error if any are found; implement
this by using op.get_bind() to execute a SELECT ... WHERE NOT (column ~
'^[0-9a-fA-F0-9]{8}-[0-9a-fA-F0-9]{4}-[0-9a-fA-F0-9]{4}-[0-9a-fA-F0-9]{4}-[0-9a-fA-F0-9]{12}$')
for each column, check results, and raise a MigrationError (or call raise
SystemExit) with details if rows exist, then proceed to the existing
op.execute(sa.text(...)) ALTER TABLE statements only when validations pass
(follow the same pattern used in c78d76a6d65c_add_team_id_to_config_table.py).

Comment on lines +30 to +33
# Drop old unique index on key if it exists, then recreate as non-unique
try:
op.drop_index("ix_config_key", table_name="config")
except Exception:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find the migration files mentioned in the review
find api -name "*initial_migration_create_all_tables*" -o -name "*add_team_id_to_config_table*"

Repository: transformerlab/transformerlab-app

Length of output: 214


🏁 Script executed:

# List all migration files to understand the structure
ls -la api/alembic/versions/ 2>/dev/null | head -20

Repository: transformerlab/transformerlab-app

Length of output: 1151


🏁 Script executed:

# Search for the specific migration files by their IDs
rg -l "f7661070ec23|c78d76a6d65c" api/alembic/versions/

Repository: transformerlab/transformerlab-app

Length of output: 347


🏁 Script executed:

cat -n api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py | grep -A 10 "config" | head -50

Repository: transformerlab/transformerlab-app

Length of output: 3066


🏁 Script executed:

# Get the full initial migration file to see config table creation
wc -l api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py

Repository: transformerlab/transformerlab-app

Length of output: 153


🏁 Script executed:

# Read specific lines around the config table creation
sed -n '1,100p' api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py | cat -n

Repository: transformerlab/transformerlab-app

Length of output: 5494


🏁 Script executed:

cat -n api/alembic/versions/c78d76a6d65c_add_team_id_to_config_table.py

Repository: transformerlab/transformerlab-app

Length of output: 6667


🏁 Script executed:

# Search for any other references to dropping the constraint in this or related migrations
rg -A 5 -B 5 "drop_constraint" api/alembic/versions/

Repository: transformerlab/transformerlab-app

Length of output: 1061


🏁 Script executed:

# Check if there's a way to inspect constraint names in the codebase
rg -B 2 -A 2 "UniqueConstraint" api/alembic/versions/f7661070ec23_initial_migration_create_all_tables.py | head -20

Repository: transformerlab/transformerlab-app

Length of output: 1044


Dropping ix_config_key does not remove the original UNIQUE(key) constraint.

The initial migration creates the config table with both a sa.UniqueConstraint("key") (line 35 in f7661070ec23) and a separate unique index ix_config_key (line 37). These are distinct database objects. The current migration only drops the index but leaves the constraint in place, which means the "key" column remains globally unique. This will block the per-team key schema and prevent duplicate keys across different teams, defeating the migration's purpose.

Suggested fix
+    inspector = sa.inspect(connection)
+    for uc in inspector.get_unique_constraints("config"):
+        if uc.get("column_names") == ["key"] and uc.get("name"):
+            op.drop_constraint(uc["name"], "config", type_="unique")
+            break
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/alembic/versions/c78d76a6d65c_add_team_id_to_config_table.py` around
lines 30 - 33, The migration currently only drops the index "ix_config_key" but
not the original sa.UniqueConstraint("key"), so you must also drop the UNIQUE
constraint on table "config" before recreating a non-unique index; update the
migration to call op.drop_constraint(..., type_="unique", table_name="config")
using the actual constraint name (or inspect and resolve it if the name is
generated), then recreate a non-unique index (op.create_index("ix_config_key",
"config", ["key"], unique=False)) so the key column is no longer globally unique
and per-team duplicates are allowed.

Comment on lines +53 to +55
try:
op.create_unique_constraint("uq_config_user_team_key", "config", ["user_id", "team_id", "key"])
except Exception:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n api/alembic/versions/c78d76a6d65c_add_team_id_to_config_table.py

Repository: transformerlab/transformerlab-app

Length of output: 6667


Unique constraint uq_config_user_team_key is ineffective for NULL user_id and team_id rows.

Line 27 adds user_id as nullable, and Line 28 adds team_id as nullable. Line 54 includes both in a unique constraint. Due to SQL NULL semantics, multiple rows with (NULL, team_id, key) or (user_id, NULL, key) or (NULL, NULL, key) combinations can be inserted, since NULL values are not considered equal to each other in unique constraints. The migration also leaves user_id as NULL for all migrated configs, making duplicates possible on the remaining (team_id, key) pairs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/alembic/versions/c78d76a6d65c_add_team_id_to_config_table.py` around
lines 53 - 55, The current unique constraint created by
op.create_unique_constraint("uq_config_user_team_key", "config", ["user_id",
"team_id", "key"]) is ineffective when user_id or team_id are NULL; instead
create explicit unique indexes to enforce the intended uniqueness across
NULLable columns: add a partial unique index for the fully-populated case (e.g.
pg index on (user_id, team_id, key) WHERE user_id IS NOT NULL AND team_id IS NOT
NULL), plus two additional partial unique indexes to cover the single-NULL cases
(e.g. unique index on (team_id, key) WHERE user_id IS NULL and unique index on
(user_id, key) WHERE team_id IS NULL); replace the op.create_unique_constraint
call with op.create_index/op.execute calls that create these three indexes (give
them clear names instead of "uq_config_user_team_key") so duplicates cannot be
inserted when user_id or team_id are NULL.

Comment on lines +16 to +20
if DATABASE_HOST and DATABASE_DB and DATABASE_USER and DATABASE_PASSWORD:
# Use PostgreSQL if all required credentials are provided
DATABASE_URL = (
f"postgresql+asyncpg://{DATABASE_USER}:{DATABASE_PASSWORD}@{DATABASE_HOST}:{DATABASE_PORT}/{DATABASE_DB}"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

URL-encode password to handle special characters.

If DATABASE_PASSWORD contains special characters like @, /, :, %, or #, the connection URL will be malformed and database connections will fail. This is a common issue in production environments.

Proposed fix
+from urllib.parse import quote_plus
+
 # Construct DATABASE_URL based on available configuration
 if DATABASE_HOST and DATABASE_DB and DATABASE_USER and DATABASE_PASSWORD:
     # Use PostgreSQL if all required credentials are provided
+    encoded_password = quote_plus(DATABASE_PASSWORD)
     DATABASE_URL = (
-        f"postgresql+asyncpg://{DATABASE_USER}:{DATABASE_PASSWORD}@{DATABASE_HOST}:{DATABASE_PORT}/{DATABASE_DB}"
+        f"postgresql+asyncpg://{DATABASE_USER}:{encoded_password}@{DATABASE_HOST}:{DATABASE_PORT}/{DATABASE_DB}"
     )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if DATABASE_HOST and DATABASE_DB and DATABASE_USER and DATABASE_PASSWORD:
# Use PostgreSQL if all required credentials are provided
DATABASE_URL = (
f"postgresql+asyncpg://{DATABASE_USER}:{DATABASE_PASSWORD}@{DATABASE_HOST}:{DATABASE_PORT}/{DATABASE_DB}"
)
from urllib.parse import quote_plus
if DATABASE_HOST and DATABASE_DB and DATABASE_USER and DATABASE_PASSWORD:
# Use PostgreSQL if all required credentials are provided
encoded_password = quote_plus(DATABASE_PASSWORD)
DATABASE_URL = (
f"postgresql+asyncpg://{DATABASE_USER}:{encoded_password}@{DATABASE_HOST}:{DATABASE_PORT}/{DATABASE_DB}"
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/transformerlab/db/constants.py` around lines 16 - 20, DATABASE_URL
construction currently interpolates DATABASE_PASSWORD raw, which breaks when the
password contains special characters; update the code that builds DATABASE_URL
to URL-encode the credentials (at least DATABASE_PASSWORD, and preferably
DATABASE_USER) using urllib.parse.quote_plus (or equivalent), then use the
encoded values when formatting DATABASE_URL so special characters like `@/`:#% are
escaped and the connection string remains valid.

Comment on lines +141 to +142
if DATABASE_TYPE == "sqlite" and db is not None:
await db.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n api/transformerlab/db/session.py | head -160

Repository: transformerlab/transformerlab-app

Length of output: 6621


Initialize db at module scope to prevent NameError during shutdown.

The db variable is only assigned inside the init() function but is read unconditionally in the close() function (line 141). If close() is called before init() completes—such as during error cleanup or shutdown handlers—accessing db will raise NameError. Initialize db at module scope before any shutdown paths can run.

Suggested fix
 import os
 import shutil
 import aiosqlite
 import subprocess
 import sys
 from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
 from sqlalchemy.orm import sessionmaker

 from transformerlab.db.constants import DATABASE_FILE_NAME, DATABASE_URL, DATABASE_TYPE
 from lab.dirs import get_workspace_dir

+db: aiosqlite.Connection | None = None
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if DATABASE_TYPE == "sqlite" and db is not None:
await db.close()
import os
import shutil
import aiosqlite
import subprocess
import sys
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
from sqlalchemy.orm import sessionmaker
from transformerlab.db.constants import DATABASE_FILE_NAME, DATABASE_URL, DATABASE_TYPE
from lab.dirs import get_workspace_dir
db: aiosqlite.Connection | None = None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/transformerlab/db/session.py` around lines 141 - 142, The module reads
the module-level variable db in close() but only assigns it inside init(),
risking NameError on early shutdown; declare and initialize db = None at module
scope and ensure init() assigns to the module variable (use global db if needed)
so close() can safely check "if DATABASE_TYPE == 'sqlite' and db is not None:"
without raising; update any references in init() and close() to the same db
symbol.


# Get the user
stmt = select(User).where(User.id == api_key_obj.user_id)
stmt = select(User).where(User.id == uuid.UUID(api_key_obj.user_id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Consider handling invalid UUID format gracefully.

If api_key_obj.user_id contains an invalid UUID string, uuid.UUID() will raise a ValueError. This could result in a 500 error instead of a proper 401 response.

Proposed defensive fix
     # Get the user
-    stmt = select(User).where(User.id == uuid.UUID(api_key_obj.user_id))
+    try:
+        user_uuid = uuid.UUID(api_key_obj.user_id)
+    except (ValueError, TypeError):
+        raise HTTPException(status_code=401, detail="Invalid user reference in API key")
+    stmt = select(User).where(User.id == user_uuid)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/transformerlab/routers/auth/api_key_auth.py` at line 88, The code
directly calls uuid.UUID(api_key_obj.user_id) which will raise ValueError for
invalid UUID strings and cause a 500; wrap the conversion in a try/except (catch
ValueError) and treat it as an unauthorized/malformed API key (return a
401/raise the same auth error used elsewhere), then use the successfully parsed
UUID variable in the select(User).where(...) expression (replace the inline
uuid.UUID(...) call with the parsed UUID variable) so invalid user_id values are
handled gracefully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants