Skip to content
Open
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
b7f3e42
Add version
mina-parham Mar 6, 2026
0085f7a
Add asset version routers
mina-parham Mar 6, 2026
ff23303
Add asset version routers
mina-parham Mar 6, 2026
cea7182
Ui version
mina-parham Mar 6, 2026
2932b65
Version group
mina-parham Mar 6, 2026
a4a826d
Alembic
mina-parham Mar 6, 2026
929bddf
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 6, 2026
361ee4a
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 9, 2026
cf65d08
Ruff
mina-parham Mar 9, 2026
4d9b36c
Merge branch 'add/model-dataset-group' of https://github.com/transfor…
mina-parham Mar 9, 2026
7216854
Merge conflict
mina-parham Mar 9, 2026
048fad2
Ruff
mina-parham Mar 9, 2026
bc1d8c4
Prettier
mina-parham Mar 9, 2026
af276e8
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 9, 2026
4cc8e42
Fix alebmic issue
mina-parham Mar 9, 2026
47e604e
Merge conflict
mina-parham Mar 9, 2026
df7059c
Fix failed tests
mina-parham Mar 9, 2026
bd82eb0
Merge branch 'main' into add/model-dataset-group
deep1401 Mar 10, 2026
c4ae4ee
Fix alembic table down version and fix bug in artifacts function
deep1401 Mar 10, 2026
d3b04b5
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 12, 2026
ff09e24
Add asset version
mina-parham Mar 12, 2026
64f93fc
Update jobs endpoints
mina-parham Mar 12, 2026
95654d4
Add asset version service
mina-parham Mar 12, 2026
e134fec
Update modes endpoint
mina-parham Mar 12, 2026
c9beb7f
Update dataset ui
mina-parham Mar 12, 2026
cf0b63f
Update save to registry dialog
mina-parham Mar 12, 2026
a768a2e
Add dataset modal
mina-parham Mar 12, 2026
6a81ec5
Add model modal
mina-parham Mar 12, 2026
486ba4d
Update the dataset and model ui
mina-parham Mar 12, 2026
5e5102a
Update modeloo ui
mina-parham Mar 12, 2026
6430b5d
Add version drawer
mina-parham Mar 12, 2026
3609538
Update endpoints
mina-parham Mar 12, 2026
3730526
Update alembic version
mina-parham Mar 12, 2026
d3f1820
Add dataset registry multiuser mode
mina-parham Mar 12, 2026
de09a30
Add model registry multi user mode
mina-parham Mar 12, 2026
81d67e3
Update endpoints
mina-parham Mar 12, 2026
47cc2e6
Make dataset ui better
mina-parham Mar 12, 2026
381e4ea
Make model ui better
mina-parham Mar 12, 2026
5010b4f
Make dataset ui better
mina-parham Mar 12, 2026
0f1bf42
Make the model ui better
mina-parham Mar 12, 2026
07e13ad
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 12, 2026
ba28351
Ruff
mina-parham Mar 12, 2026
359c223
Prettier
mina-parham Mar 12, 2026
6d39e5b
Merge branch 'add/model-dataset-group' of https://github.com/transfor…
mina-parham Mar 12, 2026
f6fddf8
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 13, 2026
4983e88
Merge remote-tracking branch 'origin/main' into add/model-dataset-group
mina-parham Mar 13, 2026
cf2d46c
Merge branch 'main' into add/model-dataset-group
deep1401 Mar 13, 2026
9d47020
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 16, 2026
a532aa5
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 17, 2026
a232fda
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 18, 2026
573f5ca
Remove asset version table
mina-parham Mar 18, 2026
5fe5ccf
Merge branch 'add/model-dataset-group' of https://github.com/transfor…
mina-parham Mar 18, 2026
65d42aa
Remove metadata table
mina-parham Mar 18, 2026
5495c7f
Update asset version routers
mina-parham Mar 18, 2026
a5a629d
Update jobs routers
mina-parham Mar 18, 2026
db6ea3e
Update asset version
mina-parham Mar 18, 2026
b72e103
Add asset group dirs
mina-parham Mar 18, 2026
9e00a8a
Update models
mina-parham Mar 18, 2026
c1131b1
Update data registry ui
mina-parham Mar 18, 2026
008bd0e
Update data dialog
mina-parham Mar 18, 2026
890dc37
Update model dialog
mina-parham Mar 18, 2026
576897a
Update model modal
mina-parham Mar 18, 2026
e45e694
Update model registry ui
mina-parham Mar 18, 2026
97f0809
Update asset version drawer
mina-parham Mar 18, 2026
d264f0c
Update version ui
mina-parham Mar 18, 2026
8c9f65b
Update endpoints
mina-parham Mar 18, 2026
c6cfb5d
Ruff
mina-parham Mar 18, 2026
b8a4668
Prettier
mina-parham Mar 18, 2026
9d5e99c
Prettier
mina-parham Mar 18, 2026
8a86bde
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 18, 2026
d62d451
Fix failed test
mina-parham Mar 18, 2026
454be16
Merge branch 'add/model-dataset-group' of https://github.com/transfor…
mina-parham Mar 18, 2026
1e208a4
Update test
mina-parham Mar 18, 2026
b6a540b
Make save to registry async
mina-parham Mar 18, 2026
2e5109a
Make the ui better
mina-parham Mar 18, 2026
8af076f
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 18, 2026
6e0a768
Ruff
mina-parham Mar 18, 2026
1571740
Prettier
mina-parham Mar 18, 2026
a89875d
Merge branch 'add/model-dataset-group' of https://github.com/transfor…
mina-parham Mar 18, 2026
4bfac62
Revert "Update test"
mina-parham Mar 18, 2026
a93f186
Fix pytest
mina-parham Mar 18, 2026
dc283d0
Use background task
mina-parham Mar 18, 2026
435b474
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 18, 2026
8c0ff17
Ruff
mina-parham Mar 18, 2026
1ffac84
Add model name
mina-parham Mar 18, 2026
d88d566
Add model name in the ui
mina-parham Mar 18, 2026
03df35b
Merge branch 'main' into add/model-dataset-group
mina-parham Mar 18, 2026
6cfcd2d
Ruff
mina-parham Mar 18, 2026
887f040
Fix tests
mina-parham Mar 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions api/alembic/versions/a3d2e5f8c901_create_asset_versions_table.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
"""create_asset_versions_table

Revision ID: a3d2e5f8c901
Revises: a1b2c3d4e5f6
Create Date: 2026-03-06 12:00:00.000000

"""

from typing import Sequence, Union

from alembic import op
import sqlalchemy as sa


# revision identifiers, used by Alembic.
revision: str = "a3d2e5f8c901"
down_revision: Union[str, Sequence[str], None] = "a1b2c3d4e5f6"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None


def upgrade() -> None:
"""Create asset_versions table for tracking versioned groups of models and datasets."""
connection = op.get_bind()

# Helper function to check if table exists
def table_exists(table_name: str) -> bool:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Migration uses sqlite_master, breaking Postgres

Migration uses sqlite_master, breaking Postgres. AGENTS.md says both SQLite and Postgres are supported. Use dialect-agnostic table existence check via sa.inspect.

View Details

Location: api/alembic/versions/a3d2e5f8c901_create_asset_versions_table.py (lines 27)

Analysis

Migration uses sqlite_master, breaking Postgres. AGENTS

What fails The Alembic migration hard-codes a SQLite-only system table query (sqlite_master) to check if the table exists, which fails on Postgres deployments.
Result Migration fails with a SQL error on Postgres: relation 'sqlite_master' does not exist.
Expected Migration should run successfully on both SQLite and Postgres using dialect-agnostic introspection.
Impact Blocks deployment on Postgres environments. AGENTS.md explicitly states the app supports both SQLite and Postgres.
How to reproduce
Run `alembic upgrade head` against a PostgreSQL database. The query `SELECT name FROM sqlite_master WHERE type='table' AND name=:name` will raise an error because sqlite_master doesn't exist in Postgres.
Patch Details
-    def table_exists(table_name: str) -> bool:
-        result = connection.execute(
-            sa.text("SELECT name FROM sqlite_master WHERE type='table' AND name=:name"), {"name": table_name}
-        )
-        return result.fetchone() is not None
+    def table_exists(table_name: str) -> bool:
+        return sa.inspect(connection).has_table(table_name)
AI Fix Prompt
Fix this issue: Migration uses sqlite_master, breaking Postgres. AGENTS.md says both SQLite and Postgres are supported. Use dialect-agnostic table existence check via sa.inspect.

Location: api/alembic/versions/a3d2e5f8c901_create_asset_versions_table.py (lines 27)
Problem: The Alembic migration hard-codes a SQLite-only system table query (sqlite_master) to check if the table exists, which fails on Postgres deployments.
Current behavior: Migration fails with a SQL error on Postgres: relation 'sqlite_master' does not exist.
Expected: Migration should run successfully on both SQLite and Postgres using dialect-agnostic introspection.
Steps to reproduce: Run `alembic upgrade head` against a PostgreSQL database. The query `SELECT name FROM sqlite_master WHERE type='table' AND name=:name` will raise an error because sqlite_master doesn't exist in Postgres.

Provide a code fix.


Tip: Reply with @paragon-run to automatically fix this issue

result = connection.execute(
sa.text("SELECT name FROM sqlite_master WHERE type='table' AND name=:name"), {"name": table_name}
)
return result.fetchone() is not None

if not table_exists("asset_versions"):
op.create_table(
"asset_versions",
sa.Column("id", sa.String(), nullable=False),
sa.Column("asset_type", sa.String(), nullable=False),
sa.Column("group_name", sa.String(), nullable=False),
sa.Column("version", sa.Integer(), nullable=False),
sa.Column("asset_id", sa.String(), nullable=False),
sa.Column("tag", sa.String(), nullable=True),
sa.Column("job_id", sa.String(), nullable=True),
sa.Column("description", sa.String(), nullable=True),
sa.Column("created_at", sa.DateTime(), server_default=sa.text("(CURRENT_TIMESTAMP)"), nullable=False),
sa.PrimaryKeyConstraint("id"),
)
op.create_index("idx_asset_versions_group", "asset_versions", ["asset_type", "group_name"], unique=False)
op.create_index("idx_asset_versions_tag", "asset_versions", ["asset_type", "group_name", "tag"], unique=False)
op.create_index("idx_asset_versions_asset_id", "asset_versions", ["asset_id"], unique=False)
op.create_index(op.f("ix_asset_versions_asset_type"), "asset_versions", ["asset_type"], unique=False)
op.create_index(op.f("ix_asset_versions_group_name"), "asset_versions", ["group_name"], unique=False)
op.create_index(op.f("ix_asset_versions_tag_col"), "asset_versions", ["tag"], unique=False)


def downgrade() -> None:
"""Drop asset_versions table."""
op.drop_index(op.f("ix_asset_versions_tag_col"), table_name="asset_versions")
op.drop_index(op.f("ix_asset_versions_group_name"), table_name="asset_versions")
op.drop_index(op.f("ix_asset_versions_asset_type"), table_name="asset_versions")
op.drop_index("idx_asset_versions_asset_id", table_name="asset_versions")
op.drop_index("idx_asset_versions_tag", table_name="asset_versions")
op.drop_index("idx_asset_versions_group", table_name="asset_versions")
op.drop_table("asset_versions")
2 changes: 2 additions & 0 deletions api/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ def _enable_datadog_if_setup():
api_keys,
quota,
ssh_keys,
asset_versions,
)
from transformerlab.routers.auth import get_user_and_team # noqa: E402

Expand Down Expand Up @@ -332,6 +333,7 @@ async def validation_exception_handler(request, exc):
app.include_router(api_keys.router)
app.include_router(quota.router)
app.include_router(ssh_keys.router, dependencies=[Depends(get_user_and_team)])
app.include_router(asset_versions.router, dependencies=[Depends(get_user_and_team)])

worker_process = None

Expand Down
179 changes: 179 additions & 0 deletions api/transformerlab/routers/asset_versions.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding this here what we discussed on Discord.

t works with a sample task but it doesnt work for a real model which has multiple files inside. This is from the thing which is trying to save a real model to registry in a new group and it seems like it created two folders in the models directory both with same files but none with the actual model files. It just copied over the checkpoints folder.

Image Image Image

Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
"""
asset_versions.py

API router for managing versioned groups of models and datasets.
"""

from typing import Optional

from fastapi import APIRouter, HTTPException, Query
from pydantic import BaseModel

from transformerlab.services import asset_version_service


router = APIRouter(prefix="/asset_versions", tags=["asset_versions"])


# ─── Request / Response schemas ───────────────────────────────────────────────


class CreateVersionRequest(BaseModel):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: CreateVersionRequest lacks validation on asset_type and group_name fields

CreateVersionRequest lacks validation on asset_type and group_name fields. Any string is accepted by Pydantic. Add Literal type or field validator.

View Details

Location: api/transformerlab/routers/asset_versions.py (lines 21)

Analysis

CreateVersionRequest lacks validation on asset_type and group_name fields

What fails The Pydantic model CreateVersionRequest accepts any string for asset_type and group_name. While the service layer validates asset_type, it returns a ValueError that becomes a 400, but group_name has no length/character constraints at all.
Result Versions can be created with empty or arbitrarily long group names, or with special characters that may cause issues in URL routing.
Expected Pydantic schema should constrain asset_type to Literal['model', 'dataset'] and add min_length/max_length/pattern validators to group_name.
Impact Data integrity issues with empty or malformed group names. Also misses opportunity for API-level documentation of valid values.
How to reproduce
POST /asset_versions/versions with body {"asset_type": "model", "group_name": "", "asset_id": "test"}. Empty group_name is accepted and creates a version with empty group.
Patch Details
-class CreateVersionRequest(BaseModel):
-    asset_type: str  # 'model' or 'dataset'
-    group_name: str
+class CreateVersionRequest(BaseModel):
+    asset_type: Literal["model", "dataset"]
+    group_name: str = Field(..., min_length=1, max_length=255)
AI Fix Prompt
Fix this issue: CreateVersionRequest lacks validation on asset_type and group_name fields. Any string is accepted by Pydantic. Add Literal type or field validator.

Location: api/transformerlab/routers/asset_versions.py (lines 21)
Problem: The Pydantic model CreateVersionRequest accepts any string for asset_type and group_name. While the service layer validates asset_type, it returns a ValueError that becomes a 400, but group_name has no length/character constraints at all.
Current behavior: Versions can be created with empty or arbitrarily long group names, or with special characters that may cause issues in URL routing.
Expected: Pydantic schema should constrain asset_type to Literal['model', 'dataset'] and add min_length/max_length/pattern validators to group_name.
Steps to reproduce: POST /asset_versions/versions with body {"asset_type": "model", "group_name": "", "asset_id": "test"}. Empty group_name is accepted and creates a version with empty group.

Provide a code fix.


Tip: Reply with @paragon-run to automatically fix this issue

asset_type: str # 'model' or 'dataset'
group_name: str
asset_id: str
job_id: Optional[str] = None
description: Optional[str] = None
tag: Optional[str] = "latest"


class SetTagRequest(BaseModel):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: SetTagRequest has no validation on tag value at schema level

SetTagRequest has no validation on tag value at schema level. Arbitrary strings bypass Pydantic, relying only on service-layer check. Add Literal constraint to the schema.

View Details

Location: api/transformerlab/routers/asset_versions.py (lines 30)

Analysis

SetTagRequest has no validation on tag value at schema level

What fails SetTagRequest accepts any string for tag. While the service layer validates against VALID_TAGS, the Pydantic schema provides no documentation or early rejection of invalid values.
Result Invalid tags pass schema validation, generating a 400 from the service layer instead of a 422 from Pydantic with proper error details.
Expected Schema should use Literal['latest', 'production', 'draft'] so OpenAPI docs reflect valid values and validation happens at the API boundary.
Impact Poor API ergonomics. Clients don't know valid tag values from schema/docs. Defense-in-depth is weakened.
How to reproduce
PUT /asset_versions/versions/model/group/1/tag with body {"tag": "invalid"}. Pydantic accepts it; only the service raises ValueError.
Patch Details
-class SetTagRequest(BaseModel):
-    tag: str  # 'latest', 'production', 'draft'
+class SetTagRequest(BaseModel):
+    tag: Literal["latest", "production", "draft"]
AI Fix Prompt
Fix this issue: SetTagRequest has no validation on tag value at schema level. Arbitrary strings bypass Pydantic, relying only on service-layer check. Add Literal constraint to the schema.

Location: api/transformerlab/routers/asset_versions.py (lines 30)
Problem: SetTagRequest accepts any string for `tag`. While the service layer validates against VALID_TAGS, the Pydantic schema provides no documentation or early rejection of invalid values.
Current behavior: Invalid tags pass schema validation, generating a 400 from the service layer instead of a 422 from Pydantic with proper error details.
Expected: Schema should use Literal['latest', 'production', 'draft'] so OpenAPI docs reflect valid values and validation happens at the API boundary.
Steps to reproduce: PUT /asset_versions/versions/model/group/1/tag with body {"tag": "invalid"}. Pydantic accepts it; only the service raises ValueError.

Provide a code fix.


Tip: Reply with @paragon-run to automatically fix this issue

tag: str # 'latest', 'production', 'draft'


# ─── Group endpoints ─────────────────────────────────────────────────────────


@router.get("/groups", summary="List all version groups for a given asset type.")
async def list_groups(asset_type: str = Query(..., description="'model' or 'dataset'")):
try:
return await asset_version_service.list_groups(asset_type)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))


@router.delete(
"/groups/{asset_type}/{group_name}",
summary="Delete all versions in a group.",
)
async def delete_group(asset_type: str, group_name: str):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: delete_group and delete_version have no authorization checks

delete_group and delete_version have no authorization checks. Any authenticated user can delete any group's versions. Add ownership or role-based access control.

View Details

Location: api/transformerlab/routers/asset_versions.py (lines 49)

Analysis

delete_group and delete_version have no authorization checks

What fails The delete_group and delete_version endpoints accept any authenticated user's request without checking if the user owns or has permission to modify those versions. get_user_and_team is a dependency but its result is not used.
Result Any authenticated user can delete any version group or individual version, including those created by other users.
Expected Delete endpoints should verify the requesting user has permission to modify the target group, or at minimum log who performed the deletion.
Impact Any authenticated user can destroy version tracking data for all assets. No audit trail of who deleted what.
How to reproduce
As any authenticated user, call DELETE /asset_versions/groups/model/some_group. The group is deleted regardless of who created it.
AI Fix Prompt
Fix this issue: delete_group and delete_version have no authorization checks. Any authenticated user can delete any group's versions. Add ownership or role-based access control.

Location: api/transformerlab/routers/asset_versions.py (lines 49)
Problem: The delete_group and delete_version endpoints accept any authenticated user's request without checking if the user owns or has permission to modify those versions. get_user_and_team is a dependency but its result is not used.
Current behavior: Any authenticated user can delete any version group or individual version, including those created by other users.
Expected: Delete endpoints should verify the requesting user has permission to modify the target group, or at minimum log who performed the deletion.
Steps to reproduce: As any authenticated user, call DELETE /asset_versions/groups/model/some_group. The group is deleted regardless of who created it.

Provide a code fix.


Tip: Reply with @paragon-run to automatically fix this issue

try:
count = await asset_version_service.delete_group(asset_type, group_name)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
return {"status": "success", "deleted_count": count}


# ─── Version CRUD ─────────────────────────────────────────────────────────────


@router.post("/versions", summary="Create a new version in a group.")
async def create_version(body: CreateVersionRequest):
try:
result = await asset_version_service.create_version(
asset_type=body.asset_type,
group_name=body.group_name,
asset_id=body.asset_id,
job_id=body.job_id,
description=body.description,
tag=body.tag,
)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
return result


@router.get(
"/versions/{asset_type}/{group_name}",
summary="List all versions in a group.",
)
async def list_versions(asset_type: str, group_name: str):
try:
return await asset_version_service.list_versions(asset_type, group_name)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))


@router.get(
"/versions/{asset_type}/{group_name}/{version}",
summary="Get a specific version by number.",
)
async def get_version(asset_type: str, group_name: str, version: int):
try:
result = await asset_version_service.get_version(asset_type, group_name, version)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
if result is None:
raise HTTPException(status_code=404, detail="Version not found")
return result


@router.delete(
"/versions/{asset_type}/{group_name}/{version}",
summary="Delete a specific version.",
)
async def delete_version(asset_type: str, group_name: str, version: int):
try:
deleted = await asset_version_service.delete_version(asset_type, group_name, version)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
if not deleted:
raise HTTPException(status_code=404, detail="Version not found")
return {"status": "success"}


# ─── Tag management ──────────────────────────────────────────────────────────


@router.put(
"/versions/{asset_type}/{group_name}/{version}/tag",
summary="Set a tag on a specific version. Moves the tag from any other version in the group.",
)
async def set_tag(asset_type: str, group_name: str, version: int, body: SetTagRequest):
try:
result = await asset_version_service.set_tag(asset_type, group_name, version, body.tag)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
if result is None:
raise HTTPException(status_code=404, detail="Version not found")
return result


@router.delete(
"/versions/{asset_type}/{group_name}/{version}/tag",
summary="Clear the tag from a specific version.",
)
async def clear_tag(asset_type: str, group_name: str, version: int):
try:
result = await asset_version_service.clear_tag(asset_type, group_name, version)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
if result is None:
raise HTTPException(status_code=404, detail="Version not found")
return result


# ─── Resolution ──────────────────────────────────────────────────────────────


@router.get(
"/resolve/{asset_type}/{group_name}",
summary="Resolve a group to a specific version. Defaults to 'latest' tag.",
)
async def resolve(
asset_type: str,
group_name: str,
tag: Optional[str] = Query(None, description="Tag to resolve: 'latest', 'production', 'draft'"),
version: Optional[int] = Query(None, description="Exact version number to resolve"),
):
try:
result = await asset_version_service.resolve(asset_type, group_name, tag=tag, version=version)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
if result is None:
raise HTTPException(status_code=404, detail="No matching version found")
return result


# ─── Bulk lookups (used by list views) ────────────────────────────────────────


@router.get(
"/map/{asset_type}",
summary="Get a map of asset_id -> group memberships for annotating list views.",
)
async def get_asset_group_map(asset_type: str):
try:
return await asset_version_service.get_all_asset_group_map(asset_type)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
14 changes: 14 additions & 0 deletions api/transformerlab/routers/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -841,6 +841,20 @@ async def dataset_list(generated: bool = True):
except Exception:
merged_list = []

# Augment each dataset with version group info if any
try:
from transformerlab.services import asset_version_service

group_map = await asset_version_service.get_all_asset_group_map("dataset")
for entry in merged_list:
dataset_id = entry.get("dataset_id", "")
if dataset_id in group_map:
entry["version_groups"] = group_map[dataset_id]
else:
entry["version_groups"] = []
except Exception as e:
print(f"Warning: could not fetch dataset version groups: {e}")

if generated:
return merged_list

Expand Down
40 changes: 37 additions & 3 deletions api/transformerlab/routers/experiment/jobs.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@
get_job_models_dir,
get_models_dir,
)
from transformerlab.services import asset_version_service

from transformerlab.services.cache_service import cache, cached

router = APIRouter(prefix="/jobs", tags=["train"])
Expand Down Expand Up @@ -1103,7 +1105,7 @@ async def get_artifacts(job_id: str, request: Request):
from lab.dirs import get_job_artifacts_dir

artifacts_dir = await get_job_artifacts_dir(job_id)
artifacts = await get_artifacts_from_directory(artifacts_dir, storage)
artifacts = await get_artifacts_from_directory(artifacts_dir)
except Exception as e:
print(f"Error getting artifacts for job {job_id}: {e}")
artifacts = []
Expand Down Expand Up @@ -1419,7 +1421,11 @@ async def save_dataset_to_registry(
If a dataset with that name already exists, a timestamped suffix is added.
- mode='existing': Merge into an existing dataset in the registry. target_name must be provided and must
refer to an existing dataset. Files from the job dataset are copied into the existing dataset directory.

In both modes a new version entry is recorded in the asset_versions table
so the asset can be tracked as part of a versioned group.
"""
from transformerlab.services import asset_version_service

try:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Version entry created even when file copy fails

Version entry created even when file copy fails. Dataset/model save catches copy error but proceeds to create_version. Move versioning inside the copy try block.

View Details

Location: api/transformerlab/routers/experiment/jobs.py (lines 1414)

Analysis

**Version entry created even when file copy fails. Dataset/model save catches copy error but proceeds **

What fails In save_dataset_to_registry, when storage.copy_dir fails, the exception is caught and only printed (line 1441), then execution continues to create a version entry for a dataset that doesn't actually exist in the registry.
Result A version entry exists in asset_versions pointing to a dataset/model that was never successfully copied to the registry. The API returns 'success'.
Expected If file copy fails, the version entry should NOT be created and the endpoint should return an error.
Impact Ghost version entries referencing non-existent assets. Users see versioned assets that don't exist, causing 404s when trying to use them.
How to reproduce
1. Trigger a save_dataset_to_registry where the copy fails (e.g., disk full, permissions error). 2. Observe the version entry is still created and 'success' is returned.
AI Fix Prompt
Fix this issue: Version entry created even when file copy fails. Dataset/model save catches copy error but proceeds to create_version. Move versioning inside the copy try block.

Location: api/transformerlab/routers/experiment/jobs.py (lines 1414)
Problem: In save_dataset_to_registry, when storage.copy_dir fails, the exception is caught and only printed (line 1441), then execution continues to create a version entry for a dataset that doesn't actually exist in the registry.
Current behavior: A version entry exists in asset_versions pointing to a dataset/model that was never successfully copied to the registry. The API returns 'success'.
Expected: If file copy fails, the version entry should NOT be created and the endpoint should return an error.
Steps to reproduce: 1. Trigger a save_dataset_to_registry where the copy fails (e.g., disk full, permissions error). 2. Observe the version entry is still created and 'success' is returned.

Provide a code fix.


Tip: Reply with @paragon-run to automatically fix this issue

# Secure the source dataset name
Expand Down Expand Up @@ -1471,7 +1477,21 @@ async def save_dataset_to_registry(
except Exception as copy_err:
print(f"Storage.copy_dir failed: {copy_err}")

return {"status": "success", "message": f"Dataset saved to registry as '{final_name}'"}
# Create a version entry for the dataset
group_name = dataset_name_secure
version_entry = await asset_version_service.create_version(
asset_type="dataset",
group_name=group_name,
asset_id=final_name,
job_id=job_id,
description=f"Created from job {job_id}",
)

return {
"status": "success",
"message": f"Dataset saved to registry as '{final_name}'",
"version": version_entry,
}

except HTTPException:
raise
Expand Down Expand Up @@ -1548,7 +1568,21 @@ async def save_model_to_registry(
except Exception as copy_err:
print(f"storage.copy_dir failed: {copy_err}")

return {"status": "success", "message": f"Model saved to registry as '{final_name}'"}
# Create a version entry for the model
group_name = model_name_secure
version_entry = await asset_version_service.create_version(
asset_type="model",
group_name=group_name,
asset_id=final_name,
job_id=job_id,
description=f"Created from job {job_id}",
)

return {
"status": "success",
"message": f"Model saved to registry as '{final_name}'",
"version": version_entry,
}

except HTTPException:
raise
Expand Down
18 changes: 17 additions & 1 deletion api/transformerlab/routers/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -818,7 +818,23 @@ async def get_model_prompt_template(model: str):
@router.get("/model/list")
async def model_local_list(embedding=False):
# the model list is a combination of downloaded hugging face models and locally generated models
return await model_helper.list_installed_models(embedding)
models = await model_helper.list_installed_models(embedding)

# Augment each model with version group info if any
try:
from transformerlab.services import asset_version_service

group_map = await asset_version_service.get_all_asset_group_map("model")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: loop reuses model as iterator over models list

Variable shadowing: loop reuses model as iterator over models list. Outer return still works but hides bugs. Use m or entry as loop variable.

View Details

Location: api/transformerlab/routers/model.py (lines 826)

Analysis

Variable shadowing: loop reuses model as iterator over models list

What fails The for loop for model in models shadows the outer models variable's items. After the loop, model points to the last item. While the return uses models (the list), this shadowing makes the code fragile and confusing.
Result After the loop, model references the last element of the list. While this doesn't immediately break since return models uses the list, it's error-prone for future edits.
Expected Use a different loop variable name like entry or m to avoid shadowing and reduce confusion.
Impact Code fragility: any future code added after the loop that references model will get the last item, not what was intended.
How to reproduce
Read model.py:820-836. The variable `models` is assigned from list_installed_models, then `for model in models` reuses `model` as the loop var.
Patch Details
-        for model in models:
-            model_id = model.get("model_id", "")
-            if model_id in group_map:
-                model["version_groups"] = group_map[model_id]
+        for entry in models:
+            model_id = entry.get("model_id", "")
+            if model_id in group_map:
+                entry["version_groups"] = group_map[model_id]
AI Fix Prompt
Fix this issue: Variable shadowing: loop reuses `model` as iterator over `models` list. Outer return still works but hides bugs. Use `m` or `entry` as loop variable.

Location: api/transformerlab/routers/model.py (lines 826)
Problem: The for loop `for model in models` shadows the outer `models` variable's items. After the loop, `model` points to the last item. While the return uses `models` (the list), this shadowing makes the code fragile and confusing.
Current behavior: After the loop, `model` references the last element of the list. While this doesn't immediately break since `return models` uses the list, it's error-prone for future edits.
Expected: Use a different loop variable name like `entry` or `m` to avoid shadowing and reduce confusion.
Steps to reproduce: Read model.py:820-836. The variable `models` is assigned from list_installed_models, then `for model in models` reuses `model` as the loop var.

Provide a code fix.


Tip: Reply with @paragon-run to automatically fix this issue

for model in models:
model_id = model.get("model_id", "")
if model_id in group_map:
model["version_groups"] = group_map[model_id]
else:
model["version_groups"] = []
except Exception as e:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Silent failure on version group augmentation in model and data list

Silent failure on version group augmentation in model and data list. Bare except catches all errors and only prints. Add proper logging instead of print.

View Details

Location: api/transformerlab/routers/model.py (lines 833)

Analysis

Silent failure on version group augmentation in model and data list

What fails Both model_local_list and dataset_list catch all exceptions from version group augmentation with a broad except Exception and only print a warning. Database connection failures, import errors, or schema mismatches are silently swallowed.
Result Version group data silently missing from responses. Only a print statement indicates the failure. No structured logging for monitoring/alerting.
Expected Use the application logger (not print) at WARNING level, and consider whether some errors (like DB connection failures) should propagate.
Impact Debugging difficulty: silent failures in production with no structured logging. Missing version data without any user-visible indication.
How to reproduce
Break the asset_versions table (e.g., drop it). Call GET /model/list. The warning is printed to stdout but no error is visible to API callers or monitoring.
AI Fix Prompt
Fix this issue: Silent failure on version group augmentation in model and data list. Bare except catches all errors and only prints. Add proper logging instead of print.

Location: api/transformerlab/routers/model.py (lines 833)
Problem: Both model_local_list and dataset_list catch all exceptions from version group augmentation with a broad `except Exception` and only print a warning. Database connection failures, import errors, or schema mismatches are silently swallowed.
Current behavior: Version group data silently missing from responses. Only a print statement indicates the failure. No structured logging for monitoring/alerting.
Expected: Use the application logger (not print) at WARNING level, and consider whether some errors (like DB connection failures) should propagate.
Steps to reproduce: Break the asset_versions table (e.g., drop it). Call GET /model/list. The warning is printed to stdout but no error is visible to API callers or monitoring.

Provide a code fix.


Tip: Reply with @paragon-run to automatically fix this issue

print(f"Warning: could not fetch model version groups: {e}")

return models


@router.get("/model/provenance/{model_id}")
Expand Down
Loading
Loading