UN-3266 [FIX] Re-indexing issue in Prompt Studio#1907
UN-3266 [FIX] Re-indexing issue in Prompt Studio#1907harini-venkataraman merged 153 commits intomainfrom
Conversation
Conflicts resolved: - docker-compose.yaml: Use main's dedicated dashboard_metric_events queue for worker-metrics - PromptCard.jsx: Keep tool_id matching condition from our async socket feature - PromptRun.jsx: Merge useEffect import from main with our branch - ToolIde.jsx: Keep fire-and-forget socket approach (spinner waits for socket event) - SocketMessages.js: Keep both session-store and socket-custom-tool imports + updateCusToolMessages dep - SocketContext.js: Keep simpler path-based socket connection approach - usePromptRun.js: Keep Celery fire-and-forget with socket delivery over polling - setupProxy.js: Accept main's deletion (migrated to Vite)
for more information, see https://pre-commit.ci
… into feat/execution-backend
for more information, see https://pre-commit.ci
… into feat/execution-backend
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
for more information, see https://pre-commit.ci
Collapse multi-line `<Typography.Text>null</Typography.Text>` JSX to a single line so biome's formatter passes in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a defensive guard in `UsageHelper.get_usage_by_model()` that drops `Usage` rows where `usage_type == "llm"` and `llm_usage_reason` is empty. Per the Usage model contract, an empty reason is only valid when `usage_type == "embedding"`; an empty reason combined with `usage_type == "llm"` is a producer-side bug (an LLM call site forgot to pass `llm_usage_reason` in `usage_kwargs`). Without this guard the row surfaces in API deployment responses as a malformed bare `"llm"` bucket with no token breakdown alongside the legitimate `"extraction_llm"` bucket. The guard logs a warning on every dropped row so future producer regressions are detectable. Adds three regression tests in `backend/usage_v2/tests/test_helper.py` that stub `account_usage.models` and `usage_v2.models` in `sys.modules` so the helper can be imported without Django being set up: - `test_unlabeled_llm_row_is_dropped` — bare "llm" bucket disappears - `test_embedding_row_is_preserved` — guard is scoped to LLM rows - `test_all_three_llm_reasons_coexist` — extraction/challenge/summarize Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
for more information, see https://pre-commit.ci
- legacy_executor: extract _run_pipeline_answer_step helper to drop _handle_structure_pipeline cognitive complexity from 18 to under 15 - legacy_executor: bundle 9 prompt-run scalars into a prompt_run_args dict so _run_line_item_extraction has 8 params (was 15, limit 13) - legacy_executor: merge implicitly concatenated log string - structure_tool_task: extract _write_pipeline_outputs helper used by both _execute_structure_tool_impl and _run_agentic_extraction to remove the duplicated INFILE / COPY_TO_FOLDER write block (fixes the 6.1% duplication on new code) - test_context_retrieval_metrics: use pytest.approx for float compare, drop unused executor local, drop always-true if is_single_pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…ming Drop _inject_context_retrieval_metrics and its call site in _handle_single_pass_extraction. The helper was timing a second fs.read against a warm cache (the cloud plugin had already read the file to build its combined prompt) and reporting that under context_retrieval, which is a fabricated number, not a measurement. The cloud plugin is the source of the file read for single-pass and is responsible for populating context_retrieval in its returned metrics. Updated the docstring to spell out the contract. Also fix misleading "Completed prompt" streaming in the table and line-item extraction wrappers: the message was firing on both the success and failure branches, and on failure the user never saw the error (it only went to logger.error). Move the success-only message into the success branch and stream the error at LogLevel.ERROR on the failure branch. Fall back to "unknown error" when the plugin returns an empty result.error. Drop the now-orphan TestInjectContextRetrievalMetrics test class (six tests calling the deleted method) and update the module docstring. Surviving classes (TestSinglePassChunkSizeForcing, TestPipelineIndexUsageKwargsPropagation) cover unrelated invariants and are kept. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary by CodeRabbit
WalkthroughThe changes introduce a new extraction-status API endpoint that allows background workers to report document extraction completion status tied to specific profiles and text-extraction configurations. The system atomically persists this status to prevent concurrent overwrites. Executor logic is updated to skip unnecessary re-indexing when documents are already indexed and reindex is disabled. Changes
Sequence DiagramsequenceDiagram
participant Executor as Executor
participant Callback as ide_callback
participant Client as PromptStudioAPIClient
participant API as extraction-status<br/>Endpoint
participant Helper as IndexHelper
participant DB as Database
Executor->>Executor: Check is_document_indexed()
alt Document Not Indexed
Executor->>Executor: perform_indexing()
Executor->>Callback: ide_index_complete(callback_kwargs)
else Document Indexed & reindex=False
Executor->>Callback: ide_index_complete(callback_kwargs)
end
Callback->>Callback: Extract x2text_config_hash,<br/>enable_highlight
Callback->>Client: mark_extraction_status(document_id,<br/>profile_manager_id, ...)
Client->>API: POST /extraction-status/
API->>API: Validate required fields
API->>Helper: mark_extraction_status(...)
Helper->>DB: select_for_update().get_or_create()
DB-->>Helper: IndexManager instance (locked)
Helper->>Helper: Read existing<br/>extraction_status dict
Helper->>Helper: Merge/update entry<br/>for x2text_config_hash
Helper->>DB: save(update_fields=<br/>["extraction_status"])
DB-->>Helper: Confirm update
Helper-->>API: Success/Result
API-->>Client: {"success": true}
Client-->>Callback: Response
Callback->>Callback: Log result (warn on error)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
|
| Filename | Overview |
|---|---|
| backend/prompt_studio/prompt_studio_core_v2/internal_views.py | Adds new extraction_status POST endpoint mirroring the existing pattern of index_update/indexing_status; resolves ProfileManager by PK, delegates to PromptStudioIndexHelper.mark_extraction_status, and returns {success: bool}. Clean and consistent with the file's existing conventions. |
| backend/prompt_studio/prompt_studio_core_v2/internal_urls.py | Wires the new extraction-status/ URL path; change is minimal and correct. |
| backend/prompt_studio/prompt_studio_index_manager_v2/prompt_studio_index_helper.py | Replaces update_or_create(defaults={extraction_status: {...}}) with select_for_update().get_or_create() + in-place dict merge + save(update_fields=[...]), correctly fixing the dict-clobber bug for multi-hash scenarios without changing single-hash behavior. |
| workers/ide_callback/tasks.py | Adds the missing mark_extraction_status call in ide_index_complete's success branch; correctly guarded behind 'if x2text_config_hash and profile_manager_id' and wrapped in isolated try/except so failure is non-fatal to primary indexing. |
| workers/shared/clients/prompt_studio_client.py | Adds mark_extraction_status client method with correct payload construction matching the new backend endpoint's expected JSON fields. |
| workers/executor/executors/legacy_executor.py | Adds early-return in _handle_index when doc_id_found=True and reindex=False; safe because both primary callers (_handle_structure_pipeline at line 841 and the IDE Index button at prompt_studio_helper.py:545) explicitly pass reindex=True. |
| workers/tests/test_legacy_executor_index.py | Adds test_already_indexed_no_reindex_short_circuits covering the new early-return path, and strengthens test_reindex_passed_through with an explicit perform_indexing.assert_called_once() assertion. |
Sequence Diagram
sequenceDiagram
participant UI as Prompt Studio UI
participant BE as Backend (Django)
participant EW as Executor Worker
participant CB as ide_callback Worker
participant VDB as Vector DB
participant ORM as Database (ORM)
UI->>BE: POST /index (Index button, reindex=True)
BE->>EW: dispatch ide_index (reindex=True in index_params)
EW->>EW: _handle_extract → extracted_text
EW->>VDB: is_document_indexed(doc_id)
Note over EW: reindex=True → skip early-return guard
EW->>VDB: perform_indexing(doc_id, extracted_text)
EW-->>CB: ide_index_complete(result_dict, cb_kwargs)
CB->>BE: POST /indexing-status/ (mark_document_indexed)
CB->>BE: POST /index/ (update_index_manager)
CB->>BE: POST /extraction-status/ NEW
BE->>ORM: select_for_update().get_or_create(doc, profile)
BE->>ORM: merge extraction_status dict + save()
CB-->>UI: WebSocket index_document completed
UI->>BE: POST /answer-prompt (Answer Prompt)
BE->>BE: check_extraction_status(x2text_config_hash)
Note over BE: Now finds COMPLETED marker - skips re-extraction
BE->>EW: dispatch answer_prompt (no re-index)
EW-->>CB: ide_prompt_complete
CB-->>UI: WebSocket fetch_response completed (with valid context)
Reviews (2): Last reviewed commit: "Merge branch 'main' into fix/agentic-exe..." | Re-trigger Greptile
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
workers/tests/test_legacy_executor_index.py (1)
227-255: Consider asserting VectorDB cleanup in the new short-circuit test.Since Line 249 exercises the new early-return branch, it would be useful to assert the mocked VectorDB is still closed in that path.
✅ Suggested test tweak
- mock_vdb_cls.return_value = MagicMock() + mock_vdb = MagicMock() + mock_vdb_cls.return_value = mock_vdb mock_get_fs.return_value = MagicMock() @@ assert result.success is True assert result.data[IKeys.DOC_ID] == "doc-already-indexed" mock_index.is_document_indexed.assert_called_once() mock_index.perform_indexing.assert_not_called() + mock_vdb.close.assert_called_once()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@workers/tests/test_legacy_executor_index.py` around lines 227 - 255, The test test_already_indexed_no_reindex_short_circuits should also assert that the VectorDB instance is cleaned up when the executor short-circuits: after calling executor.execute(ctx) add an assertion that the mocked VectorDB (mock_vdb_cls.return_value) had its close/cleanup method invoked (e.g., assert mock_vdb_cls.return_value.close.called or assert_called_once()), so locate the mock_vdb_cls usage in this test and verify its returned mock was closed when mock_index.is_document_indexed returned True and perform_indexing was not called.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@workers/tests/test_legacy_executor_index.py`:
- Around line 227-255: The test test_already_indexed_no_reindex_short_circuits
should also assert that the VectorDB instance is cleaned up when the executor
short-circuits: after calling executor.execute(ctx) add an assertion that the
mocked VectorDB (mock_vdb_cls.return_value) had its close/cleanup method invoked
(e.g., assert mock_vdb_cls.return_value.close.called or assert_called_once()),
so locate the mock_vdb_cls usage in this test and verify its returned mock was
closed when mock_index.is_document_indexed returned True and perform_indexing
was not called.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 65586c7c-474a-4856-b5de-bf896366d0c7
📒 Files selected for processing (7)
backend/prompt_studio/prompt_studio_core_v2/internal_urls.pybackend/prompt_studio/prompt_studio_core_v2/internal_views.pybackend/prompt_studio/prompt_studio_index_manager_v2/prompt_studio_index_helper.pyworkers/executor/executors/legacy_executor.pyworkers/ide_callback/tasks.pyworkers/shared/clients/prompt_studio_client.pyworkers/tests/test_legacy_executor_index.py
chandrasekharan-zipstack
left a comment
There was a problem hiding this comment.
@harini-venkataraman the methods mention "mark_extraction_status" but its called from ide_index_complete. Does this mean that extraction is not handled in a separate worker but clubbed with indexing?
backend/prompt_studio/prompt_studio_index_manager_v2/prompt_studio_index_helper.py
Show resolved
Hide resolved
Yes, ide_index is a operation that runs extract+index in the same Celery task — see _handle_ide_index in legacy_executor.py. Marking extraction_status from the index callback is because by the time ide_index_complete fires, extraction has already succeeded as part of the same task. |



What
Fixes two Prompt Studio IDE regressions introduced by the Phase 4 async-executor migration:
Both symptoms share a single root cause that this PR addresses surgically with three changes.
Why
Pre-Phase-4 behavior (worked)
The synchronous
PromptToolHTTP path markedIndexManager.extraction_status[x2text_config_hash] = {extracted: True, ...}as part of indexing. On the next Answer Prompt,check_extraction_statusfound the COMPLETED marker and skipped re-extraction. The extract.txtstayed byte-identical to what was indexed, so thedoc_idhash computed at retrieval time matched thedoc_idthe VDB was keyed under.Phase-4 behavior (broken)
Phase 4 replaced the synchronous HTTP call with
ExecutionDispatcher.dispatch()to the executor worker. The new callback chain (ide_index_complete) callsmark_document_indexed(Redis cache) andupdate_index_manager(raw_index_id) — but never callsmark_extraction_status. The worker client didn't even have that method.How
Three surgical changes:
Change 1 — Wire
mark_extraction_statusthrough the Phase 4 callback chain (primary fix)v1/prompt-studio/extraction-status/ininternal_views.py+internal_urls.pythat resolvesprofile_manager_id→ProfileManagerand delegates toPromptStudioIndexHelper.mark_extraction_status.No changes to
build_index_payload— all four requiredcb_kwargsfields were already stashed during Phase 4 (lines 577-594).Change 2 — Fix
_handle_indexunconditional re-index (defense-in-depth)In
workers/executor/executors/legacy_executor.py,_handle_indexnow early-returns with the existingdoc_idwhendoc_id_found and not reindex, preventing re-writing the same chunks into the VDB on cache misses. The Index button dispatches withreindex=True(prompt_studio_helper.py:545).Change 3 — Fix
mark_extraction_statusdict-replacement bugIn
backend/prompt_studio/prompt_studio_index_manager_v2/prompt_studio_index_helper.py, replaced theupdate_or_create(defaults={"extraction_status": {hash: data}})pattern with aselect_for_update().get_or_create()+ in-place dict merge +save(update_fields=["extraction_status"]).Can this PR break any existing features
No. Here's why each change is safe:
ide_index_complete. The new call is wrapped in a log-only try/except, so even if the backend endpoint is unreachable or returns an error, primary indexing is unaffected and the existing success path continues. The new endpoint is purely additive.doc_id_found=TrueANDreindex=False. The Prompt Studio Index button dispatches withreindex=True, so the user-facing "Index" action is unchanged. This guard matches the code's clear original intent (why else wouldis_document_indexedbe called and logged?). Existing testtest_reindex_passed_throughupdated to explicitly assertperform_indexingis still called whenreindex=True.Database Migrations
None. The
extraction_statusJSONField andIndexManagerunique constraint already exist inprompt_studio/prompt_studio_index_manager_v2/migrations/0002_indexmanager_extraction_status.py.Env Config
None.
Relevant Docs
architecture-migration-phases.md(Phase 4 async-executor migration)backend/prompt_studio/prompt_studio_index_manager_v2/prompt_studio_index_helper.py(helper docstrings)Related Issues or PRs
fix/agentic-executor-queuebranch (this PR's base).PromptToolHTTP calls withExecutionDispatcher.dispatch().Dependencies Versions
None changed.
Notes on Testing
Automated
Results:
test_legacy_executor_index.py: 13/13 passed (including newtest_already_indexed_no_reindex_short_circuits)test_sanity_phase2.py: passingtest_sanity_phase4.py: passingtest_answer_prompt.py/test_sanity_phase3.pywere verified to exist on the baseline (pre-stash) and are unrelated to this fix.Manual E2E repro (primary validation)
docker-compose upbackend + workers stackide_callbackworker logs; the first Answer Prompt should NOT show a re-index round-trip (no secondide_indexdispatch)reindex=True) — confirm the executor still runs full re-indexing and produces a valid resultMulti-profile regression check (Change 3)
mark_extraction_statusthrough two different profiles on the same document (or through two different x2text configs)IndexManager.extraction_statuscontains BOTH hash entries rather than just the latestScreenshots
N/A — purely backend fix, no UI surface changed.
Checklist
Files Modified
backend/prompt_studio/prompt_studio_core_v2/internal_views.py— newextraction_statusPOST viewbackend/prompt_studio/prompt_studio_core_v2/internal_urls.py— wire new URLbackend/prompt_studio/prompt_studio_index_manager_v2/prompt_studio_index_helper.py— dict-merge fixworkers/shared/clients/prompt_studio_client.py— newmark_extraction_statusclient methodworkers/ide_callback/tasks.py— callmark_extraction_statusfromide_index_completesuccess branchworkers/executor/executors/legacy_executor.py—_handle_indexearly-return guardworkers/tests/test_legacy_executor_index.py— new test for early-return, strengthened reindex test