Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
153 commits
Select commit Hold shift + click to select a range
2da4907
Execution backend - revamp
harini-venkataraman Feb 19, 2026
41eeef8
async flow
harini-venkataraman Feb 19, 2026
f66dfb2
Streaming progress to FE
harini-venkataraman Feb 24, 2026
95c6592
Removing multi hop in Prompt studio ide and structure tool
harini-venkataraman Feb 25, 2026
d8cc6cc
Merge origin/main into feat/execution-backend
Deepak-Kesavan Feb 28, 2026
44a2b3f
Merge remote-tracking branch 'origin/main' into feat/execution-backend
Deepak-Kesavan Mar 2, 2026
2f4f2dc
UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item
Deepak-Kesavan Mar 2, 2026
d041201
Added executors for agentic prompt studio
harini-venkataraman Mar 2, 2026
0a0cfb1
Merge branch 'main' of github.com:Zipstack/unstract into feat/executi…
harini-venkataraman Mar 2, 2026
a4e1fd7
Merge branch 'main' of github.com:Zipstack/unstract into feat/executi…
harini-venkataraman Mar 2, 2026
ae77d6a
Added executors for agentic prompt studio
harini-venkataraman Mar 2, 2026
5c22956
Added executors for agentic prompt studio
harini-venkataraman Mar 2, 2026
3cc3213
Removed redundant envs
harini-venkataraman Mar 2, 2026
d0532f8
Removed redundant envs
harini-venkataraman Mar 2, 2026
6173df5
Removed redundant envs
harini-venkataraman Mar 3, 2026
bbe6f58
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 3, 2026
a3dc912
Removed redundant envs
harini-venkataraman Mar 3, 2026
98c8071
Merge branch 'main' of github.com:Zipstack/unstract into feat/executi…
harini-venkataraman Mar 3, 2026
21157ac
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman Mar 3, 2026
0216b59
Removed redundant envs
harini-venkataraman Mar 3, 2026
db81b9d
Removed redundant envs
harini-venkataraman Mar 3, 2026
e1da202
Removed redundant envs
harini-venkataraman Mar 3, 2026
d119797
Removed redundant envs
harini-venkataraman Mar 3, 2026
fbadbf8
Removed redundant envs
harini-venkataraman Mar 3, 2026
882296e
Removed redundant envs
harini-venkataraman Mar 4, 2026
6d3bbbf
Removed redundant envs
harini-venkataraman Mar 4, 2026
292460b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 4, 2026
f35c0e6
Removed redundant envs
harini-venkataraman Mar 4, 2026
9bcb458
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman Mar 4, 2026
0cbd10a
adding worker for callbacks
harini-venkataraman Mar 4, 2026
2b1ab1e
adding worker for callbacks
harini-venkataraman Mar 5, 2026
4122f08
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 5, 2026
1ceb352
adding worker for callbacks
harini-venkataraman Mar 5, 2026
d69304d
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman Mar 5, 2026
7c1266b
adding worker for callbacks
harini-venkataraman Mar 5, 2026
0b84d9e
adding worker for callbacks
harini-venkataraman Mar 5, 2026
5b0629d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 5, 2026
98ee4b9
Pluggable apps and plugins to fit the new async prompt execution arch…
harini-venkataraman Mar 6, 2026
2dffcef
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman Mar 6, 2026
3b35fb2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 6, 2026
1ab6031
Pluggable apps and plugins to fit the new async prompt execution arch…
harini-venkataraman Mar 6, 2026
15c3daf
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman Mar 6, 2026
7ae1a74
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 6, 2026
fbf9c29
Pluggable apps and plugins to fit the new async prompt execution arch…
harini-venkataraman Mar 9, 2026
ec2f762
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman Mar 9, 2026
d6a3c5e
adding worker for callbacks
harini-venkataraman Mar 9, 2026
5c23ab0
adding worker for callbacks
harini-venkataraman Mar 9, 2026
525024f
adding worker for callbacks
harini-venkataraman Mar 9, 2026
a8cbce1
adding worker for callbacks
harini-venkataraman Mar 9, 2026
549f17a
adding worker for callbacks
harini-venkataraman Mar 9, 2026
f9b86a9
adding worker for callbacks
harini-venkataraman Mar 10, 2026
5369e5a
adding worker for callbacks
harini-venkataraman Mar 10, 2026
b5205ff
adding worker for callbacks
harini-venkataraman Mar 10, 2026
9659661
fix: write output files in agentic extraction pipeline
harini-venkataraman Mar 11, 2026
67eef62
UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in te…
harini-venkataraman Mar 11, 2026
3f4cc7d
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 11, 2026
a563a35
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 11, 2026
9b422da
Update docs
harini-venkataraman Mar 11, 2026
6a6e8e9
Merge branch 'feat/async-prompt-service-v2' of github.com:Zipstack/un…
harini-venkataraman Mar 11, 2026
817fc1c
UN-3266 fix: remove dead code with undefined names in fetch_response
harini-venkataraman Mar 11, 2026
d9bc50f
Un 3266 fix security hotspot tmp paths (#1851)
harini-venkataraman Mar 11, 2026
b715f64
UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849
harini-venkataraman Mar 11, 2026
e9c23b2
UN-3266 fix: resolve SonarCloud code smells in PR #1849
harini-venkataraman Mar 11, 2026
f59755a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 11, 2026
4bf9736
UN-3266 fix: wrap long log message in dispatcher.py to fix E501
harini-venkataraman Mar 11, 2026
0531870
UN-3266 fix: resolve remaining SonarCloud S117 naming violations
harini-venkataraman Mar 11, 2026
a2edb23
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 11, 2026
3f86131
UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849
harini-venkataraman Mar 11, 2026
45e61c4
UN-3266 fix: resolve SonarCloud cognitive complexity and code smell v…
harini-venkataraman Mar 11, 2026
6391c6c
UN-3266 fix: remove unused RetrievalStrategy import from _handle_answ…
harini-venkataraman Mar 11, 2026
0af0484
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 11, 2026
807e405
UN-3266 fix: rename UsageHelper params to lowercase (N803)
harini-venkataraman Mar 11, 2026
9bdb3f5
UN-3266 fix: resolve remaining SonarCloud issues from check run 66691…
harini-venkataraman Mar 11, 2026
18eafe9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 11, 2026
7a01a35
UN-3266 fix: remove unused locals in _handle_answer_prompt (F841)
harini-venkataraman Mar 11, 2026
3e5ce31
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 12, 2026
e3ca0c6
fix: resolve Biome linting errors in frontend source files
harini-venkataraman Mar 12, 2026
db3d8c2
fix: replace dynamic import of SharePermission with static import in …
harini-venkataraman Mar 12, 2026
a62a9fd
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 12, 2026
b3a90af
fix: resolve SonarCloud warnings in frontend components
harini-venkataraman Mar 12, 2026
4200ac1
Merge branch 'main' into feat/async-prompt-service-v2
ritwik-g Mar 12, 2026
1c58eb9
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 18, 2026
8fdb680
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 19, 2026
79adb41
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 19, 2026
9749083
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 19, 2026
e8515d5
Address PR #1849 review comments: fix null guards, dead code, and tes…
harini-venkataraman Mar 19, 2026
2be161b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 19, 2026
7a740a2
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 19, 2026
3d9f540
Fix missing llm_usage_reason for summarize LLM usage tracking
harini-venkataraman Mar 23, 2026
26d8c4a
UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor
harini-venkataraman Mar 23, 2026
4879b10
Fixing API depployment response mismatches
harini-venkataraman Mar 23, 2026
8057527
Fix single-pass extraction showing only one prompt result in real-time
harini-venkataraman Mar 25, 2026
d96a521
Move summarize from sync Django plugin to executor worker for IDE index
harini-venkataraman Mar 25, 2026
a40b681
Address PR #1849 review comments: null guards, thread safety
harini-venkataraman Mar 25, 2026
4966919
Add documentation to ExecutionResponse DTO describing result structure
harini-venkataraman Mar 25, 2026
8e29665
Fix PR review issues: IDOR, null guards, rollback, spinner, summarize…
harini-venkataraman Mar 26, 2026
58825ef
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 26, 2026
e1cec00
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 26, 2026
53fe9fc
Fix CI, tests, and add async prompt studio improvements
harini-venkataraman Mar 26, 2026
1468a97
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 26, 2026
9964c43
Fix pre-existing biome CI errors: import ordering and formatting
harini-venkataraman Mar 26, 2026
44f72f8
Fix ruff F821: add missing transaction import in prompt_studio_helper
harini-venkataraman Mar 26, 2026
bdf2916
Add input validation guards to bulk_fetch_response endpoint
harini-venkataraman Mar 26, 2026
3989ad4
Merge branch 'main' into feat/async-prompt-service-v2
kirtimanmishrazipstack Mar 31, 2026
0424443
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 31, 2026
28f2224
IDE Call backs
harini-venkataraman Mar 31, 2026
834df68
Sonar issues fix
harini-venkataraman Mar 31, 2026
8ed6b47
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 31, 2026
2b333a3
Fix ruff errors: restore summary_profile variable, suppress TC001 in …
harini-venkataraman Mar 31, 2026
0628ed1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 31, 2026
666563a
Update bun.lock to match package.json dependency ranges
harini-venkataraman Mar 31, 2026
e489e45
Fix all biome lint warnings: empty blocks, missing braces, forEach re…
harini-venkataraman Mar 31, 2026
0aae584
Move ExecutionContext import into TYPE_CHECKING block
harini-venkataraman Mar 31, 2026
2ba2c21
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 31, 2026
3e7f808
Fix SonarQube issues: duplication, naming, nesting, unused var
harini-venkataraman Mar 31, 2026
42eaed8
Replace worker-ide-callback Dockerfile with worker-unified
harini-venkataraman Mar 31, 2026
ac58c0e
Add celery_executor_agentic queue to executor worker
harini-venkataraman Apr 2, 2026
8114849
FIxing email enforce type
harini-venkataraman Apr 3, 2026
2b35695
Removing line-item from select choices
harini-venkataraman Apr 3, 2026
0deb08d
Merge main
harini-venkataraman Apr 3, 2026
b5afee1
Update workers/shared/enums/worker_enums_base.py
harini-venkataraman Apr 3, 2026
19ea4fc
Update backend/workflow_manager/workflow_v2/workflow_helper.py
harini-venkataraman Apr 3, 2026
c6cdffb
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 3, 2026
0879e82
Fix false success logs and silent failures in ETL destination pipelines
harini-venkataraman Apr 3, 2026
822e040
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 3, 2026
1eae4e2
Merge branch 'main' into fix/agentic-executor-queue
kirtimanmishrazipstack Apr 3, 2026
802eddb
Revert ETL destination pipeline changes — deferring to next cut
harini-venkataraman Apr 3, 2026
0b16930
Fix false success logs and missing data in ETL destination pipelines
harini-venkataraman Apr 6, 2026
0bea4cc
Fix missing context_retrieval metric for single pass extraction
harini-venkataraman Apr 6, 2026
b53b37b
Fix Unstructured IO adapter PermissionError on remote storage
harini-venkataraman Apr 6, 2026
13f25d4
Defer subscription usage tracking to IDE callback workers
harini-venkataraman Apr 6, 2026
7344e61
Fix missing embedding metadata in API deployment with chunking
harini-venkataraman Apr 6, 2026
c02ef1b
Fix email enforce type returning "NA" string and surface null in FE
harini-venkataraman Apr 6, 2026
c7aacc8
Feat/line item executor plugin (#1899)
harini-venkataraman Apr 6, 2026
a0884e7
Guard against undefined connector type in PostHog event lookup
harini-venkataraman Apr 6, 2026
3dd8b56
Add worker-executor-v2 service to docker-compose under workers-v2 pro…
harini-venkataraman Apr 6, 2026
7757586
Feat/line item executor plugin (#1900)
harini-venkataraman Apr 6, 2026
98ec941
Task pipeline fixes
harini-venkataraman Apr 6, 2026
8334515
Merge branch 'fix/agentic-executor-queue' of github.com:Zipstack/unst…
harini-venkataraman Apr 6, 2026
0a91221
Fixing null fonts
harini-venkataraman Apr 6, 2026
e1bbc80
Merge branch 'main' into fix/agentic-executor-queue
harini-venkataraman Apr 6, 2026
6f2ce13
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 6, 2026
0533ced
Fix biome formatting in DisplayPromptResult
harini-venkataraman Apr 6, 2026
9a6f31a
Drop unlabeled LLM rows from per-model usage breakdown
harini-venkataraman Apr 6, 2026
095c7d1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 6, 2026
e35af2f
Fix Sonar issues: cognitive complexity, params, dup, test smells
harini-venkataraman Apr 6, 2026
7421f3b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 6, 2026
1a79030
Addressing greptile comments
harini-venkataraman Apr 6, 2026
5c3b67c
Addressing greptile comments
harini-venkataraman Apr 6, 2026
adda29e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 6, 2026
1b0a1e1
Address PR review on legacy_executor single-pass and extraction strea…
harini-venkataraman Apr 7, 2026
10b2431
Fixing re-indexing marker
harini-venkataraman Apr 8, 2026
f1f071e
Merge branch 'main' into fix/agentic-executor-queue
harini-venkataraman Apr 8, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions backend/prompt_studio/prompt_studio_core_v2/internal_urls.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@
path("output/", internal_views.prompt_output, name="prompt-output"),
path("index/", internal_views.index_update, name="index-update"),
path("indexing-status/", internal_views.indexing_status, name="indexing-status"),
path(
"extraction-status/",
internal_views.extraction_status,
name="extraction-status",
),
path(
"profile/<str:profile_id>/",
internal_views.profile_detail,
Expand Down
67 changes: 67 additions & 0 deletions backend/prompt_studio/prompt_studio_core_v2/internal_views.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,73 @@ def index_update(request):
)


@csrf_exempt
@require_http_methods(["POST"])
def extraction_status(request):
"""Mark IndexManager.extraction_status for a document+profile pair.

Called by the ide_callback worker after a successful ide_index run so
that subsequent Answer Prompt dispatches can short-circuit extraction
via PromptStudioIndexHelper.check_extraction_status.

Expected JSON payload:
{
"document_id": str,
"profile_manager_id": str,
"x2text_config_hash": str,
"enable_highlight": bool,
"extracted": bool (optional, default true),
"error_message": str | null (optional)
}
"""
data, err = _parse_json_body(request)
if err:
return err

document_id = data.get("document_id", "")
profile_manager_id = data.get("profile_manager_id", "")
x2text_config_hash = data.get("x2text_config_hash", "")
enable_highlight = data.get("enable_highlight", False)
extracted = data.get("extracted", True)
error_message = data.get("error_message")

if not document_id or not profile_manager_id or not x2text_config_hash:
return JsonResponse(
{
"success": False,
"error": (
"document_id, profile_manager_id, and x2text_config_hash "
"are required"
),
},
status=status.HTTP_400_BAD_REQUEST,
)

try:
from prompt_studio.prompt_profile_manager_v2.models import ProfileManager
from prompt_studio.prompt_studio_index_manager_v2.prompt_studio_index_helper import (
PromptStudioIndexHelper,
)

profile_manager = ProfileManager.objects.get(pk=profile_manager_id)
success = PromptStudioIndexHelper.mark_extraction_status(
document_id=document_id,
profile_manager=profile_manager,
x2text_config_hash=x2text_config_hash,
enable_highlight=enable_highlight,
extracted=extracted,
error_message=error_message,
)
return JsonResponse({"success": success})

except Exception as e:
logger.exception("extraction_status internal API failed")
return JsonResponse(
{"success": False, "error": str(e)},
status=status.HTTP_500_INTERNAL_SERVER_ERROR,
)


@csrf_exempt
@require_http_methods(["POST"])
def indexing_status(request):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -97,11 +97,6 @@ def mark_extraction_status(
with transaction.atomic():
document = DocumentManager.objects.get(pk=document_id)

args = {
"document_manager": document,
"profile_manager": profile_manager,
}

# Build extraction status data
status_data = {
"extracted": extracted,
Expand All @@ -112,13 +107,23 @@ def mark_extraction_status(
if not extracted and error_message:
status_data["error"] = error_message

defaults = {"extraction_status": {x2text_config_hash: status_data}}

index_manager, created = IndexManager.objects.update_or_create(
**args,
defaults=defaults,
# Lock the row (or create an empty one) so concurrent callers
# merge into the same dict rather than clobbering each other.
index_manager, created = (
IndexManager.objects.select_for_update().get_or_create(
document_manager=document,
profile_manager=profile_manager,
defaults={"extraction_status": {}},
)
)

# Merge in place — update_or_create(defaults=...) would replace
# the whole dict and wipe any prior hash entries.
extraction_status = dict(index_manager.extraction_status or {})
extraction_status[x2text_config_hash] = status_data
index_manager.extraction_status = extraction_status
index_manager.save(update_fields=["extraction_status"])

logger.info(
f"Index manager {index_manager} {index_manager.index_ids_history}"
)
Expand Down
13 changes: 12 additions & 1 deletion workers/executor/executors/legacy_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -1026,9 +1026,20 @@ def _handle_index(self, context: ExecutionContext) -> ExecutionResult:
doc_id_found,
reindex,
)
if doc_id_found and not reindex:
shim.stream_log(
"Document already indexed in vector store; skipping re-index."
)
logger.info(
"Skipping re-index: doc_id=%s already in vector DB and "
"reindex=False",
doc_id,
)
return ExecutionResult(success=True, data={IKeys.DOC_ID: doc_id})

if doc_id_found and reindex:
shim.stream_log("Document already indexed, re-indexing...")
elif not doc_id_found:
else:
shim.stream_log("Indexing document for the first time...")
shim.stream_log("Indexing document into vector store...")
index.perform_indexing(
Expand Down
25 changes: 25 additions & 0 deletions workers/ide_callback/tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@


@app.task(name="ide_index_complete")
def ide_index_complete(

Check failure on line 142 in workers/ide_callback/tasks.py

View check run for this annotation

SonarQubeCloud / SonarCloud Code Analysis

Refactor this function to reduce its Cognitive Complexity from 16 to the 15 allowed.

See more on https://sonarcloud.io/project/issues?id=Zipstack_unstract&issues=AZ1slUys5MaBOhPBUGw2&open=AZ1slUys5MaBOhPBUGw2&pullRequest=1907
result_dict: dict[str, Any],
callback_kwargs: dict[str, Any] | None = None,
) -> dict[str, Any]:
Expand Down Expand Up @@ -211,6 +211,31 @@
profile_manager_id,
)

# Mark extraction_status so subsequent Answer Prompt dispatches
# can short-circuit re-extraction. The Phase 4 backend payload
# already stashes x2text_config_hash and enable_highlight in
# cb_kwargs for exactly this purpose. Failure here is non-fatal:
# primary indexing already succeeded above.
x2text_config_hash = cb.get("x2text_config_hash", "")
enable_highlight = cb.get("enable_highlight", False)
if x2text_config_hash and profile_manager_id:
try:
api.mark_extraction_status(
document_id=document_id,
profile_manager_id=profile_manager_id,
x2text_config_hash=x2text_config_hash,
enable_highlight=enable_highlight,
organization_id=org_id,
)
except Exception:
logger.warning(
"Failed to mark extraction_status for document %s "
"profile %s; primary indexing succeeded.",
document_id,
profile_manager_id,
exc_info=True,
)

# Handle summary index tracking via backend endpoint
# (requires PromptIdeBaseTool + IndexingUtils which need Django ORM)
summary_profile_id = cb.get("summary_profile_id", "")
Expand Down
28 changes: 28 additions & 0 deletions workers/shared/clients/prompt_studio_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
_OUTPUT_ENDPOINT = "v1/prompt-studio/output/"
_INDEX_ENDPOINT = "v1/prompt-studio/index/"
_INDEXING_STATUS_ENDPOINT = "v1/prompt-studio/indexing-status/"
_EXTRACTION_STATUS_ENDPOINT = "v1/prompt-studio/extraction-status/"
_PROFILE_ENDPOINT = "v1/prompt-studio/profile/{profile_id}/"
_HUBSPOT_ENDPOINT = "v1/prompt-studio/hubspot-notify/"
_SUMMARY_INDEX_KEY_ENDPOINT = "v1/prompt-studio/summary-index-key/"
Expand Down Expand Up @@ -71,6 +72,33 @@ def update_index_manager(
}
return self.post(_INDEX_ENDPOINT, data=payload, organization_id=organization_id)

def mark_extraction_status(
self,
document_id: str,
profile_manager_id: str,
x2text_config_hash: str,
enable_highlight: bool,
organization_id: str | None = None,
extracted: bool = True,
error_message: str | None = None,
) -> dict[str, Any]:
"""Mark IndexManager.extraction_status for a document+profile pair.

Called from the ide_index_complete callback so that subsequent
Answer Prompt dispatches can short-circuit re-extraction.
"""
payload = {
"document_id": document_id,
"profile_manager_id": profile_manager_id,
"x2text_config_hash": x2text_config_hash,
"enable_highlight": enable_highlight,
"extracted": extracted,
"error_message": error_message,
}
return self.post(
_EXTRACTION_STATUS_ENDPOINT, data=payload, organization_id=organization_id
)

def mark_document_indexed(
self,
org_id: str,
Expand Down
32 changes: 32 additions & 0 deletions workers/tests/test_legacy_executor_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,38 @@ def test_reindex_passed_through(self, mock_get_fs, mock_indexing_deps):
assert result.success is True
init_call = mock_index_cls.call_args
assert init_call.kwargs["processing_options"].reindex is True
# reindex=True with already-indexed doc must still call perform_indexing
mock_index_cls.return_value.perform_indexing.assert_called_once()

@patch(_PATCH_FS)
def test_already_indexed_no_reindex_short_circuits(
self, mock_get_fs, mock_indexing_deps
):
"""doc_id already in VDB and reindex=False → skip perform_indexing.

This is the defense-in-depth guard introduced for the IDE
re-indexing fix: even if the Redis cache misses and Answer Prompt
re-dispatches index, the executor must not re-write the same chunks
into the vector store.
"""
mock_index_cls, mock_emb_cls, mock_vdb_cls = mock_indexing_deps
_register_legacy()
executor = ExecutorRegistry.get("legacy")

mock_index = _setup_mock_index(mock_index_cls, "doc-already-indexed")
mock_index.is_document_indexed.return_value = True
mock_emb_cls.return_value = MagicMock()
mock_vdb_cls.return_value = MagicMock()
mock_get_fs.return_value = MagicMock()

# reindex defaults to False
ctx = _make_index_context()
result = executor.execute(ctx)

assert result.success is True
assert result.data[IKeys.DOC_ID] == "doc-already-indexed"
mock_index.is_document_indexed.assert_called_once()
mock_index.perform_indexing.assert_not_called()


# --- 5. VectorDB.close() always called ---
Expand Down