Skip to content

[Bug]: SDK API DELETE /datasets/{id}/chunks does not actually stop parsing task #11745

@tedhappy

Description

@tedhappy

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

cfdcceb

RAGFlow image version

v0.22.1

Other environment information

- Hardware parameters: Windows 11
- OS type: Windows
- Others: Using SDK API via HTTP

Actual behavior

Bug Description

When calling the SDK API DELETE /api/v1/datasets/{dataset_id}/chunks to stop document parsing, the background task continues and eventually completes, overwriting the CANCEL status with DONE.

Root Cause Analysis

Looking at the source code:

SDK API ([api/apps/sdk/doc.py]:

info = {"run": "2", "progress": 0, "chunk_num": 0}
DocumentService.update_by_id(id, info)
settings.docStoreConn.delete({"doc_id": doc[0].id}, search.index_name(tenant_id), dataset_id)

The SDK API only updates the database status but does NOT call cancel_all_task_of() to send the cancellation signal via Redis.

### Expected behavior

The SDK API should call cancel_all_task_of(doc_id) to properly stop the background parsing task.

### Steps to reproduce

```Markdown
1. Upload a document to a dataset
2. Start parsing the document (status becomes RUNNING)
3. Call SDK API `DELETE /api/v1/datasets/{dataset_id}/chunks` with document_ids to stop parsing
4. Observe that the document parsing continues and eventually completes (status becomes DONE instead of CANCEL)

Additional information

Root Cause Analysis

The SDK API endpoint DELETE /datasets/{dataset_id}/chunks only updates database status but does NOT call cancel_all_task_of() to stop the background task.

SDK API (api/apps/sdk/doc.py line 842-844):

info = {"run": "2", "progress": 0, "chunk_num": 0}
DocumentService.update_by_id(id, info)
settings.docStoreConn.delete({"doc_id": doc[0].id}, search.index_name(tenant_id), dataset_id)

Metadata

Metadata

Assignees

Labels

🐞 bugSomething isn't working, pull request that fix bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions