Decoupling dependent chunks by ahgpt · Pull Request #788 · inclusionAI/AWorld

ahgpt · 2026-02-27T07:24:12Z

No description provided.

…ocking the caller. Added a new method for handling summary tasks asynchronously and improved error logging for background operations.

…d `get_knowledge_chunk` method to return `None` for backward compatibility and added logging for deprecation. Removed unused chunk-related methods and adjusted memory store initialization to support SQLite or in-memory options. Cleaned up environment configuration by removing obsolete parameters.

gemini-code-assist · 2026-02-27T07:24:43Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a fundamental change in how knowledge and tool results are managed within the system. By moving away from granular chunk-based retrieval and indexing, the process of offloading and accessing context information becomes more streamlined and efficient. The focus shifts to treating entire artifacts as units, simplifying the architecture and improving responsiveness by offloading heavy operations like summary generation to background tasks. This change aims to enhance the system's performance and maintainability by reducing complexity in knowledge management.

Highlights

Chunk-based Retrieval Deprecation: The system has been refactored to deprecate and remove chunk-based knowledge retrieval. This simplifies knowledge handling by treating entire tool results as single artifacts, eliminating the need for chunking, embedding, and vector store indexing for context offloading.
Simplified Knowledge Offloading: The KnowledgeService.offload_by_workspace method no longer relies on complex chunking or indexing. It now processes the entire tool result as a single artifact, returning either the full content (if small) or a lightweight index for larger results.
Memory Store Initialization Refinement: Memory store initialization now dynamically selects between SQLite and InMemory backends based on environment variables, and the RetrieverFactory initialization has been removed, aligning with the deprecation of chunk-based retrieval.
Background Summary Generation: Summary generation for agent task memory is now executed as a background task using asyncio.create_task, preventing blocking of the main thread and improving performance.
Removal of Mem0Memory: The Mem0Memory implementation has been entirely removed, indicating a shift away from the mem0 library for memory management.
Tool Definition Updates: The get_knowledge_chunk tool has been removed from ContextKnowledgeAction and its corresponding execution logic. Existing calls to get_knowledge_chunk in example tools now log a deprecation warning and fall back to get_knowledge_by_lines.
New Documentation for Context Offloading: A new markdown document has been added, providing detailed information on the updated context offloading strategy, emphasizing the direct saving of tool results as single artifacts.
Environment Variable Cleanup: Numerous environment variables related to RAG, chunking, embeddings, and vector stores have been removed from the example .env_example file, reflecting the removal of these features.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

aworld/core/context/amni/init.py
- Removed import of Chunk from retrieval.chunker.
- Updated get_knowledge_chunk abstract method to return Optional[Any] and added a deprecation note.
- Removed the need_index method and its associated comment, marking it as deprecated.
aworld/core/context/amni/config.py
- Added imports for MemoryStore and InMemoryMemoryStore.
- Introduced a new helper function _create_memory_store to dynamically select memory backend (SQLite or InMemory).
- Removed the initialization of RetrieverFactory from init_middlewares.
aworld/core/context/amni/services/knowledge_service.py
- Removed import of Chunk from retrieval.chunker.
- Modified get_knowledge_chunk method to return Optional[Any] and added a note indicating that chunk-based retrieval is no longer supported and will always return None.
- Removed several private methods related to chunk processing and indexing, including _need_index, _format_chunk_content, _get_knowledge_index_context, _get_artifact_statistics, _load_artifact_index_context, _load_artifact_content_context, and _load_artifact_chunks_by_workspace.
- Refactored offload_by_workspace to disable indexing (build_index=False) and to build a lightweight knowledge index without chunk details.
- Refactored load_context_by_workspace to no longer rely on chunk-level retrieval, instead querying artifacts directly and building a lightweight index based on artifact summaries.
- Removed an unused import of logger from aworld/logs/util in get_todo.
aworld/core/context/amni/tool/context_knowledge_tool.py
- Removed the GET_KNOWLEDGE_CHUNK tool action definition.
- Removed the elif block that handled ContextKnowledgeAction.GET_KNOWLEDGE_CHUNK in the do_step method.
aworld/core/context/amni/worksapces.py
- Removed imports for Chunk, ChunkIndex, ArtifactStats, SearchResults, and RetrieverFactory.
- Removed _retriever, retriever property, and vector_collection property from Workspace class.
- Updated add_artifact method to no longer call self.retriever.async_insert, indicating that retrieval/chunk index is no longer maintained.
- Removed all methods related to artifact chunk CRUD operations, such as get_artifact_chunks_index, async_query_artifact_index, search_artifact_chunks_index, search_artifact_chunks, get_artifact_chunk, get_artifact_chunks_by_range, get_artifact_chunk_indices_by_range, get_artifact_chunks_head_and_tail, and get_artifact_chunk_indices_middle_range.
aworld/memory/main.py
- Modified _add method to run _summary_agent_task_memory as a background task using asyncio.create_task.
- Added a new asynchronous method _run_summary_in_background to handle background summary generation and error logging.
- Changed the role from 'assistant' to 'user' for summary messages in _gen_multi_rounds_summary.
aworld/memory/mem0/mem0_memory.py
- Removed the entire file, indicating the removal of Mem0Memory implementation.
aworld/runners/event_runner.py
- Initialized resp = None before the try block in do_run.
docs/_ahdocs/upgrade_context_offload_memory.md
- Added a new documentation file detailing the new context offload strategy, focusing on saving entire tool results as single artifacts and accessing them via knowledge interfaces.
examples/skill_agent/.env_example
- Removed numerous environment variables related to AMNI RAG configuration, chunking, embeddings, vector stores (Chroma, Elasticsearch), and rerankers.
examples/skill_agent/mcp_tools/contextserver.py
- Removed the entire file, including get_knowledge, get_knowledge_chunk, get_knowledge_by_lines, grep_knowledge, list_knowledge_info, add_knowledge, update_knowledge, and add_todo tools.
examples/skill_agent/quick_start.py
- Updated the user_input example in the if __name__ == '__main__': block.
examples/xbench/mcp_tools/contextserver.py
- Modified get_knowledge_chunk to log a deprecation warning and fall back to calling get_knowledge_by_lines with a default range.

Activity

The pull request introduces significant architectural changes to knowledge and context management.
No specific human comments or reviews were provided in the context.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively decouples the system from the chunk-based knowledge retrieval mechanism, which is a significant and positive refactoring. The changes simplify the knowledge service and workspace implementations by removing complex chunking, indexing, and retrieval logic. I've also noticed a good improvement in aworld/memory/main.py where summary generation is now non-blocking.

My main feedback is regarding some leftover parameters in a refactored function signature in knowledge_service.py, which could lead to confusion and is a breaking change that needs to be propagated to its call site. Please see the specific comment for details.

gemini-code-assist · 2026-02-27T07:27:47Z

aworld/core/context/amni/services/knowledge_service.py

                                        top_k: int = 20, load_content: bool = True, load_index: bool = True, 
                                        search_by_index: bool = True) -> str:


The parameters load_content, load_index, and search_by_index are no longer used in the new implementation of this function. To improve code clarity and maintainability, they should be removed from the function signature.

Note that this is a breaking change. You will also need to update the call to this function in aworld.core.context.amni.ApplicationContext.load_context_by_workspace to remove these arguments.

top_k: int = 20) -> str:

ahgpt added 3 commits February 27, 2026 14:39

Implement background summary generation in AworldMemory to prevent bl…

ce099bf

…ocking the caller. Added a new method for handling summary tasks asynchronously and improved error logging for background operations.

fixed

36a570d

ahgpt requested a review from tallate February 27, 2026 07:24

fixed

c07f37f

gemini-code-assist bot reviewed Feb 27, 2026

View reviewed changes

tallate approved these changes Feb 27, 2026

View reviewed changes

tallate merged commit af19b29 into main Feb 27, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decoupling dependent chunks#788

Decoupling dependent chunks#788
tallate merged 4 commits intomainfrom
feat/memory_file_support

ahgpt commented Feb 27, 2026

Uh oh!

gemini-code-assist bot commented Feb 27, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		top_k: int = 20, load_content: bool = True, load_index: bool = True,
		search_by_index: bool = True) -> str:

Conversation

ahgpt commented Feb 27, 2026

Uh oh!

gemini-code-assist bot commented Feb 27, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants