Proposal: Simplify `ToolDefinition` Architecture

## Problem Statement

The current tool system has multiple overlapping concepts (`Tool`, `ToolDefinition`, `ToolBase`) that make the architecture confusing and inconsistent:

1. **Inconsistent tool definition patterns**: Tools are defined in two different ways:
   - **Direct instances**: `FinishTool = ToolDefinition(...)` with executor already initialized
   - **Subclasses**: `class BashTool(ToolDefinition): ...` with custom `.create()` method

2. **Ambiguous `.create()` contract**: 
   - Direct instance tools (like `FinishTool`, `ThinkTool`) have a `.create()` method that raises `NotImplementedError`
   - Subclass tools (like `BashTool`, `FileEditorTool`) implement `.create()` to initialize the executor
   - This makes the contract unclear and forces defensive programming

3. **Confusion about naming**:
   - `Tool` (spec.py): Serializable spec with `name` and `params`
   - `ToolDefinition`: Concrete tool that can be executed
   - `ToolBase`: Abstract base class for tools
   - The relationship and purpose of each is not immediately clear

## Current Architecture

```python
# spec.py - Serializable spec passed to Agent
class Tool(BaseModel):
    name: str
    params: dict[str, Any]

# tool.py - Abstract base
class ToolBase[ActionT, ObservationT](ABC):
    name: str
    description: str
    action_type: type[Action]
    observation_type: type[Observation] | None
    executor: ToolExecutor | None  # Optional!
    
    @abstractmethod
    def create(cls, *args, **kwargs) -> Sequence[Self]:
        ...

# tool.py - Concrete implementation
class ToolDefinition[ActionT, ObservationT](ToolBase):
    @classmethod
    def create(cls, *args, **kwargs) -> Sequence[Self]:
        raise NotImplementedError(...)  # Hacky!
```

**Usage Pattern 1 (Direct Instance):**
```python
# builtins/finish.py
FinishTool = ToolDefinition(
    name="finish",
    action_type=FinishAction,
    observation_type=FinishObservation,
    description=TOOL_DESCRIPTION,
    executor=FinishExecutor(),  # Already initialized
)
```

**Usage Pattern 2 (Subclass):**
```python
# tools/execute_bash/definition.py
class BashTool(ToolDefinition[ExecuteBashAction, ExecuteBashObservation]):
    @classmethod
    def create(cls, conv_state, **params) -> Sequence["BashTool"]:
        executor = BashExecutor(working_dir=conv_state.workspace.working_dir, ...)
        return [cls(
            name="execute_bash",
            action_type=ExecuteBashAction,
            observation_type=ExecuteBashObservation,
            description=TOOL_DESCRIPTION,
            executor=executor,  # Initialized in create()
        )]
```

## Constraints to Satisfy

1. **Serializable and lazy**: Tools passed to `Agent` class must be serializable (for remote conversation) and lazy (should not materialize until agent is initialized)
   - Currently satisfied by `Tool` (spec) with `name` and `params`
   
2. **Two-level architecture**: We need at least two things:
   - A "spec" version of the tool that's serializable (`Tool`)
   - Actual tool instance that can be executed (`ToolDefinition`)

3. **Flexible executor initialization**: Ability to decide whether to provide tool executor during tool initialization (e.g., for sharing executors among multiple tool instances)

## Proposed Solution

**Make all tools follow the subclass pattern with a clear contract:**

1. **All tools should be subclasses of `ToolDefinition`**, not direct instances
2. **Make `.create()` an `@abstractmethod`** in `ToolBase` so the contract is explicit
3. **Each tool class implements `.create()`** which returns instances with executor populated
4. **For simple built-in tools**, create minimal subclasses instead of direct instances

### Example Refactoring

**Before (Direct Instance):**
```python
FinishTool = ToolDefinition(
    name="finish",
    action_type=FinishAction,
    observation_type=FinishObservation,
    description=TOOL_DESCRIPTION,
    executor=FinishExecutor(),
)
```

**After (Subclass):**
```python
class FinishTool(ToolDefinition[FinishAction, FinishObservation]):
    @classmethod
    def create(cls, conv_state=None, **params) -> Sequence["FinishTool"]:
        if params:
            raise ValueError("FinishTool doesn't accept parameters")
        return [cls(
            name="finish",
            action_type=FinishAction,
            observation_type=FinishObservation,
            description=TOOL_DESCRIPTION,
            executor=FinishExecutor(),
            annotations=ToolAnnotations(...),
        )]
```

## Benefits

1. **Predictable pattern**: All tools follow the same subclass pattern
2. **Clear contract**: `.create()` is `@abstractmethod`, making the interface explicit
3. **Consistent naming**: Tool classes have reliable names (e.g., `BashTool.name`)
4. **Better for registration**: Registry can reference tools by class name without instantiating
5. **Type safety**: Eliminates the `NotImplementedError` hack
6. **Maintains flexibility**: Executor can still be initialized in `.create()` with custom parameters

## Alternative Considered

**Option B: Lazy initialization with futures/proxies**

Figure out some hacks for `ToolDefinition.create()` to return a serializable "future" that materializes on demand:

```python
# Hypothetical proxy pattern
class ToolProxy(BaseModel):
    _tool_class: type[ToolDefinition]
    _params: dict[str, Any]
    
    def materialize(self, conv_state) -> ToolDefinition:
        return self._tool_class.create(conv_state, **self._params)[0]
```

**Why we prefer Option A (subclass pattern):**
- Simpler and more straightforward
- No magic or proxy objects
- Existing registry pattern already handles lazy initialization
- Better aligns with Python idioms

## Implementation Plan

1. Refactor built-in tools (`FinishTool`, `ThinkTool`) to be subclasses
2. Make `ToolBase.create()` an `@abstractmethod`
3. Remove the `NotImplementedError` in `ToolDefinition.create()`
4. Update tests to verify all tools follow the pattern
5. Update documentation to clarify the architecture

## Open Questions

1. Should we rename `ToolDefinition` to something clearer (e.g., `ToolClass`, `ToolFactory`)?
2. Should we consider making `executor` non-optional in `ToolDefinition` (with a separate "ToolSpec" for serialization)?
3. How do we handle backward compatibility for users who may have created direct `ToolDefinition` instances?

## Related

- Original discussion: https://github.com/OpenHands/software-agent-sdk/pull/862
- TODO comment in code: https://github.com/OpenHands/agent-sdk/issues/493

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: Simplify `ToolDefinition` Architecture #970

Problem Statement

Current Architecture

Constraints to Satisfy

Proposed Solution

Example Refactoring

Benefits

Alternative Considered

Implementation Plan

Open Questions

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: Simplify ToolDefinition Architecture #970

Description

Problem Statement

Current Architecture

Constraints to Satisfy

Proposed Solution

Example Refactoring

Benefits

Alternative Considered

Implementation Plan

Open Questions

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Proposal: Simplify `ToolDefinition` Architecture #970