feat(backup): add user asset backup & migration module#2457
feat(backup): add user asset backup & migration module#2457leoleils wants to merge 21 commits intoagentscope-ai:mainfrom
Conversation
Add a complete backup and migration system for CoPaw workspace assets, enabling daily auto-backup, cross-device migration, and selective export/import of user data. New module: src/copaw/backup/ - models.py: data models (AssetType, AssetManifest, ExportOptions, etc.) - errors.py: InvalidAssetPackageError, IncompatibleVersionError, etc. - sanitizer.py: preference file sensitive field redaction - version_checker.py: version compatibility, strict validation, migration - exporter.py: asset export engine with concurrency-safe reads - importer.py: asset import engine with conflict resolution - scheduler.py: backup scheduler with retention policy CLI commands: - copaw assets export/import/verify - copaw backup run/list/restore - Support --all flag for multi-agent batch operations Integration: - src/copaw/app/crons/backup_job.py: cron job builder - src/copaw/app/workspace/service_factories.py: service registration - src/copaw/cli/main.py: lazy-loaded command registration Asset types: preferences, memories, skills, tools, global_config Security: - Sensitive fields auto-redacted on export, preserved on import - ZIP path traversal prevention, size limits (500MB/1GB) - SHA256 checksum verification per asset Tests: 93 tests (14 property-based + unit), all passing Pre-commit: flake8, mypy, black pass; pylint 9.83/10
|
Hi @leoleils, this is your 8th Pull Request. 🙌 Join Developer CommunityThanks so much for your contribution! We'd love to invite you to join the official CoPaw developer group! You can find the Discord and DingTalk group links under the "Developer Community" section on our docs page: We truly appreciate your enthusiasm—and look forward to your future contributions! 😊 We'll review your PR soon. |
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive backup and migration system for CoPaw, including engines for asset export and import, sensitive data sanitization, and version compatibility management. It also adds a backup scheduler with retention policies and new CLI commands for manual asset operations. The review feedback highlights opportunities to improve error visibility by logging tool collection failures, refine exception handling during manifest validation, and eliminate code duplication by centralizing shared utility functions. Additionally, it is recommended to use English for all error messages to maintain consistency across the codebase.
src/copaw/backup/exporter.py
Outdated
| except (json.JSONDecodeError, UnicodeDecodeError): | ||
| pass |
There was a problem hiding this comment.
Silently ignoring JSON parsing errors when collecting tools can be confusing for users, as they might not realize why their tools were not backed up. It's better to log a warning to provide visibility into such issues.
| except (json.JSONDecodeError, UnicodeDecodeError): | |
| pass | |
| except (json.JSONDecodeError, UnicodeDecodeError) as exc: | |
| logger.warning("Failed to parse tools from %s, skipping tool collection: %s", agent_json, exc) |
src/copaw/backup/importer.py
Outdated
| except (json.JSONDecodeError, Exception) as exc: | ||
| raise InvalidAssetPackageError( | ||
| f"Invalid manifest.json: {exc}", | ||
| ) from exc |
There was a problem hiding this comment.
Catching the broad Exception class can hide unexpected bugs and make debugging more difficult. When validating the manifest with Pydantic's model_validate, a pydantic.ValidationError is the expected exception on failure. Please catch more specific exceptions here. You may need to add from pydantic import ValidationError at the top of the file.
src/copaw/backup/scheduler.py
Outdated
| def _get_agent_id(workspace_dir: Path) -> str: | ||
| """Read agent id from workspace agent.json.""" | ||
| import json | ||
|
|
||
| agent_json = workspace_dir / "agent.json" | ||
| if agent_json.exists(): | ||
| try: | ||
| data = json.loads(agent_json.read_text(encoding="utf-8")) | ||
| return data.get("id", "unknown") | ||
| except (json.JSONDecodeError, OSError): | ||
| pass | ||
| return "unknown" |
There was a problem hiding this comment.
This _get_agent_id function is a duplicate of the one in src/copaw/backup/exporter.py. To follow the DRY (Don't Repeat Yourself) principle and improve maintainability, this function should be moved to a shared utility module (e.g., src/copaw/backup/utils.py) and imported in both scheduler.py and exporter.py.
src/copaw/backup/version_checker.py
Outdated
| """ | ||
| match = _VERSION_RE.match(version_str) | ||
| if not match: | ||
| raise ValueError(f"无效的版本格式: {version_str}") |
There was a problem hiding this comment.
This error message is in Chinese, which is inconsistent with the rest of the codebase (docstrings, comments, other logs, and CLI help text are in English). For consistency and to make the code more accessible to a wider range of contributors, it's recommended to use English for all code-level messages. This applies to other Chinese-language strings in this module and in importer.py as well. If internationalization is a goal, using a dedicated i18n library would be a better approach.
| raise ValueError(f"无效的版本格式: {version_str}") | |
| raise ValueError(f"Invalid version format: {version_str}") |
| def _get_workspace_dir(agent_id: str = "default") -> Path: | ||
| """Resolve workspace directory for the given agent.""" | ||
| try: | ||
| config = load_config() | ||
| if agent_id in config.agents.profiles: | ||
| ref = config.agents.profiles[agent_id] | ||
| return Path(ref.workspace_dir).expanduser() | ||
| except Exception: | ||
| pass | ||
| return WORKING_DIR | ||
|
|
||
|
|
||
| def _get_all_agent_ids() -> list[str]: | ||
| """Return all configured agent IDs.""" | ||
| try: | ||
| config = load_config() | ||
| return list(config.agents.profiles.keys()) | ||
| except Exception: | ||
| return ["default"] | ||
|
|
There was a problem hiding this comment.
The helper functions _get_workspace_dir and _get_all_agent_ids are duplicated from src/copaw/cli/assets_cmd.py. To avoid code duplication and improve maintainability, these functions should be extracted into a shared utility module within the cli package, such as src/copaw/cli/utils.py, and then imported where needed.
- Remove unused variable in test_sanitizer_properties.py - Remove unused import (Any) in test_scheduler.py - Remove unused variable (export_result) in test_scheduler.py - Simplify empty list assertion in test_scheduler.py - Remove unused imports (_MIGRATIONS, get_migration_path) in test_version_checker_properties.py
- Remove unused imports (json, ConflictInfo, ExportResult) in test_models - Simplify empty list assertions to use implicit booleaness - Fix unused arguments in MockMemoryManager (test_integration) - Refactor roundtrip test to reduce statement count (R0915) - Extract _get_agent_id to shared backup/utils.py (DRY) - Replace broad Exception catch with ValidationError in importer - Add warning log for tool collection parse failures in exporter - Use English for error messages in version_checker - Extract CLI helpers to avoid duplication between assets/backup cmds
- Remove unused InsufficientStorageError import (test_importer.py) - Remove unused result variable (test_importer.py) - Remove unused HealthCheck import (test_importer_properties.py) - Add pylint disable for intentional protected-access in tests - Rename unused conflicts variable to _conflicts
- Fix E1133 non-iterable in assets_cmd.py (extract types_list) - Add 'from exc' to all raise SystemExit(1) in CLI commands - Remove unused imports in test_exporter.py - Add pylint disable for unused-argument in FakeMemoryManager - Prefix unused result variable with underscore in test_exporter - Add pylint disable for protected-access in test_importer.py
- Fix F821: add missing ExportResult import in integration tests - Fix E501: break all long lines to ≤79 chars across test files - Fix F841: remove unused variable in test_exporter - Fix C0411: correct import order (pydantic before copaw) in importer - Fix E1133: use types_list pattern for non-iterable in assets_cmd - Fix W0613: add pylint disable for unused args in test mocks - Fix W0212: add pylint disable for protected-access in tests - Run black + add-trailing-comma to match CI formatting
- Move pylint disable comments to the def line (not return type) for too-many-branches and too-many-statements - Move protected-access disable to the access line in service_factories - All pre-commit checks now pass locally: black, flake8, mypy, pylint
# Conflicts: # src/copaw/cli/main.py
- Core backup module: models, sanitizer, version checker, exporter, importer, scheduler - Backup cron job registration - CLI commands: copaw assets export/import/verify, copaw backup run/list/restore - REST API router: /backup/export, /import, /list, /restore, /config - Backup service factory registered in workspace - Console backup page with auto-backup settings, history, export/import - Agent selector for export with select-all support - i18n translations for en, zh, ja, ru - 94 unit + property-based tests (hypothesis)
…rt, export result with download button
…port success message
e7737c5 to
47e0a88
Compare
Description
Add a complete backup and migration system for CoPaw workspace assets. Users can export/import preferences, memories, skills, tools, and global config as portable ZIP packages, with daily auto-backup via cron, version compatibility checking, sensitive field redaction, and multi-agent batch support.
Related Issue: Relates to user requests for workspace backup and cross-device migration capability.
Security Considerations: Sensitive fields (api_key, bot_token, password, etc.) are auto-redacted on export. Existing sensitive values are preserved on import and not overwritten by redacted placeholders. ZIP path traversal prevention and size limits (500MB/1GB) are enforced. SHA256 checksum verification per asset.
Type of Change
Component(s) Affected
Checklist
pre-commit run --all-fileslocally and it passespytestor as relevant) and they passTesting