-
Notifications
You must be signed in to change notification settings - Fork 69
Core: Enable hash-based change detection for notebook translator #203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
skytin1004
merged 25 commits into
Azure:main
from
skytin1004:enable-notebook-translator
Aug 17, 2025
Merged
Core: Enable hash-based change detection for notebook translator #203
skytin1004
merged 25 commits into
Azure:main
from
skytin1004:enable-notebook-translator
Aug 17, 2025
+1,286
−283
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… up-to-date notebooks - Add calculate_string_hash for per-cell change detection - Store original_hash/language_code in notebook.metadata.coopTranslator - Store source_hash in each markdown cell’s metadata (cell.metadata.coopTranslator) - Reuse unchanged cells; only retranslate modified cells - Skip notebook translation when original_hash matches unless update=True Affects: JupyterNotebookTranslator, TranslationManager, metadata_utils
- Update error messages from Computer Vision to Azure AI Service - Change environment variable references from AZURE_COMPUTER_VISION_KEY to AZURE_AI_SERVICE_API_KEY - Rename AzureComputerVisionConfig class to AzureAIVisionConfig - Update docstrings and comments to reflect Azure AI Service branding
Notebooks were always considered outdated because _is_translation_outdated was looking for HTML comment metadata instead of JSON metadata format.
- docs: fix CLI option inconsistencies in command-reference.md - docs: resolve README.md markdown linting errors - docs: add beta warning for evaluation functionality - fix: remove duplicate method definition in font_config.py - docs: add evaluation command documentation and examples
Collaborator
Author
|
I have reviewed the changes and everything looks good. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
build
Related to the build process, dependency management, and CI/CD configurations
core
Related to any changes in core source files
documentation
Improvements or additions to documentation
tests
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Enable hash-based change detection for notebook translator
Purpose
Implements intelligent change detection for Jupyter notebook translation to optimize translation efficiency and ensure translations are always up-to-date. Previously, notebook translations couldn't detect when source files changed, leading to unnecessary re-translations or outdated translations.
Description
This PR adds comprehensive change detection capabilities to the notebook translator:
Key Features:
coopTranslatormetadata to translated notebooks for trackingTechnical Changes:
notebook_utils.pytometadata_utils.pyfor better organizationadd_notebook_metadata(),read_notebook_metadata(), andis_notebook_up_to_date()functionsJupyterNotebookTranslatorto automatically add metadata after translationretranslate_outdated_files()to use appropriate translator based on file extensionRelated Issue
Closes #[issue_number] (if applicable)
Does this introduce a breaking change?
When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
This change is backward compatible. Existing translated notebooks without metadata will be treated as outdated and retranslated with the new metadata system.
Type of change
Checklist
Before submitting your pull request, please confirm the following:
Additional context
Files Modified:
src/co_op_translator/utils/common/metadata_utils.py- Added notebook metadata functionssrc/co_op_translator/core/llm/jupyter_notebook_translator.py- Added metadata integrationsrc/co_op_translator/core/project/translation_manager.py- Enhanced retranslation logictests/- Added comprehensive test coverageTesting:
Performance Impact:
This enhancement brings notebook translation in line with the existing markdown translation optimization, providing a consistent and efficient translation experience across all supported file types.