Conversation
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
Contributor
|
✅ DCO Check Passed Thanks @cau-git, all your commits are properly signed off. 🎉 |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
…gine-object-detection
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
… engines - Add abstract get_label_mapping() method to BaseObjectDetectionEngine - Implement label loading from config.json in OnnxRuntimeObjectDetectionEngine - Refactor LayoutObjectDetectionModel to use engine-provided labels instead of hardcoded mapping - Centralizes label mapping logic in the inference engine layer This eliminates hardcoded label dictionaries and makes label mappings configurable through model configs. Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Implement TransformersObjectDetectionEngine as a PyTorch-based alternative to ONNX Runtime. Works as drop-in replacement for both layout and table detection models with support for CPU, CUDA, and MPS devices. - Add TransformersObjectDetectionEngine with AutoModelForObjectDetection - Update TransformersObjectDetectionEngineOptions (score_threshold, torch_dtype) - Update factory to instantiate Transformers engine - Switch OBJECT_DETECTION_LAYOUT_HERON preset to use Transformers by default - Add logging configuration to layout_object_detection_example.py Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
cau-git
added a commit
that referenced
this pull request
Feb 11, 2026
…mily with HF Transformers and ONNX runtime Implements runtime abstraction for image classification models with support for both ONNX Runtime and HuggingFace Transformers engines. Users can switch between engines without model retraining, similar to the object detection abstraction (#2959). Key components: - BaseImageClassificationEngine with factory pattern - OnnxRuntimeImageClassificationEngine and TransformersImageClassificationEngine implementations - Shared HfVisionModelMixin for common HF model utilities - Engine-specific configuration options - Test suite and example demonstrating runtime engine switching Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
3 tasks
cau-git
added a commit
that referenced
this pull request
Feb 18, 2026
…ication) and KServe v2 API support (#2979) * feat: Inference engines abstraction for image classification model family with HF Transformers and ONNX runtime Implements runtime abstraction for image classification models with support for both ONNX Runtime and HuggingFace Transformers engines. Users can switch between engines without model retraining, similar to the object detection abstraction (#2959). Key components: - BaseImageClassificationEngine with factory pattern - OnnxRuntimeImageClassificationEngine and TransformersImageClassificationEngine implementations - Shared HfVisionModelMixin for common HF model utilities - Engine-specific configuration options - Test suite and example demonstrating runtime engine switching Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Add missing files and re-export for backward compat Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Don't run with OCR in the example. Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Remove excess onnxruntime related options for inuts and outputs Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * feat: centralize torch compile defaults with DOCLING_INFERENCE_COMPILE_TORCH_MODELS Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * feat: Add Kserve2 API engine for image classifier and object detection models (#2999) * fix: add failed pages to DoclingDocument for page break consistency (#2939) * fix: add failed pages to DoclingDocument for page break consistency When some PDF pages fail to parse, they were not added to DoclingDocument.pages, causing page break markers to be incorrect during export. This adds failed/skipped pages with their size info (if available) to maintain correct page numbering and structure. - Add _add_failed_pages_to_document() method in StandardPdfPipeline - Add test cases for failed page handling - Add test cases for normal page handling (regression test) - Add test PDF files Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> * fix: ensure resource cleanup and simplify type hints - Wrap page_backend usage in try-finally to guarantee unload (prevents resource leaks). - Simplify redundant 'float | None | None' type hint. Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> * fix: add groundtruth for normal_4pages.pdf and exclude failing PDFs from e2e test Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> * fix: ensure correct status assertion for failed pages in tests Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> --------- Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> * fix: Use timezone-aware datetime (#2947) * Use timezone-aware datetime for profiling timestamps Updated timestamp recording to use timezone-aware datetime. Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> * run formatter Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> * fix(asciidoc): handle commas in image alt text (#2983) * Fix: Handle commas in AsciiDoc image alt text - Modified _parse_picture() to gracefully handle alt text containing commas - Commas in alt text are now preserved instead of causing ValueError - Added test case with realistic auto-generated alt text - split('=', 1) prevents issues when values contain '=' characters * DCO Remediation Commit for n0rdp0l <n90.w135@gmail.com> I, n0rdp0l <n90.w135@gmail.com>, hereby add my Signed-off-by to this commit: ee75249 Signed-off-by: n0rdp0l <n90.w135@gmail.com> * style: fix ruff formatting in test_backend_asciidoc.py Signed-off-by: n0rdp0l <n90.w135@gmail.com> --------- Signed-off-by: n0rdp0l <n90.w135@gmail.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> * chore: bump version to 2.73.1 [skip ci] * First attempt at establishing API Kserve2 facet Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * refactor: improve KServe v2 engine implementation after code review - Add comprehensive error handling to KserveV2HttpClient - Catch and wrap Timeout, ConnectionError, HTTPError with context - Validate response formats with clear error messages - Refactor URL building to eliminate duplication - Extract _build_model_url() helper method - Single source of truth for infer_url and model_metadata_url - Make URL required parameter (remove default localhost:8000) - Update ApiKserveV2*EngineOptions to require explicit URL - Add preset validation with helpful error messages - Rename constants for clarity: TRITON_* → KSERVE_V2_* - Add comment explaining KServe v2 uses Triton type system - Improve error messages with actual values - Show counts, shapes, and supported types in validation errors - Document official KServe Python SDK alternative - Note async-only requirement and alpha status - Update tests for required URL parameter Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Cleanup in kserve http helper and options Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Further cleanup Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Fix for remote-services on tablemodel Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * fix: improved deserialization of engine_options (#3008) * add registry of discriminated subclasses Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix detection of engine_type value Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Add options serialization improvements Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: n0rdp0l <n90.w135@gmail.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: jhchoi1182 <jhchoi1182@gmail.com> Co-authored-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Felix Wente <63914035+n0rdp0l@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com> * Fixes from review Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * DCO Remediation Commit for Christoph Auer <cau@zurich.ibm.com> I, Christoph Auer <cau@zurich.ibm.com>, hereby add my Signed-off-by to this commit: 4cdb01e Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * DCO Remediation Commit for Christoph Auer <60343111+cau-git@users.noreply.github.com> I, Christoph Auer <60343111+cau-git@users.noreply.github.com>, hereby add my Signed-off-by to this commit: e293ba3 Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Add fallback for API variants Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Recreate uv.lock Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: n0rdp0l <n90.w135@gmail.com> Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Co-authored-by: jhchoi1182 <jhchoi1182@gmail.com> Co-authored-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Felix Wente <63914035+n0rdp0l@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR establishes an implementation of generic inference engines for the object-detection model family.
It defines the required option types, base classes, factories, onnx inference engine, and a stage for layout models (in ONNX format) to demonstrate the principle (WIP).
TODO:
What this PR does not include
For reference: https://github.com/docling-project/docling/blob/main/docs/usage/model_catalog.md
Checklist: