feat(flatbuffer_direct): add QLinear ops and quant-chain fusion by PINTO0309 · Pull Request #873 · PINTO0309/onnx2tf

PINTO0309 · 2026-02-14T16:53:30Z

Summary

This PR extends flatbuffer_direct for quantized ONNX graphs and documents the updated support matrix.

face_recognition_sface_2021dec_int8.onnx.zip

What was added

Added direct lowering for quantized operators in flatbuffer_direct:
- QuantizeLinear -> QUANTIZE
- DequantizeLinear -> DEQUANTIZE
- QLinearAdd -> ADD
- QLinearMul -> MUL
- QLinearConv -> CONV_2D / DEPTHWISE_CONV_2D
- QLinearMatMul -> FULLY_CONNECTED
Added flatbuffer_direct-only preprocess fusion:
- DequantizeLinear -> BatchNormalization -> PRelu -> QuantizeLinear
- BN params are folded and rewritten into Mul + Add form before lowering.
Added supporting direct builders and registry coverage:
- BatchNormalization direct builder (MUL + ADD form)
- Flatten direct builder (RESHAPE form)
- op registry validators for quantized op constraints (const inputs, rank/shape checks, axis checks)
Updated README support status section for flatbuffer_direct.

Implementation details

Quantized builder layer

New file: onnx2tf/tflite_builder/op_builders/quantized.py
Implements quant parameter handling (scale, zero_point, quantized_dimension) for both:
- raw quantized path (QLinear*)
- dequant/quant bridge path (DequantizeLinear / QuantizeLinear)
Handles:
- dtype derivation from zero_point
- quant metadata propagation across intermediate tensors
- bias scale derivation for int32 bias (input_scale * weight_scale)

Dispatch/validation layer

Updated onnx2tf/tflite_builder/op_registry.py with new DispatchEntrys and validators:
- _validate_quantize_dequantize_linear
- _validate_qlinear_binary
- _validate_qlinear_conv
- _validate_qlinear_matmul
- _validate_batch_norm
- _validate_flatten

Preprocess rules

New rule file: onnx2tf/tflite_builder/preprocess/rules/quant_chain_fusion.py
Registered via:
- onnx2tf/tflite_builder/preprocess/rules/__init__.py
- onnx2tf/tflite_builder/preprocess/__init__.py
Rule ID: quant_chain_fusion_wave3

Tests

Added/updated tests

tests/test_tflite_builder_direct.py
- Added qlinear conv chain model
- Added qlinear matmul/fc chain model
- Included both in direct operator smoke coverage
tests/test_tflite_builder_preprocess.py
- Added test_quant_chain_fusion_wave3_rewrites_dq_bn_prelu_q_chain

Test results

pytest -q tests/test_tflite_builder_preprocess.py tests/test_tflite_builder_direct.py
- 72 passed

Real model verification

Validated with:

face_recognition_sface_2021dec_int8.onnx
command: python -m onnx2tf -i face_recognition_sface_2021dec_int8.onnx -o /tmp/onnx2tf_sface_pr_check -tb flatbuffer_direct --report_op_coverage -n

Coverage report highlights:

conversion_error = null
graph_summary.supported_nodes = 278 / 278
graph_summary.unsupported_nodes = 0
graph_summary.coverage_ratio = 1
quant_chain_fusion_wave3 applied (matched_patterns=26, rewritten_patterns=26)

Additional updates

Version bump:
- pyproject.toml: 2.0.11
- onnx2tf/__init__.py: 2.0.11
README Docker tag examples aligned to 2.0.11.

PINTO0309 · 2026-02-14T17:07:33Z

Supplement for additional commit 1090987.

Additional implementation (more INT8-oriented)

Added direct lowering of PRelu to PRELU in flatbuffer_direct
- onnx2tf/tflite_builder/op_builders/elementwise.py:215
For quantized input (INT8/UINT8), alpha (slope) is converted to a quantized constant with quantization metadata
- onnx2tf/tflite_builder/op_builders/elementwise.py:185
Added PRelu dispatch and validator in the registry
- onnx2tf/tflite_builder/op_registry.py:664
- onnx2tf/tflite_builder/op_registry.py:940
Removed PRelu expansion from pseudo_ops_wave1; PRelu is now handled by direct builder without decomposition
- onnx2tf/tflite_builder/preprocess/rules/pseudo_ops.py:489
Added PRELU builtin options mapping in model writer
- onnx2tf/tflite_builder/model_writer.py:202
Updated README support matrix
- README.md:280 (builtin count 38)
- README.md:315 (PRelu row)

Tests

Existing + newly added tests:
- tests/test_tflite_builder_direct.py:881 (verifies PRELU builtin emission)
Result:
- pytest -q tests/test_tflite_builder_preprocess.py tests/test_tflite_builder_direct.py tests/test_tflite_builder_op_coverage.py
- 76 passed

Real model verification (`face_recognition_sface_2021dec_int8.onnx`)

Conversion succeeded (--report_op_coverage): unsupported 0 / coverage 1.0
All 27 PRelu nodes are dispatch_mode=builtin
Number of float-output ops in generated TFLite:
- Before: 223 / 335
- After: 115 / 227
- Main effect: decomposed PRelu (Neg/Relu/Mul/Sub) was consolidated into a single PRELU op

PINTO0309 added 2 commits February 15, 2026 01:51

Add flatbuffer_direct QLinear ops and quant-chain fusion

f34baf8

Add direct PRELU lowering and stop PRelu pseudo rewrite

1090987

PINTO0309 merged commit 4bdb913 into main Feb 15, 2026
3 checks passed

PINTO0309 deleted the feat-fdirect-int8 branch February 15, 2026 00:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(flatbuffer_direct): add QLinear ops and quant-chain fusion#873

feat(flatbuffer_direct): add QLinear ops and quant-chain fusion#873
PINTO0309 merged 2 commits intomainfrom
feat-fdirect-int8

PINTO0309 commented Feb 14, 2026 •

edited

Loading

Uh oh!

PINTO0309 commented Feb 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

PINTO0309 commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What was added

Implementation details

Quantized builder layer

Dispatch/validation layer

Preprocess rules

Tests

Added/updated tests

Test results

Real model verification

Additional updates

Uh oh!

PINTO0309 commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Additional implementation (more INT8-oriented)

Tests

Real model verification (face_recognition_sface_2021dec_int8.onnx)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PINTO0309 commented Feb 14, 2026 •

edited

Loading

PINTO0309 commented Feb 14, 2026 •

edited

Loading

Real model verification (`face_recognition_sface_2021dec_int8.onnx`)