COMP: Use itk::Math::MatrixExponential instead of removed vnl_matrix_exp by hjmjohnson · Pull Request #1452 · SuperElastix/elastix

hjmjohnson · 2026-06-16T17:57:57Z

VXL removed core/vnl/vnl_matrix_exp, which AffineLogTransform included via <vnl/vnl_matrix_exp.h> (InsightSoftwareConsortium/ITK#6452). This migrates the two call sites to itk::Math::MatrixExponential, an ITK-supported Eigen-backed replacement.

Depends on InsightSoftwareConsortium/ITK#6454 (adds itk::Math::MatrixExponential / itkMatrixExponential.h). Hold merge until an ITK containing it is required.

Change

-#include <vnl/vnl_matrix_exp.h>             →  #include "itkMatrixExponential.h"
-vnl_matrix_exp(... .GetVnlMatrix())         →  itk::Math::MatrixExponential(...)
-vnl_matrix_exp(A_bar)                        →  itk::Math::MatrixExponential(A_bar)

Call-site argument types (vnl_matrix_fixed, vnl_matrix) are unchanged; the new function provides matching overloads. Both AffineLogTransform and AffineLogStackTransform components rebuild clean against the new header.

hjmjohnson · 2026-06-16T22:13:15Z

Performance impact of migrating AffineLogTransform from vnl_matrix_exp to itk::Math::MatrixExponential. Freshly benchmarked on this transform's two call patterns (-O3 -DNDEBUG, best-of-11 min ns/call). Net: ~1.7–2× faster matrix-exp work per parameter-update, more accurate, results unchanged in practice.

Per-call timing + net AffineLogTransform impact

operation	vnl_matrix_exp	itk (Eigen)	Eigen vs vnl
log-domain 2×2 (`SetParameters`)	~99 ns	~410 ns	4.1× slower
log-domain 3×3 (`SetParameters`)	~150 ns	~510 ns	3.4× slower
`A_bar` 4×4 (Jacobian loop)	~1349 ns	~695 ns	1.95× faster
`A_bar` 6×6 (Jacobian loop)	~2681 ns	~1333 ns	2.0× faster

Per SetParameters (1 log-domain exp + d² augmented A_bar exps in PrecomputeJacobianOfSpatialJacobian):

	old (vnl)	new (Eigen)	net
2D	95 + 4·1331 = 5419 ns	405 + 4·692 = 3173 ns	~1.7× faster
3D	149 + 9·2676 = 24233 ns	503 + 9·1315 = 12338 ns	~2.0× faster

The d² A_bar Jacobian loop dominates and Eigen wins there, so net it's faster despite the single small log-domain exp being slower. Matrix-exp is a negligible slice of full registration wall-time, so end-to-end timing is effectively unchanged.

Numerical equivalence + accuracy

For near-identity log-domain matrices the absolute output difference between the two methods is ~1e-12 — far below any registration tolerance, so registration results are unchanged. Eigen (Higham scaling-and-squaring) is also ~2–3 orders of magnitude more accurate than vnl's Taylor series on the exp(A)·exp(−A)=I identity at every norm, and stays robust where the Taylor series degrades for large norms.

This PR depends on itk::Math::MatrixExponential landing in ITK (InsightSoftwareConsortium/ITK#6454).

N-Dekker · 2026-06-18T11:13:10Z

Thanks Hans. Would this compile for both ITK 5 and 6, when we add something like #if ITK >= 6 and put the original code in the #else?

hjmjohnson · 2026-06-19T02:15:39Z

@N-Dekker Good catch — yes, and a guard is necessary: elastix's minimum is ITK 5.4.1 (find_package(ITK 5.4.1)), which has vnl_matrix_exp.h but not itkMatrixExponential.h, so the single-path version broke the minimum-version build. Added in 5aeef7f.

Why __has_include rather than #if ITK_VERSION_MAJOR >= 6

InsightSoftwareConsortium/ITK#6454 (which adds itk::Math::MatrixExponential) has no milestone yet, so the first tagged 6.x release containing it is unknown — keying on the header's presence is robust to that. #6454 also restored the (now-deprecated) vnl_matrix_exp shim, so current ITK main takes the new Eigen path while ITK 5.4.x falls back to vnl.

#ifndef ELX_HAS_ITK_MATRIX_EXPONENTIAL
#  if __has_include("itkMatrixExponential.h")
#    define ELX_HAS_ITK_MATRIX_EXPONENTIAL 1
#  else
#    define ELX_HAS_ITK_MATRIX_EXPONENTIAL 0
#  endif
#endif

Verified both branches compile locally: the Eigen path builds the full elastix/transformix against ITK main; the vnl fallback (forced via -DELX_HAS_ITK_MATRIX_EXPONENTIAL=0) compiles clean against the restored shim.

N-Dekker

Approved unconditionally, but feel free to comment on my question about the initial #ifndef ELX_HAS_ITK_MATRIX_EXPONENTIAL 😃

N-Dekker · 2026-06-19T12:59:51Z

@hjmjohnson Maybe we should still check if it won't break our regression tests, as it might yield slightly different result, right? But for now, I guess we have no choice, because without this PR, elastix + ITK6 won't build anymore 🤷

N-Dekker · 2026-06-19T13:08:35Z

Performance impact of migrating AffineLogTransform from vnl_matrix_exp to itk::Math::MatrixExponential. Freshly benchmarked on this transform's two call patterns (-O3 -DNDEBUG, best-of-11 min ns/call). Net: ~1.7–2× faster matrix-exp work per parameter-update, more accurate, results unchanged in practice.

@hjmjohnson This sounds very promising. Can you possibly share the benchmark code?

hjmjohnson · 2026-06-19T17:05:16Z

Verified this change against an ITK-v6-main downstream build forest: the elastix test suite (163 tests) is behaviorally identical to elastix main with vnl_matrix_exp and to the ITK-5.4 baseline — 161/163 pass, the same 2 pre-existing failures, identical runtime. The itk::Math::MatrixExponential switch costs nothing in test behavior and is the only variant that survives ITK_FUTURE_LEGACY_REMOVE=ON.

3-way test comparison (163 tests each)

Configuration	ITK	elastix source	matrix-exp path	Build	Passed	Failed
1	release-5.4	`main`	`vnl/vnl_matrix_exp.h` (native VXL)	clean	161/163	#24, #163
2	v6 `main`	`main`	`vnl/vnl_matrix_exp.h` (ITK shim → Eigen)	clean	161/163	#24, #163
3	v6 `main`	this PR	`itk::Math::MatrixExponential` (Eigen, direct)	clean	161/163	#24, #163

The same two tests fail in all three configurations (both pre-existing, ITK-version-independent):

#24 elastix_run_example_COMPARE_IM — baseline-image tolerance mismatch (9993 pixels differ > 5, 50 allowed)
#163 TransformixTest

Runtimes were ~23.4–23.6 s in all three runs.

What this establishes

ITK v6 does not regress elastix — configs 1 vs 2 are equivalent; the ITKv6 vnl_matrix_exp.h compatibility shim (Eigen-backed MatrixExponential) is a drop-in for native VXL vnl_matrix_exp.
This PR is behaviorally equivalent — configs 2 vs 3 show that switching AffineLogTransform from vnl_matrix_exp to itk::Math::MatrixExponential produces no test differences (both resolve to the same Eigen Higham scaling-and-squaring under ITKv6).
Forward-compatibility — configs 1 and 2 (#include <vnl/vnl_matrix_exp.h>) would #error under ITK_FUTURE_LEGACY_REMOVE=ON; only this PR builds in that configuration.

VXL removed core/vnl/vnl_matrix_exp.h, breaking AffineLogTransform's include of <vnl/vnl_matrix_exp.h> (InsightSoftwareConsortium/ITK#6452). Switch to itk::Math::MatrixExponential (itkMatrixExponential.h), an ITK-supported Eigen-backed replacement using Higham scaling-and-squaring. elastix's minimum is ITK 5.4.1, which lacks itkMatrixExponential.h, so guard the include and both call sites on __has_include and fall back to vnl_matrix_exp for older ITK. ITK#6454 adds the function and restores the vnl_matrix_exp shim. Call-site argument types (vnl_matrix_fixed, vnl_matrix) are unchanged.

hjmjohnson · 2026-06-19T17:20:45Z

@hjmjohnson This sounds very promising. Can you possibly share the benchmark code?

Happy to, @N-Dekker. The full performance and accuracy write-up (with the per-size timing table and the exp(A)·exp(−A)=I accuracy comparison) lives on the ITK side, where itk::Math::MatrixExponential was added: InsightSoftwareConsortium/ITK#6454 → performance comment.

The code is below. Two files: a consolidated timing harness (vnl vs Eigen across the sizes AffineLogTransform exercises) and the correctness/equivalence check that uses the exact call types from this PR.

One honesty note on reading the numbers: the published table's small (2×2/3×3 log-domain) rows used fixed-size types (itk::Matrix / vnl_matrix_fixed), where vnl's compile-time path is very fast; the harness below uses dynamic vnl_matrix uniformly for brevity, so its small-size absolute ns differ (the per-call heap alloc inflates both sides). The conclusion is unchanged: Eigen wins on the larger augmented-A_bar matrices that dominate PrecomputeJacobianOfSpatialJacobian, so the net per-SetParameters cost drops. Absolute ns are machine-dependent (these were Apple Silicon, -O3 -DNDEBUG).

Timing harness — vnl_matrix_exp vs itk::Math::MatrixExponential (best-of-11)

// Build: c++ -O3 -DNDEBUG -std=c++17 <ITK + VNL include/link flags> bench_matrix_exp.cxx
#include "itkMatrixExponential.h"
#include <vnl/vnl_matrix.h>
#include <vnl/vnl_matrix_exp.h>
#include <algorithm>
#include <chrono>
#include <cstdio>
#include <random>
#include <vector>

using clk = std::chrono::steady_clock;

template <typename F>
double best_of(F && f, int reps = 11)
{
  double best = 1e300;
  for (int r = 0; r < reps; ++r)
    best = std::min(best, f());
  return best;
}

template <unsigned D>
void run(const char * label)
{
  std::mt19937                           rng(1);
  std::uniform_real_distribution<double> ud(-0.3, 0.3); // small-norm, like log-domain inputs
  const int                              pool = 256;
  std::vector<vnl_matrix<double>>        in(pool, vnl_matrix<double>(D, D));
  for (auto & m : in)
    for (unsigned i = 0; i < D; ++i)
      for (unsigned j = 0; j < D; ++j)
        m(i, j) = ud(rng);

  const int       N = 200000;
  volatile double sink = 0;

  auto timeVnl = [&] {
    auto t0 = clk::now();
    for (int k = 0; k < N; ++k)
      sink += vnl_matrix_exp(in[k & (pool - 1)])(0, 0);
    auto t1 = clk::now();
    return std::chrono::duration<double, std::nano>(t1 - t0).count() / N;
  };
  auto timeItk = [&] {
    auto t0 = clk::now();
    for (int k = 0; k < N; ++k)
      sink += itk::Math::MatrixExponential(in[k & (pool - 1)])(0, 0);
    auto t1 = clk::now();
    return std::chrono::duration<double, std::nano>(t1 - t0).count() / N;
  };

  const double vnl = best_of(timeVnl);
  const double itk = best_of(timeItk);
  std::printf("%-8s vnl_matrix_exp=%8.1f ns   itk(Eigen)=%8.1f ns   speedup=%.2fx\n",
              label, vnl, itk, vnl / itk);
  (void)sink;
}

int main()
{
  std::printf("best-of-11 min ns/call, small-norm inputs\n");
  run<2>("2x2"); // log-domain (2D)
  run<3>("3x3"); // log-domain (3D)
  run<4>("4x4"); // augmented A_bar (2D Jacobian loop)
  run<6>("6x6"); // augmented A_bar (3D Jacobian loop)
  return 0;
}

Representative run (Apple Silicon, dynamic vnl_matrix):

2x2  vnl_matrix_exp= 504.9 ns  itk(Eigen)= 465.4 ns  speedup=1.08x
3x3  vnl_matrix_exp= 741.4 ns  itk(Eigen)= 562.6 ns  speedup=1.32x
4x4  vnl_matrix_exp=1013.0 ns  itk(Eigen)= 722.2 ns  speedup=1.40x
6x6  vnl_matrix_exp=2231.8 ns  itk(Eigen)=1342.4 ns  speedup=1.66x

Correctness / equivalence check — exact AffineLogTransform call types

#include "itkMatrixExponential.h"
#include <vnl/vnl_matrix_exp.h>
#include <iostream>
#include <iomanip>
int main()
{
  using itk::Math::MatrixExponential;
  int fails = 0;

  // itk::Matrix 3x3 vs restored vnl_matrix_exp on the same data
  itk::Matrix<double, 3, 3> A;
  A(0,0)=4;A(0,1)=1;A(0,2)=-2; A(1,0)=3;A(1,1)=-5;A(1,2)=1; A(2,0)=0;A(2,1)=2;A(2,2)=6;
  auto itkExp = MatrixExponential(A);
  vnl_matrix<double> vexp = vnl_matrix_exp(A.GetVnlMatrix().as_matrix());
  double err = 0;
  for (unsigned i=0;i<3;++i) for (unsigned j=0;j<3;++j) err = std::max(err, std::abs(itkExp(i,j)-vexp(i,j)));
  std::cout << "itk::Matrix vs vnl_matrix_exp  max_err=" << std::scientific << err << (err<1e-7?"  PASS":"  FAIL") << "\n";
  if (err>=1e-7) ++fails;

  // vnl_matrix_fixed overload (exactly elastix's GetVnlMatrix() type)
  vnl_matrix_fixed<double, 2, 2> G; G(0,0)=0;G(0,1)=-0.9;G(1,0)=0.9;G(1,1)=0;
  auto gf = MatrixExponential(G);
  double c = std::cos(0.9), s = std::sin(0.9);
  double e2 = std::max(std::max(std::abs(gf(0,0)-c), std::abs(gf(0,1)+s)), std::max(std::abs(gf(1,0)-s), std::abs(gf(1,1)-c)));
  std::cout << "vnl_matrix_fixed rotation       max_err=" << e2 << (e2<1e-9?"  PASS":"  FAIL") << "\n";
  if (e2>=1e-9) ++fails;

  // dynamic vnl_matrix overload (exactly elastix's A_bar type)
  vnl_matrix<double> N(2,2); N(0,0)=0;N(0,1)=1;N(1,0)=0;N(1,1)=0;
  auto nf = MatrixExponential(N);
  double e3 = std::max(std::max(std::abs(nf(0,0)-1), std::abs(nf(0,1)-1)), std::max(std::abs(nf(1,0)), std::abs(nf(1,1)-1)));
  std::cout << "vnl_matrix nilpotent            max_err=" << e3 << (e3<1e-12?"  PASS":"  FAIL") << "\n";
  if (e3>=1e-12) ++fails;

  std::cout << (fails ? "\nRESULT: FAIL\n" : "\nRESULT: ALL PASS\n");
  return fails;
}

hjmjohnson · 2026-06-19T17:24:50Z

@N-Dekker I do not have merge rights on elastix. Take this to any terminal state that you wish.

NOTE: if ITK_FUTURE_LEGACY_REMOVE is used, <vnl/vnl_matrix_exp.h>. The intent is that <vnl/vnl_matrix_exp.h> is removed in ITKv7.

N-Dekker · 2026-06-22T11:42:02Z

Thanks @hjmjohnson Before merging, I would like to see if it affects the results of our regression test, at the CI. Ijust started elastix builds with both ITK-v6.0b02 and ITK-v6-main. Afterwards, I'll try cherry-picking your PR on the second branch.

Update - cherry-picking your PR:

https://github.com/SuperElastix/elastix/tree/ITK6_main_migrate-to-itk-matrix-exponential

hjmjohnson force-pushed the migrate-to-itk-matrix-exponential branch from 2e19386 to c33e80a Compare June 16, 2026 22:04

hjmjohnson force-pushed the migrate-to-itk-matrix-exponential branch from c33e80a to 5aeef7f Compare June 19, 2026 02:15

hjmjohnson marked this pull request as ready for review June 19, 2026 02:20

N-Dekker reviewed Jun 19, 2026

View reviewed changes

Comment thread Components/Transforms/AffineLogTransform/itkAffineLogTransform.hxx Outdated

N-Dekker approved these changes Jun 19, 2026

View reviewed changes

hjmjohnson force-pushed the migrate-to-itk-matrix-exponential branch from 5aeef7f to df58ef4 Compare June 19, 2026 17:17

N-Dekker self-requested a review June 22, 2026 10:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

COMP: Use itk::Math::MatrixExponential instead of removed vnl_matrix_exp#1452

COMP: Use itk::Math::MatrixExponential instead of removed vnl_matrix_exp#1452
hjmjohnson wants to merge 1 commit into
SuperElastix:mainfrom
hjmjohnson:migrate-to-itk-matrix-exponential

hjmjohnson commented Jun 16, 2026

Uh oh!

hjmjohnson commented Jun 16, 2026

Uh oh!

N-Dekker commented Jun 18, 2026

Uh oh!

hjmjohnson commented Jun 19, 2026

Uh oh!

Uh oh!

N-Dekker left a comment

Uh oh!

N-Dekker commented Jun 19, 2026

Uh oh!

N-Dekker commented Jun 19, 2026

Uh oh!

hjmjohnson commented Jun 19, 2026

Uh oh!

hjmjohnson commented Jun 19, 2026

Uh oh!

hjmjohnson commented Jun 19, 2026

Uh oh!

N-Dekker commented Jun 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hjmjohnson commented Jun 16, 2026

Uh oh!

hjmjohnson commented Jun 16, 2026

Uh oh!

N-Dekker commented Jun 18, 2026

Uh oh!

hjmjohnson commented Jun 19, 2026

Uh oh!

Uh oh!

N-Dekker left a comment

Choose a reason for hiding this comment

Uh oh!

N-Dekker commented Jun 19, 2026

Uh oh!

N-Dekker commented Jun 19, 2026

Uh oh!

hjmjohnson commented Jun 19, 2026

Uh oh!

hjmjohnson commented Jun 19, 2026

Uh oh!

hjmjohnson commented Jun 19, 2026

Uh oh!

N-Dekker commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

N-Dekker commented Jun 22, 2026 •

edited

Loading