feat(tidy3d): FXC-3961-faster-convolutions-for-tidy-3-d-plugins-autograd-filters #2963

marcorudolphflex · 2025-11-05T16:40:40Z

Greptile Overview

Updated On: 2025-11-05 16:45:19 UTC

Greptile Summary

This PR replaces the autograd scipy convolution implementation with a custom FFT-based convolution to improve tracing performance. The main changes include:

Removed dependency on autograd.scipy.signal.convolve and replaced with custom _fft_convolve_general function using FFT convolution
Added _normalize_axes helper function to handle axis specification and validation
Added comprehensive tests for convolution with explicit axes specification
Updated CHANGELOG to document the performance improvement

Issues found:

Dead code in _normalize_axes (lines 74-82): bounds checking occurs after modulo normalization, so checks will never trigger
Extra blank line in CHANGELOG.md

Confidence Score: 3/5

This PR is mostly safe to merge but contains dead code that should be addressed
The FFT-based convolution implementation appears algorithmically sound and includes appropriate tests. However, the dead code in bounds checking (lines 74-82) indicates incomplete validation logic that could mask invalid inputs. While this won't cause runtime errors due to the modulo operation preventing out-of-bounds access, it means the function won't properly reject invalid axis specifications, which could lead to confusing behavior.
Pay close attention to tidy3d/plugins/autograd/functions.py - the bounds checking logic in _normalize_axes should be fixed or removed

Important Files Changed

File Analysis

Filename	Score	Overview
tidy3d/plugins/autograd/functions.py	3/5	replaced autograd.scipy.signal.convolve with custom FFT-based implementation for performance; added helper functions `_normalize_axes` and `_fft_convolve_general`; contains dead code in bounds checking (lines 74-82)
tests/test_plugins/autograd/test_functions.py	5/5	added test class `TestConvolveAxes` with parametrized tests for convolution with explicit axes, including validation against numpy and gradient checks
CHANGELOG.md	4/5	added entry for improved autograd convolution speed; contains extra blank line on line 57

Sequence Diagram

sequenceDiagram
    participant User
    participant convolve
    participant _normalize_axes
    participant pad
    participant _fft_convolve_general
    participant fftn/ifftn

    User->>convolve: array, kernel, axes, mode, padding
    convolve->>convolve: validate kernel dimensions (must be odd)
    convolve->>_normalize_axes: array.ndim, kernel.ndim, axes
    _normalize_axes->>_normalize_axes: normalize axes with modulo
    _normalize_axes->>_normalize_axes: validate axes (length match, uniqueness)
    _normalize_axes-->>convolve: axes_array, axes_kernel
    
    alt mode is "same" or "full"
        loop for each axis pair
            convolve->>pad: working_array, pad_width, padding mode
            pad-->>convolve: padded array
        end
        convolve->>convolve: set effective_mode ("valid" for "same", "full" for "full")
    end
    
    convolve->>_fft_convolve_general: working_array, kernel, axes, effective_mode
    _fft_convolve_general->>_fft_convolve_general: reorder axes (batch vs conv)
    _fft_convolve_general->>_fft_convolve_general: expand dimensions for broadcasting
    _fft_convolve_general->>fftn/ifftn: compute FFT convolution
    fftn/ifftn-->>_fft_convolve_general: full convolution result
    
    alt effective_mode is "valid"
        _fft_convolve_general->>_fft_convolve_general: slice valid region
    end
    
    _fft_convolve_general->>_fft_convolve_general: convert to real if inputs were real
    _fft_convolve_general-->>convolve: result
    convolve-->>User: convolved array

greptile-apps

_{3 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

tidy3d/plugins/autograd/functions.py

CHANGELOG.md

github-actions · 2025-11-05T17:13:19Z

Diff Coverage

Diff: origin/develop...HEAD, staged and unstaged changes

tidy3d/plugins/autograd/functions.py (87.6%): Missing lines 51-54,57,71,79,85,126,159,171

Summary

Total: 89 lines
Missing: 11 lines
Coverage: 87%

tidy3d/plugins/autograd/functions.py

Lines 47-61

  47     """Normalize the axes specification for convolution."""
  48 
  49     def _normalize_single_axis(ax: SupportsInt, ndim: int, kind: str) -> int:
  50         if not isinstance(ax, int):
! 51             try:
! 52                 ax = int(ax)
! 53             except Exception as e:
! 54                 raise TypeError(f"Axis {ax!r} could not be converted to an integer.") from e
  55 
  56         if not -ndim <= ax < ndim:
! 57             raise ValueError(f"Invalid axis {ax} for {kind} with ndim {ndim}.")
  58         return ax + ndim if ax < 0 else ax
  59 
  60     if axes is None:
  61         if ndim_array != ndim_kernel:

Lines 67-75

  67         axes_kernel = tuple(range(ndim_kernel))
  68         return axes_array, axes_kernel
  69 
  70     if len(axes) != 2:
! 71         raise ValueError("'axes' must be a tuple of two iterable collections of axis indices.")
  72 
  73     axes_array_raw, axes_kernel_raw = axes
  74 
  75     axes_array = tuple(_normalize_single_axis(ax, ndim_array, "array") for ax in axes_array_raw)

Lines 75-83

  75     axes_array = tuple(_normalize_single_axis(ax, ndim_array, "array") for ax in axes_array_raw)
  76     axes_kernel = tuple(_normalize_single_axis(ax, ndim_kernel, "kernel") for ax in axes_kernel_raw)
  77 
  78     if len(axes_array) != len(axes_kernel):
! 79         raise ValueError(
  80             "The number of convolution axes for the array and kernel must be the same, "
  81             f"got {len(axes_array)} and {len(axes_kernel)}."
  82         )

Lines 81-89

  81             f"got {len(axes_array)} and {len(axes_kernel)}."
  82         )
  83 
  84     if len(set(axes_array)) != len(axes_array) or len(set(axes_kernel)) != len(axes_kernel):
! 85         raise ValueError("Convolution axes must be unique for both the array and the kernel.")
  86 
  87     return axes_array, axes_kernel
  88

Lines 122-130

  122     array_conv_shape = array_reordered.shape[num_batch_array:]
  123     kernel_conv_shape = kernel_reordered.shape[num_batch_kernel:]
  124 
  125     if any(d <= 0 for d in array_conv_shape + kernel_conv_shape):
! 126         raise ValueError("Convolution dimensions must be positive; got zero-length axis.")
  127 
  128     fft_axes = tuple(range(-num_conv_axes, 0))
  129     fft_shape = [next_fast_len(n + k - 1) for n, k in zip(array_conv_shape, kernel_conv_shape)]
  130     use_real_fft = fft_shape[-1] % 2 == 0  # only applicable in this case

Lines 155-163

  155     ifft_fun = irfftn if use_real_fft else ifftn
  156     full_result = ifft_fun(product, fft_shape, axes=fft_axes)
  157 
  158     if mode == "full":
! 159         result = full_result
  160     elif mode == "valid":
  161         valid_slices = [slice(None)] * full_result.ndim
  162         for axis_offset, (array_dim, kernel_dim) in enumerate(
  163             zip(array_conv_shape, kernel_conv_shape)

Lines 167-175

  167             axis = full_result.ndim - num_conv_axes + axis_offset
  168             valid_slices[axis] = slice(start, start + length)
  169         result = full_result[tuple(valid_slices)]
  170     else:
! 171         raise ValueError(f"Unsupported convolution mode '{mode}'.")
  172 
  173     return np.real(result)
  174

yaugenst-flex

Thanks @marcorudolphflex this is great! I guess as a general observation we should keep in mind that this new implementation is slower for small arrays/kernels and faster for larger ones. Since for small convolutions both methods are fast in absolute terms and for large cases (where the cost of the convolution becomes noticeable) FFT will be significantly faster I think this is definitely a better default. It could be interesting to check with a little bit of benchmarking where this crossover lies and compare that to typical sizes in topopt problems where the arrays are generally on the order of a few hundred pixels in x and y and kernels are probably on the order of ~10 pixels.

yaugenst-flex · 2025-11-07T10:52:11Z

tidy3d/plugins/autograd/functions.py

+    """Normalize the axes specification for convolution."""
+
+    def _normalize_single_axis(ax: int, ndim: int, kind: str) -> int:
+        if not isinstance(ax, int):


Are we sure we want to be this strict with the int type here? This would break if you pass something like axes=np.arange(3) for example because those are np.int64. I think this would be a pretty common case.

yaugenst-flex · 2025-11-07T10:59:13Z

tests/test_plugins/autograd/test_functions.py

+class TestConvolveAxes:
+    @pytest.mark.parametrize("mode", ["valid", "same", "full"])
+    @pytest.mark.parametrize("padding", ["constant", "edge"])
+    def test_convolve_axes_val(self, rng, mode, padding):
+        """Test convolution with explicit axes against NumPy implementations."""
+        array = rng.random((2, 5))
+        kernel = rng.random((3, 3))
+        axes = ([1], [1])
+
+        conv_td = convolve(array, kernel, padding=padding, mode=mode, axes=axes)
+
+        working_array = array
+        scipy_mode = mode
+        if mode in ("same", "full"):
+            pad_width = kernel.shape[1] // 2 if mode == "same" else kernel.shape[1] - 1
+            working_array = pad(array, (pad_width, pad_width), mode=padding, axis=1)
+            scipy_mode = "valid"
+
+        working_array_np = np.asarray(working_array)
+        kernel_np = np.asarray(kernel)
+        conv_length = np.convolve(working_array_np[0], kernel_np[0], mode=scipy_mode).shape[0]
+
+        expected = np.empty((array.shape[0], kernel.shape[0], conv_length))
+        for i in range(array.shape[0]):
+            for j in range(kernel.shape[0]):
+                expected[i, j] = np.convolve(
+                    working_array_np[i],
+                    kernel_np[j],
+                    mode=scipy_mode,
+                )
+
+        npt.assert_allclose(conv_td, expected, atol=1e-12)
+
+    def test_convolve_axes_grad(self, rng):
+        """Test gradients of convolution when specific axes are provided."""
+        array = rng.random((2, 5))
+        kernel = rng.random((3, 3))
+        check_grads(convolve, modes=["rev"], order=2)(
+            array,
+            kernel,
+            padding="constant",
+            mode="valid",
+            axes=([1], [1]),
+        )


Not really sure I understand this test. Wouldn't we want to parametrize over axes? That's what I would expect from the naming of the test at least. And we'd want to parametrize over array and kernel too? Why are these things parametrized in the value but not in the gradient test?

More generally, isn't this test replicating much of TestConvolve? Wouldn't we just want to add a parametrization over axes there?

Yes, would make sense to vary the axes/kernel param here.
The reason for this test is to specifically test that this works with specified axes. We could do a unified test with axes parameters being None and non-None, but in this case we would probably just skip some tests if they would not make sense regarding the parameter combination? Alternatively, we could use some shared helpers to check the results to avoid duplicate code. Not sure what is more elegant here.

tidy3d/plugins/autograd/functions.py

yaugenst-flex · 2025-11-07T11:04:54Z

tidy3d/plugins/autograd/functions.py

+                raise ValueError(
+                    f"Even-sized kernel along axis {ax_kernel} not supported for 'same' mode."
+                )


Does numpy also complain about this?

no, it silently returns a result that is shifted by half a sample (floor division), i.e. it assumes center = floor(M/2) for kernel length M - which I dislike... But guess we want to be compatible with that?

Hmm yeah I think it's probably fine as-is, we're not really trying to match the scipy or numpy interface here

yaugenst-flex · 2025-11-07T11:05:34Z

CHANGELOG.md

 - Unified run submission API: `web.run(...)` is now a container-aware wrapper that accepts a single simulation or arbitrarily nested containers (`list`, `tuple`, `dict` values) and returns results in the same shape.
 - `web.Batch(ComponentModeler)` and `web.Job(ComponentModeler)` native support
 - Simulation data of batch jobs are now automatically downloaded upon their individual completion in `Batch.run()`, avoiding waiting for the entire batch to reach completion.
+- Improved speed of autograd tracing for convolutions.


This change is not only related to autograd tracing right?

marcorudolphflex · 2025-11-07T13:50:41Z

Thanks @marcorudolphflex this is great! I guess as a general observation we should keep in mind that this new implementation is slower for small arrays/kernels and faster for larger ones. Since for small convolutions both methods are fast in absolute terms and for large cases (where the cost of the convolution becomes noticeable) FFT will be significantly faster I think this is definitely a better default. It could be interesting to check with a little bit of benchmarking where this crossover lies and compare that to typical sizes in topopt problems where the arrays are generally on the order of a few hundred pixels in x and y and kernels are probably on the order of ~10 pixels.

You are right, thats really size-dependent. In this test I used quadratic 2D inputs/kernels.

marcorudolphflex · 2025-11-07T14:09:08Z

tweaked it a little bit (direct real fft and scipy.fft.next_fast_len), got this

yaugenst-flex · 2025-11-07T14:24:18Z

tweaked it a little bit (direct real fft and scipy.fft.next_fast_len)

Nice! I'd say for any realistic use case this would be strictly better than before!

…rad-filters

greptile-apps bot reviewed Nov 5, 2025

View reviewed changes

tidy3d/plugins/autograd/functions.py Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

marcorudolphflex force-pushed the FXC-3961-faster-convolutions-for-tidy-3-d-plugins-autograd-filters branch from 4df0cd5 to 1255f7f Compare November 5, 2025 16:55

marcorudolphflex requested a review from yaugenst-flex as a code owner November 5, 2025 16:55

marcorudolphflex force-pushed the FXC-3961-faster-convolutions-for-tidy-3-d-plugins-autograd-filters branch 7 times, most recently from 799eaf0 to 24ea2ac Compare November 6, 2025 14:11

yaugenst-flex requested changes Nov 7, 2025

View reviewed changes

feat(tidy3d): FXC-3961-faster-convolutions-for-tidy-3-d-plugins-autog…

f1454dc

…rad-filters

marcorudolphflex force-pushed the FXC-3961-faster-convolutions-for-tidy-3-d-plugins-autograd-filters branch from 24ea2ac to f1454dc Compare November 7, 2025 14:36

feat(tidy3d): FXC-3961-faster-convolutions-for-tidy-3-d-plugins-autograd-filters #2963

Are you sure you want to change the base?

feat(tidy3d): FXC-3961-faster-convolutions-for-tidy-3-d-plugins-autograd-filters #2963

Uh oh!

Conversation

marcorudolphflex commented Nov 5, 2025 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Diff Coverage

Diff: origin/develop...HEAD, staged and unstaged changes

Summary

tidy3d/plugins/autograd/functions.py

Uh oh!

yaugenst-flex left a comment

Choose a reason for hiding this comment

Uh oh!

yaugenst-flex Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

yaugenst-flex Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

marcorudolphflex Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yaugenst-flex Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

marcorudolphflex Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

yaugenst-flex Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

yaugenst-flex Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

marcorudolphflex commented Nov 7, 2025

Uh oh!

marcorudolphflex commented Nov 7, 2025

Uh oh!

yaugenst-flex commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

marcorudolphflex commented Nov 5, 2025 •

edited by greptile-apps bot

Loading

github-actions bot commented Nov 5, 2025 •

edited

Loading

marcorudolphflex Nov 7, 2025 •

edited

Loading