-
Notifications
You must be signed in to change notification settings - Fork 65
feat(tidy3d): FXC-3961-faster-convolutions-for-tidy-3-d-plugins-autograd-filters #2963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
feat(tidy3d): FXC-3961-faster-convolutions-for-tidy-3-d-plugins-autograd-filters #2963
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 files reviewed, 2 comments
4df0cd5 to
1255f7f
Compare
Diff CoverageDiff: origin/develop...HEAD, staged and unstaged changes
Summary
tidy3d/plugins/autograd/functions.pyLines 47-61 47 """Normalize the axes specification for convolution."""
48
49 def _normalize_single_axis(ax: SupportsInt, ndim: int, kind: str) -> int:
50 if not isinstance(ax, int):
! 51 try:
! 52 ax = int(ax)
! 53 except Exception as e:
! 54 raise TypeError(f"Axis {ax!r} could not be converted to an integer.") from e
55
56 if not -ndim <= ax < ndim:
! 57 raise ValueError(f"Invalid axis {ax} for {kind} with ndim {ndim}.")
58 return ax + ndim if ax < 0 else ax
59
60 if axes is None:
61 if ndim_array != ndim_kernel:Lines 67-75 67 axes_kernel = tuple(range(ndim_kernel))
68 return axes_array, axes_kernel
69
70 if len(axes) != 2:
! 71 raise ValueError("'axes' must be a tuple of two iterable collections of axis indices.")
72
73 axes_array_raw, axes_kernel_raw = axes
74
75 axes_array = tuple(_normalize_single_axis(ax, ndim_array, "array") for ax in axes_array_raw)Lines 75-83 75 axes_array = tuple(_normalize_single_axis(ax, ndim_array, "array") for ax in axes_array_raw)
76 axes_kernel = tuple(_normalize_single_axis(ax, ndim_kernel, "kernel") for ax in axes_kernel_raw)
77
78 if len(axes_array) != len(axes_kernel):
! 79 raise ValueError(
80 "The number of convolution axes for the array and kernel must be the same, "
81 f"got {len(axes_array)} and {len(axes_kernel)}."
82 )Lines 81-89 81 f"got {len(axes_array)} and {len(axes_kernel)}."
82 )
83
84 if len(set(axes_array)) != len(axes_array) or len(set(axes_kernel)) != len(axes_kernel):
! 85 raise ValueError("Convolution axes must be unique for both the array and the kernel.")
86
87 return axes_array, axes_kernel
88 Lines 122-130 122 array_conv_shape = array_reordered.shape[num_batch_array:]
123 kernel_conv_shape = kernel_reordered.shape[num_batch_kernel:]
124
125 if any(d <= 0 for d in array_conv_shape + kernel_conv_shape):
! 126 raise ValueError("Convolution dimensions must be positive; got zero-length axis.")
127
128 fft_axes = tuple(range(-num_conv_axes, 0))
129 fft_shape = [next_fast_len(n + k - 1) for n, k in zip(array_conv_shape, kernel_conv_shape)]
130 use_real_fft = fft_shape[-1] % 2 == 0 # only applicable in this caseLines 155-163 155 ifft_fun = irfftn if use_real_fft else ifftn
156 full_result = ifft_fun(product, fft_shape, axes=fft_axes)
157
158 if mode == "full":
! 159 result = full_result
160 elif mode == "valid":
161 valid_slices = [slice(None)] * full_result.ndim
162 for axis_offset, (array_dim, kernel_dim) in enumerate(
163 zip(array_conv_shape, kernel_conv_shape)Lines 167-175 167 axis = full_result.ndim - num_conv_axes + axis_offset
168 valid_slices[axis] = slice(start, start + length)
169 result = full_result[tuple(valid_slices)]
170 else:
! 171 raise ValueError(f"Unsupported convolution mode '{mode}'.")
172
173 return np.real(result)
174 |
799eaf0 to
24ea2ac
Compare
yaugenst-flex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @marcorudolphflex this is great! I guess as a general observation we should keep in mind that this new implementation is slower for small arrays/kernels and faster for larger ones. Since for small convolutions both methods are fast in absolute terms and for large cases (where the cost of the convolution becomes noticeable) FFT will be significantly faster I think this is definitely a better default. It could be interesting to check with a little bit of benchmarking where this crossover lies and compare that to typical sizes in topopt problems where the arrays are generally on the order of a few hundred pixels in x and y and kernels are probably on the order of ~10 pixels.
| """Normalize the axes specification for convolution.""" | ||
|
|
||
| def _normalize_single_axis(ax: int, ndim: int, kind: str) -> int: | ||
| if not isinstance(ax, int): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure we want to be this strict with the int type here? This would break if you pass something like axes=np.arange(3) for example because those are np.int64. I think this would be a pretty common case.
| class TestConvolveAxes: | ||
| @pytest.mark.parametrize("mode", ["valid", "same", "full"]) | ||
| @pytest.mark.parametrize("padding", ["constant", "edge"]) | ||
| def test_convolve_axes_val(self, rng, mode, padding): | ||
| """Test convolution with explicit axes against NumPy implementations.""" | ||
| array = rng.random((2, 5)) | ||
| kernel = rng.random((3, 3)) | ||
| axes = ([1], [1]) | ||
|
|
||
| conv_td = convolve(array, kernel, padding=padding, mode=mode, axes=axes) | ||
|
|
||
| working_array = array | ||
| scipy_mode = mode | ||
| if mode in ("same", "full"): | ||
| pad_width = kernel.shape[1] // 2 if mode == "same" else kernel.shape[1] - 1 | ||
| working_array = pad(array, (pad_width, pad_width), mode=padding, axis=1) | ||
| scipy_mode = "valid" | ||
|
|
||
| working_array_np = np.asarray(working_array) | ||
| kernel_np = np.asarray(kernel) | ||
| conv_length = np.convolve(working_array_np[0], kernel_np[0], mode=scipy_mode).shape[0] | ||
|
|
||
| expected = np.empty((array.shape[0], kernel.shape[0], conv_length)) | ||
| for i in range(array.shape[0]): | ||
| for j in range(kernel.shape[0]): | ||
| expected[i, j] = np.convolve( | ||
| working_array_np[i], | ||
| kernel_np[j], | ||
| mode=scipy_mode, | ||
| ) | ||
|
|
||
| npt.assert_allclose(conv_td, expected, atol=1e-12) | ||
|
|
||
| def test_convolve_axes_grad(self, rng): | ||
| """Test gradients of convolution when specific axes are provided.""" | ||
| array = rng.random((2, 5)) | ||
| kernel = rng.random((3, 3)) | ||
| check_grads(convolve, modes=["rev"], order=2)( | ||
| array, | ||
| kernel, | ||
| padding="constant", | ||
| mode="valid", | ||
| axes=([1], [1]), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really sure I understand this test. Wouldn't we want to parametrize over axes? That's what I would expect from the naming of the test at least. And we'd want to parametrize over array and kernel too? Why are these things parametrized in the value but not in the gradient test?
More generally, isn't this test replicating much of TestConvolve? Wouldn't we just want to add a parametrization over axes there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, would make sense to vary the axes/kernel param here.
The reason for this test is to specifically test that this works with specified axes. We could do a unified test with axes parameters being None and non-None, but in this case we would probably just skip some tests if they would not make sense regarding the parameter combination? Alternatively, we could use some shared helpers to check the results to avoid duplicate code. Not sure what is more elegant here.
tidy3d/plugins/autograd/functions.py
Outdated
| raise ValueError( | ||
| f"Even-sized kernel along axis {ax_kernel} not supported for 'same' mode." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does numpy also complain about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it silently returns a result that is shifted by half a sample (floor division), i.e. it assumes center = floor(M/2) for kernel length M - which I dislike... But guess we want to be compatible with that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah I think it's probably fine as-is, we're not really trying to match the scipy or numpy interface here
CHANGELOG.md
Outdated
| - Unified run submission API: `web.run(...)` is now a container-aware wrapper that accepts a single simulation or arbitrarily nested containers (`list`, `tuple`, `dict` values) and returns results in the same shape. | ||
| - `web.Batch(ComponentModeler)` and `web.Job(ComponentModeler)` native support | ||
| - Simulation data of batch jobs are now automatically downloaded upon their individual completion in `Batch.run()`, avoiding waiting for the entire batch to reach completion. | ||
| - Improved speed of autograd tracing for convolutions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not only related to autograd tracing right?
You are right, thats really size-dependent. In this test I used quadratic 2D inputs/kernels.
|
Nice! I'd say for any realistic use case this would be strictly better than before! |
24ea2ac to
f1454dc
Compare


Greptile Overview
Updated On: 2025-11-05 16:45:19 UTC
Greptile Summary
This PR replaces the autograd scipy convolution implementation with a custom FFT-based convolution to improve tracing performance. The main changes include:
autograd.scipy.signal.convolveand replaced with custom_fft_convolve_generalfunction using FFT convolution_normalize_axeshelper function to handle axis specification and validationIssues found:
_normalize_axes(lines 74-82): bounds checking occurs after modulo normalization, so checks will never triggerConfidence Score: 3/5
tidy3d/plugins/autograd/functions.py- the bounds checking logic in_normalize_axesshould be fixed or removedImportant Files Changed
File Analysis
_normalize_axesand_fft_convolve_general; contains dead code in bounds checking (lines 74-82)TestConvolveAxeswith parametrized tests for convolution with explicit axes, including validation against numpy and gradient checksSequence Diagram
sequenceDiagram participant User participant convolve participant _normalize_axes participant pad participant _fft_convolve_general participant fftn/ifftn User->>convolve: array, kernel, axes, mode, padding convolve->>convolve: validate kernel dimensions (must be odd) convolve->>_normalize_axes: array.ndim, kernel.ndim, axes _normalize_axes->>_normalize_axes: normalize axes with modulo _normalize_axes->>_normalize_axes: validate axes (length match, uniqueness) _normalize_axes-->>convolve: axes_array, axes_kernel alt mode is "same" or "full" loop for each axis pair convolve->>pad: working_array, pad_width, padding mode pad-->>convolve: padded array end convolve->>convolve: set effective_mode ("valid" for "same", "full" for "full") end convolve->>_fft_convolve_general: working_array, kernel, axes, effective_mode _fft_convolve_general->>_fft_convolve_general: reorder axes (batch vs conv) _fft_convolve_general->>_fft_convolve_general: expand dimensions for broadcasting _fft_convolve_general->>fftn/ifftn: compute FFT convolution fftn/ifftn-->>_fft_convolve_general: full convolution result alt effective_mode is "valid" _fft_convolve_general->>_fft_convolve_general: slice valid region end _fft_convolve_general->>_fft_convolve_general: convert to real if inputs were real _fft_convolve_general-->>convolve: result convolve-->>User: convolved array