Skip to content

Conversation

@lucascolley
Copy link
Contributor

@lucascolley lucascolley commented Apr 3, 2025

closes gh-17

Left to-do here:

I think only the last item is blocking for this PR, the rest could be left for a follow-up.

@lucascolley lucascolley marked this pull request as ready for review April 3, 2025 12:15
@lucascolley lucascolley changed the title torch windows scipy: torch windows Apr 3, 2025
@lucascolley
Copy link
Contributor Author

scipy\meson.build:276:9: ERROR: Dependency "blas" not found, tried pkgconfig

@rgommers
Copy link
Owner

rgommers commented Apr 4, 2025

Ah this will require more changes I think. I had Windows working with Flang a while back, but I only occasionally open my Windows dev setup.

@lucascolley
Copy link
Contributor Author

lucascolley commented Apr 14, 2025

progress update:

@lucascolley lucascolley changed the title scipy: torch windows scipy: test windows support Apr 27, 2025
@rgommers
Copy link
Owner

Yes, we do need clang-cl indeed (and flang). Switching to netlib BLAS will be fine I think, although OpenBLAS should work as well.

Windows is a bit time-consuming to get to work. I did pretty much have it working a while back. The clang-cl error sounds a bit mysterious, it should work if MSVC is installed. Perhaps to avoid this hurdle, building with -Duse-pythran=false will help.

@lucascolley
Copy link
Contributor Author

Yes, the error seemed not fundamental, it will be possible to replicate the conda-forge setup somehow.

I am without a Windows box until July now, so this may have to wait a while longer.

@lucascolley lucascolley changed the title scipy: test windows support scipy: test windows (and macOS) support May 5, 2025
@lucascolley
Copy link
Contributor Author

Left to-do here:

  • figure out why openblas doesn't work with --setup-args=-Dblas=blas --setup-args=-Dlapack=lapack on Windows (meson doesn't find cblas)
  • figure out how to remove the need for cp ../.pixi/envs/default/Library/bin/openblas.dll build-install/Lib/site-packages/scipy/linalg/openblas.dll (we want the DLL accessible from envs other than the default env)
  • get -editable and -nogil builds working on Windows
  • figure out how to resolve jax-cpu being in the array-api env, yet not supporting Windows

I think only the last item is blocking for this PR, the rest could be left for a follow-up.

@rgommers this is ready for review.

@rgommers
Copy link
Owner

rgommers commented Jul 9, 2025

I tried lucascolley/scipy@b41bab6 — is that what you had in mind?

os.add_dll_directory() should get an absolute path as an argument for it to do anything.

I think only the last item is blocking for this PR, the rest could be left for a follow-up.

Agreed. I can look into the openblas.dll issue, it feels like a pixi bug to me at first sight but it could also be related to the current multi-env setup relying on RPATHs somehow.

@lucascolley
Copy link
Contributor Author

Since we are using the dll outside of the build env, it seems like we either need to

  1. make the dll available outside of the build env
  2. make the other envs aware of the build env

This PR does (1) by using a cp from the build env. Maybe there should be a way to tell Pixi to handle this? I'm not sure though.

(2) could maybe be handled by os.add_dll_directory() or similar.

@rgommers
Copy link
Owner

rgommers commented Jul 9, 2025

  • figure out why openblas doesn't work with --setup-args=-Dblas=blas --setup-args=-Dlapack=lapack on Windows (meson doesn't find cblas)

This looks fine:

$ pixi ls blas --platform win-64
Package      Version  Build                 Size       Kind   Source
blas-devel   3.9.0    31_hc0f8095_openblas  17.3 KiB   conda  blas-devel
libblas      3.9.0    31_h11dc60a_openblas  3.8 MiB    conda  libblas
libcblas     3.9.0    31_h9bd4c3b_openblas  3.8 MiB    conda  libcblas
libopenblas  0.3.29   pthreads_head3c61_0   3.8 MiB    conda  libopenblas
openblas     0.3.29   pthreads_h4a7f399_0   258.1 KiB  conda  openblas

I can't see anything wrong yet, guess I need to open a Windows machine to be helpful.

(2) could maybe be handled by os.add_dll_directory() or similar.

(2) is the way to do it, (1) is a hacky workaround. This should be done automatically normally; IIRC it's the python package that adds the environment's libdir to its search path. Which would explain the problem: if python is in a non-default env, then the build env's libdir goes missing. So yeah, os.add_dll_directory(path_to_builddir/lib) seems like the thing to do.

@lucascolley
Copy link
Contributor Author

I can't see anything wrong yet, guess I need to open a Windows machine to be helpful.

I discussed this for a bit with Axel here. It may be related to the fact that only the netlib variant ships a cblas.pc on conda-forge.

@rgommers
Copy link
Owner

rgommers commented Jul 9, 2025

It may be related to the fact that only the netlib variant ships a cblas.pc on conda-forge.

Yeah I know, that's normal. That shouldn't cause any failures; if you look for cblas in scipy/meson.build you will see that only for Netlib BLAS we're trying to detect cblas.

@lucascolley
Copy link
Contributor Author

lucascolley commented Jul 9, 2025

you will see that only for Netlib BLAS we're trying to detect cblas

Hmm, I don't understand. Isn't https://github.com/scipy/scipy/blob/main/scipy/meson.build#L284-L290 exactly what we hit when we pass -Dblas=blas?

To clarify, this is about building against openblas (not netlib) when we have an openblas variant of blas in the env, by passing -Dblas=blas (this works on unix and IIUC is what we do by default in pixi run build above).

@rgommers
Copy link
Owner

rgommers commented Jul 9, 2025

Oh right, sorry I can't read - we need the _netlib variants instead of the _openblas ones by default to build. But that's going to interfere with having openblas by default at test time.

On Linux this works:

$ pixi shell
$ pkg-config --cflags cblas
-I/home/rgommers/code/pixi-dev-scipystack/scipy/.pixi/envs/default/include

On Windows it doesn't. This is a weird asymmetry, you can see it in the conda-metadata browser:

There's something off on Linux as well though:

$ pkg-config --libs cblas
-L/home/rgommers/code/pixi-dev-scipystack/scipy/.pixi/envs/default/lib -lopenblas
(scipy) (base) ~/code/pixi-dev-scipystack/scipy/scipy (main)$ pkg-config --libs blas
-L/home/rgommers/code/pixi-dev-scipystack/scipy/.pixi/envs/default/lib -lopenblas
(scipy) (base) ~/code/pixi-dev-scipystack/scipy/scipy (main)$ pkg-config --libs cblas
-L/home/rgommers/code/pixi-dev-scipystack/scipy/.pixi/envs/default/lib -lopenblas

So we're still linking against openblas directly despite doing -Csetup-args=blas=blas, which is wrong. This BLAS switching doesn't work at all unfortunately; it only works when the libblas implementation is switched out in the same environment.

For Windows support let's go with -Dblas=openblas then; the rest needs a larger overhaul.

@lucascolley
Copy link
Contributor Author

lucascolley commented Jul 9, 2025

thanks for clearing that up! I was rather puzzled at the whole blas-switching situation, but I think it makes sense now. I thought that building against openblas on unix was intentional, while you thought it wasn't happening.

So we're still linking against openblas directly despite doing -Csetup-args=blas=blas, which is wrong.

So, in effect, we may as well be passing -Dblas=openblas for all platforms? (Not that that helps simplify things while we have the cp shim for openblas.dll).

@rgommers
Copy link
Owner

rgommers commented Jul 9, 2025

So, in effect, we may as well be passing -Dblas=openblas for all platforms?

Yes, that's fine in this PR.

@rgommers
Copy link
Owner

@lucascolley does the whole test suite pass for you locally? The CI job only runs linalg tests and they work, but running pixi r test just resulted in a crash for me locally that took the OS 10 minutes to recover from.

@lucascolley
Copy link
Contributor Author

😅 that sounds bad... I can try it out later today.

What kind of crash? Possibly related to building against openblas blas instead of netlib, or not?

@rgommers
Copy link
Owner

Don't know yet - the terminal disappeared completely, so now trying module by module to figure out where it is unhappy.

@lucascolley
Copy link
Contributor Author

no terminal crash for me on pixi run test on my Windows box!

However, the final test summary did not print, instead pytest crashed with this after reaching 100%:

Traceback (most recent call last):
  File "E:\dev\pixi-dev-scipystack\scipy\.pixi\envs\default\Lib\pathlib.py", line 1311, in mkdir
    os.mkdir(self, mode)
FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'E:\\dev\\pixi-dev-scipystack\\scipy\\scipy\\.pytest_cache\\v\\cache'

@rgommers
Copy link
Owner

After scipy/scipy#23313 the build log looks pretty decent.

I can't quite track down the crashes, but they may have to do with:

lld-link: warning: ignoring unknown argument '-Wl,-defaultlib:D:/pixi-dev/scipy/.pixi/envs/default/lib/clang/19/lib/windows/clang_rt.builtins-x86_64.lib'

That comes from the conda-forge activation script:

(scipy) PS D:\pixi-dev\scipy\scipy> $env:LD
lld-link.exe
(scipy) PS D:\pixi-dev\scipy\scipy> $env:LDFLAGS
 -Wl,-defaultlib:D:/pixi-dev/scipy/.pixi/envs/default/lib/clang/19/lib/windows/clang_rt.builtins-x86_64.lib

not sure what's wrong with it.

@rgommers
Copy link
Owner

rgommers commented Jul 10, 2025

Okay, I identified at least one problematic test that is hanging or crashing:

scipy\optimize\tests\test_least_squares.py::TestLM::test_workers

@rgommers
Copy link
Owner

Just started testing PyTorch, the build seems broken:

$ pixi r test-torch -s fft
_________________________________ ERROR collecting build-install/Lib/site-packages/scipy/fft/tests/test_helper.py __________________________________
ImportError while importing test module 'D:\pixi-dev\scipy\scipy\build-install\Lib\site-packages\scipy\fft\tests\test_helper.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
..\..\..\..\.pixi\envs\torch\Lib\importlib\__init__.py:90: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
scipy\fft\__init__.py:91: in <module>
    from ._fftlog import fht, ifht, fhtoffset
scipy\fft\_fftlog.py:10: in <module>
    from ._fftlog_backend import fhtoffset
scipy\fft\_fftlog_backend.py:4: in <module>
    from ..special import loggamma, poch
scipy\special\__init__.py:785: in <module>
    from . import _ufuncs
E   ImportError: DLL load failed while importing _ufuncs: The specified module could not be found.

@lucascolley
Copy link
Contributor Author

lucascolley commented Jul 10, 2025

Right, this is the same story as openblas.dll (I assume)

Copy link
Owner

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Lucas, did a bunch more testing on Windows and macOS, and it looks pretty happy - let's get this in.

@rgommers rgommers merged commit 68295d6 into rgommers:main Jul 11, 2025
8 checks passed
@rgommers
Copy link
Owner

Right, this is the same story as openblas.dll (I assume)

Yep, it was

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PyTorch windows!

3 participants