Skip to content

PY_SSIZE_T_CLEAN error when loading Triton-generated extension on Windows (Python 3.12.9, VS2022, PyTorch 2.8+cu129, triton-windows 3.4) #163

@happye

Description

@happye

Environment
OS: Windows 11, x64.
GPU: 4080s, Driver Version: 581.29, CUDA Version: 13.0
ComfyUI embedded Python: 3.12.9 at G:\Tools\ComfyUI_wan2.1\python.
PyTorch: 2.8+cu129.
triton-windows: 3.4.
CUDA: Toolkit 12.4.
MSVC: Visual Studio 2022 (x64 Native Tools Command Prompt used).
vcredist already installed.

Reproduction steps
Start VS2022 x64 Native Tools Command Prompt.

Ensure embedded Python is used: G:\Tools\ComfyUI_wan2.1>.\python\python.exe -m pip show torch triton-windows.

Run the README test script test_triton.py (exact code from triton-windows README) using the embedded python: .\python\python.exe test_triton.py

Observed result (critical log)
Compilation phases show creation of cuda_utils and __triton_launcher libraries, then failure on load with traceback ending in: SystemError: PY_SSIZE_T_CLEAN macro must be defined for '#' formats

sysconfig check printed: Python 3.12.9 and returned Py_ENABLE_SHARED=None, LIBDIR=None for the embedded Python.

What I tried

Deleted caches: %USERPROFILE%.triton\cache and %LOCALAPPDATA%\Temp\torchinductor_, and removed temp files with __triton_launcher / cuda_utils prefixes.

Uninstalled VS2019 and installed VS2022; ran tests from VS2022 dev prompt to ensure cl/link come from VS2022.

Cleared leftover env vars (VCINSTALLDIR, VCToolsVersion, WindowsSdkDir, WindowsSDKVersion) in-session and invoked vcvars64.bat..

Verified include/libs placed into embedded Python per README guidance (include and libs for Python 3.12 series).

Reinstalled triton-windows in the same session and repeated test.

Confirmed compilation run occurs but load fails with the PY_SSIZE_T_CLEAN error despite full cleanup and VS2022 environment.

Facts from the project README that I followed or considered: the project requires copying include and libs into the Python folder for embedded Python; the README documents the need to delete triton/torchinductor cache when changing compilers, Python, or Triton; and the README explicitly lists the PY_SSIZE_T_CLEAN error as a symptom recommending cache deletion and further diagnostics

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions