Skip to content

Eval bug: gpt-oss reasoning_effort does nothing #15130

@createthis

Description

@createthis

Name and Version

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA RTX PRO 6000 Blackwell Workstation Edition, compute capability 12.0, VMM: yes
version: 6097 (9515c61)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu

Operating systems

Linux

GGML backends

CUDA

Hardware

dual EPYC 9355, blackwell 6000 pro

Models

gpt-oss-120b-F16.gguf

Problem description & steps to reproduce

reasoning_effort seems to have no effect.

Startup command:

./build/bin/llama-server \
    --model /data/gpt-oss-120b-GGUF/gpt-oss-120b-F16.gguf \
    --alias gpt-oss-120b-F16 \
    --no-webui \
    --numa numactl \
    --threads 32 \
    --ctx-size 131072 \
    --n-gpu-layers 37 \
    -ot "exps.*\.blk.*\.ffn_.*=CUDA0" \
    --no-op-offload \
    -ub 4096 -b 4096 \
    --seed 3407 \
    --temp 0.6 \
    --top-p 1.0 \
    --log-colors \
    --flash-attn \
    --host 0.0.0.0 \
    --jinja \
    --chat-template-kwargs '{"reasoning_effort": "high"}' \
    --port 11434

First Bad Commit

No response

Relevant log output

output says:


Knowledge cutoff: 2024-06
Current date: 2025-08-06

Reasoning: medium

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions