Skip to content

Conversation

@mattt
Copy link
Owner

@mattt mattt commented Nov 3, 2025

Fixes #16

@mattt mattt requested a review from Copilot November 3, 2025 12:01
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR modifies the temperature handling in the Llama language model implementation to only apply temperature for non-greedy sampling modes, and adds a test for greedy sampling with temperature.

  • Removed unconditional temperature application before the sampling mode switch
  • Added conditional temperature application only for .topK and .nucleus sampling modes
  • Added a test case for greedy sampling with a temperature parameter

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
Sources/AnyLanguageModel/Models/LlamaLanguageModel.swift Refactored temperature handling to be conditional based on sampling mode, removing it from greedy sampling
Tests/AnyLanguageModelTests/LlamaLanguageModelTests.swift Added test case for greedy sampling with temperature parameter
Comments suppressed due to low confidence (2)

Sources/AnyLanguageModel/Models/LlamaLanguageModel.swift:316

  • Greedy sampling silently ignores the temperature parameter when provided. This creates inconsistent behavior where users can set a temperature with greedy sampling but it has no effect. Consider either: (1) explicitly validating and rejecting the combination of greedy sampling with temperature, or (2) documenting this behavior clearly in the API documentation. The OllamaLanguageModel implementation handles this by explicitly forcing temperature to 0.0 for greedy sampling.
                case .greedy:
                    llama_sampler_chain_add(sampler, llama_sampler_init_top_k(1))
                    llama_sampler_chain_add(sampler, llama_sampler_init_top_p(1.0, 1))

Sources/AnyLanguageModel/Models/LlamaLanguageModel.swift:460

  • Greedy sampling silently ignores the temperature parameter when provided. This creates inconsistent behavior where users can set a temperature with greedy sampling but it has no effect. Consider either: (1) explicitly validating and rejecting the combination of greedy sampling with temperature, or (2) documenting this behavior clearly in the API documentation. The OllamaLanguageModel implementation handles this by explicitly forcing temperature to 0.0 for greedy sampling.
                    case .greedy:
                        llama_sampler_chain_add(sampler, llama_sampler_init_top_k(1))
                        llama_sampler_chain_add(sampler, llama_sampler_init_top_p(1.0, 1))

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mattt mattt merged commit e65c288 into main Nov 3, 2025
9 checks passed
@mattt mattt deleted the mattt/greedy-temperature branch November 3, 2025 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GGML assertion fails when adding sampling: .greedy in GenerationOptions for LlamaLanguageModel

2 participants