You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: private_gpt/settings/settings.py
+4Lines changed: 4 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -241,6 +241,10 @@ class OllamaSettings(BaseModel):
241
241
1.1,
242
242
description="Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)",
243
243
)
244
+
request_timeout: float=Field(
245
+
120.0,
246
+
description="Time elapsed until ollama times out the request. Default is 120s. Format is float. ",
Copy file name to clipboardExpand all lines: settings-ollama.yaml
+6-5Lines changed: 6 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -14,11 +14,12 @@ ollama:
14
14
llm_model: mistral
15
15
embedding_model: nomic-embed-text
16
16
api_base: http://localhost:11434
17
-
tfs_z: 1.0# Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting.
18
-
top_k: 40# Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)
19
-
top_p: 0.9# Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
20
-
repeat_last_n: 64# Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)
21
-
repeat_penalty: 1.2# Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)
17
+
tfs_z: 1.0# Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting.
18
+
top_k: 40# Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)
19
+
top_p: 0.9# Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
20
+
repeat_last_n: 64# Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)
21
+
repeat_penalty: 1.2# Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)
22
+
request_timeout: 120.0# Time elapsed until ollama times out the request. Default is 120s. Format is float.
0 commit comments