Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
179 commits
Select commit Hold shift + click to select a range
f0cf772
fix: lazy load cost_calculator.py
AlexsanderHamir Nov 18, 2025
216b08d
fix: lazy-load Prometheus
AlexsanderHamir Nov 18, 2025
e8a6a07
fix: lazy load litellm_logging
AlexsanderHamir Nov 19, 2025
fa55864
fix: lazy load utils.py imports
AlexsanderHamir Nov 19, 2025
b3b8612
fix: lazy load tiktoken and default_encoding imports
AlexsanderHamir Nov 19, 2025
13128a3
refactor: add helper functions for cached lazy imports
AlexsanderHamir Nov 19, 2025
55ca1f1
feat: lazy load HTTP handlers to reduce import-time memory cost
AlexsanderHamir Nov 19, 2025
6a8b4b6
fix: lazy load caching classes to reduce import-time memory cost
AlexsanderHamir Nov 19, 2025
726bb49
refactor: make lazy imports cleaner
AlexsanderHamir Nov 21, 2025
505c598
fix: lazy load LLMClientCache
AlexsanderHamir Nov 21, 2025
98fc291
Merge remote-tracking branch 'origin/main' into litellm_memory_import…
AlexsanderHamir Nov 22, 2025
da97d2c
fix: lazy load COHERE_EMBEDDING_INPUT_TYPES, GuardrailItem, and remov…
AlexsanderHamir Nov 22, 2025
efcc634
Lazy load litellm.types.utils imports to reduce import-time memory cost
AlexsanderHamir Nov 22, 2025
f8b80bc
Lazy load provider_list and priority_reservation_settings
AlexsanderHamir Nov 22, 2025
44df16e
Lazy load types.secret_managers.main imports
AlexsanderHamir Nov 22, 2025
b03746b
Delay client import to reduce early import memory usage
AlexsanderHamir Nov 22, 2025
f6d9136
Lazy load BytezChatConfig to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
3f4fce4
Lazy load CustomLLM to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
eb4ed12
Lazy load AmazonConverseConfig to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
05e1b9b
Lazy load OpenAILikeChatConfig to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
50cd4dd
Lazy load AiohttpOpenAIChatConfig to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
d9b8d04
Lazy load GaladrielChatConfig to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
8ef0fbd
Lazy load GithubChatConfig, CompactifAIChatConfig, and EmpowerChatConfig
AlexsanderHamir Nov 22, 2025
1894bdb
Lazy load HuggingFaceChatConfig, OpenrouterConfig, AnthropicConfig, a…
AlexsanderHamir Nov 22, 2025
d8a8f8b
Lazy load PredibaseConfig, ReplicateConfig, and SnowflakeConfig
AlexsanderHamir Nov 22, 2025
19f6e4e
Remove duplicate DatabricksConfig import
AlexsanderHamir Nov 22, 2025
83a7823
Lazy load HuggingFaceEmbeddingConfig to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
7e678df
Lazy load 28 additional config classes to reduce import-time memory u…
AlexsanderHamir Nov 22, 2025
50f80c5
Lazy load 10 rerank config classes to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
557d218
Add rerank configs to TYPE_CHECKING block
AlexsanderHamir Nov 22, 2025
c10fde8
Lazy load 10 more config classes (vertex, bedrock, anthropic, togethe…
AlexsanderHamir Nov 22, 2025
6b44a4f
Lazy load 6 more bedrock config classes to reduce import-time memory …
AlexsanderHamir Nov 22, 2025
af1b943
Lazy load AnthropicModelInfo to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
53a28b3
Add lazy loading handler for AnthropicModelInfo
AlexsanderHamir Nov 22, 2025
8c44ef7
Lazy load AI21Config alias to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
e57a297
Add lazy loading handler for AI21Config alias
AlexsanderHamir Nov 22, 2025
98d09fd
Lazy load PalmConfig (deprecated provider) to reduce import-time memo…
AlexsanderHamir Nov 22, 2025
67ad9ac
Add lazy loading handler for PalmConfig
AlexsanderHamir Nov 22, 2025
6333476
Add lazy loading handler for PalmConfig (fix)
AlexsanderHamir Nov 22, 2025
6897bd2
Lazy load all deprecated provider configs to reduce import-time memor…
AlexsanderHamir Nov 22, 2025
5f946d7
Add lazy loading handler for AlephAlphaConfig
AlexsanderHamir Nov 22, 2025
60c42e5
Add lazy loading handler for AlephAlphaConfig (fix)
AlexsanderHamir Nov 22, 2025
f847937
Lazy load bedrock_tool_name_mappings to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
ff8cc60
Add lazy loading handler for bedrock_tool_name_mappings
AlexsanderHamir Nov 22, 2025
8d8d6a7
Lazy load AmazonInvokeConfig to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
8f821b9
Add AmazonInvokeConfig to TYPE_CHECKING block
AlexsanderHamir Nov 22, 2025
7b4b7f1
Lazy load MistralEmbeddingConfig to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
a1c6b40
Add MistralEmbeddingConfig to TYPE_CHECKING block
AlexsanderHamir Nov 22, 2025
a0fc114
Lazy load OpenAITextCompletionConfig to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
deb0d9f
Add lazy loading handler and TYPE_CHECKING for OpenAITextCompletionCo…
AlexsanderHamir Nov 22, 2025
5597104
Add lazy loading handler for OpenAITextCompletionConfig
AlexsanderHamir Nov 22, 2025
b77d130
Lazy load VoyageContextualEmbeddingConfig to reduce import-time memor…
AlexsanderHamir Nov 22, 2025
b5652c3
Add lazy loading handler and TYPE_CHECKING for VoyageContextualEmbedd…
AlexsanderHamir Nov 22, 2025
61a05a8
Add lazy loading handler for VoyageContextualEmbeddingConfig
AlexsanderHamir Nov 22, 2025
f04c4cd
Lazy load AzureOpenAIResponsesAPIConfig to reduce import-time memory …
AlexsanderHamir Nov 22, 2025
4863404
Lazy load AzureOpenAIOSeriesResponsesAPIConfig to reduce import-time …
AlexsanderHamir Nov 22, 2025
83de911
Add lazy loading handler and TYPE_CHECKING for AzureOpenAIOSeriesResp…
AlexsanderHamir Nov 22, 2025
f70f6d9
Add lazy loading handler for AzureOpenAIOSeriesResponsesAPIConfig
AlexsanderHamir Nov 22, 2025
929d47a
Lazy load OpenAIOSeriesConfig, OpenAIO1Config, and openaiOSeriesConfi…
AlexsanderHamir Nov 22, 2025
a606086
Add lazy loading handlers and TYPE_CHECKING for OpenAIOSeriesConfig
AlexsanderHamir Nov 22, 2025
08a123d
Add lazy loading handlers for OpenAIOSeriesConfig, OpenAIO1Config, an…
AlexsanderHamir Nov 22, 2025
1e9785b
Lazy load AzureOpenAIO1Config to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
2d934ed
Add lazy loading handler for AzureOpenAIO1Config
AlexsanderHamir Nov 22, 2025
4bc699f
Lazy load GradientAIConfig to reduce import-time memory usage
AlexsanderHamir Nov 22, 2025
ff198cf
Add lazy loading handler and TYPE_CHECKING for GradientAIConfig
AlexsanderHamir Nov 22, 2025
ee64008
Add lazy loading handler for GradientAIConfig
AlexsanderHamir Nov 22, 2025
2ade09a
Lazy load OpenAIGPTConfig and openAIGPTConfig to reduce import-time m…
AlexsanderHamir Nov 22, 2025
2cab36a
Add lazy loading handlers and TYPE_CHECKING for OpenAIGPTConfig
AlexsanderHamir Nov 22, 2025
2d70bfc
Add lazy loading handlers for OpenAIGPTConfig and openAIGPTConfig
AlexsanderHamir Nov 22, 2025
5fbc62a
Lazy load OpenAIGPT5Config and openAIGPT5Config to reduce import-time…
AlexsanderHamir Nov 22, 2025
303ee35
Add lazy loading handlers and TYPE_CHECKING for OpenAIGPT5Config
AlexsanderHamir Nov 22, 2025
3b50af0
Add lazy loading handlers for OpenAIGPT5Config and openAIGPT5Config
AlexsanderHamir Nov 22, 2025
f3dfa46
Lazy load OpenAIGPTAudioConfig and openAIGPTAudioConfig to reduce imp…
AlexsanderHamir Nov 22, 2025
f80de69
Add OpenAIGPTAudioConfig to TYPE_CHECKING block
AlexsanderHamir Nov 22, 2025
02d8391
Lazy load NvidiaNimConfig and nvidiaNimConfig to reduce import-time m…
AlexsanderHamir Nov 22, 2025
6b5899c
Add lazy loading handlers and TYPE_CHECKING for NvidiaNimConfig
AlexsanderHamir Nov 22, 2025
56388cd
Add lazy loading handlers for NvidiaNimConfig and nvidiaNimConfig
AlexsanderHamir Nov 22, 2025
838400e
Refactor dotprompt lazy loading into separate function
AlexsanderHamir Nov 22, 2025
6800220
Refactor logging integrations lazy loading into separate function
AlexsanderHamir Nov 22, 2025
0891eb4
Refactor type items lazy loading into separate function
AlexsanderHamir Nov 22, 2025
02ea75c
Refactor core helpers and OpenAI-like configs lazy loading into separ…
AlexsanderHamir Nov 22, 2025
34471cd
Refactor small provider chat configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
117f657
Refactor data platform configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
9b70fa7
Refactor HuggingFace configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
7ec1448
Refactor Anthropic configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
3da8a8f
Refactor Triton configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
1003c67
Refactor AI21 configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
f59b831
Refactor Ollama configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
8d05cc4
Refactor Sagemaker configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
a9b3089
Refactor Cohere chat configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
2adfea3
Refactor rerank configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
cfcd7af
Refactor Vertex AI configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
72b2ab6
Refactor Amazon Bedrock configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
1e34287
Refactor deprecated provider configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
88bc7f0
Refactor Azure Responses API configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
6997fe8
Refactor OpenAI O-Series configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
851646d
Refactor OpenAI GPT configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
a0090cd
Refactor NvidiaNim configs lazy loading into separate function
AlexsanderHamir Nov 22, 2025
21b62f4
Refactor miscellaneous transformation configs lazy loading into separ…
AlexsanderHamir Nov 22, 2025
e98c236
Move lazy import helper to separate file with fully lazy loading
AlexsanderHamir Nov 23, 2025
85306d4
Move _lazy_import_litellm_logging to separate file
AlexsanderHamir Nov 23, 2025
b9cace7
Move _lazy_import_utils to separate file
AlexsanderHamir Nov 23, 2025
bc3d487
Move _lazy_import_http_handlers to separate file
AlexsanderHamir Nov 23, 2025
ed1eee3
Move _lazy_import_caching to separate file
AlexsanderHamir Nov 23, 2025
9875905
Move _lazy_import_types_utils to separate file
AlexsanderHamir Nov 23, 2025
87b8b15
Move _lazy_import_ui_sso to separate file
AlexsanderHamir Nov 23, 2025
2dd9532
Move _lazy_import_secret_managers to separate file
AlexsanderHamir Nov 23, 2025
f63cd5f
Move _lazy_import_logging_integrations to separate file
AlexsanderHamir Nov 23, 2025
8c89ff5
Move _lazy_import_nvidia_nim_configs to separate file
AlexsanderHamir Nov 23, 2025
1385253
Move remaining lazy import helper functions to separate file
AlexsanderHamir Nov 23, 2025
6090e0b
Update all lazy import functions to use _get_litellm_globals()
AlexsanderHamir Nov 23, 2025
15c42c4
Lazy load CerebrasConfig
AlexsanderHamir Nov 23, 2025
fdba04a
Lazy load BasetenConfig, SambanovaConfig, FireworksAIConfig, SambaNov…
AlexsanderHamir Nov 23, 2025
d6be221
Lazy load FriendliaiChatConfig, XAIChatConfig, AIMLChatConfig, VolcEn…
AlexsanderHamir Nov 23, 2025
f943f82
Lazy load Azure OpenAI configs, AzureOpenAIError, HerokuChatConfig, a…
AlexsanderHamir Nov 23, 2025
284fb9e
Lazy load HostedVLLM, Llamafile, LiteLLMProxy, DeepSeek, LMStudio, Ns…
AlexsanderHamir Nov 23, 2025
cbd7ebb
Lazy load Nebius, Wandb, DashScope, Moonshot, DockerModelRunner, V0, …
AlexsanderHamir Nov 23, 2025
a83c805
Lazy load BaseFilesConfig, AllowedModelRegion, and KeyManagementSyste…
AlexsanderHamir Nov 23, 2025
540be15
Add lazy import helper for main module functions
AlexsanderHamir Nov 23, 2025
c1a8997
Remove from .main import * and add essential direct imports
AlexsanderHamir Nov 23, 2025
00041c9
Add lazy loading handler for main module functions in __getattr__
AlexsanderHamir Nov 23, 2025
b42e0ee
optimize lazy load fallback
AlexsanderHamir Nov 24, 2025
cdbd78e
Lazy load anthropic_tokenizer.json to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
b458774
Optimize lazy loading for get_llm_provider and fix circular imports
AlexsanderHamir Nov 24, 2025
047cbf4
Lazy load model_cost to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
d054657
Lazy load batches.main to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
62141b3
Lazy load DatadogLLMObsInitParams and DatadogInitParams to reduce imp…
AlexsanderHamir Nov 24, 2025
dcbd8e0
Lazy load TritonGenerateConfig and TritonInferConfig to reduce import…
AlexsanderHamir Nov 24, 2025
ef6a16f
Lazy load GeminiModelInfo to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
d2bd71d
Lazy load assistants.main to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
668e4a5
Lazy load OpenAIImageVariationConfig to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
fa00a74
Lazy load DeepgramAudioTranscriptionConfig to reduce import-time memo…
AlexsanderHamir Nov 24, 2025
8d92e83
Lazy load TopazModelInfo to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
8face9c
Lazy load TopazImageVariationConfig to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
7fa8555
Lazy load OpenAIResponsesAPIConfig to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
65aec69
Fix circular import between custom_logger and custom_batch_logger
AlexsanderHamir Nov 24, 2025
ab32d1d
Lazy load LlmProviders and PriorityReservationSettings, fix circular …
AlexsanderHamir Nov 24, 2025
32e44e6
refactor: lazy load async client cleanup registration to reduce impor…
AlexsanderHamir Nov 24, 2025
6033919
refactor: lazy load timeout decorator to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
1982f4e
refactor: lazy load VertexAITextEmbeddingConfig to reduce import-time…
AlexsanderHamir Nov 24, 2025
8f032f5
refactor: lazy load TwelveLabsMarengoEmbeddingConfig to reduce import…
AlexsanderHamir Nov 24, 2025
c31d706
refactor: lazy load NvidiaNimEmbeddingConfig to reduce import-time me…
AlexsanderHamir Nov 24, 2025
e095fd6
refactor: lazy load KeyManagementSettings to reduce import-time memor…
AlexsanderHamir Nov 24, 2025
699da8c
refactor: lazy load httpx to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
54f3d21
refactor: lazy load PromptSpec to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
9bd0a4e
refactor: lazy load Router to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
9a429af
refactor: lazy load images.main and fix circular import
AlexsanderHamir Nov 24, 2025
31964f6
refactor: lazy load videos.main to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
47d99a3
refactor: lazy load rerank_api.main to reduce import-time memory cost
AlexsanderHamir Nov 24, 2025
a7dda99
refactor: lazy load anthropic experimental, responses, and containers…
AlexsanderHamir Nov 24, 2025
e13e688
refactor: lazy load OCR, search, realtime, and fine-tuning modules
AlexsanderHamir Nov 24, 2025
e93939b
refactor: lazy load anthropic_interface module
AlexsanderHamir Nov 24, 2025
e8599d4
refactor: lazy load vector stores, passthrough, and google_genai in _…
AlexsanderHamir Nov 24, 2025
7c36b71
refactor: lazy load all LLM handlers and related imports in main.py
AlexsanderHamir Nov 24, 2025
fd1258d
fix: add CreateFileRequest to lazy loading in __init__
AlexsanderHamir Nov 24, 2025
0b3df8c
fix: add azure_chat_completions to __getattr__ in main.py
AlexsanderHamir Nov 24, 2025
271a114
refactor: lazy load openai in @client decorator to reduce import-time…
AlexsanderHamir Nov 24, 2025
423327e
refactor: remove unused _service_logger import from utils.py
AlexsanderHamir Nov 24, 2025
44ee508
refactor: lazy load audio_utils.utils to reduce import-time memory
AlexsanderHamir Nov 24, 2025
66d4a0b
refactor: remove unused litellm.llms imports to reduce import-time me…
AlexsanderHamir Nov 24, 2025
78c2a8f
refactor: lazy load CachingHandlerResponse and LLMCachingHandler
AlexsanderHamir Nov 24, 2025
1a016ef
refactor: lazy load CustomGuardrail to reduce import-time memory
AlexsanderHamir Nov 24, 2025
f547c25
refactor: lazy load CustomLogger to reduce import-time memory
AlexsanderHamir Nov 24, 2025
fee857c
fix: use lazy loader for LLMCachingHandler in async wrapper
AlexsanderHamir Nov 24, 2025
be109b0
refactor: remove unused BaseVectorStore import from utils.py
AlexsanderHamir Nov 24, 2025
90921bf
refactor: lazy load get_litellm_metadata_from_kwargs to reduce import…
AlexsanderHamir Nov 24, 2025
1452d93
refactor: lazy load CredentialAccessor to reduce import-time memory
AlexsanderHamir Nov 24, 2025
8a7f4de
refactor: lazy load exception_mapping_utils functions to reduce impor…
AlexsanderHamir Nov 24, 2025
72c7d17
fix: lazy load exception_type in main.py to fix import error
AlexsanderHamir Nov 24, 2025
dfbfb47
refactor: lazy load get_llm_provider to reduce import-time memory
AlexsanderHamir Nov 24, 2025
c82a377
fix: lazy load get_llm_provider in main.py to fix import error
AlexsanderHamir Nov 24, 2025
674eccc
refactor: lazy load get_supported_openai_params to reduce import-time…
AlexsanderHamir Nov 24, 2025
7bb735f
refactor: lazy load convert_dict_to_response functions to reduce impo…
AlexsanderHamir Nov 24, 2025
1503b0d
refactor: lazy load get_api_base to reduce import-time memory
AlexsanderHamir Nov 24, 2025
ea622ce
refactor: lazy load llm_response_utils and redact_messages functions …
AlexsanderHamir Nov 24, 2025
2b40b56
fix: move TYPE_CHECKING block after typing import to fix NameError
AlexsanderHamir Nov 24, 2025
68d0902
refactor: lazy load CustomStreamWrapper to reduce import-time memory
AlexsanderHamir Nov 24, 2025
90ceb20
refactor: lazy load BaseGoogleGenAIGenerateContentConfig to reduce im…
AlexsanderHamir Nov 24, 2025
ea248c8
refactor: lazy load BaseOCRConfig to reduce import-time memory
AlexsanderHamir Nov 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,563 changes: 1,118 additions & 445 deletions litellm/__init__.py

Large diffs are not rendered by default.

1,412 changes: 1,412 additions & 0 deletions litellm/_lazy_imports.py

Large diffs are not rendered by default.

11 changes: 9 additions & 2 deletions litellm/images/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,16 @@
import httpx

import litellm
from litellm import Logging, client, exception_type, get_litellm_params
from litellm.utils import exception_type, get_litellm_params
# client is imported from litellm as it's a decorator
from litellm import client
from litellm.constants import DEFAULT_IMAGE_ENDPOINT_MODEL
from litellm.constants import request_timeout as DEFAULT_REQUEST_TIMEOUT
from litellm.exceptions import LiteLLMUnknownProvider
from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
# Logging is imported lazily when needed to avoid loading litellm_logging at import time
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from litellm.litellm_core_utils.litellm_logging import Logging, Logging as LiteLLMLoggingObj
from litellm.litellm_core_utils.mock_functions import mock_image_generation
from litellm.llms.base_llm import BaseImageEditConfig, BaseImageGenerationConfig
from litellm.llms.custom_httpx.http_handler import AsyncHTTPHandler, HTTPHandler
Expand Down Expand Up @@ -263,6 +268,8 @@ def image_generation( # noqa: PLR0915

litellm_params_dict = get_litellm_params(**kwargs)

# Import Logging lazily only when needed
from litellm.litellm_core_utils.litellm_logging import Logging
logging: Logging = litellm_logging_obj
logging.update_environment_variables(
model=model,
Expand Down
2 changes: 2 additions & 0 deletions litellm/integrations/custom_batch_logger.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@

import litellm
from litellm._logging import verbose_logger
# Import CustomLogger lazily to break circular dependency:
# custom_logger -> caching.caching -> gcs_cache -> gcs_bucket_base -> custom_batch_logger -> custom_logger
from litellm.integrations.custom_logger import CustomLogger


Expand Down
9 changes: 7 additions & 2 deletions litellm/integrations/custom_logger.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,12 @@
from pydantic import BaseModel

from litellm._logging import verbose_logger
from litellm.caching.caching import DualCache
# Lazy import DualCache to break circular dependency:
# custom_logger -> caching.caching -> gcs_cache -> gcs_bucket_base -> custom_batch_logger -> custom_logger
if TYPE_CHECKING:
from litellm.caching.caching import DualCache
else:
DualCache = Any # Will be imported lazily when needed
from litellm.constants import DEFAULT_MAX_RECURSE_DEPTH_SENSITIVE_DATA_MASKER
from litellm.types.integrations.argilla import ArgillaItem
from litellm.types.llms.openai import AllMessageValues, ChatCompletionRequest
Expand Down Expand Up @@ -289,7 +294,7 @@ async def async_dataset_hook(
async def async_pre_call_hook(
self,
user_api_key_dict: UserAPIKeyAuth,
cache: DualCache,
cache: "DualCache", # Use string annotation to avoid import at module level
data: dict,
call_type: CallTypesLiteral,
) -> Optional[
Expand Down
10 changes: 9 additions & 1 deletion litellm/integrations/prometheus.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@
from litellm.types.integrations.prometheus import *
from litellm.types.integrations.prometheus import _sanitize_prometheus_label_name
from litellm.types.utils import StandardLoggingPayload
from litellm.utils import get_end_user_id_for_cost_tracking

if TYPE_CHECKING:
from apscheduler.schedulers.asyncio import AsyncIOScheduler
Expand Down Expand Up @@ -778,6 +777,9 @@ async def async_log_success_event(self, kwargs, response_obj, start_time, end_ti
model = kwargs.get("model", "")
litellm_params = kwargs.get("litellm_params", {}) or {}
_metadata = litellm_params.get("metadata", {})
# Lazy import to avoid loading utils.py at import time (60MB saved)
from litellm.utils import get_end_user_id_for_cost_tracking

end_user_id = get_end_user_id_for_cost_tracking(
litellm_params, service_type="prometheus"
)
Expand Down Expand Up @@ -1164,6 +1166,9 @@ async def async_log_failure_event(self, kwargs, response_obj, start_time, end_ti
"standard_logging_object", {}
)
litellm_params = kwargs.get("litellm_params", {}) or {}
# Lazy import to avoid loading utils.py at import time (60MB saved)
from litellm.utils import get_end_user_id_for_cost_tracking

end_user_id = get_end_user_id_for_cost_tracking(
litellm_params, service_type="prometheus"
)
Expand Down Expand Up @@ -2249,6 +2254,9 @@ def prometheus_label_factory(
}

if UserAPIKeyLabelNames.END_USER.value in filtered_labels:
# Lazy import to avoid loading utils.py at import time (60MB saved)
from litellm.utils import get_end_user_id_for_cost_tracking

filtered_labels["end_user"] = get_end_user_id_for_cost_tracking(
litellm_params={"user_api_key_end_user_id": enum_values.end_user},
service_type="prometheus",
Expand Down
4 changes: 3 additions & 1 deletion litellm/litellm_core_utils/litellm_logging.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,6 @@
from litellm.integrations.custom_logger import CustomLogger
from litellm.integrations.deepeval.deepeval import DeepEvalLogger
from litellm.integrations.mlflow import MlflowLogger
from litellm.integrations.prometheus import PrometheusLogger
from litellm.integrations.sqs import SQSLogger
from litellm.litellm_core_utils.get_litellm_params import get_litellm_params
from litellm.litellm_core_utils.llm_cost_calc.tool_call_cost_tracking import (
Expand Down Expand Up @@ -3457,6 +3456,9 @@ def _init_custom_logger_compatible_class( # noqa: PLR0915
_in_memory_loggers.append(_literalai_logger)
return _literalai_logger # type: ignore
elif logging_integration == "prometheus":
# Lazy import to avoid loading prometheus.py and utils.py at import time (60MB saved)
from litellm.integrations.prometheus import PrometheusLogger

for callback in _in_memory_loggers:
if isinstance(callback, PrometheusLogger):
return callback # type: ignore
Expand Down
8 changes: 6 additions & 2 deletions litellm/litellm_core_utils/token_counter.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
cast,
)

import tiktoken
# tiktoken is imported lazily when needed to avoid loading it at import time

import litellm
from litellm import verbose_logger
Expand All @@ -28,7 +28,7 @@
MAX_TILE_HEIGHT,
MAX_TILE_WIDTH,
)
from litellm.litellm_core_utils.default_encoding import encoding as default_encoding
# default_encoding is imported lazily when needed to avoid loading tiktoken at import time
from litellm.llms.custom_httpx.http_handler import _get_httpx_client
from litellm.types.llms.anthropic import (
AnthropicMessagesToolResultParam,
Expand Down Expand Up @@ -532,6 +532,8 @@ def count_tokens(text: str) -> int:
return len(enc.ids)

elif tokenizer_json["type"] == "openai_tokenizer":
# Import tiktoken lazily to avoid loading it at import time
import tiktoken
model_to_use = _fix_model_name(model) # type: ignore
try:
if "gpt-4o" in model_to_use:
Expand All @@ -550,6 +552,8 @@ def count_tokens(text: str) -> int:
else:

def count_tokens(text: str) -> int:
# Import default_encoding lazily to avoid loading tiktoken at import time
from litellm.litellm_core_utils.default_encoding import encoding as default_encoding
return len(default_encoding.encode(text, disallowed_special=()))

return count_tokens
Expand Down
2 changes: 1 addition & 1 deletion litellm/llms/azure/azure.py
Original file line number Diff line number Diff line change
Expand Up @@ -1020,7 +1020,7 @@ async def aimage_generation(
headers: dict,
client=None,
timeout=None,
) -> litellm.ImageResponse:
) -> ImageResponse:

response: Optional[dict] = None
try:
Expand Down
4 changes: 2 additions & 2 deletions litellm/llms/azure_ai/embed/handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ async def async_image_embedding(
data: ImageEmbeddingRequest,
timeout: float,
logging_obj,
model_response: litellm.EmbeddingResponse,
model_response: EmbeddingResponse,
optional_params: dict,
api_key: Optional[str],
api_base: Optional[str],
Expand Down Expand Up @@ -138,7 +138,7 @@ async def async_embedding(
input: List,
timeout: float,
logging_obj,
model_response: litellm.EmbeddingResponse,
model_response: EmbeddingResponse,
optional_params: dict,
api_key: Optional[str] = None,
api_base: Optional[str] = None,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ def transform_response(
encoding: Any,
api_key: Optional[str] = None,
json_mode: Optional[bool] = None,
) -> litellm.ModelResponse:
) -> ModelResponse:
return AmazonConverseConfig.transform_response(
self,
model,
Expand Down
2 changes: 1 addition & 1 deletion litellm/llms/bedrock/image/amazon_titan_transformation.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

from openai.types.image import Image

from litellm import get_model_info
from litellm.utils import get_model_info
from litellm.types.llms.bedrock import (
AmazonNovaCanvasImageGenerationConfig,
AmazonTitanImageGenerationRequestBody,
Expand Down
2 changes: 1 addition & 1 deletion litellm/llms/jina_ai/embedding/transformation.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

import httpx

from litellm import LlmProviders
from litellm.types.utils import LlmProviders
from litellm.secret_managers.main import get_secret_str
from litellm.llms.base_llm.chat.transformation import BaseLLMException
from litellm.llms.base_llm import BaseEmbeddingConfig
Expand Down
2 changes: 1 addition & 1 deletion litellm/llms/openai/openai.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@
from typing_extensions import overload

import litellm
from litellm import LlmProviders
from litellm._logging import verbose_logger
from litellm.types.utils import LlmProviders
from litellm.constants import DEFAULT_MAX_RETRIES
from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
from litellm.litellm_core_utils.logging_utils import track_llm_api_timing
Expand Down
2 changes: 1 addition & 1 deletion litellm/llms/openai_like/chat/handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@
import httpx

import litellm
from litellm import LlmProviders
from litellm.llms.bedrock.chat.invoke_handler import MockResponseIterator
from litellm.types.utils import LlmProviders
from litellm.llms.custom_httpx.http_handler import AsyncHTTPHandler, HTTPHandler
from litellm.llms.databricks.streaming_utils import ModelResponseIterator
from litellm.llms.openai.chat.gpt_transformation import OpenAIGPTConfig
Expand Down
4 changes: 3 additions & 1 deletion litellm/llms/ovhcloud/chat/transformation.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@
from typing import Optional, Union, List

import httpx
from litellm import ModelResponseStream, OpenAIGPTConfig, get_model_info, verbose_logger
from litellm.utils import ModelResponseStream, get_model_info
from litellm.llms.openai.chat.gpt_transformation import OpenAIGPTConfig
from litellm._logging import verbose_logger
from litellm.llms.ovhcloud.utils import OVHCloudException
from litellm.llms.base_llm.base_model_iterator import BaseModelResponseIterator
from litellm.llms.base_llm.chat.transformation import BaseLLMException
Expand Down
3 changes: 2 additions & 1 deletion litellm/llms/together_ai/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@

from typing import Optional

from litellm import get_model_info, verbose_logger
from litellm.utils import get_model_info
from litellm._logging import verbose_logger

from ..openai.chat.gpt_transformation import OpenAIGPTConfig

Expand Down
3 changes: 2 additions & 1 deletion litellm/llms/vertex_ai/common_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
import httpx

import litellm
from litellm import supports_response_schema, supports_system_messages, verbose_logger
from litellm.utils import supports_response_schema, supports_system_messages
from litellm._logging import verbose_logger
from litellm.constants import DEFAULT_MAX_RECURSE_DEPTH
from litellm.litellm_core_utils.prompt_templates.common_utils import unpack_defs
from litellm.llms.base_llm.base_utils import BaseLLMModelInfo, BaseTokenCounter
Expand Down
2 changes: 1 addition & 1 deletion litellm/llms/vertex_ai/files/handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@

import httpx

from litellm import LlmProviders
from litellm.integrations.gcs_bucket.gcs_bucket_base import (
GCSBucketBase,
GCSLoggingConfig,
Expand All @@ -17,6 +16,7 @@
OpenAIFileObject,
)
from litellm.types.llms.vertex_ai import VERTEX_CREDENTIALS_TYPES
from litellm.types.utils import LlmProviders

from .transformation import VertexAIJsonlFilesTransformation

Expand Down
4 changes: 2 additions & 2 deletions litellm/llms/vertex_ai/fine_tuning/handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
ResponseSupervisedTuningSpec,
ResponseTuningJob,
)
from litellm.types.utils import LiteLLMFineTuningJob
from litellm.types.utils import LiteLLMFineTuningJob, LlmProviders


class VertexFineTuningAPI(VertexLLM):
Expand All @@ -30,7 +30,7 @@ class VertexFineTuningAPI(VertexLLM):
def __init__(self) -> None:
super().__init__()
self.async_handler = get_async_httpx_client(
llm_provider=litellm.LlmProviders.VERTEX_AI,
llm_provider=LlmProviders.VERTEX_AI,
params={"timeout": 600.0},
)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
import httpx

import litellm
from litellm import EmbeddingResponse
from litellm.types.utils import EmbeddingResponse
from litellm.llms.custom_httpx.http_handler import (
AsyncHTTPHandler,
HTTPHandler,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

from typing import List

from litellm import EmbeddingResponse
from litellm.types.utils import EmbeddingResponse
from litellm.types.llms.openai import EmbeddingInput
from litellm.types.llms.vertex_ai import (
ContentType,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ async def aimage_generation(
vertex_project: Optional[str],
vertex_location: Optional[str],
vertex_credentials: Optional[VERTEX_CREDENTIALS_TYPES],
model_response: litellm.ImageResponse,
model_response: ImageResponse,
logging_obj: Any,
model: str = "imagegeneration", # vertex ai uses imagegeneration as the default model
client: Optional[AsyncHTTPHandler] = None,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -147,13 +147,13 @@ async def async_multimodal_embedding(
optional_params: dict,
litellm_params: dict,
data: dict,
model_response: litellm.EmbeddingResponse,
model_response: EmbeddingResponse,
timeout: Optional[Union[float, httpx.Timeout]],
logging_obj: LiteLLMLoggingObj,
headers={},
client: Optional[AsyncHTTPHandler] = None,
api_key: Optional[str] = None,
) -> litellm.EmbeddingResponse:
) -> EmbeddingResponse:
if client is None:
_params = {}
if timeout is not None:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ async def handle_count_tokens_request(
headers = {"Authorization": f"Bearer {access_token}"}

# Get async HTTP client
from litellm import LlmProviders
from litellm.types.utils import LlmProviders

async_client = get_async_httpx_client(llm_provider=LlmProviders.VERTEX_AI)

Expand Down
2 changes: 1 addition & 1 deletion litellm/llms/vertex_ai/vertex_ai_partner_models/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import httpx # type: ignore

import litellm
from litellm import LlmProviders
from litellm.types.utils import LlmProviders
from litellm.types.llms.vertex_ai import VertexPartnerProvider
from litellm.utils import ModelResponse

Expand Down
4 changes: 2 additions & 2 deletions litellm/llms/vertex_ai/vertex_embeddings/embedding_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ async def async_embedding(
self,
model: str,
input: Union[list, str],
model_response: litellm.EmbeddingResponse,
model_response: EmbeddingResponse,
logging_obj: LiteLLMLoggingObject,
optional_params: dict,
custom_llm_provider: Literal[
Expand All @@ -152,7 +152,7 @@ async def async_embedding(
gemini_api_key: Optional[str] = None,
extra_headers: Optional[dict] = None,
encoding=None,
) -> litellm.EmbeddingResponse:
) -> EmbeddingResponse:
"""
Async embedding implementation
"""
Expand Down
Loading