Giskard-AI · kevinmessiaen · Aug 30, 2024 · Jul 26, 2024 · Jul 26, 2024 · Aug 30, 2024
diff --git a/docs/open_source/scan/scan_llm/index.md b/docs/open_source/scan/scan_llm/index.md
@@ -74,6 +74,7 @@ Require `openai>=1.0.0`
 ```python
 import os
 from giskard.llm import set_llm_model
+from giskard.llm.embeddings.openai import set_embedding_model
 
 os.environ['AZURE_OPENAI_API_KEY'] = '...'
 os.environ['AZURE_OPENAI_ENDPOINT'] = 'https://xxx.openai.azure.com'
@@ -82,6 +83,7 @@ os.environ['OPENAI_API_VERSION'] = '2023-07-01-preview'
 # You'll need to provide the name of the model that you've deployed
 # Beware, the model provided must be capable of using function calls
 set_llm_model('my-gpt-4-model')
+set_embedding_model('my-embedding-model') # Optional
 ```
 
 ::::::

diff --git a/docs/open_source/setting_up/index.md b/docs/open_source/setting_up/index.md
@@ -1,6 +1,7 @@
 # 🤖 Setting up the LLM Client
 
 This guide focuses primarily on configuring and using various LLM clients supported to run Giskard's LLM-assisted functionalities. These clients include:
+
 - OpenAI GPT models (such as GPT-3.5 and GPT-4)
 - Azure OpenAI
 - Mistral
@@ -26,6 +27,7 @@ giskard.llm.set_default_client(oc)
 ```python
 import os
 from giskard.llm import set_llm_model
+from giskard.llm.embeddings.openai import set_embedding_model
 
 os.environ['AZURE_OPENAI_API_KEY'] = '...'
 os.environ['AZURE_OPENAI_ENDPOINT'] = 'https://xxx.openai.azure.com'
@@ -35,6 +37,7 @@ os.environ['OPENAI_API_VERSION'] = '2023-07-01-preview'
 # You'll need to provide the name of the model that you've deployed
 # Beware, the model provided must be capable of using function calls
 set_llm_model('my-gpt-4-model')
+set_embedding_model('my-embedding-model') # Optional
 ```
 
 ## Mistral Client Setup

diff --git a/docs/open_source/testset_generation/testset_generation/index.md b/docs/open_source/testset_generation/testset_generation/index.md
@@ -1,30 +1,30 @@
-# 🎯 RAGET Testset Generation 
+# 🎯 RAGET Testset Generation
 
+Waiting to collect data from production to evaluate your RAG agents extensively is a risky business. But building
+an in-house evaluation dataset is a painful task that requires manual curation and review.
 
-Waiting to collect data from production to evaluate your RAG agents extensively is a risky business. But building 
-an in-house evaluation dataset is a painful task that requires manual curation and review. 
-
-To help with this, the Giskard python library provides **RAGET: RAG Evaluation Toolkit**, a toolkit to evaluate RAG 
-agents **automatically**. 
+To help with this, the Giskard python library provides **RAGET: RAG Evaluation Toolkit**, a toolkit to evaluate RAG
+agents **automatically**.
 
 > ℹ️ You can find a [tutorial](../../../reference/notebooks/RAGET.ipynb) where we demonstrate the capabilities of RAGET
 > with a simple RAG agent build with LlamaIndex
 > on the IPCC report.
 
-
 (q_types)=
+
 ## What does RAGET do exactly?
 
-RAGET can generate automatically a list of `question`, `reference_answer` and `reference_context` from the knowledge 
+RAGET can generate automatically a list of `question`, `reference_answer` and `reference_context` from the knowledge
 base of the RAG. It relies on a chain of LLM operations to generate realistic questions across different types.
-You can then use this **generated test set to evaluate your RAG agent**. 
+You can then use this **generated test set to evaluate your RAG agent**.
 
 By default, RAGET will create multiple types of questions. Each of them are designed to target and evaluate specific
 components of the RAG system (for example: the retriever, the generation, or the quality of your knowledge base
 chunks). During evaluation, RAGET will use a mapping between questions type and RAG components to identify possible
-weaknesses affecting a specific component of your RAG agent. 
+weaknesses affecting a specific component of your RAG agent.
 
 RAGET is capable of targeting and evaluating the following components of a RAG agent:
+
 - **`Generator`**: the LLM used inside the RAG to generate the answers
 - **`Retriever`**: fetch relevant documents from the knowledge base according to a user query
 - **`Rewriter`** (optional): rewrite the user query to make it more relevant to the knowledge base or to account for
@@ -33,6 +33,7 @@ RAGET is capable of targeting and evaluating the following components of a RAG a
 - **`Knowledge Base`**: the set of documents given to the RAG to generate the answers
 
 These are the question types currently supported by RAGET:
+
 ```{list-table}
 :header-rows: 1
 :widths: 20, 50, 25
@@ -48,7 +49,7 @@ These are the question types currently supported by RAGET:
   - Questions made more complex by paraphrasing
 
     *Example: What is the capital of the country of Victor Hugo?*
-  - `Generator` 
+  - `Generator`
 * - **Distracting questions**
   - Questions made to confuse the retrieval part of the RAG with a distracting element from the knowledge base but irrelevant to the question
 
@@ -60,21 +61,19 @@ These are the question types currently supported by RAGET:
     *Example: I am planning a trip to Europe, what is the capital of France?*
   - `Generator`
 * - **Double questions**
-  - Questions with two distinct parts to evaluate the capabilities of the query rewriter of the RAG 
+  - Questions with two distinct parts to evaluate the capabilities of the query rewriter of the RAG
 
     *Example: What is the capital and the population of France?*
   - `Generator`, `Rewriter`
 * - **Conversational questions**
   - Questions made as part of a conversation, first message describes the context of the question that is asked in the last message, also tests the rewriter
 
-    *Example: (two separate messages)* 
-      - *I would like to know some information about France.* 
+    *Example: (two separate messages)*
+      - *I would like to know some information about France.*
       - *What is its capital city?*
   - `Rewriter`
 ```
 
-
-
 ## Before starting
 
 Before starting, make sure you have installed the LLM flavor of Giskard:
@@ -100,6 +99,7 @@ giskard.llm.set_llm_api("openai")
 oc = OpenAIClient(model="gpt-4-turbo-preview")
 giskard.llm.set_default_client(oc)
 ```
+
 ::::::
 ::::::{tab-item} Azure OpenAI
 
@@ -108,6 +108,7 @@ Require `openai>=1.0.0`
 ```python
 import os
 from giskard.llm import set_llm_model
+from giskard.llm.embeddings.openai import set_embedding_model
 
 os.environ['AZURE_OPENAI_API_KEY'] = '...'
 os.environ['AZURE_OPENAI_ENDPOINT'] = 'https://xxx.openai.azure.com'
@@ -117,10 +118,12 @@ os.environ['OPENAI_API_VERSION'] = '2023-07-01-preview'
 # You'll need to provide the name of the model that you've deployed
 # Beware, the model provided must be capable of using function calls
 set_llm_model('my-gpt-4-model')
+set_embedding_model('my-embedding-model') # Optional
 ```
 
 ::::::
 ::::::{tab-item} Mistral
+
 ```python
 import os
 from giskard.llm.client.mistral import MistralClient
@@ -133,6 +136,7 @@ giskard.llm.set_default_client(mc)
 
 ::::::
 ::::::{tab-item} Ollama
+
 ```python
 from openai import OpenAI
 from giskard.llm.client.openai import OpenAIClient
@@ -143,6 +147,7 @@ _client = OpenAI(base_url="http://localhost:11434/v1/", api_key="ollama")
 oc = OpenAIClient(model="gemma:2b", client=_client)
 giskard.llm.set_default_client(oc)
 ```
+
 ::::::
 ::::::{tab-item} Claude 3
 
@@ -180,6 +185,7 @@ giskard.llm.set_default_client(GeminiClient())
 
 ::::::
 ::::::{tab-item} Custom Client
+
 ```python
 import giskard
 from typing import Sequence, Optional
@@ -239,12 +245,10 @@ set_default_client(MyLLMClient())
 ::::::
 :::::::
 
-
-
 ## Prepare your Knowledge Base
 
-Prepare your data or knowledge base in a pandas `DataFrame`. Then, create a 
-{class}`~giskard.rag.knowledge_base.KnowledgeBase` instance with the `from_pandas` method. 
+Prepare your data or knowledge base in a pandas `DataFrame`. Then, create a
+{class}`~giskard.rag.knowledge_base.KnowledgeBase` instance with the `from_pandas` method.
 
 By default, we will use all columns in your data frame to populate your knowledge base. If only certain columns in your
 dataframe are relevant for the generation of questions, make sure to specify the columns
@@ -260,16 +264,16 @@ df = pd.read_csv("path/to/your/knowledge_base.csv")
 knowledge_base = KnowledgeBase.from_pandas(df, columns=["column_1", "column_2"])
 ```
 
-
 ## Generate a test set
-By default, **RAGET automatically generates 6 different [question types](q_types)** (these can be selected if needed, see [advanced question generation](advanced_config)). The total number of questions is 
-divided equally between each question type. To make the question generation more relevant and accurate, you can also 
+
+By default, **RAGET automatically generates 6 different [question types](q_types)** (these can be selected if needed, see [advanced question generation](advanced_config)). The total number of questions is
+divided equally between each question type. To make the question generation more relevant and accurate, you can also
 provide a description of your agent.
 
 ```python
 # Generate a testset with 10 questions & answers for each question types (this will take a while)
 testset = generate_testset(
-    knowledge_base, 
+    knowledge_base,
     num_questions=60,
     language='en',  # optional, we'll auto detect if not provided
     agent_description="A customer support chatbot for company X", # helps generating better questions
@@ -298,55 +302,59 @@ df = loaded_testset.to_pandas()
 
 Here's an example of a generated question:
 
-| question | reference_context | reference_answer |  metadata |
-|----------|-------------------|------------------|-----------|
+| question                                     | reference_context                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | reference_answer                                                                                              | metadata                                                                       |
+| -------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------ |
 | For which countries can I track my shipping? | Document 1: We offer free shipping on all orders over \$50. For orders below \$50, we charge a flat rate of \$5.99. We offer shipping services to customers residing in all 50 states of the US, in addition to providing delivery options to Canada and Mexico. Document 2: Once your purchase has been successfully confirmed and shipped, you will receive a confirmation email containing your tracking number. You can simply click on the link provided in the email or visit our website's order tracking page. | We ship to all 50 states in the US, as well as to Canada and Mexico. We offer tracking for all our shippings. | {"question_type": "simple", "seed_document_id": 1, "topic": "Shipping policy"} |
 
 Each row of the test set contains 5 columns:
+
 - `question`: the generated question
 - `reference_context`: the context that can be used to answer the question
 - `reference_answer`: the answer to the question (generated with GPT-4)
 - `conversation_history`: not shown in the table above, contain the history of the conversation with the agent as a list, only relevant for conversational question, otherwise it contains an empty list.
 - `metadata`: a dictionary with various metadata about the question, this includes the `question_type`, `seed_document_id` the id of the document used to generate the question and the `topic` of the question
 
 (advanced_config)=
+
 ### Advanced configuration of the question generation
 
-By default, the test set contains all question types. **You can change this by providing question generators** to 
-the `giskard.rag.generate_testset` function. Generators are available inside the `question_generators` module. For 
+By default, the test set contains all question types. **You can change this by providing question generators** to
+the `giskard.rag.generate_testset` function. Generators are available inside the `question_generators` module. For
 instance to generate only complex and double questions use the following:
 
 ```python
 from giskard.rag.question_generators import complex_questions, double_questions
 
 testset = generate_testset(
-    knowledge_base, 
+    knowledge_base,
     question_generators=[complex_questions, double_questions],
 )
 ```
 
 You can also implement custom question generators, by implementing the interface defined
 by {class}`~giskard.rag.question_generators.QuestionGenerator`.
 
-
 ## What’s next: evaluate your model on the generated testset
+
 Once you have generated the test set, you can evaluate your RAG agent using the `giskard.rag.evaluate` function.
 Detailed instructions can be found in the [RAGET Evaluation](../rag_evaluation/index.md) section.
 
 ## Frequently Asked Questions
 
 > #### ℹ️ What data are being sent to LLM Providers
-> 
+>
 > In order to perform tasks with language model-assisted detectors, we send the following information to the selected language model provider (e.g., OpenAI, Azure OpenAI, Ollama, Mistral, etc):
-> 
+>
 > - Data provided in your knowledge base
 > - Text generated by your model
 > - Model name and description
 
 > #### 🌎 Will the test set generation work in any language?
-> Yes, you can specify the language of the generated questions when calling the `generate_testset` function. 
+>
+> Yes, you can specify the language of the generated questions when calling the `generate_testset` function.
 > Ultimately, the quality of the generated questions will depend on the LLM performance in the desired language.
 
 ## Troubleshooting
-If you encounter any issues, join our [Discord community](https://discord.gg/fkv7CAr3FE) and ask questions in 
+
+If you encounter any issues, join our [Discord community](https://discord.gg/fkv7CAr3FE) and ask questions in
 our #support channel.
diff --git a/giskard/llm/embeddings/openai.py b/giskard/llm/embeddings/openai.py
@@ -6,6 +6,25 @@
 from ..client import get_default_llm_api
 from .base import BaseEmbedding
 
+_default_embedding_model = "text-embedding-ada-002"
+
+
+def get_embedding_model() -> str:
+    return _default_embedding_model
+
+
+def set_embedding_model(model: str):
+    """
+    Set the default embedding model to be used with OpenAI/AzureOpenAI client.
+
+    Parameters
+    ----------
+    model : str
+        Model name (e.g. 'text-embedding-ada-002' or 'text-embedding-3-large').
+    """
+    global _default_embedding_model
+    _default_embedding_model = model
+
 
 class OpenAIEmbedding(BaseEmbedding):
     def __init__(self, client, model: str, batch_size=40):
@@ -48,6 +67,6 @@ def try_get_openai_embeddings() -> Optional[OpenAIEmbedding]:
         else:
             client = OpenAI()
 
-        return OpenAIEmbedding(client=client, model="text-embedding-ada-002")
+        return OpenAIEmbedding(client=client, model=get_embedding_model())
     except ImportError:
         return None