|
19 | 19 | @component
|
20 | 20 | class AzureOpenAIChatGenerator(OpenAIChatGenerator):
|
21 | 21 | """
|
22 |
| - A Chat Generator component that uses the Azure OpenAI API to generate text. |
| 22 | + Generates text using OpenAI's models on Azure. |
23 | 23 |
|
24 |
| - Enables text generation using OpenAI's large language models (LLMs) on Azure. It supports `gpt-4` and |
25 |
| - `gpt-3.5-turbo` family of models accessed through the chat completions API endpoint. |
| 24 | + It works with the gpt-4 and gpt-3.5-turbo - type models and supports streaming responses |
| 25 | + from OpenAI API. It uses [ChatMessage](https://docs.haystack.deepset.ai/docs/data-classes#chatmessage) |
| 26 | + format in input and output. |
26 | 27 |
|
27 |
| - Users can pass any text generation parameters valid for the `openai.ChatCompletion.create` method |
28 |
| - directly to this component via the `generation_kwargs` parameter in `__init__` or the `generation_kwargs` |
29 |
| - parameter in `run` method. |
| 28 | + You can customize how the text is generated by passing parameters to the |
| 29 | + OpenAI API. Use the `**generation_kwargs` argument when you initialize |
| 30 | + the component or when you run it. Any parameter that works with |
| 31 | + `openai.ChatCompletion.create` will work here too. |
30 | 32 |
|
31 |
| - For more details on OpenAI models deployed on Azure, refer to the Microsoft |
32 |
| - [documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/). |
| 33 | + For details on OpenAI API parameters, see |
| 34 | + [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat). |
33 | 35 |
|
34 |
| - Key Features and Compatibility: |
35 |
| - - Primary Compatibility: Designed to work seamlessly with the OpenAI API Chat Completion endpoint. |
36 |
| - - Streaming Support: Supports streaming responses from the OpenAI API Chat Completion endpoint. |
37 |
| - - Customizability: Supports all parameters supported by the OpenAI API Chat Completion endpoint. |
38 |
| -
|
39 |
| - Input and Output Format: |
40 |
| - - ChatMessage Format: This component uses the ChatMessage format for structuring both input and output, ensuring |
41 |
| - coherent and contextually relevant responses in chat-based text generation scenarios. |
42 |
| - - Details on the ChatMessage format can be found [here](https://docs.haystack.deepset.ai/v2.0/docs/data-classes#chatmessage). |
43 |
| -
|
44 |
| -
|
45 |
| - Usage example: |
| 36 | + ### Usage example |
46 | 37 |
|
47 | 38 | ```python
|
48 | 39 | from haystack.components.generators.chat import AzureOpenAIGenerator
|
@@ -87,37 +78,37 @@ def __init__(
|
87 | 78 | """
|
88 | 79 | Initialize the Azure OpenAI Chat Generator component.
|
89 | 80 |
|
90 |
| - :param azure_endpoint: The endpoint of the deployed model, e.g. `"https://example-resource.azure.openai.com/"` |
91 |
| - :param api_version: The version of the API to use. Defaults to 2023-05-15 |
| 81 | + :param azure_endpoint: The endpoint of the deployed model, for example `"https://example-resource.azure.openai.com/"`. |
| 82 | + :param api_version: The version of the API to use. Defaults to 2023-05-15. |
92 | 83 | :param azure_deployment: The deployment of the model, usually the model name.
|
93 | 84 | :param api_key: The API key to use for authentication.
|
94 |
| - :param azure_ad_token: [Azure Active Directory token](https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id) |
95 |
| - :param organization: The Organization ID, defaults to `None`. See |
96 |
| - [production best practices](https://platform.openai.com/docs/guides/production-best-practices/setting-up-your-organization). |
97 |
| - :param streaming_callback: A callback function that is called when a new token is received from the stream. |
98 |
| - The callback function accepts StreamingChunk as an argument. |
99 |
| - :param timeout: The timeout in seconds to be passed to the underlying `AzureOpenAI` client, if not set it is |
100 |
| - inferred from the `OPENAI_TIMEOUT` environment variable or set to 30. |
101 |
| - :param max_retries: Maximum retries to establish a connection with AzureOpenAI if it returns an internal error, |
102 |
| - if not set it is inferred from the `OPENAI_MAX_RETRIES` environment variable or set to 5. |
103 |
| - :param generation_kwargs: Other parameters to use for the model. These parameters are all sent directly to |
104 |
| - the OpenAI endpoint. See OpenAI [documentation](https://platform.openai.com/docs/api-reference/chat) for |
105 |
| - more details. |
| 85 | + :param azure_ad_token: [Azure Active Directory token](https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id). |
| 86 | + :param organization: Your organization ID, defaults to `None`. For help, see |
| 87 | + [Setting up your organization](https://platform.openai.com/docs/guides/production-best-practices/setting-up-your-organization). |
| 88 | + :param streaming_callback: A callback function called when a new token is received from the stream. |
| 89 | + It accepts [StreamingChunk](https://docs.haystack.deepset.ai/docs/data-classes#streamingchunk) |
| 90 | + as an argument. |
| 91 | + :param timeout: Timeout for OpenAI client calls. If not set, it defaults to either the |
| 92 | + `OPENAI_TIMEOUT` environment variable, or 30 seconds. |
| 93 | + :param max_retries: Maximum number of retries to contact OpenAI after an internal error. |
| 94 | + If not set, it defaults to either the `OPENAI_MAX_RETRIES` environment variable, or set to 5. |
| 95 | + :param generation_kwargs: Other parameters to use for the model. These parameters are sent directly to |
| 96 | + the OpenAI endpoint. For details, see [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat). |
106 | 97 | Some of the supported parameters:
|
107 | 98 | - `max_tokens`: The maximum number of tokens the output text can have.
|
108 |
| - - `temperature`: What sampling temperature to use. Higher values mean the model will take more risks. |
| 99 | + - `temperature`: The sampling temperature to use. Higher values mean the model takes more risks. |
109 | 100 | Try 0.9 for more creative applications and 0 (argmax sampling) for ones with a well-defined answer.
|
110 |
| - - `top_p`: An alternative to sampling with temperature, called nucleus sampling, where the model |
111 |
| - considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens |
112 |
| - comprising the top 10% probability mass are considered. |
113 |
| - - `n`: How many completions to generate for each prompt. For example, if the LLM gets 3 prompts and n is 2, |
114 |
| - it will generate two completions for each of the three prompts, ending up with 6 completions in total. |
| 101 | + - `top_p`: Nucleus sampling is an alternative to sampling with temperature, where the model considers |
| 102 | + tokens with a top_p probability mass. For example, 0.1 means only the tokens comprising |
| 103 | + the top 10% probability mass are considered. |
| 104 | + - `n`: The number of completions to generate for each prompt. For example, with 3 prompts and n=2, |
| 105 | + the LLM will generate two completions per prompt, resulting in 6 completions total. |
115 | 106 | - `stop`: One or more sequences after which the LLM should stop generating tokens.
|
116 |
| - - `presence_penalty`: What penalty to apply if a token is already present at all. Bigger values mean |
117 |
| - the model will be less likely to repeat the same token in the text. |
118 |
| - - `frequency_penalty`: What penalty to apply if a token has already been generated in the text. |
119 |
| - Bigger values mean the model will be less likely to repeat the same token in the text. |
120 |
| - - `logit_bias`: Add a logit bias to specific tokens. The keys of the dictionary are tokens, and the |
| 107 | + - `presence_penalty`: The penalty applied if a token is already present. |
| 108 | + Higher values make the model less likely to repeat the token. |
| 109 | + - `frequency_penalty`: Penalty applied if a token has already been generated. |
| 110 | + Higher values make the model less likely to repeat the token. |
| 111 | + - `logit_bias`: Adds a logit bias to specific tokens. The keys of the dictionary are tokens, and the |
121 | 112 | values are the bias to add to that token.
|
122 | 113 | """
|
123 | 114 | # We intentionally do not call super().__init__ here because we only need to instantiate the client to interact
|
|
0 commit comments