# openai_v2v_python An extension for integrating OpenAI's Next Generation of **Multimodal** AI into your application, providing configurable AI-driven features such as conversational agents, task automation, and tool integration. ## Features - OpenAI **Multimodal** Integration: Leverage GPT **Multimodal** models for voice to voice as well as text processing. - Configurable: Easily customize API keys, model settings, prompts, temperature, etc. - Async Queue Processing: Supports real-time message processing with task cancellation and prioritization. ## API Refer to `api` definition in [manifest.json] and default values in [property.json](property.json). | **Property** | **Type** | **Description** | |----------------------------|------------|-------------------------------------------| | `api_key` | `string` | API key for authenticating with OpenAI | | `temperature` | `float64` | Sampling temperature, higher values mean more randomness | | `model` | `string` | Model identifier (e.g., GPT-3.5, GPT-4) | | `max_tokens` | `int64` | Maximum number of tokens to generate | | `system_message` | `string` | Default system message to send to the model | | `voice` | `string` | Voice that OpenAI model speeches, such as `alloy`, `echo`, `shimmer`, etc | | `server_vad` | `bool` | Flag to enable or disable server vad of OpenAI | | `language` | `string` | Language that OpenAO model reponds, such as `en-US`, `zh-CN`, etc | | `dump` | `bool` | Flag to enable or disable audio dump for debugging purpose | ### Data Out: | **Name** | **Property** | **Type** | **Description** | |----------------|--------------|------------|-------------------------------| | `text_data` | `text` | `string` | Outgoing text data | ### Command Out: | **Name** | **Description** | |----------------|---------------------------------------------| | `flush` | Response after flushing the current state | ### Audio Frame In: | **Name** | **Description** | |------------------|-------------------------------------------| | `pcm_frame` | Audio frame input for voice processing | ### Audio Frame Out: | **Name** | **Description** | |------------------|-------------------------------------------| | `pcm_frame` | Audio frame output after voice processing | ### Azure Support This extension also support Azure OpenAI Service, the propoerty settings are as follow: ``` json { "base_uri": "wss://xxx.openai.azure.com", "path": "/openai/realtime?api-version=xxx&deployment=xxx", "api_key": "xxx", "model": "gpt-4o-realtime-preview", "vendor": "azure" } ```