Breeze Automatic is a sophisticated server designed to power advanced conversational AI experiences, built around a Pipecat-based Voice Agent for robust, real-time voice assistants.
The application's core is a standalone voice agent built on the Pipecat framework. It's launched as a subprocess by the main FastAPI server and handles the end-to-end voice conversation flow, including:
- Speech-to-Text (STT)
- Language Model (LLM) interaction with dynamic tool use
- Text-to-Speech (TTS)
- Dual-Mode Operation: Can run in
live
mode with real-time data fetching ortest
mode using dummy data. - Dynamic Tool Loading: The voice agent dynamically loads tools based on the operating mode and provided credentials (e.g., Juspay and Breeze tools are only loaded in
live
mode with valid tokens). - Multi-Provider Analytics: Integrates with both Juspay and Breeze APIs to fetch a wide range of analytics data, including sales, orders, marketing, and checkout metrics.
- Personalized Prompts: The agent's system prompt can be personalized with the user's name for a more engaging experience.
- Environment-Driven Configuration: All sensitive keys and settings are managed via environment variables.
- Modular & Scalable Architecture: The project is structured for clarity, maintainability, and easy extension with new tools or providers.
The project is organized into the main FastAPI server (app/
) and the Pipecat voice agent (app/agents/voice/automatic/
).
.
├── app/
│ ├── main.py # FastAPI app, agent endpoint, and subprocess management
│ ├── api/ # API clients for Juspay, Breeze, etc.
│ │ ├── juspay_metrics.py
│ │ └── breeze_metrics.py
│ └── agents/voice/automatic/ # Pipecat Voice Agent
│ ├── __init__.py # Main agent logic, pipeline definition
│ ├── prompts.py # System prompts for the agent
│ └── tools/ # Tool definitions for the agent
│ ├── __init__.py # Dynamic tool initializer
│ ├── system/ # System tools (e.g., get_current_time)
│ ├── dummy/ # Dummy tools for test mode
│ ├── juspay/ # Real-time Juspay analytics tools
│ └── breeze/ # Real-time Breeze analytics tools
├── static/
│ └── client.html # HTML client for testing
├── requirements.txt
└── run.py # Script to run the server
- Python 3.8+
- Access to Azure OpenAI and Daily.co APIs with valid keys.
- Clone the repository.
- Create and activate a virtual environment.
- Install dependencies:
pip install -r requirements.txt
- Set up Environment Variables:
Create a
.env
file in the project root with the following variables:DAILY_API_KEY
: Required.AZURE_OPENAI_API_KEY
: Required.AZURE_OPENAI_ENDPOINT
: Required.GOOGLE_CREDENTIALS_JSON
: Required. Path to your Google Cloud credentials JSON file.GEMINI_API_KEY
: Required for the Gemini Live Proxy.
Execute the run.py
script:
python run.py
The server will start on http://0.0.0.0:8000
by default.
- A client sends a POST request to the
/agent/voice/automatic
endpoint on the FastAPI server. - The payload includes the
mode
(live
ortest
) and various tokens/IDs (eulerToken
,breezeToken
,shopId
, etc.). - The server creates a new Daily.co video room for the voice session.
- It then launches the Pipecat voice agent as a new subprocess, passing the mode, tokens, and shop details as command-line arguments.
- Inside the agent's
__init__.py
, theinitialize_tools
function is called. - This function checks the
mode
and the presence of tokens to decide which toolsets to load:- System tools are always loaded.
- In
test
mode, dummy tools are loaded. - In
live
mode, if tokens are present, the corresponding real-time Juspay and Breeze tools are loaded.
- The agent's system prompt is personalized with the user's name if provided.
- The agent connects to the Daily room and begins the conversation, now equipped with the appropriate set of tools for the session.
This architecture allows for clean separation of concerns and enables the creation of highly contextual and capable voice assistants.