Playground

Compare models and deployments side-by-side in real-time, debug runtime modules, and optimize LLM parameters within your Infralo workspace.

                  Workspace Playground
                           │
      ┌────────────────────┼────────────────────┐
      ▼                    ▼                    ▼
   Column 1             Column 2             Column 3
┌─────────────┐      ┌─────────────┐      ┌─────────────┐
│  Model A    │      │  Model B    │      │Deployment C │
├─────────────┤      ├─────────────┤      ├─────────────┤
│  Response   │      │  Response   │      │  Response   │
│  - Latency  │      │  - Latency  │      │  - Latency  │
│  - Tokens   │      │  - Tokens   │      │  - Tokens   │
│  - Cost     │      │  - Cost     │      │  - Debug    │
└─────────────┘      └─────────────┘      └─────────────┘

The Workspace Playground is an interactive console that allows AI developers and engineers to test, compare, and optimize models and deployments configured in Infralo.

Instead of writing custom scripts or calling multiple API endpoints manually, you can run prompts side-by-side, inspect real-time latencies and token expenditures, and debug custom runtime modules in a unified interface.

Core Capabilities

1. Multi-Model Comparison

You can compare up to three models or virtual deployments simultaneously. When you submit a prompt, the playground sends requests concurrently to each configured column, rendering their streaming outputs side-by-side.

Mixed Selection: Compare a raw model (e.g., gpt-4o-mini) directly against a virtual deployment (e.g., customer-support-lb) to test load balancing, fallbacks, and routing logic.
Sync / Add / Remove: Quickly add or remove comparison columns using the controls at the top of the playground.

2. Advanced Parameters

The playground separates configuration parameters into two levels of control:

Global Settings (Applies to all columns)

Access these by clicking the Settings button in the top action bar. They configure the overall context for the entire comparison session:

System Instructions: Define global system instructions or prompts (e.g. "You are a helpful assistant...") that are prepended to the context of every model call.
Max History Window: A slider to restrict conversation history to the last $N$ turns, allowing you to control context window sizes and optimize token costs.
Streaming (SSE): Globally toggle Server-Sent Events (SSE) streaming. If enabled, text renders token-by-token. (Note: If any of the selected models do not support streaming, this toggle will be auto-disabled with a warning).

Model-Specific Settings (Applies to individual columns)

Access these by clicking the Gear icon next to each column's model selector dropdown. Enable Advanced Config to override parameters independently for that specific model:

Temperature: Adjust creativity and randomness (from 0 to 2).
Max Output Tokens: Limit the maximum length of the generation response.
Top P: Control diversity of token selection via nucleus sampling (from 0 to 1).

3. Telemetry & Cost Tracking

For every message received, the playground captures and displays detailed transaction metadata:

Latency (ms): The total round-trip time of the generation request.
Token Breakdown: Exact counts for input, output, and cached tokens.
Cost Estimation: Real-time dollar cost calculated based on your provider rates and cache savings.
Actual Model Used: Displays the concrete provider model that ultimately served the request (extremely useful for verifying failover routing or fallback rules inside Deployments).

4. Runtime Module Debugging

When a column is mapped to a Virtual Deployment configured with pre- or post-execution Runtime Modules, the playground acts as an interactive debugger:

Execution Log: View a sequence of all executed modules, showing their execution order, latency, status (success, skipped, error), and error details.
Payload Inspection: Open the visual inspector to view the raw JSON payloads captured immediately before and immediately after each runtime module executes. This makes it easy to verify schema transformations, guardrails, or credential injections.

Using the Playground

Setting up Comparison Columns

Navigate to your workspace and select Playground from the sidebar.
Select a model or deployment in the first column's dropdown.
Click Add Model at the top right to create additional comparison columns.
Customize settings for each model by clicking the gear icon next to the dropdown in each column.

Running Prompts

Enter your prompt in the text area at the bottom of the screen.
Press Enter or click the send button.
The responses will stream in real-time across all active columns.
Analyze the responses side-by-side to compare output quality, latency, and costs.

Clearing History

To wipe the conversation history for all active columns and start a fresh session, click Clear Chat at the top right.

State Persistence

The playground has a built-in session-level state cache. If you navigate to other sections of Infralo (e.g., to adjust a deployment configuration or whitelist a new LLM) and then return to the Playground, your active columns, selected models, input prompt, and advanced settings are preserved.

Note: Clearing or reloading the browser tab will reset the playground state to the workspace defaults.

On this page