Workspaces

Understand workspaces as the core unit of logical resource isolation, team collaboration, and environment segregation in Infralo.

Workspaces are the primary structural container in Infralo. They act as logical partitions to organize, isolate, and control access to your machine learning resources.

By grouping models, routing rules, API keys, member roles, and telemetry into independent environments, workspaces allow organizations to establish strict operational boundaries.

                Workspace

    ┌────────┬────────┼────────┬────────┐
    ▼        ▼        ▼        ▼        ▼
 Models  Deployments API Keys Members Observability

1. Overview

Infralo is built to support multi-tenant and multi-team environments. Workspaces exist to provide complete logical isolation for different projects, teams, or lifecycle stages within a single organization:

  • Resource Isolation: Models whitelisted in one workspace are completely separate from other workspaces. A Virtual API Key created in Workspace A cannot authenticate requests or access resources in Workspace B.
  • Environment Segregation: You can define separate workspaces for Development, Staging, and Production. This ensures that experimental models or credentials used by developers do not interfere with live customer-facing deployments.
  • Team Boundaries: Isolate different teams (e.g., Customer Support AI vs. Internal BI Search) to prevent access cross-contamination and ensure clear ownership.
  • Independent Permissions: Assign roles to users on a workspace-by-workspace basis. A user might be an administrative manager in a test environment, but hold read-only viewer rights in production.
  • Isolated Observability: Logs, metrics, and billing details are partitioned by workspace. You can trace API request streams, analyze latencies, and audit token expenditures for individual applications without global noise.

2. Workspace Resources

Every workspace acts as a self-contained hub for several core Infralo features. Rather than defining configuration globally, you configure these resources inside the context of a workspace:

Workspace LLM Collection

A whitelist of models enabled for the workspace. This collection decouples applications from the global provider keys, allowing workspace administrators to specify exactly which models are available to the workspace's applications.

Deployments

Virtual load-balanced routing endpoints. Deployments group multiple whitelisted models together under a single endpoint alias and apply failover, round-robin weights, fallback rules, caching, and custom runtime plugins.

  • See Deployments for detailed routing and balancing configurations.

Virtual API Keys

Secure, custom credentials (prefixed with vk_...) generated for client applications to call the gateway. Virtual keys are restricted by endpoint scopes (e.g., chat only), model scopes, or deployment scopes.

Members

The set of global users assigned to the workspace. Administrators assign workspace-specific roles (owner, admin, member) to manage who can edit configurations or view logs.

Observability

Real-time usage telemetry, latencies, cache metrics, error distributions, cost tracking, and transaction request logs scoped strictly to the workspace.

Settings

The administrative console where workspace owners and administrators can edit basic metadata (name, description), configure the workspace-wide Response Cache (toggling cache activation and defining Time-To-Live / TTL durations), or permanently delete the workspace.


3. Workspace Lifecycle

Setting up a workspace is the entry point to using Infralo. The diagram below illustrates how multiple platform features connect sequentially when provisioning a new environment:

Create Workspace


Invite Members


Whitelist Models


Create Deployments


Generate Virtual API Keys


Applications Send Requests


Monitor with Observability
  1. Create Workspace: A platform administrator or tenant member creates a new workspace container and defines its basic details.
  2. Invite Members: Workspace administrators add team members to the workspace and define their roles to delegate setup responsibilities.
  3. Whitelist Models: Admins select LLMs from the global provider catalog and whitelist them for the workspace, defining local limits and availability.
  4. Create Deployments: High-availability endpoints are configured, attaching whitelisted models to load balancers, caching rules, and runtime security modules.
  5. Generate Virtual API Keys: Scoped API credentials are created for client applications, locking down access to specific models, deployments, or routes.
  6. Applications Send Requests: Client applications swap their raw provider keys for the virtual keys and point their SDKs to the Infralo gateway.
  7. Monitor with Observability: AI engineers and operations teams monitor incoming requests, audit trace executions, analyze token costs, and tune performance.

4. Best Practices

For secure and efficient workspace operations, Infralo recommends the following practices:

  • Strict Environment Isolation: Never mix development tests and production traffic in the same workspace. Maintain separate workspaces for dev, staging, and prod to prevent credential exposure and logging confusion.
  • Enforce Least-Privilege Access: Limit the number of workspace owner and admin roles. Assign the standard member role to developers so they can view settings and logs without modifying routing or whitelists.
  • Keep Whitelists Minimal: Only whitelist the models a workspace actually needs. This prevents developers or automated apps from invoking expensive, unapproved models.
  • Prefer Deployments in Production: Avoid giving client applications direct access to individual models. Route production traffic through Virtual Deployments to benefit from automated load balancing, caching, failover, and runtime guards.
  • Monitor Logs & Cost Trends: Review workspace observability metrics weekly to identify high-cost outliers, monitor cache efficiency, and troubleshoot application errors.

On this page