Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/NVIDIA/OpenShell/llms.txt

Use this file to discover all available pages before exploring further.

OpenShell handles inference traffic through two paths: requests to external hosts like api.openai.com, and requests to inference.local, a special endpoint exposed inside every sandbox.

Two routing paths

PathHow it works
External endpointsTraffic to external hosts is treated like any other outbound request. It is allowed or denied by network_policies. See Policies for details.
inference.localA special HTTPS endpoint exposed inside every sandbox. The privacy router strips the sandbox-supplied credentials, injects the configured backend credentials, and forwards the request to the managed model endpoint.

How inference.local works

When code inside a sandbox calls https://inference.local, the privacy router intercepts the request and routes it to the backend configured for that gateway. OpenShell applies the configured model to generation requests and supplies the provider credentials itself — no sandbox code needs access to the real API key. If code calls an external inference host directly, that traffic bypasses inference.local entirely and is evaluated only by network_policies.
PropertyDetail
CredentialsNo sandbox API keys needed. Credentials come from the configured provider record.
ConfigurationOne provider and one model define sandbox inference for the active gateway. Every sandbox on that gateway sees the same inference.local backend.
Provider supportNVIDIA NIM, any OpenAI-compatible provider, and Anthropic all work through the same endpoint.
Hot-refreshProvider credential changes and inference updates propagate within about 5 seconds by default, without recreating sandboxes.
The client-supplied model and api_key values sent to inference.local are not forwarded upstream. The privacy router injects the real credentials from the configured provider and rewrites the model before forwarding.

Supported API patterns

The patterns accepted by inference.local depend on the provider type configured for the gateway.
PatternMethodPath
Chat CompletionsPOST/v1/chat/completions
CompletionsPOST/v1/completions
ResponsesPOST/v1/responses
Model DiscoveryGET/v1/models
Model DiscoveryGET/v1/models/*
Requests to inference.local that do not match the configured provider’s supported patterns are denied.

Next steps

Configure inference routing

Set up the provider and model behind inference.local.

Sandbox policies

Control which external inference endpoints sandboxes can reach.