Introduction

Orqen documentation

Orqen cuts your LLM bill by removing the tokens your agent doesn't need — before every call to the provider. It works with the SDK you already use (Anthropic, OpenAI, or AWS Bedrock). Update the key and endpoint, keep your existing request shape.

Quickstart

Send your first request through Orqen in under 5 minutes.

Authentication

How Orqen API keys work and how to manage your provider keys.

API Reference

Full reference for every endpoint, parameter, and response field.

What is Orqen?

When your LLM agent sends every tool, every schema, long history, large tool outputs, and old image context on every API call, you pay for tokens that do not help the response and the model has more noise to reason through.

Orqen sits between your agent and your LLM provider. On each request, it builds one intent-aware plan, routes the model if asked, forwards the relevant tool subset, compresses schemas and tool results, trims stale multimodal context, assembles the final prompt, validates critical terms, and forwards the optimized payload to the model.

Observed from live Orqen traffic: 437 calls, $30.01 saved on provider bills, 7,688,105 tokens removed, and 53.9% less tool context per call on average.

Intent plan

One request-level plan coordinates tool routing, compression, model routing, and safety.

Payload compression

Tool results, schemas, old context, and images are compacted. Long sessions are summarized in tiers — recent turns kept verbatim, older history compressed or summarized — so prompt size stays stable as sessions grow.

Final validation

Critical IDs, URLs, constraints, and required tool schema fields are checked before forwarding.

Quality feedback loop

Correction signals in the next user message, tool recall misses, and HTTP errors trigger immediate aggressiveness clamping. A weekly calibrator sets the long-term baseline per API key.

How it fits into your stack

Orqen supports the three main SDK families directly. Point your existing client at Orqen with a key and endpoint update — usually api_key and base_url — while your messages and tools keep their current format. Anthropic Messages, Bedrock Converse, and OpenAI Chat Completions are accepted in their usual shapes.

See the provider migration examples for side-by-side before/after snippets for each SDK.

Compatible with:

Anthropic SDKOpenAI Python SDKOpenAI Node.js SDKAWS Bedrock SDK (boto3)LangChainLlamaIndexFrameworks with OpenAI-compatible APIsAny HTTP client

Supported LLM providers

Orqen forwards requests to your LLM provider using your own API keys. Supported providers include:

OpenAI (GPT-4o, GPT-4.1, o-series, …)
Anthropic (Claude Sonnet, Haiku, Opus, …)
AWS Bedrock (Claude, Llama, Titan, …)
Google Gemini (Gemini 2.x Flash, Pro, …)
Groq (Llama 3.3, Mixtral, …)
Mistral, Together AI, Fireworks, OpenRouter, Cohere

Start the quickstart API reference →