Skip to content
Orqen Docs

Core Concepts

Payload optimization

Orqen optimizes the full agent payload before it reaches the model. That includes tools, schemas, tool results, history, images, model choice, reconstruction, validation, and recovery.

Orqen snapshots, optimizes, validates, then forwards each request.

One intent-aware plan coordinates every stage

Before forwarding a request, Orqen builds one request plan. The same plan guides tool routing, compression, summarization, reconstruction, validation, and recovery so those stages stay aligned instead of making isolated decisions.

What happens on a request

1

Snapshot and understand the request

Orqen snapshots the original payload, reads the current goal, and optionally enriches high-value requests before changing anything.

2

Decide the right cleanup level

Orqen chooses how much history to keep, when to compress, how many tools to forward, how to trim schemas, and how strict validation should be.

3

Optimize the whole payload

Orqen deduplicates prompts, compresses images and tool results, manages hot/warm/cold history, routes relevant tools, and trims schemas after routing.

4

Rebuild and validate

Orqen assembles the final model-facing request, checks critical terms and tool schemas, and restores context if validation fails.

Intent-aware tool routing

Tool routing is one part of the payload optimization layer. Orqen does not only compare text similarity; it understands the current request well enough to build a frame such as:

{
  "domain": "weather",
  "action": "forecast",
  "slots": { "location": "Sittingbourne Kent" },
  "side_effect_allowed": false,
  "previous_tool_error": false,
  "confidence": 0.73
}

That frame is matched against tool capability cards derived from function names, descriptions, schemas, required inputs, and optional routing examples.

{
  "type": "function",
  "function": {
    "name": "open_meteo_weather",
    "description": "Get real weather forecast for a city.",
    "x-orqen-examples": [
      "weather in London",
      "forecast for Sittingbourne Kent"
    ],
    "parameters": {
      "type": "object",
      "properties": { "city": { "type": "string" } },
      "required": ["city"]
    }
  }
}

Adaptive K

Orqen chooses how many tools to forward based on confidence, risk, and recovery signals:

Crisp single intent1 toolExample: list files, weather in London.
Moderate confidence2-3 toolsEnough room for close alternatives.
Multi-step or failed retryup to 4 toolsProtects recall when the agent is recovering.
Side effectsminimum 3 toolsWrite/send/execute operations are widened unless confidence is very high.
Timeout or errorall toolsFail-open behavior keeps customer requests reliable.

Conversation history management

For multi-turn agents, prompt size grows with each exchange. Orqen manages history in three tiers:

Hot — recent turnsKept verbatim in the forwarded payload.
Warm — older turnsCompressed and deduplicated — semantically equivalent content collapsed.
Cold — early turnsSummarized — replaced by a compact LLM-generated summary of the chunk.

For very long sessions (100+ turns), summaries are merged in a hierarchical pass — each merge call handles exactly two summaries — so no single LLM call sees unbounded input and early context is never silently truncated.

Learning loop

Orqen stores privacy-safe optimization traces: detected intent, selected tools, top candidates, recall, compression strategy, reconstruction strategy, shadow proactive rebuild, and recovery signals. This lets Orqen calibrate the system from real data without storing raw prompts.

x-orqen-tools-input:  51
x-orqen-tools-output: 1
x-orqen-prune-ratio:  1/51
x-orqen-routing:      semantic

Best practice

Write tool descriptions with explicit scope, keep required schema fields accurate, and preserve meaningful IDs/URLs in user-visible turns. Add x-orqen-examples when two tools have similar names or overlapping domains.