When a request reaches Metnos, before bothering the LLM planner we try to recognise it. If we have already seen it and we have already decided well how to answer, we replay it and return to the user in half a second instead of ten. No language models on the critical path, just memory.
The Metnos planner is a local Qwen 3.6 35B-A3B: it thinks well but takes
about twelve seconds to decide the first step of a turn. For many
requests this wait is disproportionate. “What time is it?”
does not need an LLM: it needs get_now. Even
“download this page and describe it in two lines” does
not need the planner if it is a request we make often and whose
sequence we already know: get_urls followed by
describe_entries.
Hence the question: how do we recognise that a request is already known? And how do we accept that almost known is enough? The answer is the introvertive fast-path, organised in two layers — each captures a different degree of certainty, both converging to the same outcome: the assistant executes the right sequence without thinking twice.
| Layer | What it recognises | How | Cost per hit |
|---|---|---|---|
| L0 | A request already solved successfully, identical or semantically very close | Exact hash (0a) + BGE-M3 cosine (0b) | < 5 ms (hash) / < 150 ms (cosine) |
| L1 | An intent belonging to a cluster confirmed by positive user feedback | Intent hash + cluster_id lookup | ~30 ms |
The flow is strictly sequential: first L0 is attempted, then L1 if L0 missed. If both miss, the LLM planner (Engine v2) takes over as always.
fastpath.py)
The first layer lives in runtime/engine/fastpath.py and manages
a SQLite database (fastpaths.sqlite) of plans already executed.
Entries are produced automatically: every time a turn
completes successfully (full plan from the engine, L1 hit, or promotion
from cosine 0b), Metnos records the canonical query, its hash, the BGE-M3
embedding, the complete plan (framework) and the intent (verb + object).
No approval is needed: the chains are made of executors already vetted and
tested.
The lookup proceeds in two phases:
/admin/praxis console.undo_last_turn and
get_inputs do not generate fastpaths, because their semantics
depend on the turn context.since_iso="2026-06-11") is not recorded, because
replay on a different day would execute a frozen time window. Relative
dates (time_window="today") remain cacheable,
because replay re-resolves them correctly.autopath.py)
The second layer lives in runtime/engine/autopath.py and
operates on a different logic: it does not repeat the same
query, but generalises to a cluster of intents confirmed by
positive user feedback.
When the user gives a “✓” feedback, Metnos records the turn's framework, its hash and the semantic cluster (intent hash + cluster ID). After a minimum number of confirmations (configurable, default 1) on the same framework hash and cluster, the plan becomes a reusable autopath: the next time an intent in the same cluster arrives, the plan is replayed without going through the planner.
L0 fastpaths age and die deterministically, with no LLM involved. The
nightly task_state_reaper job applies three aging rules and
four death conditions.
| Rule | Criterion | Default | Env |
|---|---|---|---|
| Never reused | Created more than N days ago but never served a second time | 14 days | METNOS_FASTPATH_GRACE_DAYS |
| Stale | Last use more than N days ago | 30 days | METNOS_FASTPATH_STALE_DAYS |
| LRU cap | Total entries above the cap; least recently used are pruned | 500 | METNOS_FASTPATH_MAX |
| Code | Cause | Inheritance |
|---|---|---|
| C1 | A tool in the plan no longer exists in the catalogue (retired, renamed, archived). Replay would fail. | No |
| C2 provenance | The fastpath was promoted to a synthetic executor (see §9) and that executor is now in the catalogue. | Yes |
| C2 name | An executor named {verb}_{object} matching the intent exists, but no tool in the plan belongs to that family. The fastpath would shadow the executor. | Yes |
| C2 prefilter | For multi-step plans: the deterministic routing prefilter on the canonical query indicates that a single executor now covers the intent (even under a different name). | Yes |
When a fastpath dies by supersession (C2), its usage counts
(n_uses) are transferred to the heir executor via the
executor aging system. Accumulated demand is not lost.
Recognising the request is half the work. The other half is
reconstructing the concrete argument values: which paths, which URLs,
which date, which threshold. Metnos has a deterministic extractor
(args_extractor.py) that works by rules:
https://...), path (~/... or
/..., with the “home” shortcut becoming
~/), email, numbers, file extensions (“PDF
file” becomes *.pdf), dates
(today/yesterday/tomorrow/day-after-tomorrow in Italian and English,
mapped to ISO format), time windows (“this week”,
“last 24 hours”, “last 7 days”).
Fast-path parameters are controlled by METNOS_*
environment variables. A TOML file
(~/.config/metnos/runtime.toml) provides persistent values;
the default hardcoded in the module is the last safety net.
| Variable | Default | Meaning |
|---|---|---|
METNOS_FASTPATH_STALE_DAYS | 30 | Calendar days after which an unused entry is pruned |
METNOS_FASTPATH_GRACE_DAYS | 14 | Grace days for never-reused entries |
METNOS_FASTPATH_MAX | 500 | Maximum rows (LRU cap) |
| Variable | Default | Meaning |
|---|---|---|
METNOS_AUTOPATH_MIN_OBS | 1 | Minimum positive observations to promote an autopath |
METNOS_AUTOPATH_TTL_ANTI | 2592000 (30 d) | Anti-autopath duration in seconds |
METNOS_AUTOPATH_TTL_REPEAT | 3600 (1 h) | Soft window for repeated feedback |
| Variable | Default | Meaning |
|---|---|---|
METNOS_FP_PROMOTE_MIN_CLUSTER | 3 | Minimum distinct fastpaths in the cluster |
METNOS_FP_PROMOTE_MIN_USES | 15 | Minimum cumulative usage |
METNOS_FP_PROMOTE_MIN_AGE_DAYS | 30 | Minimum cluster age |
METNOS_FP_PROMOTE_MAX_PER_NIGHT | 3 | Maximum new emissions per night |
METNOS_FASTPATH_AUTOPROMOTE | off | Enables Tier 2 auto-promotion (no human approval) |
When a group of recurring L0 fastpaths shares the same plan structure
(framework hash) and the same intent, a nightly job
(task_fastpath_promotion) evaluates them as candidates for
becoming a full synthetic executor. Promotion is
cluster-based, never per-instance: at least 3 distinct
fastpaths, 15 cumulative uses and 30 days of age are required. Only
multi-step patterns are promoted: single-step ones already have their
executor, and the fastpath value there is skipping the LLM, not the plan.
/admin/proposals and in the unified hub. Approval writes a
synt_pending/ marker that starts the full synthesis pipeline
(5 stages + test + signing + installation).
Every emitted candidate records the IDs and canonical hashes of the
source fastpaths in a promotions table. When the executor
enters the catalogue, C2 provenance-based death is exact: the link
between the original fastpath and its heir is recorded, not inferred
from the name.
© 2026 Roberto Brunialti · Metnos documentation