Tier ladder
Lumen routes across 4 cost/quality tiers, plus an anthropic-only brand cascade for Pro+ customers.
Pareto frontier
| Tier | Model | $/1k tokens | Min quality | Max context |
|--------|------------------|-------------|-------------|-------------|
| nano | gemini-2.5-flash | $0.001 | 0.60 | 30k |
| small | gpt-4o-mini | $0.003 | 0.80 | 128k |
| mid | gemini-2.5-pro | $0.012 | 0.90 | 2M |
| high | claude-opus-4-8 | $0.060 | 0.98 | 200k |
These are the cheapest model at each quality floor — anything cheaper at the same quality is dominated and excluded.
Brand cascade (Pro+ only)
| Tier | Model | $/1k tokens | Min quality | Max context |
|-----------------|--------------------|-------------|-------------|-------------|
| anthropic-fast | claude-haiku-4-5 | $0.005 | 0.78 | 200k |
| anthropic-mid | claude-sonnet-4-6 | $0.015 | 0.92 | 200k |
| (anthropic-high)| claude-opus-4-8 | $0.060 | 0.98 | 200k |
Set brand_pref="anthropic" in your request to route through this cascade exclusively.
How a tier is selected
1. Classify difficulty: length + tool-use + json mode + reasoning/code/creative signals -> a 0..1 score.
2. Compute effective floor: max(difficulty_score, your quality_floor).
3. Filter: keep only tiers whose min_quality >= effective_floor AND max_context >= request size.
4. Pick cheapest: lowest cost_per_1k from the filtered set.
5. Budget check: if max_cost_usd was set and the chosen tier exceeds it, downgrade to the highest-quality tier that fits the budget. If nothing fits, return HTTP 402.
Plan ceilings
Each plan caps which tier you can route to:
- Free — nano only
- Starter — through high
- Pro / Scale — through high + brand_pref enabled
- Enterprise — through high + custom routing