Lumen API autonomous LLM cost-router
F1838-C · Public launch · Powered by Elite AI Empire

Pay 30-70% less
without sacrificing quality.

Every LLM call is a coin-flip on cost. You pick a model and hope. Lumen picks the cheapest model that meets your quality floor — per request, automatically. OpenAI-compatible. Hash-chained audit log.

Start free — 100 requests/day, no credit card Quickstart
# Drop-in. Change two lines. client = OpenAI( base_url="https://lumen-api.eliteaiempire.com/v1", api_key="lumen_sk_...", ) resp = client.chat.completions.create( model="auto", # ← Lumen picks the model per request messages=[...], ) # resp.lumen tells you the tier, cost, and audit hash.
Autonomous routing

Cost classifier per request

Length, tool-use, JSON mode, reasoning/code/creative signals — every request is scored in microseconds before it hits a model. Easy lookups go to a $0.001 tier. Hard prompts cascade to frontier.

Quality-floor cascade

Won't degrade below threshold

You set a quality floor (default 0.75). Lumen picks the cheapest tier that clears it — never below. Mid prompt with floor 0.9? You get gemini-2.5-pro, not flash. No surprises.

Cryptographic audit

Hash-chained log every call

Each response includes an audit_hash. The on-disk log is SHA-256 hash-chained with a server secret. Tamper-evident. Look up any past call at /v1/audit/<hash>. SOC2-ready.

How much would you save?

Paste your monthly OpenAI/Anthropic bill and answer two questions. We model the routing for you.

Projected monthly bill on Lumen:
$—
Start free →

Empire dog-food (we eat our own routing)

Lumen is the routing brain behind these production properties, all running on the public API right now:

Pricing

Free forever for low-volume. Three paid tiers; upgrade in-app, billed by Stripe.

Free

Test, prototype, hobby projects.

$0 /mo
No credit card
  • 100 requests/day
  • 3,000 requests/month
  • nano tier only (gemini-flash)
  • Cache + audit log
Start free

Starter

First production app.

$9 /mo
Cancel anytime
  • 5,000 requests/day
  • 150,000 requests/month
  • All 4 tiers (nano → high)
  • $50/mo budget cap
  • Email support
Start free, upgrade later

Scale

Heavy production, 7-figure traffic.

$99 /mo
Unlimited budget
  • 500,000 requests/day
  • 15M requests/month
  • No budget cap
  • Highest queue priority
  • Slack support
Start free, upgrade later

Enterprise? Custom contracts, SOC2 docs, dedicated routing pool, net-30. Contact sales →

How Lumen compares

LumenOpenRouterHeliconePortkey
Picks model for you✓ automarketplacenono
Quality-floor cascade
Cryptographic audit log✓ hash-chainedlogs onlylogs only
OpenAI-compatible
Multi-provider failover✓ 8 vendorssingle
Free tier100/daycredits10k/mo10k/mo
Geopolitical filter (no CN/RU)

Full comparison →

FAQ

Will my code work without changes?

Yes — Lumen speaks OpenAI chat-completions wire format. Change base_url to https://lumen-api.eliteaiempire.com/v1 and use your lumen_sk_… key. model="auto" turns on routing; you can also pin a specific tier.

Which models does Lumen route to?

Gemini 2.5 Flash, GPT-4o-mini, Gemini 2.5 Pro, Claude Opus 4.8 by default. Pro+ unlocks brand_pref="anthropic" which routes through Haiku 4.5, Sonnet 4.6, Opus 4.8. Full list at tier-ladder.

How do you decide which tier?

Each request is scored 0-1 for difficulty (length, tool-use, JSON-mode, reasoning/code/creative signals). The cheapest tier whose min_quality clears max(score, your_floor) wins. Read quality-floor for details.

What if I want to force a specific model?

Pass model="nano" or model="gpt-4o-mini". The router will use exactly that tier. Your plan's tier ceiling still applies.

Do you train on my data?

No. Lumen is a routing layer — we pass requests through to vendor APIs and never retain prompts or responses in training pipelines. Cache TTL is 1 hour, then evicted.

What about streaming?

Streaming works (stream: true) but bypasses the response cache. Audit log entries are still written.

Can I bring my own keys?

Not on Free/Starter. Pro+ customers can request BYO-key routing — we use your vendor accounts instead of ours, and you pay only the Lumen routing fee. Email support to enable.

Audit log — what do I do with it?

Every lumen.audit_hash in a response points to a SHA-256-chained on-disk entry recording tier/model/cost/tokens/timing. Use it for SOC2 evidence, dispute resolution, or "which model answered this critical user question?" debugging.

Start free — 100 requests/day, no credit card