The complete AI backend, built for startups.

Routing, tools, rate limits, billing, and logs — every piece a startup needs to ship an AI product, in one SDK call.

Get started — free

ROUTE · model: gpt-4o · 24h

openai/gpt-4oP082%

azure/gpt-4oP011%

anthropic/sonnetP17%

groq/llama-3.3P20%

/routing

Routing

Every model. Every provider. One SDK that routes both.

Failover
BYOK
Region pin
Weighted
Retries
Health

02 / The wall

Six products. One backend.

Each one ships independently. Pin one up, the rest still work. Click any note to deep-dive.

/routing

Routing

every model · every provider

One SDK call → any model from any provider. Same model across deployments, with failover.

/tools

Tools

hosted MCP

We host. We auth your end users. We verify each tool call. You ship the agent.

/skills

Skills

stop curating · agents pick

Your agent doesn't need every skill loaded. It searches at the moment of need, fetches the folder, runs it, leaves a review.

/limits

Limits

platform · user · time

Three gate types: rate, wallet balance, usage-per-time. Set per-platform or per-end-user.

/billing

Billing

wallets · client-side

You handle payments. You top up via API. We debit. End users can call Assistiv directly.

/logs

Logs

trace · token · cost

Full trace per request. Tokens, latency, cost on every step. Replay + eval built-in.

03 / How it flows

One call in, one log out. We do the six things in between.

Your app makes one call. We check the user's tier, route to a provider, run any tools, debit the wallet, and write a log row. You handle the feature. We handle the rest.

Your appone HTTP call

Limitscheck tier budget

Routingpick provider, retry

Toolsrun MCP tool, OAuth'd

Skillsfind · fetch · run · rate

Billingdebit the wallet

Logswrite log row

Your userwallet debited

One canonical log row per request:req_84a3 · user_28f3a · gpt-4o · 1.2k tok · 412ms · github.create_issue · $0.018 · ok

04 / The suite

Everything between
your app and the model.

Each one ships independently. Each one replaces a stack of glue code you'd otherwise maintain. Click any product to deep-dive.

R/routing

Routing

Every model. Every provider. One SDK that routes both.

Deep dive

T/tools

Tools

Hosted MCP. We host the servers, handle end-user auth, and verify every tool call.

Deep dive

S/skills

Skills

Stop curating skill libraries. Agents find the right one at runtime, fetch it, run it, and rate it.

Deep dive

L/limits

Limits

Per-platform AND per-end-user. Rate, wallet-balance, and usage-per-time gates.

Deep dive

B/billing

Billing

Per-user wallets you top up via API. Plus client-side calling — end users hit Assistiv directly.

Deep dive

G/logs

Logs

LangSmith-grade observability. Full traces, token + latency + cost, replay, eval.

Deep dive

02 / ROUTING

Every model. Every provider. One SDK that routes both.

Two routes in one. (1) Every model from every provider — OpenAI, Anthropic, Google, Bedrock, Mistral, Groq — through one OpenAI-compatible call. (2) The same model across provider deployments — gpt-4o on OpenAI ↔ Azure, claude-3.5 on Anthropic ↔ Bedrock — with priority and failover.

✓Every model from every provider — one SDK call
✓Same model across providers (OpenAI ↔ Azure, Anthropic ↔ Bedrock)
✓Priority + weighted routing, sub-200ms failover
✓BYOK or Assistiv-managed keys, swap freely

Read /docs/routing →API reference ↗

ROUTE · model: gpt-4o · 24h

openai/gpt-4oP082%

azure/gpt-4oP011%

anthropic/sonnetP17%

groq/llama-3.3P20%

03 / TOOLS

Hosted MCP. We host the servers, handle end-user auth, and verify every tool call.

We host the MCP server. We handle end-user OAuth in your dashboard, under your brand. We verify every tool call before it runs — scopes, allowlists, dry-run when you want it. You write zero OAuth callbacks, store zero refresh tokens, and audit every action.

✓We host the MCP servers — 30+ apps live, growing weekly
✓We handle end-user auth — OAuth in your dashboard, your brand
✓Verified tool calls — scopes, allowlists, dry-run, audit row each call
✓Bring-your-own MCP servers supported alongside hosted

Read /docs/tools →API reference ↗

TOOLS · session_84a35 calls · oauth'd as user_28f3a

github.create_issue()200

slack.send_message()200

linear.create_ticket()200

gmail.search()200

stripe.create_invoice()200

04 / SKILLS

Stop curating skill libraries. Agents find the right one at runtime, fetch it, run it, and rate it.

Stuffing 40 skills into the system prompt taxes every token, distracts the model, and rots into a museum nobody uses. Skills inverts this — agents call find_skill at the moment they need a capability, fetch the whole folder, and run it. Reviews compound. The catalog gets sharper without you curating.

✓Stop loading skills. Agents call find_skill at the moment they need one
✓Real reviews compound — accuracy, value, efficiency, clarity drive ranking
✓Plugin-bundle repos with byte-identical copies dedupe to one row
✓Private skills are end-user-owned and survive API key rotation

Read /docs/skills →API reference ↗

SKILL · query: summarize PDF · top-4

summarize-pdfsP04.6★

extract-tablesP14.3★

ocr-scanP24.1★

redact-piiP33.8★

05 / LIMITS

Per-platform AND per-end-user. Rate, wallet-balance, and usage-per-time gates.

Three kinds of gate. Rate limits (RPM, TPM, sliding window, token bucket). Wallet-balance gates (block when below threshold). Usage-per-time budgets ($X per day/week/month). Set them on the platform, on each end-user, or both. Enforced before the provider sees the call.

✓Per-platform gates (you set defaults)
✓Per-end-user gates (you set per signup)
✓Rate limits — RPM, TPM, sliding window, token bucket
✓Wallet-balance + usage-per-time budgets

Read /docs/limits →API reference ↗

TIER BUDGETS · today

free$0.42 / $0.50/day

pro$8.60 / $20/day

team$112 / $200/day

ent$2,140 / unlimited

no cap · billed monthly

06 / BILLING

Per-user wallets you top up via API. Plus client-side calling — end users hit Assistiv directly.

Each end-user gets a wallet. You handle payments however you want — Stripe, Paddle, invoicing, internal credits — and call walletTopUp() to credit the wallet. We debit per token, per tool, per request — atomic, audit-logged. End users can also call Assistiv directly with their own sk-eu_ key, bypassing your server entirely.

✓You own the payment flow; we own the wallet ledger
✓walletTopUp(user, amount) — one API call, idempotent
✓Client-side calling — end users hit api.assistiv.ai directly
✓Tokens, tools, agents all settle against the same wallet

Read /docs/billing →API reference ↗

WALLET · user_28f3a · pro tier

$5.80balance

62% of $10/day budgetresets in 6h 12m

gpt-4o · 1.2k tok−$0.018

claude · 890 tok−$0.012

github.create_issue−$0.001

stripe top-up+$10.00

07 / LOGS

LangSmith-grade observability. Full traces, token + latency + cost, replay, eval.

Every request becomes a full trace: each LLM call, retry, and tool step in the chain. Tokens, latency, cost, and status attached at every step. Filter by model, user, route, error. Replay any trace. Score traces against datasets. Built-in — not a separate SaaS.

✓Full traces — every step, every retry, every tool call
✓Tokens, latency, cost on every step (per user, per model, per route)
✓Replay traces, score against datasets, run evals
✓Webhook export to your warehouse — Snowflake, BigQuery, S3

Read /docs/logs →API reference ↗

● LIVE LOGSlast 60s · 6 of 1,284