Routing, tools, rate limits, billing, and logs — every piece a startup needs to ship an AI product, in one SDK call.
Every model. Every provider. One SDK that routes both.
Each one ships independently. Pin one up, the rest still work. Click any note to deep-dive.
One SDK call → any model from any provider. Same model across deployments, with failover.
We host. We auth your end users. We verify each tool call. You ship the agent.
Three gate types: rate, wallet balance, usage-per-time. Set per-platform or per-end-user.
You handle payments. You top up via API. We debit. End users can call Assistiv directly.
Full trace per request. Tokens, latency, cost on every step. Replay + eval built-in.
Your app makes one call. We check the user's tier, route to a provider, run any tools, debit the wallet, and write a log row. You handle the feature. We handle the rest.
Each one ships independently. Each one replaces a stack of glue code you'd otherwise maintain. Click any product to deep-dive.
Every model. Every provider. One SDK that routes both.
Hosted MCP. We host the servers, handle end-user auth, and verify every tool call.
Per-platform AND per-end-user. Rate, wallet-balance, and usage-per-time gates.
Per-user wallets you top up via API. Plus client-side calling — end users hit Assistiv directly.
LangSmith-grade observability. Full traces, token + latency + cost, replay, eval.
Two routes in one. (1) Every model from every provider — OpenAI, Anthropic, Google, Bedrock, Mistral, Groq — through one OpenAI-compatible call. (2) The same model across provider deployments — gpt-4o on OpenAI ↔ Azure, claude-3.5 on Anthropic ↔ Bedrock — with priority and failover.
We host the MCP server. We handle end-user OAuth in your dashboard, under your brand. We verify every tool call before it runs — scopes, allowlists, dry-run when you want it. You write zero OAuth callbacks, store zero refresh tokens, and audit every action.
Three kinds of gate. Rate limits (RPM, TPM, sliding window, token bucket). Wallet-balance gates (block when below threshold). Usage-per-time budgets ($X per day/week/month). Set them on the platform, on each end-user, or both. Enforced before the provider sees the call.
Each end-user gets a wallet. You handle payments however you want — Stripe, Paddle, invoicing, internal credits — and call walletTopUp() to credit the wallet. We debit per token, per tool, per request — atomic, audit-logged. End users can also call Assistiv directly with their own sk-eu_ key, bypassing your server entirely.
Every request becomes a full trace: each LLM call, retry, and tool step in the chain. Tokens, latency, cost, and status attached at every step. Filter by model, user, route, error. Replay any trace. Score traces against datasets. Built-in — not a separate SaaS.
Every product in the suite replaces something you'd otherwise glue together. Here's the receipt.
Small surface area means we can move fast. Here's the last six weeks.
Three new hosted MCP servers. Same OAuth flow, same wallet debits.
Pin requests to US, EU, or AU regions. Per-key, per-user, or per-tier.
Issue refunds, holds, and credits via a single endpoint. Auditable.
New limit kind alongside sliding-window and fixed-budget. Mix freely.
Stream one canonical row per request to your warehouse. Real-time.
Two more options for the priority list. BYOK or Assistiv-managed.
Generous free tier across every product. Pay-as-you-go after that — top up the wallet, no monthly base, no "contact sales."
For prototyping. All 5 products on, generous limits across the board.
For shipping. No monthly base. Top up the wallet, drain on usage.
For real volume. Negotiated commission, dedicated support, SOC 2 docs.
We onboard 1–2 indie startups a week. Reply within a day, integrated within a week.