API Gateway
ApiGatewayClient is the SDK 3.0 client for metered AI calls from the browser: SPA, Chrome extension, Telegram Mini App. The proxy forwards the request to OpenAI/Anthropic/any configured HTTP API using server-held keys, debits one credit from the user’s balance for the provider’s query_type, and returns the response as-is (including text/event-stream).
The point of routing through us is that your provider API keys never leave our server. In a browser-only app you can’t ship OPENAI_API_KEY to the client; the gateway is what makes metered AI possible without a backend of your own.
When to use. Only when your paywall has tokenization enabled and at least one API Provider configured. Without them /balances returns an empty array and gateway.call() fails with provider-disabled.
Headless / your own backend? Don’t use ApiGatewayClient server-side — your backend already has the provider keys and can call OpenAI/Anthropic directly with lower latency. Debit credits after success via billing.debitTokens() (apiKey + identity). The gateway exists for browser scenarios; routing through it from your own server adds a network hop for no benefit.
Quick start
import {
AuthClient,
BillingClient,
QuotaExceededError
} from '@monetize.software/sdk/core';
const auth = new AuthClient({ paywallId: 'pw_123' });
const billing = new BillingClient({ paywallId: 'pw_123', auth });
// The factory automatically wires Bearer and balance state from this billing.
const gateway = billing.createApiGatewayClient();
billing.onBalanceChange((balances) => {
console.log('Quota:', balances); // [{ type: 'standard', count: 42 }, ...]
});
try {
const res = await gateway.call({
providerId: 'prov_openai_chat',
path: 'v1/chat/completions',
body: {
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello' }]
}
});
const data = await res.json();
} catch (e) {
if (e instanceof QuotaExceededError) {
paywall.open(); // user is out of quota — show upgrade
} else throw e;
}In a Chrome extension
In an MV3 extension you use @monetize.software/sdk-extension, where BillingClient/AuthClient are proxied to the offscreen document. The gateway factory billing.createApiGatewayClient() is not available there — paywall.billing is a remote proxy and does not mirror it. Instead, construct ApiGatewayClient yourself from @monetize.software/sdk/core and pass the extension’s auth as the source: the Bearer is still resolved once in offscreen via getAccessToken(), so there is a single AuthClient and no token duplication.
import { PaywallUI } from '@monetize.software/sdk-extension';
// Import BOTH symbols from the same core entry — see the warning below.
import { ApiGatewayClient, QuotaExceededError } from '@monetize.software/sdk/core';
import type { AuthClient } from '@monetize.software/sdk/core';
const paywall = new PaywallUI({ paywallId: 'pw_123', apiOrigin, auth: true });
const gateway = new ApiGatewayClient({
paywallId: 'pw_123',
apiOrigin, // same custom_domain as PaywallUI
auth: paywall.auth as unknown as AuthClient, // RemoteAuthClient, duck-typed; Bearer comes from offscreen
onChargeSuccess: () => {
// The extension billing lives in offscreen — force a refetch so the UI counter updates.
void paywall.billing.getBalances({ force: true }).catch(() => {});
},
onQuotaExceeded: () => paywall.open() // 402 → show the paywall
});
try {
const res = await gateway.call({ providerId: 'prov_openai_chat', path: 'v1/chat/completions', body });
} catch (e) {
if (e instanceof QuotaExceededError) paywall.open();
}Import ApiGatewayClient and QuotaExceededError from the same @monetize.software/sdk/core entry. The extension content bundle inlines its own copy of sdk. If you construct the gateway from one copy and check instanceof QuotaExceededError against another (e.g. a symbol re-exported from sdk-extension), the class identities differ and instanceof silently returns false. Constructing the gateway from @monetize.software/sdk/core keeps it aligned with the error you catch. This is also why sdk-extension deliberately does not re-export these symbols.
Principles
- Raw
Response.gateway.call()returns a plainfetchResponse, not a wrapper. The caller decides:.json(),.text(),.body.getReader(), async-iter — everything works out of the box. The SDK doesn’t ship its own SSE parser — parse raw chunks yourself or use your existing streaming library. - Authorization from
AuthClient. If billing was created withauth, the gateway automatically sendsAuthorization: Bearer <access_token>; lazy refresh works just like for every other SDK request. - Optimistic balance decrement. On success,
decrementBalanceLocal()reducescachedBalancesfor the matchingquery_typeimmediately; listeners get the update instantly. On a 402 the SDK auto-firesrefreshBalances()so the UI gets a fresh snapshot. - 402 →
QuotaExceededError. The backend returns 402 withdetails.balances,details.queryType,details.currentBalance— the SDK parses and throws a typed error. PaywallUI catches it automatically and opens the upgrade modal; a headless caller handles it itself.
Streaming (SSE)
The backend proxy passes text/event-stream responses through unchanged, with the same chunks the provider sent.
const res = await gateway.call({
providerId: 'prov_openai_chat',
path: 'v1/chat/completions',
body: {
model: 'gpt-4',
stream: true,
messages: [{ role: 'user', content: 'Tell me a story' }]
},
signal: controller.signal
});
const reader = res.body!.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value, { stream: true });
// chunk is raw SSE: "data: {...}\n\n"; parse as usual
}signal: AbortSignal is supported — controller.abort() cancels upstream, the backend cancels the provider request and logs the stream as aborted.
Balances
BillingClient keeps balances in memory with a 5-second TTL plus a listener. You can read them manually:
const balances = await billing.getBalances();
// [{ type: 'free', count: 100 }, { type: 'standard', count: 9 }]
const sync = billing.getCachedBalances(); // null if never loaded
const off = billing.onBalanceChange((balances) => {
/* re-render counter */
});
off(); // unsubscribe
await billing.refreshBalances(); // force re-fetchAfter every successful gateway.call() balances are decremented locally, without a second round-trip to the server. The only authoritative source is the server-side balance; the local cache is a UX facade.
gateway.call() options
| Field | Type | What it does |
|---|---|---|
providerId | string | API provider UUID from the dashboard (API Providers section) |
path | string? | Path after the provider (v1/chat/completions, messages, …). Concatenated via / |
method | 'GET' \| 'POST' | Defaults to POST |
body | unknown \| FormData \| Blob \| ReadableStream \| string | Object → JSON.stringify + Content-Type: application/json. FormData/Blob — the browser sets the boundary. ReadableStream/string — passed through |
headers | Record<string,string> | Extra headers. Doesn’t overwrite Authorization or X-Paywall-Id |
signal | AbortSignal | Cancel the request (important for long streams) |
QuotaExceededError
class QuotaExceededError extends PaywallError {
code: 'not_enough_queries';
status: 402;
balances: Balance[]; // every balance the user has at the moment of 402
queryType: string; // which query_type ran out
currentBalance: Balance | null; // entry for queryType, usually { type, count: 0 }
}Standard host-app handler:
import { QuotaExceededError } from '@monetize.software/sdk/core';
try {
const res = await gateway.call({ providerId, body });
/* ... */
} catch (e) {
if (e instanceof QuotaExceededError) {
// Balances have already been refreshed via refreshBalances() — onBalanceChange
// emitted a new snapshot already. Open the paywall.
paywall.open();
return;
}
throw e;
}