Async + Webhook Execution
UseresponseMode: "async" when you can’t or shouldn’t hold an HTTP connection open for the duration of a compliance run (server-to-server pipelines, mobile apps, batch workflows). The API returns immediately with a jobId, processes the check in the background, and delivers the result to your webhook URL when complete. A polling endpoint exists as a recovery and verification fallback.
How it works
Submit a check withresponseMode: "async" and a webhookUrl. You get back 202 + jobId immediately. When the check completes, ZebraTruth POSTs the HMAC-signed report to your webhookUrl (primary delivery). A polling endpoint at GET /v1/compliance/jobs/{jobId} exists as a recovery / verification fallback.
Step 1 — Submit the check
webhookUrl is required when responseMode is async — the API returns 400 without it.
callbackId is your opaque tag, echoed back in the webhook payload and the polling response. Use it to correlate the result with whatever record triggered the check.
Response — 202 Accepted
webhookSecret keyed by jobId. You’ll use it to verify the HMAC on the eventual webhook delivery. It is returned only in this 202 response — the polling endpoint never re-exposes it.
The 202 returns in well under a second; orchestration runs in the background.
Step 2 — Receive the webhook
When the check completes, ZebraTruth POSTs to yourwebhookUrl:
On failure — compliance.failed
Verifying the signature
rawBody as the unparsed request body bytes. Do not parse + re-stringify the JSON before verifying — even semantically-equivalent reformatting changes the byte representation and breaks the HMAC.
Delivery guarantees
- At-least-once. Use
eventIdto deduplicate on your side. - Retry policy. Up to 4 attempts (initial + 3 retries) with delays
1 s, 5 s, 25 s. After exhaustion the delivery is recorded asdeadin our internal log. - Timeout. Each delivery attempt has a 10-second per-request timeout. Make your webhook handler fast — return
2xxand process asynchronously if the work is heavy.
2xx status code, we treat it as a failure and retry.
Step 3 — Polling (fallback)
Polling exists as a recovery and verification mechanism. Webhooks are the recommended primary delivery channel.Use cases for polling
- The webhook delivery failed all 4 attempts and you need to recover the result.
- You want to verify the data in the webhook matches what’s actually on our side (defence-in-depth).
- Your client cannot receive webhooks at all (no public URL, behind NAT) — in which case async is still useful for the “submit and check later” pattern, you just rely on polling for delivery.
- You want to reconcile what was charged across a recent batch.
Endpoint
Recommended cadence
First poll at 30 seconds, then every 30 seconds. Most fast-mode checks complete in 60–110 seconds; full-mode in 90–240 seconds. The polling endpoint enforces a hard floor: a minimum of 5 seconds between polls per(tenantId, jobId). Polls faster than that get 429 with Retry-After guidance.
Response shapes
While the job is in flight (queued or running):
completed:
failed:
Tenant scoping
A request for ajobId that doesn’t belong to the authenticated tenant returns 404 Job not found. Job existence is never leaked across tenants.
After 90 days
Job rows and their report blobs are retained for 90 days. After that:- The polling endpoint returns
410 Goneif the row’s still there but the report blob has been pruned. - The polling endpoint returns
404 Not Foundif the row itself has been swept.
Streaming an in-flight async job
If you submitted withresponseMode: "async" but later want a live progress feed, open an SSE connection to:
job.queued or job.started event on connect, then the terminal complete or error event when the job reaches that state. Live per-agent events (agent.started / agent.completed) are not replayed — for full per-agent streaming, use responseMode: "stream" from the start.
The resume stream times out after 4 minutes; if reached it emits an error event with code stream_timeout and closes. Recover by polling the GET endpoint.
Operational notes
- Function timeout ceiling. Each async job runs inside a Vercel function with a 300-second
maxDuration. Jobs that exceed it (extremely long full-mode runs with all jurisdictions and platforms) may be killed mid-orchestration, leaving the row inrunninguntil a sweeper flips it tofailedwith codevercel_timeout. We’re working on a queue-backed worker for unbounded async — talk to us if you hit this. - Idempotency. Pass an
Idempotency-Keyheader on the POST to deduplicate retries safely. - No queue priority yet. Async jobs run as soon as the request arrives; there is no per-tier queue prioritisation.