Service-level objectives
This page documents the latency and availability targets that the DocGen API is engineered to meet during v1. Numbers are measured against production telemetry on the generation worker and the public API edge, sampled per output format and per request size.
These targets are aspirational v1 SLOs measured against production telemetry, not contractual SLAs. Customer-contract SLAs are negotiated per engagement through your account team. Numbers may revise as v1 traffic patterns stabilize.
PDF latency
PDF latency is measured from request acknowledgment to the moment the rendered artifact is available for download. Numbers are broken out by page count because PDF rendering cost scales close to linearly with pages once headers, fonts, and images are cached.
| Size class | p50 | p95 | p99 |
|---|---|---|---|
| Small (<10 pages) | 1.5s | 3s | 6s |
| Medium (10–50 pages) | 4s | 8s | 15s |
| Large (50+ pages) | 12s | 25s | 45s |
Documents that embed large external images or font subsets that miss our render cache will trend toward the upper bound of their size class. Repeated generations from the same template warm the cache and trend toward p50.
HTML and MDX latency
HTML and MDX outputs skip the PDF rasterization step, so latency is dominated by template compile and merge-data evaluation rather than page count.
| Format | p50 | p95 | p99 |
|---|---|---|---|
| HTML / MDX | 0.4s | 1s | 2.5s |
Async batch SLA
Batch jobs are scheduled onto the shared generation worker pool. The published targets apply to the batch as a whole, not to individual documents within it.
| Metric | Target |
|---|---|
| Time-to-first-document (TTFD) | p95 ≤ 30s after job enqueue |
| Per-org sustained throughput | p95 ≤ 600 docs/hr |
| 1k-document batch completion | p95 ≤ 90 min |
TTFD measures the gap between the API accepting the batch and the first child document reaching GENERATED. Sustained throughput is measured over a rolling one-hour window so a brief burst that drains the rate-limit budget does not count against the SLO.
Availability
The DocGen API targets 99.9% availability per calendar quarter, excluding pre-announced maintenance windows. Availability is measured at the public API edge: a probe that can authenticate, submit a generation request, and retrieve the artifact counts as an "up" sample.
Status, incidents, and scheduled maintenance windows are published at https://status.propper.ai.
The quarterly error budget at 99.9% is approximately 2 hours, 11 minutes. Budget consumption is reviewed monthly and drives reliability investment for the following quarter.
What is excluded
The targets above describe steady-state operation of the DocGen API itself. They do not cover:
- Customer-owned components such as outbound webhook receivers, Stripe redirects, or signer email delivery. These have their own targets documented under webhooks.
- DocuSign or Conga compatibility shims that proxy to a third-party vendor. Latency for those endpoints inherits the vendor's own SLO; see the migration-compatibility guide for details.
- Custom template logic that performs synchronous external HTTP calls during render. The render budget assumes deterministic templates whose only inputs are the supplied merge data and bundled assets.
How to measure on your side
If you want to track these SLOs from the client, the X-Request-Id header returned on every response is the join key. Pair it with the created_at and generated_at timestamps on the document resource to compute end-to-end latency. The values you measure should track the published p50 within roughly 100ms; large divergences typically indicate either a network path issue or a client-side queuing layer between your app and the API.