Service-level objectives

This page documents the latency and availability targets that the DocGen API is engineered to meet during v1. Numbers are measured against production telemetry on the generation worker and the public API edge, sampled per output format and per request size.

Aspirational v1 SLO, not a contractual SLA

These targets are aspirational v1 SLOs measured against production telemetry, not contractual SLAs. Customer-contract SLAs are negotiated per engagement through your account team. Numbers may revise as v1 traffic patterns stabilize.

PDF latency

PDF latency is measured from request acknowledgment to the moment the rendered artifact is available for download. Numbers are broken out by page count because PDF rendering cost scales close to linearly with pages once headers, fonts, and images are cached.

Size class	p50	p95	p99
Small (<10 pages)	1.5s	3s	6s
Medium (10–50 pages)	4s	8s	15s
Large (50+ pages)	12s	25s	45s

Documents that embed large external images or font subsets that miss our render cache will trend toward the upper bound of their size class. Repeated generations from the same template warm the cache and trend toward p50.

HTML and MDX latency

HTML and MDX outputs skip the PDF rasterization step, so latency is dominated by template compile and merge-data evaluation rather than page count.

Format	p50	p95	p99
HTML / MDX	0.4s	1s	2.5s

Async batch SLA

Batch jobs are scheduled onto the shared generation worker pool. The published targets apply to the batch as a whole, not to individual documents within it.

Metric	Target
Time-to-first-document (TTFD)	p95 ≤ 30s after job enqueue
Per-org sustained throughput	p95 ≤ 600 docs/hr
1k-document batch completion	p95 ≤ 90 min

TTFD measures the gap between the API accepting the batch and the first child document reaching GENERATED. Sustained throughput is measured over a rolling one-hour window so a brief burst that drains the rate-limit budget does not count against the SLO.

Availability

The DocGen API targets 99.9% availability per calendar quarter, excluding pre-announced maintenance windows. Availability is measured at the public API edge: a probe that can authenticate, submit a generation request, and retrieve the artifact counts as an "up" sample.

Status, incidents, and scheduled maintenance windows are published at https://status.propper.ai.

The quarterly error budget at 99.9% is approximately 2 hours, 11 minutes. Budget consumption is reviewed monthly and drives reliability investment for the following quarter.

What is excluded

The targets above describe steady-state operation of the DocGen API itself. They do not cover:

Customer-owned components such as outbound webhook receivers, Stripe redirects, or signer email delivery. These have their own targets documented under webhooks.
DocuSign or Conga compatibility shims that proxy to a third-party vendor. Latency for those endpoints inherits the vendor's own SLO; see the migration-compatibility guide for details.
Custom template logic that performs synchronous external HTTP calls during render. The render budget assumes deterministic templates whose only inputs are the supplied merge data and bundled assets.

How to measure on your side

If you want to track these SLOs from the client, the X-Request-Id header returned on every response is the join key. Pair it with the created_at and generated_at timestamps on the document resource to compute end-to-end latency. The values you measure should track the published p50 within roughly 100ms; large divergences typically indicate either a network path issue or a client-side queuing layer between your app and the API.

PDF latency​

HTML and MDX latency​

Async batch SLA​

Availability​

What is excluded​

How to measure on your side​