Rate limits
How much traffic the public API accepts today, how the per-API-key limits are enforced, and how to design a client that backs off correctly under throttling.
This page is the source of truth for the throttle behaviour. The 429 envelope, the rate_limited and too_many_failures codes, and the Retry-After / X-RateLimit-* headers are documented end-to-end below.
Today
Two distinct ceilings are active:
| Limit | Value | Where it applies |
|---|---|---|
Reads (GET /...) | 600 req/min sustained, burst 100 | Per API key. Returns 429 with code rate_limited when exceeded. |
Writes (POST / PUT / PATCH / DELETE) | 120 req/min sustained, burst 30 | Per API key. Returns 429 with code rate_limited. |
Imports trigger (POST /imports, POST /imports/{id}/start, POST /imports/{id}/cancel) | 10 req/min sustained, burst 5 | Per API key. Returns 429 with code rate_limited. |
| Failed auth attempts | 5 / minute / IP | Returns 429 with code too_many_failures when exceeded. Cools down after 60 seconds. |
| Body size | 5 MB | Per request, on every endpoint. Larger payloads return 413. Use Imports instead. |
| Batch size | 500 items | Per POST /products/batch call. |
| Imports, line count | No hard cap | NDJSON files past ~1M lines may take noticeably longer to process. Split into multiple imports if you can. |
| Concurrent imports | No hard cap | Be reasonable, running dozens of large imports in parallel for the same company is fine for short bursts, not as a steady state. |
The buckets are independent: saturating reads will not throttle your writes, and vice-versa.
Plan against the documented values
The documented limits are the contract. Treat the table above as the hard cap and don't build retry budgets that rely on extra headroom.
Granularity: per-API-key
Limits apply per API key, not per company. Each key holds its own counters, provisioning extra keys widens your effective budget and isolates a noisy integration from the rest of your traffic. If you have a backfill cron that occasionally bursts and a customer-facing storefront that must never see 429, mint a separate key for each.
The trade-off: there is no global per-company cap. We expect the per-key budget to be enough for any realistic sync; if a single integration legitimately needs more, mint additional keys for it rather than asking us to lift the limits.
The 429 response
When a request is throttled, the API responds with 429 Too Many Requests and the standard error envelope. The error code is rate_limited for the per-API-key buckets and too_many_failures for the auth-layer IP throttle. See Humind error codes.
Response headers
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 12
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1714060800| Header | Meaning |
|---|---|
Retry-After | Seconds to wait before retrying. Always set on 429. Honor it as the floor for your backoff. |
X-RateLimit-Limit | Sustained per-minute cap for this endpoint group. |
X-RateLimit-Remaining | Requests left in the current window. Drops to 0 when the cap is hit. |
X-RateLimit-Reset | Unix epoch (seconds) at which the window resets and the budget refills. |
Burst values aren't headerized
X-RateLimit-Limit exposes only the sustained per-minute cap. The burst budget (e.g. 30 burst over 120 sustained writes) is not echoed in headers — treat it as additional short-window headroom, not as a documented runtime signal.
Response body
{
"error": {
"code": "rate_limited",
"message": "Rate limit exceeded. Retry after 12 seconds.",
"request_id": "req_8f3a1c2d4e5b6a7f",
"details": {
"retry_after": 12
}
}
}The Retry-After header and details.retry_after always agree. Read either one, the header is canonical.
Implementing backoff
Read Retry-After on every 429, wait at least that long, then double the wait on each subsequent failure up to a 60-second cap. Stop after 5 retries; if you're still throttled, the problem isn't transient.
Node
async function withBackoff(fetchOnce, { maxRetries = 5, cap = 60_000 } = {}) {
let attempt = 0
while (true) {
const res = await fetchOnce()
if (res.status !== 429) return res
if (attempt >= maxRetries) return res
const retryAfter = Number(res.headers.get('retry-after')) || 1
const wait = Math.min(retryAfter * 1000 * 2 ** attempt, cap)
await new Promise(r => setTimeout(r, wait))
attempt++
}
}Python
import time, requests
def with_backoff(call, max_retries=5, cap=60):
attempt = 0
while True:
res = call()
if res.status_code != 429:
return res
if attempt >= max_retries:
return res
retry_after = int(res.headers.get('Retry-After', '1'))
wait = min(retry_after * (2 ** attempt), cap)
time.sleep(wait)
attempt += 1Ruby
def with_backoff(max_retries: 5, cap: 60)
attempt = 0
loop do
res = yield
return res if res.code.to_i != 429
return res if attempt >= max_retries
retry_after = res['Retry-After'].to_i
retry_after = 1 if retry_after.zero?
wait = [retry_after * (2 ** attempt), cap].min
sleep wait
attempt += 1
end
endBest practices
- Honor
Retry-After. Treat it as the floor of your backoff window, not a hint. Retrying earlier just bounces another 429 and wastes budget. - Exponential backoff with a cap. Start at
Retry-After, double each retry, cap at 60s. Five retries max, past that the issue isn't transient. - Prefer batch over loops.
POST /products/batchcosts one request for up to 500 products, vs. 500 singlePOST /productscalls. The loop burns your write budget in seconds. - Use Imports for big pushes. 50,000 products is one
POST /importscall, not 100 batch requests or 50,000 single ones. Imports also bypass the body-size and per-call limits. - Cache reads merchant-side. If your code reads the same product multiple times in a short window, cache the result. The product API isn't a database, round-tripping it on every page render burns latency and quota.
- Don't retry writes blindly. Pair every retry with an
Idempotency-Keyso the second call can't double-create. See Idempotency. - Spread bulk traffic out. If you have to do thousands of writes, drip them at one or two per second rather than firing them all in a single burst, bursts get clipped, drips don't.
- Log
request_idon every 429. Including it in support tickets makes the log lookup instant.
Avoiding rate limits with batches and imports
The most reliable way not to hit a rate limit is to use the right endpoint for the volume you're moving.
| Need | N | Use |
|---|---|---|
| Push products | 1 | POST /products |
| Push products | 2–500 | POST /products/batch |
| Push products | 500+ | POST /imports (NDJSON) |
| Read a single product | 1 | GET /products/{id} |
| Read products by filter | many | GET /products?... (server-side filtering, not a client-side fan-out) |
| Read products by external_id | many | GET /products/api:<external_id> per ID, cache aggressively. For full backfill, use Imports. |
The pattern: anything that scales with N should be one call, not N calls.
Concurrent connections
There's no explicit limit on concurrent connections per API key today. Very high concurrency tends to show up as elevated latency rather than as throttling, so use moderate parallelism even where 429s aren't enforced.
A few rules of thumb:
- Sustained traffic above ~50 requests/second to the same company starts pushing P95 latency up. Keep concurrency moderate (a handful of in-flight requests at a time, not hundreds).
- Reuse HTTP connections. Open one keep-alive client at startup and reuse it for every call. Spinning up a fresh TCP/TLS handshake per request is wasteful and slow.
- Connection pooling on the merchant side helps more than parallelism. A pool of 4–8 connections that drips requests steadily outperforms a flood of 100 concurrent requests, both in throughput and in playing nicely with the documented limits.
Next
- Errors: error format, the full list of HTTP status codes and Humind error codes, including
rate_limitedandtoo_many_failures. - Conventions: request format, body size limits, idempotency.
- Imports: the right tool for moving large catalogs without burning write budget.