Skip to content

Rate limits

How much traffic the public API accepts today, how the per-API-key limits are enforced, and how to design a client that backs off correctly under throttling.

This page is the source of truth for the throttle behaviour. The 429 envelope, the rate_limited and too_many_failures codes, and the Retry-After / X-RateLimit-* headers are documented end-to-end below.

Today

Two distinct ceilings are active:

LimitValueWhere it applies
Reads (GET /...)600 req/min sustained, burst 100Per API key. Returns 429 with code rate_limited when exceeded.
Writes (POST / PUT / PATCH / DELETE)120 req/min sustained, burst 30Per API key. Returns 429 with code rate_limited.
Imports trigger (POST /imports, POST /imports/{id}/start, POST /imports/{id}/cancel)10 req/min sustained, burst 5Per API key. Returns 429 with code rate_limited.
Failed auth attempts5 / minute / IPReturns 429 with code too_many_failures when exceeded. Cools down after 60 seconds.
Body size5 MBPer request, on every endpoint. Larger payloads return 413. Use Imports instead.
Batch size500 itemsPer POST /products/batch call.
Imports, line countNo hard capNDJSON files past ~1M lines may take noticeably longer to process. Split into multiple imports if you can.
Concurrent importsNo hard capBe reasonable, running dozens of large imports in parallel for the same company is fine for short bursts, not as a steady state.

The buckets are independent: saturating reads will not throttle your writes, and vice-versa.

Plan against the documented values

The documented limits are the contract. Treat the table above as the hard cap and don't build retry budgets that rely on extra headroom.

Granularity: per-API-key

Limits apply per API key, not per company. Each key holds its own counters, provisioning extra keys widens your effective budget and isolates a noisy integration from the rest of your traffic. If you have a backfill cron that occasionally bursts and a customer-facing storefront that must never see 429, mint a separate key for each.

The trade-off: there is no global per-company cap. We expect the per-key budget to be enough for any realistic sync; if a single integration legitimately needs more, mint additional keys for it rather than asking us to lift the limits.

The 429 response

When a request is throttled, the API responds with 429 Too Many Requests and the standard error envelope. The error code is rate_limited for the per-API-key buckets and too_many_failures for the auth-layer IP throttle. See Humind error codes.

Response headers

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 12
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1714060800
HeaderMeaning
Retry-AfterSeconds to wait before retrying. Always set on 429. Honor it as the floor for your backoff.
X-RateLimit-LimitSustained per-minute cap for this endpoint group.
X-RateLimit-RemainingRequests left in the current window. Drops to 0 when the cap is hit.
X-RateLimit-ResetUnix epoch (seconds) at which the window resets and the budget refills.

Burst values aren't headerized

X-RateLimit-Limit exposes only the sustained per-minute cap. The burst budget (e.g. 30 burst over 120 sustained writes) is not echoed in headers — treat it as additional short-window headroom, not as a documented runtime signal.

Response body

json
{
  "error": {
    "code": "rate_limited",
    "message": "Rate limit exceeded. Retry after 12 seconds.",
    "request_id": "req_8f3a1c2d4e5b6a7f",
    "details": {
      "retry_after": 12
    }
  }
}

The Retry-After header and details.retry_after always agree. Read either one, the header is canonical.

Implementing backoff

Read Retry-After on every 429, wait at least that long, then double the wait on each subsequent failure up to a 60-second cap. Stop after 5 retries; if you're still throttled, the problem isn't transient.

Node

js
async function withBackoff(fetchOnce, { maxRetries = 5, cap = 60_000 } = {}) {
  let attempt = 0
  while (true) {
    const res = await fetchOnce()
    if (res.status !== 429) return res
    if (attempt >= maxRetries) return res
    const retryAfter = Number(res.headers.get('retry-after')) || 1
    const wait = Math.min(retryAfter * 1000 * 2 ** attempt, cap)
    await new Promise(r => setTimeout(r, wait))
    attempt++
  }
}

Python

python
import time, requests

def with_backoff(call, max_retries=5, cap=60):
    attempt = 0
    while True:
        res = call()
        if res.status_code != 429:
            return res
        if attempt >= max_retries:
            return res
        retry_after = int(res.headers.get('Retry-After', '1'))
        wait = min(retry_after * (2 ** attempt), cap)
        time.sleep(wait)
        attempt += 1

Ruby

ruby
def with_backoff(max_retries: 5, cap: 60)
  attempt = 0
  loop do
    res = yield
    return res if res.code.to_i != 429
    return res if attempt >= max_retries
    retry_after = res['Retry-After'].to_i
    retry_after = 1 if retry_after.zero?
    wait = [retry_after * (2 ** attempt), cap].min
    sleep wait
    attempt += 1
  end
end

Best practices

  • Honor Retry-After. Treat it as the floor of your backoff window, not a hint. Retrying earlier just bounces another 429 and wastes budget.
  • Exponential backoff with a cap. Start at Retry-After, double each retry, cap at 60s. Five retries max, past that the issue isn't transient.
  • Prefer batch over loops. POST /products/batch costs one request for up to 500 products, vs. 500 single POST /products calls. The loop burns your write budget in seconds.
  • Use Imports for big pushes. 50,000 products is one POST /imports call, not 100 batch requests or 50,000 single ones. Imports also bypass the body-size and per-call limits.
  • Cache reads merchant-side. If your code reads the same product multiple times in a short window, cache the result. The product API isn't a database, round-tripping it on every page render burns latency and quota.
  • Don't retry writes blindly. Pair every retry with an Idempotency-Key so the second call can't double-create. See Idempotency.
  • Spread bulk traffic out. If you have to do thousands of writes, drip them at one or two per second rather than firing them all in a single burst, bursts get clipped, drips don't.
  • Log request_id on every 429. Including it in support tickets makes the log lookup instant.

Avoiding rate limits with batches and imports

The most reliable way not to hit a rate limit is to use the right endpoint for the volume you're moving.

NeedNUse
Push products1POST /products
Push products2–500POST /products/batch
Push products500+POST /imports (NDJSON)
Read a single product1GET /products/{id}
Read products by filtermanyGET /products?... (server-side filtering, not a client-side fan-out)
Read products by external_idmanyGET /products/api:<external_id> per ID, cache aggressively. For full backfill, use Imports.

The pattern: anything that scales with N should be one call, not N calls.

Concurrent connections

There's no explicit limit on concurrent connections per API key today. Very high concurrency tends to show up as elevated latency rather than as throttling, so use moderate parallelism even where 429s aren't enforced.

A few rules of thumb:

  • Sustained traffic above ~50 requests/second to the same company starts pushing P95 latency up. Keep concurrency moderate (a handful of in-flight requests at a time, not hundreds).
  • Reuse HTTP connections. Open one keep-alive client at startup and reuse it for every call. Spinning up a fresh TCP/TLS handshake per request is wasteful and slow.
  • Connection pooling on the merchant side helps more than parallelism. A pool of 4–8 connections that drips requests steadily outperforms a flood of 100 concurrent requests, both in throughput and in playing nicely with the documented limits.

Next

  • Errors: error format, the full list of HTTP status codes and Humind error codes, including rate_limited and too_many_failures.
  • Conventions: request format, body size limits, idempotency.
  • Imports: the right tool for moving large catalogs without burning write budget.

Released under the proprietary Humind license.