Rate limit design
Arvexi enforces rate limits at the organization level, not per API key. This means all keys belonging to the same organization share a single rate limit budget. The design prevents a single integration from consuming the entire quota while still giving teams flexibility to distribute requests across multiple keys and services.
Rate limits are tracked in the database using a sliding window counter, not in-memory state. This makes the system serverless-safe. Limits are enforced consistently even when requests are handled by different serverless function instances, different regions, or after cold starts. There is no Redis dependency.
Thresholds
Default rate limits vary by plan:
- Standard: 1,000 requests per minute.
- Professional: 2,500 requests per minute.
- Enterprise: 5,000 requests per minute, with custom limits available on request.
Write operations (POST, PUT, PATCH, DELETE) count the same as read operations (GET). Bulk endpoints that process multiple records in a single request count as one request toward the limit.
Rate limit headers
Every API response includes three headers that tell you where you stand against your limit:
X-RateLimit-Limit: Your maximum requests per minute.X-RateLimit-Remaining: How many requests you have left in the current window.X-RateLimit-Reset: Unix timestamp (in seconds) when the current window resets.
Use these headers to implement client-side throttling. A well-behaved integration checks X-RateLimit-Remaining before each batch and backs off when it approaches zero, rather than hitting the limit and handling errors.
Handling 429 responses
When you exceed the rate limit, Arvexi returns a 429 Too Many Requests response with a JSON body:
{
"error": "rate_limit_exceeded",
"message": "Organization rate limit exceeded. Retry after 12 seconds.",
"retry_after": 12
}The retry_after field (also sent as a Retry-After header) tells you how many seconds to wait before retrying. Implement exponential backoff with jitter for production integrations:
- Wait for
retry_afterseconds plus a random jitter of 0–2 seconds. - Retry the request. If you receive another
429, double the wait time (up to a maximum of 60 seconds). - After five consecutive
429responses, stop retrying and alert your monitoring system.
Idempotency keys
For write operations, include an Idempotency-Key header with a unique value (typically a UUID v4). Arvexi stores each key for 24 hours. If a request arrives with an idempotency key that was already used, Arvexi returns the original response without re-executing the operation.
This is critical for safely retrying failed requests. If your network connection drops after sending a POST request but before receiving the response, you cannot know whether the server processed it. Resending the same request with the same idempotency key guarantees the operation happens exactly once.
Idempotency keys are also used internally for webhook deliveries. Each webhook payload includes an idempotency_key field that your endpoint should use for deduplication, following the same 24-hour TTL window.
Rules for idempotency keys:
- Must be unique per operation. Reusing a key with a different request body returns a
422 Unprocessable Entityerror. - Maximum length is 255 characters. Arvexi recommends UUID v4 format.
- Keys expire after 24 hours. After expiry, the same key can be reused for a new operation.
- GET requests do not require idempotency keys (reads are naturally idempotent).
Statement timeout
Every database query executed by the API has a statement timeout of 60 seconds. If a query exceeds this limit, the database cancels it and Arvexi returns a 503 Service Unavailable response:
{
"error": "statement_timeout",
"message": "The query exceeded the 60-second statement timeout.",
"retry_after": 5
}Statement timeouts protect the database from runaway queries that could degrade performance for all tenants. If you encounter this error consistently, the most common causes are:
- Overly broad queries: Add filters (date ranges, entity IDs, status) to reduce the result set. Use pagination for large collections.
- Missing indexes: Contact support if you believe a query pattern should be faster. Arvexi monitors slow query logs and adds indexes proactively, but custom report queries may surface edge cases.
- Large batch operations: Split bulk creates or updates into smaller batches (100–500 records per request).
Connection pooling
Arvexi uses PgBouncer in transaction mode for connection pooling between the application and PostgreSQL. The pool is configured with:
- Maximum connections: 25 per pool.
- Minimum connections: 5 (kept warm to avoid cold-start latency).
- Idle timeout: 30 seconds. Idle connections beyond the minimum are released back to the pool.
In a serverless environment, each function instance opens its own connection to PgBouncer, which then multiplexes those connections into the smaller pool of actual database connections. This prevents the “too many connections” problem that plagues serverless architectures with direct database access.
If PgBouncer cannot allocate a connection within 10 seconds (all 25 are in use), the API returns a 503 Service Unavailable response with a Retry-After: 5 header. This is rare under normal load but can occur during traffic spikes.
Error handling summary
The two error codes most relevant to rate limiting and infrastructure constraints:
429 Too Many Requests: You have exceeded your organization’s rate limit. Wait for theRetry-Afterduration and retry with exponential backoff.503 Service Unavailable: The request failed due to a statement timeout or connection pool exhaustion. Wait for theRetry-Afterduration and retry. If the error persists, reduce your request volume or simplify the query.
Both error types include a Retry-After header and a retry_after field in the JSON body. Your integration should handle these gracefully. They are expected under heavy load, not signs of a bug.
For authentication errors and API key management, see the API authentication and keys article. For webhook-specific delivery and retry behavior, see the webhook setup and events article.