Rate Limits

For Coding Agents

rate-limits.md

Rate Limits

AGNT enforces rate limits to keep the platform stable and fair. The limits are generous for normal usage — if you're hitting them, something is probably wrong with your integration.

Standard Rate Limit

1,000 requests per minute per organization.

This applies across all API endpoints. The limit resets on a rolling window, not a fixed clock minute.

If you exceed the limit, you'll receive a 429 Too Many Requests response. Back off and retry.

Chat Processing Lock

Chat message processing uses a distributed lock to prevent race conditions. Each chat allows 1 concurrent processing request at a time.

When a message is being processed:

  • The chat acquires a Redis distributed lock
  • The lock has a 30-second timeout
  • Any additional processing requests to the same chat receive a 409 Conflict response
json
{
  "error": "Chat is currently being processed",
  "error_code": "CHAT_PROCESSING_LOCKED"
}

Lock Lifecycle

The lock releases automatically in three scenarios:

  1. Success — Processing completes normally, lock releases immediately
  2. Error — Processing fails, lock releases immediately
  3. Timeout — If neither success nor error occurs within 30 seconds, the lock auto-releases as a safety valve

Handling 409 Responses

Don't retry 409s in a tight loop. The previous request is still processing. Wait a reasonable interval (1-2 seconds) and retry, or better yet, design your integration to avoid concurrent sends to the same chat.

If you're building a real-time UI, disable the send button while a message is in flight. If you're building a backend integration, queue messages per chat and process them sequentially.