Rate limits
Rate limits
Section titled “Rate limits”Discord enforces two layers of rate limits that any production bot has to
respect: a global ceiling (~50 requests/sec per token to most routes)
and per-route buckets (varying per endpoint, with :id-aware
discrimination — /channels/A/messages and /channels/B/messages are
separate buckets). When you exceed either, Discord responds with HTTP 429
and a retry_after header.
discord-mcp’s job is to (1) respect those limits proactively, (2) handle the inevitable 429 when proactivity fails, and (3) give the agent a clear signal when it’s pushing too hard.
The model
Section titled “The model”graph LR Tool[Tool] --> Bulkhead[Bulkhead<br/>concurrency cap] Bulkhead --> Circuit[Circuit breaker<br/>per-route] Circuit --> Retry[Retry<br/>429-aware] Retry --> DJS[discordjs/rest] DJS --> Discord[Discord API] DJS -. djs queue disabled .-> Note[retries: 0<br/>cockatiel owns retry]Three quotas live in different layers:
- Bulkhead (concurrency): the
MCP_BULKHEAD_LIMITsemaphore caps in-flight requests. Default 100; tighten to 20–30 for early “agent is over-parallelizing” signals. - Per-route circuit breaker: opens after 10 failures on a single route in a sliding window. Stops hammering a flapping endpoint before Discord’s Cloudflare layer treats us as abusive.
- 429-aware retry: when Discord returns 429, the retry layer waits
retry_afterseconds (plus jitter) before re-issuing. Bounded byMCP_RETRY_MAX_ATTEMPTS.
See Operations → Resilience for the exact env var contract.
@discordjs/rest queue is DISABLED
Section titled “@discordjs/rest queue is DISABLED”By default, @discordjs/rest ships with its own retry queue: when it sees
a 429, it queues subsequent requests and retries the failing one
automatically. Sounds good, but it conflicts with Cockatiel:
- djs-rest retries inside its own request method (a single
awaitin the tool blocks for the full backoff). - Cockatiel wraps that await with another retry layer.
- A 429 fires; djs-rest waits
retry_after; Cockatiel’s timeout fires during the wait; the request appears to time out; Cockatiel retries. Now you have two unbounded retry sources fighting each other.
Plan 8 Phase C disabled the djs-rest queue by passing retries: 0 to
the @discordjs/rest constructor. Cockatiel is the single retry
authority; djs-rest just makes the HTTP call and bubbles 429s up
immediately.
// From packages/mcp-core/src/rest/resilient.ts (paraphrased)const rest = new REST({ version: '10', retries: 0, // ← cockatiel owns retry globalRequestsPerSecond: 0, // ← cockatiel/bulkhead owns concurrency});The globalRequestsPerSecond: 0 similarly disables djs-rest’s internal
global-rate-limit awareness; we let cockatiel + bulkhead handle that too.
Result: one retry layer, predictable backoff, clean traces.
Webhook bypass
Section titled “Webhook bypass”Webhook routes (/webhooks/{id}/{token} for execution,
/webhooks/{id} for management) use:
- A per-webhook ratelimit (typically 30/min) that’s not shared with the bot’s global bucket.
- A different identity (the webhook ID/token, not the bot token).
Practical implication: high-volume publishing (announcements, notifications, log forwarding) should use webhooks, leaving the bot’s rate budget free for interactive work (commands, moderation, slash responses).
The trade-off: webhooks can’t react, can’t moderate, can’t read history — they’re write-only. Use them for the publish-heavy half of your workload and let the bot handle everything else.
See webhook-execute recipe for
worked patterns.
Tuning by workload
Section titled “Tuning by workload”High-volume agent (announcement bot, log forwarder)
Section titled “High-volume agent (announcement bot, log forwarder)”MCP_BULKHEAD_LIMIT=200— raise concurrency cap; you’re publish-heavy.MCP_RETRY_MAX_ATTEMPTS=4MCP_RETRY_BASE_DELAY_MS=500— be patient with occasional 429s; you have time.- Use webhooks for the publishing path; reserve the bot for control plane.
Low-volume agent (moderation, dashboards)
Section titled “Low-volume agent (moderation, dashboards)”- Defaults are fine.
MCP_BULKHEAD_LIMIT=100is way more than needed. - Tighten if you want a clear “stop” signal:
MCP_BULKHEAD_LIMIT=20— hits saturation fast if the agent loops, makes the bug visible.
Pipeline-heavy agent (multi-tool atomic operations)
Section titled “Pipeline-heavy agent (multi-tool atomic operations)”- Be careful with
MCP_BULKHEAD_LIMIT: one pipeline that fans out to N children competes with itself for slots. See Architecture → Pipeline and Operations → Resilience for the pipeline + bulkhead interaction notes. - Min sane value: 10. A bulkhead of 1 deadlocks the pipeline tool itself.
Dev / test
Section titled “Dev / test”MCP_RETRY_MAX_ATTEMPTS=1to surface failures immediately.MCP_CIRCUIT_ENABLED=falseso a flaky test doesn’t open the breaker and corrupt subsequent test runs.MCP_TIMEOUT_DEFAULT_MS=5000to fail fast on hangs.
Observability
Section titled “Observability”When the system pushes too hard, the metrics show it:
mcp.bulkhead.rejected.countrising → tighten parallelism in the agent loop, or raise the bulkhead.mcp.circuit.transitions{to="open"}firing → a route is genuinely failing; the breaker is doing its job. Investigate the route.mcp.deadletter.count{error_code="rate_limited"}→ the agent exhausted retries on 429s; consider bumpingMCP_RETRY_MAX_ATTEMPTSor batching via dedicated bulk tools (messages_bulk_deleteinstead of N individual deletes).
See Operations → Telemetry for the full metric catalog and recommended dashboards.
Source map
Section titled “Source map”| Concern | File |
|---|---|
| REST adapter (cockatiel + djs) | rest/resilient.ts |
| Cockatiel policy chain | rest/policy.ts |
| Error mapping (429 → DiscordRateLimitError) | rest/errors.ts |
Related
Section titled “Related”- Operations → Resilience — retry/circuit/bulkhead env vars in detail.
- Operations → Telemetry — how to observe rate-limit pressure.
- Architecture → Error handling —
DISCORD_RATE_LIMITED,BULKHEAD_SATURATED,CIRCUIT_OPENrecovery hints. webhook-executerecipe — using webhooks to bypass the bot bucket.