Telemetry (OpenTelemetry)
Telemetry (OpenTelemetry)
Section titled “Telemetry (OpenTelemetry)”discord-mcp ships with OpenTelemetry traces and metrics built on top of the official OTel JS SDK. This page covers how to enable telemetry, where to send the data, and what signals you get out of the box.
The SDK is disabled by default: if you do not set OTEL_ENABLED=true the
server runs with no exporter wired and zero overhead. All telemetry
configuration is environment-variable driven (no code changes needed).
Quickstart
Section titled “Quickstart”Pick the exporter that matches your environment.
Set the OTLP endpoint and your team API key as headers, then start the server:
export OTEL_ENABLED=trueexport OTEL_EXPORTER_OTLP_ENDPOINT=https://api.honeycomb.ioexport OTEL_EXPORTER_OTLP_HEADERS=x-honeycomb-team=YOUR_API_KEY,x-honeycomb-dataset=discord-mcpexport OTEL_SERVICE_NAME=discord-mcpexport OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
npx discord-mcpTraces appear in the Honeycomb UI under the discord-mcp dataset within
~10 seconds (the OTLP exporter batches).
Save as docker-compose.telemetry.yml:
services: jaeger: image: jaegertracing/all-in-one:1.62 ports: - "16686:16686" # UI - "4317:4317" # OTLP gRPC
otel-collector: image: otel/opentelemetry-collector-contrib:0.110.0 command: ["--config=/etc/otelcol-contrib/config.yaml"] volumes: - ./otel-collector-config.yaml:/etc/otelcol-contrib/config.yaml ports: - "4318:4318" # OTLP HTTP (what discord-mcp speaks) - "8888:8888" # Internal collector metrics depends_on: - jaegerotel-collector-config.yaml:
receivers: otlp: protocols: http: endpoint: 0.0.0.0:4318
processors: batch:
exporters: otlp/jaeger: endpoint: jaeger:4317 tls: insecure: true
service: pipelines: traces: receivers: [otlp] processors: [batch] exporters: [otlp/jaeger]Then:
docker compose -f docker-compose.telemetry.yml up -dexport OTEL_ENABLED=trueexport OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobufnpx discord-mcp# Jaeger UI: http://localhost:16686For dev / debugging only — no collector required:
export OTEL_ENABLED=trueexport OTEL_CONSOLE_EXPORTER=truenpx discord-mcp 2>otel.jsonlThe console exporter writes spans as JSON to stderr (stdout is reserved
for JSON-RPC frames). Pipe stderr to a file or tail it. Useful when you
don’t have a collector handy and just want to confirm that spans fire.
Available signals
Section titled “Available signals”Tool-level metrics
Section titled “Tool-level metrics”| Metric | Type | Description |
|---|---|---|
mcp.tool.duration_ms | Histogram | Wall-clock duration of each tool call (milliseconds). |
mcp.tool.calls | Counter | Total tool calls, labelled by status ∈ {ok, tool_error, error}. |
mcp.tool.errors | Counter | Subset of calls that ended in error (tool_error or thrown). |
Common labels on every metric: mcp.tool.name, mcp.tool.category,
mcp.tool.idempotent, mcp.transport, status.
Resilience metrics
Section titled “Resilience metrics”| Metric | Type | Description |
|---|---|---|
mcp.circuit.transitions | Counter | Circuit breaker state transitions (closed → open, open → half-open, half-open → closed). Labelled by from, to, route. |
mcp.bulkhead.rejected.count | Counter | Calls fast-rejected because the bulkhead semaphore was full. Labelled by route. |
mcp.deadletter.count | Counter | Calls that exhausted retries / circuit-rejected and surfaced to the client as a structured error. Labelled by tool, category, error_code. |
Every tool invocation produces an mcp.tool.<tool_name> SERVER span with:
- Standard attributes:
mcp.tool.name,mcp.tool.category,mcp.tool.idempotent,mcp.transport,mcp.request_id(if known). - Span event
mcp.tool.argswith attributemcp.args.redactedcontaining the JSON-stringified, redacted arg payload (see Audit → Privacy redaction for the policy). - Status:
OKon success,ERRORwithtool returned isErrorfor structured tool errors, or the thrown exception message for crashes.
If OTEL_ENABLED=true AND the underlying Discord REST call fires under the
active span context, you also get a child CLIENT span from
@opentelemetry/instrumentation-undici with the standard
http.request.method, url.full, http.response.status_code attributes plus
the discord.route (route-redacted, e.g. POST /channels/:id/messages).
Dashboards (text-only sketches)
Section titled “Dashboards (text-only sketches)”For Grafana + Prometheus, useful starter queries:
- Tool error rate, last 5m:
sum by (mcp.tool.name) (rate(mcp_tool_errors_total[5m])) / sum by (mcp.tool.name) (rate(mcp_tool_calls_total[5m])) - p95 tool latency:
histogram_quantile(0.95, sum by (mcp.tool.name, le) (rate(mcp_tool_duration_ms_bucket[5m]))) - Circuit open events:
sum by (route) (rate(mcp_circuit_transitions_total{to="open"}[1h])) - Bulkhead saturation:
sum by (route) (rate(mcp_bulkhead_rejected_count_total[5m]))
For Honeycomb, useful starting BubbleUp / triggers:
- Slow
mcp.tool.<name>spans (P95 > 1s). mcp.tool.errors> 10 per minute.mcp.deadletter.count> 0 (every dead-letter is a real failure the client saw).
The shipped repository does NOT include exported board JSON — every deployment has different SLOs. Use the queries above as a starting point and tune.
Sampling
Section titled “Sampling”The default trace sampler is parentbased_always_on: if an incoming trace
context is set (parent), defer to it; otherwise sample 100%. This is fine for
low-volume bot deployments. For high-volume servers, switch to:
export OTEL_TRACES_SAMPLER=parentbased_traceidratioexport OTEL_TRACES_SAMPLER_ARG=0.05 # 5% samplingMetrics are NOT affected by trace sampling — counters and histograms are always reported.
Reference: env vars
Section titled “Reference: env vars”| Var | Default | Description |
|---|---|---|
OTEL_ENABLED | false | Master switch. When false, SDK is not booted. |
OTEL_SERVICE_NAME | discord-mcp | service.name resource attribute. |
OTEL_SERVICE_VERSION | (package version) | service.version resource attribute. |
OTEL_EXPORTER_OTLP_ENDPOINT | (unset) | OTLP collector endpoint (e.g. http://localhost:4318). |
OTEL_EXPORTER_OTLP_PROTOCOL | http/protobuf | One of http/protobuf, http/json, grpc. |
OTEL_EXPORTER_OTLP_HEADERS | (unset) | Comma-separated key=value pairs (e.g. for vendor auth). |
OTEL_TRACES_SAMPLER | parentbased_always_on | One of the standard OTel samplers. |
OTEL_TRACES_SAMPLER_ARG | 1 | Ratio for ratio-based samplers. |
OTEL_CONSOLE_EXPORTER | false | When true, also writes spans to stderr as JSON (debug aid). |
Related
Section titled “Related”- Resilience — circuit/bulkhead/retry knobs that emit the resilience metrics above.
- Audit — mutating-call trail; uses the same redaction policy as span events.
- Architecture → Middleware chain — where the telemetry middleware sits in the call path.