Claude Code Adds OpenTelemetry Logging, Boosting Detection Engineers' Insight
Photo by Steve Johnson on Unsplash
Anthropic's Claude Code now ships with native OpenTelemetry support, emitting structured telemetry for tool calls, prompts, permission decisions and API requests, according to Brandon Lyons.
Key Facts
- •Key company: Claude Code
Anthropic’s move to embed native OpenTelemetry (OTel) support in Claude Code marks a rare convergence of generative‑AI tooling and enterprise observability, giving detection engineers a data stream that bridges the gap between endpoint telemetry and model‑level activity. As Brandon Lyons explains, Claude Code now emits two distinct OTel streams—metrics and logs—that capture everything from tool invocations and prompt content to permission decisions and API calls. The metrics exporter (OTEL_METRICS_EXPORTER) aggregates counters such as session counts, token usage, and lines of code modified on a 60‑second interval, offering a high‑level view of anomalous cost spikes. By contrast, the logs exporter (OTEL_LOGS_EXPORTER) delivers near‑real‑time, per‑action records every five seconds, preserving the full context of each bash command, tool decision, or error. This dual‑stream architecture gives security teams the “big picture” alerts that metrics provide while retaining the granular event data needed for forensic analysis.
Configuring Claude Code for full‑fidelity telemetry is intentionally non‑trivial, a design choice Lyons emphasizes to prevent accidental exposure. Both exporters are disabled by default; administrators must set CLAUDE_CODE_ENABLE_TELEMETRY=1 and explicitly point OTEL_EXPORTER_OTLP_ENDPOINT at a collector—typically a gRPC endpoint like http://collector.company.com:4317. The recommended deployment model pushes a managed‑settings JSON file to /Library/Application Support/ClaudeCode/managed‑settings.json on macOS (or the equivalent path on other platforms), ensuring that end‑users cannot override the telemetry flags. An additional flag, OTEL_LOG_USER_PROMPTS=1, captures raw prompt content, a capability that Lyons notes is valuable for detecting prompt‑injection attempts that leave no downstream behavioral trace. By centralizing these settings, enterprises can enforce consistent observability without relying on developers to opt in voluntarily.
The practical impact of this telemetry becomes clear when comparing traditional endpoint detection and response (EDR) data with Claude Code’s enriched logs. An EDR system will show a cascade of child processes—bash shells, keychain accesses, Git commands—but it lacks the narrative that explains why those commands were issued. Lyons illustrates a scenario where a detection engineer investigates a Claude‑driven setup of a new GitLab environment. Without OTel logs, the sequence of keychain lookups and bash history searches appears suspicious in isolation. With Claude Code’s logs, the engineer can trace each command back to the originating prompt, the tool chain that executed it, and the authorization source that granted permission, turning a false positive into a verified, legitimate automation run.
While the addition of OTel logging fills a critical visibility gap, Lyons cautions that the data is not a silver bullet. Metrics, by design, aggregate over time and can mask the precise order of events, making them useful for flagging outliers—such as a session cost that spikes an order of magnitude above a user’s baseline—but insufficient for root‑cause analysis. Conversely, logs provide the necessary detail but generate a higher volume of data that must be ingested, stored, and correlated with existing security information and event management (SIEM) pipelines. Lyons recommends a hybrid approach: enable both exporters, use metrics to surface anomalies, and then drill down into the corresponding logs for context. He also notes that the current implementation does not capture every possible internal state of the model, leaving a narrow blind spot for attacks that manipulate the model’s reasoning without triggering an explicit tool call or API request.
Looking ahead, the industry is watching how Claude Code’s telemetry will influence the broader AI‑observability landscape. VentureBeat’s recent coverage of the enterprise observability market—highlighting the rivalry between Dynatrace and New Relic—underscores a growing appetite for standardized, high‑resolution data from AI workloads. By adopting OpenTelemetry, a widely supported open‑source standard, Anthropic positions Claude Code to integrate seamlessly with existing observability stacks, potentially setting a precedent for other AI platform providers. If detection engineers can reliably correlate Claude Code’s logs with traditional endpoint data, the result could be a more cohesive security posture that treats AI agents as first‑class citizens in the threat‑model, rather than opaque black boxes.
Sources
Reporting based on verified sources and public filings. Sector HQ editorial standards require multi-source attribution.