Claude Powers New Security Scanner, Tackles LLM “p‑hacking” with Local Code Skills

8% of free Claude Skills harbor security flaws, according to recent findings, prompting Anthropic to roll out a new scanner that leverages Claude’s local code abilities to curb LLM “p‑hacking.”

Key Facts

•Key company: Claude

Anthropic’s response to the security gap in its free Claude Skills marketplace comes in the form of a locally‑run scanner that leverages the same “Claude Code” execution environment that powers its enterprise‑grade security offering. In a March 13 post on Hacker News, developer ayame0328 detailed how they built a hybrid scanner that combines static pattern matching with Claude’s semantic analysis, using only the SKILL.md definition of each public skill — a method that mirrors Anthropic’s internal approach but is fully customizable by independent developers [report]. The author notes that the scanner’s real breakthrough is its ability to mitigate “p‑hacking,” the phenomenon where identical prompts yield divergent outputs, by employing multi‑stage self‑verification and quantitative confidence scoring, features previously reserved for the paid Enterprise version [report].

The need for such a tool is underscored by recent findings that 8 % of free Claude Skills contain exploitable security flaws, a figure that aligns with Snyk’s broader research indicating that roughly 36.8 % of publicly available Skills suffer from similar issues [report]. Anthropic’s own documentation admits that the marketplace lacks a formal review process and that “security verification of SKILL.md is not performed,” leaving the onus of vetting on end users [report]. By offering a scanner that runs locally, Anthropic sidesteps the latency and privacy concerns of cloud‑based analysis while giving developers the ability to audit code before deployment, a capability that could become a de‑facto standard for LLM‑driven extensions.

From a product‑strategy perspective, the move differentiates Anthropic’s free tier from its Enterprise offering, which already includes a proprietary detection rule set defined internally by Anthropic and a self‑verification pipeline that produces a standardized report format [report]. The Skills‑based scanner, by contrast, lets users define their own detection rules, customize report outputs, and handle false positives through iterative verification loops. While the Enterprise version remains priced as part of Anthropic’s team plans, the open‑source scanner democratizes access to high‑grade security tooling without the associated cost [report].

Industry observers have noted that Anthropic’s recent model upgrades—Claude Opus 4.6 with a 1 million‑token context window and “agent teams” capabilities—are aimed at closing the functional gap with OpenAI’s Codex suite [VentureBeat][ZDNet][CNET]. The addition of a local security scanner complements these enhancements by addressing a critical non‑functional requirement: safety. As enterprises increasingly embed LLMs into CI/CD pipelines, the ability to automatically flag and remediate “p‑hacking” vulnerabilities could become a decisive factor in vendor selection. Anthropic’s strategy of bundling advanced code‑execution features with a self‑service security layer may thus reinforce its positioning in the emerging AI‑augmented development market.

Claude Powers New Security Scanner, Tackles LLM “p‑hacking” with Local Code Skills

Key Facts

Sources

🏢Companies in This Story

Related Stories