case-studyexploitationvulnerability-chaining

How AI Agents Chain Vulnerabilities: From Recon to Root

Revelion Team·15 March 2026·8 min read

Vulnerability chaining is how real attackers operate. They don't find a single critical vulnerability and call it a day. They piece together multiple low- and medium-severity findings into a complete attack path that ends in full compromise. This is the gap that traditional scanners cannot bridge: they test each parameter, each endpoint, each configuration in isolation. Revelion's agent architecture was built specifically to close that gap. Here are three patterns we see repeatedly, where individual findings look unremarkable but the chain between them is devastating.

Chain 1: Information Disclosure to Privilege Escalation

The recon agent begins by mapping the target's HTTP response headers and fingerprinting the technology stack. In this pattern, the server returns an X-Powered-By: Express 4.17.1 header alongside a Server: nginx/1.18.0 header. On its own, this is a low-severity information disclosure finding. Most scanners flag it, recommend removing the header, and move on. CVSS 2.1 at best. Nobody is losing sleep over it.

But Revelion's recon agent doesn't just flag the header. It propagates the version data to a shared context layer accessible by all active agents. The CVE correlation agent picks up the Express version, cross-references it against known vulnerabilities, and identifies a prototype pollution vulnerability affecting that specific release. Now the injection agent has a concrete hypothesis to test, not a generic checklist item, but a targeted exploit path informed by the actual software version running in production.

The injection agent confirms the prototype pollution is exploitable by crafting a payload that modifies Object.prototype through a JSON merge operation on an API endpoint that accepts nested objects. The polluted prototype introduces an isAdmin: true property that the application's authorization middleware checks without hasOwnProperty validation. The agent now has authenticated access to administrative endpoints.

From the admin panel, the privilege escalation agent discovers a server configuration endpoint that allows modifying environment variables. It writes a reverse shell payload into a scheduled task configuration, achieving code execution on the underlying host. Total chain: information disclosure, known CVE exploitation, prototype pollution, authentication bypass, privilege escalation, remote code execution.

A scanner would have produced four separate findings across different severity tiers. The information disclosure would sit in the “Low” bucket, probably filtered out of the executive summary entirely. The prototype pollution, if detected at all, would appear as a “Medium” with a note about theoretical impact. None of the findings would reference each other. The actual critical path, from a leaked version header to root-level code execution, would be invisible.

Chain 2: Forgotten Staging Server to Production Database

The recon agent runs subdomain enumeration using DNS brute-forcing, certificate transparency log analysis, and response fingerprinting. It discovers staging-admin.target.com, a subdomain that resolves to a live host running what appears to be an older build of the application. The staging environment was deployed for a feature branch six months ago and never decommissioned. It still has network connectivity to internal services, including the production database cluster, because the staging VPC peering was never torn down.

The authentication agent tests the staging admin panel and finds it accepts the default credentials admin:admin. The password was never changed because the staging environment was considered “internal only,” even though the DNS record is publicly resolvable and the host has no IP allowlisting. This is a finding that requires context to understand. The credentials are “default credentials on an admin panel,” which a scanner can detect. But the scanner cannot determine that this admin panel has a trust relationship with production infrastructure.

Once authenticated, the file upload agent identifies that the admin panel includes a media upload feature with insufficient validation. The application checks the file extension against a blocklist (.php, .exe, .sh) rather than an allowlist, and it does not validate the file content against the declared MIME type. The agent uploads a PHP web shell with a .phtml extension, which the blocklist does not cover. Apache processes the file as PHP, and the agent has code execution on the staging server.

From the staging server, the lateral movement agent enumerates the network. It discovers that the staging environment's database configuration file contains credentials for the production PostgreSQL cluster, because the staging deployment script copied the production .env file and nobody rotated the credentials afterward. The agent connects to the production database and confirms read/write access to all tables, including user records, payment information, and API keys.

Each finding in this chain is individually medium severity at most. Forgotten subdomain: informational. Default credentials: medium. File upload bypass: medium. Credential reuse: medium. But the chain produces a critical outcome: unauthenticated access from the public internet to the production database, through a path that no scanner would ever construct because it requires understanding relationships between assets, not just testing each asset independently.

Chain 3: Verbose API Errors to Full Compromise

The recon agent discovers an API endpoint at /api/v2/users/search that returns detailed error messages when given malformed input. Passing a single quote in the q parameter triggers a stack trace that reveals the database engine (MySQL 8.0), the ORM in use (Sequelize), the table schema (users table with columns id, email, password_hash, role, api_key), and the query structure. A scanner flags “verbose error messages” and “possible SQL injection.” Two separate findings. Neither references the other.

Revelion's injection agent uses the leaked schema information to construct a targeted SQL injection payload. Because it knows the exact table and column names, it skips the enumeration phase that manual exploitation typically requires. It uses a UNION-based injection to extract the api_key column for all users with role='admin'. The extracted API keys are valid for the application's REST API, which uses stateless token authentication.

The credential reuse agent takes the extracted admin API key and tests it against other services running on the same infrastructure. It discovers that the same API key authenticates against an internal microservice at /internal/deployment that was never intended to be externally accessible but is reachable due to a misconfigured ingress rule. This deployment service accepts arbitrary container image references, giving the agent the ability to deploy a malicious container into the production Kubernetes cluster.

The chain: verbose error messages (informational) led to targeted SQL injection (high), which yielded credential extraction (high), which enabled lateral movement through credential reuse (medium), which reached an internal service (critical). The finding that a scanner would have missed entirely is the credential reuse across services, because that requires actually extracting the credentials and testing them elsewhere. Scanners do not do this. They cannot, because it requires multi-step exploitation with contextual reasoning at each handoff.

The Agent Handoff Mechanism

What makes these chains possible is Revelion's shared context architecture. Every agent writes its findings to a real-time discovery graph that all other agents can read. When the recon agent discovers a new subdomain, the authentication agent picks it up within seconds and starts testing. When the injection agent confirms a SQL injection, the credential extraction agent immediately begins formulating queries. There is no queue, no batch processing, no waiting for one scan phase to complete before the next begins.

Each discovery is tagged with a priority score that accounts for both its individual severity and its potential contribution to an attack chain. A verbose error message that leaks schema information scores higher than a verbose error message that only reveals a framework version, because the schema leak has more downstream exploitation potential. The priority scoring is dynamic: it updates as new findings are added to the graph. If an agent discovers default credentials on an admin panel, the priority of any related file upload or configuration endpoints immediately increases because the chain potential just went up.

This is fundamentally different from how scanners operate. A scanner runs its checks in a predetermined order, often parallelised for speed, but each check is independent. Check #247 does not know what check #83 found. There is no shared state, no prioritisation based on cumulative evidence, and no ability to construct multi-step attack paths. The scanner produces a flat list of findings. Revelion produces an attack graph.

The practical difference is that Revelion reports what an attacker can actually achieve, not what vulnerabilities exist in isolation. A report that says “we achieved unauthenticated access to your production database starting from a publicly resolvable staging subdomain” drives a fundamentally different remediation conversation than a report that lists “default credentials (medium),” “file upload bypass (medium),” and “credential reuse (medium)” as three separate items on page 47.

Why Scanners Miss These Chains

It is not a limitation of scanner quality. It is a limitation of scanner architecture. Scanners are designed to test individual parameters against known vulnerability signatures. They are very good at this. But vulnerability chaining requires three capabilities that scanners structurally lack.

First, shared state across test phases. A scanner's port scan module does not share findings with its web application module in a way that enables exploitation chains. Revelion's agents share a live context graph where every finding is immediately available to every other agent.

Second, adaptive exploitation. When a scanner's SQL injection test fails, it moves on. It does not pivot to testing whether the error message it received reveals schema information that could be used differently. Revelion's agents treat partial results as inputs, not dead ends.

Third, cross-asset correlation. A scanner tests each host and each endpoint as an independent target. It cannot determine that credentials found on Host A also work on Host B, because it never tries. Revelion's credential reuse agent systematically tests discovered credentials against all known services, mapping trust relationships across the entire attack surface.

To understand the full agent architecture behind these exploitation chains, read our guide to autonomous AI pentesting. For a direct comparison of what AI pentesting catches versus traditional scanning tools, see AI pentesting vs vulnerability scanning.

Run Revelion against your attack surface. Start free with 20,000 credits.

Ready to start testing?

Start free with 20,000 credits. No card required.

Launch Platform

What is Autonomous AI Pentesting?

A comprehensive guide to autonomous AI penetration testing: how intelligent agents perform reconnaissance, exploitation, and reporting without manual intervention, with real benchmark results.

pentestingvulnerability-scanning

8 min read

AI Pentesting vs Vulnerability Scanning: What Actually Changes

Vulnerability scanners check for known signatures. AI pentesting thinks, adapts, and proves exploitability. Here's what actually changes, and why it matters for your security posture.

2026-03-25