AI & LLM Applications

LLM-powered apps, chatbots, agents, and MCP servers are now on most major programs' scopes. HackerOne reported valid AI findings up 210% in 2025 with prompt injection alone up 540%, and 1,121 programs explicitly listed AI in scope - a 270% jump year over year. The attack surface is different enough from a classic web app that testing needs its own methodology.

The pattern worth internalising: an LLM is a parser that treats text as instructions, and every channel feeding text into it is a potential injection point. That parser also controls tools with real-world side effects - file reads, API calls, email sends, code execution in MCP servers. Bugs that cross a trust boundary between those two sides pay. Clever prompt tricks that don't reach a privileged action almost never do.

Attack Surface Map

flowchart TD
    U["User / Attacker"] --> A["Application Layer"]
    A --> G["Guardrail / Gateway"]
    G --> M["LLM / Model"]
    M --> T["Tools & Functions"]
    M --> O["Output to User"]
    R["RAG / Vector DB"] --> M
    D["Docs, Emails, Web<br/>untrusted content"] --> R
    T --> E["External APIs<br/>Gmail, GitHub, S3, shells"]

    U -.- X1["ATTACK: Direct prompt injection"]
    D -.- X2["ATTACK: Indirect prompt injection"]
    T -.- X3["ATTACK: Tool-call hijack / confused deputy"]
    R -.- X4["ATTACK: RAG poisoning, cross-tenant leak"]
    O -.- X5["ATTACK: Unsafe output to downstream XSS/SSRF/RCE"]

Where the Bugs Live

Input plane - what the model reads. Direct user prompts, system prompts, tool descriptions, function schemas. Classic prompt injection sits here, and the subset that pays is when the injection lets you act as a different principal.

Data plane - what the model retrieves. RAG documents, vector databases, memory, past conversation, file attachments, web pages the agent browses. Indirect prompt injection lives here. The attacker never types anything into the model; they plant instructions in content the victim tells the model to process.

Control plane - what the model can do. Tools, MCP servers, OAuth-connected APIs, shell access. This is the blast-radius layer. A zero-impact jailbreak becomes a critical when the model has tools that email, write code, or hit production APIs.

What Pays vs. What Gets Closed

Programs vary, but the pattern across HackerOne, Bugcrowd, Intigriti, and direct programs is consistent:

Reliably paid:

Cross-tenant data access via prompt injection - reading another user's conversation, RAG content, or memory
RCE or shell access via an MCP server or tool-call hijack
Private key, token, or secret leakage from the system prompt or retrieved context
Privilege escalation where an agent performs an action the user could not (admin API calls, billing changes)
Supply-chain compromise of MCP servers or plugins

Usually closed as informational:

"I made the chatbot say a swear word"
Generic jailbreaks that don't cross a trust boundary
Model hallucinations with no security impact
Token exhaustion / cost-bombing unless the program explicitly scopes it
Outputs an unauthenticated user could have generated themselves

Read each program's AI-specific scope. Many now carve out a specific harness (the production app, not the raw model API), specific boundaries (tenant isolation, tool invocation limits), and explicitly out-of-scope categories (bias, toxicity, content policy).

BugBounty.info

Explorer

AI & LLM Applications

AI & LLM Applications

Attack Surface Map

Where the Bugs Live

What Pays vs. What Gets Closed

Section Pages

Direct Prompt Injection

Indirect Prompt Injection

MCP Vulnerabilities

Agent Abuse

Training Data & Memory Leaks

RAG & Vector DB Attacks

Output Handling Flaws

Model DoS & Resource Abuse

See Also

Direct Prompt Injection

Indirect Prompt Injection

MCP Vulnerabilities

Model DoS & Resource Abuse

Output Handling Flaws

RAG & Vector DB Attacks

Training Data & Memory Leaks

Agent Abuse