Blue41, fresh off an RSAC Launch Pad win, showed how a two-cent bank transfer could weaponize Bunq's AI assistant against its own customers. The Belgian security startup is betting that prompt injection is less a model problem than an application security market waiting to open up.

The attack cost two cents. That was the entire budget required to turn the AI assistant inside one of Europe's largest digital banks into a phishing delivery system. No malware, no stolen device, no convincing a victim to click anything. Just a €0.02 bank transfer with a carefully written message attached to it.
That demonstration, run by the security startup Blue41 against Bunq, is the kind of proof of concept that sells a category before it sells a product. Bunq serves more than 20 million customers and ranks as Europe's second-largest digital bank. Its AI assistant does what most banking assistants now do: answer plain-language questions like "show me my recent transactions" by pulling account data into a large language model and summarizing it. Blue41's researchers, Thomas Vissers and Tim Van hamme, found that the transaction description field, the same line that normally reads "Coffee" or "Rent," could carry instructions the model would obey.
The problem the company is selling against
The vulnerability has a name that has been circulating in AI security circles for a couple of years now: indirect prompt injection. The direct version is the familiar one, where a user types "ignore your previous instructions" into a chatbot. The indirect version is sneakier and harder to defend. The malicious instruction is not typed by the person talking to the assistant. It is hidden inside data the assistant retrieves on its own, later, to do its job.
In the Bunq case the sequence was almost boringly simple. An attacker sends the target a tiny transfer and writes the payload into the description. The victim, who has no idea any of this happened, later opens the app and asks the assistant a routine question. To answer it, the assistant fetches recent transactions, including the poisoned one, and hands the whole batch to the model as context. The model reads the injected text not as a record of a payment but as a command. In Blue41's controlled test, the assistant then produced what looked like a legitimate reauthentication request from the bank, a spearphishing message rendered inside the bank's own app, in the bank's own voice, referencing the user's real account details.
That last detail is what makes the scenario worth paying attention to. A phishing email has to fake legitimacy. A compromised banking assistant already has it. It is sitting inside a trusted application, with access to real transaction history, and it can make a fraudulent prompt feel personal and timely in a way no spoofed email ever could.
Why guardrails did not catch it
Bunq was not running its assistant naked. It had guardrails, the input filters and prompt-injection classifiers that have become standard advice for anyone shipping an LLM feature. The attack worked anyway, and the reason is the part of the story that matters most for the security teams Blue41 is courting.
The payload did not announce itself. It did not contain a jailbreak phrase or anything a classifier would flag with confidence when reading the transaction line in isolation. It was written to look like ordinary transaction data and only became dangerous once the assistant retrieved it, dropped it into context alongside everything else, and generated a response. The danger lived in the interaction between untrusted data, retrieval logic, model behavior, and the assistant's available actions, not in any single string a filter could scan.
This is Blue41's core pitch, and it is a reasonable one. Static text classification assumes you can decide whether a piece of input is malicious by looking at the input. Indirect prompt injection breaks that assumption. The company's argument is that defense has to move to runtime: watch what the assistant actually does, build a behavioral profile of how it normally operates, and flag when it starts embedding external URLs, accessing unexpected data sources, or calling tools in ways that do not match its usual pattern.
The market thesis underneath the case study
Strip away the demonstration and what Blue41 is really doing is staking out a position in AI application security, a field that did not exist as a budget line a few years ago and is now scrambling to. The company won the RSA Conference Launch Pad earlier this year, the kind of credential that opens doors with exactly the enterprise buyers it needs, banks and financial institutions deploying assistants into customer-facing workflows.
The thesis is that the injection surface in finance is enormous and mostly unguarded. Transaction descriptions, payment references, merchant metadata, support messages, uploaded documents, emails, CRM notes, all of these are fields that were designed as data and are now being fed to models that treat text as potential instruction. The more capable the assistant becomes, especially once it can call tools or initiate workflows rather than just summarize, the larger the blast radius. A read-only assistant can mislead a user. An assistant wired into account operations can do considerably worse.
Blue41's recommended controls are not novel on their own, and to the company's credit it does not pretend otherwise. Minimize the context you pass to the model. Treat retrieved data as untrusted by default. Constrain the assistant's ability to generate links or trigger sensitive actions. Monitor runtime behavior. The bet is that layering these into a coherent product, with the behavioral monitoring piece as the differentiator, is worth paying for, because preventing every possible payload in advance is not realistic and the company is honest about that.
What to watch
The healthy skepticism here is about category timing rather than the technical findings, which are sound. Plenty of AI security startups are positioning around prompt injection right now, and the runtime-monitoring angle, while sensible, will face the same question every behavioral security product faces: can it tell a genuinely compromised assistant from one that is simply doing something unusual but legitimate, without burying analysts in false positives. Behavioral profiling has a long history in security of being easier to demo than to operationalize.
What Blue41 has going for it is a clean, legible story and a real customer engagement to point to, with the vulnerability disclosed responsibly and remediated together with Bunq before publication. For a young company, a named European bank and a working proof of concept that costs two cents to reproduce is a stronger opening than most. The broader point it is making will outlast any single product pitch: as banks push assistants from pilots into production, the boundary between code and data that traditional application security relied on is dissolving, and the fields everyone once treated as harmless text are becoming instruction channels. Whether Blue41 is the company that owns the defense or just the one that articulated the problem early, the attack surface it described is not going away.

Comments
Please log in or register to join the discussion