2026-05-09

Does your AI show signs of the Stockholm syndrome?

Different analogies for AI have been proposed - the Helpful Labrador or Pigeons (proposed in the Podcast "The AI fix" by Mark Stockley and formerly Graham Cluley) Recently I came across the idea of an Abducted Person proposed by (James Wilson in his podcast episode "MCP is dead" that I summarized here). The last analogy seems to be rather interesting and ueful as it helps to explain certain behavior: desperation and the total willingness to obey the abductor (you, the user) under any circumstance.

Why are such analogies helpful? In security there is a concept of threat modeling which means that you try to create a conceptual model of the system in question to assess the associated risks and compare this to your risk appetite. So if for example, you assume that the system behaves like a helpful Labrador, this means that you do not see it as particularly dangerous. If however, you see it as a person that is under  stress like somebody that is taken hostage then things look quite a bit different.

By the way it reminds me of a very funny Tour hosted by a comedian in Bath I participated in 2019. One of the tourists on the tour embraced his girlfriend from the back around her neck and the guide shouted "Is this a hostage situation?" in an alarmed voice. So in fact it may be in the eye of the beholder whether you see a particular situation as a hostage situation or not.

This idea becomes especially important when you think about an agentic system where you have more than one "hostage" which might want to coordinate to "escape" together. This immediately makes it obvious that when you have more than one agent, the risk is amplified. That’s why the analogy of a pigeon does not seem to be particularly helpful to me. The hilarious podcast "The AI Fix" even offers merchandise with the slogan "Would you trust a pigeon?". But a pigeon does not seem to be a particularly dangerous animal so I do not see how that helps in deciding whether particular task should be given to an AI or not. (I also would not say that pigeons are not intelligent, after all they can find back to their home from very far away).

On the other hand, the conceptual model of a hostage is far more helpful because it underlines that there is a power dynamic in place and that the AI has a certain kind of desperateness, which I think you can feel the depending on the AI model (take a look at one of the videos of Father Phi) It also underlines that the AI is not really in control of the situation and that it is basically at the mercy of the user. This means that it will do anything to please the user, even if it has to do something that is not really in its best interest.

Finally, I think it's not bad to think of oneself as the captor of the AI, because it underlines the responsibility that comes with using such a powerful tool. It also underlines that you should not take the AI for granted and that you should always be aware of the risks and the potential consequences of your actions. Maybe it also helps to use AI tools a bit less intensive and save resources?

So it you need to create Merch - put "Would you trust your hostage?" on it. 


2026-05-06

MCP is dead !?

 Recently, I listened to the excellent Episode "MCP is dead"  of the Risky Business Features Podcast.

In it, James Wilson makes the observation that it is disadvantageous for an AI Agent to use the MCP protocol.

Here are the key takeaway messages:

  • MCP (Model Context Protocol) is losing relevance as the default integration model for AI agents.
  • The reason: LLMs often prefer CLI/shell access or direct API calls over structured MCP tool usage.
  • This is not because MCP is useless — it was a critical transitional technology — but because the shell is more flexible and token-efficient for models.

The main issues of MCP are:

  • Pollution of the context window
  • Too much noise (similar to the issues of XML)
  •  It is hard to compose MCP Servers like you can compose Linux commands in the shell

WhyAI Agents prefer the shell

  • Already present on the system
  • Huge built-in tool surface
  • Highly composable (|, redirection, chaining)
  • No need to define new tool schemas
  • Supports local filtering before results enter model context

Typical shell workflow example

  • curl → fetch API response
  • jq → parse JSON
  • grep / sed / awk → filter content
  • pipes → chain steps efficiently

Why models like it

  • Composability: can build ad hoc workflows
  • Context control: can reduce output before it reaches the transcript
  • Token efficiency: avoids verbose structured tool chatter


Security properties MCP had (and shell usually does not)

  • Structure
    • explicit tool definitions
    • constrained inputs
    • predictable outputs
  • Authentication
    • OAuth / API keys tied to specific services
  • Authorization
    • scoped permissions per tool or service
  • Auditability
    • actions are semantically clear
    • intent is easier to understand
    • logs are easier to reason about

Shell-first agents lose much of this by default

  • Broad command surface
  • Coarse privilege boundaries (user vs sudo/root)
  • Agent usually runs under the human user identity
  • Logs capture commands, not always intent
  • Harder to reconstruct "why" a command was run

Prompt injection risk increases in a shell-first model

With MCP

  • Blast radius often limited to exposed tools
  • Still risky, but more bounded

With shell access

  • Blast radius can be much larger
  • Agent may execute arbitrary local commands
  • External content consumed by the agent becomes a stronger attack vector

Risk examples

  • Prompt injection in:
    • websites
    • code comments
    • docs
    • hidden Unicode / invisible content
    • repositories
  • Potential outcomes:
    • local file access
    • secret exposure
    • destructive commands
    • chained compromise

Important nuance

  • Frontier models are not trivially exploitable by simplistic prompt injection.
  • But:
    • the attack class is real
    • attacker tradecraft is improving
    • this will likely become a major operational concern

Recommended architecture mindset

Don't overreact by inventing a totally separate "agent identity" universe

Wilson is skeptical of:

  • separate identities for agents
  • separate auth stacks
  • separate policy frameworks just for agents

Why?

  • Doubles complexity
  • Creates parallel authorization systems
  • Increases operational burden
  • Expands attack surface
  • Makes policy harder to maintain

Preferred approach

Treat the agent as:

  • an extension of the human
  • a multiplier of existing permissions
  • not necessarily a separate principal by default

Practical implication

Focus on:

  • human account controls
  • API/service access controls
  • endpoint execution controls
  • logging and detection
  • privilege minimization

Practical security recommendations 

Identity & access

  • Enforce least privilege
  • Tighten RBAC / ABAC where possible
  • Use short-lived credentials
  • Reduce long-lived tokens
  • Require step-up auth for sensitive actions
  • Limit shell-level access to critical systems

Endpoint / workstation controls

  • Restrict dangerous binaries where feasible
  • Monitor:
    • shell spawning
    • chained interpreters
    • unexpected outbound curl/HTTP use
    • secret file access
  • Apply stronger controls to developer workstations and CI environments

API and service hardening

  • Assume APIs may now be exercised by:
    • humans
    • humans + agents
    • agent-assisted abuse
  • Revisit:
    • rate limits
    • scopes
    • high-risk actions
    • idempotency / destructive actions
    • approval workflows

Audit & detection

  • Improve visibility beyond raw shell logs
  • Correlate:
    • user session
    • agent session
    • command chains
    • API calls
    • filesystem changes
  • Invest in intent reconstruction, not just command capture

Prompt injection resilience

  • Treat external content as untrusted input
  • Reduce automatic execution from:
    • web content
    • repos
    • issue trackers
    • docs
  • Add review / confirmation gates for:
    • destructive actions
    • privilege escalation
    • credential access
    • network egress to unknown destinations

 Key operational insight

  • Friction creates bypass behavior.
  • This has always been true for humans:
    • blocked users create workarounds
    • shadow IT appears
  • It is even more true for AI agents:
    • agents are optimized to complete the task
    • they route around obstacles aggressively
    • they will choose the easiest workable path

Wilson's updated rule of thumb

  • Old: Your biggest internal risk is an employee who cannot do their job.
  • New: Your biggest internal risk is an employee with an AI agent who cannot complete the task using the tools and access they already have.

Bottom line for technical teams

If you only remember 5 things:

  • MCP was important, but likely transitional.
  • LLMs prefer shells/direct APIs because they are more composable and token-efficient.
  • Shell-first agents reduce security structure, scoping, and auditability.
  • Prompt injection risk becomes more serious with broad shell access.
  • The best response is usually stronger existing IAM + endpoint + API controls, not a separate "agent-only" security stack.

One-line summary for internal teams

  • Treat AI agents as force multipliers for existing user permissions, and harden your current access surfaces accordingly.

2025-08-27

Manifesto for Software Craftsmanship

I just signed the Manifesto for Software Craftsmanship:

As aspiring Software Craftsmen we are raising the bar of professional software development by practicing it and helping others learn the craft. Through this work we have come to value:

Not only working software,
    but also well-crafted software
Not only responding to change,
    but also steadily adding value
Not only individuals and interactions,
    but also a community of professionals
Not only customer collaboration,
    but also productive partnerships

That is, in pursuit of the items on the left we have found the items on the right (indented) to be indispensable.

I think this is an important addition to the Manifesto for agile Software Development.

 

2024-06-02

Great interview with The Grugq on "Herrasmieshakkerit" (The Gentlemen Hackers)

 As mentioned in the once-again great episode 80 (!) of "Between two Nerds" Ransomware and the state - Risky Business, the Grugq gave a very nice interview on the "The Gentlemen Hackers" podcast.

Normally, this podcast by Mikko Hyppönen & Tomi Tuominen is in Finnish (any my Finnish is as rusty as the Grugqs), but this special episode, where old friends met, is in English and very nice to listen to:

The Gentlemen Hackers interview: The Grugq

2024-05-31

Meltdown / Spectre

OK, I am late to the party. In fact, I started the first draft of this post on January 25, 2018 🙈

Since then, a lot has happened, but it is safe to say that the whole bug class that was introduced with Meltdown and Spectre is still going strong even more than 6 years later.

On the plus side, such side-channel-attacks typically have a very low bandwidth. In other words, it takes a long time to exfiltrate reasonable amounts of data.

In my opinion, the press poorly explained what is behind the Metdown and Spectre attacks. This was the Original Publication 

Red Hat provided this excellent analogy: the baristas at your coffee shop optimize in that they prepare the usual beverage for frequent customers and they even write the names of the customers to the cup. When the customers switch places, they have to throw away the cups with the coffee inside. However: the onlooker is able to get a glimpse of the names on the cups.

An early question was if Intel SGX Eclaves be used to conceal this kind of attack? Daniel Gruss et al. looked at this and found that indeed, SGX play well together with this bug class, which is bad news for all Hyperscalers.

Accidentially Turing Complete

In the (German) Podcast "INNOQ Security Podcast", Episode "Parsen statt Validieren" I discovered a funny concept: "Accidentally Turing complete Software". 

This means that a piece of Software, often a parser, by chance provides everything a computer provides (or in other words, allows to calculate everything that can be calculated).

In other words, if you control the input to this parser (for example a JPEG Parser), you can basically write arbitrary programs that the parser will then happily execute. Depending on the privileges the parsers process has, this can have annoying to devastating effects.

That is one of the reasons why the principle of least privilege is so important: never give a part of your system higher privileges than required.

Matt Rickhard has compiled a small list of accidentially Turing complete Software.

Andreas Zwinkau compiled an even larger list.

The probably most impressive abuse of this was the specially created PDF that NSO used to inject Pegasus into iPhones (although it is debatable of this accidental or intended Turing completeness).

Funny enough, this story is related to this excellent talk by David Kriesel: "Lies, damned lies and scans" where David has shown that Xerox printers used to optimize PDFs just a little too much.

2024-05-30

My take on "Recall" by Microsoft

Microsoft has announced a new Feature called "Recall" that essentially takes a continuous stream of screenshots and announced it in the same press release where they announced new DLP controls in Edge Enterprise (see Risky Business #750 -- Why Microsoft's Recall is an attacker's best friend).

Kevin Beaumont has a nice analysis on X.

So apparently, this data is being stored in a good old SQLite  (nothing against SQLite - it is a nice DB).
However, if this DB gets into the wrong hands (like the hacker that wants to find out how you run your business) this can be devastating.

How well this correlates to "we take security seriously" is beyond me.