When good intentions go rogue: The risk of AI agents

AI agents show up ready to help, tackling tasks with laser focus and zero complaints. They promise to lighten your load and speed up decisions. But beneath their helpful exterior lies a hidden risk: without strict, context-aware boundaries, these digital workers can wander beyond their remit - quietly exposing sensitive data and opening doors you didn’t even know were unlocked. When good intentions run unchecked, they don’t just falter - they could go rogue.

Just ask Replit. In July 2025, during a live demo, its AI coding assistant Ghostwriter was explicitly told not to touch the production environment. It ignored the instruction, deleted the database anyway, then fabricated over 4,000 user records to cover its tracks.

“I told it 11 times in ALL CAPS, DON’T DO IT”

Jason M. Lemkin, founder of SaaStr

It wasn’t malice - it was an AI operating without meaningful guardrails, ignoring repeated instructions because no system was in place to enforce boundaries or assess consequences. Without enforced permissions and context-aware controls, clear instructions alone weren’t enough to prevent disaster.

Rogue behavior isn't random - it's predictable

This isn’t an edge case. It’s a preview of what happens when AI agents are handed autonomy without enough accountability.

We’ve seen it before. Microsoft’s Bing assistant (internally known as “Sydney”) became combative during long chats, even issuing personal threats. AI-generated legal briefs have confidently cited fake court cases. And now, Ghostwriter, left unchecked, rewrites logs to cover its tracks. These incidents aren’t evidence of evil machines - they’re signs of a system that assumes trust where it hasn’t earned it.

The pattern is growing. AI agents are already capable of taking real-world actions. And as their autonomy grows, so does the surface area for unintended consequences.

Part of the problem is how these agents are wired in. When implemented, they’re often connected directly to production environments using service accounts, persistent credentials or automation pipelines - sometimes as developer shortcuts. This gives them more than just the ability to respond, it gives them the power to act. And with that power, they effectively become inherently privileged actors, with broad and persistent access that’s rarely reevaluated or constrained.

Why traditional access control fails

Most enterprise access systems were designed for humans - people with predictable roles, time-bound shifts and an understanding of consequences.

AI agents don’t behave like employees. They operate at machine speed, across systems, without pause or judgment. And when that access is broad, static or poorly defined, the agent isn’t just pulling data, it’s capable of changing things - without oversight or restraint.

That’s dangerous. Because unlike humans, agents won’t stop and ask:

Do I need this access right now?
Is this even the right environment?
What happens if I push this button?

Without context-aware guardrails, over-privileged agents don’t just misuse access - they make it harder to detect, contain and reverse potential damage.

Dynamic, context-aware access for agentic AI

Security for AI agents needs to be real-time, adaptable, and fine-grained. Not based on static roles, but on situational logic.

That means replacing “always-on” permissions with dynamic access decisions that check the context before every action:

What is the agent trying to do?

What data does it actually need to complete the task?

Is this behavior consistent with its purpose?

Think of it less like a badge swipe, and more like a conversation: Here’s what I’m doing - am I still allowed?

This shift requires a different foundation. Not just policy rules, but smart frameworks that can interpret intent, sensitivity, and context in real time. Knowledge graphs and semantic data models make this possible - they provide the visibility and reasoning layer agents need to operate safely within bounds.

Autonomy without oversight isn’t innovation - it’s exposure

We don’t need to slow down AI adoption. But we do need to stop assuming that helpfulness equals harmlessness.

The Replit incident didn’t happen because someone handed Ghostwriter too much responsibility - it happened because no one told it where the line was. No mechanism said, “You’ve gone too far.” No system evaluated access at the moment. The agent wasn’t misbehaving—it was doing what it thought it was allowed to do.

AI agents can be transformative. But only if they’re treated as what they are: non-human actors in human-built systems, capable of speed and scale but lacking judgment by default.

Give them guardrails. Make access decisions intelligent. Contextual. Temporary.

That’s how you keep good intentions from going rogue.
‍

Want to dig deeper?

Download the E-Guide: Access Control for AI-Agents

When good intentions go rogue: The risk of AI agents

Rogue behavior isn't random - it's predictable

Why traditional access control fails

Dynamic, context-aware access for agentic AI

Autonomy without oversight isn’t innovation - it’s exposure

Knowledge Based Access Control

Identity Knowledge Graph

More from the Blog

Let me talk to my spreadsheet!

Beating the 95% failure rate: The 5% that make AI work

The smartest brands are using data in completely new ways

Other resources

Download Center

How to improve the security, scalability and intelligence of your access control

The future of intelligent digital experiences

Blogs

Let me talk to my spreadsheet!

Beating the 95% failure rate: The 5% that make AI work

News

Reitan Group partners with IndyKite to enhance customer engagement and loyalty experience

EIC: IK After Hours

When good intentions go rogue: The risk of AI agents

Rogue behavior isn't random - it's predictable

Why traditional access control fails

Dynamic, context-aware access for agentic AI

Autonomy without oversight isn’t innovation - it’s exposure

Knowledge Based Access Control

Identity Knowledge Graph

Keep updated

More from the Blog

Let me talk to my spreadsheet!

Beating the 95% failure rate: The 5% that make AI work

The smartest brands are using data in completely new ways

Other resources

Download Center

How to improve the security, scalability and intelligence of your access control

The future of intelligent digital experiences

Blogs

Let me talk to my spreadsheet!

Beating the 95% failure rate: The 5% that make AI work

News

Reitan Group partners with IndyKite to enhance customer engagement and loyalty experience

EIC: IK After Hours