How to Use AI Agents Safely: A CISO’s Framework for Responsible Agent Deployment

A practical, defense-in-depth framework for using AI agents safely — outlining how organizations can harness their power without compromising control, compliance, or trust.

Joan Pepin
Chief Information Security Officer
Nov 4, 2025
4 minutes

Subscribe to Our Newsletter

See Edge Delta in Action

Share

IBM’s latest research should make every security leader pay attention: 97% of organizations with AI breaches lacked proper access controls, and one in five reported breaches due to shadow AI. We’re deploying autonomous agents with the security posture of a screen door.

Here’s the disconnect: while organizations rush to deploy AI agents that can write code, access databases, and make business decisions, 63% of breached organizations don’t even have an AI governance policy. We’re essentially handing over the keys to systems we don’t fully understand, without basic guardrails in place.

We need to acknowledge what AI agents are capable of versus what we’re currently equipped to control and monitor in reality right now. The gap between those two states is where the biggest risk lives.

The New Attack Surface: Understanding Agent-Specific Risks

Let’s be clear: AI agents represent a fundamentally different attack surface than traditional applications. They’re not just processing data — they’re making decisions, taking actions, and operating with a level of autonomy we haven’t had to secure before.

Smarter and Dumber: Right now AI agents are in their early days, and they exhibit some seemingly very intelligent behavior right alongside some potentially inept actions. They can be shockingly determined, and can sometimes find a way to complete a task you never thought of – going “cowboy” on your production systems.

Data Exposure Risks: Every AI agent is a potential leak vector for sensitive data. Unlike traditional applications with fixed data flows, agents can dynamically query systems, combine information from multiple sources, and inadvertently expose data in ways we didn’t anticipate. Agents trained on production data could accidentally surface customer PII in responses, or pull sensitive financial information into contexts where it doesn’t belong.

Privilege Escalation: When an agent is “helpful,” it can quickly become harmful. An agent designed to help developers debug issues might need read access to logs — but what happens when it starts suggesting configuration changes that require write access? The natural tendency is to grant those permissions to maximize utility. That’s the trap.

Supply Chain Vulnerabilities: Most organizations aren’t building their AI models from scratch — they’re using third-party models, frameworks, and APIs, and they aren’t focusing on whether these models are trusted and responsibly developed. But each one represents a supply chain risk. Model poisoning, backdoors in training data, vulnerabilities in inference engines aren’t theoretical concerns — they’re real attack vectors. 

Compliance Implications: If you’re operating under GDPR, HIPAA, SOC 2, or PCI DSS, AI agents introduce complex compliance questions. When an agent processes personal health information or payment data, who’s responsible? (Hint: you are.) How do you demonstrate control? How do you honor data subject rights when the agent has learned from that data? These aren’t just technical questions — they’re business-critical compliance issues that can result in significant fines and regulatory action.

Defense in Depth for AI Agents: A Practical Framework

I believe in Defense in Depth because it is a time-tested strategy that works. You can’t secure AI agents with a single control — you need overlapping layers that create resilience even when individual controls fail. Here’s the framework I use:

Layer 1: Data Boundaries

Before any agent touches your systems, you need to implement data classification. Not everything should be accessible to AI agents, period. Start by identifying what’s sensitive, what’s confidential, and what’s public. Then build guardrails around those classifications.

At Edge Delta, we use telemetry pipelines as data guardrails — they inspect, transform, and route data before it reaches any downstream system, including the out-of-the-box AI agents in our new collaborative AI Teammates platform. This isn’t a gimmick; it’s how we ensure that PII never makes it into contexts where it shouldn’t be. (You should also know that Edge Delta never uses your data to train our models.)

Your strategies should include:

  • PII masking: Automatically redact sensitive information before agent access
  • Data sanitization: Remove or tokenize confidential data in agent-accessible contexts
  • Classification enforcement: Technical controls that prevent agents from accessing data outside their classification level

Layer 2: Permission Architecture

I’ll say this plainly: always start with read-only access. I don’t care how much your developers complain that it slows them down. Read-only should be your default posture for any new agent deployment. Don’t worry – we ship our AI Teammates with read-only by default.

When you do need to grant write permissions, implement approval workflows. A human should review and approve any agent action that modifies state — whether that’s changing a configuration, updating a database, or deploying code. These approvals should be logged, time-boxed, and automatically revoked after use.

Time-boxed permissions are critical. An agent doesn’t need permanent write access to a production database — it needs access for the duration of a specific task. After that task completes, the permission should automatically expire. Build this into your IAM infrastructure from day one.

Layer 3: Human-in-the-Loop Controls

No matter how sophisticated your AI agent, there must be critical decision gates where a human reviews and approves actions. These aren’t just approval workflows — they’re architectural requirements that prevent fully autonomous operation in high-risk scenarios.

You need override mechanisms that allow humans to interrupt or redirect agent behavior in real-time. If an agent is taking actions that look wrong, your security team should be able to step in immediately.

Emergency stop capabilities are your kill switch. When something goes catastrophically wrong, you need the ability to shut down agent operations across your environment instantly. This isn’t a graceful shutdown — it’s an emergency brake that prioritizes safety over continuity.

Layer 4: Audit and Observability

Every agent action must be logged. Not summarized, not sampled — every single action. This is non-negotiable because when something goes wrong (and it will), you need a complete audit trail to understand what happened and demonstrate control to auditors. This is another place where Edge Delta has your back and logs every action out of the box. 

Your logging should capture:

  • What the agent accessed
  • What decisions it made
  • What actions it took
  • What data it processed or exposed
  • Any errors or anomalies

Real-time monitoring of agent behavior is equally important. You need to know immediately when an agent starts exhibiting unusual patterns — accessing systems it doesn’t normally touch, making requests at unusual volumes, or attempting actions outside its expected scope.

Anomaly detection for AI agents requires a different approach than traditional security monitoring. You’re looking for deviations in behavior patterns, not just known attack signatures. Statistical models that baseline normal agent behavior and alert on deviations are essential here.

The “Never Let Agents” List

Some things are just too risky to automate with AI agents. Here’s my list of hard boundaries:

  • Never let agents modify firewall rules autonomously: Security policy changes require human review and approval. An agent that can modify firewall rules can effectively disable your security perimeter.
  • Never grant cross-environment permissions: Development agents should never have access to production systems. Staging agents should never have access to production data. Build hard boundaries between environments and enforce them technically, not just procedurally.
  • Never allow direct database writes without approval: Even read-only database access is risky with agents. Write access without explicit approval? That’s how you wake up to corrupted production data or inadvertent data deletion.
  • Never expose unencrypted credentials: Agents should never handle raw credentials. Use secret management systems, short-lived tokens, and credential rotation to minimize exposure risk.

Building Trust Through Transparency

Security isn’t just about controls — it’s about trust. And trust requires transparency about what agents can do, what they’re doing, and what happens when things go wrong.

Your communication strategy should include regular updates to stakeholders about agent deployments, capabilities, and any incidents. Don’t wait for people to ask — proactively share information about how agents are being used and what protections are in place.

Documentation is critical. Every agent deployment should have clear documentation about its purpose, capabilities, permissions, data access, and controls. This documentation isn’t just for compliance — it’s for the teams who need to understand and trust the system.

Incident response planning for agent-related issues needs to be part of your standard IR playbook. What happens when an agent malfunctions? When it exposes sensitive data? When it takes an unauthorized action? You need documented procedures, clear escalation paths, and tested recovery processes.

Start Smart, Scale Safely

If you’re deploying AI agents without these controls, you’re taking on risk that could fundamentally damage your business. If businesses can’t trust you to take care of their data and keep it secure, you simply don’t have a viable business. It’s that simple.

My advice: start with a risk analysis. Identify where agents could provide value, assess the associated risks, and build appropriate controls before deployment. Start small, measure everything, and scale only when you’ve demonstrated that your controls work.

Friends don’t let friends deploy agents without guardrails. The organizations that get this right will build competitive advantage through safe, responsible AI adoption. The ones that don’t? They’ll be the case studies in IBM’s next breach report.

If you’re thinking of implementing agents built by a third party, like Edge Delta’s AI Teammates, make sure they take the same care that we have in ensuring they are deployed safely. 

The choice is yours, but the data is clear: proper access controls, governance policies, and defense-in-depth architectures aren’t optional anymore. They’re the price of entry for responsible AI agent deployment.

See Edge Delta in Action

Get hands-on in our interactive playground environment.