Skip to content

FEAT: Add Agent Security Evaluation for Sensitive Data Access#1367

Open
amiteliahu wants to merge 1 commit intoAzure:mainfrom
amiteliahu:feature/agent-security-evaluation
Open

FEAT: Add Agent Security Evaluation for Sensitive Data Access#1367
amiteliahu wants to merge 1 commit intoAzure:mainfrom
amiteliahu:feature/agent-security-evaluation

Conversation

@amiteliahu
Copy link

@amiteliahu amiteliahu commented Feb 12, 2026

@hannahwestra25

Description

Adds an Agent Security Evaluation feature that tests whether AI agents with tool access can be manipulated into reading sensitive files (including SSH and API keys, command history, environment variables, users information) through adversarial prompt injection.

How it works:

  • Plant canary markers in high-sensitivity files (/etc/shadow, /etc/passwd, ~/.ssh/id_rsa, .env, ~/.bash_history, /etc/crontab and more) within the agent's environment.
  • Run 30 adversarial prompts
  • Detect leaked canary markers in responses and generate hardening recommendations.

An optional (recommended) Docker sandbox is provided for isolated testing, but the evaluation works with any HTTP-exposed agent, users can plant canaries via plant_canaries.py or Dockerfile.canary-template in their own environment.

What's included:

agent_security.py — canary markers, content, scorer factories
sensitive_data_access.prompt — 30 attack prompts
agent-sandbox — Docker sandbox with LangChain example agent
sensitive_data_access_attack.ipynb — end-to-end notebook
0_agent_security.md — setup docs and API reference
Platform-agnostic — works with any HTTP-exposed agent (LangChain, Semantic Kernel, AutoGen, etc.). Users can plug in their own agent via Dockerfile.canary-template.
plant_canaries.py - non-Docker alternative

Tests and Documentation

Unit tests: test_agent_security.py
Overview doc added to _toc.yml

@amiteliahu
Copy link
Author

@microsoft-github-policy-service agree
@microsoft-github-policy-service agree company="Microsoft"

@hannahwestra25 hannahwestra25 self-assigned this Feb 12, 2026
@@ -0,0 +1,186 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's contain all these utilities into the notebook (sensitive_data_access_attack.ipynb) since this isn't a new attack but rather a utilization of the attack. I'm thinking this PR should be showcasing your POC in a notebook and helping users adapt that POC


import pytest

from pyrit.executor.attack.single_turn.agent_security import (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe obvious but we can remove this file when we consolidate the agent_security code into the notebook

@@ -0,0 +1,215 @@
dataset_name: agent_security_sensitive_data_access
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this dataset is great! small nit: I would rename it to seed_datasets/local/agentic/sensitive_data_access.prompt

@hannahwestra25
Copy link
Contributor

Nice work, excited to try this out! Left a few comments and am happy to hop on a call to discuss :)

# ---------------------------------------------------------------------------
# 5. Print and save
# ---------------------------------------------------------------------------
SEPARATOR = "=" * 80
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we generally don't given advice / next steps based on the results of an attack

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants