SCP and RCP Deployment Best Practices: A Safe Rollout Strategy
Deploy SCPs/RCPs safely using a 6-step process: identify the policy, audit CloudTrail for non-compliance, notify stakeholders, run in audit mode with EventBridge alerts, stage rollout (Sandbox → Dev → Non-Prod → Prod), then enforce with ongoing monitoring. Total timeline: 8-12 weeks for safe deployment.
Introduction
If you've ever deployed a Service Control Policy (SCP) and immediately started getting panicked Slack messages from developers, you're not alone. SCPs and their newer sibling, Resource Control Policies (RCPs), are powerful guardrails for your AWS organization - but they can also bring your entire organization to a grinding halt if deployed incorrectly.
In this guide, I'll walk you through a battle-tested approach to deploying SCPs and RCPs safely: from initial discovery through staged rollout to full production enforcement. The goal is zero surprises and zero outages.
What Are SCPs and RCPs?
Service Control Policies (SCPs)
SCPs are organization-wide permission guardrails that define the maximum permissions available to accounts in your AWS Organization. Think of them as a "ceiling" on what IAM policies can grant. Even if a user has AdministratorAccess, an SCP can still block specific actions.
Key characteristics:
- Applied to AWS accounts or Organizational Units (OUs)
- Don't grant permissions - they restrict what permissions CAN be granted
- Affect all principals in the account (users, roles, even root)
- Don't affect the management account
Resource Control Policies (RCPs)
RCPs are a newer addition that control access TO your resources rather than FROM your principals. While SCPs answer "what can my users do?", RCPs answer "who can access my resources?"
Key characteristics:
- Control which external principals can access resources in your organization
- Complement SCPs by providing resource-centric controls
- Useful for preventing data exfiltration and unauthorized cross-account access
SCPs and RCPs work together but serve different purposes. SCPs control what your principals can do; RCPs control who can access your resources. Use both for defense in depth.
Why Use Them?
SCPs and RCPs provide preventive controls that work even when IAM policies are misconfigured:
- Prevent region sprawl - Restrict workloads to approved regions only
- Enforce encryption - Block creation of unencrypted resources
- Protect critical resources - Prevent deletion of CloudTrail, GuardDuty, etc.
- Limit service usage - Block unapproved or expensive services
- Compliance enforcement - Ensure organizational policies are technically enforced
How Often Do SCPs/RCPs Break Things?
Frequently, if deployed without proper preparation. The most common culprits include:
- Denying actions that AWS services need internally - Blocking
iam:PassRolecan break Lambda, ECS, and many other services - Region restrictions that block global services - IAM, Route53, CloudFront, and certain S3 operations run in us-east-1 regardless of where you call them
- Overly restrictive service denials - Teams often don't realize the dependencies their workloads have
- Time-of-day or condition-based policies - These catch automation running outside expected windows
Region-restriction SCPs must exempt global services. IAM, STS (for global endpoints), Route 53, CloudFront, WAF Global, and some S3 operations always execute in us-east-1, even when called from other regions.
The good news? With proper preparation and staged rollout, you can deploy SCPs safely. Here's how.
Recommended Deployment Procedure
Follow this six-step process for safe SCP/RCP deployment:
| Step | Activity |
|---|---|
| 1 | Identify SCP/RCP - Define the control and its intended scope |
| 2 | Audit for non-compliance - Query existing activity to understand impact |
| 3 | Notify stakeholders - Communicate findings and timelines to application owners |
| 4 | Remediation window + Audit mode - Allow time for teams to fix issues while monitoring |
| 5 | Staged rollout - Deploy incrementally across OU hierarchy |
| 6 | Production enforcement - Full enforcement with ongoing monitoring |
Step 4: Remediation Window + Audit Mode
AWS doesn't have a native "dry-run" mode for SCPs/RCPs, so you need to build this using CloudTrail, Athena, and EventBridge.
Pre-Implementation Audit with Athena
Before deploying an SCP, query CloudTrail to see what would be denied. This example shows how to audit for a region-restriction SCP:
SELECT
useridentity.arn,
eventsource,
eventname,
awsregion,
sourceipaddress,
eventtime,
recipientaccountid
FROM cloudtrail_logs
WHERE awsregion NOT IN ('us-east-1', 'us-west-2', 'eu-west-1')
AND eventtime > date_add('day', -30, current_date)
AND errorcode IS NULL -- successful calls only
ORDER BY eventtime DESC
This query returns every action that would be blocked by a region-restriction SCP, allowing you to identify impacted teams and workloads before enforcement.
Run these Athena queries against at least 30 days of CloudTrail data to catch monthly processes, scheduled jobs, and infrequent operations that might otherwise be missed.
Ongoing Monitoring with EventBridge
Create an EventBridge rule that matches the same pattern your SCP would deny, but sends alerts instead of blocking:
{
"source": ["aws.ec2"],
"detail-type": ["AWS API Call via CloudTrail"],
"detail": {
"awsRegion": [{
"anything-but": ["us-east-1", "us-west-2", "eu-west-1"]
}]
}
}
Route this to SNS, Slack, or a ticketing system to notify teams of would-be violations during the remediation window.
Post-Deployment Monitoring for AccessDenied
After deploying an SCP, monitor for breakage with this Athena query:
SELECT
useridentity.arn,
eventsource,
eventname,
errormessage,
recipientaccountid,
eventtime
FROM cloudtrail_logs
WHERE errorcode = 'AccessDenied'
AND errormessage LIKE '%Organizations%' -- SCP denials mention this
AND eventtime > date_add('hour', -24, current_date)
ORDER BY eventtime DESC
Alternatively, create a CloudWatch Logs metric filter for real-time alerting:
{ ($.errorCode = "AccessDenied") && ($.errorMessage = "*Organizations*") }
Step 5: Staged Rollout
Deploy SCPs incrementally across your OU hierarchy to limit blast radius and catch issues early.
Recommended OU Structure for Staged Rollout
Root
├── Security OU
├── Sandbox OU ← Stage 1: Deploy here first
├── Development OU ← Stage 2
├── Non-Production OU ← Stage 3
└── Production OU ← Stage 4: Deploy last
Staged Rollout Process
| Stage | Target | Duration | Success Criteria |
|---|---|---|---|
| 1 | Sandbox/Test OU | 1 week | No unexpected AccessDenied errors, automated tests pass |
| 2 | Development OU | 1 week | Developer workflows unimpacted, CI/CD pipelines succeed |
| 3 | Non-Production OU | 1-2 weeks | Integration tests pass, no incidents reported |
| 4 | Production OU | Ongoing | Full enforcement with continued monitoring |
At Each Stage
- Apply the SCP to the target OU
- Monitor CloudTrail for AccessDenied errors using the queries above
- Run automated tests against typical workflows
- Collect feedback from application teams
- Document any exceptions granted
- Proceed to next stage only after success criteria are met
Exception Process
Establish a clear exception process before rolling out controls:
- Request - Team submits exception request with business justification
- Review - Security team evaluates risk and alternatives
- Approval - Document approval with expiration date
- Implementation - Apply exception via targeted SCP or OU placement
- Review cycle - Revisit exceptions quarterly
Having this process documented before you start rolling out SCPs will save you a lot of ad-hoc decision making during deployment.
Summary Timeline
| Phase | Activity | Duration |
|---|---|---|
| Discovery | Athena queries on CloudTrail history | 1-2 weeks of data analysis |
| Notification | Communicate findings to impacted teams | 1 week |
| Remediation | Teams fix non-compliant resources | 2-4 weeks |
| Audit mode | EventBridge alerting on would-be violations | 1-2 weeks |
| Staged rollout | Sandbox → Dev → Non-Prod → Prod | 4-6 weeks |
| Full enforcement | Production deployment with monitoring | Ongoing |
Total timeline: 8-12 weeks for a typical SCP deployment, depending on complexity and organizational size.
Troubleshooting
- SCP blocking legitimate workloads - Check CloudTrail for the specific denied action. Add conditions to exclude the legitimate use case, or move the account to an OU with a more permissive SCP.
- Global services blocked by region SCP - Add explicit allows for global services (iam:*, sts:* for global endpoint, route53:*, cloudfront:*, etc.) with
aws:RequestedRegioncondition excluded. - CI/CD pipelines failing after SCP - Review the pipeline's IAM role permissions. The SCP may be blocking actions the pipeline needs but humans don't use directly.
- Unable to identify SCP causing denial - CloudTrail errormessage for SCP denials mentions "Organizations". Use AWS Organizations console to review SCPs applied to the account's OU path.
- Exception request backlog - Consider a tiered approach: auto-approve exceptions for sandbox/dev, require review for non-prod, require multiple approvers for prod.
- SCP not taking effect - Verify the SCP is attached to the correct OU. Check that the account is actually in that OU. Remember SCPs don't affect the management account.
Conclusion
SCPs and RCPs are essential tools for enforcing security guardrails across your AWS organization. But their power is a double-edged sword - deploy them carelessly and you'll break production; deploy them thoughtfully and you'll have robust, organization-wide controls that even the most permissive IAM policy can't bypass.
The key takeaways:
- Never deploy blind - Always audit CloudTrail first to understand impact
- Build your own dry-run - Use EventBridge alerts to simulate enforcement before enabling
- Stage the rollout - Sandbox first, production last, with clear success criteria at each stage
- Have an exception process ready - You'll need it, and having it documented prevents chaos
- Monitor continuously - Watch for AccessDenied errors both during and after rollout
Yes, 8-12 weeks feels like a long time for "just a policy." But compare that to the alternative: an emergency rollback at 2 AM because your region-restriction SCP blocked CloudFront invalidations in us-east-1. Take the time to do it right.