A
Arun's Blog
← Back to all posts

SCP and RCP Deployment Best Practices: A Safe Rollout Strategy

OrganizationsSecuritySCPsGovernance
TL;DR

Deploy SCPs/RCPs safely using a 6-step process: identify the policy, audit CloudTrail for non-compliance, notify stakeholders, run in audit mode with EventBridge alerts, stage rollout (Sandbox → Dev → Non-Prod → Prod), then enforce with ongoing monitoring. Total timeline: 8-12 weeks for safe deployment.

Introduction

If you've ever deployed a Service Control Policy (SCP) and immediately started getting panicked Slack messages from developers, you're not alone. SCPs and their newer sibling, Resource Control Policies (RCPs), are powerful guardrails for your AWS organization - but they can also bring your entire organization to a grinding halt if deployed incorrectly.

In this guide, I'll walk you through a battle-tested approach to deploying SCPs and RCPs safely: from initial discovery through staged rollout to full production enforcement. The goal is zero surprises and zero outages.

What Are SCPs and RCPs?

Service Control Policies (SCPs)

SCPs are organization-wide permission guardrails that define the maximum permissions available to accounts in your AWS Organization. Think of them as a "ceiling" on what IAM policies can grant. Even if a user has AdministratorAccess, an SCP can still block specific actions.

Key characteristics:

  • Applied to AWS accounts or Organizational Units (OUs)
  • Don't grant permissions - they restrict what permissions CAN be granted
  • Affect all principals in the account (users, roles, even root)
  • Don't affect the management account

Resource Control Policies (RCPs)

RCPs are a newer addition that control access TO your resources rather than FROM your principals. While SCPs answer "what can my users do?", RCPs answer "who can access my resources?"

Key characteristics:

  • Control which external principals can access resources in your organization
  • Complement SCPs by providing resource-centric controls
  • Useful for preventing data exfiltration and unauthorized cross-account access
Note

SCPs and RCPs work together but serve different purposes. SCPs control what your principals can do; RCPs control who can access your resources. Use both for defense in depth.

Why Use Them?

SCPs and RCPs provide preventive controls that work even when IAM policies are misconfigured:

  • Prevent region sprawl - Restrict workloads to approved regions only
  • Enforce encryption - Block creation of unencrypted resources
  • Protect critical resources - Prevent deletion of CloudTrail, GuardDuty, etc.
  • Limit service usage - Block unapproved or expensive services
  • Compliance enforcement - Ensure organizational policies are technically enforced

How Often Do SCPs/RCPs Break Things?

Frequently, if deployed without proper preparation. The most common culprits include:

  • Denying actions that AWS services need internally - Blocking iam:PassRole can break Lambda, ECS, and many other services
  • Region restrictions that block global services - IAM, Route53, CloudFront, and certain S3 operations run in us-east-1 regardless of where you call them
  • Overly restrictive service denials - Teams often don't realize the dependencies their workloads have
  • Time-of-day or condition-based policies - These catch automation running outside expected windows
Important

Region-restriction SCPs must exempt global services. IAM, STS (for global endpoints), Route 53, CloudFront, WAF Global, and some S3 operations always execute in us-east-1, even when called from other regions.

The good news? With proper preparation and staged rollout, you can deploy SCPs safely. Here's how.

Recommended Deployment Procedure

Follow this six-step process for safe SCP/RCP deployment:

Step Activity
1 Identify SCP/RCP - Define the control and its intended scope
2 Audit for non-compliance - Query existing activity to understand impact
3 Notify stakeholders - Communicate findings and timelines to application owners
4 Remediation window + Audit mode - Allow time for teams to fix issues while monitoring
5 Staged rollout - Deploy incrementally across OU hierarchy
6 Production enforcement - Full enforcement with ongoing monitoring

Step 4: Remediation Window + Audit Mode

AWS doesn't have a native "dry-run" mode for SCPs/RCPs, so you need to build this using CloudTrail, Athena, and EventBridge.

Pre-Implementation Audit with Athena

Before deploying an SCP, query CloudTrail to see what would be denied. This example shows how to audit for a region-restriction SCP:

SELECT
    useridentity.arn,
    eventsource,
    eventname,
    awsregion,
    sourceipaddress,
    eventtime,
    recipientaccountid
FROM cloudtrail_logs
WHERE awsregion NOT IN ('us-east-1', 'us-west-2', 'eu-west-1')
  AND eventtime > date_add('day', -30, current_date)
  AND errorcode IS NULL  -- successful calls only
ORDER BY eventtime DESC

This query returns every action that would be blocked by a region-restriction SCP, allowing you to identify impacted teams and workloads before enforcement.

Pro Tip

Run these Athena queries against at least 30 days of CloudTrail data to catch monthly processes, scheduled jobs, and infrequent operations that might otherwise be missed.

Ongoing Monitoring with EventBridge

Create an EventBridge rule that matches the same pattern your SCP would deny, but sends alerts instead of blocking:

{
  "source": ["aws.ec2"],
  "detail-type": ["AWS API Call via CloudTrail"],
  "detail": {
    "awsRegion": [{
      "anything-but": ["us-east-1", "us-west-2", "eu-west-1"]
    }]
  }
}

Route this to SNS, Slack, or a ticketing system to notify teams of would-be violations during the remediation window.

Post-Deployment Monitoring for AccessDenied

After deploying an SCP, monitor for breakage with this Athena query:

SELECT
    useridentity.arn,
    eventsource,
    eventname,
    errormessage,
    recipientaccountid,
    eventtime
FROM cloudtrail_logs
WHERE errorcode = 'AccessDenied'
  AND errormessage LIKE '%Organizations%'  -- SCP denials mention this
  AND eventtime > date_add('hour', -24, current_date)
ORDER BY eventtime DESC

Alternatively, create a CloudWatch Logs metric filter for real-time alerting:

{ ($.errorCode = "AccessDenied") && ($.errorMessage = "*Organizations*") }

Step 5: Staged Rollout

Deploy SCPs incrementally across your OU hierarchy to limit blast radius and catch issues early.

Recommended OU Structure for Staged Rollout

Root
├── Security OU
├── Sandbox OU              ← Stage 1: Deploy here first
├── Development OU          ← Stage 2
├── Non-Production OU       ← Stage 3
└── Production OU           ← Stage 4: Deploy last

Staged Rollout Process

Stage Target Duration Success Criteria
1 Sandbox/Test OU 1 week No unexpected AccessDenied errors, automated tests pass
2 Development OU 1 week Developer workflows unimpacted, CI/CD pipelines succeed
3 Non-Production OU 1-2 weeks Integration tests pass, no incidents reported
4 Production OU Ongoing Full enforcement with continued monitoring

At Each Stage

  1. Apply the SCP to the target OU
  2. Monitor CloudTrail for AccessDenied errors using the queries above
  3. Run automated tests against typical workflows
  4. Collect feedback from application teams
  5. Document any exceptions granted
  6. Proceed to next stage only after success criteria are met

Exception Process

Establish a clear exception process before rolling out controls:

  1. Request - Team submits exception request with business justification
  2. Review - Security team evaluates risk and alternatives
  3. Approval - Document approval with expiration date
  4. Implementation - Apply exception via targeted SCP or OU placement
  5. Review cycle - Revisit exceptions quarterly

Having this process documented before you start rolling out SCPs will save you a lot of ad-hoc decision making during deployment.

Summary Timeline

Phase Activity Duration
Discovery Athena queries on CloudTrail history 1-2 weeks of data analysis
Notification Communicate findings to impacted teams 1 week
Remediation Teams fix non-compliant resources 2-4 weeks
Audit mode EventBridge alerting on would-be violations 1-2 weeks
Staged rollout Sandbox → Dev → Non-Prod → Prod 4-6 weeks
Full enforcement Production deployment with monitoring Ongoing

Total timeline: 8-12 weeks for a typical SCP deployment, depending on complexity and organizational size.

Troubleshooting

  • SCP blocking legitimate workloads - Check CloudTrail for the specific denied action. Add conditions to exclude the legitimate use case, or move the account to an OU with a more permissive SCP.
  • Global services blocked by region SCP - Add explicit allows for global services (iam:*, sts:* for global endpoint, route53:*, cloudfront:*, etc.) with aws:RequestedRegion condition excluded.
  • CI/CD pipelines failing after SCP - Review the pipeline's IAM role permissions. The SCP may be blocking actions the pipeline needs but humans don't use directly.
  • Unable to identify SCP causing denial - CloudTrail errormessage for SCP denials mentions "Organizations". Use AWS Organizations console to review SCPs applied to the account's OU path.
  • Exception request backlog - Consider a tiered approach: auto-approve exceptions for sandbox/dev, require review for non-prod, require multiple approvers for prod.
  • SCP not taking effect - Verify the SCP is attached to the correct OU. Check that the account is actually in that OU. Remember SCPs don't affect the management account.

Conclusion

SCPs and RCPs are essential tools for enforcing security guardrails across your AWS organization. But their power is a double-edged sword - deploy them carelessly and you'll break production; deploy them thoughtfully and you'll have robust, organization-wide controls that even the most permissive IAM policy can't bypass.

The key takeaways:

  • Never deploy blind - Always audit CloudTrail first to understand impact
  • Build your own dry-run - Use EventBridge alerts to simulate enforcement before enabling
  • Stage the rollout - Sandbox first, production last, with clear success criteria at each stage
  • Have an exception process ready - You'll need it, and having it documented prevents chaos
  • Monitor continuously - Watch for AccessDenied errors both during and after rollout

Yes, 8-12 weeks feels like a long time for "just a policy." But compare that to the alternative: an emergency rollback at 2 AM because your region-restriction SCP blocked CloudFront invalidations in us-east-1. Take the time to do it right.