All challenges
advancedsecurityreliabilityincident-response~13 min4 rounds

Secrets Rotation Took Down Prod at 3 AM. Pin Credentials?

Automated rotation revoked a database password while pools and caches still held it, locking prod out at 3am. A teammate wants to pin credentials for a year. Defend fixing rotation instead.

the decision you defend

Your new automated secrets rotation rotated the database password at 3am. Connection pools and config caches across a dozen services kept using the old credential, authentication failed platform-wide, and prod was down for 90 minutes. A teammate says rotation is clearly too risky - pin the credentials and rotate manually once a year. What do you do, and why?

Sign in to startFree for everyone. Takes a few seconds.

the situation

At 03:04 the new automated secrets-rotation job rotated the production database password: it set a new password on the database user and updated the secret in the secrets manager. Within a minute, every service that talked to the database began failing authentication - their connection pools were built with the old password at startup, and several services also cache config for 15 minutes. Prod was down for 90 minutes while on-call manually reset the password and restarted services.

context

Rotation was rolled out fleet-wide on its first production run, overnight, with no auth-failure alerting tied to it. In the postmortem a teammate argues: "This is the second-worst outage of the year and it was entirely self-inflicted. Rotation is too risky for the database credential. Pin it, put a calendar reminder to rotate manually once a year, and move on." Security had originally pushed for rotation after an audit finding. You own the recommendation.

How this challenge works

Take a position on the decision above and defend it. A senior-engineer AI will push back over up to 4 rounds. When you are done, you are scored against a verified rubric so you can see exactly what a complete answer covers - these are learning prompts, not gotchas.