CI/CD Best Practices
A CI/CD pipeline is only as good as its design. These practices apply regardless of whether you use GitHub Actions, Jenkins, GitLab CI, or Azure Pipelines.
Pipeline Design Principles
Fast Feedback First
Order stages from fastest to slowest:
Lint (30s) → Unit tests (2min) → Build (3min) → Integration tests (10min) → E2E (20min)
Fail fast — don’t run expensive E2E tests if unit tests already failed.
Keep Pipelines Deterministic
- Pin dependency versions (
package-lock.json,Pipfile.lock) - Use fixed Docker image tags
- Avoid time-dependent tests without mocking
- Same commit → same result, every time
Pipeline as Code
Store pipeline definitions in Git alongside application code:
my-app/
├── src/
├── .github/workflows/ci.yml # or Jenkinsfile
└── docker-compose.test.yml
Review pipeline changes in Pull Requests like any other code.
Testing Gates
Define minimum quality before merge:
| Gate | Typical Threshold |
|---|---|
| Unit test pass rate | 100% |
| Code coverage | 80%+ (critical paths) |
| Lint errors | 0 |
| Security scan | No critical CVEs |
| Build success | Required |
# GitHub Actions — coverage gate
- run: pytest --cov=src --cov-fail-under=80
# Block merge via branch protection + required status checks
Test Pyramid in CI
┌─────────┐
│ E2E │ Run on staging deploy
├─────────┤
│ Integr. │ Run in CI with test DB
├─────────┤
│ Unit │ Run on every push
└─────────┘
Branch and Environment Strategy
feature/* → PR → CI → merge to main → deploy staging
→ approve → deploy production
release/* → CI → deploy pre-prod → deploy production
tag v* → CI → deploy production (immutable release)
| Environment | Purpose | Deploy trigger |
|---|---|---|
| Development | Developer testing | Manual / feature branch |
| Staging | Pre-production mirror | Auto on merge to main |
| Production | Live users | Manual approval or tagged release |
Environment-specific config via secrets — never hardcode production URLs in code.
Artifact Management
Build once, deploy many:
# Build stage — produces artifact
- run: npm run build
- uses: actions/upload-artifact@v4
with:
name: dist
path: dist/
# Deploy stage — uses same artifact
- uses: actions/download-artifact@v4
with:
name: dist
path: dist/
- run: ./deploy.sh
For Docker:
# Build and tag once
docker build -t myapp:${{ github.sha }} .
docker push myapp:${{ github.sha }}
# Deploy by tag — same image to staging and production
docker pull myapp:${{ github.sha }}
Never rebuild for production — the artifact tested in staging must be identical in production.
Deployment Strategies
Rolling Deployment
Replace instances gradually — default for most platforms.
[v1] [v1] [v1] → [v2] [v1] [v1] → [v2] [v2] [v1] → [v2] [v2] [v2]
Blue-Green
Two identical environments — switch traffic instantly:
Traffic → Blue (v1) Traffic → Green (v2)
Green (idle) Blue (idle, rollback ready)
Rollback: switch traffic back to Blue in seconds.
Canary
Route small percentage to new version:
95% → v1
5% → v2 (monitor error rate, latency)
Increase canary percentage if metrics healthy; rollback if not.
Feature Flags
Deploy code without exposing features:
if (featureFlags.isEnabled('new-checkout', user)) {
return newCheckoutFlow();
}
return legacyCheckout();
Decouple deployment from release — ship dark, enable gradually.
Rollback Strategy
Every deployment needs a rollback plan:
# Kubernetes
kubectl rollout undo deployment/my-api
# Docker / ECS
aws ecs update-service --task-definition my-api:previous
# Git revert
git revert HEAD
git push origin main # triggers redeploy of previous version
Requirements:
- Previous Docker image tags retained in registry
- Database migrations are backward-compatible or reversible
- Rollback tested in staging quarterly
Security in CI/CD
- Least privilege — CI credentials scoped to minimum permissions
- Secret scanning — detect committed API keys (GitHub Secret Scanning, git-secrets)
- Dependency scanning — Dependabot, Snyk, npm audit
- SAST — static analysis for code vulnerabilities
- Signed commits and artifacts — verify integrity
- Separate production credentials — staging CI cannot access production
# OIDC — no long-lived AWS keys in secrets
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/GitHubActionsRole
aws-region: us-east-1
Monitoring Deployments
Post-deploy checks (automated smoke tests):
- name: Smoke test
run: |
for i in {1..10}; do
STATUS=$(curl -s -o /dev/null -w "%{http_code}" https://staging.myapp.com/health)
if [ "$STATUS" = "200" ]; then exit 0; fi
sleep 10
done
exit 1
Monitor after production deploy:
- Error rate (should not spike)
- Latency p99 (should not degrade)
- Business metrics (checkout completion, signups)
Automatic rollback if SLO breached within 15 minutes of deploy.
DORA Metrics
Track DevOps performance:
| Metric | Elite | How to improve |
|---|---|---|
| Deployment frequency | Multiple/day | Smaller batches, automation |
| Lead time for changes | < 1 hour | CI speed, trunk-based dev |
| Change failure rate | 0–15% | Testing gates, canary deploys |
| MTTR | < 1 hour | Rollback automation, observability |
Anti-Patterns to Avoid
| Anti-pattern | Fix |
|---|---|
| Manual deploy steps | Automate everything |
| Long-lived feature branches | Merge daily, use feature flags |
| No tests in CI | Block merge without passing CI |
| Shared staging environment conflicts | Ephemeral preview environments per PR |
| “Works in CI” but not production | Staging mirrors production |
| Deploying on Fridays | Deploy when team is available to monitor |
Complete Pipeline Checklist
- Lint and unit tests on every PR
- Integration tests with real dependencies (Docker Compose)
- Build artifact or Docker image once
- Security scanning (dependencies + container)
- Auto-deploy to staging on merge
- Smoke tests post-deploy
- Manual or automated gate for production
- Rollback procedure documented and tested
- Monitoring alerts configured
- Pipeline runs in < 15 minutes for CI feedback
Mastering CI/CD completes the DevOps loop: Git for version control, Docker for packaging, CI/CD for reliable delivery.