Linux Automation (Ansible/cron)
Why Automate?
Manual SSH sessions do not scale. Automation ensures repeatable, auditable, and idempotent changes across development, staging, and production — reducing drift and human error.
| Tool | Best for |
|---|---|
| Shell scripts | Single-host tasks, glue logic |
| cron / systemd timers | Scheduled recurring jobs |
| Ansible | Multi-host config management, provisioning |
| Terraform | Infrastructure provisioning (cloud resources) |
| CI/CD (GitHub Actions, GitLab CI) | Build, test, deploy pipelines |
cron — Classic Job Scheduler
# Edit user crontab
crontab -e
# System-wide
sudo ls /etc/cron.d/
sudo cat /etc/crontab
ls /etc/cron.{hourly,daily,weekly,monthly}/
# List current crontab
crontab -l
sudo crontab -l -u www-data
Crontab format (5 fields + command):
# min hour dom month dow command
0 2 * * * /usr/local/bin/backup.sh >> /var/log/backup.log 2>&1
*/5 * * * * /usr/local/bin/health-check.sh
0 0 * * 0 /usr/local/bin/weekly-report.sh
Environment variables at top of crontab:
SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin
[email protected]
cron Best Practices
# Use absolute paths — cron has minimal PATH
0 3 * * * /usr/local/bin/backup.sh
# Lock file prevents overlapping runs
0 3 * * * flock -n /tmp/backup.lock /usr/local/bin/backup.sh
# Log stdout and stderr
0 3 * * * /usr/local/bin/backup.sh >> /var/log/backup.log 2>&1
systemd Timers (Modern cron)
Prefer timers when you need journal logging, dependencies, and missed-run catch-up:
# /etc/systemd/system/backup.service
[Unit]
Description=Nightly backup
[Service]
Type=oneshot
ExecStart=/usr/local/bin/backup.sh
User=backup
# /etc/systemd/system/backup.timer
[Unit]
Description=Run backup daily at 03:00
[Timer]
OnCalendar=*-*-* 03:00:00
Persistent=true
RandomizedDelaySec=300
[Install]
WantedBy=timers.target
sudo systemctl daemon-reload
sudo systemctl enable --now backup.timer
systemctl list-timers --all
journalctl -u backup.service
Ansible Overview
Ansible uses SSH (agentless) to push playbooks — YAML documents describing desired state.
sudo apt install ansible # or pip install ansible
ansible --version
# Inventory: /etc/ansible/hosts or inventory.ini
# [web]
# web1.example.com
# web2.example.com
#
# [db]
# db1.example.com
Ad-hoc commands:
ansible web -m ping
ansible web -a "uptime"
ansible web -m apt -a "name=nginx state=present" --become
ansible web -m service -a "name=nginx state=restarted" --become
ansible all -m setup -a "filter=ansible_distribution*" # facts
Playbook Structure
# site.yml
---
- name: Configure web servers
hosts: web
become: true
vars:
nginx_worker_processes: auto
tasks:
- name: Install nginx
apt:
name: nginx
state: present
update_cache: true
- name: Deploy config
template:
src: templates/nginx.conf.j2
dest: /etc/nginx/nginx.conf
notify: Reload nginx
- name: Enable and start nginx
service:
name: nginx
state: started
enabled: true
handlers:
- name: Reload nginx
service:
name: nginx
state: reloaded
Run playbook:
ansible-playbook -i inventory.ini site.yml
ansible-playbook site.yml --check # dry run
ansible-playbook site.yml --limit web1 # single host
ansible-playbook site.yml -v # verbose
Roles and Organization
ansible/
├── inventory/
│ ├── production
│ └── staging
├── roles/
│ ├── nginx/
│ │ ├── tasks/main.yml
│ │ ├── templates/
│ │ ├── handlers/main.yml
│ │ └── defaults/main.yml
│ └── common/
├── site.yml
└── ansible.cfg
# site.yml using roles
- hosts: web
become: true
roles:
- common
- nginx
Roles enable reuse across playbooks and teams.
Variables and Vault
# group_vars/web/vault.yml (encrypted)
ansible-vault create group_vars/web/vault.yml
ansible-vault edit group_vars/web/vault.yml
# Run with vault password
ansible-playbook site.yml --ask-vault-pass
ansible-playbook site.yml --vault-password-file ~/.vault_pass
Never commit plaintext secrets to git — use Ansible Vault or external secret managers (HashiCorp Vault, AWS SSM).
Idempotency
Ansible modules check current state before changing:
- name: Ensure user exists
user:
name: deploy
groups: sudo
shell: /bin/bash
state: present
Running the playbook twice should produce no changes the second time (changed=0) — unlike raw shell scripts that may re-apply blindly.
Use command or shell modules sparingly; prefer native modules with creates/removes guards when necessary.
Best Practices
| Practice | Reason |
|---|---|
| Version control playbooks | Audit trail, rollback, code review |
| Separate inventories per env | Prevent prod accidents |
--check before prod runs |
Catch unintended changes |
| Use roles and group_vars | DRY configuration |
| Test in staging first | Validate before fleet-wide apply |
| flock/cron locks on scripts | Prevent concurrent corruption |
Common Mistakes
| Mistake | Consequence |
|---|---|
| cron without logging | Silent failures for weeks |
Ansible without --become when needed |
Permission errors mid-playbook |
| Hardcoded IPs in playbooks | Breaks when infra changes |
| Non-idempotent shell tasks | Duplicate entries, restarts every run |
| Running ansible as root locally | Confusing permission mapping |
Troubleshooting
cron job not running:
grep CRON /var/log/syslog
# Check crontab syntax, script permissions, PATH
sudo run-parts --test /etc/cron.daily
Ansible SSH failures:
ansible web -m ping -vvv
# Check keys, sudo, Python on target (/usr/bin/python3)
Timer didn’t fire:
systemctl status backup.timer
systemctl list-timers backup.timer
journalctl -u backup.timer
Production Scenario
A 100-node web fleet managed entirely via Ansible:
- Git repo — playbooks merged via PR with CI lint (
ansible-lint) - Staging —
ansible-playbook -i staging site.ymlon merge to main - Production — manual approval triggers
-i production site.yml --limit 10rolling batches - cron — only on individual hosts for local log cleanup; everything else via systemd timers or Ansible
- Backup timer — systemd
Persistent=truecatches missed runs after maintenance windows - Drift detection — nightly Ansible
--checkreports unexpected changes to Slack
New engineer onboards: clone repo, run against Vagrant/VM lab, never SSH to prod manually for config changes.
Automation transforms Linux administration from artisanal SSH into engineered, reviewable infrastructure — start with cron for schedules, add Ansible when one host becomes many.