Why Automate?

Manual SSH sessions do not scale. Automation ensures repeatable, auditable, and idempotent changes across development, staging, and production — reducing drift and human error.

Tool Best for
Shell scripts Single-host tasks, glue logic
cron / systemd timers Scheduled recurring jobs
Ansible Multi-host config management, provisioning
Terraform Infrastructure provisioning (cloud resources)
CI/CD (GitHub Actions, GitLab CI) Build, test, deploy pipelines

cron — Classic Job Scheduler

  # Edit user crontab
crontab -e

# System-wide
sudo ls /etc/cron.d/
sudo cat /etc/crontab
ls /etc/cron.{hourly,daily,weekly,monthly}/

# List current crontab
crontab -l
sudo crontab -l -u www-data
  

Crontab format (5 fields + command):

  # min hour dom month dow command
0   2   *   *   *   /usr/local/bin/backup.sh >> /var/log/backup.log 2>&1
*/5 *   *   *   *   /usr/local/bin/health-check.sh
0   0   *   *   0   /usr/local/bin/weekly-report.sh
  

Environment variables at top of crontab:

  SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin
[email protected]
  

cron Best Practices

  # Use absolute paths — cron has minimal PATH
0 3 * * * /usr/local/bin/backup.sh

# Lock file prevents overlapping runs
0 3 * * * flock -n /tmp/backup.lock /usr/local/bin/backup.sh

# Log stdout and stderr
0 3 * * * /usr/local/bin/backup.sh >> /var/log/backup.log 2>&1
  

systemd Timers (Modern cron)

Prefer timers when you need journal logging, dependencies, and missed-run catch-up:

  # /etc/systemd/system/backup.service
[Unit]
Description=Nightly backup

[Service]
Type=oneshot
ExecStart=/usr/local/bin/backup.sh
User=backup
  
  # /etc/systemd/system/backup.timer
[Unit]
Description=Run backup daily at 03:00

[Timer]
OnCalendar=*-*-* 03:00:00
Persistent=true
RandomizedDelaySec=300

[Install]
WantedBy=timers.target
  
  sudo systemctl daemon-reload
sudo systemctl enable --now backup.timer
systemctl list-timers --all
journalctl -u backup.service
  

Ansible Overview

Ansible uses SSH (agentless) to push playbooks — YAML documents describing desired state.

  sudo apt install ansible    # or pip install ansible
ansible --version

# Inventory: /etc/ansible/hosts or inventory.ini
# [web]
# web1.example.com
# web2.example.com
#
# [db]
# db1.example.com
  

Ad-hoc commands:

  ansible web -m ping
ansible web -a "uptime"
ansible web -m apt -a "name=nginx state=present" --become
ansible web -m service -a "name=nginx state=restarted" --become
ansible all -m setup -a "filter=ansible_distribution*"   # facts
  

Playbook Structure

  # site.yml
---
- name: Configure web servers
  hosts: web
  become: true
  vars:
    nginx_worker_processes: auto

  tasks:
    - name: Install nginx
      apt:
        name: nginx
        state: present
        update_cache: true

    - name: Deploy config
      template:
        src: templates/nginx.conf.j2
        dest: /etc/nginx/nginx.conf
      notify: Reload nginx

    - name: Enable and start nginx
      service:
        name: nginx
        state: started
        enabled: true

  handlers:
    - name: Reload nginx
      service:
        name: nginx
        state: reloaded
  

Run playbook:

  ansible-playbook -i inventory.ini site.yml
ansible-playbook site.yml --check          # dry run
ansible-playbook site.yml --limit web1       # single host
ansible-playbook site.yml -v                 # verbose
  

Roles and Organization

  ansible/
├── inventory/
│   ├── production
│   └── staging
├── roles/
│   ├── nginx/
│   │   ├── tasks/main.yml
│   │   ├── templates/
│   │   ├── handlers/main.yml
│   │   └── defaults/main.yml
│   └── common/
├── site.yml
└── ansible.cfg
  
  # site.yml using roles
- hosts: web
  become: true
  roles:
    - common
    - nginx
  

Roles enable reuse across playbooks and teams.

Variables and Vault

  # group_vars/web/vault.yml (encrypted)
ansible-vault create group_vars/web/vault.yml
ansible-vault edit group_vars/web/vault.yml

# Run with vault password
ansible-playbook site.yml --ask-vault-pass
ansible-playbook site.yml --vault-password-file ~/.vault_pass
  

Never commit plaintext secrets to git — use Ansible Vault or external secret managers (HashiCorp Vault, AWS SSM).

Idempotency

Ansible modules check current state before changing:

  - name: Ensure user exists
  user:
    name: deploy
    groups: sudo
    shell: /bin/bash
    state: present
  

Running the playbook twice should produce no changes the second time (changed=0) — unlike raw shell scripts that may re-apply blindly.

Use command or shell modules sparingly; prefer native modules with creates/removes guards when necessary.

Best Practices

Practice Reason
Version control playbooks Audit trail, rollback, code review
Separate inventories per env Prevent prod accidents
--check before prod runs Catch unintended changes
Use roles and group_vars DRY configuration
Test in staging first Validate before fleet-wide apply
flock/cron locks on scripts Prevent concurrent corruption

Common Mistakes

Mistake Consequence
cron without logging Silent failures for weeks
Ansible without --become when needed Permission errors mid-playbook
Hardcoded IPs in playbooks Breaks when infra changes
Non-idempotent shell tasks Duplicate entries, restarts every run
Running ansible as root locally Confusing permission mapping

Troubleshooting

cron job not running:

  grep CRON /var/log/syslog
# Check crontab syntax, script permissions, PATH
sudo run-parts --test /etc/cron.daily
  

Ansible SSH failures:

  ansible web -m ping -vvv
# Check keys, sudo, Python on target (/usr/bin/python3)
  

Timer didn’t fire:

  systemctl status backup.timer
systemctl list-timers backup.timer
journalctl -u backup.timer
  

Production Scenario

A 100-node web fleet managed entirely via Ansible:

  1. Git repo — playbooks merged via PR with CI lint (ansible-lint)
  2. Stagingansible-playbook -i staging site.yml on merge to main
  3. Production — manual approval triggers -i production site.yml --limit 10 rolling batches
  4. cron — only on individual hosts for local log cleanup; everything else via systemd timers or Ansible
  5. Backup timer — systemd Persistent=true catches missed runs after maintenance windows
  6. Drift detection — nightly Ansible --check reports unexpected changes to Slack

New engineer onboards: clone repo, run against Vagrant/VM lab, never SSH to prod manually for config changes.

Automation transforms Linux administration from artisanal SSH into engineered, reviewable infrastructure — start with cron for schedules, add Ansible when one host becomes many.