Files
operating-automation/README.md
2026-02-20 13:56:27 +01:00

215 lines
9.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Operating Automation Ansible Playbooks & Roles
Automation for system operations: OS updates/upgrades, Docker cleanup, Mailcow maintenance, Checkmk onboarding, time services, hardening, and more.
Last Update: 2025-11-19
## Prerequisites
- Ansible (>= 2.14 recommended)
- Python on target systems, SSH access (key-based authentication preferred)
- Collections (install once):
```bash
ansible-galaxy collection install \
community.docker:3.11.0 \
community.proxmox:1.4.0 \
checkmk.general
```
Notes:
- `ansible.cfg` sets `roles_path = ./roles:/etc/ansible/roles` and disables host key checking.
- Sensitive variables are stored in `vault.yml` (protect with Ansible Vault).
## Inventories & Variables
- Examples: `inventories/icp-fra-pve1.yml`, `inventories/icp-frav-packer01.yml`
- Group variables: `inventories/group_vars/all.yml`
- Important OS update variables (defaults in `roles/os-updates/defaults/main.yml`):
- `os_also_update_mirror` (bool, default: true)
- `os_update_mirrors` (list of mirror entries)
- `os_update_major_version` (bool)
- `os_update_version_codename` (e.g., `bookworm`, `trixie`)
- Checkmk variables: `checkmk_server_url`, `checkmk_monitoring_site`, `checkmk_automation_user`, `checkmk_automation_pass`, `checkmk_agent_bakery_passphrase`, and others.
Vault example (excerpt; store in `vault.yml` and encrypt with Vault):
```yaml
checkmk_automation_user: "automation"
checkmk_automation_pass: "<secret>"
checkmk_agent_bakery_passphrase: "<secret>"
# Proxmox API (for major upgrade snapshots)
proxmox_api_host: "<pve.example>"
proxmox_api_user: "<user@pve>"
proxmox_api_token_id: "<token-id>"
proxmox_api_token_secret: "<token>"
```
## Quick Start
1) Install collections (see above)
2) Run playbooks (examples):
```bash
# OS update (minor) for all inventory hosts
ansible-playbook -i inventories/icp-fra-pve1.yml playbooks/os-update.yml -K
# OS major upgrade to Debian "trixie" (with Proxmox snapshot and reboot)
ansible-playbook -i inventories/icp-fra-pve1.yml playbooks/os-major-upgrade.yml \
-e os_update_version_codename=trixie -K
# Change mirrors
ansible-playbook -i inventories/icp-fra-pve1.yml playbooks/os-change-mirror.yml -K
# Configure time service via chronyd
ansible-playbook -i inventories/icp-frav-packer01.yml playbooks/setup-chronyd.yml -K
# Checkmk monitoring (create host, sign/bake agent, register)
ansible-playbook -i inventories/icp-frav-packer01.yml playbooks/setup-checkmk-monitoring.yml --ask-vault-pass
# Deploy ClamAV server (group "clamav-servers")
ansible-playbook -i inventories/icp-fra-pve1.yml playbooks/deploy-clamav-server.yml -K
# Docker: cleanup images only
ansible-playbook -i inventories/icp-frav-packer01.yml playbooks/docker/cleanup-images.yml -K
# Docker: full cleanup (containers/networks/volumes/cache), determine Mailcow first
ansible-playbook -i inventories/icp-frav-packer01.yml playbooks/docker/cleanup-all.yml -K
# Mailcow: update/restart/cleanup in sequence
ansible-playbook -i inventories/icp-frav-packer01.yml playbooks/managed-mailcow/update-mailcow.yaml -K
```
## Playbook Reference
### OS & System
- `playbooks/os-update.yml`
- Purpose: Standard OS update on Debian. Optionally updates mirrors (`os_also_update_mirror`).
- Variables: `os_update_major_version` (bool), `os_update_version_codename` (relevant for templates only)
- Role: `os-updates` (executes `update_mirrors.yaml` and `upgrade_packages.yaml`; reboots on kernel change via handler)
- `playbooks/os-major-upgrade.yml`
- Purpose: Debian major upgrade to target codename (e.g., `trixie`) including Proxmox snapshot before and reboot after.
- Loads `vault.yml` (Proxmox API & Checkmk secrets, etc.).
- Roles/tasks: `proxmox-automation:get-vmid`, `proxmox-automation:create-snapshots`, `os-updates:update_major_version`.
- Requirement: Collection `community.proxmox` and valid API tokens.
- `playbooks/os-change-mirror.yml`
- Purpose: Change Debian APT mirrors according to `os_update_mirrors`.
- Role: `os-updates:update_mirrors`.
- `playbooks/setup-chronyd.yml`
- Purpose: Configure time service with Chrony (systemd-timesyncd is removed).
- Role: `system:setup-timeserver` (handler: restart chronyd).
### Checkmk Onboarding
- `playbooks/setup-checkmk-monitoring.yml`
- Purpose: Create host in Checkmk, sign/bake pending agent jobs, register agent, run discovery.
- Loads `vault.yml` (automation user/pass, etc.).
- Roles/tasks: `checkmk-monitoring:create-host`, `checkmk-monitoring:sign-bake-agents`, `checkmk.general.agent` (TLS/update/registration), `checkmk-monitoring:discover-host`.
- Tags: `checkmk-deploy` (for registration & wait time).
- Requirement: Collection `checkmk.general`.
### Docker & Mailcow
- `playbooks/docker/cleanup-images.yml`
- Purpose: Prune Docker images only; optionally capture Compose stack status (`docker_compose_path`).
- Role: `docker:cleanup-images.yml` (collection `community.docker`).
- `playbooks/docker/cleanup-all.yml`
- Purpose: Full Docker cleanup (containers/images/networks/volumes/builder cache) with running Mailcow stack.
- Roles/tasks: `managed-mailcow:find-mailcow-composedir`, `docker:get-containerstatus`, `docker:cleanup-all` (only if containers not "false").
- `playbooks/managed-mailcow/update-mailcow.yaml`
- Purpose: Update Mailcow via `update.sh`; optionally restart Docker daemon and cleanup.
- Variables: `github_mailcow_ver` (target tag), `disk_space_percent_max` (threshold), `debug`.
- Roles/tasks: `roles/managed-mailcow:*`, `roles/docker:restart-daemon`, `roles/docker:cleanup-all`.
- `playbooks/managed-mailcow/start-stop-mailcow.yaml`
- Purpose: Stop and restart Mailcow stack (Compose v2).
- Roles/tasks: `managed-mailcow:find-mailcow-composedir`, `managed-mailcow:stop-mailcow`, `managed-mailcow:start-mailcow`.
- `playbooks/managed-mailcow/check-mailcow-health.yml`
- Purpose: Check HTTP accessibility and ports (25/587/143/993); tolerates errors (`ignore_errors`).
- `playbooks/managed-mailcow/enable-sni-globally.yml`
- Purpose: Set `ENABLE_SSL_SNI=y` in `mailcow.conf`; restart stack if changed.
- `playbooks/managed-mailcow/change-garbagecleaner.yaml`
- Purpose: Set `MAILDIR_GC_TIME` to 7 days (10080 minutes) and restart stack if changed.
- `playbooks/managed-mailcow/migrate-clamd.yaml`
- Purpose: Switch Rspamd to external/shared ClamAV, disable local ClamAV, restart Rspamd.
- `playbooks/managed-mailcow/use-docker-image-proxy.yaml`
- Purpose: Configure Docker daemon proxy & CA, set systemd drop-in, restart Docker.
- `playbooks/managed-mailcow/use-syslog-server.yaml`
- Purpose: Switch Docker logging to syslog and restart Mailcow if needed.
- `playbooks/managed-mailcow/remove-watchdog-mail.yaml`
- Purpose: Remove `WATCHDOG_NOTIFY_EMAIL` from `mailcow.conf` and restart stack.
- `playbooks/managed-mailcow/find-roundcube-versions.yaml`
- Purpose: Extract Roundcube version from `CHANGELOG.md` (under `data/web/rc|roundcube|roundcubemail`).
- `playbooks/managed-mailcow/add-haveged.yaml`
- Purpose: Install `haveged` package.
### Hardening
- `playbooks/hardening/manage-ssh-keys.yaml`
- Purpose: Add good keys, remove bad keys; write comment with timestamp.
- Role: `manage-ssh-keys`
- Variables (see `roles/manage-ssh-keys/defaults/main.yml`):
- `ssh_user` (default: root)
- `good_keys` (list of allowed keys)
- `bad_keys` (list of keys to remove)
### ClamAVServer
- `playbooks/deploy-clamav-server.yml`
- Hosts: `clamav-servers`
- Role: `deploy-clamd` (compiles ClamAV, creates user/group, configures systemd services `clamd`/`freshclam`).
- Variable: `clamd_version` (default: 1.4.2). IPv6 binding according to template (`TCPAddr {{ ansible_default_ipv6.address }}`).
## Roles & Collections (Overview)
- `roles/os-updates` Mirror update, package upgrade, major upgrade including Exim blocking, reboot/apt cleanup handlers.
- `roles/docker` Compose v2 status, prune (images/all), Docker daemon restart. Collection: `community.docker`.
- `roles/managed-mailcow` Find Mailcow path, start/stop, update process, helper tasks.
- `roles/system` Chrony setup, Docker/MOTD/SSH hardening, disk utility check, service handlers.
- `roles/checkmk-monitoring` Create host, discovery, agent bakery/activation. Collection: `checkmk.general`.
- `roles/deploy-clamd` ClamAV build/configuration/templates (systemd units, freshclam/clamd.conf).
- `roles/proxmox-automation` Snapshots/VM info (collection: `community.proxmox`).
## Common Commands
```bash
# Create/edit vault file
ansible-vault create vault.yml
ansible-vault edit vault.yml
# Syntax check
ansible-playbook -i inventories/icp-fra-pve1.yml playbooks/os-update.yml --syntax-check
# Target only one host group
ansible-playbook -i inventories/icp-fra-pve1.yml playbooks/os-update.yml -l icp-fra-pve1
# Dry run
ansible-playbook -i inventories/icp-fra-pve1.yml playbooks/os-update.yml --check
```
## Notes & Best Practices
- Never commit secrets in plaintext only provide via `vault.yml`.
- Always create snapshots/backups before major upgrades (playbook handles Proxmox snapshots automatically if configured).
- `community.docker` requires a working Docker engine and Compose v2 on the target system.
- Maintain inventory/hosts with IPv6 where possible (repo is prepared for this).
---
Questions or feature requests? Please mention the playbook/use case we're happy to extend documentation and examples.