name: server-environment-audit description: Comprehensive server environment audit and safe optimization — check all services, configs, cron jobs, disk, and memory, then fix issues without breaking running services. category: devops
Server Environment Audit & Safe Optimization
When to use
- User asks to "check the server" or "review the environment"
- User wants optimization but emphasizes safety ("don't break anything")
- Periodic maintenance check before making changes
- Investigating performance issues or unexpected behavior
Core Principle
Never restart or modify critical services (OpenClaw, Hermes) without explicit confirmation. Fix only clearly broken/stale items.
Audit Steps
1. System Overview
uname -r && uptime && free -h && df -h /
ps aux --sort=-%mem | head -15
ss -tlnp | grep LISTEN
systemctl --failed
2. OpenClaw Health
curl -s http://127.0.0.1:18381/health
python3 -c "import json; c=json.load(open('/root/.openclaw/openclaw.json')); print('agents:', [a['id'] for a in c['agents']['list']]); print('plugins:', [k for k,v in c['plugins']['entries'].items() if v.get('enabled')])"
tail -5 /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log
3. Hermes Health
ps aux | grep hermes_cli | grep -v grep
curl -s http://127.0.0.1:8642/health 2>/dev/null
ls -la /root/.hermes/profiles/friend/
4. Cron Job Audit
4. Cron Job Audit
crontab -l
# Check each cron target path exists:# For each script path in crontab, verify: ls -la <script_path>
5. Disk & Temp Cleanup Candidates
find /tmp -maxdepth 1 -name '*backup*' -mtime +3 -exec du -sh {} \;
find /root/.openclaw/workspace -name '*.bak' -exec du -sh {} \;
docker system df
docker images --format '{{.ID}} {{.Repository}}:{{.Tag}} {{.Size}}'
Safe Fixes (what can be done without confirmation)
Broken systemd services in restart loop
# If service has high restart count and status 203/EXEC or similar:
systemctl stop <service> && systemctl disable <service>
Stale cron jobs (target paths don't exist)
# Write a Python script to filter crontab:
# 1. Parse current crontab
# 2. Remove lines referencing non-existent directories
# 3. Write back with crontab command
Stale OpenClaw plugin entries
# Remove disabled plugins from entries, installs, and allow lists:
import json
with open('/root/.openclaw/openclaw.json') as f:
cfg = json.load(f)
stale = [name for name, data in cfg['plugins']['entries'].items() if not data.get('enabled')]
for name in stale:
cfg['plugins']['entries'].pop(name, None)
cfg['plugins']['installs'].pop(name, None)
if name in cfg['plugins']['allow']:
cfg['plugins']['allow'].remove(name)
with open('/root/.openclaw/openclaw.json', 'w') as f:
json.dump(cfg, f, indent=2, ensure_ascii=False)
Docker image tag dedup
Docker image tag dedup
```bash# Remove mirror tags that point to same image ID:
docker rmi
Things to NOTE but NOT fix (require user confirmation)
- Missing Chrome/Playwright binary (affects browser tools)
- High memory usage (suggest upgrade if needed)
- Large wiki/git directories (suggest cleanup strategy)
- OpenClaw config hot-reload deferral (log shows "requires gateway restart")
Verification After Fixes
curl -s http://127.0.0.1:18381/health # OpenClaw
curl -s http://127.0.0.1:8642/health 2>/dev/null # Hermes
python3 -c "import json; json.load(open('/root/.openclaw/openclaw.json')); print('Config valid ✓')"
df -h / | tail -1
free -h | head -2
Pitfalls
- Never restart OpenClaw or Hermes without explicit user approval
- Config changes trigger hot-reload but may defer until tasks complete
openclaw doctorcan hang (timeout after 15s)- Cron entries may have comments spanning multiple lines — use Python for safe filtering
- Docker image dedup: only remove tags, never the base image that a running container uses