Skip to main content
Kodelyth ECC
Skill

enterprise-agent-ops

Operate long-lived agent workloads with observability, security boundaries, and lifecycle management.

Invoke via:use enterprise-agent-ops
Origin:ECC

Enterprise Agent Ops

Use this skill for cloud-hosted or continuously running agent systems that need operational controls beyond single CLI sessions.

Operational Domains

  • runtime lifecycle (start, pause, stop, restart)
  • observability (logs, metrics, traces)
  • safety controls (scopes, permissions, kill switches)
  • change management (rollout, rollback, audit)

Baseline Controls

  • immutable deployment artifacts
  • least-privilege credentials
  • environment-level secret injection
  • hard timeout and retry budgets
  • audit log for high-risk actions

Metrics to Track

  • success rate
  • mean retries per task
  • time to recovery
  • cost per successful task
  • failure class distribution

Incident Pattern

When failure spikes:

  • freeze new rollout
  • capture representative traces
  • isolate failing route
  • patch with smallest safe change
  • run regression + security checks
  • resume gradually

Deployment Integrations

This skill pairs with:

  • PM2 workflows
  • systemd services
  • container orchestrators
  • CI/CD gates