← Back to tasks
Metrics promqlmetricscapacitypredictionresource

promql-capacity-analysis

View in GitHub

Instruction

Using Prometheus `process_cpu_seconds_total` and `process_resident_memory_bytes`, assess how our services are doing on CPU and memory at evaluation time versus the worst point in the last 6 hours. Give the current fleet-wide totals (aggregate CPU rate over roughly the last hour and current total resident memory), the worst 6h fleet memory peak, rank the top two jobs by CPU and the top two by resident memory, and then say whether usage still looks comfortable or whether any jobs stand out as relatively tighter.