← Back to blog
HelmTroubleshootingPlatformGitOps

Helm Upgrade Failed: History Before Rollback

When helm upgrade fails, the release is telling you a story — read history and revision manifests before you uninstall or rollback.

2 min read

Your pipeline reports success — until someone checks the cluster:

helm status api -n prod
STATUS: failed
REVISION: 14
DESCRIPTION: Upgrade "api" failed: timed out waiting for condition

The reflex is immediate: rollback, uninstall, delete the Deployment. All three feel like progress.

They often make recovery harder.

**Helm stores every revision.** History is your timeline — not a footnote.

What failed status actually means

failed means revision 14 never reached a healthy deployed state. Pods might still run on revision 13. Hooks might be stuck. The chart might have rendered but readiness never passed.

Before any destructive command, answer:

  1. Which revision is actually serving traffic?
  2. What changed between the last good revision and this one?
  3. Is the release pending on a hook Job?
helm history api -n prod
helm get manifest api -n prod --revision 14
kubectl get pods -n prod -l app.kubernetes.io/instance=api
kubectl get jobs -n prod

The decision order that works

SymptomFirst stepWhy
`failed` after upgrade`helm history`Pick the right rollback target
`pending-upgrade`Check hook Jobs + `helm status`Second upgrade races the first
Values changed, image unchanged`helm get manifest` vs live DeploymentValues path / subchart alias issues
CrashLoop after bumpDiff rev N vs N-1 manifestsRollback only after you know N-1 was good

Rollback is recovery — not diagnosis. helm rollback without reading revision 14 repeats blind fixes.


Traps that waste on-call time

helm uninstall first — removes release metadata and complicates re-install ownership.

Delete the Deployment — Helm still thinks it owns the release; the next upgrade fights orphaned labels.

helm upgrade --force on pending releases — stacks operations while hooks still run.

Trusting helm get notes — notes are template docs; Ingress hosts and Service names come from rendered manifests and values.


Practice the pattern

The Helm Releases path in the Decision Trainer walks through failed upgrades, pending hooks, values drift, and rollback decisions — graded on your first step, not chart trivia.

If you manage releases in CI, pair this with the Platform Pack (Helm, Kustomize, Kyverno, Argo CD) for GitOps-adjacent incidents end to end.