Skip to content

Image upgrade with version-specific settings causes CrashLoopBackOff due to ConfigMap update before StatefulSet reconciliation #1926

@dashashutosh24

Description

@dashashutosh24

Describe the bug

When upgrading ClickHouse from one version to another (e.g., 25.3 → 25.8) while simultaneously adding version-specific settings, the pods enter CrashLoopBackOff and never recover without manual intervention.

The root cause is the reconciliation order introduced in v0.24.3:

  1. ConfigMaps are updated first with new version-specific settings
  2. Operator attempts SYSTEM SHUTDOWN (software restart) on the existing pod
  3. Pod restarts with the OLD image but mounts the NEW ConfigMap
  4. ClickHouse crashes because the old version doesn't recognize the new settings
  5. Operator gets stuck in waitHostIsReady() waiting for the unhealthy host
  6. StatefulSet update (which would apply the new image) is never reached

This is a regression from v0.24.3 where the restart behavior was changed:

"Changed a way ClickHouse is restarted in order to pickup server configuration change. Instead of pod re-creation it tries SYSTEM SHUTDOWN first."

To Reproduce

  1. Deploy a ClickHouse cluster with operator v0.25.6 (or any version >= 0.24.3)
  2. Use ClickHouse version 25.3
  3. Update the CHI to:
    • Change image to ClickHouse 25.8
    • Add a setting that only exists in 25.5+ (e.g., write_marks_for_substreams_in_compact_parts)
  4. Apply the updated CHI
  5. Observe pods entering CrashLoopBackOff

Example CHI change:

spec:
  configuration:
    settings:
      merge_tree:
        # This MergeTree setting is only valid in ClickHouse 25.5+
        write_marks_for_substreams_in_compact_parts: 0
  templates:
    podTemplates:
      - name: clickhouse-pod
        spec:
          containers:
            - name: clickhouse
              image: clickhouse/clickhouse-server:25.8.16.34  # Changed from 25.3

Expected behavior

When an image change is detected, the operator should:

  1. Update the StatefulSet with the new image first (or skip software restart for image changes)
  2. Wait for the new pod to be running with the new image
  3. Then apply ConfigMap changes

OR

The operator should detect that an image change requires a full pod replacement and skip the SYSTEM SHUTDOWN software restart attempt.

Actual behavior

  1. Operator detects the image change correctly (visible in logs)
  2. ConfigMaps are updated with new settings (including version-specific ones)
  3. Operator sends SYSTEM SHUTDOWN to ClickHouse
  4. Kubernetes restarts the container with the same old image (StatefulSet not yet updated)
  5. ClickHouse 25.3 fails to start due to unrecognized MergeTree setting:
    Code: 137. DB::Exception: Unknown setting 'write_marks_for_substreams_in_compact_parts'.
    
  6. Pod enters CrashLoopBackOff
  7. Operator stuck in waitHostIsReady() indefinitely
  8. StatefulSet reconciliation (with new image) is never executed

Workaround

Manual recovery:

kubectl rollout restart deployment clickhouse-operator -n <namespace>

This triggers a fresh reconciliation that properly updates the StatefulSet.

Prevention:
Perform upgrades in two phases:

  1. First: Update image only (no new settings)
  2. Second: Add version-specific settings after pods are running new image

Operator logs

Change detection working correctly:

I0217 09:59:32.115253  1 worker-reconciler-chi.go:182] logSWVersion():Host:0-0[0/0]:
  default/clickhouse:Host software version: 0-0 25.8.16[25.8.16.34/parsed from the tag: '25.8.16.34']

diff item [14]:'.Templates.PodTemplates[0].Spec.Containers[0].Image' = '"clickhouse/clickhouse-server:25.8.16.34"'

Stuck waiting for unhealthy host:

I0217 09:59:XX.XXXXXX  1 worker.go:XXX] waitHostIsReady():
  Waiting for host to be ready...
  [Repeated indefinitely - host never becomes ready due to CrashLoopBackOff]

Environment

Component Version
Operator 0.25.6
Previous Operator 0.24.2 (worked correctly)
ClickHouse (before) 25.3
ClickHouse (after) 25.8
Kubernetes 1.34
Installation method Helm

Additional context

  • This issue was introduced in v0.24.3 with the SYSTEM SHUTDOWN optimization
  • v0.24.2 and earlier versions did not have this issue because they always did full pod replacement
  • The issue only manifests when upgrading images AND adding version-specific settings simultaneously

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions