Skip to content

feat: rework BD naming to avoid requiring WWNs#266

Open
tserong wants to merge 9 commits intoharvester:masterfrom
tserong:wip-dont-require-wwn
Open

feat: rework BD naming to avoid requiring WWNs#266
tserong wants to merge 9 commits intoharvester:masterfrom
tserong:wip-dont-require-wwn

Conversation

@tserong
Copy link
Member

@tserong tserong commented Feb 20, 2026

Problem:
We can't really cope properly with disks that don't have WWNs, or whose WWNs change (the latter should never happen, but it did at least once in the past when a kernel update resulted in IDs for a certain make and model of NVMe changing).

Solution:

  • Identify disks by, in order:
    1. UUID if present (will be there for provisioned LHv1 volumes and LVM volumes)
    2. WWN if present
    3. Vendor+Model+SerialNumber+BusPath if none of the above exist.
  • BD CR names become non-deterministic (they're UUIDs for any new BDs after this change), so when scanning we don't check existing BDs by name, we check if there's one with the same UUID, then WWN, then Vendor+Model+Serial+BusPath.
  • This automatically prevents adding devices with duplicate FS/LVM UUIDs or WWNs
  • It would be nice if we could also interrogate provisioned LHv2 devices when scanned but that's rather more problematic (see details in HEP: Rework BlockDevice naming to avoid requiring WWNs [skip ci] harvester#10115)

Related Issue:
harvester/harvester#6261
harvester/harvester#8295

Test plan:
TBD

Note:
The first four commits in this PR aren't actually directly related to the naming rework. They're fixes for other issues that I found along the way while working on this.

@tserong tserong requested review from a team, Vicente-Cheng and Yu-Jack as code owners February 20, 2026 08:25
@tserong tserong marked this pull request as draft February 20, 2026 08:25
@tserong tserong force-pushed the wip-dont-require-wwn branch 7 times, most recently from aa16be7 to b800238 Compare February 27, 2026 09:14
@tserong tserong changed the title [DNM, WIP] feat: rework BD naming to avoid requiring WWNs feat: rework BD naming to avoid requiring WWNs Mar 2, 2026
@tserong tserong marked this pull request as ready for review March 2, 2026 04:58
@tserong tserong force-pushed the wip-dont-require-wwn branch from b800238 to c6a02ec Compare March 2, 2026 04:59
@tserong tserong force-pushed the wip-dont-require-wwn branch from c6a02ec to 7a6aa91 Compare March 2, 2026 05:06
@tserong tserong force-pushed the wip-dont-require-wwn branch 2 times, most recently from c40a53c to 67fce5c Compare March 6, 2026 09:04
@tserong
Copy link
Member Author

tserong commented Mar 6, 2026

I know CodeFactor thinks scanBlockDevicesOnNode() is a complex method, but IMO it's exactly as complex as it needs to be so ;-) I'm hoping we can ignore this check.

@mergify
Copy link

mergify bot commented Mar 9, 2026

This pull request is now in conflict. Could you fix it @tserong? 🙏

tserong added 5 commits March 10, 2026 11:43
…ices

harvester#253 updated BD CRs
to use stable /dev/mapper paths for multipath devices.  These paths need
to be resolved back to /dev/dm-0 when calling GetDiskByDevPath(),
otherwise fetching details like the WWN and Serial number from
/sys/block will fail, which means we lose the device WWN, SerialNumber
and Capacity information when multipath devices are provisioned.

Signed-off-by: Tim Serong <tserong@suse.com>
Harvester hasn't supported adding additional partitioned disks since
v1.0.2.

Signed-off-by: Tim Serong <tserong@suse.com>
With debug logging turned on, you will periodically see something like
the following:

time="2026-02-19T09:55:30Z" level=debug msg="Prepare to checking block device ac1b8d0086729ec4840eac4e2319cfaf"
time="2026-02-19T09:55:30Z" level=debug msg="Update block device ac1b8d0086729ec4840eac4e2319cfaf tags (Status) from [] to []"

Note that it's triggering an update even though the tags don't appear to
have changed (they're both rendered in the logs as empty slices).

The problem is that bd.Status.Tags is specified as `json:"tags,omitempty"`
but disk.Tags is specified as `json:"tags"`. This means that if both are
empty, bd.Status.Tags will be nil, while disk.Tags will be an empty slice.
In this case, reflect.DeepEqual() will return false, which will result
in an unnecessary update.  We can avoid this by skipping the DeepCopy
and DeepEqual entirely if both bd.Status.Tags and disk.Tags are already
empty.

Signed-off-by: Tim Serong <tserong@suse.com>
The original implementation ran `blkid -s UUID`, `blkid -s PTUUID`
and `blkid -s TYPE` for each device scanned.  We can collapse this
back to a single call if we invoke it once as `blkid -o export`
then parse the output and turn it into a map of strings.

Signed-off-by: Tim Serong <tserong@suse.com>
We can't really cope properly with disks that don't have WWNs, or whose
WWNs change (the latter should never happen, but it did at least once in
the past when a kernel update resulted in IDs for a certain make and
model of NVMe changing).

We can solve this as follows:

- Identify disks by, in order:
  1. UUID if present (will be there for provisioned LHv1 volumes and
     LVM volumes)
  2. WWN if present
  3. Vendor+Model+SerialNumber+BusPath if none of the above exist.
- BD CR names become non-deterministic (they're UUIDs for any new BDs
  after this change), so when scanning we don't check existing BDs by
  name, we check if there's one with the same UUID, then WWN, then
  Vendor+Model+Serial+BusPath.
- This automatically prevents adding devices with duplicate FS/LVM UUIDs
  or WWNs.
- It would be nice if we could also interrogate provisioned LHv2 devices
  when scanned but that's rather more problematic (see details in
  harvester/harvester#10115)

Related-to: harvester/harvester#6261
Related-to: harvester/harvester#8295
Signed-off-by: Tim Serong <tserong@suse.com>
tserong added 4 commits March 10, 2026 11:44
Signed-off-by: Tim Serong <tserong@suse.com>
There's a couple of changes we need to make here:

1. We're actually starting with two blockdevices now, not one,
   because the new NDM code picks up /dev/vdb in the vagrant host
2. The duplicated WWN device needs a new backing file, otherwise
   it'll pick up the duplicate disk UUID, not the duplicate WWN.

Signed-off-by: Tim Serong <tserong@suse.com>
There'll be more than the two expected block devices thanks for NDM now
picking up /dev/vdb.  It's more robust to check how many we started with
then expect to end up with two more than whatever that original number
was.

Signed-off-by: Tim Serong <tserong@suse.com>
Signed-off-by: Tim Serong <tserong@suse.com>
@tserong tserong force-pushed the wip-dont-require-wwn branch from cc5f7cd to 54a514e Compare March 10, 2026 00:46
@tserong
Copy link
Member Author

tserong commented Mar 10, 2026

Conflicts resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant