Skip to content

<feature>[storage]: support cleanAllVmMetadata param#3795

Open
MatheMatrix wants to merge 1 commit intozsv_5.0.0from
sync/tao.gan/clean-vm-metadata-ZSV-11867@@3
Open

<feature>[storage]: support cleanAllVmMetadata param#3795
MatheMatrix wants to merge 1 commit intozsv_5.0.0from
sync/tao.gan/clean-vm-metadata-ZSV-11867@@3

Conversation

@MatheMatrix
Copy link
Copy Markdown
Owner

Resolves: ZSV-11867

Change-Id: I6e6b616875706f67706d726d617367637863717a

sync from gitlab !9664

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 19, 2026

概览

本次变更为虚拟机元数据清理功能添加了批量清理能力。引入了新的布尔标志 cleanAllVmMetadata,允许一次性清理所有虚拟机的元数据。同时添加了清理数量计数字段 cleanedCount 和失败的主存储UUID列表,用于追踪清理操作的结果。

变更

内聚群 / 文件 摘要
头部消息与事件类
header/src/main/java/org/zstack/header/storage/primary/CleanupVmInstanceMetadataOnPrimaryStorageMsg.java, header/src/main/java/org/zstack/header/storage/primary/CleanupVmInstanceMetadataOnPrimaryStorageReply.java, header/src/main/java/org/zstack/header/vm/APICleanupVmInstanceMetadataEvent.java
添加了 cleanAllVmMetadata 布尔字段用于控制清理范围,新增 cleanedCount 整数字段用于报告清理数量,以及 failedPrimaryStorageUuids 列表用于追踪失败的存储。
API请求消息
header/src/main/java/org/zstack/header/vm/APICleanupVmInstanceMetadataMsg.java
vmUuids 参数从必需的非空改为可选,新增 cleanAllVmMetadata 布尔参数。
本地存储实现
plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageBase.java, plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java
扩展了清理处理逻辑以支持批量清理模式,新增处理方法用于查询连接的主机并聚合清理结果,增强了KVM后端命令与响应结构。
NFS存储实现
plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java, plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackendCommands.java
更新了命令与响应类以支持新的 cleanAllVmMetadata 标志和 cleanedCount 字段。
SDK同步更新
sdk/src/main/java/org/zstack/sdk/CleanupVmInstanceMetadataAction.java, sdk/src/main/java/org/zstack/sdk/CleanupVmInstanceMetadataResult.java
同步API变更,调整 vmUuids 参数的必需性,新增 cleanAllVmMetadata 参数和 failedPrimaryStorageUuids 结果字段。

时序图

sequenceDiagram
    participant Client as 客户端
    participant API as API层
    participant LocalStorage as 本地存储
    participant Host as 连接的主机
    participant Backend as KVM后端

    Client->>API: 发送APICleanupVmInstanceMetadataMsg<br/>(cleanAllVmMetadata=true)
    API->>LocalStorage: CleanupVmInstanceMetadataOnPrimaryStorageMsg
    LocalStorage->>LocalStorage: 检测 cleanAllVmMetadata=true
    LocalStorage->>LocalStorage: 查询所有连接的主机
    LocalStorage->>Host: 遍历每个主机
    Host->>Backend: 调用KVM后端处理
    Backend->>Backend: 执行CleanupVmMetadataCmd<br/>(cleanAllVmMetadata=true)
    Backend-->>Host: 返回CleanupVmMetadataRsp<br/>(cleanedCount)
    Host-->>LocalStorage: 聚合结果
    LocalStorage->>LocalStorage: 累积cleanedCount
    LocalStorage-->>API: CleanupVmInstanceMetadataOnPrimaryStorageReply<br/>(cleanedCount)
    API-->>Client: APICleanupVmInstanceMetadataEvent<br/>(totalCleaned, failedPrimaryStorageUuids)
Loading

代码审查工作量估计

🎯 3 (中等) | ⏱️ ~20 分钟

🐰 一键清空所有尘埃,
批量处理不再徘徊,
主机逐一来响应,
计数累积更精彩!
存储清爽又快哉!✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed 标题清晰地总结了PR的主要变更:添加对cleanAllVmMetadata参数的支持,与代码库改动直接相关且准确。
Description check ✅ Passed 描述与代码改动相关,提供了问题跟踪号(ZSV-11867)和GitLab同步信息,虽然简洁但足以说明PR的来源和目的。

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/tao.gan/clean-vm-metadata-ZSV-11867@@3

Comment @coderabbitai help to get the list of available commands and usage tips.

Resolves: ZSV-11867

Change-Id: I6e6b616875706f67706d726d617367637863717a
@MatheMatrix MatheMatrix force-pushed the sync/tao.gan/clean-vm-metadata-ZSV-11867@@3 branch from f662f09 to 1f54630 Compare April 19, 2026 11:53
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
header/src/main/java/org/zstack/header/vm/APICleanupVmInstanceMetadataMsg.java (1)

17-37: ⚠️ Potential issue | 🔴 Critical

APICleanupVmInstanceMetadataMsg 未被注册到处理器,将被视为未知消息。

在 VmInstanceManagerImpl.handleApiMessage() 方法中,apiCleanupVmInstanceMetadataMsg 没有对应的 instanceof 检查路由。该消息最终会调用 bus.dealWithUnknownMessage() 被拒绝,整个 API 不可用。因此,原计划的两个参数之间的互斥性校验(vmUuids 和 cleanAllVmMetadata)永远不会执行。需要在 handleApiMessage() 中添加 APICleanupVmInstanceMetadataMsg 的处理分支,并在 VmInstanceApiInterceptor 中实现参数校验,确保至少指定一个清理目标。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@header/src/main/java/org/zstack/header/vm/APICleanupVmInstanceMetadataMsg.java`
around lines 17 - 37, The APICleanupVmInstanceMetadataMsg is never handled
because VmInstanceManagerImpl.handleApiMessage() lacks an instanceof branch for
APICleanupVmInstanceMetadataMsg; add a branch that routes this message to the
existing VM metadata cleanup handler (or a new private handler method) so it is
not sent to bus.dealWithUnknownMessage(). Also implement parameter validation in
VmInstanceApiInterceptor (referencing APICleanupVmInstanceMetadataMsg, vmUuids,
cleanAllVmMetadata) to enforce that the request must specify at least one target
and that vmUuids and cleanAllVmMetadata are not both provided (i.e., either
cleanAllVmMetadata==true XOR vmUuids is non-empty), returning a proper API error
if the check fails.
🧹 Nitpick comments (1)
header/src/main/java/org/zstack/header/vm/APICleanupVmInstanceMetadataEvent.java (1)

46-54: 建议同步更新 __example__,避免示例与返回结构脱节。

新增了 failedPrimaryStorageUuids 后,示例返回建议一并体现,便于 API 文档消费方理解完整字段集。

可选修改示例
 public static APICleanupVmInstanceMetadataEvent __example__() {
     APICleanupVmInstanceMetadataEvent evt = new APICleanupVmInstanceMetadataEvent();
     evt.totalCleaned = 5;
     evt.totalFailed = 0;
     evt.failedVmUuids = java.util.Collections.emptyList();
+    evt.failedPrimaryStorageUuids = java.util.Collections.emptyList();
     return evt;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@header/src/main/java/org/zstack/header/vm/APICleanupVmInstanceMetadataEvent.java`
around lines 46 - 54, The API response example needs to be updated to include
the new field added to APICleanupVmInstanceMetadataEvent:
failedPrimaryStorageUuids (with its getter/setter present on the class), so
update the "__example__" sample response to show this field (e.g., as an array
of UUID strings or an empty array) alongside the existing fields so
documentation consumers see the complete return structure; ensure the example
key name exactly matches failedPrimaryStorageUuids and the value type matches
List<String>.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageBase.java`:
- Around line 3564-3567: The cleanAll branch currently returns early and
bypasses the existing per-primary-storage serialization used by thdf.chainSubmit
and the metadata update path (so cleanAll can race with per-VM updates);
instead, wrap the clean-all work in the same per-PS chain: submit a chain task
via thdf.chainSubmit using the same serialization key you use for single-VM
cleanup/metadata updates, and inside that chain task call
handleCleanAllVmMetadataOnLocalStorage (performing any host-level fan-out from
within the chain). This ensures cleanAllVmMetadata follows the same serialized
boundary as the metadata update code and single-VM cleanup.
- Around line 3638-3650: The cleanAll logic in LocalStorageBase (method cleanAll
/ helper cleanAllVmMetadata) only queries hosts with HostStatus.Connected, which
allows cleanAll to report success while offline hosts remain uncleaned; change
the host enumeration to fetch all host UUIDs mounted to this primary storage
(remove the "and host.status = :hstatus" filter or query HostVO without status),
then detect hosts whose HostStatus != Connected and mark the operation as
partial failure or return an error (e.g., set reply to failed/partial with
cleanedCount for connected hosts and include the offline host UUIDs/count);
update the analogous block around the cleanAllVmMetadata call (the code at the
other occurrence) the same way so offline hosts are counted and reported rather
than ignored.

---

Outside diff comments:
In
`@header/src/main/java/org/zstack/header/vm/APICleanupVmInstanceMetadataMsg.java`:
- Around line 17-37: The APICleanupVmInstanceMetadataMsg is never handled
because VmInstanceManagerImpl.handleApiMessage() lacks an instanceof branch for
APICleanupVmInstanceMetadataMsg; add a branch that routes this message to the
existing VM metadata cleanup handler (or a new private handler method) so it is
not sent to bus.dealWithUnknownMessage(). Also implement parameter validation in
VmInstanceApiInterceptor (referencing APICleanupVmInstanceMetadataMsg, vmUuids,
cleanAllVmMetadata) to enforce that the request must specify at least one target
and that vmUuids and cleanAllVmMetadata are not both provided (i.e., either
cleanAllVmMetadata==true XOR vmUuids is non-empty), returning a proper API error
if the check fails.

---

Nitpick comments:
In
`@header/src/main/java/org/zstack/header/vm/APICleanupVmInstanceMetadataEvent.java`:
- Around line 46-54: The API response example needs to be updated to include the
new field added to APICleanupVmInstanceMetadataEvent: failedPrimaryStorageUuids
(with its getter/setter present on the class), so update the "__example__"
sample response to show this field (e.g., as an array of UUID strings or an
empty array) alongside the existing fields so documentation consumers see the
complete return structure; ensure the example key name exactly matches
failedPrimaryStorageUuids and the value type matches List<String>.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 69163841-9532-43bb-ac8f-d14b683bdd28

📥 Commits

Reviewing files that changed from the base of the PR and between 7609252 and 1f54630.

📒 Files selected for processing (10)
  • header/src/main/java/org/zstack/header/storage/primary/CleanupVmInstanceMetadataOnPrimaryStorageMsg.java
  • header/src/main/java/org/zstack/header/storage/primary/CleanupVmInstanceMetadataOnPrimaryStorageReply.java
  • header/src/main/java/org/zstack/header/vm/APICleanupVmInstanceMetadataEvent.java
  • header/src/main/java/org/zstack/header/vm/APICleanupVmInstanceMetadataMsg.java
  • plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageBase.java
  • plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java
  • plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java
  • plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackendCommands.java
  • sdk/src/main/java/org/zstack/sdk/CleanupVmInstanceMetadataAction.java
  • sdk/src/main/java/org/zstack/sdk/CleanupVmInstanceMetadataResult.java

Comment on lines +3564 to +3567
if (msg.isCleanAllVmMetadata()) {
handleCleanAllVmMetadataOnLocalStorage(msg);
return;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

cleanAll 分支绕过了现有的元数据串行化队列。

这里直接 return 到新的全量清理逻辑,导致它不再经过下面单 VM cleanup 使用的 thdf.chainSubmit(...),也和 Line 3368 的 metadata update 路径失去了同一个串行化边界。这样全量清理可以和更新/单 VM 清理并发修改同一套 metadata,结果会依赖时序。建议把 cleanAllVmMetadata=true 也放进同一个 per-PS sync chain,再在 chain 内部做 host 级 fan-out。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageBase.java`
around lines 3564 - 3567, The cleanAll branch currently returns early and
bypasses the existing per-primary-storage serialization used by thdf.chainSubmit
and the metadata update path (so cleanAll can race with per-VM updates);
instead, wrap the clean-all work in the same per-PS chain: submit a chain task
via thdf.chainSubmit using the same serialization key you use for single-VM
cleanup/metadata updates, and inside that chain task call
handleCleanAllVmMetadataOnLocalStorage (performing any host-level fan-out from
within the chain). This ensures cleanAllVmMetadata follows the same serialized
boundary as the metadata update code and single-VM cleanup.

Comment on lines +3638 to +3650
List<String> connectedHostUuids = SQL.New(
"select h.hostUuid from LocalStorageHostRefVO h, HostVO host" +
" where h.primaryStorageUuid = :psUuid" +
" and h.hostUuid = host.uuid" +
" and host.status = :hstatus", String.class)
.param("psUuid", self.getUuid())
.param("hstatus", HostStatus.Connected)
.list();
if (connectedHostUuids.isEmpty()) {
logger.warn(String.format("[MetadataCleanup] cleanAll: no connected host found for local ps[uuid:%s]", self.getUuid()));
reply.setCleanedCount(0);
bus.reply(msg, reply);
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

cleanAll 现在会把“未清理离线宿主机”误报成成功。

这里仅查询 HostStatus.Connected 的宿主机;只要有一台在线宿主机成功,最终就会返回成功并带 cleanedCount。但挂载在该 local ps 上的离线宿主机上的 metadata 根本没有机会被清理,cleanAllVmMetadata 的语义就不成立了。建议像本类的扫描逻辑一样先枚举全部宿主机,把离线宿主机计入失败/部分失败,或者直接在存在离线宿主机时返回错误。

可参考的修正方向
-        List<String> connectedHostUuids = SQL.New(
+        List<String> allHostUuids = SQL.New(
+                        "select h.hostUuid from LocalStorageHostRefVO h, HostVO host" +
+                                " where h.primaryStorageUuid = :psUuid" +
+                                " and h.hostUuid = host.uuid", String.class)
+                .param("psUuid", self.getUuid())
+                .list();
+
+        List<String> connectedHostUuids = SQL.New(
                         "select h.hostUuid from LocalStorageHostRefVO h, HostVO host" +
                                 " where h.primaryStorageUuid = :psUuid" +
                                 " and h.hostUuid = host.uuid" +
                                 " and host.status = :hstatus", String.class)
                 .param("psUuid", self.getUuid())
                 .param("hstatus", HostStatus.Connected)
                 .list();
-        if (connectedHostUuids.isEmpty()) {
-            logger.warn(String.format("[MetadataCleanup] cleanAll: no connected host found for local ps[uuid:%s]", self.getUuid()));
-            reply.setCleanedCount(0);
+
+        List<String> disconnectedHostUuids = new ArrayList<>(allHostUuids);
+        disconnectedHostUuids.removeAll(connectedHostUuids);
+        if (!disconnectedHostUuids.isEmpty()) {
+            reply.setError(operr("cannot clean all vm metadata on local primary storage[uuid:%s], disconnected hosts=%s",
+                    self.getUuid(), disconnectedHostUuids));
             bus.reply(msg, reply);
             return;
         }

Also applies to: 3684-3690

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageBase.java`
around lines 3638 - 3650, The cleanAll logic in LocalStorageBase (method
cleanAll / helper cleanAllVmMetadata) only queries hosts with
HostStatus.Connected, which allows cleanAll to report success while offline
hosts remain uncleaned; change the host enumeration to fetch all host UUIDs
mounted to this primary storage (remove the "and host.status = :hstatus" filter
or query HostVO without status), then detect hosts whose HostStatus != Connected
and mark the operation as partial failure or return an error (e.g., set reply to
failed/partial with cleanedCount for connected hosts and include the offline
host UUIDs/count); update the analogous block around the cleanAllVmMetadata call
(the code at the other occurrence) the same way so offline hosts are counted and
reported rather than ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants