Skip to content

<fix>[ha]: defer skip-trace list cleanup on MN departure to prevent split-brain#3845

Open
MatheMatrix wants to merge 3 commits into5.4.8from
sync/yaohua.wu/5.4.8@@2
Open

<fix>[ha]: defer skip-trace list cleanup on MN departure to prevent split-brain#3845
MatheMatrix wants to merge 3 commits into5.4.8from
sync/yaohua.wu/5.4.8@@2

Conversation

@MatheMatrix
Copy link
Copy Markdown
Owner

When a management node departs, its VM skip-trace entries were
immediately removed. If VMs were still being started by kvmagent,
the next VM sync would falsely detect them as Stopped and trigger
HA, causing split-brain.

Fix: transfer departed MN skip-trace entries to an orphaned set with
10-minute TTL instead of immediate deletion. VMs in the orphaned set
remain skip-traced until the TTL expires or they are explicitly
continued, preventing false HA triggers during MN restart scenarios.

Resolves: ZSTAC-80821

Change-Id: I3222e260b2d7b33dc43aba0431ce59a788566b34

Conflicts:
plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java

Conflicts:
plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java

sync from gitlab !9716

…plit-brain

When a management node departs, its VM skip-trace entries were
immediately removed. If VMs were still being started by kvmagent,
the next VM sync would falsely detect them as Stopped and trigger
HA, causing split-brain.

Fix: transfer departed MN skip-trace entries to an orphaned set with
10-minute TTL instead of immediate deletion. VMs in the orphaned set
remain skip-traced until the TTL expires or they are explicitly
continued, preventing false HA triggers during MN restart scenarios.

Resolves: ZSTAC-80821

Change-Id: I3222e260b2d7b33dc43aba0431ce59a788566b34

Conflicts:
	plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java

Conflicts:
	plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java
…anup

Resolves: ZSTAC-80821

Change-Id: I59284c4e69f5d2ee357b1836b7c243200e30949a
Resolves: ZSTAC-80821

Change-Id: Ia9a9597feceb96b3e6e22259e2d0be7bde8ae499
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 23, 2026

概述

为 KVM 虚拟机管理引入了孤立虚拟机跳过超时配置参数,并实现了基于 TTL 的孤立虚拟机跟踪机制,在管理节点离开时保留跳过状态,待超时后自动清理并恢复追踪。

更改内容

涵盖的内容 / 文件 摘要
配置参数
plugin/kvm/src/main/java/org/zstack/kvm/KVMGlobalConfig.java
新增公共静态字段 ORPHANED_VM_SKIP_TIMEOUT,用于配置孤立虚拟机跳过超时时间(默认 600 秒),验证规则要求大于 0。
孤立虚拟机跟踪
plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java
实现 TTL 机制追踪孤立虚拟机:nodeLeft 方法将离线节点的跳过集合转移至 orphanedSkipVms 并附加时间戳;syncVm 循环在每个周期清理过期条目,同时在构建 vmsToSkipSetHostSide 时融合孤立虚拟机 UUID;isVmDoNotNeedToTrace 扩展支持检查孤立状态直到 TTL 过期。新增详细日志记录转移、跳过、过期和清理操作。

序列图

sequenceDiagram
    participant MN as 管理节点
    participant Task as KvmVmSyncPingTask
    participant Host as KVM 主机
    participant State as 孤立虚拟机状态<br/>(orphanedSkipVms)

    MN->>Task: 节点离开事件 nodeLeft()
    Task->>State: 转移跳过的 VM UUID<br/>+ 时间戳
    State-->>Task: 保存孤立 VM 映射

    loop 每个 syncVm 周期
        Task->>State: 检查过期条目
        State->>State: 清理 TTL 已过期的项
        
        Task->>Task: 获取 vmsToSkip + orphanedSkipVms
        Task->>Host: 构建 vmsToSkipSetHostSide<br/>(包含孤立 VM)
        Host-->>Task: 同步响应
        
        Task->>Task: 检查 isVmDoNotNeedToTrace()
        Task-->>Task: 孤立 VM 返回 true<br/>(不追踪)
    end

    Note over State,Task: TTL 过期后
    State->>State: 移除过期孤立 VM
    Task->>Task: 恢复对 VM 的追踪<br/>(isVmDoNotNeedToTrace 返回 false)
Loading

代码审查工作量

🎯 3 (中等) | ⏱️ ~20 分钟

庆祝诗

🐰 孤立虚拟机需关怀,
TTL 计时不再拆,
节点离别莫惶恐,
时限到期自解脱,
追踪恢复新起航!✨


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)

Check name Status Explanation Resolution
Title check ❌ Error PR标题超过了72字符的限制(共79字符),未完全遵守[scope]: 格式的长度要求。 将标题缩短至72字符以内,例如:[ha]: defer skip-trace cleanup on MN departure to prevent split-brain
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed PR描述详细说明了问题背景、解决方案和目的,与changeset密切相关,包含了JIRA ticket引用。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/yaohua.wu/5.4.8@@2

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java`:
- Around line 468-475: The orphanedSkipVms entries are never removed on an
explicit "continue trace" path, so add code in the VM_CONTINUE_TRACE_PATH
handler (the continuation-trace callback used by KvmVmSyncPingTask) to also
remove the VM from orphanedSkipVms in addition to clearing vmsToSkip;
specifically update the continuation handling logic that currently clears
vmsToSkip (and/or vmsToSkipSetHostSide) to call orphanedSkipVms.remove(vmUuid)
so syncVm()/vmsToSkipSetHostSide won't re-add or re-skip the VM after a manual
continuation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: 45cfccc4-3e1e-45ef-80cd-5cf119c86705

📥 Commits

Reviewing files that changed from the base of the PR and between fb31815 and 1450efb.

📒 Files selected for processing (2)
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMGlobalConfig.java
  • plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java

Comment on lines +468 to +475
Set<String> skippedVms = vmsToSkip.remove(inv.getUuid());
if (skippedVms != null && !skippedVms.isEmpty()) {
long now = System.currentTimeMillis();
for (String vmUuid : skippedVms) {
orphanedSkipVms.put(vmUuid, now);
logger.info(String.format("moved VM[uuid:%s] from departed MN[uuid:%s] skip list to orphaned set" +
" (will expire in %d minutes)", vmUuid, inv.getUuid(), getOrphanTtlMs() / 60000));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

孤儿 skip 条目缺少“显式继续追踪”的回收路径。

这里把 VM 转入 orphanedSkipVms 之后,当前实现只会在 TTL 到期时删除它;VM_CONTINUE_TRACE_PATH 的处理逻辑仍然只清 vmsToSkip。这样即使启动流程已经完成,后续 syncVm() 仍会把该 VM 放进 vmsToSkipSetHostSide,最多额外屏蔽一个 vm.orphanedSkipTimeout 窗口的 HA/状态收敛,这和 PR 目标里的“TTL 到期或显式 continuation”不一致。建议在 continue-trace 回调里同步 orphanedSkipVms.remove(vmUuid)

💡 建议修改
 evtf.on(VmTracerCanonicalEvents.VM_CONTINUE_TRACE_PATH, new EventCallback() {
     `@Override`
     protected void run(Map tokens, Object data) {
         VmTracerCanonicalEvents.VmContinueTraceData data1
                 = (VmTracerCanonicalEvents.VmContinueTraceData) data;
+        String continuedVmUuid = data1.getVmUuid();
+
         if (data1.getApiId() != null && vmApis.containsKey(data1.getManagementNodeId()) && vmApis.get(data1.getManagementNodeId()).contains(data1.getApiId())) {
-            String vmUuid = vmApis.get(data1.getManagementNodeId()).remove(data1.getApiId());
-            logger.info("Continuing tracing VM: " + vmUuid);
-            vmsToSkip.get(data1.getManagementNodeId()).remove(vmUuid);
-            return;
+            continuedVmUuid = vmApis.get(data1.getManagementNodeId()).remove(data1.getApiId());
         }
 
-        if (data1.getVmUuid() != null) {
-            logger.info("Continuing tracing VM: " + data1.getVmUuid());
-            vmsToSkip.get(data1.getManagementNodeId()).remove(data1.getVmUuid());
+        if (continuedVmUuid != null) {
+            logger.info("Continuing tracing VM: " + continuedVmUuid);
+            orphanedSkipVms.remove(continuedVmUuid);
+            Set<String> skipped = vmsToSkip.get(data1.getManagementNodeId());
+            if (skipped != null) {
+                skipped.remove(continuedVmUuid);
+            }
         }
     }
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java` around lines
468 - 475, The orphanedSkipVms entries are never removed on an explicit
"continue trace" path, so add code in the VM_CONTINUE_TRACE_PATH handler (the
continuation-trace callback used by KvmVmSyncPingTask) to also remove the VM
from orphanedSkipVms in addition to clearing vmsToSkip; specifically update the
continuation handling logic that currently clears vmsToSkip (and/or
vmsToSkipSetHostSide) to call orphanedSkipVms.remove(vmUuid) so
syncVm()/vmsToSkipSetHostSide won't re-add or re-skip the VM after a manual
continuation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants