mctpd: use same IID for retries in query_peer_properties by FreddieJheng-quanta · Pull Request #152 · CodeConstruct/mctp

FreddieJheng-quanta · 2026-04-29T09:58:33Z

Per DSP0237 Table 9, retries are retransmissions of the same MCTP control message and must reuse the same instance ID (IID). Move mctp_next_iid() from the individual query functions to the caller so one IID is shared across all retries for a given request.

Tested:

root@bmc:~# journalctl -u mctpd
Apr 28 22:29:09 bmc mctpd[1493]: mctpd: query_get_peer_msgtypes: Sending Get Message Type Support to peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0 IID 0x00
Apr 28 22:29:09 bmc mctpd[1493]: mctpd: Retrying to get endpoint types for peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0. Attempt 1 IID 0x00
Apr 28 22:29:09 bmc mctpd[1493]: mctpd: query_get_peer_msgtypes: Sending Get Message Type Support to peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0 IID 0x00
Apr 28 22:29:09 bmc mctpd[1493]: mctpd: query_get_peer_uuid: Sending Get Endpoint UUID to peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0 IID 0x01
Apr 28 22:29:09 bmc mctpd[1493]: mctpd: Retrying to get peer UUID for peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0. Attempt 1 IID 0x01
Apr 28 22:29:09 bmc mctpd[1493]: mctpd: query_get_peer_uuid: Sending Get Endpoint UUID to peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0 IID 0x01

jk-ozlabs · 2026-04-30T01:29:54Z

Per DSP0237 Table 9

That table just covers the timeouts for the SMBus binding, and doesn't seem to cover any IID reuse behaviour of the control protocol. Can you elaborate on what's requiring this?

I would prefer separate IIDs for retries, as that allows us to track delays vs. lost messages.

FreddieJheng-quanta · 2026-04-30T04:06:19Z

Per DSP0237 Table 9

That table just covers the timeouts for the SMBus binding, and doesn't seem to cover any IID reuse behaviour of the control protocol. Can you elaborate on what's requiring this?

I would prefer separate IIDs for retries, as that allows us to track delays vs. lost messages.

Hi @jk-ozlabs, thanks for the review!
Wanted to discuss further the reasoning behind this implementation.

We encountered a PLDM device that was slow to respond during init sync, causing the requester to time out and switch to a new IID. The vendor's position was that the BMC should retry with the same IID within the MT4 window (5~6 sec) rather than treating it as a new request.

Before implementing this, I came across DSP0236 Table 10, which describes the Instance ID field as:
"used to identify new instances of an MCTP control Request or Datagram to differentiate new requests or datagrams that are sent to a given message terminus from retried messages that are sent to the same message terminus."

One reading is that the IID value itself signals the distinction: same IID = retry, different IID = new request, which would imply retries should preserve the IID. Though the spec doesn't explicitly say "retries must reuse the same IID" so it seems open to interpretation.

Would be great to hear your take on this!

jk-ozlabs · 2026-04-30T04:34:33Z

OK, it sounds like we would be justified in using the same IID within the retry loop - I would suggest updating the commit to reference DSP0236 Table 10 though.

This is probably the point at which we need to converge the multiple separate retry loops though, so we have consistent behaviour. Is that something you want to look at doing beforehand?

jk-ozlabs · 2026-04-30T05:00:44Z

(I might take a look at coalescing those retry loops now)

FreddieJheng-quanta · 2026-04-30T06:37:53Z

OK, it sounds like we would be justified in using the same IID within the retry loop - I would suggest updating the commit to reference DSP0236 Table 10 though.

This is probably the point at which we need to converge the multiple separate retry loops though, so we have consistent behaviour. Is that something you want to look at doing beforehand?

Thanks, Will update the commit to reference DSP0236 Table 10. Regarding coalescing the retry loops, note that query_get_peer_vdm_types doesn't take an iid parameter currently.

Per DSP0236 Table 10, the Instance ID field is used to differentiate new requests from retried messages — retries should reuse the same instance ID (IID). Move mctp_next_iid() from the individual query functions to the caller so one IID is shared across all retries for a given request. Tested: ``` root@bmc:~# journalctl -u mctpd Apr 28 22:29:09 bmc mctpd[1493]: mctpd: query_get_peer_msgtypes: Sending Get Message Type Support to peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0 IID 0x00 Apr 28 22:29:09 bmc mctpd[1493]: mctpd: Retrying to get endpoint types for peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0. Attempt 1 IID 0x00 Apr 28 22:29:09 bmc mctpd[1493]: mctpd: query_get_peer_msgtypes: Sending Get Message Type Support to peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0 IID 0x00 Apr 28 22:29:09 bmc mctpd[1493]: mctpd: query_get_peer_uuid: Sending Get Endpoint UUID to peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0 IID 0x01 Apr 28 22:29:09 bmc mctpd[1493]: mctpd: Retrying to get peer UUID for peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0. Attempt 1 IID 0x01 Apr 28 22:29:09 bmc mctpd[1493]: mctpd: query_get_peer_uuid: Sending Get Endpoint UUID to peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0 IID 0x01 ``` Signed-off-by: Freddie Jheng <Freddie.Jheng@quantatw.com>

jk-ozlabs · 2026-04-30T08:05:37Z

OK, I have refactored the various retry loops into one, and provided a facility to disable retries. In doing that, the IIDs are naturally repeated for command retries.

I have then taken the test case from your commit (preserving authorship for that), which passes.

This is in my experimental/refactor-retry branch: https://github.com/jk-ozlabs/mctp/tree/experimental/refactor-retry . Let me know what you think. If that's sensible, I'll propose for an actual PR.

I will likely have to modify some of the retry semantics, we probably don't want to retry on the downstream bridge pool probes either.

FreddieJheng-quanta · 2026-04-30T10:43:49Z

OK, I have refactored the various retry loops into one, and provided a facility to disable retries. In doing that, the IIDs are naturally repeated for command retries.

I have then taken the test case from your commit (preserving authorship for that), which passes.

This is in my experimental/refactor-retry branch: https://github.com/jk-ozlabs/mctp/tree/experimental/refactor-retry . Let me know what you think. If that's sensible, I'll propose for an actual PR.

I will likely have to modify some of the retry semantics, we probably don't want to retry on the downstream bridge pool probes either.

Thanks for the refactor, It looks much cleaner, though I haven't run the tests yet.

One minor correction: the docstring in test_query_peer_properties_same_iid_on_retry carried over my incorrect reference to DSP0237 Table 9, it should be DSP0236 Table 10.

Sorry that I’ll be away next week and won't be able to reply until I’m back on May 11th. Thanks for understanding!

FreddieJheng-quanta force-pushed the mctpd-fix-iid-retry branch from 8c36d19 to ba48d20 Compare April 29, 2026 10:10

FreddieJheng-quanta force-pushed the mctpd-fix-iid-retry branch from ba48d20 to 61b61e9 Compare April 30, 2026 06:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mctpd: use same IID for retries in query_peer_properties#152

mctpd: use same IID for retries in query_peer_properties#152
FreddieJheng-quanta wants to merge 1 commit intoCodeConstruct:mainfrom
FreddieJheng-quanta:mctpd-fix-iid-retry

FreddieJheng-quanta commented Apr 29, 2026

Uh oh!

jk-ozlabs commented Apr 30, 2026

Uh oh!

FreddieJheng-quanta commented Apr 30, 2026 •

edited

Loading

Uh oh!

jk-ozlabs commented Apr 30, 2026

Uh oh!

jk-ozlabs commented Apr 30, 2026

Uh oh!

FreddieJheng-quanta commented Apr 30, 2026

Uh oh!

jk-ozlabs commented Apr 30, 2026

Uh oh!

FreddieJheng-quanta commented Apr 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FreddieJheng-quanta commented Apr 29, 2026

Uh oh!

jk-ozlabs commented Apr 30, 2026

Uh oh!

FreddieJheng-quanta commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jk-ozlabs commented Apr 30, 2026

Uh oh!

jk-ozlabs commented Apr 30, 2026

Uh oh!

FreddieJheng-quanta commented Apr 30, 2026

Uh oh!

jk-ozlabs commented Apr 30, 2026

Uh oh!

FreddieJheng-quanta commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FreddieJheng-quanta commented Apr 30, 2026 •

edited

Loading

FreddieJheng-quanta commented Apr 30, 2026 •

edited

Loading