mctpd: use same IID for retries in query_peer_properties#152
mctpd: use same IID for retries in query_peer_properties#152FreddieJheng-quanta wants to merge 1 commit intoCodeConstruct:mainfrom
Conversation
8c36d19 to
ba48d20
Compare
That table just covers the timeouts for the SMBus binding, and doesn't seem to cover any IID reuse behaviour of the control protocol. Can you elaborate on what's requiring this? I would prefer separate IIDs for retries, as that allows us to track delays vs. lost messages. |
Hi @jk-ozlabs, thanks for the review! We encountered a PLDM device that was slow to respond during init sync, causing the requester to time out and switch to a new IID. The vendor's position was that the BMC should retry with the same IID within the MT4 window (5~6 sec) rather than treating it as a new request. Before implementing this, I came across DSP0236 Table 10, which describes the Instance ID field as: One reading is that the IID value itself signals the distinction: same IID = retry, different IID = new request, which would imply retries should preserve the IID. Though the spec doesn't explicitly say "retries must reuse the same IID" so it seems open to interpretation. Would be great to hear your take on this! |
|
OK, it sounds like we would be justified in using the same IID within the retry loop - I would suggest updating the commit to reference DSP0236 Table 10 though. This is probably the point at which we need to converge the multiple separate retry loops though, so we have consistent behaviour. Is that something you want to look at doing beforehand? |
|
(I might take a look at coalescing those retry loops now) |
Thanks, Will update the commit to reference DSP0236 Table 10. Regarding coalescing the retry loops, note that |
Per DSP0236 Table 10, the Instance ID field is used to differentiate new requests from retried messages — retries should reuse the same instance ID (IID). Move mctp_next_iid() from the individual query functions to the caller so one IID is shared across all retries for a given request. Tested: ``` root@bmc:~# journalctl -u mctpd Apr 28 22:29:09 bmc mctpd[1493]: mctpd: query_get_peer_msgtypes: Sending Get Message Type Support to peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0 IID 0x00 Apr 28 22:29:09 bmc mctpd[1493]: mctpd: Retrying to get endpoint types for peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0. Attempt 1 IID 0x00 Apr 28 22:29:09 bmc mctpd[1493]: mctpd: query_get_peer_msgtypes: Sending Get Message Type Support to peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0 IID 0x00 Apr 28 22:29:09 bmc mctpd[1493]: mctpd: query_get_peer_uuid: Sending Get Endpoint UUID to peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0 IID 0x01 Apr 28 22:29:09 bmc mctpd[1493]: mctpd: Retrying to get peer UUID for peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0. Attempt 1 IID 0x01 Apr 28 22:29:09 bmc mctpd[1493]: mctpd: query_get_peer_uuid: Sending Get Endpoint UUID to peer eid 50 net 1 phys physaddr if 11 hw len 1 0x62 state 0 IID 0x01 ``` Signed-off-by: Freddie Jheng <Freddie.Jheng@quantatw.com>
ba48d20 to
61b61e9
Compare
|
OK, I have refactored the various retry loops into one, and provided a facility to disable retries. In doing that, the IIDs are naturally repeated for command retries. I have then taken the test case from your commit (preserving authorship for that), which passes. This is in my I will likely have to modify some of the retry semantics, we probably don't want to retry on the downstream bridge pool probes either. |
Thanks for the refactor, It looks much cleaner, though I haven't run the tests yet. One minor correction: the docstring in Sorry that I’ll be away next week and won't be able to reply until I’m back on May 11th. Thanks for understanding! |
Per DSP0237 Table 9, retries are retransmissions of the same MCTP control message and must reuse the same instance ID (IID). Move mctp_next_iid() from the individual query functions to the caller so one IID is shared across all retries for a given request.
Tested: