Describe the bug
Description
Broker throughput detection has a mechanism for handling name clashes using the PostfixGenerator and SanitizedName, however this is:
- not de-duplicating queue throughput correctly
- subject to race conditions and incorrect throughput values
- causing issues when writing the throughput report
Scenario:
- Case sensitive broker (e.g. RabbitMQ). Two queues differing only in casing, e.g. "Sales" and "sales"
First time ServiceControl throughput detection runs after the queues are created
- queue names are fetched and run through postfix. "sales" gets assigned a postfix of 1
- adding the throughput information is run asynchronously, without an await
- Both queuename instances query Raven for an existing match. Both return null, since neither have reached the await for saving
- "Sales" saves its endpoint data to Raven with Id of "Sales/Broker" and SanitizedName of "Sales".
- "sales" saves its endpoint data to Raven with Id of "sales/Broker" and SanitizedName of "sales1". Raven, being case insensitive, overwrites "Sales/Broker" with the details of "sales".
Subsequent throughput detection runs
- Depending on the order of fetching queue names from RabbitMQ, either "Sales" or "sales" gets through first
- the endpoint read is the existing "Sales/Broker" for both
- whichever one hits the save first will write their throughput data.
If the two queues were created at different times
- If "Sales" was created first, SC broker throughput detection runs, then some time in the future "sales" is created, the same behaviour as Subsequent throughput detection runs above will happen, since the record fetched by case insensitiveid of "sales/Broker" is fetched. The difference is that the record in Raven would have SanitizedName set to "Sales" rather than "sales1", so the throughput report issue below wouldn't happen.
Generating throughput report
- The queue name written to the report is from the Id, i.e. "Sales"
- The queue name used to match to other throughput sources, i.e. Audit or Monitoring, is from SanitizedName, i.e. "sales1"
- The throughput report ends up with two entries for "Sales", one from the Broker and one with Audit/Monitoring
- Also note that, if there was a genuine queue named "Sales1", then its broker throughput would be recorded against the same throughput report record as "Sales", since their SanitizedName's match.
Expected behavior
Correctly deduplicated reporting of throughput against both "Sales" and "sales", with the correct matching to their Audit and Monitoring throughput values.
Actual behavior
As above
Versions
6.7.2+ (since the throughput report generator has existed in ServiceControl)
Steps to reproduce
have two queues with the same name but different casing in the same broker.
Relevant log output
Additional Information
Workarounds
Possible solutions
Additional information
Describe the bug
Description
Broker throughput detection has a mechanism for handling name clashes using the
PostfixGeneratorandSanitizedName, however this is:Scenario:
First time ServiceControl throughput detection runs after the queues are created
Subsequent throughput detection runs
If the two queues were created at different times
Generating throughput report
Expected behavior
Correctly deduplicated reporting of throughput against both "Sales" and "sales", with the correct matching to their Audit and Monitoring throughput values.
Actual behavior
As above
Versions
6.7.2+ (since the throughput report generator has existed in ServiceControl)
Steps to reproduce
have two queues with the same name but different casing in the same broker.
Relevant log output
Additional Information
Workarounds
Possible solutions
Additional information