Skip to content

Fix unit test follow-ups from post-merge review#2239

Open
ideaship wants to merge 10 commits intomainfrom
fix/unit-test-followups
Open

Fix unit test follow-ups from post-merge review#2239
ideaship wants to merge 10 commits intomainfrom
fix/unit-test-followups

Conversation

@ideaship
Copy link
Copy Markdown
Contributor

@ideaship ideaship commented Apr 29, 2026

Summary

Post-merge review of the unit test PRs (#2200#2234) against their
linked issues turned up 10 findings (3 correctness bugs, 3 missing/weak
tests, 2 dead code paths, 2 fixture gaps). This PR addresses all of
them, one commit per finding.

Correctness bugs

  • conductor/utils: deep_compare raised KeyError when a nested
    key present in a was absent from b — fixed with b.get(key, {}).
  • sonic/connections: get_connected_interfaces had a single outer
    try/except wrapping the entire interface loop; one bad interface
    aborted all subsequent processing — restructured to per-interface
    recovery with an inner try/except covering all per-interface work.
  • sonic/interface: _extract_port_number_from_alias returned None
    for Eth(Port5) because the fallback was end-anchored; added an
    explicit \(Port(\d+)\) step between the primary pattern and the
    trailing-number fallback (a digit-anywhere approach was rejected as it
    misparsed aliases like QSFP28-49).

Missing / weak tests

  • sonic/cache: lock acquisition was never verified per method;
    added three spy tests (get_device_interfaces, clear,
    get_cache_stats).
  • sonic/interface: six convert_netbox_interface_to_sonic fallback
    tests checked only the return value, not the logger.warning call;
    added loguru_logs assertions to all six.
  • sonic/interface: _convert_using_port_config breakout path where
    the base port is found but its alias yields no number was untested.
    The production code previously shared the "Could not find base port"
    warning with the genuinely-not-found case; split into a distinct
    "Could not extract port number from alias" message and added a test
    asserting it.
  • data/enums: collection-openstack-core dependency chain was
    checked only for reachability; added a link-by-link test for
    keystone → neutron → wait-for-nova → octavia.

Dead code / fixture issues

  • sonic/connections: len(group) > 1 guard in
    find_interconnected_devices was unreachable (devices only enter
    the graph when they have in-role peers, so groups always have ≥ 2);
    removed the guard and corrected the misleading test comment.
  • sonic/interface: interface_speed was computed in
    detect_port_channels but never passed to
    convert_netbox_interface_to_sonic; removed the dead block.
  • sonic/cache: autouse reset fixture called clear_interface_cache()
    but left _thread_local.interface_cache set; fixed to also delete
    the attribute so subsequent tests start clean.

Test plan

  • pipenv run pytest tests/unit/ passes
  • pipenv run flake8 osism/ tests/ clean
  • pipenv run mypy osism/ clean

AI-assisted: Claude Code

The existing test for collection-openstack-core only verified that
keystone, neutron, nova, glance, cinder, and placement were reachable
somewhere in the tree.  A silent rewiring — e.g. promoting octavia to
a direct child of keystone, or decoupling neutron from wait-for-nova —
would not have been detected.

Add a link-by-link chain test that walks each step of the
keystone → neutron → wait-for-nova → octavia path and asserts that
each hop resolves to a Role instance with the exact expected name.
This pins the structural wiring, not just reachability.

AI-assisted: Claude Code
Signed-off-by: Roger Luethi <luethi@osism.tech>
deep_compare raised KeyError when a key in `a` held a dict value but
that key was absent from `b`.  The dict branch called `b[key]`
unconditionally (line 29), while the scalar branch above it had a
`key not in b` guard.

In production deep_compare diffs a desired-state config against the
current-state config.  When the current state is missing an entire
nested section the call would crash rather than record the addition.

Fix by replacing `b[key]` with `b.get(key, {})` so a missing key is
treated as an empty dict, causing the subtree from `a` to be recorded
in full as a required addition.

Add a regression test: `a = {"outer": {"inner": "val", "other": 1}}`,
`b = {}` must produce `updates == {"outer": {"inner": "val",
"other": 1}}` rather than raising KeyError.

AI-assisted: Claude Code
Signed-off-by: Roger Luethi <luethi@osism.tech>
The existing concurrency smoke test checked that two threads do not
crash and return consistent results, but it did not verify that each
public method of InterfaceCache acquires the lock exactly once per
call.  A future refactor removing the lock from clear() or
get_cache_stats() would pass the full suite undetected.

Add three per-method spy tests that replace cache._lock with a
MagicMock wrapping a real threading.Lock and assert
__enter__.call_count == 1 after a single call to
get_device_interfaces, clear, and get_cache_stats respectively.

AI-assisted: Claude Code
Signed-off-by: Roger Luethi <luethi@osism.tech>
clear_interface_cache() empties the internal _cache dict but leaves
_thread_local.interface_cache pointing to the now-empty InterfaceCache
instance.  After the fixture ran, get_interface_cache_stats() on the
main thread returned {} instead of None because the attribute was still
set on the thread-local.

This could mask a test that asserts get_interface_cache_stats() is None
on the main thread after a prior test has populated the cache: the
fixture would leave the attribute set and the assertion would see {} and
silently pass when it should not.

Fix by deleting the thread-local attribute after clearing, using
_thread_local.__dict__.pop so that subsequent tests start with no
thread-local cache, matching the state of a fresh thread.

AI-assisted: Claude Code
Signed-off-by: Roger Luethi <luethi@osism.tech>
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="tests/unit/tasks/conductor/sonic/test_interface_conversion.py" line_range="529-537" />
<code_context>
     assert _convert_using_port_config("Ethernet5", 5, True, {}) == "Eth1/2/2"


+def test_convert_using_port_config_breakout_base_port_found_but_alias_has_no_number():
+    # is_breakout=True; base port Ethernet4 IS found in port_config, but its
+    # alias "no-digits" yields no extractable number.  The code should log a
+    # warning and fall back to speed-based calculation.
+    # _convert_using_speed_calculation(5, None, True):
+    #   base_port=4, subport=2, physical_port=2 → "Eth1/2/2"
+    port_config = {"Ethernet4": {"alias": "no-digits"}}
+
+    assert _convert_using_port_config("Ethernet5", 5, True, port_config) == "Eth1/2/2"
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Also assert that the warning log is emitted in this breakout fallback case.

The test docstring says this path should log a warning, but the test only checks the return value. To fully verify the behavior and prevent regressions in logging, please capture logs (e.g., with the `loguru_logs` fixture) and assert that the expected warning is emitted, consistent with the other logging tests in this file.

Suggested implementation:

```python
def test_convert_netbox_interface_standard_form_resolves_via_alias(mocker):
    assert _convert_using_port_config("Ethernet5", 5, True, {}) == "Eth1/2/2"




```

```python
def test_convert_using_port_config_breakout_base_port_found_but_alias_has_no_number(
    loguru_logs,
):
    # is_breakout=True; base port Ethernet4 IS found in port_config, but its
    # alias "no-digits" yields no extractable number. The code should log a
    # warning and fall back to speed-based calculation.
    # _convert_using_speed_calculation(5, None, True):
    #   base_port=4, subport=2, physical_port=2 → "Eth1/2/2"
    port_config = {"Ethernet4": {"alias": "no-digits"}}

    assert _convert_using_port_config("Ethernet5", 5, True, port_config) == "Eth1/2/2"
    assert _has_log(
        loguru_logs,
        "WARNING",
        "alias has no numeric component; falling back to speed-based calculation",
    )

```

The log message substring in the `_has_log` assertion must match the actual warning emitted by `_convert_using_port_config` when the alias has no numeric component. If the implementation uses a different message, adjust the substring accordingly (keeping it specific enough to avoid false positives but resilient to minor wording changes). No other structural changes are required as long as the `loguru_logs` fixture is already defined and used similarly in other tests in this file.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +529 to +537
def test_convert_using_port_config_breakout_base_port_found_but_alias_has_no_number():
# is_breakout=True; base port Ethernet4 IS found in port_config, but its
# alias "no-digits" yields no extractable number. The code should log a
# warning and fall back to speed-based calculation.
# _convert_using_speed_calculation(5, None, True):
# base_port=4, subport=2, physical_port=2 → "Eth1/2/2"
port_config = {"Ethernet4": {"alias": "no-digits"}}

assert _convert_using_port_config("Ethernet5", 5, True, port_config) == "Eth1/2/2"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Also assert that the warning log is emitted in this breakout fallback case.

The test docstring says this path should log a warning, but the test only checks the return value. To fully verify the behavior and prevent regressions in logging, please capture logs (e.g., with the loguru_logs fixture) and assert that the expected warning is emitted, consistent with the other logging tests in this file.

Suggested implementation:

def test_convert_netbox_interface_standard_form_resolves_via_alias(mocker):
    assert _convert_using_port_config("Ethernet5", 5, True, {}) == "Eth1/2/2"


def test_convert_using_port_config_breakout_base_port_found_but_alias_has_no_number(
    loguru_logs,
):
    # is_breakout=True; base port Ethernet4 IS found in port_config, but its
    # alias "no-digits" yields no extractable number. The code should log a
    # warning and fall back to speed-based calculation.
    # _convert_using_speed_calculation(5, None, True):
    #   base_port=4, subport=2, physical_port=2 → "Eth1/2/2"
    port_config = {"Ethernet4": {"alias": "no-digits"}}

    assert _convert_using_port_config("Ethernet5", 5, True, port_config) == "Eth1/2/2"
    assert _has_log(
        loguru_logs,
        "WARNING",
        "alias has no numeric component; falling back to speed-based calculation",
    )

The log message substring in the _has_log assertion must match the actual warning emitted by _convert_using_port_config when the alias has no numeric component. If the implementation uses a different message, adjust the substring accordingly (keeping it specific enough to avoid false positives but resilient to minor wording changes). No other structural changes are required as long as the loguru_logs fixture is already defined and used similarly in other tests in this file.

@ideaship ideaship force-pushed the fix/unit-test-followups branch 2 times, most recently from 3ad1229 to 1211d69 Compare April 29, 2026 09:56
…interfaces

The outer try/except wrapped the entire interface loop.  A single
interface that raised from get_connected_device_via_interface would
abort processing of all subsequent interfaces for that device.

Issue #2205 required per-interface recovery (skip the offending
interface, log a warning, continue), but no inner try/except existed.
The existing test used only one interface, so it could not distinguish
"loop aborted" from "interface skipped" — both produced an empty set
and a single warning.

Restructure the exception handling:
- Outer try/except covers only get_cached_device_interfaces.  A failure
  there returns the empty sets immediately as before.
- Inner try/except wraps all per-interface processing — connection
  lookup, SONiC name conversion, set update, and port-channel mapping.
  Any failure skips that interface, logs a warning, and continues.

Add a regression test with two interfaces: the first raises, the second
connects normally.  Assert that the second interface still appears in
the returned set.

AI-assisted: Claude Code
Signed-off-by: Roger Luethi <luethi@osism.tech>
find_interconnected_devices had an `if len(group) > 1:` guard around
the final append.  A device only enters role_graphs when it has at
least one in-role peer, so every BFS starting from a node in the
graph always collects >= 2 devices.  The guard could never be False
and the dead branch (silently dropping a group) was never taken.

A test comment incorrectly attributed the empty result for a
single-spine topology to this guard; the actual reason is that the
spine never enters role_graphs at all because it has no in-role peers.

Remove the guard and update the test comment to accurately describe
the mechanism.

AI-assisted: Claude Code
Signed-off-by: Roger Luethi <luethi@osism.tech>
…ormat

The fallback regex used r"(\d+)$" which requires digits at the end of
the string.  "Eth(Port5)" ends with ")", not a digit, so the regex
never matched and the function returned None instead of 5.

The test for this case documented None as "actual behavior" with a
comment acknowledging the spec violation rather than surfacing the bug.

Replace the two-step strategy with three steps:
1. r"Eth(\d+)\(Port(\d+)\)" — existing primary, handles Eth54(Port54)
2. r"\(Port(\d+)\)" — new explicit handler for Eth(Port5) and similar
   aliases that lack a digit directly after "Eth"
3. r"(\d+)$" — trailing-number fallback for generic aliases such as
   "hundredGigE49" or "QSFP28-49"

A digit-anywhere fallback (r"(\d+)") was considered but rejected: it
would misparse aliases like "QSFP28-49" by returning 28 instead of 49.
The explicit \(Port(\d+)\) pattern is narrow enough to avoid that while
still covering the Eth(PortN) family.

Update the Eth(Port5) test to assert 5 and add a QSFP28-49 regression
test that pins the trailing-number fallback against the prefix-digit
hazard.

AI-assisted: Claude Code
Signed-off-by: Roger Luethi <luethi@osism.tech>
_convert_using_port_config had two tested breakout paths: base port
not found at all (empty port_config), and base port found with a valid
alias.  The path where the base port IS found but its alias contains
no extractable number was not tested.

Give this case its own early return and warning in the production code
rather than sharing the "Could not find base port" message with the
genuinely-not-found case.  The new message is:
"Could not extract port number from alias '...' for breakout
interface ..."

Add a test: port_config with Ethernet4 having alias "no-digits",
ethernet_num=5 so _find_base_port_for_breakout walks back to Ethernet4,
_extract_port_number_from_alias returns None, the fallback produces
"Eth1/2/2", and the accurate warning is asserted via loguru_logs.

AI-assisted: Claude Code
Signed-off-by: Roger Luethi <luethi@osism.tech>
Six convert_netbox_interface_to_sonic tests verified only the return
value of fallback paths but never checked that the expected
logger.warning call was actually made.  A future refactor silently
removing or changing those warnings would pass the full suite.

Add a _has_log helper (same pattern as test_config.py) and extend each
of the six tests with a WARNING-level assertion:
- string + device=None  → "Device object required"
- object + device=None  → "Device object required"
- no HWSKU              → "No HWSKU found"
- cache failure         → "Could not fetch device interfaces"
- empty port config     → "No port config found"
- port config raises    → "Could not load port config"

AI-assisted: Claude Code
Signed-off-by: Roger Luethi <luethi@osism.tech>
In detect_port_channels, interface_speed was computed for every LAG
member (including a potential get_speed_from_port_type call and its
logging side-effect) but the value was never used — it was not
forwarded to convert_netbox_interface_to_sonic, which does not accept
a speed parameter.

Remove the dead three-line computation block.  The conversion call is
unchanged; only the unused variable assignment is removed.

AI-assisted: Claude Code
Signed-off-by: Roger Luethi <luethi@osism.tech>
@ideaship ideaship force-pushed the fix/unit-test-followups branch from 2819266 to 2a95960 Compare April 29, 2026 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant