Skill Error codes for pick and place manipulation by mustafab0 · Pull Request #2010 · dimensionalOS/dimos

mustafab0 · 2026-05-07T20:39:19Z

Problem

Manipulation skills return prose like "Error: Robot not found". The agent has to parse strings to know what failed — blocks structured retries. Phase 1 of the pick-and-place reliability plan.

Solution

Skills now return a SkillResult dataclass with success, message, error_code, duration_ms. Error codes are domain enums (CommonSkillError, ManipulationError). MCP auto-detects the new agent_encode() method, so the LLM sees structured JSON. Every return logs SKILL <name> result=<code> duration_ms=<n> server-side. All 18 manipulation skills migrated. Drive-by: enabled gripper on xarm7-planner-coordinator.

Breaking Changes

Skills return SkillResult instead of str. Replace "Error:" in result with result.is_success() / result.error_code.

How to Test

uv run pytest dimos/agents/test_skill_result.py dimos/manipulation/
dimos run xarm7-planner-coordinator, then python -i -m dimos.manipulation.planning.examples.manipulation_client`
_client.set_gripper(0.85) returns a SkillResult dataclass; logs show SKILL set_gripper result=OK ...

closes DIM-786

greptile-apps · 2026-05-07T20:45:55Z

Greptile Summary

This PR replaces free-form string returns from all 18 manipulation skills with a typed SkillResult dataclass carrying success, error_code (an Enum), message, duration_ms, and optional metadata. The MCP server's existing agent_encode hook auto-detects the new method on SkillResult and forwards structured JSON to the LLM, requiring no changes to the MCP layer.

dimos/agents/skill_result.py (new): defines SkillResult, CommonSkillError, and the skill_timing context manager that stamps wall-clock duration and emits a structured SKILL <name> result=<code> duration_ms=<n> log line on every return path.
dimos/manipulation/skill_errors.py (new): adds the ManipulationError enum with 10 domain-specific codes covering grasp failures, planning, gripper errors, and world-monitor availability.
manipulation_module.py / pick_and_place_module.py: all public skill methods and the private _lift_if_low / _preview_execute_wait helpers migrated to SkillResult; string-based error checks in callers replaced with result.is_success() / result.error_code.

Confidence Score: 4/5

Safe to merge; the migration is consistent across all 18 skills with correct error-code mappings and a well-tested wire contract

All skill methods are correctly migrated and the structured return type flows cleanly through the existing MCP agent_encode hook without requiring any server-side changes. The main gaps are in skill_timing (exception logging and duration mutation in composed skills) and the look/scan_objects semantic change — all quality concerns rather than correctness bugs.

dimos/agents/skill_result.py (skill_timing exception path and stamp mutation) and dimos/manipulation/pick_and_place_module.py (look/scan_objects empty-result semantics)

Important Files Changed

Filename	Overview
dimos/agents/skill_result.py	New file: defines `SkillResult` dataclass, `CommonSkillError` enum, and `skill_timing` context manager; `skill_timing` has a logging gap on unhandled exceptions and `stamp()` mutates result in-place which causes duration info to be overwritten in composed skills
dimos/manipulation/skill_errors.py	New file: clean `ManipulationError` enum with 10 domain-specific codes; `IK_FAILED` and `COLLISION_AT_START` are defined but unused in Phase 1 (reserved for future use)
dimos/manipulation/manipulation_module.py	Migrates 9 skill methods from `str` to `SkillResult`; `_lift_if_low` and `_preview_execute_wait` private helpers also converted; all error paths correctly mapped to structured codes
dimos/manipulation/pick_and_place_module.py	Migrates 9 pick-and-place skills; `look` and `scan_objects` now return `NO_OBJECTS_VISIBLE` failure instead of an informational success when empty, which is a notable semantic change for callers
dimos/agents/test_skill_result.py	Comprehensive tests for `SkillResult` factories, `agent_encode` wire contract, `skill_timing` duration measurement, and log format; good coverage of cross-domain enum usage
dimos/manipulation/blueprints.py	Drive-by: adds `add_gripper=True` to xarm7 planner coordinator config
dimos/manipulation/test_manipulation_unit.py	Updated `test_reset_not_during_execution` to assert `result.is_success()` and `result.error_code`; all other tests operate on non-skill methods and are unaffected by the type change
dimos/manipulation/test_pick_and_place_unit.py	Updated `test_place_back_no_pick_pose_errors` to check `result.error_code is ManipulationError.NO_PRIOR_POSE` and `result.message`; other tests target pure-logic helpers unaffected by return type change

Sequence Diagram

sequenceDiagram
    participant Agent as LLM Agent
    participant MCP as MCP Server
    participant PAP as PickAndPlaceModule
    participant MM as ManipulationModule

    Agent->>MCP: tool call: pick_and_place(object, x, y, z)
    MCP->>PAP: pick_and_place(...)
    activate PAP
    Note over PAP: skill_timing("pick_and_place")
    PAP->>PAP: pick(object_name)
    activate PAP
    Note over PAP: skill_timing("pick")
    PAP->>MM: _lift_if_low()
    MM-->>PAP: SkillResult.ok()
    PAP->>MM: plan_to_pose / _preview_execute_wait
    MM-->>PAP: SkillResult.ok/fail
    PAP-->>PAP: stamp(result) logs SKILL pick
    deactivate PAP
    PAP->>PAP: place(x, y, z)
    activate PAP
    Note over PAP: skill_timing("place")
    PAP->>MM: _lift_if_low / plan / execute
    MM-->>PAP: SkillResult
    PAP-->>PAP: stamp(result) logs SKILL place
    deactivate PAP
    PAP-->>PAP: stamp(place_result) logs SKILL pick_and_place
    deactivate PAP
    MCP->>MCP: hasattr(result, agent_encode) True
    MCP-->>Agent: "JSON {success, error_code, message, duration_ms}"

_{Reviews (1): Last reviewed commit: "updated manipulation blueprints and test..." | Re-trigger Greptile}

greptile-apps · 2026-05-07T20:45:59Z

+@contextmanager
+def skill_timing(name: str | None = None) -> Iterator[Callable[[SkillResult], SkillResult]]:
+    """Stamp a ``SkillResult`` with the elapsed wall time at exit, and log it.
+
+    If ``name`` is supplied, a structured ``logger.info`` line is emitted on
+    every ``stamp(...)`` call (i.e. every skill return point) with the skill
+    name, the outcome code, and the elapsed duration. Pass the skill's own
+    method name so the log is greppable.
+
+    Usage::
+
+        with skill_timing("set_gripper") as stamp:
+            ...
+            return stamp(SkillResult.ok("done"))
+    """
+    t0 = time.monotonic()
+
+    def stamp(result: SkillResult) -> SkillResult:
+        result.duration_ms = (time.monotonic() - t0) * 1000.0
+        if name is not None:
+            code = result.error_code.name if result.error_code is not None else "OK"
+            logger.info(f"SKILL {name} result={code} duration_ms={result.duration_ms:.1f}")
+        return result
+
+    yield stamp


skill_timing silently drops structured logging on exception exit

The PR description guarantees "Every return logs SKILL <name> result=<code> duration_ms=<n> server-side." However, stamp() is only called on explicit return paths inside the with block — any unhandled exception raised before stamp() (e.g., an AttributeError from a bad robot state, or a timeout in a downstream RPC call) silently exits the context manager without emitting the SKILL ... result=... log line. The MCP server's outer except Exception will still log a generic error, but the structured SKILL line won't appear. Adding a try/finally to emit a log on exception exit (e.g., result=EXCEPTION) would close this gap and make production debugging easier.

greptile-apps · 2026-05-07T20:46:01Z


        Args:
            robot_name: Robot context (only needed for multi-arm setups).
        """
-        obstacles = self.refresh_obstacles(0.0)
-
-        detections = self._detection_snapshot
-        if not detections:
-            return "No objects visible from current position"
-
-        lines = [f"Currently see {len(detections)} object(s):"]
-        for det in detections:
-            c = det.center
-            lines.append(
-                f"  - {det.name} [id={det.object_id[:8]}]: ({c.x:.3f}, {c.y:.3f}, {c.z:.3f})"
-            )
+        with skill_timing("look") as stamp:
+            obstacles = self.refresh_obstacles(0.0)
+
+            detections = self._detection_snapshot
+            if not detections:
+                return stamp(
+                    SkillResult.fail(
+                        ManipulationError.NO_OBJECTS_VISIBLE,
+                        "No objects visible from current position",
+                    )
+                )


look() now reports a failure when no objects are visible

Previously look() returned an informational success string ("No objects visible from current position") when nothing was detected — a perfectly normal outcome. With this change, a caller that loops on result.is_success() will interpret "no objects" as a transient error and retry indefinitely. The code for scan_objects has the same pattern. Whether "no objects visible" constitutes a failure vs. a valid empty-result depends on the agent's strategy; at a minimum the docstring should note that NO_OBJECTS_VISIBLE is the expected "empty" code so callers know to handle it specially.

greptile-apps · 2026-05-07T20:46:03Z

+    def stamp(result: SkillResult) -> SkillResult:
+        result.duration_ms = (time.monotonic() - t0) * 1000.0
+        if name is not None:
+            code = result.error_code.name if result.error_code is not None else "OK"
+            logger.info(f"SKILL {name} result={code} duration_ms={result.duration_ms:.1f}")
+        return result


stamp() mutates duration_ms on already-stamped inner SkillResult objects

In composed skills (pick_and_place, drop_on, scan_objects, place_back), the outer stamp() call overwrites duration_ms on a SkillResult that was already stamped by the inner skill's own skill_timing. For example, pick_and_place calls stamp(self.place(...)), so the returned SkillResult.duration_ms reflects the full pick_and_place elapsed time rather than the isolated place time. Logging is still consistent (each inner skill logs its own time), but a programmatic caller reading result.duration_ms on a failed inner result gets the outer skill's elapsed clock rather than the duration of the specific operation that failed. Cloning the result before stamping, or stamping only freshly-created results, would prevent this.

codecov · 2026-05-07T20:52:34Z

⚠️ JUnit XML file not found

The CLI was unable to find any JUnit XML files to upload.
For more help, visit our troubleshooting guide.

mustafab0 added 5 commits May 7, 2026 13:28

created structured return type for @Skill methods

9b0717c

manipulation skill erors with enums

de05845

added skill result tests

80dc449

refactored manipulation module to support new SkillResult return type

4c5906d

updated manipulation blueprints and test files

38817d9

mustafab0 marked this pull request as ready for review May 7, 2026 20:39

greptile-apps Bot reviewed May 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skill Error codes for pick and place manipulation#2010

Skill Error codes for pick and place manipulation#2010
mustafab0 wants to merge 5 commits intolegacy-dev-dont-mergefrom
mustafa/feature/manipulation-skill-error-codes

mustafab0 commented May 7, 2026

Uh oh!

greptile-apps Bot commented May 7, 2026

Uh oh!

greptile-apps Bot May 7, 2026

Uh oh!

greptile-apps Bot May 7, 2026

Uh oh!

greptile-apps Bot May 7, 2026

Uh oh!

codecov Bot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mustafab0 commented May 7, 2026

Problem

Solution

Breaking Changes

How to Test

Uh oh!

greptile-apps Bot commented May 7, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 7, 2026

⚠️ JUnit XML file not found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant