Skip to content

Fix/refresh respects session to fix job waiting#5790

Open
aviruthen wants to merge 2 commits intoaws:masterfrom
aviruthen:fix/session-bug-approach-b
Open

Fix/refresh respects session to fix job waiting#5790
aviruthen wants to merge 2 commits intoaws:masterfrom
aviruthen:fix/session-bug-approach-b

Conversation

@aviruthen
Copy link
Copy Markdown
Collaborator

Fixes #5765 — wait=True (and other instance methods) ignore user-provided sagemaker_session, falling back to default/ambient credentials. Regenerated resources.py and shapes.py as this required fixes within the autogenerated files.

Problem
All 77 resource classes in resources.py are code-generated from templates. Class methods (create(), get()) accept a session parameter and use it correctly, but instance methods (refresh(), wait(), update(), delete(), stop()) call Base.get_sagemaker_client() with no session, creating a new default client. When a user passes a custom session (e.g., assumed-role via STS), the instance methods fail with NoCredentialsError.

Solution (codegen template fix)
Store the session on the resource instance in get(), and use it in all instance methods:

GET_METHOD_TEMPLATE: {resource}._session = session after instance creation
REFRESH_METHOD_TEMPLATE: Base.get_sagemaker_client(session=getattr(self, '_session', None))
UPDATE_METHOD_TEMPLATE (both variants): same
DELETE_METHOD_TEMPLATE: same
STOP_METHOD_TEMPLATE: changed from SageMakerClient().sagemaker_client to use Base.get_sagemaker_client(session=...)
Backward compatible — when no session is passed, _session is None, and get_sagemaker_client(session=None) behaves identically to the old get_sagemaker_client().

Additional codegen fixes (pre-existing bugs)
resources_codegen.py: Fixed _get_instance_count_ref indentation bug for TrainingJob that produced unparseable Python
resources_codegen.py: Aliased Session import as Boto3Session to avoid collision with generated Session resource class
shapes_codegen.py: Fixed _filter_input_output_shapes to not filter out shapes transitively referenced by resource classes (e.g., CreateEndpointConfigInput); fixed output path to write into shapes/ subdirectory instead of shadowing the package
codegen.py: Lazy import of reformat_file_with_black to avoid circular import; scoped black to generated files only

Testing
13 unit tests (test_session_propagation.py): verify _session stored on get(), used by refresh(), delete(), stop(), update(), backward compatibility with None
15 existing unit tests pass with no regressions
6 integration tests (test_session_wait_e2e.py): ProcessingJob wait, TrainingJob wait (via ModelTrainer and resource class), session survives refresh cycles, get-then-wait flow
Manual verification: TransformJob, Endpoint, and CompilationJob wait flows all confirmed working

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@aviruthen aviruthen changed the title Fix/session bug approach b Fix/refresh respects session to fix job waiting Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[v3] FrameworkProcessor and ModelTrainer: 4 regressions (including dropping CodeArtifact support) from v2 migration

1 participant