Fix/refresh respects session to fix job waiting#5790
Open
aviruthen wants to merge 2 commits intoaws:masterfrom
Open
Fix/refresh respects session to fix job waiting#5790aviruthen wants to merge 2 commits intoaws:masterfrom
aviruthen wants to merge 2 commits intoaws:masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #5765 — wait=True (and other instance methods) ignore user-provided sagemaker_session, falling back to default/ambient credentials. Regenerated resources.py and shapes.py as this required fixes within the autogenerated files.
Problem
All 77 resource classes in resources.py are code-generated from templates. Class methods (create(), get()) accept a session parameter and use it correctly, but instance methods (refresh(), wait(), update(), delete(), stop()) call Base.get_sagemaker_client() with no session, creating a new default client. When a user passes a custom session (e.g., assumed-role via STS), the instance methods fail with NoCredentialsError.
Solution (codegen template fix)
Store the session on the resource instance in get(), and use it in all instance methods:
GET_METHOD_TEMPLATE: {resource}._session = session after instance creation
REFRESH_METHOD_TEMPLATE: Base.get_sagemaker_client(session=getattr(self, '_session', None))
UPDATE_METHOD_TEMPLATE (both variants): same
DELETE_METHOD_TEMPLATE: same
STOP_METHOD_TEMPLATE: changed from SageMakerClient().sagemaker_client to use Base.get_sagemaker_client(session=...)
Backward compatible — when no session is passed, _session is None, and get_sagemaker_client(session=None) behaves identically to the old get_sagemaker_client().
Additional codegen fixes (pre-existing bugs)
resources_codegen.py: Fixed _get_instance_count_ref indentation bug for TrainingJob that produced unparseable Python
resources_codegen.py: Aliased Session import as Boto3Session to avoid collision with generated Session resource class
shapes_codegen.py: Fixed _filter_input_output_shapes to not filter out shapes transitively referenced by resource classes (e.g., CreateEndpointConfigInput); fixed output path to write into shapes/ subdirectory instead of shadowing the package
codegen.py: Lazy import of reformat_file_with_black to avoid circular import; scoped black to generated files only
Testing
13 unit tests (test_session_propagation.py): verify _session stored on get(), used by refresh(), delete(), stop(), update(), backward compatibility with None
15 existing unit tests pass with no regressions
6 integration tests (test_session_wait_e2e.py): ProcessingJob wait, TrainingJob wait (via ModelTrainer and resource class), session survives refresh cycles, get-then-wait flow
Manual verification: TransformJob, Endpoint, and CompilationJob wait flows all confirmed working
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.