Scalable

Scalable is a Python framework for orchestrating containerized, distributed workflows on HPC systems. It integrates container lifecycle management, scheduler-aware resource provisioning, and a Dask-based execution model so multi-stage scientific workflows can run consistently at scale.

Documentation

Full documentation is available at jgcri.github.io/scalable.

Installation

Install from PyPI:

pip install scalable

Install from source:

git clone https://github.com/JGCRI/scalable.git
pip install ./scalable

If your shell cannot find installed scripts (for example, scalable_bootstrap), add the relevant scripts directory to PATH.

System Requirements

Scheduler: Slurm
Local host tools: Docker
HPC host tools: Apptainer

Platform guidance:

Linux is recommended for bootstrapping.
On Windows, Git Bash is recommended.
On macOS, Terminal works as expected.

Quick Start

Scalable includes a bootstrap process that prepares a local/HPC work environment and required containers.

Choose a local working directory.
Run the bootstrap command.
Follow interactive prompts.

cd <local_work_dir>
scalable_bootstrap

After setup completes, the workflow environment is launched on the HPC side. From the work directory, start an interactive Python session or execute a script:

python3
python3 <filename>.py

SSH Recommendation

Bootstrap performs multiple SSH operations. For best reliability and usability, configure key-based passwordless SSH authentication in advance.

Usage

At runtime, create a cluster, register container targets, scale workers, and submit functions.

1. Create a cluster

from scalable import SlurmCluster, ScalableClient

cluster = SlurmCluster(
    queue="slurm",
    walltime="02:00:00",
    account="GCIMS",
    interface="ib0",
    silence_logs=False,
)

2. Register container targets

cluster.add_container(
    tag="gcam",
    cpus=10,
    memory="20G",
    dirs={"/qfs/people/user/work/gcam-core": "/gcam-core", "/rcfs": "/rcfs"},
)
cluster.add_container(
    tag="stitches",
    cpus=6,
    memory="50G",
    dirs={"/qfs/people/user": "/user", "/rcfs": "/rcfs"},
)
cluster.add_container(
    tag="osiris",
    cpus=8,
    memory="20G",
    dirs={"/rcfs/projects/gcims/data": "/data", "/qfs/people/user/test": "/scratch"},
)

3. Scale workers

cluster.add_workers(n=3, tag="gcam")
cluster.add_workers(n=2, tag="stitches")
cluster.add_workers(n=3, tag="osiris")

4. Submit functions

def func1(param):
    import gcam
    return gcam.__version__


def func2(param):
    import stitches
    return stitches.__version__


def func3(param):
    import osiris
    return osiris.__version__


client = ScalableClient(cluster)

fut1 = client.submit(func1, "gcam", tag="gcam")
fut2 = client.submit(func2, "stitches", tag="stitches")
fut3 = client.submit(func3, "osiris", tag="osiris")

5. Scale down when complete

cluster.remove_workers(n=2, tag="gcam")
cluster.remove_workers(n=1, tag="stitches")
cluster.remove_workers(n=3, tag="osiris")

Function Caching

Scalable provides a cacheable decorator to avoid recomputing expensive function calls across retries or interrupted runs.

from scalable import cacheable


@cacheable(return_type=str, param=str)
def func1(param):
    import gcam
    return gcam.__version__


@cacheable(return_type=str, recompute=True, param=str)
def func2(param):
    import stitches
    return stitches.__version__


@cacheable
def func3(param):
    import osiris
    return osiris.__version__

For reliable behavior, explicitly specify argument and return types whenever possible.

How to Contribute

Contributions are welcome.

Fork the repository.
Create a feature branch.
Implement changes and add or update tests.
Open a pull request with a clear summary and rationale.

For bug reports, feature requests, and support questions, open an issue:

https://github.com/JGCRI/scalable/issues

License

This project is licensed under the terms in LICENSE.md.

Name		Name	Last commit message	Last commit date
Latest commit History 271 Commits
.github/workflows		.github/workflows
communicator/src		communicator/src
docs		docs
scalable		scalable
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
DISCLAIMER.md		DISCLAIMER.md
LICENSE.md		LICENSE.md
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scalable

Table of Contents

Documentation

Installation

System Requirements

Quick Start

SSH Recommendation

Usage

1. Create a cluster

2. Register container targets

3. Scale workers

4. Submit functions

5. Scale down when complete

Function Caching

How to Contribute

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Scalable

Table of Contents

Documentation

Installation

System Requirements

Quick Start

SSH Recommendation

Usage

1. Create a cluster

2. Register container targets

3. Scale workers

4. Submit functions

5. Scale down when complete

Function Caching

How to Contribute

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages