The standardised orchestration platform for HPC. Define workflows as code, run across any cluster, and let AI ensure your workloads complete successfully. Focus on research, not infrastructure.
Expanse fixes this.
One workflow definition. Any cluster. ML that predicts failures before they happen.
Expanse standardises how you define, share, and run computational workflows across any infrastructure — from your laptop to the world's largest supercomputers.
Define nodes as code functions with simple configuration files. No more custom job scripts — workflows are declarative, version-controlled, and portable across clusters.
Data flows seamlessly from Python preprocessing on your laptop to Fortran simulations on a supercomputer and back to C analysis — automatically. Expanse uses a zero-copy transfer protocol for maximum efficiency.
State-of-the-art ML models predict resource needs and failure risks before your workload runs. Get suggestions to prevent crashes and optimize allocations — so you can focus on research, not debugging.
Share and discover reusable computational nodes through a centralized registry. Eliminate version mismatches and scattered code. Make HPC accessible to everyone with reproducible, verified components.
Complete audit trails track who ran what, when, and with which changes. Built for regulated industries requiring compliance with FDA 21 CFR Part 11, MiFID II, and other standards. Administrators get full visibility into usage patterns and modifications.
Run different stages on local machines, SLURM clusters, or Ray — all within a single workflow. Expanse handles the complexity of heterogeneous compute environments.
Expanse uses machine learning to analyse your workflows before execution, predicting resource requirements and potential failures. Get actionable suggestions to ensure your workloads run end-to-end without crashes.
Before execution, Expanse analyses your code, data, and cluster state to predict out-of-memory errors, walltime exceeded, and other failure modes. Get warnings and recommendations to prevent crashes.
ML models analyse workload patterns to suggest optimal CPU, memory, and walltime allocations. Reduce wasted resources and queue wait times with data-driven recommendations.
Predict queue wait times and suggest alternative clusters or resource configurations. Make informed decisions about when and where to run your workloads for maximum efficiency.
Define nodes as reusable code functions with simple YAML configuration. Compose workflows by referencing nodes — each can target different clusters, and data flows between them automatically with zero-copy efficiency.
Join researchers using Expanse to make computational science more reproducible, shareable, and efficient.