Back to registry

hpc-runtime-doctor

HeshamFS/materials-simulation-skills
41Added Jun 5, 2026
hpcmpigpu-computingjob-schedulercluster-computingmaterials-simulationslurmcuda

Summary

Diagnose HPC runtime and scheduler problems for materials simulations, including MPI/OpenMP/GPU layout, modules, CUDA/Kokkos hints, scratch paths, walltime, job arrays, restart strategy, scheduler portability, and resource mismatch. Use when jobs fail, run slowly, get killed, or behave differently on a cluster than on a workstation.

SKILL.md