This paper presents a system deployed on parallel clusters to manage a collection of parallel simulations that make up a computational study. It explores how such a system can extend traditional parallel job scheduling and resource allocation techniques to incorporate knowledge specific to the study. Using a UINTAH-based helium gas simulation code (ARCHES) and the SimX system for multi-experiment computational studies, this paper demonstrates that, by using application-specific knowledge in resource allocation and scheduling decisions, one can reduce the run time of a computational study from over 20 hours to under 4.5 hours on a 32-processor cluster, and from almost 11 hours to just over 3.5 hours on a 64-processor cluster.
- High-throughput computing
- Parallel system