Virtualization for Scientific Workload

I. Zacharov, O. Panarin, E. Ryabinkin, K. Izotov, A. Teslyuk

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    1 Citation (Scopus)

    Abstract

    The computationally intensive applications (HPC) and the data analytics within the Big Data applications (BDA) constitute a combined workflow for the scientific discovery. Yet, the development and the execution environment for the HPC and the BDA have different origins and also within a single discipline there are large differences for the requirements for the provisioning of the libraries and tools within the Linux Operating System. Traditionally the HPC library versioning is addressed with the software Modules and other multi-root building frameworks by the system of administration. This does not necessary provide for the compatibility between different systems where the programs should be running. Such compatibility is required for the effective sharing of the code between scientific institutions and deployment on the centralized Supercomputing resources (Centers of Collective Usage). The flexibility of the software build model and the flexibility of subsequent deployment in the target computer center may be provided by the Virtualization and/or Containerization of the software. The Virtualization framework addresses also the problem of the unification between the HPC and the BDA workflow for a common scientific discovery model. Therefore virtualization is chosen as the software basis of the recent cluster deployment at Skoltech computer center. In the framework of this deployment model we have investigated the implications for the program run time when deployed in a virtualized environment. For the multitude of the scientific goals no compromise on the machine performance and program scalability can be accepted by the users of the system. Therefore a measurement of these parameters was undertaken for a representative range of scientific applications and for a multitude of running environments including the usage of CPU, GPU and High Speed networks. In this article we report a joined work between the Skolkovo Institute of Technology (Skoltech) and the National Research Center (Kurchatov Institute) comprising this evaluation. We have established the build and running environment using several container technologies. The setup retains more than 95% of the performance compared to the bare metal run times. The conditions for the usage of the Nvidia GPGPU within the containers are described and we have tested a model of the load balancing by migration of the containers to a different node. The presented work paves the way to a more flexible deploy everywhere model of execution that will speed up the scientific discovery and improve the opportunity for science.

    Original languageEnglish
    Title of host publication2018 International Scientific and Technical Conference Modern Computer Network Technologies, MoNeTeC 2018 - Proceedings
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    ISBN (Electronic)9781538694565
    DOIs
    Publication statusPublished - 10 Dec 2018
    Event2018 International Scientific and Technical Conference Modern Computer Network Technologies, MoNeTeC 2018 - Moscow, Russian Federation
    Duration: 25 Oct 201826 Oct 2018

    Publication series

    Name2018 International Scientific and Technical Conference Modern Computer Network Technologies, MoNeTeC 2018 - Proceedings

    Conference

    Conference2018 International Scientific and Technical Conference Modern Computer Network Technologies, MoNeTeC 2018
    Country/TerritoryRussian Federation
    CityMoscow
    Period25/10/1826/10/18

    Keywords

    • containers
    • high performance computing
    • supercomputing
    • virtualization

    Fingerprint

    Dive into the research topics of 'Virtualization for Scientific Workload'. Together they form a unique fingerprint.

    Cite this