Processing a workflow (or a job) created by a user, who can be a researcherfrom a scientific laboratory or an analysis from a commercial organization, is the mainfunctionality that a data center or a high-performance computing center is generallyexpected to provide. It can be accomplished with a single core processor and rather smallamount of memory if the problem is adequately small while it may require thousands ofnodes to solve a complicated problem and peta-bytes of storage for its output. Also specificapplications on various platforms are required in general by users for resolving theproblems appropriately. In this aspect, a data center should operate non-homogeneoussystems for resource management, so-called batch system, in which it results in inefficientresource utilization due to stochastic behavior of user activity. Implementation of virtualizationfor resource management, e.g. Cloud Computing, is one of promising solutionsrecently arising, however, it results in the increase of complexity of the system itself aswell as the system administration because it naturally implies the intervention of virtualizationstack, e.g. hypervisor, between Operating System and applications for resourcemanagement. In this paper, we propose a new conceptual design to be implemented as apre-scheduler capable to insert user submitted jobs dedicated to a specific batch system intoavailable resources managed by other kind of batch systems. The proposed design featurestransparency in between clients and batch systems, accuracy in terms of monitoring andprediction on the available resources, and scalability for additional batch systems. Wesuggest the implementation example of the conceptual design based on the scenarioestablished from our experience of operating a data center.
Keyword
Job processing; Resource utilization; Batch system; Data center