Difference between revisions of "Enabling multicore jobs and jobs requesting large amounts of memory"

From PDP/Grid Wiki
Jump to navigationJump to search
Line 21: Line 21:
  
 
To support multicore jobs or memory requests, the CreamCE must recognize these requests and translate them into directives for the Torque batch system. In the implementation discussed here, the CreamCE needs 3 files:
 
To support multicore jobs or memory requests, the CreamCE must recognize these requests and translate them into directives for the Torque batch system. In the implementation discussed here, the CreamCE needs 3 files:
* A script to process specific resource requests [[File:pbs_local_submit_attributes.sh]]. This file should be installed as /usr/bin/pbs_local_submit_attributes.sh on the CreamCE. It processes memory resource requests.
+
* A script to process specific resource requests [[Media:pbs_local_submit_attributes.sh]]. This file should be installed as /usr/bin/pbs_local_submit_attributes.sh on the CreamCE. It processes memory resource requests.
* A configuration file to activate this submit filter [[torque.cfg]]. This file should be stored as /var/spool/pbs/torque.cfg.
+
* A configuration file to activate this submit filter [[Media:torque.cfg]]. This file should be stored as /var/spool/pbs/torque.cfg.
* A Torque submit filter to write the input for the batch system [[torque_submit_filter.sh]]. This file should be installed as /usr/local/sbin/torque_submit_filter.sh (actually: the location specified in file torque.cfg above as value for SUBMITFILTER). The submit filter is discussed in more detail below.
+
* A Torque submit filter to write the input for the batch system [[Media:torque_submit_filter.sh]]. This file should be installed as /usr/local/sbin/torque_submit_filter.sh (actually: the location specified in file torque.cfg above as value for SUBMITFILTER). The submit filter is discussed in more detail below.
  
 
==== The submit filter ====
 
==== The submit filter ====

Revision as of 16:00, 6 August 2012

This article describes

Introduction

Certain applications will benefit from access to more than core (logical CPU) on the same physical computer. Grid jobs that use more than one core are referred to as multicore jobs.

Other applications require a specific amount of memory to run efficiently or successfully. Such jobs are called large-memory jobs in this article (because they often require a higher-than-default amount of memory on the machine).

The Cream Computing Elements (CreamCEs) offer support for multicore jobs and large-memory jobs, although they need some additional configuration to forward the job requirements to the batch system.

This article describes the support of multicore jobs or large-memory jobs at the Cream Computing Elements at Nikhef. Section "System Configuration" describes the setup of the system. In section "Submitting multicore or (large) memory jobs", the information relevant to users of the Computing Elements is presented.

The information presented here is valid for the UMD-1 version of the CreamCE in combination with a batch system based on Torque 2.3. Other versions of the CreamCE (in particular the nearly unsupported gLite 3.2 version) may require different configuration. Other versions of the Torque batch system may work fine, although that hasn't been verified. Different batch systems fall outside the scope of this article.


System Configuration

Two services are involved in the submission of grid jobs requiring multiple cores or specific amounts of memory: the CreamCE and the Torque batch system. The CreamCE is the entry point for a grid job at a site. The CreamCE processes the resource requests and translates them into a format that is specific for the batch system implementation. The batch system can then allocate the requested resources.

==== Setup at the CreamCE

To support multicore jobs or memory requests, the CreamCE must recognize these requests and translate them into directives for the Torque batch system. In the implementation discussed here, the CreamCE needs 3 files:

  • A script to process specific resource requests Media:pbs_local_submit_attributes.sh. This file should be installed as /usr/bin/pbs_local_submit_attributes.sh on the CreamCE. It processes memory resource requests.
  • A configuration file to activate this submit filter Media:torque.cfg. This file should be stored as /var/spool/pbs/torque.cfg.
  • A Torque submit filter to write the input for the batch system Media:torque_submit_filter.sh. This file should be installed as /usr/local/sbin/torque_submit_filter.sh (actually: the location specified in file torque.cfg above as value for SUBMITFILTER). The submit filter is discussed in more detail below.

The submit filter

The submit filter processes the initial input for the batch system. It inspects the requested resources and may change them before writing the (possibly modified) input to standard output (which is forwarded to the batch server).


Setup at the batch system

Submitting multicore or (large) memory jobs