Using the Grid/Large data job

From BiGGrid Wiki
Revision as of 10:39, 28 June 2011 by Machiel.Jansen (talk | contribs) (Created page with "The InputSandbox and OutputSandbox attributes in the JDL file are the basic way to move files to and from the User Interface (UI) and the Worker Node (WN). However, large...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The InputSandbox and OutputSandbox attributes in the JDL file are the basic way to move files to and from the User Interface (UI) and the Worker Node (WN). However, large files (from about 10 MB and larger) are involved you should not use these Sandboxes to move data around. Instead you should use the Storage Elements en work with the lfc and lcg commands. These commands, and the storage system in general, are explained in the section on datamanagement. Here we give an example of how to use large input and output files which are needed by your job.

Data Requirements

This case describes the DataRequirements attribute in your job description file; this attribute is a list of classads representing the data requirements for the job. Each classad has to contain three attributes :

  • InputData
  • DataCatalog
  • DataCatalogType

These represent respectively

  • The list of input files needed by the job
  • The type of data catalog - needed by the Grid middleware. This is needed in order to resolve logical names to physical names. Fill in "DLI" here.
  • The address (URL) of the data catalog if this is not the VO default one.

The presence of the DataRequirements attribute causes the job to run on a Computing Element (CE) which is next to the Storage Element (SE) where the requested file is stored. For further details on how to store a file on a SE see here ). Note that this attribute doesn't perform the actual copy of the file from the SE to the WN; as we will see, this have to be done by the user.

this is what you have to do: First, register a file on a SE and to the LFC Catalog. We do this by copy and register (lcg-cr).

$ lcg-cr --vo lsgrid -d gb-se-ams.els.sara.nl -l lfn:/grid/lsgrid/mgjansen/test.txt \
 file:/home/mgjansen/local_test.txt 
guid:522350d4-a28a-48aa-939b-d85c9ab5443f

Note that the guid part is what we get as return value from the command. It identifies the file uniquely in the Grid storage. You can save this id for emergencies. The part which starts with lfn: identiefies the logical file name of our uploaded file.

Second, create a JDL file that describes your job. Itwill contain the LFN of the file, as is shown here.

$ cat inputdata.jdl
[
        Executable = "/bin/sh";
        Arguments = "scriptInputData.sh lfn:/grid/lsgrid/mgjansen/test.txt";

        StdOutput = "std.out";
        StdError = "std.err";

        InputSandbox = "scriptInputData.sh";
        OutputSandbox = {"std.out","std.err"};

        DataRequirements = {
                [
                  InputData = {"lfn:/grid/lsgrid/mgjansen/test.txt"};
                  DataCatalogType = "DLI";
                  DataCatalog = "http://lfc.grid.sara.nl:8085";
                ]
        };
        DataAccessProtocol = {"gsiftp"};

        RetryCount = 3;
]

This jdl mentions the script scriptInputData.sh (as value of Arguments) which will be submitted to the WMS, and run on a worker node. This script needs an inputfile, and expects an LFN as argument. We will use the file that we copied to an SE earlier. In the DataRequirements section, we mention the LFN of this file as value of InputData. Notice that the DataCatalogType and DataCatalog are also described. You can copy these values.

Note that this in itself is not enough for the script to use the file. It still needs to be copied to the worker node where the job lands. All that is achieved by this JDL description is that the job will land close to an SE which contains the needed data. The copying is done by the script itself. To actually copy the file associated with this LFN from the SE to the WN, the script uses an lcg-cp command. The script "scriptInputData.sh" is shown below.

The script gets the file, performs the ls command and shows the content of the file to stdout.

$ cat scriptInputData.sh 
#!/bin/sh

# Set the proper environment
export LFC_HOST=lfc.grid.sara.nl
export LCG_GFAL_INFOSYS=bdii.grid.sara.nl:2170
export LCG_CATALOG_TYPE=lfc

# Download the file from the SE to the WN where this job runs
# note that the LFN is passed as input to this script
lcg-cp --vo lsgrid $1 file:`pwd`/local_file

echo "########################################"
ls -la local_file
echo "########################################"
# type the file just downloaded
cat local_file

Now the actual submission, status checking, output retrieval and inspection can take place. If you want to try this example, you have to create two files, inputdata.jdl and scriptInputData.sh, filling them with the content displayed above. Of course, you have to register your own file and consequently change the LFN requested within the DataRequirements attribute.


Moving output data from the job to the SE

What do you do when you have to move data from a running job on the Worker Node to a Storage Element? The answer is: the job has to do it by copying the data in a script. We give an example. Assume that the following script code is executed by a running job.

$ cat  registeringfile-script.sh   
#!/bin/sh
# Author : Emidio Giorgio
# Usage : register a file to the default SE, with a specified LFN 
#  - The file to copy and register is passed as first input argument to the script ($1)
#  - The logical file name it will have is the second input argument to the script ($2)
#  - the LFN will be like this /grid/lsgrid/YOUR_DIRECTORY/$2 

# Set the proper environment
export LFC_HOST=lfc.grid.sara.nl
export LCG_GFAL_INFOSYS=bdii.grid.sara.nl:2170
export LCG_CATALOG_TYPE=lfc

# Actually upload the file to the SE
# path to the file to be registered is built as {current path}/{relative path from this script to filename}
# REPLACE CHANGEME with an (already existing) LFC directory of your choice 
lcg-cr --vo lsgrid -l lfn:/grid/lsgrid/CHANGEME/$2  file:$PWD/$1

This script is in charge of copying the output of your job. The simplest thing is to run it from within the main job script, as shown below:

 cat  scriptWhichDoesSomething.sh
#!/bin/sh

# do whatever 
echo "This is a very dummy test" > fileout.txt

# run the script which registers the file fileout.txt just created above 
/bin/sh registeringfile-script.sh fileout.txt data_from_the_WN

# greetings 
echo "All done correctly (I hope). Bye bye"

This could be a starting point for your jdl :

$ cat  JobWritingToSE.jdl
[
        Executable = "/bin/sh";
        Arguments = "scriptWhichDoesSomething.sh";

        StdOutput = "std.out";
        StdError = "std.err";

# carry out also the script which registers the file  
        InputSandbox = {"scriptWhichDoesSomething.sh","registeringfile-script.sh"};
        OutputSandbox = {"std.out","std.err"};
]

Alternatively, you can just append the content of registeringfile-script.sh to your main script.