Dans Data Verify

From BiGGrid Wiki
Revision as of 16:38, 15 November 2012 by Janjust@nikhef.nl (talk | contribs)
Jump to navigation Jump to search

In the step of this DANS workflow the .tar.gz files are checked, to verify that the md5sum checksums of all files contained in the .tar.gz files match the checksums of the files as found on the DANS data server. This verification step is done on the grid itself, hence we need to submit a set of jobs to the grid. The 'check-tar' script does this automatically.

$ ./check-tar
Found 84 tar.gz files in lfc.grid.sara.nl:/grid/dans/soundbites
Splitting into 11 jobs, start=1, end=84
Delegating proxy
Submitting DANS job 156: https://wms1.grid.sara.nl:9000/R5PWohEeStwR4u-3I-ojNA
Submitting DANS job 157  https://wms1.grid.sara.nl:9000/qcHwM00A7wmSWCUfcGhcnQ
Submitting DANS job 158  https://wms1.grid.sara.nl:9000/lxzCOTFcFsXlqXyC3IwYcQ
Submitting DANS job 159  https://wms1.grid.sara.nl:9000/k30pquu-p5mDnf2nW-UW5A
Submitting DANS job 160  https://wms1.grid.sara.nl:9000/ZmdEOuBMesm_3e-L9jBSVg
Submitting DANS job 161  https://wms1.grid.sara.nl:9000/9yNPzoZYU6ykx-8YcOb9PA
Submitting DANS job 162  https://wms1.grid.sara.nl:9000/V1DWV2qbmRe-CiRB5g06YA
Submitting DANS job 163  https://wms1.grid.sara.nl:9000/0GJ7YyfwHC1bTPzabG2LSw
Submitting DANS job 164  https://wms1.grid.sara.nl:9000/vKIqcsPzIt3ugtJTB8qa6Q
Submitting DANS job 165  https://wms1.grid.sara.nl:9000/QQmeZGZr3eOeagGvLfjh7g
Submitting DANS job 166  https://wms1.grid.sara.nl:9000/BSx6PWo3xIqsccVqDb3cPA

The 'soundbites' archive consists of 84 .tar.gz files which need to be checked. Each gridjob will verify 8 tarballs, hence a total of 11 jobs were submitted. After the jobs have been submitted to the grid you can track the status of these jobs using the 'job-status' script:

$ ./job-status
00156: https://wms1.grid.sara.nl:9000/R5PWohEeStwR4u-3I-ojNA         Status=Running
00157  https://wms1.grid.sara.nl:9000/qcHwM00A7wmSWCUfcGhcnQ         Status=Running
00158  https://wms1.grid.sara.nl:9000/lxzCOTFcFsXlqXyC3IwYcQ         Status=Running
00159  https://wms1.grid.sara.nl:9000/k30pquu-p5mDnf2nW-UW5A         Status=Running
00160  https://wms1.grid.sara.nl:9000/ZmdEOuBMesm_3e-L9jBSVg         Status=Running
00161  https://wms1.grid.sara.nl:9000/9yNPzoZYU6ykx-8YcOb9PA         Status=Running
00162  https://wms1.grid.sara.nl:9000/V1DWV2qbmRe-CiRB5g06YA         Status=Running
00163  https://wms1.grid.sara.nl:9000/0GJ7YyfwHC1bTPzabG2LSw         Status=Running
00164  https://wms1.grid.sara.nl:9000/vKIqcsPzIt3ugtJTB8qa6Q         Status=Running
00165  https://wms1.grid.sara.nl:9000/QQmeZGZr3eOeagGvLfjh7g         Status=Running
00166  https://wms1.grid.sara.nl:9000/BSx6PWo3xIqsccVqDb3cPA         Status=Running

The grid job ids, starting with https://, look like URLs and that's exactly what they are. The user who submits the job can view the status of that job using a webbrowser, provided that the user's grid certificate is installed in that browser.

See the section DANS Job Scripts for more details on both the 'check-tar' and the 'job-status' scripts.

Job output

When a grid job is finished the 'job-status' script automatically retrieves the output:

00166  https://wms1.grid.sara.nl:9000/BSx6PWo3xIqsccVqDb3cPA         Status=Done (Exit code=0)
       Retrieving job output into $HOME/dans/gridjobs/00166/output

The status message 'Done (Exit code=0)' means that the job ran successfully and returned an exit code 0, which indicates success.

Wait for all jobs to complete successfully before continuing to the next step.

Comparing the checksums