Difference between revisions of "Dans Data Verify"

From BiGGrid Wiki
Jump to navigation Jump to search
Line 4: Line 4:
 
  Splitting into 11 jobs, start=1, end=84
 
  Splitting into 11 jobs, start=1, end=84
 
  Delegating proxy
 
  Delegating proxy
  Submitting DANS job 156: https://wms1.grid.sara.nl:9000/R5PWohEeStwR4u-3I-ojNA
+
00167 Submitting DANS job 67: https://graskant.nikhef.nl:9000/PJFSZ_piRH8WBnVUXUyVAQ
  Submitting DANS job 157  https://wms1.grid.sara.nl:9000/qcHwM00A7wmSWCUfcGhcnQ
+
00168  Submitting DANS job 68: https://grasveld.nikhef.nl:9000/rm19k1kWjtXGaGEjcl6AYA
  Submitting DANS job 158  https://wms1.grid.sara.nl:9000/lxzCOTFcFsXlqXyC3IwYcQ
+
00169 Submitting DANS job 69: https://wms2.grid.sara.nl:9000/dN_QM8ZxIS3JCqa63QmILw
  Submitting DANS job 159  https://wms1.grid.sara.nl:9000/k30pquu-p5mDnf2nW-UW5A
+
00170 Submitting DANS job 70: https://wms2.grid.sara.nl:9000/n_qLv1K_CNC26JQdarqd-A
  Submitting DANS job 160  https://wms1.grid.sara.nl:9000/ZmdEOuBMesm_3e-L9jBSVg
+
00171 Submitting DANS job 71: https://wms2.grid.sara.nl:9000/bS-C2WTuj1nFw7xpSt2hXw
  Submitting DANS job 161  https://wms1.grid.sara.nl:9000/9yNPzoZYU6ykx-8YcOb9PA
+
00172 Submitting DANS job 72: https://wms2.grid.sara.nl:9000/Q0cBRixMFsrFgpLyZtK_4g
  Submitting DANS job 162  https://wms1.grid.sara.nl:9000/V1DWV2qbmRe-CiRB5g06YA
+
00173 Submitting DANS job 73: https://wms2.grid.sara.nl:9000/ooUaSiBLargL74brqKeS7w
  Submitting DANS job 163  https://wms1.grid.sara.nl:9000/0GJ7YyfwHC1bTPzabG2LSw
+
00174 Submitting DANS job 74: https://wms2.grid.sara.nl:9000/ux8PPNk71_Utf53a8x-_Yg
  Submitting DANS job 164  https://wms1.grid.sara.nl:9000/vKIqcsPzIt3ugtJTB8qa6Q
+
00175 Submitting DANS job 75: https://wms2.grid.sara.nl:9000/zqBI1QXfOMBKEP7sWDD-aA
  Submitting DANS job 165  https://wms1.grid.sara.nl:9000/QQmeZGZr3eOeagGvLfjh7g
+
00176 Submitting DANS job 76: https://wms2.grid.sara.nl:9000/OtU_naI5EYBilqML81krsA
  Submitting DANS job 166  https://wms1.grid.sara.nl:9000/BSx6PWo3xIqsccVqDb3cPA
+
00177 Submitting DANS job 77: https://wms2.grid.sara.nl:9000/fl05LVqrQS9JzkXsn73R4Q
 
+
00178 Submitting DANS job 78: https://wms2.grid.sara.nl:9000/XVlE9d-ufjWwiZc2EyFsDA
 
The 'soundbites' archive consists of 84 .tar.gz files which need to be checked. Each gridjob will verify 8 tarballs, hence a total of 11 jobs were submitted.
 
The 'soundbites' archive consists of 84 .tar.gz files which need to be checked. Each gridjob will verify 8 tarballs, hence a total of 11 jobs were submitted.
 
After the jobs have been submitted to the grid you can track the status of these jobs using the '<tt>job-status</tt>' script:
 
After the jobs have been submitted to the grid you can track the status of these jobs using the '<tt>job-status</tt>' script:
 
 
  $ ./job-status
 
  $ ./job-status
  00156: https://wms1.grid.sara.nl:9000/R5PWohEeStwR4u-3I-ojNA         Status=Running
+
  00167 https://graskant.nikhef.nl:9000/PJFSZ_piRH8WBnVUXUyVAQ        Status=Running
  00157  https://wms1.grid.sara.nl:9000/qcHwM00A7wmSWCUfcGhcnQ        Status=Running
+
00168 https://grasveld.nikhef.nl:9000/rm19k1kWjtXGaGEjcl6AYA         Status=Running
  00158  https://wms1.grid.sara.nl:9000/lxzCOTFcFsXlqXyC3IwYcQ        Status=Running
+
  00169 https://wms2.grid.sara.nl:9000/dN_QM8ZxIS3JCqa63QmILw          Status=Running
  00159  https://wms1.grid.sara.nl:9000/k30pquu-p5mDnf2nW-UW5A        Status=Running
+
  00170 https://wms2.grid.sara.nl:9000/n_qLv1K_CNC26JQdarqd-A          Status=Running
  00160  https://wms1.grid.sara.nl:9000/ZmdEOuBMesm_3e-L9jBSVg        Status=Running
+
  00171 https://wms2.grid.sara.nl:9000/bS-C2WTuj1nFw7xpSt2hXw          Status=Running
  00161  https://wms1.grid.sara.nl:9000/9yNPzoZYU6ykx-8YcOb9PA        Status=Running
+
  00172 https://wms2.grid.sara.nl:9000/Q0cBRixMFsrFgpLyZtK_4g          Status=Running
  00162  https://wms1.grid.sara.nl:9000/V1DWV2qbmRe-CiRB5g06YA        Status=Running
+
  00173 https://wms2.grid.sara.nl:9000/ooUaSiBLargL74brqKeS7w          Status=Running
  00163  https://wms1.grid.sara.nl:9000/0GJ7YyfwHC1bTPzabG2LSw        Status=Running
+
  00174 https://wms2.grid.sara.nl:9000/ux8PPNk71_Utf53a8x-_Yg          Status=Running
  00164  https://wms1.grid.sara.nl:9000/vKIqcsPzIt3ugtJTB8qa6Q        Status=Running
+
  00175 https://wms2.grid.sara.nl:9000/zqBI1QXfOMBKEP7sWDD-aA          Status=Running
  00165  https://wms1.grid.sara.nl:9000/QQmeZGZr3eOeagGvLfjh7g        Status=Running
+
  00176 https://wms2.grid.sara.nl:9000/OtU_naI5EYBilqML81krsA          Status=Running
  00166  https://wms1.grid.sara.nl:9000/BSx6PWo3xIqsccVqDb3cPA        Status=Running
+
  00177 https://wms2.grid.sara.nl:9000/fl05LVqrQS9JzkXsn73R4Q          Status=Running
+
  00178 https://wms2.grid.sara.nl:9000/XVlE9d-ufjWwiZc2EyFsDA          Status=Running
 
The grid job ids, starting with https://, look like URLs and that's exactly what they are. The user who submits the job can view the status of that job using a webbrowser, provided that the user's grid certificate is installed in that browser.
 
The grid job ids, starting with https://, look like URLs and that's exactly what they are. The user who submits the job can view the status of that job using a webbrowser, provided that the user's grid certificate is installed in that browser.
  
Line 38: Line 37:
 
===Job output===
 
===Job output===
 
When a grid job is finished the '<tt>job-status</tt>' script automatically retrieves the output:
 
When a grid job is finished the '<tt>job-status</tt>' script automatically retrieves the output:
  00166 https://wms1.grid.sara.nl:9000/BSx6PWo3xIqsccVqDb3cPA         Status=Done (Exit code=0)
+
  00172 https://wms2.grid.sara.nl:9000/Q0cBRixMFsrFgpLyZtK_4g         Status=Done (Exit code=0)
         Retrieving job output into $HOME/dans/gridjobs/00166/output
+
         Retrieving job output into $HOME/dans/gridjobs/00172/output
 
The status message 'Done (Exit code=0)' means that the job ran successfully and returned an exit code 0, which indicates success.
 
The status message 'Done (Exit code=0)' means that the job ran successfully and returned an exit code 0, which indicates success.
  
Line 46: Line 45:
 
===Comparing the checksums===
 
===Comparing the checksums===
 
After all jobs submitted by the '<tt>check-tar</tt>' script have successfully completed you can compare the checksums of the files found on the grid against the checksums of the local files. The local checksums were calculated before the files were uploaded to the grid, as part of the [[Dans_Data_Upload|Data Upload]] procedure.
 
After all jobs submitted by the '<tt>check-tar</tt>' script have successfully completed you can compare the checksums of the files found on the grid against the checksums of the local files. The local checksums were calculated before the files were uploaded to the grid, as part of the [[Dans_Data_Upload|Data Upload]] procedure.
 
 
  $ ./compare-checksums  
 
  $ ./compare-checksums  
 
  Scanning for DANS 'soundbites' jobs:
 
  Scanning for DANS 'soundbites' jobs:
Line 60: Line 58:
 
  00177: check-archive.sh "soundbites  73  80" Comparing md5sums: Equal
 
  00177: check-archive.sh "soundbites  73  80" Comparing md5sums: Equal
 
  00178: check-archive.sh "soundbites  81  84" Comparing md5sums: Equal
 
  00178: check-archive.sh "soundbites  81  84" Comparing md5sums: Equal
 +
 +
This output shows that the MD5 checksums for all files found in the 'soundbites' archive on the grid are equal to the checksums that were generated when this archive was uploaded for the first time.

Revision as of 16:51, 15 November 2012

In the step of this DANS workflow the .tar.gz files are checked, to verify that the md5sum checksums of all files contained in the .tar.gz files match the checksums of the files as found on the DANS data server. This verification step is done on the grid itself, hence we need to submit a set of jobs to the grid. The 'check-tar' script does this automatically.

$ ./check-tar
Found 84 tar.gz files in lfc.grid.sara.nl:/grid/dans/soundbites
Splitting into 11 jobs, start=1, end=84
Delegating proxy
00167  Submitting DANS job 67: https://graskant.nikhef.nl:9000/PJFSZ_piRH8WBnVUXUyVAQ
00168  Submitting DANS job 68: https://grasveld.nikhef.nl:9000/rm19k1kWjtXGaGEjcl6AYA
00169  Submitting DANS job 69: https://wms2.grid.sara.nl:9000/dN_QM8ZxIS3JCqa63QmILw
00170  Submitting DANS job 70: https://wms2.grid.sara.nl:9000/n_qLv1K_CNC26JQdarqd-A
00171  Submitting DANS job 71: https://wms2.grid.sara.nl:9000/bS-C2WTuj1nFw7xpSt2hXw
00172  Submitting DANS job 72: https://wms2.grid.sara.nl:9000/Q0cBRixMFsrFgpLyZtK_4g
00173  Submitting DANS job 73: https://wms2.grid.sara.nl:9000/ooUaSiBLargL74brqKeS7w
00174  Submitting DANS job 74: https://wms2.grid.sara.nl:9000/ux8PPNk71_Utf53a8x-_Yg
00175  Submitting DANS job 75: https://wms2.grid.sara.nl:9000/zqBI1QXfOMBKEP7sWDD-aA
00176  Submitting DANS job 76: https://wms2.grid.sara.nl:9000/OtU_naI5EYBilqML81krsA
00177  Submitting DANS job 77: https://wms2.grid.sara.nl:9000/fl05LVqrQS9JzkXsn73R4Q
00178  Submitting DANS job 78: https://wms2.grid.sara.nl:9000/XVlE9d-ufjWwiZc2EyFsDA

The 'soundbites' archive consists of 84 .tar.gz files which need to be checked. Each gridjob will verify 8 tarballs, hence a total of 11 jobs were submitted. After the jobs have been submitted to the grid you can track the status of these jobs using the 'job-status' script:

$ ./job-status
00167 https://graskant.nikhef.nl:9000/PJFSZ_piRH8WBnVUXUyVAQ         Status=Running
00168 https://grasveld.nikhef.nl:9000/rm19k1kWjtXGaGEjcl6AYA         Status=Running
00169 https://wms2.grid.sara.nl:9000/dN_QM8ZxIS3JCqa63QmILw          Status=Running
00170 https://wms2.grid.sara.nl:9000/n_qLv1K_CNC26JQdarqd-A          Status=Running
00171 https://wms2.grid.sara.nl:9000/bS-C2WTuj1nFw7xpSt2hXw          Status=Running
00172 https://wms2.grid.sara.nl:9000/Q0cBRixMFsrFgpLyZtK_4g          Status=Running
00173 https://wms2.grid.sara.nl:9000/ooUaSiBLargL74brqKeS7w          Status=Running
00174 https://wms2.grid.sara.nl:9000/ux8PPNk71_Utf53a8x-_Yg          Status=Running
00175 https://wms2.grid.sara.nl:9000/zqBI1QXfOMBKEP7sWDD-aA          Status=Running
00176 https://wms2.grid.sara.nl:9000/OtU_naI5EYBilqML81krsA          Status=Running
00177 https://wms2.grid.sara.nl:9000/fl05LVqrQS9JzkXsn73R4Q          Status=Running
00178 https://wms2.grid.sara.nl:9000/XVlE9d-ufjWwiZc2EyFsDA          Status=Running

The grid job ids, starting with https://, look like URLs and that's exactly what they are. The user who submits the job can view the status of that job using a webbrowser, provided that the user's grid certificate is installed in that browser.

See the section DANS Job Scripts for more details on both the 'check-tar' and the 'job-status' scripts.

Job output

When a grid job is finished the 'job-status' script automatically retrieves the output:

00172  https://wms2.grid.sara.nl:9000/Q0cBRixMFsrFgpLyZtK_4g         Status=Done (Exit code=0)
       Retrieving job output into $HOME/dans/gridjobs/00172/output

The status message 'Done (Exit code=0)' means that the job ran successfully and returned an exit code 0, which indicates success.

Wait for all jobs to complete successfully before continuing to the next step.

Comparing the checksums

After all jobs submitted by the 'check-tar' script have successfully completed you can compare the checksums of the files found on the grid against the checksums of the local files. The local checksums were calculated before the files were uploaded to the grid, as part of the Data Upload procedure.

$ ./compare-checksums 
Scanning for DANS 'soundbites' jobs:
00167: check-archive.sh "soundbites    1    8" Comparing md5sums: Equal
00169: check-archive.sh "soundbites    9   16" Comparing md5sums: Equal
00170: check-archive.sh "soundbites   17   24" Comparing md5sums: Equal
00171: check-archive.sh "soundbites   25   32" Comparing md5sums: Equal
00172: check-archive.sh "soundbites   33   40" Comparing md5sums: Equal
00173: check-archive.sh "soundbites   41   48" Comparing md5sums: Equal
00174: check-archive.sh "soundbites   49   56" Comparing md5sums: Equal
00175: check-archive.sh "soundbites   57   64" Comparing md5sums: Equal
00176: check-archive.sh "soundbites   65   72" Comparing md5sums: Equal
00177: check-archive.sh "soundbites   73   80" Comparing md5sums: Equal
00178: check-archive.sh "soundbites   81   84" Comparing md5sums: Equal

This output shows that the MD5 checksums for all files found in the 'soundbites' archive on the grid are equal to the checksums that were generated when this archive was uploaded for the first time.