Checksumming support in SRM implementations

From PDP/Grid Wiki
Jump to navigationJump to search

Some applications have requested the availability of checksumming support on files stored on the storage elements in the Netherlands. The first VO to request this was Atlas, but other VOs are also very interested. Unfortunately, checksumming support seems to be non-homogenously implemented across the different storage backend systems.

This Wiki page is the result of a short investigation into the checksumming differences between dCache, DPM, StoRM and CASTOR.

To store a file with a checksum in SRM I've used the commands

lcg-cr -d $SRM -l lfn:/grid/pvier/janjust/myfile file://$PWD/myfile --checksum
lcg-cr -d $SRM -l lfn:/grid/pvier/janjust/myfile file://$PWD/myfile --checksum --checksum-type adler32
lcg-cr -d $SRM -l lfn:/grid/pvier/janjust/myfile file://$PWD/myfile --checksum --checksum-type md5

where $SRM points to the appropriate storage system.

Note: so-called gridftp checksums are checksums that can be calculated on-the-fly using a special GridFTP command

dCache

  • srmping: v2.2 dCache production-1.9.3-3
  • seems to support only adler32 checksums (which is what WLCG seems to want) ; according to Ron md5 is also support but this requires a server reconfiguration
  • checksums
    • are computed when the file is transferred to dCache or during a move between pool nodes.
    • are stored in the dCache namespace
    • can be retrieved using the srmls -l command
  • gridftp checksums are not supported

DPM

  • srmping: v2.2 DPM 1.7.0-5
  • supports adler32, md5 and crc32 checksums
  • supports only gridftp checksums; adler32 and crc32 are supported using a special DPM-DSI GridFTP plugin
  • checksums
    • can be computed using the lcg-get-checksum command
    • are stored in the DPM namespace but can never be retrieved from there ( !! )
    • are not displayed when using the srmls -l command
    • are recalculated every time

StoRM

  • srmping: v2.2 StoRM <FE:1.4.0-01.sl4><BE:1.4.0-00>
  • supports only md5 gridftp checksums
  • checksums
    • can be computed using the lcg-get-checksum command; however, the command fails for large files
    • are not displayed when using the srmls -l command
    • are recalculated every time

CASTOR

  • srmping: v2.2 CASTOR v2_7_15 2.1.7
  • supports only adler32 checksums
  • checksums
    • are computed in the background
    • cannot be specified with the --checksum option
    • can be retrieved using the srmls -l command
    • are stored in the CASTOR name space
  • gridftp checksums are not supported

Command output

dCache

lcg-cr output:

$ lcg-cr -l /grid/pvier/janjust/my-dcache-file-checksum-adler32 -d $SRM file:/user/janjust/myfile --checksum 
guid:d156757d-ecca-472e-adb9-64ebefde23c4

srm-ls output:

$ srmls -l srm://srm.grid.sara.nl/pnfs/grid.sara.nl/data/pvier/generated/2009-08-22/filec00ed728-a4d9-43f2-86d8-e5d334c15648
 8508 /pnfs/grid.sara.nl/data/pvier/generated/2009-08-22/filec00ed728-a4d9-43f2-86d8-e5d334c15648
 space token(s) :15253459
 storage type:PERMANENT
 retention policy:CUSTODIAL
 access latency:NEARLINE
 locality:NEARLINE
 - Checksum value:  cd5d9820
 - Checksum type:  adler32
 UserPermission: uid=18010 PermissionsRW
 GroupPermission: gid=1276 PermissionsR
 WorldPermission: R
 created at:2009/08/22 00:02:18
 modified at:2009/08/22 00:02:18
 - Assigned lifetime (in seconds):  -1
 - Lifetime left (in seconds):  -1
 - Original SURL:  /pnfs/grid.sara.nl/data/pvier/generated/2009-08-22/filec00ed728-a4d9-43f2-86d8-e5d334c15648
 - Status:  null
 - Type:  FILE

DPM

lcg-cr output:

$ lcg-cr -l /grid/pvier/janjust/my-dpm-file-checksum-adler32 -d $SRM file:/user/janjust/myfile --checksum 
guid:2352aac8-37b9-4cfd-ba91-5eaee71fd5f1

srm-ls output:

$ srmls -l $SRM
 8508 /dpm/science.uu.nl/home/pvier/generated/2009-08-21/filefd107414-085b-4ff0-abf2-f6fdac110bc4
space token(s) :none found
 storage type:PERMANENT
retentionpolicyinfo : null
 locality:ONLINE
  UserPermission: uid=/O=dutchgrid/O=users/O=nikhef/CN=Jan Just Keijser PermissionsRW
  GroupPermission: gid=pvier PermissionsRW
 WorldPermission: R
created at:2009/08/21 23:56:27
modified at:2009/08/21 23:56:27
 - Lifetime left (in seconds):  -1
 - Original SURL:  /dpm/science.uu.nl/home/pvier/generated/2009-08-21/filefd107414-085b-4ff0-abf2-f6fdac110bc4
- Status:  null
- Type:  FILE

StoRM

lcg-cr output:

$ lcg-cr -l /grid/pvier/janjust/my-storm-file-checksum-adler32 -d $SRM file:/user/janjust/myfile --checksum 
[LCG-UTIL][lcg_cr4][] Destination may be corrupted: 
Source checksum (cd5d9820) != Destination checksum (d41d8cd98f00b204e9800998ecf8427e )
[LCG-UTIL][lcg_cp4][] srm://srm.grid.rug.nl/pvier/generated/2009-08-21/filecc31824f-0a4a-4bd5-8a5e-7fc00cbe4aee has 
been DELETED,  please try again (temporary network problem? )
lcg_cr: Resource temporarily unavailable

a different file:

$ lcg-cr -l /grid/pvier/janjust/my-storm-file-checksum-adler32 -d $SRM file:/user/janjust/wms_soap_msgs.txt --checksum 
[LCG-UTIL][lcg_cr4][] Destination may be corrupted:
Source checksum (8d9e8485) != Destination checksum (d41d8cd98f00b204e9800998ecf8427e)
[LCG-UTIL][lcg_cp4][] srm://srm.grid.rug.nl/pvier/generated/2009-08-22/fileddf9c6c0-f01a-46d5-b1b2-1f48fc498c87 has 
been DELETED, please try again (temporary network problem?)
lcg_cr: Resource temporarily unavailable

Interesting: the destination checksum is the same, even though it's a completely different file ?!?!?!?!

srm-ls output: not shown as the file was not successfully copied.

CASTOR

lcg-cr output:

$ lcg-cr -d $SRM/myfile2 -l lfn:/grid/pvier/janjust/my-castor-file2 file://$PWD/myfile --checksum
[LCG-UTIL][lcg_get_checksum_surls][]
srm://srm-dteam.gridpp.rl.ac.uk/castor/ads.rl.ac.uk/test/dteam/disk0tape0/janjust/myfile2:
Communication error on send 
guid:64344a07-f3f3-41e5-9079-c86bf61a874e
lcg_cr: Communication error on send 

Afterwards the file is stored in SRM but not in the LFC.

srmls output (after 10 minutes; the checksum calculation happens in the background):

$ srmls -l $SRM/myfile
 8508 /castor/ads.rl.ac.uk/test/dteam/disk0tape0/janjust/myfile
 space token(s) :none found
 type: null
 retentionpolicyinfo : null
  locality:ONLINE_AND_NEARLINE
  - Checksum value:  0xcd5d9820
  - Checksum type:  adler32 
   UserPermission: uid=dteam001 PermissionsRWX
   GroupPermission: gid=dteam PermissionsRWX
  WorldPermission: RW
 created at:2009/08/20 17:11:06