Job matching on graszode vs graspol
From PDP/Grid Wiki
Jump to navigationJump to searchIssue
The execution of glite-wms-job-list-match against graszode and graspol is giving different result. With one of them I get the list with the CE/queue, but with the other I get an empty list. The jdl is the following:
Type = "job"; JobType = "normal"; Executable = "/bin/sh"; Arguments = "mach2dat.sh ergo test.dat test.ped ergo.chr1_1.mlinfo ergo.chr1_1.dose.gz"; StdOutput = "lfn.out"; StdError = "lfn.err"; InputSandbox = {"mach2dat.sh", "test.dat", "test.ped"}; OutputSandbox = {"lfn.out", "lfn.err"}; DataCatalog = "http://lfc.grid.sara.nl:8085"; InputData = { "lfn:/grid/lsgrid/aabuseiris/grimp/bin/mach2dat", "lfn:/grid/lsgrid/aabuseiris/grimp/datasets/ergo/ergo.chr1_1.mlinfo", "lfn:/grid/lsgrid/aabuseiris/grimp/datasets/ergo/ergo.chr1_1.dose.gz" }; DataAccessProtocol = {"rfio","gsiftp","gsidcap","https"};
These are the results:
$ glite-wms-job-list-match -d fbernabe -e https://graszode.nikhef.nl:7443/glite_wms_wmproxy_server ego.jdl Connecting to the service https://graszode.nikhef.nl:7443/glite_wms_wmproxy_server ========================================================================== COMPUTING ELEMENT IDs LIST The following CE(s) matching your job requirements have been found: *CEId* - gb-ce-emc.erasmusmc.nl:2119/jobmanager-pbs-express - gb-ce-emc.erasmusmc.nl:2119/jobmanager-pbs-medium - gb-ce-rug.sara.usor.nl:8443/cream-pbs-express - gb-ce-rug.sara.usor.nl:8443/cream-pbs-medium - gb-ce-ams.els.sara.nl:2119/jobmanager-pbs-express - gb-ce-ams.els.sara.nl:2119/jobmanager-pbs-medium ==========================================================================
$ glite-wms-job-list-match -d fbernabe -e https://graspol.nikhef.nl:7443/glite_wms_wmproxy_server ego.jdl Connecting to the service https://graspol.nikhef.nl:7443/glite_wms_wmproxy_server ==================== glite-wms-job-list-match failure ==================== No Computing Element matching your job requirements has been found! ==========================================================================
Cause
The bug https://savannah.cern.ch/bugs/index.php?57421 could give more information about it.
Workaround
The restart of glite-wms-wm on the problematic WMS solves the issue.
Resolution
The WMS still has to be monitored, until being sure that this issue doesn't repeat again.
Notes
A couple of notes were posted in the bug:
Marco Cecchi: Hi, next time it happens please check the ism dump file. Before restarting, just wait for another purchase cycle.
Stephen Burke:I'm not sure it's exactly the same, in this case it would be a partial failure because apparently the match worked without the data requirement, so either the problem is the dynamic lookup from the LFC or it's a loss of the close SE info only.