SintMaarten network

From PDP/Grid Wiki
Revision as of 11:25, 24 November 2009 by Janjust@nikhef.nl (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

In October 2009 the SintMaarten cluster was commissioned. This cluster is based on HP blades. Soon after commissioning a serious performance issue was reported:

 [BG-NLT1-Support] #287: bad gridftp transfer rate - smrt wns

This page is the result of the analysis of this performance issue.

Problem report

The performance issue reported was seen when copying a file from the Nikhef storage system to a SintMaarten worker node. Transfer speeds at first were OK but dropped to very low levels after about 120 Mb of data, eventually causing timeouts in the lcg-cp command used. Copying the exact same file from the exact same storage element to a slightly older worker node did not experience this problem:

 ===
 wn-smrt-006 (Bad!)
 ===
 # lcg-cp --vo atlas -v srm://.... file://.....
 [snip]
 # streams: 1
    62914560 bytes   1279.98 KB/sec avg    512.00 KB/sec inst

vs

 ===
 wn-val-066 (Good!)
 ===
 # lcg-cp --vo atlas -v srm://.... file://.....
 [snip]
 # streams: 1
    1672478720 bytes  68053.21 KB/sec avg  70142.84 KB/sec inst


Analysis

This At first it was thought that the lcg-cp command itself was causing the error

Solution