Difference between revisions of "ADC Operation NL"
Jump to navigation
Jump to search
(→=) |
|||
Line 69: | Line 69: | ||
== Trouble shooting logs == | == Trouble shooting logs == | ||
− | === Sept. 2008 - huge transfer backlog from T2s to SARA | + | === Sept. 2008 - huge transfer backlog from T2s to SARA === |
Problem fixed by running the FTS admin tool to slim down the FTS job history db table. | Problem fixed by running the FTS admin tool to slim down the FTS job history db table. | ||
Revision as of 09:01, 9 October 2008
Introduction
This page is summarizing the current setup/configuration of the NL cloud of the ATLAS Distributed Computing (ADC).
Apart from that, it also logs the used-to-happen issues during the day-to-day operation works.
Sites
-
combined Tier1 center:
Country | Institute | GOC site name |
---|---|---|
Netherlands | NIKHEF | NIKHEF-ELPROD |
Netherlands | SARA | SARA-MATRIX |
Country | Institute | GOC site name |
---|---|---|
Russia | ITEP | ITEP |
Russia | IHEP | RU-PROTVINO-IHEP |
Russia | JINR | JINR-LCG2 |
Russia | PNPI | RU-PNPI |
Russia | SINP | SINP |
Russia | RRC-KI | RRC-KI |
Ireland | CST | CSTCDIE |
Turkey | ULAKBIM | TR-10-ULAKBIM |
Data Management Services
- Storage Resource Manager v2.2: deployed on each site as common interface to storage elements
- gLite File Transfer Service (FTS): deployed at SARA serving the T1-T1 and T1-T2 data transfers to the sites of the NL cloud
- LCG File Catalog Service (LFC): deployed at SARA for cataloging grid files stored on the sites of the NL cloud
Data locations among SARA and NIKHEF
SRMv2 space tokens
Daily operation logs
Date | Actions | Remarks |
---|---|---|
9 Oct. 2008 | requests cosmic and 1beam reprocessed ESD to NIKHEF-ELPROD_DATADISK | needed by NIKHEF physics group |
10 Oct. 2008 | NL Tier2s become on-line again for MC production | FTS performance issue at SARA fixed |
Useful links
- ADC eLog entries concerning NL cloud
- DDM operation wiki
- DDM browser
- SRM space usage monitor
- Data transfer dashboard (T0-T1, T1-T1)
- Data transfer dashboard (T1-T2)
- PanDA monitor
Trouble shooting logs
Sept. 2008 - huge transfer backlog from T2s to SARA
Problem fixed by running the FTS admin tool to slim down the FTS job history db table.
=== Sept. 2008 - SRM request timeout reading data from SARA A cron job fixing orphan file issue of dCache loads PNFS server so it was stopped. Also observe a broken network switch involving 4 new dCache node. The problematic network switch has been replaced. A broken dCache node is also replaced.