Difference between revisions of "NDPF News"

From PDP/Grid Wiki
Jump to navigationJump to search
m
Line 1: Line 1:
 
== At A Glance ==
 
== At A Glance ==
  
* Please report issues through GGUS: http://ggus.org/
+
Beste gebruikers van de Nikhef Data Processing Faciliteit en SSO,
* For more information, mail grid.support@nikhef.nl
+
Dear users of the Nikhef Data Processing Facility and SSO,
 
+
----
+
Op dinsdag 8 maart a.s. (volgende week) zal er gepland onderhoud
 
+
plaatsvinden aan een deel van de netwerk faciliteiten bij Nikhef. De
Beste gebruikers van de Nikhef Data Processing Faciliteit en SSO,
+
centrale 'router' in het NDPF netwerk zal worden vervangen door een
Dear users of the Nikhef Data Processing Facility and SSO,
+
nieuw exemplaar, waarbij alle verbindingen worden verbroken. Vanwege
 
+
fysieke beperkingen (kabellengtes en kastruimte) is het helaas niet
Op dinsdag 8 maart a.s. (volgende week) zal er gepland onderhoud
+
mogelijk deze vervanging uit te voeren zonder onderbreking van de
plaatsvinden aan een deel van de netwerk faciliteiten bij Nikhef. De
+
grid en NikIdM diensten.
centrale 'router' in het NDPF netwerk zal worden vervangen door een
+
nieuw exemplaar, waarbij alle verbindingen worden verbroken. Vanwege
+
De volgende diensten zullen op 8 maart van 09.00 CET tot ca. 17.00
fysieke beperkingen (kabellengtes en kastruimte) is het helaas niet
+
NIET BESCHIKBAAR zijn:
mogelijk deze vervanging uit te voeren zonder onderbreking van de
+
* SSO en federatieve diensten (SURFspot, MailFilter, grid certificaten)
grid en NikIdM diensten.
+
* het wijzigen van wachtwoorden of mail aliases
 
+
* grid computing services op NIKHEF-ELPROD
De volgende diensten zullen op 8 maart van 09.00 CET tot ca. 17.00
+
* de qsub-tunnel ('nsub') op ikohefnet desktop systemen
NIET BESCHIKBAAR zijn:
+
* data opgeslagen op tbn18.nikhef.nl
* SSO en federatieve diensten (SURFspot, MailFilter, grid certificaten)
+
* WMS en brokering en andere grid services op .nikhef.nl domeinen
* het wijzigen van wachtwoorden of mail aliases
+
* grid computing services op NIKHEF-ELPROD
+
Afgezien van SSO en storage zijn al deze diensten ook beschikbaar op de
* de qsub-tunnel ('nsub') op ikohefnet desktop systemen
+
andere BiG Grid sites, zoals bij SARA, RUG-CIT, en HTC-Philips, en
* data opgeslagen op tbn18.nikhef.nl
+
op de overige sites in EGI en wLCG. Deze blijven gewoon beschikbaar.
* WMS en brokering en andere grid services op .nikhef.nl domeinen
+
Ook andere Nikhef diensten zijn gewoon bereikbaar tijdens dit onderhoud.
 
+
Afgezien van SSO en storage zijn al deze diensten ook beschikbaar op de
+
Deze vervanging is noodzakelijk voor consolidatie van bandbreedte, de
andere BiG Grid sites, zoals bij SARA, RUG-CIT, en HTC-Philips, en
+
introductie van IPv6 in productie binnen de grid netwerken, en ter
op de overige sites in EGI en wLCG. Deze blijven gewoon beschikbaar.
+
voorbereiding op high-throughput cloud services binnen BiG Grid.
Ook andere Nikhef diensten zijn gewoon bereikbaar tijdens dit onderhoud.
+
 
+
Er is bij ingrijpende werkzaamheden altijd kans dat het misloopt, ondanks
Deze vervanging is noodzakelijk voor consolidatie van bandbreedte, de
+
onze voorafgaande simulaties en tests - op deze schaal is het onmogelijk om  
introductie van IPv6 in productie binnen de grid netwerken, en ter
+
alles van tevoren te testen. Indien dit gebeurt zal aan het eind van deze  
voorbereiding op high-throughput cloud services binnen BiG Grid.
+
dag de oude situatie worden hersteld en - na diagnose - op donderdag 10 een
 
+
nieuwe poging worden ondernomen.
Er is bij ingrijpende werkzaamheden altijd kans dat het misloopt, ondanks
+
onze voorafgaande simulaties en tests - op deze schaal is het onmogelijk om  
+
Wij hopen op uw begrip!
alles van tevoren te testen. Indien dit gebeurt zal aan het eind van deze  
+
dag de oude situatie worden hersteld en - na diagnose - op donderdag 10 een
 
nieuwe poging worden ondernomen.
 
 
 
Wij hopen op uw begrip!
 
 
 
 
DavidG.
 
DavidG.
 
+
 
------------------------------------------------------------------------------
 
------------------------------------------------------------------------------
  
On Tuesday March 8 (next week) intrusive network maintenance will be
+
On Tuesday March 8 (next week) intrusive network maintenance will be
performed on selected parts of the Nikhef network infrastructure, affecting
+
performed on selected parts of the Nikhef network infrastructure, affecting
Grid and NikIdM (single sign-on) services. The routing equipment at the
+
Grid and NikIdM (single sign-on) services. The routing equipment at the
core of the grid network will be replaced and all links have to be
+
core of the grid network will be replaced and all links have to be
reconneced to the new device. Due to physical limitations - cable lengths
+
reconneced to the new device. Due to physical limitations - cable lengths
and cabinet space - this cannot be done without service interruption.
+
and cabinet space - this cannot be done without service interruption.
 
+
The following services WILL NOT BE AVAILABLE on March 8, from 0900-1700 CET:
+
The following services WILL NOT BE AVAILABLE on March 8, from 0900-1700 CET:
* single sign-on and federative services (SURFspot, MailFilter, certificates)
+
* single sign-on and federative services (SURFspot, MailFilter, certificates)
* changing your password or adding email aliases
+
* changing your password or adding email aliases
* grid computing services at NIKHEF-ELPROD
+
* grid computing services at NIKHEF-ELPROD
* the qsub-tunnel ('nsub') from the Nikhef desktop network
+
* the qsub-tunnel ('nsub') from the Nikhef desktop network
* access to data stored at tbn18.nikhef.nl
+
* access to data stored at tbn18.nikhef.nl
* WMS, brokering, and other Grid services hosted on .nikhef.nl domains
+
* WMS, brokering, and other Grid services hosted on .nikhef.nl domains
 
+
Apart from the SSO and storage services, alternatives are available at
+
Apart from the SSO and storage services, alternatives are available at
our partner BiG Grid sites, such as SARA, RUG-CIT and HTC-Philips. Also
+
our partner BiG Grid sites, such as SARA, RUG-CIT and HTC-Philips. Also
all other sites in EGI and wLCG can be used.
+
all other sites in EGI and wLCG can be used.
Other services at Nikhef are not affected by this maintenance.
+
Other services at Nikhef are not affected by this maintenance.
 
+
The new network router allows for consolidation of bandwidth and better
+
The new network router allows for consolidation of bandwidth and better
interconnects, the introduction of production-level IPv6 services in the
+
interconnects, the introduction of production-level IPv6 services in the
grid network, and prepares for the introduction of high-throughput
+
grid network, and prepares for the introduction of high-throughput
cloud services at Nikhef in the context of BiG Grid.
+
cloud services at Nikhef in the context of BiG Grid.
 
+
 
+
However extensive the planning and testing, there is always the possibility
However extensive the planning and testing, there is always the possibility
+
of some horrible failure. At this scale, it is unrealistic to test every
of some horrible failure. At this scale, it is unrealistic to test every
+
possible interaction in the system. Were such a failure to occur, we can  
possible interaction in the system. Were such a failure to occur, we can  
+
restore the old situation at the end of the day, and a new attempt will be  
restore the old situation at the end of the day, and a new attempt will be  
+
done on Thursday March 10 -- of course after due diagnosis of the failure.
done on Thursday March 10 -- of course after due diagnosis of the failure.
+
 
+
We hope for your understanding!
We hope for your understanding!
+
 
+
DavidG.
DavidG.
+
 
 
 
= Actual =
 
= Actual =
  

Revision as of 10:56, 3 March 2011

At A Glance

Beste gebruikers van de Nikhef Data Processing Faciliteit en SSO,
Dear users of the Nikhef Data Processing Facility and SSO,

Op dinsdag 8 maart a.s. (volgende week) zal er gepland onderhoud
plaatsvinden aan een deel van de netwerk faciliteiten bij Nikhef. De
centrale 'router' in het NDPF netwerk zal worden vervangen door een
nieuw exemplaar, waarbij alle verbindingen worden verbroken. Vanwege
fysieke beperkingen (kabellengtes en kastruimte) is het helaas niet
mogelijk deze vervanging uit te voeren zonder onderbreking van de
grid en NikIdM diensten.

De volgende diensten zullen op 8 maart van 09.00 CET tot ca. 17.00
NIET BESCHIKBAAR zijn:
* SSO en federatieve diensten (SURFspot, MailFilter, grid certificaten)
* het wijzigen van wachtwoorden of mail aliases
* grid computing services op NIKHEF-ELPROD
* de qsub-tunnel ('nsub') op ikohefnet desktop systemen
* data opgeslagen op tbn18.nikhef.nl
* WMS en brokering en andere grid services op .nikhef.nl domeinen

Afgezien van SSO en storage zijn al deze diensten ook beschikbaar op de
andere BiG Grid sites, zoals bij SARA, RUG-CIT, en HTC-Philips, en
op de overige sites in EGI en wLCG. Deze blijven gewoon beschikbaar.
Ook andere Nikhef diensten zijn gewoon bereikbaar tijdens dit onderhoud.

Deze vervanging is noodzakelijk voor consolidatie van bandbreedte, de
introductie van IPv6 in productie binnen de grid netwerken, en ter
voorbereiding op high-throughput cloud services binnen BiG Grid.

Er is bij ingrijpende werkzaamheden altijd kans dat het misloopt, ondanks
onze voorafgaande simulaties en tests - op deze schaal is het onmogelijk om 
alles van tevoren te testen. Indien dit gebeurt zal aan het eind van deze 
dag de oude situatie worden hersteld en - na diagnose - op donderdag 10 een
nieuwe poging worden ondernomen.

Wij hopen op uw begrip!

DavidG.


On Tuesday March 8 (next week) intrusive network maintenance will be
performed on selected parts of the Nikhef network infrastructure, affecting
Grid and NikIdM (single sign-on) services. The routing equipment at the
core of the grid network will be replaced and all links have to be
reconneced to the new device. Due to physical limitations - cable lengths
and cabinet space - this cannot be done without service interruption.

The following services WILL NOT BE AVAILABLE on March 8, from 0900-1700 CET:
* single sign-on and federative services (SURFspot, MailFilter, certificates)
* changing your password or adding email aliases
* grid computing services at NIKHEF-ELPROD
* the qsub-tunnel ('nsub') from the Nikhef desktop network
* access to data stored at tbn18.nikhef.nl
* WMS, brokering, and other Grid services hosted on .nikhef.nl domains

Apart from the SSO and storage services, alternatives are available at
our partner BiG Grid sites, such as SARA, RUG-CIT and HTC-Philips. Also
all other sites in EGI and wLCG can be used.
Other services at Nikhef are not affected by this maintenance.

The new network router allows for consolidation of bandwidth and better
interconnects, the introduction of production-level IPv6 services in the
grid network, and prepares for the introduction of high-throughput
cloud services at Nikhef in the context of BiG Grid.

However extensive the planning and testing, there is always the possibility
of some horrible failure. At this scale, it is unrealistic to test every
possible interaction in the system. Were such a failure to occur, we can 
restore the old situation at the end of the day, and a new attempt will be 
done on Thursday March 10 -- of course after due diagnosis of the failure.

We hope for your understanding!

	DavidG.

Actual

An actual overview of downtimes for Nikhef's grid services and those of other BiG Grid sites is present at the Big Grid downtime overview page.


Past announcements

Change in queue names and properties

To improve the uniformity of the computing resources at the various BiG Grid sites towards the users, all sites will define identical queues (concerning name and properties) on their computing systems.

At grid site NIKHEF-ELPROD, we have created new queues that will replace some of the existing queues. The following queues will be removed from the systems per December 7th, 2009:

  • "test": replaced by queue "infra";
  • "qshort": replaced by queue "short" with a maximum wall time of 4 hours;
  • "qlong": replaced by queue "medium" with a maximum wall time of 36 hours.

Note that the replacing queues can already be used.

How does this affect users of the computing infrastructure?

  • Users who do not explicitly submit jobs to a specific queue, do not have to take any action;
  • Users who put a statement in the .jdl file to select a specific queue may have to change the queue name in the .jdl file;
  • Users who directly submit jobs to a Computing Element and queue via the command glite-wms-job-submit using the option
 "--resource <CE>:2119/jobmanager-pbs-<QUEUE>"

may have to change the name for <QUEUE>.

WN upgrade to CentOS-5 x86-64 and gLite 3.2

On Monday October 26th, 2009, all worker nodes will be upgraded to CentOS-5 x86-64 with gLite 3.2 middleware. The worker nodes will be put off-line the weekend before (Oct 24/25) to allow running jobs to complete.

Moving grid services to new data center

A new data center has been built at Nikhef. The existing grid infrastructure at Nikhef will be moved to this new data center between 10 August and 21 August. During the migration process, grid services will be unavailable.

This large-scale operation will take place in two phases:

1) Moving grid services and network infrastructure (10-14 August 2009)

During this phase, all grid services at site NIKHEF-ELPROD will be unavailable. For grid users this means that the following services cannot be used:

  • Computing services (CEs gazon.nikhef.nl and trekker.nikhef.nl);
  • Storage services (SE tbn18.nikhef.nl);
  • Job submission services (WMS graszode.nikhef.nl, graspol.nikhef.nl, dorsvlegel.nikhef.nl);
  • Requesting renewal of grid certificates via the Dutchgrid CA web site will not be possible at 10 and 11 August (requests can be submitted via mail but will not be processed);
  • The web sites www.dutchgrid.nl, www.vl-e.nl and poc.vl-e.nl will be unavailable.

2) Moving compute and storage clusters (15-21 August 2009)

In this phase, the computing and storage clusters will be unavailable. Grid users will not be able to:

  • Use the computing services (CEs gazon.nikhef.nl and trekker.nikhef.nl);
  • Access certain data files via SE tbn18.nikhef.nl.

We advise all users of the grid infrastructure to:

  • Request renewal of grid certificates before August 5th (only if the certificate will expire early or mid-August);
  • Use the grid computing services at SARA (CE ce.gina.sara.nl)
  • Submit grid jobs via the WMS at SARA (WMS wms.grid.sara.nl)
  • Plan their work such, that no access is required to data files via storage element tbn18.nikhef.nl, or to copy relevant data elsewhere.

Grid computing nodes unavailable 24-27 April

During the weekend of 25-26 April, work will be done on the electric systems providing power to the grid computing nodes at Nikhef. Therefore, the grid computing nodes ("worker nodes") at Nikhef's grid facility will be unavailable from Friday April 24, 13:00 until Monday April 27, 10:00.

The facility will stop accepting new jobs from Thursday morning 8:00 (the 23rd), to enable jobs to finish before the shutdown of the nodes.

Stopping resource broker services at NIKHEF-ELPROD

The resource broker services at site NIKHEF-ELPROD will be stopped at 02-February-2009. This affects the RBs bosheks.nikhef.nl, boszwijn.nikhef.nl and the alias rb03.nikhef.nl.

Current users of the resource brokers are advised to start submitting their jobs via the replacing WMS servers as soon as possible, but not later than January 15, 2009. Output sandboxes have to be retrieved before February 2, 2009; after this date they will be deleted.

The WMS hosts at NIKHEF-ELPROD are graszode.nikhef.nl and graspol.nikhef.nl (or use the alias wms03.nikhef.nl).


Removing installation of VL-e PoC release 2

The installation of VL-e PoC release 2 will be removed from the worker nodes at site NIKHEF-ELPROD per 02-February-2009. Users of the the PoC should migrate to release 3 of the PoC, which is already available.


Change in services

The Classic Storage Element tbn15.nikhef.nl (alias se03.nikhef.nl) will be removed from the information system at Monday 17-Nov-2008. Access via gridftp to the storage element will remain available until further notice.

After removal of the service from the information system, the lcg-* tools (e.g. lcg-cr, lcg-cp) can no longer be used to access the data on this storage element.data on this storage element.

EGEE-broadcast: https://cic.gridops.org/index.php?section=rc&page=broadcastretrieval&step=2&typeb=C&idbroadcast=37321