Difference between revisions of "Monthly Review"

From PDP/Grid Wiki
Jump to navigationJump to search
 
 
(5 intermediate revisions by the same user not shown)
Line 17: Line 17:
  
 
The result of running that program will be a file, 'tmp.csv'.
 
The result of running that program will be a file, 'tmp.csv'.
 +
You should then open that file in Excel, as well as the template to be found here:
 +
 +
http://www.nikhef.nl/grid/ndpf/files/12-2011.xlsx
 +
 +
It should be relatively clear where to paste stuff from the tmp.csv file into this template.
 +
Things to watch out for:
 +
* You can't just paste the entire tmp.csv in one go due to the extra REF line (line 21) in the template.
 +
* If a change in the farm capacity occurs during the period covered, you'll need to edit some stuff.  This is shown in the example file above.  "Old" pledge values have been added by hand in the box at cell D40.  These have been used to adjust:
 +
** the numbers in column D in the first three blocks
 +
** the data in row 21
 +
** the total capacity numbers in column F
 +
 +
The "new" pledge numbers at cell A38 are printed by the script, you'll need to edit this once a year to put in the new pledge numbers.
 
   
 
   
There is a python program that can be used to IPMI is a "standard" interface that can be used to do remote management tasks. Most of the physical machines in our farm segment our connected to a special IPMI network (which is different from the standard "internet" network). Via this special management network, you can do things like turn the machine off and on, or look at its console in some cases, or read the temperature of the boxIf you know the physical machine you want to access, you can find the corresponding IPMI address on this page: [[NDPF_Node_Functions]], in the table called "IPMI dedicated management network"Here is an example: you want to access wn-val-003.farm.nikhef.nl.   You can see from that table the following line:
+
Scroll down a bit further, you can see the pie chart.  The data for the chart can be generated by going to the following site:
 +
 
 +
https://www3.egee.cesga.es/gridsite/accounting/CESGA/custom_view.html
 +
 
 +
In the menu at left, select
 +
* EGI
 +
* Production
 +
* NGI_NL
 +
* NIKHEF-ELPROD
 +
 
 +
Choose at the top "Norm Sum Elapsed" (HS06-hours). Then choose the same period as above (warning : the dates on this site are inclusive, so choosing end month = 10 gets you data for all of october).  Choose then under groupings show data for SITE and group as a function of VO, VO groups select "ALL".
 +
Finally click "Click here for a csv dump of this table" to get a CSV file; copy that data and paste it into the template sheet.  It makes sense to delete a few columns of that csv as well as sorting the columns, in order to get a pretty pie chartNow you've got the relevant computing plots.
 +
 
 +
== Disk numbers ==
 +
 
 +
The input for the disk part can be found in the directory /var/adm on tbn18.  There you can find CSV files that contain a dump of the disk usage for all pools, once per day.  Pick the 15th, or a neighboring day in case that dump has problems.  These numbers should be copied into the excel spreadsheetThe pivot table ranges need to be updated, as well as the graph data selections on the 2nd sheet.
  
0.20-0.121 wn-val-(001-102) valentine LCG2ELPROD
+
== GGUS Tickets ==
  
read this as follows: the first field gives the last two parts of the IP address (the first part is 172.20), so wn-val-003 will have IP address 172.20.0.22.  The second field gives the IPMI hostname, so this node has hostname "wn-val-003.ipmi.nikhef.nl".
+
You can generate the nice picture of tickets using this tool:
bosui:~> host wn-val-003.ipmi.nikhef.nl
 
wn-val-003.ipmi.nikhef.nl has address 172.20.0.22
 
This information can be used as hostname/ip input for the various IPMI client tools.
 
  
'''Important Note''': the IPMI network is not accessible from everywhere. From your desktop, you should have the OpenVPN tunnel running, otherwise you won't be able to connect.  The IPMI network is not accessible from all nodes in the farm; it does work from e.g. the install server (stal).
+
https://ggus.eu/stat/ttt.php
  
== Remote Consoles and Switches ==
+
If you click on each ticket bar, a new page will open that shows the relevant GGUS ticket, and you can annotate the picture accordingly.

Latest revision as of 15:05, 5 December 2011

Background

This page documents how to do assemble the monthly review data for the Tier-1 meeting.

CPU numbers

There is a script which can be used to generate a csv file with the accounting numbers. The script is here: svn+ssh://svn@ndpfsvn.nikhef.nl/repos/pdpsoft/nl.nikhef.ndpf.3maand/

Example invocations:

./3maand.py -p <db passwd here> 2011-08-01
./3maand.py -p <db passwd here> 2011-08-01 +6m
./3maand.py -p <db passwd here> 2011-08-01 2011-12-05

"2011-08-01" is the start date of the period over which the accounting will be summed. With no end date provided, the default is for a period of three months (the first example). An end date of "+6m" means, sum the accounting over 6 months. The last example shows an explicit end date provided. You can use any dates you like, but the results are only guaranteed if the start date is on the first of the month, as part of the program logic is based on that assumption.

The result of running that program will be a file, 'tmp.csv'. You should then open that file in Excel, as well as the template to be found here:

http://www.nikhef.nl/grid/ndpf/files/12-2011.xlsx

It should be relatively clear where to paste stuff from the tmp.csv file into this template. Things to watch out for:

  • You can't just paste the entire tmp.csv in one go due to the extra REF line (line 21) in the template.
  • If a change in the farm capacity occurs during the period covered, you'll need to edit some stuff. This is shown in the example file above. "Old" pledge values have been added by hand in the box at cell D40. These have been used to adjust:
    • the numbers in column D in the first three blocks
    • the data in row 21
    • the total capacity numbers in column F

The "new" pledge numbers at cell A38 are printed by the script, you'll need to edit this once a year to put in the new pledge numbers.

Scroll down a bit further, you can see the pie chart. The data for the chart can be generated by going to the following site:

https://www3.egee.cesga.es/gridsite/accounting/CESGA/custom_view.html

In the menu at left, select

  • EGI
  • Production
  • NGI_NL
  • NIKHEF-ELPROD

Choose at the top "Norm Sum Elapsed" (HS06-hours). Then choose the same period as above (warning : the dates on this site are inclusive, so choosing end month = 10 gets you data for all of october). Choose then under groupings show data for SITE and group as a function of VO, VO groups select "ALL". Finally click "Click here for a csv dump of this table" to get a CSV file; copy that data and paste it into the template sheet. It makes sense to delete a few columns of that csv as well as sorting the columns, in order to get a pretty pie chart. Now you've got the relevant computing plots.

Disk numbers

The input for the disk part can be found in the directory /var/adm on tbn18. There you can find CSV files that contain a dump of the disk usage for all pools, once per day. Pick the 15th, or a neighboring day in case that dump has problems. These numbers should be copied into the excel spreadsheet. The pivot table ranges need to be updated, as well as the graph data selections on the 2nd sheet.

GGUS Tickets

You can generate the nice picture of tickets using this tool:

https://ggus.eu/stat/ttt.php

If you click on each ticket bar, a new page will open that shows the relevant GGUS ticket, and you can annotate the picture accordingly.