RunningSPEC

From PDP/Grid Wiki
Revision as of 10:12, 14 April 2009 by Davidg@nikhef.nl (talk | contribs) (→‎Verifying the output)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Note: some steps can only be completed successfully by DavidG, since he owns the licenses needed for the Intel Compiler Suite and for SmartHeap. There is only one Intel license (commercial) available for doing this work, as well as only a single SmartHeap license (still on order). But, in principle, anyone at Nikhef is entitled to run the SPEC benchmark with their favourite plain old dirty gcc compiler and with conventional heap management. Access to the SPEC suites is limited to the NDPF systemAdministrator group (with the NDPF login) only to make sure that the code does not leak out, and that anyone running the benchmark does it in a compliant way.

Prepare the machine

  • Install a machine with CentOS5 x86-64, e.g. via quattor or manually. Make sure it's a physical machine, and that it will be dedicated to the benchmark running for at least as long as you need it. Make sure numactl and taskset (from util-linux) are available on the machine.

Using the Intel compiler

If you want to get proper performace from the benchmark, and for vendor acceptance tests, you must use the Intel compiler:

  • As root, install the Intel compiler suite (version 11.0) from https://www.nikhef.nl/grid/ndpf/files/local/intel/. Install both the i386 and the x86_64 varieties.
  • As root, run the script intel-Compiler-postinstall.sh to set the installation destination into the generated scripts.
  • As root, install the license file to /opt/intel/licenses/icc-davidg.lic and chown it to davidg:users, mode 0600.
  • Add to your profile:
PATH=$PATH:/opt/intel/Compiler/11.0/081/bin/intel64/
export PATH
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/intel/Compiler/11.0/081/lib/intel64
export LD_LIBRARY_PATH
LANG=C
export LANG

Using SmartHeap

Install the SmartHeap library:

  • obtain the appropriate distribution and license from MicroQuill. The current 64 bit version is called sh8linux64b.tar.gz, and the 32-bit version sh81linux.tar.gz.
  • unpack the tar-balls, e.g. to the location "/opt/SmartHeap8-x86_64/", so that the "lib/" directory is right below that one. Make the directory readable only for the user owning the license.
  • Add SmartHeap to your library path and setup (bashrc is the best). For the 64 bits version:
LD_LIBRARY_PATH=/opt/SmartHeap8-x86_64/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH
  • download a config file and flags file to enable the library. The flags file is essential, or you'll get an 'invalid run' complaint from Spec2006

Installing and running SPEC2006

  • From the installation directory retrieve the SPEC2006 sources cpu2006.tar.bz2 (you'll need your NDPF username to retrieve the file, and be a member of the systemAdministrators group: for SPEC2006 and SPEC2000 Nikhef has a site license).
  • Unpack the spec2006 sources in a dedicated directory (to prevent pollution):
mkdir -p benchmarking && ( cd benchmarking ; bzcat ../cpu2006.tar.bz2 | tar xf - )
  • change to the benchmark directory
cd benchmarking
  • 'install' the suite by running in the benchmarking/ directory:
./install.sh

and agreeing that the source and installation directory are the same.

  • copy the proper configuration file from the private repository, for example:
  • Rename the config file, and make sure the hardware specification notes are correct. Also update test date and OS environment specifications, please.
  • set the SMP affinity in the config file to the number of available cores. This directly affect the rate measurement!
  • Set the environment correctly by SOURCING the setstack.sh script:
export KMP_STACK_SIZE=64M
ulimit -s unlimited
  • source the SPEC environment by
. shrc
  • run the benchmark, making sure to redirect logging to a file for analysis:
runspec --rate 8 -c cpu2006-icc11-noSH-rate-nikhef-ppfB0.cfg --machine=stoakleydp8cores -T base -o all int > run.log 2>&1 &
disown %1
runspec --rate 8 -c cpu2006-icc11-SH8-x86_64-rate-nikhef-ppfB1.cfg --machine=stoakleydp8cores -T base -o all int > run.log 2>&1 &
disown %1
  • wait for the test to complete, but of course don't load the machine with queries. A full, valid, run takes approx. 6-12 hours on a modern 8-core system.

Verifying the output

The result of your SPEC run in left in numbered files in the results/ directory in the benchmarking working directory. The files are called CINT2006.001.ref.*. For a successful run, please copy all of these files to our persistent storage area on the web server at login.nikhef.nl:/www/grid/ndpf/files/nikhef-only/spec2006-nikhef/results/CONFIGFILENAME/.

The past results can be found at https://www.nikhef.nl/grid/ndpf/files/nikhef-only/spec2006-nikhef/results/.

If ever the CINT2006.XXX.ref.txt file start with this:

##############################################################################
#   INVALID RUN -- INVALID RUN -- INVALID RUN -- INVALID RUN -- INVALID RUN  #
#                                                                            #
# Your run was marked invalid because ...

you know you've messed up and you MUST NOT use the results. Please review the log file and error messages and try again.

If your result is very far form the ones published on http://www.spec.org/cpu2006/results/, then also review your setup. Was the Intel compiler used? Was SmartHeap correctly installed? Etc. Of course, if you used gcc than your performance will be down by 30-50% anyway.

Typical numbers for the different compiler scenarios are, on an 8-core decent and modern system of 2008:

Condition Performance metric
Vendor provided public value 107
GCC v3.4 63.2
ICC11.0, no SH 93.7
ICC11.0, with SH 8.0 x86-64 104

So, we get within 5% of the vendor specified value, using a commodity Linux OS and no special care taken for 32 vs 64 bits for the various elements of the test. This was all with 64 bits, 32 may be better (or worse, of course...).

Other benchmarks

The LCG community prefers to use a lousy compiler and doesnt want to spend time to optimize the code and make it portable, so the normal benchmarks are not representative of the relative performance of these applications. They have therefore decided to run a representative set of benchmarks (cpp_all), but with compiler settings and flags that reflect their way of working. These tests are described at https://twiki.cern.ch/twiki/bin/view/FIOgroup/TsiBenchHEPSPECWlcg. The same machine as used above yields 69.11 on this 'hep-spec' benchmark.