BigDataTestBedHvATI

From PDP/Grid Wiki
Jump to navigationJump to search

For the HvA "Showing Real Big Data" projects an experimental environment has been configured. Within this environment (virtual) machines can be configured to host the various services of an ELK data analytics cluster (ElasticSearch, LogStash, Kibana) by the project team. It also contains a gateway/access system that permits access to the (semi-isolated) network to which the ELK cluster systems are connected. On the systems that will run the ELK cluster, and on any additional machines in the network the project team has full system-level access (root).

Before you begin

Basic configuration

Team systems network
  • The team systems run CentOS6 (a rebuild of RedHat Enterprise Linux 6) - update 7.
  • Auto-updating is enabled. Team systems should be updated and patched for known security vulnerabilities (also in installed software, like the ELK stack) even though the systems are placed in an isolated network (since in the future this may be a public-facing service)
  • The team systems are on a dedicated private network (vlan). This network may use addresses from the RFC1918 range 172.23.1.0/24. If IPv6 is desired, the range 2001:0610:0120:e108::/64 may be used, but no IPv6 default gateway is configured (if you need one, ask).
  • Access to the team systems:
    • there are a set of VMs (for now: 4), called "vm{1,2,3,4}.stud1.ipmi.nikhef.nl" (sorry for the naming ...)
    • from the 'inside' (desktops, login.nikhef.nl) you can get to them directly on most TCP ports.
    • from the 'outside' there is no access - but there is a gateway host that can proxy traffic: schaapscheerder.nikhef.nl. This host has an external public interface (IPv4 194.171.96.53, IPv6 2001:610:120:e100::35) and an internal one on eth1 (172.23.1.1). This system is also the default gateway for the team systems, as it will do SNAT for them when talking to the outside world. SNATted traffic is bandwidth limited.
    • local traffic (e.g. to the OS distribution mirror) is direct and not NATted via the specific gateway 172.23.1.254 (parkwachter.nikhef.nl)
  • Login to schaapscheerder and the team systems is with your standard NikIdM/SSO username and password
  • Team members have root access to the team systems. If you provide your ssh public key in the NikIdM identity service (at https://sso.nikhef.nl/chsh/, the root authorized_keys files will be updated automatically within the hour. Otherwise, login as a mortal user and use the root password to set up sudo(8).
  • You do not have root access on the gateway box. If you need port forwarding or config changes, contact grid.sysadmin@nikhef.nl or drop by in room H150 or H156.

And please follow the Nikhef AUP and be nice to the systems. Also: do NOT take any data home, and do not transfer log data off-site. The logs contain (a little bit) of personal information. Feel free to login to the team systems (via login.nikhef.nl) from anywhere ...

Log data to start with

The log data is coming from a DPM "Disk Pool Manager" storage cluster, part of the LCGDM data management suite. Some hints about what DPM does were already in the introductory presentation, but it makes sense reading a bit about what it does in general.

The head node system is "tbn18.nikhef.nl". All other systems (oliebol-*, strijker-*) are slave disk servers. On schaapscheerder:/data/local/logstore/reference-data/ there is one week's worth of reference data from the storage cluster - in the format as it is found on the nodes: partially compressed, sometimes with some data missing on a particular node. It's real data, with real defects ;-)

More documentation:

Team systems

The team systems "vmX.stud1.ipmi.nikhef.nl" are basic CentOS6.7 installs. Only a bit of firewalling and authentication config has been added. As part of the work you are invited to install (many ;-) additional services: the ELK stack, apache httpd, additional elasticsearch boxes to build a cluster, etc.

 ssh your_name@vm1.stud1.ipmi.nikhef.nl

should work. And if you have uploaded your ssh public key to SSO, also ssh root@vm1.stud1.ipmi.nikhef.nl ought to work.

For now, these systems are entirely independent, meaning you do not have a shared home directory, nor (for now) any shared file system with the schaapscheerder gateway host. To import data into ELK, devise a method. If needed, you can of course setup NFS inside the cluster. It is also possible, if needed, to export the file system(s) from schaapscheerder to the team systems -- but if you're able to do without NFS that's better (it means also the production system will not have a need for NFS ;-)

The systems start with 1GB RAM and a 16GB disk. If you need additional disk or memory in the team systems that's fine. Send a mail to <grid.sysadmin@nikhef.nl> and we can add that to the VMs. For more RAM the machine will need a reboot. An additional disk can be added live (you'll see /dev/xvdX appearing).

Reporting and data preservation

Since we hope to use this system in the future (and it's of key interest for the phase-II students starting in 2016), we kindly urge you to keep a log (;-) of what you do, preserve any code you write, and to keep a safe copy of any specific configuration you generate.

  • important changes and results: please send them to the list <pdp-proj-hva-bigdata@nikhef.nl> by mail (it's archived)
  • any code you write: deposit it in a repo. GitHub is fine for us, but if you prefer we also host a local subversion service
  • configuration: maybe keep a copy on schaapscheerder for deployment to the team systems? You can do (from schaapscheerder) a ssh root login to the VMs. If you enjoy sysadminning, that should be enough to get you started with Ansible or so ;-)

On schaapscheeder, the file system /data/local/logstore/ has 240GByte available for the rsults of your work (and is writable by you all)

References and links