Difference between revisions of "BigDataTestBedHvATI"

From PDP/Grid Wiki
Jump to navigationJump to search
Line 2: Line 2:
 
On the systems that will run the ELK cluster, and on any additional machines in the network the project team has full system-level access (root).
 
On the systems that will run the ELK cluster, and on any additional machines in the network the project team has full system-level access (root).
  
 +
== Before you begin ==
 +
 +
* change your password at the [https://sso.nikhef.nl/passwd/ Nikhef NikIdM/SSO page]
 +
* configure your favourite shell and upload your SSH public keys to the [https://sso.nikhef.nl/chsh/ CHSH page]
 +
* [https://wiki.nikhef.nl/ct/Main_Page Nikhef ICT services user documentation]
 +
* discussion and questions can be mailed to <pdp-proj-hva-bigdata@nikhef.nl>. Questions about the network and team systems to <grid.support@nikhef.nl>
 +
* Enjoy!
  
 
== Basic configuration ==
 
== Basic configuration ==

Revision as of 15:45, 22 September 2015

For the HvA "Showing Real Big Data" projects an experimental environment has been configured. Within this environment (virtual) machines can be configured to host the various services of an ELK data analytics cluster (ElasticSearch, LogStash, Kibana) by the project team. It also contains a gateway/access system that permits access to the (semi-isolated) network to which the ELK cluster systems are connected. On the systems that will run the ELK cluster, and on any additional machines in the network the project team has full system-level access (root).

Before you begin

Basic configuration

  • The team systems run CentOS6 (a rebuild of RedHat Enterprise Linux 6) - update 7.
  • Auto-updating is enabled. Team systems should be updated and patched for known security vulnerabilities (also in installed software, like the ELK stack) even though the systems are placed in an isolated network (since in the future this may be a public-facing service)
  • The team systems are on a dedicated private network (vlan). This network may use addresses from the RFC1918 range 172.23.1.0/24. If IPv6 is desired, the range 2001:0610:0120:e108::/64 may be used.
  • Access to the team systems:
    • there are a set of VMs (for now: 4), called "vm{1,2,3,4}.stud1.ipmi.nikhef.nl" (sorry for the naming ...)
    • from the 'inside' (desktops, login.nikhef.nl) you can get to them directly on most TCP ports.
    • from the 'outside' there is no access - but there is a gateway host that can proxy traffic: schaapscheerder.nikhef.nl. This host has an external public interface (IPv4 194.171.96.53, IPv6 2001:610:120:e100::35) and an internal one on eth1 (172.23.1.1). This system is also the default gateway for the team systems, as it will do SNAT for them when talking to the outside world. SNATted traffic is bandwidth limited.
    • local traffic (e.g. to the OS distribution mirror) is direct and not NATted via the specific gateway 172.23.1.254 (parkwachter.nikhef.nl)
  • Login to schaapscheerder and the team systems is with your standard NikIdM/SSO username and password
  • Team members have root access to the team systems. If you provide your ssh public key in the NikIdM identity service (at https://sso.nikhef.nl/chsh/, the root authorized_keys files will be updated automatically within the hour. Otherwise, login as a mortal user and use the root password to set up sudo(8).
  • You do not have root access on the gateway box. If you need port forwarding or config changes, contact grid.sysadmin@nikhef.nl or drop by in room H150 or H156.

And please follow the Nikhef AUP and be nice to the systems. Also: do NOT take any data home, and do not transfer log data off-site. The logs contain (a little bit) of personal information. Feel free to login to the team systems (via login.nikhef.nl) from anywhere ...

== Log data to start with The log data is coming from a DPM "Disk Pool Manager" storage cluster, part of the LCGDM data management suite. Some hints about what DPM does were already in the introductory presentation, but it makes sense reading a bit about what it does in general.

The head node system is "tbn18.nikhef.nl". All other systems (oliebol-*, strijker-*) are slave disk servers. On schaapscheerder:/data/local/logstore/reference-data/ there is one week's worth of reference data from the storage cluster - in the format as it is found on the nodes: partially compressed, sometimes with some data missing on a particular node. It's real data, with real defects ;-)

More documentation:


Team systems

The team systems "vmX.stud1.ipmi.nikhef.nl" are basic CentOS6.7 installs. Only a bit of firewalling and authentication config has been added. As part of the work you are invited to install (many ;-) additional services: the ELK stack, apache httpd, additional elasticsearch boxes to build a cluster, etc.

For now, these systems are entirely independent, meaning you do not have a shared home directory, nor (for now) any shared file system with the schaapscheerder gateway host. To import data into ELK, devise a method. If needed, you can of course setup NFS inside the cluster. It is also possible, if needed, to export the file system(s) from schaapscheerder to the team systems -- but if you're able to do without NFS that's better (it means also the production system will not have a need for NFS ;-)

The systems start with 1GB RAM and a 16GB disk. If you need additional disk or memory in the team systems that's fine. Send a mail to <grid.sysadmin@nikhef.nl> and we can add that to the VMs. For more RAM the machine will need a reboot. An additional disk can be added live (you'll see /dev/xvdX appearing).