Stoomboot Scheduling

From PDP/Grid Wiki
Jump to navigationJump to search

Note this is a work in progress

Scheduling on stoomboot depends on many factors.

The number of cores accessible in a particular queue is not a fixed number. First, the cluster is currently partitioned into two pieces, one which can access the glusterfs disks and one which cannot:

class   running     unused       down    offline    multicore   total
generic     248     136 (25%)       16        8      168 (31%)     536 
gluster     184       0 ( 0%)       16        0        0 ( 0%)     192
    tot     432     136 (18%)       32        8      168 (23%)     728

A specific job can be submitted to either the generic partition (536 job slots) or the gluster partition (192 slots). Furthermore there is a multicore pool within the 'generic partition'. This pool is currently at 168 job slots which leaves 368 jobs lots available to single core jobs in the generic pool. Sometimes the size of the multicore pool is changed, depending on usage. Anyway the point is that this number is not fixed.

Of those 368 slots (or whatever the number will be depending on size of multicore pool), there are some per-queue limits to keep a single user from using all of the available slots for herself:

allier:~> qmgr -c 'p s' | grep user_run
set queue generic max_user_run = 342
set queue long max_user_run = 280
set queue stbcq max_user_run = 156
set queue gravwav max_user_run = 2

So in queue generic for example, a single user will never be able to run more than 342 jobs at a single time. Finally, the scheduler tries to partition jobs fairly between the various users ... depending on who else is trying to use the system, a specific single user (let's call him sbuser) may get anywhere from 0 to 342 job slots at any given time. 0 might happen if sbuser has been using the cluster for several hours and getting a lot of the job slots, then suddenly other people start to submit. the scheduler sees that sbuser has used a lot of computing time on the cluster for the past hours and the two recently-started users have made none in the same time period; it will then assign the "new" users a higher scheduling priority, for a period of up to a couple of hours, the scheduler will preferentially start their jobs instead of sbusers in an attempt to reach an equal share of time between the three of you.