5. Calcul Québec
Colosse
5
Sun Constellation System deployed in 2009
960 Diskless Compute Nodes
7680 Nehalem Cores
QDR Infiniband only, Full Bisection
1PB of Lustre Storage
6. Calcul Québec
Colosse : Current architecture
Everything is tied together with custom scripts
Accounting is extracted from Grid Engine and
moved to a SQL database
Has been working well for 2 years...
6
Provisioning Scheduler + ressource manager
8. Calcul Québec
Grid Engine Vs Moab
8
SGE
Used on only 1 large Compute Canada system
Unknown vendor commitment
Fractured community
Limited support available for large HPC
deployment
9. Calcul Québec
Grid Engine Vs Moab
9
Moab
Already well known to our users
Single vendor
Commercial support
Strong community
Known to scale
11. Calcul Québec
Transition plan
Implement the existing scheduling policy
Train users and get them on the new scheduler
Train staff to work with Moab
Adapt/port our management scripts
Give control of the cluster to Moab/Torque
11
12. Calcul Québec
Scheduling policy
Priority based on share tree
Dedicated nodes only
Max 200 jobs per project
4 queues
test (15m, 16 cores) - 2 nodes
short (24h, 256 cores) - all nodes
med (48h, 128 cores) - all nodes
long (7 days, ? cores) - 120 nodes
12
13. Calcul Québec
Scheduling policy - exceptions
Analysts use overide tickets on user’s jobs
Users can qualify for more cores
No exception on maximum wallclock times
BLCR is used to allow checkpoint/restart of serial jobs
13
14. Calcul Québec
Share Tree
80% of system reserved for special allocations
20% for groups without an allocation
14
Share tree
(100%)
Project1 (20%)
Project2 (15%)
Project9 (5%)
...
Project10 (0.1%)
Project11 (0.1%)
Project13 (0.1%)
...
...
15. Calcul Québec
Share Tree
80% of system reserved for special allocations
20% for groups without an allocation
15
Share tree
(100%)
Special
allocations
(80%)
Default
allocations
(20%)
Project1 (20%)
Project2 (15%)
Project9 (5%)
...
Project10 (0.1%)
Project11 (0.1%)
Project13 (0.1%)
...
...
16. Calcul Québec
User’s transition
Much easier than expected for users
Submit files are very similar
Job submission is easy (qsub becomes msub)
but more commands to learn to monitor jobs
A lot of questions about the difference between the Torque
and Moab commands
16
#!/bin/bash
#$ ...
#$ ...
mpirun ...
SGE
#!/bin/bash
#PBS ...
#PBS ...
mpirun ...
Moab
17. Calcul Québec
Staff’s transition
Harder than expected
The workflow for working with users’ issues will
need to be reviewed
how to figure out where is the original submit file
where are each processes and how much memory they use
...
More internal documentation will be required
17
18. Calcul Québec
Management scripts
Old habits die hard....
Accounting/reporting
Accounting data is read from Moab event files
Queue status
Maintenance related scripts
monitoring,account creation,node maintenance
prolog/epilog
18
19. Calcul Québec
Deployment - progress report
Progressive deployment
Grid engine and Moab will live together for a
couple of months
19
20. Calcul Québec
Deployment - progress report
We built a different oneSIS image for Torque
compute nodes
Also rebuilt the MPI implementation with Torque support
Rebooting a node in the Torque image switches it
over to Moab
20
21. Calcul Québec
Deployment - progress report
10% of nodes controlled by Moab right now
Open to all users to test their workflow
21