Notes
Slide Show
Outline
1
Resource Allocation Scheme
  • Only think as far as PRAGMA9, for now
  • Assume there will be total 4 applications
    • Biogrid, QMMD, Savannah, NewApplication
  • If your compute nodes have private IP only, subtract Biogrid (total 3)
  • Divide the number of nodes equally among them
  • Use ACL (user access list) to dedicate
  • Tell Biogrid application driver the names of the dedicated nodes (for job submission to the nodes directly)
  • Before the NewApplication is ready to run, attach the ACLs of all the fault-tolerant applications to the nodes allocated for the NewApplication
  • See an illustrated example for initial setup on next slide
2
 
3
When a New Appliction is Ready
  • Remove QMMD and Savannah ACLs from rocks-62.q, rocks-63.q, rocks-64.q
  • Attach the new application ACL to the 3 nodes (queues)
  • kill any running jobs on these nodes (queues)
  • See the changed setup on next slide
4
 
5
Questions
  • I think this should work for all our current applications. If you see any problem, please discuss.
  • I think this can be implemented on our cluster (rocks-52.sdsc.edu and rocks-47.sdsc.edu). Can your site implement this? If not, is there an alternative setup can accomplish the same effect?
  • Welcome all comments. Thanks! J