面向GPU异构集群的自学习负载均衡调度算法Self-learning load-balancing scheduling algorithm based on GPU heterogeneous cluster
刘惠,王继刚,葛铮铮,顾群,陈倩,杜军朝
摘要(Abstract):
由于GPU的高性能计算能力,越来越多地被用于集群系统中,但同时也给集群带来节点级的异构问题,使原来适用于同构集群的调度算法在异构集群中性能大大降低。为使异构节点间的负载均衡,降低总的作业执行时间,提出了一个面向GPU异构集群的自学习负载均衡调度算法。首先对Torque调度器进行扩展,使其支持GPU作业调度,然后将提出的自学习调度算法在Rocks操作系统及Torque调度器软件中实现。真实物理集群上的实验结果表明,扩展后的Torque调度器很好地支持GPU任务的调度,自学习调度算法较原来的Torque调度算法能达到更好的负载均衡。
关键词(KeyWords): 异构集群;GPU;自学习调度算法;负载均衡
基金项目(Foundation): 国家自然科学基金项目(编号:61100075;61272456);; 高等院校基本科研业务费项目(编号:K5051323005,BDY041409)
作者(Author): 刘惠,王继刚,葛铮铮,顾群,陈倩,杜军朝
参考文献(References):
- [1]Wu Kehe,Chen Long,Ye Shichao,et al.A load balancing algorithm based on the variation trend of entropy in homogeneous cluster[J].International Journal of Grid and Distributed Computing,2014,7(2):11-20.
- [2]Patil S,Gopal A.Cluster performance evaluation using load balancing algorithm[C].International Conference on Information Communication and Embedded Systems.Piscataway:IEEE,2013:104-108
- [3]刘伟,尹行,段玉光,等.同构DVS集群中基于自适应阈值的并行任务节能调度算法[J].计算机学报,2013,36(2):393-407.LIU Wei,YIN-Hang,DUAN Yu-Guang,et al.Adaptive threshold-based energy-efficient scheduling algorithm for parallel tasks on homogeneous DVS-enabled clusters[J].Chinese Journal of Computers,2013,36(2):393-407.
- [4]Liu Wei,Li Hongfeng,Shi Feiyan.Energy-efficient task clustering scheduling on homogeneous clusters[C].Proceedings 201011th International Conference on Parallel and Distributed Computing,Applications and Technologies(PDCAT 2010).Los Alamitos:IEEE Computer Society,2010:382-385.
- [5]刘莉,姜明华.异构集群下的任务调度算法研究[J].计算机应用研究,2014,31(1):80-84.LIU Li,JIANG Ming-hua.Research of task scheduling algorithm on heterogeneous cluster[J].Application Research of Computers,2014,31(1):80-84.
- [6]Terzopoulos G,Karatza H.Power-aware load balancing in heterogeneous clusters[C].Proceedings of the 2013 International Symposium on Performance Evaluation of Computer and Telecommunication Systems.Toronto:IEEE,2013:148-154.
- [7]Ye Bin,Dong Xiaoshe,Zheng Pengfei,et al.A delay scheduling algorithm based on history time in heterogeneous environments[C].China Grid Annual Conference.Piscataway:IEEE,2013:86-91.
- [8]Gregg C,Boyer M,Hazelwood K,et al.Dynamic heterogeneous scheduling decisions using historical runtime data[C/OL].[2014-05-03].http://www.cs.virginia.edu/~skadron/Papers/gregg_a4mmc11.pdf.
- [9]史琰,刘增基,盛敏.一种保证负载均衡的网络资源分配算法[J].西安电子科技大学学报,2005,32(6):885-889.SHI Yan,LIU Zeng-ji,SHENG Min.A novel network resource allocation algorithm with load balance guarantees[J].Journal of Xidian University,2005,32(6):885-889.
- [10]Zhang Keliang,Wu Baifeng.Task scheduling for GPU Heterogeneous cluster[C].Cluster Computing Workshops(Cluster Workshops),2012 IEEE International Conference.Beijing:IEEE,2012:161-169.
- [11]Payli RU,Erciyes K,Dagdeviren O.Cluster-based loadbalancing algorithms for grids[J].International Journal of Computer Networks&Communications(IJCNC),2011,3(5):253-269.
- [12]Youn Cand,Chung L.An efficient load balancing algorithm for clustersystem[C].IFIP International Conference on Network and Parallel Computing.Berlin:Springer Berlin Heidelberg,2005:176-179.
- [13]Terzopoulos G,Karatza H.Power-aware Load Balancing in Heterogeneous Clusters[C].Performance Evaluation of Computer and Telecommunication Systems(SPECTS).Toronto:IEEE,2013:148-154.
- [14]Kindratenko V V,Enos J J,Guochun Shi,et al.GPU clusters for high-performance computing[C].Workshop on Parallel Programming on Accelerator Clusters(PPAC).New Orleans:IEEE,2009:1-8.
- [15]Showerman M,Enos J,Steffen C,et al.A power-efficient GPU cluster architecture for scientificcomputing[J].Computing in Science Engineering,2011,13(2):83-87.