信息网络安全 ›› 2019, Vol. 19 ›› Issue (4): 11-19.doi: 10.3969/j.issn.1671-1122.2019.04.002

• 技术研究 • 上一篇    下一篇

一种基于机器学习的Spark容器集群性能提升方法

田春岐1,2(), 李静1,2, 王伟1,2,3, 张礼庆1,2   

  1. 1. 同济大学计算机科学与技术系,上海 200092
    2. 同济大学嵌入式系统与服务计算教育部重点实验室,上海 200092
    3. 湖北省教育信息化工程技术研究中心,湖北武汉 430062
  • 收稿日期:2018-11-19 出版日期:2019-04-10 发布日期:2020-05-11
  • 作者简介:

    作者简介:田春岐(1975—),男,陕西,副教授,博士,主要研究方向为云计算、无线宽带网络;李静(1993—),女,四川,硕士研究生,主要研究方向为云计算、大数据;王伟(1979—),男,湖北,副教授,博士,主要研究方向为云计算、大数据、大规模在线学习系统;张礼庆(1994—),男,江苏,硕士研究生,主要研究方向为云计算、大数据。

  • 基金资助:
    国家自然科学基金[61672384, 61772372];中央高校基本科研业务费专项资金[0800219373];湖北省教育信息化工程技术研究中心开放基金重点项目[201701]

A Method for Improving the Performance of Spark on Container Cluster Based on Machine Learning

Chunqi TIAN1,2(), Jing LI1,2, Wei WANG1,2,3, Liqing ZHANG1,2   

  1. 1. Department of Computer Science and Engineering, Tongji University, Shanghai 200092, China
    2. The Key Laboratory of Embedded System and Service Computing of Ministry of Education, Tongji University, Shanghai 200092, China
    3. Hubei Engineering Research Center for Education Information, Wuhan Hubei 430062, China
  • Received:2018-11-19 Online:2019-04-10 Published:2020-05-11

摘要:

目前基于Spark的应用十分广泛,合理的参数配置会使Spark作业具备较高的执行效率,很多学者对虚拟机集群上的Spark参数调优进行了深入研究。近年来,容器作为一种新兴的云计算基础设施越来越广泛地被应用于服务集群中,因而对基于容器集群的Spark参数调优进行研究也具有重要意义。文章研究了Docker容器集群中Spark的参数配置问题,提出了一种新型的参数调优方法(ContainerOpt),使用机器学习方法学习并预测作业在不同参数组合下的性能,同时引入节点自动伸缩机制,使输入规模较大的作业可以获得更优的性能。文章还提出了由时间和资源共同决定的性能表示模型,代替传统的基于单一执行时间的性能表示模型,从而在作业执行时间和资源占用之间达到较好的平衡。实验结果表明,相较于默认配置,该参数调优方法可提升50%的执行效率。

关键词: 云计算, Spark, Docker, 机器学习, 参数调优

Abstract:

At present, Spark-based applications are very extensive. Reasonable configuration will make Spark jobs have higher execution efficiency. A large number of scholars have conducted in-depth research on the parameter tuning of Spark on virtual machine clusters. In recent years, as an emerging cloud computing infrastructure, containers are more and more widely used in service clusters. Therefore, it is also important to study the parameter tuning of Spark on container clusters. This paper studies the parameter configuration problem of Spark on Docker container cluster, and proposes a new parameter tuning method(ContainerOpt), which uses machine learning method to learn and predict the performance of the job under different parameter combinations, and introduces node automatic scaling mechanism that enable higher-input jobs to achieve better performance. In order to achieve a better balance between job execution time and resource occupation, a performance representation model based on time and resource is proposed to replace the traditional performance representation model based on a single execution time. The experimental results show that compared with the default configuration, the parameter tuning method can improve the execution efficiency by 50%.

Key words: cloud computing, Spark, Docker, machine learning, parameter tuning

中图分类号: