信息网络安全 ›› 2022, Vol. 22 ›› Issue (10): 1-7.doi: 10.3969/j.issn.1671-1122.2022.10.001

• 入选论文 • 上一篇    下一篇

一种面向Android恶意软件的多视角多任务学习检测方法

仝鑫1, 金波1,2(), 王靖亚1, 杨莹2   

  1. 1.中国人民公安大学信息网络安全学院,北京 100038
    2.公安部第三研究所,上海 200031
  • 收稿日期:2022-09-07 出版日期:2022-10-10 发布日期:2022-11-15
  • 通讯作者: 金波 E-mail:jinbo@gass.cn
  • 作者简介:仝鑫(1995—),男,河南,博士研究生,主要研究方向为网络空间安全和自然语言处理|金波(1972—),男,上海,研究员,博士,主要研究方向为网络空间安全|王靖亚(1966—),女,北京,教授,硕士,主要研究方向为深度学习和人工智能安全|杨莹(1981—),女,河南,副研究员,博士,主要研究方向为网络空间安全和数据安全
  • 基金资助:
    国家重点研发计划(2021YFB3101405);国家社会科学基金重点项目(20AZD114);国家重点研发计划(2022YFC3300800)

A Multi-View and Multi-Task Learning Detection Method for Android Malware

TONG Xin1, JIN Bo1,2(), WANG Jingya1, YANG Ying2   

  1. 1. School of Information Network Security, People’s Public Security University of China, Beijing 100038, China
    2. The Third Research Institute of the Ministry of Public Security, Shanghai 200031, China
  • Received:2022-09-07 Online:2022-10-10 Published:2022-11-15
  • Contact: JIN Bo E-mail:jinbo@gass.cn

摘要:

近年来,针对Android平台的恶意软件急剧增加,给反恶意软件领域带来了巨大挑战。尽管目前基于机器学习的检测方法为弥补传统检测技术的不足提供了新方向,但这些检测方法往往是基于单个模型或组合的相似模型构建的,很难从多个视角提取不同层次的语义信息,最终限制了检测效果。针对这一问题,文章提出了一种基于多视角多任务学习的Android恶意软件检测模型。首先,系统调用信息被输入梯度提升树模型以挖掘频次视角信息,然后调用信息还会被转化为灰度图并输入到基于视觉图神经网络、卷积神经网络的学习器以学习共现和关联特征。最后,文章还引入了基于层次标签的多任务学习方法完成模型训练,实现了针对Android恶意软件的多视角特征提取和分析。在来自UNB的细粒度公开数据集上的实验结果表明,该方法总体上优于传统基于单视角的检测方法,具备较好的准确率和可靠性。

关键词: Android恶意软件, 多视角学习, 多任务学习, 图神经网络

Abstract:

In recent years, there is a dramatic increase in malware targeting the Android platform, which brings great challenges to the anti-malware field. Although the current detection methods based on machine learning provide a new direction to make up for the shortcomings of traditional detection technology. These methods are often based on an individual model or a combination of similar models. It is difficult to extract semantic information at different levels from multi-view, which ultimately limits the detection effect. To address this vulnerability, this paper proposed an Android malware detection model based on multi-view and multi-task learning. First of all, the system call information was input into the gradient boosting decision tree model to mine the frequency view features. Then, the system call information was also transformed into a grayscale image and input to the learner based on a vision graph neural network and a convolutional neural network to learn co-occurrence and association features. Finally, the paper also introduced a multi-task learning method based on hierarchical labeling to complete model training, and achieved multi-view feature extraction and analysis for Android malware. Experimental results on the fine-grained public dataset from UNB show that this method is generally superior to the traditional method based on a single view, with better accuracy and reliability.

Key words: Android malware, multi-view learning, multi-task learning, graph neural network

中图分类号: