信息网络安全 ›› 2021, Vol. 21 ›› Issue (9): 52-58.doi: 10.3969/j.issn.1671-1122.2021.09.008

• 入选论文 • 上一篇    下一篇

基于图像的网络钓鱼邮件检测方法研究

弋晓洋1,2, 张健1,2()   

  1. 1.南开大学网络空间安全学院,天津 300350
    2.天津市网络与数据安全技术重点实验室,天津 300350
  • 收稿日期:2021-04-27 出版日期:2021-09-10 发布日期:2021-09-22
  • 通讯作者: 张健 E-mail:zhang.jian@nankai.edu.cn
  • 作者简介:弋晓洋(1999—),女,四川,硕士研究生,主要研究方向为网络安全|张健(1968—),男,天津,正高级工程师,博士,主要研究方向为云安全、网络安全、系统安全。
  • 基金资助:
    国家重点研发计划(2021YFF0307202);天津市新一代人工智能科技重大专项(19ZXZNGX00090);天津市重点研发计划(20YFZCGX00680)

Image-based Phishing Email Detection Method and Implementation

YI Xiaoyang1,2, ZHANG Jian1,2()   

  1. 1. College of Cyber Science, Nankai University, Tianjin 300350, China
    2. Tianjin Key Laboratory of Network and Data Security Technology, Tianjin 300350, China
  • Received:2021-04-27 Online:2021-09-10 Published:2021-09-22
  • Contact: ZHANG Jian E-mail:zhang.jian@nankai.edu.cn

摘要:

网络钓鱼邮件攻击是一种利用人的安全防范意识漏洞和软件漏洞的APT攻击手段,其危害极大且攻击事件数量呈逐步上升的趋势。网络钓鱼邮件的样本失衡问题一直是网络安全领域难以解决的问题,而提取邮件正文特征进行分析存在侵犯用户个人隐私的风险。文章提出了一种基于图像的网络钓鱼邮件检测方法,使用Simhash算法将邮件样本转换为图像,并进一步利用LBP方法进行图像特征提取,从而避免了对邮件原始内容进行直接分析,有效保护了用户的隐私。同时,文章采用DCGAN模型进行网络钓鱼邮件数据集的扩充,解决了样本不平衡问题,提升了Inception V3模型对邮件图像进行分类检测的准确性。实验表明,该方法可以有效检测网络钓鱼邮件,实验精确率可达92.8%。

关键词: 网络钓鱼邮件, 图像, 生成式对抗网络, 卷积神经网络

Abstract:

Email phishing attack is an APT attack method that exploits lack of consciousness of cyber security and software vulnerability. It can cause serious damage and the number of attacks is gradually increasing. The class imbalance problem of phishing emails and normal emails has been a difficult topic in the field of cyber security. Extracting the characteristics of email body for analysis also has the risk of infringing the user’s personal privacy. The paper proposed an image-based phishing email detection method. It used Simhash algorithm to transform emails into images, and then used LBP method to extract its features. It could not only retain the original information of emails, but also protected the privacy of users. In the paper, DCGAN model was used to expand the phishing email data set. It solved the class imbalance problem in emails and improved the accuracy of Inception V3 model for image classification. Experiments show that this method can detect phishing emails effectively, and the precision of experiments can reach to 92.8%.

Key words: phishing email, image, generative adversarial networks, conventional neural network

中图分类号: