信息网络安全 ›› 2017, Vol. 17 ›› Issue (9): 111-114.doi: 10.3969/j.issn.1671-1122.2017.09.026

• • 上一篇    下一篇

基于分类置信度和网站特征的钓鱼检测系统

陈旭1, 黎宇坤1, 袁华平2, 刘文印2()   

  1. 1. 广东工业大学自动化学院,广东广州 510006
    2. 广东工业大学计算机学院,广东广州 510006
  • 收稿日期:2017-08-01 出版日期:2017-09-20 发布日期:2020-05-12
  • 作者简介:

    作者简介: 陈旭(1992—),男,安徽,硕士研究生,主要研究方向为网络身份安全、假冒网站检测、机器学习;黎宇坤(1993—),男,广东,硕士研究生,主要研究方向为网络安全、机器学习;袁华平(1993—),男,江西,硕士研究生,主要研究方向为网络安全、数据挖掘;刘文印(1966—),男,吉林,教授,博士,主要研究方向为网络身份安全、假冒网站检测、机器视觉、图形识别、文本挖掘等。

  • 基金资助:
    广东省引进创新团队项目[2014ZT05G157]

Phishing Detection System Based on Classification Confidence and Website Features

Xu CHEN1, Yukun LI1, Huaping YUAN2, Wenyin LIU2()   

  1. 1. School of Automation, Guangdong University of Technology, Guangzhou Guangdong 510006, China
    2. School of Computer Science, Guangdong University of Technology, Guangzhou Guangdong 510006, China
  • Received:2017-08-01 Online:2017-09-20 Published:2020-05-12

摘要:

文章构建了URL和网页内容两方面特征,结合机器学习Adaboost算法,训练了两种钓鱼检测模型。系统可根据网址状态智能选择合适的模型,并最终以浏览器插件形式与用户交互。文章提出利用钓鱼检测模型的分类置信度进一步提升系统性能,认为URL检测结果的分类置信度在0.95以上时是可靠的。实验表明,系统的漏警率和虚警率分别为3.59%和2.93%,准确率达到96.75%,可以有效抵御网络钓鱼攻击。

关键词: 钓鱼检测, 机器学习, 统计分析, 分类置信度

Abstract:

This paper develops an anti-phishing system to combat the increasing amount and severity of phishing attacks. To this end, features based on URLs and Web links are constructed and used to train two Adaboost models, which can detect phishing URLs with a high accuracy. In particular, the confidence of the model on the detected URLs is exploited further to improve the detected results. Extensive experiments conducted on a real-world dataset show the effectiveness of the proposed approach, achieving an accuracy of 96.7% with a missing alarm rate and false alarm rate as low as 3.59% and 2.93%.

Key words: phishing detection, machine learning, statistical analysis, classification confidence

中图分类号: