信息网络安全 ›› 2017, Vol. 17 ›› Issue (10): 55-62.doi: 10.3969/j.issn.1671-1122.2017.10.009

• • 上一篇    下一篇

一种基于序列最小优化算法的跨站脚本漏洞检测技术

黄娜娜1,2, 万良1,2(), 邓烜堃1,2, 易辉凡1,2   

  1. 1.贵州大学计算机科学与技术学院,贵州贵阳 550025
    2.贵州大学计算机科学理论研究所,贵州贵阳 550025
  • 收稿日期:2017-08-14 出版日期:2017-10-10 发布日期:2020-05-12
  • 作者简介:

    作者简介: 黄娜娜(1986—),女,江苏,硕士研究生,主要研究方向为Web应用安全漏洞、信息安全;万良(1974—),男,贵州,教授,博士,主要研究方向为形式化方法、信息安全;邓烜堃(1991—),男,甘肃,硕士研究生,主要研究方向为神经网络;易辉凡(1993—),男,贵州,硕士研究生,主要研究方向为形式化方法。

  • 基金资助:
    贵州省科学基金[黔科合J字[2011]2328号,黔科合LH字[2014]7634号]

A Cross Site Script Vulnerability Detection Technology Based on Sequential Minimum Optimization Algorithm

Nana HUANG1,2, Liang WAN1,2(), Xuankun DENG1,2, Huifan YI1,2   

  1. 1.College of Computer Science and Technology, Guizhou University, Guiyang Guizhou 550025, China
    2. Institute of Computer Science, Guizhou University, Guiyang Guizhou 550025, China
  • Received:2017-08-14 Online:2017-10-10 Published:2020-05-12

摘要:

当攻击者使用Web应用程序将恶意代码注入不同的终端用户时,就会发生跨站脚本攻击。文章针对Web应用程序使用用户输入的数据,而不对其进行验证或编码的现象,提出一种基于正则表达式匹配算法和序列最小优化算法的递归特征消除算法(SMO-RFE)。首先对数据进行预处理,采用正则表达式匹配算法,为训练集选择有代表性的特征数据集;其次利用SMO-RFE特征选择算法选择出最优特征;再次对具有攻击性的关键词进行特征排序和组合;最后总结特征关键字的出现频率以及特征值权重比例。攻击关键字出现的频率越高,漏洞存在的可能性就越大。实验验证发现,数据集通过SMO-RFE算法选择之后,SVM特征向量被检测的准确率更高,充分说明该算法能够有效地检测跨站脚本漏洞。

关键词: 跨站脚本攻击, 特征值, Web安全漏洞, SMO-RFE算法, 信息安全

Abstract:

When the attacker uses the Web APP to inject malicious code into different end users, XSS attacks will occur. In the light of the phenomenon that Web application uses the user's input, but don’t verify or encode it, this paper put forward a kind of recursive feature elimination algorithm matching algorithm and sequential minimal optimization based on regular expression (SMO-RFE). The first is the data preprocessing, using regular expression matching algorithm, choose the characteristics of representative data set for the training set; then use the SMO-RFE feature selection algorithm to select the optimal features; once again feature sort and assemble the aggressive keywords; finally summarize the occurrence frequency of feature keyword and the weight ratio of feature value. The higher the occurrence frequency of attack keywords, the greater the likelihood of vulnerabilities. Through the experiment we can find out that after the data set is selected by SMO-RFE algorithm, the accuracy of SVM feature vector to be detected is higher, and shows that the algorithm can effectively detect XSS vulnerabilities.

Key words: cross site script attack, feature value, Web security vulnerabilities, SMO-RFE algorithm, information security

中图分类号: