• • 上一篇 下一篇
赵晓丹%徐燕
基金资助:
ZHAO Xiao-dan%XU Yan
About author:
摘要: 文章主要进行了接收端的垃圾邮件处理技术的对比研究,包括预处理、特征选择和分类3大步骤。其中特征选择技术包括文档频率(DF)、信息增益(IG)、优势率(ODD)等方法。文章详细介绍了其中基于粗糙集理论的特征选择方法--信息增益(knowledge gain),并用实验验证了该方法在正确率等指标中的突出表现。主流分类器算法包括k近邻、贝叶斯、SVM等,其中详细展示了线性分类器在垃圾邮件分类算法实验中的突出表现。
Abstract: This paper mainly introduces the comparative study of methods dealing with the spam in receiving end, which includes preprocessing, feature selecting and classifying. Documents frequency, information gain and odds ratio are all the methods of feature selection.This paper also introduces a new method of feature selection,which is knowledge gain. Its excellent behavior is veriifed in the experiment. Common classiifers include KNN, Bayes, SVM, etc. The liner classiifers also have the advantages in spam which is presented in the experiment too.
. 垃圾邮件分类技术对比研究[J]. .
0 / / 推荐
导出引用管理器 EndNote|Ris|BibTeX
链接本文: http://netinfo-security.org/CN/
http://netinfo-security.org/CN/Y2014/V14/I2/75