信息网络安全 ›› 2015, Vol. 15 ›› Issue (7): 84-89.doi: 10.3969/j.issn.1671-1122.2015.07.013

• • 上一篇    下一篇

基于用户聚类的热门微博分类研究

张士豪, 顾益军(), 张俊豪   

  1. 中国人民公安大学网络安全保卫学院,北京102623
  • 收稿日期:2015-05-10 出版日期:2015-07-01 发布日期:2015-07-28
  • 作者简介:

    作者简介: 张士豪(1992-),男,山西,硕士研究生,主要研究方向:网络安全与数据挖掘;顾益军(1968-),男,江苏,副教授,博士,主要研究方向:网络安全与数据挖掘;张俊豪(1991-),男,河南,硕士研究生,主要研究方向:网络安全与数据挖掘。

  • 基金资助:
    公安部重点研究计划项目[2011ZDYJGADX016]

Research on the Popular Microblogging Classification Based on User Clustering

ZHANG Shi-hao, GU Yi-jun(), ZHANG Jun-hao   

  1. School of Cybersecurity, People’s Public Security University of China, Beijing 102623, China
  • Received:2015-05-10 Online:2015-07-01 Published:2015-07-28

摘要:

文章在已有的微博分类研究的基础上,提出一种基于热门微博下转发用户聚类的微博分类方法,使得分类结果能够在公安工作中有更大的利用价值。文章所使用的聚类算法采用了现如今比较成熟的K-means聚类算法以及对其进行改进之后的X-means聚类算法,X-means算法使用了更加科学的BIC准则作为类别之间的相似性度量,而且用户在使用X-means算法时无需再指定聚类个数,只需要划定聚类范围就可以了,通过这样的机制,X-means算法提高了聚类的准确性和科学性。经过对实验结果的对比分析,发现X-means算法得出的聚类结果拟合性更好。因此,在微博分类研究中将会使用X-means算法进行用户聚类。另外,文章还列举了不同种类的微博下的用户聚集情况,并为网络安全主管部门提出了针对不同种类微博的应对策略。

关键词: 用户聚类, 热门微博, 分类

Abstract:

On the basis of the existing classification of microblogging,this paper proposes a classification method based on the clustering of user which has forwarded a popular microblogging.By using this method,the classification result we obtain will be more useful in the policing work.Clustering algorithm used in the text is a maturealgorithm called K-means and its improved algorithm called X-means.X-means algorithm uses a more scientific criterion called BIC to measure the similarity between the classes, and users no longer need to specify the number of clusters. All they need to do is just specifying the number of clusters range. By this kind of mechanism, X-means clustering algorithm is able to improve its accuracy and scientific.We analyzed and compared the results of the experiment and find that the results of X-means clustering algorithm derived fit better than K-meansclustering algorithm, and therefore,this paper will use X-means clustering algorithm in the microblogging category study.In addition, this paper listed case of different types of users gathered under different kinds of microblogging, and proposed different strategiesto the different kinds of microblogging.

Key words: user clustering, popular microblogging, classification

中图分类号: