信息网络安全 ›› 2016, Vol. 16 ›› Issue (4): 61-68.doi: 10.3969/j.issn.1671-1122.2016.04.010

• • 上一篇    下一篇

微博舆情分析中信息转发路径提取方法研究

周红福(), 贾璐, 张婷婷, 李剑   

  1. 北京邮电大学计算机学院, 北京 100876
  • 收稿日期:2015-11-20 出版日期:2016-04-20 发布日期:2020-05-13
  • 作者简介:

    作者简介: 周红福(1990—),男,河北,硕士研究生,主要研究方向为智能网络安全;贾璐(1990—),女,河南,硕士研究生,主要研究方向为智能网络安全;张婷婷(1991—),女,河北,硕士研究生,主要研究方向为智能网络安全;李剑(1976—),男,陕西,副教授,博士,主要研究方向为智能网络安全。

  • 基金资助:
    国家自然科学基金[61472048];北京市自然科学基金[4152038]

Research on the Method of Message Forwarding Path Extraction in the Analysis of Microblog Public Opinion

Hongfu ZHOU(), Lu JIA, Tingting ZHANG, Jian LI   

  1. School of Computer, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2015-11-20 Online:2016-04-20 Published:2020-05-13

摘要:

文章以sina微博为研究对象,分析研究了微博舆情分析中提取微博信息转发路径的方法,并获得在微博信息转发过程中起关键作用的微博用户。系统主要使用网络爬虫框架进行数据采集,采用多账户多线程分布式技术,可以绕过sina微博反爬虫机制的功能,具有较高的稳定性和高效性。在研究转发路径提取方法整个过程中,需要经历爬取微博转发网页信息、提取转发信息、转发信息预处理、构造转发路径树等步骤。通过对微博转发路径信息的提取和组织成树形结构信息,就可以实现在网页中显示微博的转发图。最后通过对PageRank算法的改进,实现了计算用户传播影响力的算法设计,能够快速评估在微博整个转发传播过程中用户的转发影响力。

关键词: 微博舆情, 转发路径, PageRank算法, 转发影响力

Abstract:

In this paper, taking the sina microblog as the study, the method of extracting forwarding path of microblog information is analyzed, and microblogging users who play a key role in the process of micro-blogging message forwarding can be obtained. By utilizing multiple accounts and distributed multi-thread technology, the new network crawler frame used in this system can bypass the sina microblog anti crawler mechanism function and has high stability and high efficiency. There are several steps when researching the extracting method of forwading path, including crawling forwarding webpage information of microblog, extracting the message forwarding information, preprocessing the forward information, constructing the forwarding path tree, etc. The system can show the spread graph of a microblog message forwarding by extracting and organizing the path forwarding information to tree state structure information. Finally, the system implements the algorithm of calculating the user,s spreading influence, and can quickly estimate the user's influence in the whole process of the forwarding and propagation of the microblog by improving the PageRank algorithm.

Key words: microblog public opinion, forwarding path, PageRank algorithm, forwarding influence

中图分类号: