信息网络安全 ›› 2016, Vol. 16 ›› Issue (9): 40-44.doi: 10.3969/j.issn.1671-1122.2016.09.008

• • 上一篇    下一篇

基于Web文本挖掘算法预防现实危害的研究

吴威()   

  1. 内蒙古公安厅网安总队,内蒙古呼和浩特 010050
  • 收稿日期:2016-07-25 出版日期:2016-09-20 发布日期:2020-05-13
  • 作者简介:

    作者简介: 吴威(1983—),男,辽宁,硕士,主要研究方向为网安技术建设。

Research on Real Hazard Prevention Using Web Text Mining Algorithms

Wei WU()   

  1. Cyber Police of Inner Mongolia Public Security Department, Huhhot Inner Mongolia 010050, China
  • Received:2016-07-25 Online:2016-09-20 Published:2020-05-13

摘要:

随着互联网的快速普及,人们已经习惯利用互联网进行交流。由于互联网存在信息交流快速、社会反馈和社会规范缺乏等特性,人们交流也变得更加自由和极端,表达的情绪也更加真实。这导致人们的注意力主要集中在信息本身,而忽略社会规则。人们在网上发表的负面言论,往往是一种负面情绪的表达,这种情绪积累到一定程度时,很可能演变为现实危害。文章主要介绍如何利用Web文本挖掘技术及基于朴素贝叶斯分类器的EM算法对Web文本数据进行情感分析,将情感分为正面、中性和负面,并且对负面信息进行归类、分析和预警,以预防现实危害的发生。

关键词: Web文本挖掘, 情感分析, 现实危害

Abstract:

As the rapid spread of the internet, people have gotten used to communicating with others through internet. However, because of the rapid exchange of information and the lack of social feedback and social norms, people become more free and extreme, and the expression of emotion is more real, which leads that people intend to focus on information itself and ignore the social regulations. Remarks written by net citizens are always negative because the emotions they try to express are negative. Moreover, the accumulation of negative emotions on internet will develop into social crisis in reality. This article primarily introduces the emotional analysis of Web textual data by using Web text mining technology and EM algorithm based on native Bayes classifier, which divides emotions on internet into positive emotions, neutral emotions and negative emotions. Meanwhile, the emotional analysis method is able to take precautions against social crisis in reality by the classification, analysis and early warning of negative information.

Key words: Web text mining, sentiment analysis, real hazard

中图分类号: