Netinfo Security ›› 2014, Vol. 15 ›› Issue (11): 36-40.doi: 10.3969/j.issn.1671-1122.2014.11.006

• Orginal Article • Previous Articles    

The Method of Classifying Network Public Opinion Text Based on Random Forest Algorithm

Jian WU1,2(), Jing SHA3   

  1. 1.College of Computer Science and Technology, Zhejiang University, Hangzhou Zhejiang 310058, China
    2. Zhejiang Province Public Security Department, Hangzhou Zhejiang 310009, China
    3.The Third Research Institute of the Ministry of Public Security, Shanghai 200031, China
  • Received:2014-09-18 Online:2014-11-01 Published:2020-05-18

Abstract:

Faced with massive growth of Internet public opinion information, it’s very meaningful to classify these public opinion text information. First of all, this paper established the model of text document representation and selection of feature selection function. Then, it analyzed the characteristics of random forest algorithm in classification learning algorithm, and proposed to complete a series of document category by constructing decision tree. In the experiments, it collected a large number of network media corpora, and set the training and test, the common algorithm is obtained by contrast test (including the kNN, SMO, SVM) compared with the algorithm of RF quantitative performance data, this paper demonstrated that the proposed algorithm has better comprehensive classification rate and the stability of classification.

Key words: network public opinion text, random forest algorithm, document detection tree, document classification

CLC Number: