Netinfo Security ›› 2017, Vol. 17 ›› Issue (2): 12-21.doi: 10.3969/j.issn.1671-1122.2017.02.003

• Orginal Article • Previous Articles     Next Articles

Research on User Customized Topic Web Crawler for Specialized Information Acquiration Technology

Limin XUE1(), Qi WU1,2, Jun LI1   

  1. 1. Department of Information, Naval Command College, Nanjing Jiangsu 211800, China
    2. Navy Unit 92853, Xingcheng Liaoning 125106, China
  • Received:2016-11-28 Online:2017-02-20 Published:2020-05-12

Abstract:

Stepping into the era of big data, the Internet has become an important battle field for every walk of life to collect intelligence. Facing the explosive growth of network information resources, how to screen out the required information quickly and efficiently is a practical problem to solve. It is very important to construct an information screening mechanism between the mass data and intelligence personnel to meet the needs of specific tasks, which can greatly improve the efficiency. In order to improve the accuracy of the information collected, this paper conducts the research on the user customized topic Web crawler technology for information acquisition. In order to solve the difficult problem of information screening in the large data age, the user’s interest preference is integrated into the crawling process of the topic Web crawler, and the information screening is effectively improved. Experimental results show that the method can improve the precision.

Key words: big data, topic Web crawler, Pagerank algorithm, behavior analysis, user customized

CLC Number: