信息网络安全 ›› 2024, Vol. 24 ›› Issue (7): 1088-1097.doi: 10.3969/j.issn.1671-1122.2024.07.010

• 理论研究 • 上一篇    下一篇

一种Tor网站多网页多标签指纹识别方法

蔡满春(), 席荣康, 朱懿, 赵忠斌   

  1. 中国人民公安大学信息网络安全学院,北京 100038
  • 收稿日期:2024-02-26 出版日期:2024-07-10 发布日期:2024-08-02
  • 通讯作者: 蔡满春 caimanchun@ppsuc.edu.cn
  • 作者简介:蔡满春(1972—),男,河北,教授,博士,主要研究方向为密码学与通信保密|席荣康(1997—),男,河南,硕士研究生,主要研究方向为匿名通信|朱懿(2000—),男,上海,硕士研究生,主要研究方向为网络安全、深度学习|赵忠斌(1998—),男,四川,硕士研究生,主要研究方向为网络安全、机器学习。
  • 基金资助:
    国家重点研发计划(2018YFC0823205);中国人民公安大学2022年度基本科研业务费项目(2022JKF02009)

A Fingerprint Identification Method of Multi-Page and Multi-Tag Targeting Tor Website

CAI Manchun(), XI Rongkang, ZHU Yi, ZHAO Zhongbin   

  1. College of Information and Cyber Security, People’s Public Security University of China, Beijing 100038, China
  • Received:2024-02-26 Online:2024-07-10 Published:2024-08-02

摘要:

Tor匿名通信系统经常被不法分子用来从事暗网犯罪活动,Tor网页指纹识别技术为暗网监管提供技术手段。针对单标签Tor网页指纹识别技术在网络监管中实用性差的问题,文章提出一种多网页多标签Tor指纹识别方法。首先,对标准粒子群算法、K最近邻算法进行参数优化并整合,提出自适应粒子群优化K最近邻模型APSO-KNN,进行连续多标签网页分割。然后,利用自注意力机制和一维卷积神经网络模型对网页分割片段进行内容识别。最后,利用APSO-KNN记忆打分机制选择识别失败的网页的次优分割点进行网页重分割。实验结果表明,APSO-KNN采用粒子搜索机制代替穷举遍历机制寻找分割点能取得96.30%的分割准确率,分割效率较传统KNN算法有显著提高。深度学习模型SA-1DCNN抗网页分割误差性能远优于机器学习模型,识别准确率可达96.1%。

关键词: 洋葱路由, 网页指纹, 粒子群优化算法, 加权K最近邻算法

Abstract:

Tor anonymous communication system is often used by criminals to engage in darknet criminal activities. Tor webpage fingerprint identification technology provide technical means for darknet supervision. Aiming at the problem of poor practicality of single label tor website recognition technology in the process of network supervision, this paper proposed a multi-page and multi-tag tor fingerprint identification method. Firstly, standard particle swarm optimization and K nearest neighbor (KNN) were optimized and combined, and KNN based on adaptive PSO (APSO-KNN) model was proposed for successive multi-tag website segmentation. Then, 1D CNN combined with self-attention mechanism (SA-1DCNN) model was used to classify content of website fragments. Finally, APSO-KNN memory scoring mechanism was used to select suboptimal segmentation point of website that failed to be identified. Experimental results show APSO-KNN uses particle search mechanism instead of exhaustive traversal mechanism to find the split point. It can achieve 96.3% segmentation accuracy, and efficiency is significantly improved compared with the traditional KNN algorithm. Deep learning model SA-1DCNN is better than machine learning model in terms of resist website segmentation error and can achieve 96.1% identification accuracy.

Key words: the onion router, website fingerprint, particle swarm optimization, weighted K-nearest neighbor

中图分类号: