信息网络安全 ›› 2023, Vol. 23 ›› Issue (10): 77-82.doi: 10.3969/j.issn.1671-1122.2023.10.011

• 入选论文 • 上一篇    下一篇

基于多模态数据的博彩网站检测识别模型

赵欣荷1,2, 谢永恒3,4(), 万月亮3,4, 汪金苗3,4   

  1. 1.中国人民公安大学信息网络安全学院,北京 100038
    2.淄博市公安局周村分局,淄博 255300
    3.公安部第三研究所,上海 200031
    4.北京锐安科技有限公司,北京 100192
  • 收稿日期:2023-06-26 出版日期:2023-10-10 发布日期:2023-10-11
  • 通讯作者: 谢永恒 E-mail:yongheng@bjrun.com
  • 基金资助:
    国家重点研发计划(2021YFB3101401)

Detection and Identification Model of Gambling Websites Based on Multi-Modal Data

ZHAO Xinhe1,2, XIE Yongheng3,4(), WAN Yueliang3,4, WANG Jinmiao3,4   

  1. 1. School of Information Networking Security, People’s Public Security University of China, Beijing 100038, China
    2. Zhoucun Branch, Zibo City Public Security Bureau, Zibo 255300, China
    3. The Third Research Institute of Ministry of Public Security, Shanghai 200031, China
    4. Run Technologies Co., Ltd. Beijing,Beijing 100192, China
  • Received:2023-06-26 Online:2023-10-10 Published:2023-10-11

摘要:

文章提出一种基于多模态数据的博彩网站检测识别模型,首先构建基于文本特征的Bert特征提取模型和基于图像特征的VGG19特征提取模型;然后通过特征融合及改变损失函数的方式提升博彩网站检测识别分类效果。在自建的正负样本1:5、1:10和1:20的数据集上对模型进行验证,实验结果表明,正负样本不均衡情况越明显,该模型的优势越明显,越能高效检测识别博彩网站。

关键词: 多模态, 博彩网站, 特征提取

Abstract:

This paper proposed a gambling website detection and recognition model based on multimodal data. Firstly, it constructed a Bert feature extraction model based on text features and a VGG19 feature extraction model based on image features; secondly, the method improved the classification effect of gambling website detection and recognition based on feature fusion and changing the loss function; lastly, this paper validated the method on self-constructed positive and negative samples of 1:5, 1:10, and 1:20 datasets. The experimental results indicate that the more obvious the imbalance of positive and negative samples is, the more obvious the advantage of the proposed method is, and it can detect and recognise gambling websites well.

Key words: multi-modal, gambling website, feature extraction

中图分类号: