信息网络安全 ›› 2020, Vol. 20 ›› Issue (12): 47-53.doi: 10.3969/j.issn.1671-1122.2020.12.007

• 技术研究 • 上一篇    下一篇

基于语义的网络交易论坛虚拟身份同一性识别

张璇1,2, 袁得嵛1, 金波3()   

  1. 1.中国人民公安大学信息网络安全学院,北京 100038
    2.山东警察学院侦查系,济南 250014
    3.公安部第三研究所,上海 201204
  • 收稿日期:2020-09-19 出版日期:2020-12-10 发布日期:2021-01-12
  • 通讯作者: 金波 E-mail:jinbo@stars.org.cn
  • 作者简介:张璇(1980—),女,山东,博士研究生,主要研究方向为网络安全、电子数据取证、网络犯罪侦查|袁得嵛(1986—),男,河北,讲师,博士,主要研究方向为网络安全、网络犯罪侦查|金波(1972—),男,浙江,研究员,博士,主要研究方向为网络安全、大数据
  • 基金资助:
    国家自然科学基金(61771072);辽宁省网络安全执法协同创新中心培育项目(WXZX-201912016);山东警察学院科技计划(YKJYB201706)

Virtual Identity Identification Based on Semantic for Network Trading Platform

ZHANG Xuan1,2, YUAN Deyu1, JIN Bo3()   

  1. 1. School of Information Network Security, People’s Security University of China, Beijing 100038, China
    2. Department of Investigation, Shandong Police College, Jinan 250014, China
    3. The Third Research Institute ofMinistry of Public Security, Shanghai 201204, China
  • Received:2020-09-19 Online:2020-12-10 Published:2021-01-12
  • Contact: JIN Bo E-mail:jinbo@stars.org.cn

摘要:

近年来,IT技术催生电子商务繁荣发展,网络交易深度融入到了人们的生产生活中。网络交易论坛作为重要的交易载体,其多样化和差异化也促使交易双方在不同平台注册账号,以多个虚拟身份进行商品买卖。由于不同交易论坛之间信息不共享,虚拟身份缺乏有效关联,无法进行数据汇聚,难以通过传统数据关联比对的方法识别用户,迫切需要新的技术方法对网络交易平台参与者虚拟身份进行深入分析,形成准确的身份映射。文章利用多个网络交易论坛数据,训练生成基于Doc2Vec语义相似度分析的虚拟身份同一性识别无监督模型,对出售商品的描述文本进行相似性计算,挖掘隐藏卖家同一虚拟身份,进而为用户画像、风控等技术场景提供支持。

关键词: Doc2Vec, 虚拟身份识别, 语义相似性

Abstract:

In recent years, the development of IT technology has given rise to the prosperity of online trading platforms, which are deeply integrated into people's production and life. The diversification and differentiation of online transactions also encourage both parties to register accounts on different platforms and use multiple virtual identities to buy and sell commodities. Due to the non-sharing of information between different platforms and the lack of effective association between virtual identities, data cannot be aggregated and it is difficult to identify users through the traditional data association comparison method. Therefore, new technical methods are urgently needed to effectively identify the virtual identities of participants of network trading platforms and form accurate identity mapping. Training data using multiple network trading platform, this paper generated virtual identity based on Doc2Vec semantic similarity analysis identity recognition unsupervised model, description of goods on sale text similarity calculation, dig the hidden sellers in the same virtual identity, and picture for the user, recommend, risk control and other technical application support.

Key words: Doc2Vec, virtual identity profiling, semantic similarity

中图分类号: