信息网络安全 ›› 2024, Vol. 24 ›› Issue (3): 449-461.doi: 10.3969/j.issn.1671-1122.2024.03.010

• 技术研究 • 上一篇    下一篇

基于深度强化学习和隐私保护的群智感知动态任务分配策略

傅彦铭1,2,3, 陆盛林1, 陈嘉元1(), 覃华1   

  1. 1.广西大学计算机与电子信息学院,南宁 530004
    2.广西高校并行分布与智能计算重点实验室,南宁 530004
    3.广西智能数字服务工程技术研究中心,南宁 530004
  • 收稿日期:2024-01-29 出版日期:2024-03-10 发布日期:2024-04-03
  • 通讯作者: 陈嘉元 E-mail:ycq_cjy@163.com
  • 作者简介:傅彦铭(1976—),男,广西,副教授,博士,CCF会员,主要研究方向为智能计算、网络安全|陆盛林(1999—),男,广东,硕士研究生,主要研究方向为群智感知、隐私保护|陈嘉元(1997—),男,山西,硕士研究生,主要研究方向为群智感知、隐私保护|覃华(1972—),男,广西,教授,博士,主要研究方向为量子计算理论、近似动态规划最优化方法、数据挖掘
  • 基金资助:
    国家自然科学基金(61962005)

Dynamic Task Allocation for Crowd Sensing Based on Deep Reinforcement Learning and Privacy Protection

FU Yanming1,2,3, LU Shenglin1, CHEN Jiayuan1(), QIN Hua1   

  1. 1. School of Computer, Electronic and Information, Guangxi University, Nanning 530000, China
    2. Key Laboratory of Parallel, Distributed and Intelligent Computing(Guangxi), Nanning 530000, China
    3. Guangxi Intelligent Digital Services Research Center of Engineering Technology, Nanning 530000, China
  • Received:2024-01-29 Online:2024-03-10 Published:2024-04-03
  • Contact: CHEN Jiayuan E-mail:ycq_cjy@163.com

摘要:

在移动群智感知(Mobile Crowd Sensing,MCS)中,动态任务分配的结果对提高系统效率和确保数据质量至关重要。然而,现有的大部分研究在处理动态任务分配时,通常将其简化为二分匹配模型,该简化模型未充分考虑任务属性与工人属性对匹配结果的影响,同时忽视了工人位置隐私的保护问题。针对这些不足,文章提出一种基于深度强化学习和隐私保护的群智感知动态任务分配策略。该策略首先通过差分隐私技术为工人位置添加噪声,保护工人隐私;然后利用深度强化学习方法自适应地调整任务批量分配;最后使用基于工人任务执行能力阈值的贪婪算法计算最优策略下的平台总效用。在真实数据集上的实验结果表明,该策略在不同参数设置下均能保持优越的性能,同时有效地保护了工人的位置隐私。

关键词: 群智感知, 深度强化学习, 隐私保护, 双深度Q网络, 能力阈值贪婪算法

Abstract:

In mobile crowd sensing(MCS), the outcome of dynamic task allocation is crucial for enhancing system efficiency and ensuring data quality. Most existing studies simplify dynamic task allocation into a bipartite matching model, which fails to sufficiently consider the impact of task and worker attributes on the matching results and overlooked the protection of worker location privacy. To address these shortcomings, this paper presents a privacy-preserving dynamic task allocation strategy for MCS based on deep reinforcement learning and privacy protection. The strategy first employed differential privacy techniques to add noise to worker locations, protecting their privacy. It then adapted task batch assignments using deep reinforcement learning methods. Finally, it employed a greedy algorithm based on worker task capability thresholds to compute the maximal total utility of the platform under the optimal strategy. Experimental results on real-world datasets demonstrate that the strategy maintains superior performance under various parameter settings while effectively safeguarding worker location privacy.

Key words: crowd sensing, deep reinforcement learning, privacy protection, double deep Q-network, capacity threshold greedy algorithm

中图分类号: