信息网络安全 ›› 2015, Vol. 15 ›› Issue (4): 74-77.doi: 10.3969/j.issn.1671-1122.2015.04.013

• • 上一篇    下一篇

基于Q-Learning的无线传感器网络生命周期平衡路由

苏彬庭1,2, 方禾1,2(), 许力1,2   

  1. 1. 福建师范大学数学与计算机科学学院,福建福州 350007
    2.福建省网络安全与密码技术重点实验室,福建福州 350007
  • 收稿日期:2015-02-10 出版日期:2015-04-10 发布日期:2018-07-16
  • 作者简介:

    作者简介: 苏彬庭(1990-),男,福建,硕士研究生,主要研究方向:网络与信息安全;方禾(1991-),女,福建,博士,主要研究方向:网络与信息安全;许力(1970-),男,福建,博士,教授,主要研究方向:无线网络与移动通信、网络与信息安全、物联网与云计算、智能信息处理、复杂系统和网络的建模和仿真。

  • 基金资助:
    国家自然科学基金[U1405255];福建省自然科学基金[2013J01222];福州市科技项目[2013-G-84]

Q-Learning-based Routing Protocol for the Balance of WSN Lifetime

Bin-ting SU1,2, He FANG1,2(), Li XU1,2   

  1. 1. School of Mathematics and Computer Science, Fujian Normal University, Fuzhou Fujian 350007, China
    2.Fujian Provincial Key Laboratory of Network Security and Cryptology, Fuzhou Fujian 350007,China
  • Received:2015-02-10 Online:2015-04-10 Published:2018-07-16

摘要:

无线传感器网络(wireless sensor network,WSN)由于容易部署和安装成本低等优势,受到学术界和工业界的广泛关注。然而无线传感器网络的节点在能量、计算能力、存储能力和带宽等方面都存在很大的局限性,复杂的传统网络路由协议无法直接应用到该网络中,因而简单高效的路由协议成为无线传感器网络的研究重点。为了延长传感器的工作时间,文章基于增强学习算法提出一种平衡无线传感器网络生命周期的路由协议Q-WRP。该协议综合考虑了节点的能量、到汇聚节点的跳数、传输时延等信息,为每个转发节点分配计算一个转发质量(即Q值),最终根据各转发节点Q值的大小选择出最优的转发路径。NS2仿真结果表明,该算法延迟了网络第一个死亡节点的出现时间,可以有效平衡网络节点的生命周期。

关键词: 无线传感器网络, 路由协议, 增强学习

Abstract:

Wireless sensor network (WSN) is extensive concerned by academia and industry because of its good performances such as flexible deployment and low cost. But the nodes of wireless sensor network have the great limitation in the aspect of energy, computation, memory size and bandwidth, the complex routing protocols of traditional network can't be applied directly in wireless sensor network, thus a simple and efficient routing protocol became the research focus of wireless sensor network. In order to extend working hours, this paper proposes a routing protocol, Q-WRP, which can balance the wireless sensor network lifetime on the basis of reinforcement learning. The protocol takes account of the factors of residual energy, hop count to sink node, and propagation delay time, allocates Q-value for each node, and finds the optimal routing path according the Q-values of each node at last. Simulation result from NS2 shows that Q-WRP extends the occurrence time of the node that dies firstly, and can balance the wireless sensor network lifetime efficiently.

Key words: wireless sensor network, routing protocol, reinforcement learning

中图分类号: