• • 上一篇    下一篇

基于DOM的信息检索研究

陈涛%薛丽敏%宋庆帅   

Research of Information Retrieval based on DOM

CHEN Tao%XUE Li-min%SONG Qing-shuai   

  • About author:海军指挥学院信息系,江苏南京,211800

摘要: 向量空间模型是信息检索中的重要模型,传统的向量空间模型考虑了特征项在目标文档中的出现频率和文档频率,但并未考虑特征项出现在文本中的位置这一重要信息。针对这一问题,文章在将文档以文档对象模型表示的基础上,根据特征项出现的位置不同,对特征项的权重额外附加一个不同的系数,以反映不同位置上的特征项在表达文档主旨上的能力差异,以期改善返回文档的排序质量,改进用户的检索工作。通过模拟实验,验证了该方法相比于传统VSM在改进检索效果上的优势。

Abstract: Vector Space Model is a important model in information retrieval, traditional Vector Space Model take feature term frequence and document frequence into account, regardless of the location feature term appears, which is a signiifcant information. Considering the problem above, after turn document into Document Object Model, this paper add a ratio to weight of feature term based on different location it appears to inlfect different ability of feature term that appears in different location in expressing main idea of the document, thus improve ranking result of documents returned and users’ retrieving work. Simulation experiment manifests the advantage of the solution above over traditional VSM.