图书情报工作 ›› 2018, Vol. 62 ›› Issue (21): 112-117.DOI: 10.13266/j.issn.0252-3116.2018.21.014

• 知识组织 • 上一篇    下一篇

基于改进内容过滤算法的高校图书馆文献资源个性化推荐研究

耿立校1, 晋高杰1, 李亚函1, 孙卫忠1,2, 马士豪1   

  1. 1. 河北工业大学经济管理学院 天津 300401;
    2. 河北工业大学图书馆 天津 300401
  • 收稿日期:2018-04-25 修回日期:2018-06-24 出版日期:2018-11-05 发布日期:2018-11-05
  • 作者简介:耿立校(ORCID:0000-0002-1041-5061),副教授,博士,硕士生导师,E-mail:lixgeng@qq.com;晋高杰(ORCID:0000-0002-0630-5986),硕士研究生;李亚函(ORCID:0000-0003-2816-8596),助理研究员,硕士;孙卫忠(ORCID:0000-0002-6073-7114),馆长,教授,博士,硕士生导师;马士豪(ORCID:0000-0003-3250-795X),硕士研究生。
  • 基金资助:
    本文系河北省社会科学基金项目"面向用户科研需求的高校图书馆信息服务体系研究"(项目编号:HB17TQ009)研究成果之一。

Research on Personalized Recommendation of University Library Literature Resources Based on Improved Content-based Filtering Algorithm

Geng Lixiao1, Jin Gaojie1, Li Yahan1, Sun Weizhong1,2, Ma Shihao1   

  1. 1. School of Economics and Management, Hebei University of Technology, Tianjin 300401;
    2. Hebei University of Technology Library, Tianjin 300401
  • Received:2018-04-25 Revised:2018-06-24 Online:2018-11-05 Published:2018-11-05

摘要: [目的/意义]基于内容的过滤推荐中,针对向量空间模型表示文本时容易造成维度灾难的问题,提出利用余弦值r与匹配度值Sim相结合的方法对原有模型进行改进。[方法/过程]由文献资源和用户兴趣分别筛选出权重较大特征词的词向量,进而由公式计算余弦值r,结合对应的特征词权重进一步计算出匹配度值Sim,将其作为向目标用户推荐文献的依据,并利用河北工业大学图书馆的相关数据对改进模型、向量空间模型及LDA主题模型进行实验,最后利用查准率、召回率、F1值及运行时间等评价指标对3种模型的实验结果进行分析。[结果/结论]实验结果表明所提出的改进模型相比较于实验中的向量空间模型与LDA主题模型具有更高的应用价值与运行效率。

关键词: 基于内容推荐, 匹配度值Sim, 推荐模型, 实证分析

Abstract: [Purpose/significance] In content-based filtering recommendation, the problem of dimensionality disaster is easily caused when the vector space model (VSM) is used to represent text. This paper proposes a method that combines the cosine value r and the matching value Sim to improve the original model.[Method/process] based on literature resources and user interests the word vectors of feature words with large weight were selected, and then the cosine value r is calculated by the formula, and the matching value Sim is further calculated based on the corresponding feature words weights as the basis for recommending literature to the target user. And it uses the data from the Hebei University of Technology Library to conduct experiments on the improved model, vector space model and LDA topic model, and finally uses the evaluation index of precision rate, recall rate, F1 and running time to analysis the experimental results of the three models.[Result/conclusion] The experimental results show that the improved model presented in this paper has higher application value and operation efficiency compared with the vector space model and LDA topic model.

Key words: content-based recommendation, matching value Sim, recommendation model, empirical analysis

中图分类号: