图书情报工作 ›› 2023, Vol. 67 ›› Issue (3): 72-84.DOI: 10.13266/j.issn.0252-3116.2023.03.007

• 情报研究 • 上一篇    下一篇

基于显隐式信息融合和单类协同过滤方法的主题词推荐

李树青, 黄金旺, 马丹丹, 张志旺   

  1. 南京财经大学信息工程学院 南京 210023
  • 收稿日期:2022-08-17 修回日期:2022-11-26 出版日期:2023-02-24 发布日期:2023-02-24
  • 作者简介:李树青,教授,博士,硕士生导师, E-mail:leeshuqing@163.com;黄金旺,硕士研究生;马丹丹,讲师,博士;张志旺,教授,博士。
  • 基金资助:
    本文系国家社会科学基金项目“学术虚拟社区知识交流效率研究”(项目编号: 17BTQ028)研究成果之一。

Subject Term Recommendation Based on the Fusion of Explicit & Implicit Information and One-class Collaborative Filtering

Li Shuqing, Huang Jinwang, Ma Dandan, Zhang Zhiwang   

  1. School of Information Engineering, Nanjing University of Finance and Economics, Nanjing 210023
  • Received:2022-08-17 Revised:2022-11-26 Online:2023-02-24 Published:2023-02-24

摘要: [目的/意义] 提出一种基于融合显隐式信息的单类协同过滤算法的文献主题词推荐方法,以提高面向学者和文献的主题词推荐的准确率。[方法/过程] 通过构造一种基于文献丰富度和主题词流行度的矩阵分解模型,测度出文献和未出现在当前文献中的主题词相关性概率,并根据相关性概率的大小将这些主题词划分为文献的隐式相关主题词和隐式无关主题词。然后针对这两种主题词,分别提出两种不同的主题词权值预测方法,即融合偏好系数的自编码器填充模型和零值填充模型。[结果/结论] 在面向人工智能领域的科技文献数据集 SD4AI上的实验表明,较各种其他典型协同过滤方法,本文方法可分别提高预测主题词权值和识别高权值主题词的推荐效果, MAE 和 FCP 的提升幅度最高达 16.07% 和 16.83%, P@N 和 NDCG@N 的推荐效果最高达 22.37% 和27.06%。

关键词: 主题词推荐, 扩展主题词, 单类协同过滤, 词项相关性, 词项权值

Abstract: [Purpose/Significance] The proposed one-class collaborative filtering algorithm with the fusion of explicit and implicit information has a remarkable effect in the field of literature subject term recommendation, and improves the precision of subject term recommendation for scholar and literature. [Method/Process] By constructing a matrix decomposition model based on literature richness and subject term popularity, the correlation probability of literature and subject terms that do not appear in the current literature was measured, and these subject terms could be divided into implicit related subject terms and implicit unrelated subject terms of literature according to the correlation probability. For these two kinds of subject terms, two different weight prediction methods of subject terms were proposed, namely, AutoRec Filling with Preference Coefficient and Zero Filling. [Result/Conclusion] The experiment on SD4AI, a scientific and technological literature dataset oriented to the field of artificial intelligence, shows that compared with various typical collaborative filtering methods, MAE and FCP have respectively improved the recommendation effect of predicting the weight of subject terms and identifying high weight subject terms, with the maximum increase of 16.07% and 16.83%, while the maximum value of P@N and NDCG@N is 22.37% and 27.06% respectively.

Key words: subject term recommendation, subject term expansion, one-class collaborative filtering, term relevance, term weight

中图分类号: