[目的/意义]以高校图书馆馆藏图书数据库和各类论文数据库等海量的文献资源为基础,设计推荐方案并基于Spark技术开展实证研究,力图优化图书馆文献推荐效果和提高系统计算性能。[方法/过程]首先分析大数据背景下高校图书馆文献推荐的需求,接着针对存在的文献查找缺失、文献浏览迷航和文献分析低效的现状,提出一种以文献"混合关联"为主要内容的图书馆文献推荐方案及实现算法,并应用Spark内存计算技术设计实证案例,最后对实证结果进行讨论并与同类算法比较。[结果/结论]基于Spark的文献"混合关联"方案能有效满足用户需求,提高文献推荐性能和效率,促进当前图书馆大数据应用的落地。
[Purpose/significance] In order to improve recommendation effects and the computation performance, a recommendation scheme is designed and its empirical study is realized in this paper based on the mass literature resource including bibliographic databases and various paper databases. [Method/process] Firstly, the paper analyzed the requirement of the literature recommendation of university libraries under the big data background. Then, it put forward the scheme and its algorithm implementation of library literature recommendation with the hybrid link strategy to solve the problems of the literature query deficit, the literature browse loss and the literature analysis inefficiency, and designed a case study based on in-memory computing technology of Spark. Finally, the paper discussed results of the experiment after comparison with similar algorithms. [Result/conclusion] The proposed scheme can meet the users' requirements efficiently and improve the performance and efficiency of literature recommendation and promote the application of big data in libraries at present.
[1] 黄文碧. 基于元数据关联的馆藏资源聚合研究[J]. 情报理论与实践, 2015, 38(4):74-79.
[2] 熊拥军. 数字图书馆个性化服务资源推荐模式分析[J]. 图书馆, 2014(2):132-134.
[3] 张闪闪, 黄鹏. 高校图书馆图书推荐系统中的稀疏性问题实证探析[J]. 大学图书馆学报, 2014(6):47-53.
[4] 任柯, 黄智兴, 邱玉辉. 基于主题模型的跨学科协作文献推荐[J]. 计算机科学, 2012, 39(9):235-239.
[5] 何胜,冯新翎,武群辉,等. 基于用户行为建模和大数据挖掘的图书馆个性化服务研究[J]. 图书情报工作,2017,61(1):40-46.
[6] 蓝冬梅. 大数据量图书下多数据集的二部图多样化推荐[J]. 情报理论与实践2016, 39(2):69-72.
[7] 肖强, 朱庆华, 郑华,等. Hadoop环境下的分布式协同过滤算法设计与实现[J]. 现代图书情报技术, 2013(1):83-89.
[8] 赵彦辉, 刘树春. Hadoop平台在图书推荐应用中的性能分析[J]. 现代情报, 2014, 34(10):157-161.
[9] Spark[EB/OL].[2017-06-11]. http://spark.apache.org/.
[10] 沈旺, 马一鸣, 李贺. 基于情境感知的用户推荐系统研究综述[J]. 图书情报工作, 2015, 59(21):128-138.
[11] 朱扬勇,孙婧. 推荐系统研究进展[J]. 计算机科学与探索, 2015, 9(5):513-525.
[12] 黄震华, 张佳雯, 张波, 等. 语义推荐算法研究综述[J]. 电子学报, 2016, 44(9):2262-2275.
[13] MUSTO C, BASILE P, LOPS P, et al. Introducing linked open data in graph-based recommender systems[J]. Information processing and management, 2017, 53(2):405-435.
[14] BEEL J, GIPP B, LANGER S, et al. Research-paper recommender systems:a literature survey[J].International journal on digital libraries,2016, 17(4):305-338.
[15] NASCIMENTO C, LAENDER A H F, DA SILVA A S, et al. A source independent framework for research paper recommendation[C]//NEWTON G. Proceedings of the 11th annual international ACM/IEEE joint conference on digital libraries. New York:ACM,2011:297-306.
[16] PHILIP S, SHOLA P B, MUSA E P. A paper recommender system based on the past ratings of a user[J]. International journal of advanced computer technology, 2014, 3(6):41-46.
[17] TSUJI K, TAKIZAWA N, SATO S, et al. Book recommendation based on library loan records and bibliographic information[J]. Procedia-social and behavioral sciences, 2014,147:478-486.
[18] 宋楚平. 一种改进的协同过滤方法在高校图书馆图书推荐中的应用[J]. 图书情报工作, 2016, 60(24):86-91.
[19] 凌霄娥, 周兵, 李克潮. 面向新读者和新图书的数字图书馆个性推荐冷启动问题研究[J]. 情报理论与实践, 2014, 37(8):100-104.
[20] 李宇航, 夏绍模, 程华亮. 基于跨域协同的移动图书馆个性化推荐模型研究[J]. 情报科学, 2017, 35(3):82-86.
[21] 安维, 刘启华, 张李义. 个性化推荐系统的多样性研究进展[J]. 图书情报工作, 2013, 57(20):127-135.
[22] DBbook dataset[EB/OL].[2017-06-11]. https://grouplens.org/datasets/hetrec-2011/.
[23] 奉国和, 黄家兴. 基于Hadoop与Mahout的协同过滤图书推荐研究[J]. 图书情报工作, 2013, 57(18):116-121.
[24] NLPIR汉语分词系统[EB/OL].[2017-06-11]. http://ictclas.nlpir.org.
[25] SALTON G, BUCKLEY C. Term-weighting approaches in automatic text retrieval[J]. Information processing & management, 1988, 24(5):513-523.
[26] SMITH B, LINDEN G, YORK J. Amazon.com recommendations:item-to-item collaborative filtering[J]. IEEE Internet computing, 2003, 7(1):76-80.
[27] ZIEGLER C N, LAUSEN G. Making product recommendations more diverse[J]. IEEE data engineering bulletin, 2009, 32(4):23-32.