The Research and Improvement of Catalog Search Scoring Algorithm of Similarity Based on Lucene

  • Wang Zexian
Expand
  • Guangzhou University Library, Guangzhou 510006

Received date: 2014-01-17

  Revised date: 2014-02-02

  Online published: 2014-02-20

Abstract

It is a better solution to develop catalog search system based on Lucene. After analyzing and studying on the default scoring algorithm of Lucene similarity, the author points out that it does not consider the popularity for the sorting of search results when searching books catalog. The author puts forward and achieves an improving algorithm. The experimental results show that the improving algorithm can put the popular books ahead of the sorting list as well as improve the readers' catalog searching experience.

Cite this article

Wang Zexian . The Research and Improvement of Catalog Search Scoring Algorithm of Similarity Based on Lucene[J]. Library and Information Service, 2014 , 58(04) : 94 -98 . DOI: 10.13266/j.issn.0252-3116.2014.04.015

References

[1] Spink A, Jansen B J, Wolfram D, et al.From e-sex to e-commerce: Web search changes[J]. IEEE Computer, 2002, 35(3):107-109.
[2] 范晨熙, 黄理灿, 李雪利.基于Lucene 的BM25 模型的评分机制的研究[J].工业控制计算机, 2013(3):78-79.
[3] 陈建峡, 黄日, 马忠宝.基于PageRank的Lucene排序算法优化与实现[J].计算机工程与科学, 2012(10):123-127.
[4] 白培发, 王成良, 徐玲.一种融合词语位置特征的Lucene相似度评分算法[J/OL].[2012-07-16].http://www.cnki.net/kcms/detail/11.2127.TP.20120716.1501.033.html.
[5] 黄承慧, 印鉴, 陆寄远.一种改进的Lucene 语义相似度检索算法[J].中山大学学报(自然科学版), 2011(2):11-10.
[6] Salton G, Yang C S.On the specification of term values in automatic indexing[J].Journal of Documentation, 1973, 29(4):351-372.
[7] Salton G, Buckley C.Term-weighting approaches in automatic text retrieval[J].Information Processing & Management, 1988, 24(5):513-523.
[8] Church K, Gale W.Inverse document frequency(IDF):A measure of deviations from poisson[C]//Proceedings of the 3rd Workshop on Very Large Corpora.Boston, 1995:12l-130.
[9] Classic Scoring Formula: Formula of Lucene's classic Vector Space implementation[EB/OL].[2013-08-15].http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html.
[10] 李克潮, 梁正友.基于多特征的个性化图书推荐算法[J].计算机工程, 2012(6):34-37.
[11] Collector(Lucene 4.4.0 API)[EB/OL].[2013-08-15].http://lucene.apache.org/core/4_4_0/core/index.html.
Outlines

/