知识组织

社会化标注系统标签质量影响因素研究:基于随机森林算法

  • 张云中 ,
  • 秦艺源
展开
  • 上海大学图书情报档案系 上海 200444
张云中(ORCID:0000-0002-7323-2561),副教授,博士,硕士生导师,E-mail:zhang-yun-zhong@126.com;秦艺源(ORCID:0000-0003-4101-8584),硕士研究生。

收稿日期: 2019-04-28

  修回日期: 2019-07-22

  网络出版日期: 2019-12-20

基金资助

本文系国家哲学社会科学基金项目"基于形式概念分析的社会化标注系统语义发现与语义映射研究"(项目编号:16CTQ023)研究成果之一。

Research on Influencing Factors of Tag Quality in Social Tagging System: Based on Random Forest

  • Zhang Yunzhong ,
  • Qin Yiyuan
Expand
  • Department of Library, Information and Archives, Shanghai University, Shanghai 200444

Received date: 2019-04-28

  Revised date: 2019-07-22

  Online published: 2019-12-20

摘要

[目的/意义] 在社会化标注系统中,标签质量往往关乎用户对网络资源的分类、查询、浏览、获取等使用体验,确定影响标签质量的关键因素有助于进一步优化社会化标注系统的资源组织核心功能。[方法/过程] 以社会化标注系统的标签为研究对象,从标注主体、标注客体、标注环境、标注动机、标注方式、标注产物等维度入手重构标签质量影响因素模型,尝试探究影响社会化标签质量的关键因素,并运用问卷调查方法收集数据,结合有监督学习的随机森林算法,建立标签质量影响因素的决策树模型。[结果/结论] 结果显示,标注主体是影响标签质量的首要关键维度,主体的知识结构和认知水平、标注频度及其感知有用性对标签质量的影响突出;标注方式是影响标签质量的次要关键维度,标签推荐和规范标签提示是影响标签质量的重要因素。

本文引用格式

张云中 , 秦艺源 . 社会化标注系统标签质量影响因素研究:基于随机森林算法[J]. 图书情报工作, 2019 , 63(24) : 119 -126 . DOI: 10.13266/j.issn.0252-3116.2019.24.013

Abstract

[Purpose/significance] Tag quality is often related to users' experience of classification, query, browsing, acquisition of online resources in social tagging system. Identifying key influencing factors of tag quality can optimize the core functions of resources organization of STS.[Method/process] Based on tags, we provided the influencing factors model of tag quality from six perspectives, which covered tagging subject, tagging object, tagging environment, tagging motivation, tagging methods and tagging products. The study attempted to explore the key influencing factors of tag quality by questionnaire, and established the decision tree model of influencing factors of tag quality based on Random Forest.[Result/conclusion] Tagging subject is the primary key dimension affecting tag quality. And the impact of the subject's knowledge structure and cognitive level, the subject's tagging frequency, and the subject's perceived usefulness are prominent. Tagging methods are the secondary one, and tag recommendation and standard tag tips are main influencing factors.

参考文献

[1] 邰杨芳,陈新国.社会化标注系统中用户的标注行为及差异分析[J].图书馆,2017(10):42-49,61.
[2] 熊回香,杨雪萍.社会化标注系统中的个性化信息推荐研究[J].情报学报,2016,35(5):549-560.
[3] 李旭晖,李媛媛,马费成.我国图情领域社会化标签研究主要问题分析[J].图书情报工作,2018,62(16):120-131.
[4] 章成志,李蕾.社会化标签质量自动评估研究[J].现代图书情报技术,2015(10):2-12.
[5] 章成志,赵华,李蕾,等.中英文图片标签质量差异比较研究——以Flickr为例[J].情报理论与实践,2018,41(4):123-127.
[6] 黄如花,任其翔.WorldCat热门标签的调查与分析[J].图书与情报,2012(5):7-10.
[7] 吴方枝.Flickr网站用户标签的质量控制对策[J].图书馆学研究,2012(11):26-28.
[8] SOGOL N,ARASH B,CHEN D. An improved collaborative recommendation system by integration of social tagging data[EB/OL].[2018-08-01].https://doi.org/10.1007/978-3-319-14379-8_7.
[9] HALL C E,ZARRO M A. What do you call it?:a comparison of library-created and user-created tags[C]//Proceedings of the11th annual international ACM/IEEE joint conference on digital libraries. New York: ACM,2011:53-56.
[10] GUY M,TONKIN E. Folksonomies: tidying up tags?[EB/OL].[2018-08-02].http://www.dlib.org/dlib/january06/guy/01guy.html#1.
[11] FlOECK F, PUTZKE J, STEINFELS S, et al. Imitation and quality of tags in social bookmarking systems[C]//Advances in intelligent and soft computing. Berlin: Springer-verlag, 2010:75-91.
[12] SEN S,HARPER F M,LAPITZ A, et al. The quest for quality tags[EB/OL].[2018-08-01].http://www.doc88.com/p-3724514050617.html.
[13] 吴克文,朱庆华,赵宇翔,等.社会化标注系统中标签检索质量模拟研究[J].情报学报,2011,30(1):29-36.
[14] 罗琳,杨洋.社会化标注系统中用户标签使用行为影响因素研究[J].图书情报知识,2018(3):85-94.
[15] 高兵,孙琳,谢彪,等.权重概率主成分分析模型的建立及应用研究[J].中国卫生统计,2018,35(6):802-805.
[16] 徐少成,李东喜.基于随机森林的加权特征选择算法[J].统计与决策,2018,34(18):25-28.
[17] SCOTT A G,BERNARDO A H. Usage patterns of collaborative tagging systems[J].Journal of information science,2006,32(2):198-208.
[18] 刘竞妍,张可,王桂华.综合评价中数据标准化方法比较研究[J].数字技术与应用,2018,36(6):84-85.
文章导航

/