图书情报工作 ›› 2018, Vol. 62 ›› Issue (3): 55-63.DOI: 10.13266/j.issn.0252-3116.2018.03.007

• 情报研究 • 上一篇    下一篇

多专长专家识别方法研究——以大数据领域为例

刘晓豫, 朱东华, 汪雪锋, 黄颖   

  1. 北京理工大学管理与经济学院 北京 100081
  • 收稿日期:2017-08-26 修回日期:2017-11-20 出版日期:2018-02-05 发布日期:2018-02-05
  • 作者简介:刘晓豫(ORCID:0000-0003-2509-8457),博士研究生,E-mail:xiaoyu.liu2019@foxmail.com;朱东华,教授,博士生导师;汪雪锋,教授,博士生导师;黄颖,博士研究生。
  • 基金资助:
    本文系国家自然科学基金面上项目"开放数据环境下技术专家定位与评估方法研究"(项目编号:71673024)研究成果之一。

Multi-expertise Researcher Identification: A Case Study of the Big Data

Liu Xiaoyu, Zhu Donghua, Wang Xuefeng, Huang Ying   

  1. School of Management and Economics, Beijing Institute of Technology, Beijing 100081
  • Received:2017-08-26 Revised:2017-11-20 Online:2018-02-05 Published:2018-02-05

摘要: [目的/意义]国家政府、大中型企业以及研究机构面对技术难题,如何找到合适的专家是迫切需要解决的问题。面对需要运用多学科知识来解决的综合性复杂难题,寻找到多专长专家显得尤为重要,寻找合适的方法识别出多专长专家是本研究的目的。[方法/过程]利用专家所发表的学术论文数据,通过抽取专家有代表性的研究专长特征,基于TFIDF加权的重叠K-means聚类算法对专家进行重叠聚类划分,挖掘出专家的多个研究专长,进而识别出多专长专家。[结果/结论]研究结果表明TFIDF加权的重叠K-means聚类算法在查准率、召回率和F值上有良好的表现,可以识别多专长专家。

关键词: 专家识别, 重叠K-means, 多专长专家, 大数据, TFIDF

Abstract: [Purpose/significance]In response to the rapid shifting of knowledge needs, how to choose the appropriate researchers for a given problem is an important issue for the government, companies, as well as research institutions. When we face a real complex problem, it is essential to find multi-expertise researchers. This research aims to find a proper way to identify multi-expertise researchers. [Method/process]This paper used a Term Frequency-Inverse Document Frequency (TFIDF) weighted overlapping K-means clustering method. Based on the researchers' co-authorship network built up from the publication data, the TFIDF weighted overlapping K-means clustering method was applied to cluster researchers into overlapping clusters and identify the multi-expertise researchers. [Result/conclusion]Results show that the TFIDF weighted overlapping K-means method has an advantage over the previous work in terms of the precision ratio, the recall ratio and the F-value, so such a method can be beneficial to identify multi-expertise researchers.

Key words: researcher identification, overlapping K-means, multi-expertise researcher, big data, Term Frequency-Inverse Document Frequency (TFIDF)

中图分类号: