图书情报工作 ›› 2022, Vol. 66 ›› Issue (12): 108-116.DOI: 10.13266/j.issn.0252-3116.2022.12.010

• 知识组织 • 上一篇    下一篇

基于共词和Word2Vec加权向量的文献-主题语义匹配分析方法

丁敬达, 陈一帆, 刘超, 蔡微   

  1. 上海大学文化遗产与信息管理学院 上海 200444
  • 收稿日期:2021-11-10 修回日期:2022-03-26 出版日期:2022-06-20 发布日期:2022-06-25
  • 作者简介:丁敬达,教授,博士,博士生导师,E-mail:djdhyn@126.com;陈一帆,硕士研究生;刘超,博士研究生;蔡微,硕士研究生。
  • 基金资助:
    本文系国家社会科学基金项目"基于多元数据融合的社科领域新兴主题探测方法及实证研究"(项目编号:21BTQ010)研究成果之一。

An Article-Topic Semantic Matching Analysis Method Based on Co-Word and Weighted Word2Vec

Ding Jingda, Chen Yifan, Liu Chao, Cai Wei   

  1. School of Cultural Heritage and Information Management, Shanghai University, Shanghai 200444
  • Received:2021-11-10 Revised:2022-03-26 Online:2022-06-20 Published:2022-06-25

摘要: [目的/意义]共词分析作为主题识别的重要方法,存在一定的局限和不足,将Word2Vec加权向量与共词分析相结合,有利于明确具体文献的主题归属,更好地对主题的发展演化进行分析。[方法/过程]在运用共词分析进行主题聚类的基础上,通过Word2Vec加权向量分别计算文献向量与聚类主题向量,并基于余弦相似度进行文献与主题的语义匹配。[结果/结论]国内外知识共享领域的实证分析表明,该方法能较好地将相关文献匹配至对应主题,并能从文献层面对主题特征及发展演化进行动态分析。

关键词: Word2Vec, 共词分析, 语义匹配, 知识共享, 主题演化

Abstract: [Purpose/Significance] As an important method for topic identification, co-word analysis has some limitations and deficiencies. The combination of weighted Word2Vec and co-word analysis is helpful to clarify the topic attribution of specific articles, and to better analyze the evolution of topics. [Method/Process] On the basis of topic clustering by co-word analysis, the article vectors and the clustering topic vectors were calculated by weighted Word2Vec, and the semantic matching between articles and topics was carried out based on cosine similarity. [Result/Conclusion] The empirical analysis in the field of knowledge sharing at home and abroad shows that this method can better match the relevant articles to the corresponding topics, and a dynamic analysis of the topic characteristic and evolution can be carried out from the article level.

Key words: Word2Vec, co-word analysis, semantic matching, knowledge sharing, topic evolution

中图分类号: