图书情报工作 ›› 2022, Vol. 66 ›› Issue (13): 102-117.DOI: 10.13266/j.issn.0252-3116.2022.13.010

• 情报研究 • 上一篇    下一篇

同行评审意见类型识别及其在不同被引频次下的分布研究

秦成磊, 韩茹雪, 周昊旻, 仲江涛, 章成志   

  1. 南京理工大学信息管理系 南京 210094
  • 收稿日期:2022-03-07 修回日期:2022-05-11 出版日期:2022-07-05 发布日期:2022-07-06
  • 通讯作者: 章成志,教授,博士,博士生导师,通信作者,E-mail:zhangcz@njust.edu.cn。
  • 作者简介:秦成磊,博士研究生;韩茹雪,本科生;周昊旻,本科生;仲江涛,本科生。

Identification of Peer Review Comments Types and Research on their Distribution at Different Citation Frequencies

Qin Chenglei, Han Ruxue, Zhou Haomin, Zhong Jiangtao, Zhang Chengzhi   

  1. Department of Information Management, Nanjing University of Science and Technology, Nanjing 210094
  • Received:2022-03-07 Revised:2022-05-11 Online:2022-07-05 Published:2022-07-06

摘要: [目的/意义]识别学术论文同行评审意见类型、分析不同被引频次下同行评审意见类型在同行评审报告中的分布情况,有助于加深对同行评议机制的认识,为评估论文学术质量、量化评审专家贡献提供新思路。[方法/过程]首先,将同行评审意见类型划分为正面评价、负面评价、要求/建议(主、次要方面)、问题/疑问、陈述六个类别,经人工标注、获取训练、测试语料后,对比分析传统机器学习模型、深度学习模型在同行评审意见类型自动识别上的效果;其次,将同行评审报告涉及的学术论文进行主题聚类,进而对被引频次进行标准化处理;最后,使用Spearman相关系数、累积分布、K-S检验、负二项回归分析不同被引频次学术论文对应的同行评审报告中同行评审意见类型的分布情况。[结果/结论]SciBert模型识别效果最佳;在基于Spearman的相关性分析中,评审报告中正面评价的分布占比与被引频次具有显著的弱正相关,负面评价的分布占比与被引频次具有显著的弱负相关;通过累计分布发现,多数情况下,当累积概率相同时,高被引分区中正面评价的分布占比大于低被引分区、负面评价的分布占比小于低被引分区,K-S检验能够检测到这种差异;在负二项回归分析中,正面评价分布占比、负面评价分布占比分别对被引频次有显著的正向影响、负向影响。研究结果表明,同行评审报告中正面评价、负面评价的分布情况与其对应论文的被引频次存在相关性,被引频次一定程度上能够反映论文的学术质量。

关键词: 同行评议, 同行评审意见, 同行评审意见类型, 被引频次, 相关性分析

Abstract: [Purpose/Significance] Identifying the types of peer review comments of academic articles and analyzing the distribution of the types of peer review comments in peer review reports at different paper citation frequencies will help to deepen the understanding of the peer review mechanism and provide new ideas for evaluating the academic quality of articles and quantifying the contribution of reviewers. [Method/Process] Firstly, the types of peer review comments were divided into six classifications: positive evaluation, negative evaluation, requirements/suggestions (primary and secondary aspects), problems/questions, and statements. After manual labeling, obtaining training and testing corpus, the performance of traditional machine learning models and deep learning models on the automatic identification of peer review comments types were compared and analyzed; Secondly, the academic papers covered by peer-reviewed reports were clustered thematically, and the citation frequencies were standardized. Finally, Spearman correlation coefficient, cumulative distribution, K-S test and negative binomial regression were used to analyze the distribution of peer review comments types in peer review reports at different citation frequencies. [Result/Conclusion] The SeiBert model is the best for recognizing peer review comments classifications. In Spearman correlation analysis, the distribution proportion of positive evaluation in the review reports has a significant weak positive correlation with the cited frequency. The distribution proportion of negative evaluation has a significant weak negative correlation with the cited frequency. Through the cumulative distribution, it is found that in most cases, when the cumulative probability is the same, the distribution proportion of positive evaluation in high cited areas is more significant than that in low cited areas, and the distribution proportion of Negative evaluation is less than that in low cited areas, and the K-S test can detect this difference. In the negative binomial regression analysis, the proportion of positive and negative evaluation distribution has significant positive and negative effects on the citations respectively. The result of this paper shows that the distribution of positive and negative evaluations in peer review reports is related to the cited frequencies of corresponding papers, which can reflect the academic quality of papers to a certain extent.

Key words: peer review, peer review comments, peer review comments types, cited frequencies, correlation analysis

中图分类号: