利用迁移学习精准识别领域信息之探讨

陆泉; 郝志同; 陈静; 陈仕; 朱安琪

doi:10.13266/j.issn.0252-3116.2021.05.011

图书情报工作 >

2021 , Vol. 65 >Issue 5: 110 - 117

DOI: https://doi.org/10.13266/j.issn.0252-3116.2021.05.011

情报研究

利用迁移学习精准识别领域信息之探讨

陆泉 ,
郝志同 ,
陈静 ,
陈仕 ,
朱安琪

展开

1. 武汉大学信息资源研究中心武汉 430072;
2. 国土资源部城市土地资源监测与仿真重点实验室深圳 518034;
3. 华中师范大学信息管理学院武汉 430079

陆泉(ORCID:0000-0002-8679-9866),教授,博士生导师;郝志同(ORCID:0000-0003-1803-2441),硕士研究生;陈仕(ORCID:0000-0003-4664-7208),硕士研究生;朱安琪(ORCID:0000-0002-7526-1761),硕士研究生。

收稿日期: 2020-07-03

修回日期: 2020-09-19

网络出版日期: 2021-04-14

基金资助

本文系国家社会科学基金重点项目"心理账户理论视角下在线健康社区精准信息服务研究"（项目编号：20ATQ008）研究成果之一。

收起

Discussion on Using Transfer Learning to Accurately Identify Domain Information

Lu Quan ,
Hao Zhitong ,
Chen Jing ,
Chen Shi ,
Zhu Anqi

Expand

1 Center for Studies of Information Resources, Wuhan University, Wuhan 430072;
2 Key Laboratory of Urban Land Resources Monitoring and Simulation, Ministry of Land and Resources, Shenzhen 518034;
3 School of Information Management, Central China Normal University, Wuhan 430079

Received date: 2020-07-03

Revised date: 2020-09-19

Online published: 2021-04-14

Fold

摘要

[目的/意义] 将从互联网大数据中无监督学习的结果迁移到目标领域，解决目标领域因学习样本有限而信息识别效果难以提升的问题。[方法/过程] 使用以中文维基百科等数据预训练的RoBERTa模型进行迁移学习，将学习结果映射到目标领域后使用DPCNN对其进行聚合凝练，然后结合部分标注数据微调模型完成领域信息的精准识别。[结果/结论] 在10个领域内与未进行迁移学习的模型及经典模型TextCNN对比，提出的模型均较大幅度优于对比模型，平均后的精确率绝对提高4.15%、3.43%，召回率绝对提高4.55%、3.44%，F1分数绝对提高4.52%、3.44%，表明利用网络大数据迁移学习可以显著提升目标领域的信息识别效果。

关键词： 迁移学习; 信息识别; RoBERTa

本文引用格式

陆泉 , 郝志同 , 陈静 , 陈仕 , 朱安琪 . 利用迁移学习精准识别领域信息之探讨[J]. 图书情报工作, 2021 , 65(5) : 110 -117 . DOI: 10.13266/j.issn.0252-3116.2021.05.011

Abstract

[Purpose/significance] To solve the problem that the identification effect of the target domain information is difficult to improve because of not enough samples, we will transfer the results of unsupervised learning from big data to the feature space of the target domain. [Method/process] Used the RoBERTa model, which was pre-trained with Chinese Wikipedia and other data, for transfer learning. After mapping the learning results to the target domain, DPCNN was used to aggregate and condense it, and then fine-tuned the model with part of the labeled data to complete the accurate recognition of domain information. [Result/conclusion] Compared with the model without transfer learning and the classic model TextCNN in 10 fields, the model in this paper is much better than the comparison models. After average, the precision is increased by 4.15% and 3.43%, the recall is increased by 4.55% and 3.44%, and the F1 score is increased by 4.52% and 3.44%. It shows that knowledge transfer using big data can effectively improve the information recognition effect in the target field.

Key words： transfer learning; information recognition; RoBERTa

参考文献

[1] RINGEL D, RADINSKY K, MARKOVITCH S. Cross-cultural transfer learning for text classification[D]. Israel:Technion, 2019.
[2] YU S, SU J, LUO D. Improving BERT-based text classification with auxiliary sentence and domain knowledge[J]. IEEE access, 2019, 7:176600-176612.
[3] 潘洪亮, 王正德. 信息知识词典[M]. 北京:军事谊文出版社, 2002.
[4] 张学工. 模式识别[M].3版. 北京:清华大学出版社, 2010.
[5] MA Y, TANG J, AGGARWAL C. Feature engineering for data streams[M]//Feature engineering for machine learning and data analytics. Boca Raton:CRC Press, 2018:117-143.
[6] 廖列法,勒孚刚,朱亚兰.LDA模型在专利文本分类中的应用[J].现代情报,2017,37(3):35-39.
[7] 杨腾飞, 解吉波, 李振宇, 等. 微博中蕴含台风灾害损失信息识别和分类方法[J]. 地球信息科学学报, 2018, 20(7):906-917.
[8] KIM Y. Convolutional neural networks for sentence classification[C]//YUVAL M. Empirical methods in natural language processing. Qatar:ACL, 2014:1746-1751.
[9] 黄涛. 基于机器学习的新闻分类系统研究与实现[D].北京:北京邮电大学,2019.
[10] LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning[C]//IJCAI.Proceedings of the twenty-fifth international joint conference on artificial intelligence. New York:AAAI Press, 2016:2873-2879.
[11] JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[C]//MIRELLA L. the 15th conference of the european chapter of the association for computational linguistics. Spain:EACL, 2017:427-431.
[12] JOHNSON R, ZHANG T. Deep pyramid convolutional neural networks for text categorization[C]//HINRICH S. Proceedings of the 55th annual meeting of the association for computational linguistics. Vancouver:ACL, 2017:562-570.
[13] 庄福振, 罗平, 何清,等. 迁移学习研究进展[J]. 软件学报, 2015, 26(1):26-39.
[14] ZHOU B, LAPEDRIZA A, XIAO J, et al. Learning deep features for scene recognition using places database[C]//ROMAN G. International conference on neural information processing systems. Cambridge:MIT press, 2014:487-495.
[15] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//JUN Z. Advances in neural information processing systems. Harrahs:NIPS, 2013:3111-3119.
[16] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//ISABELLE G. Advances in neural information processing systems. Long Beach:NIPS, 2017:5998-6008.
[17] RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners[J]. OpenAI blog, 2019, 1(8):9.
[18] DEVLIN J, CHANG M W, LEE K, et al. BERT:pre-training of deep bidirectional transformers for language understanding[J/OL].[2020-06-28]. https://arxiv.org/pdf/1810.04805.
[19] LIU Y, OTT M, GOYAL N, et al. RoBERTa:a robustly optimized BERT pretraining approach[J/OL].[2020-04-01]. https://arxiv.org/pdf/1907.11692.
[20] LAN Z, CHEN M, GOODMAN S, et al. ALBERT:a lite BERT for self-supervised learning of language representations[J/OL].[2020-04-01]. https://arxiv.org/pdf/1909.11942.
[21] HOULSBY N, GIURGIU A, JASTRZEBSKI S, et al. Parameter-efficient transfer learning for NLP[J].[2020-04-01].https://arxiv.org/pdf/1902.00751.
[22] SHARMA T, UPADHYAY U, BAGLER G. Classification of cuisines from sequentially structured recipes[C]//2020 IEEE 36th international conference on data engineering workshops (ICDEW). Dallas:IEEE, 2020:105-108.
[23] 孙茂松,李景阳,郭志芃,等. THUCTC:一个高效的中文文本分类工具包[EB/OL].[2020-05-10].http://thuctc.thunlp.org/.
[24] CUI Y, CHE W, LIU T, et al. Revisiting pre-trained models for Chinese natural language processing[J/OL].[2020-06-01]. https://arxiv.org/pdf/2004.13922.
[25] TENNEY I, DAS D, PAVLICK E. BERT rediscovers the classical NLP pipeline[J/OL].[2020-01-01]. https://arxiv.org/pdf/1905.05950.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献