Research of Automatic Extraction of Entities of Data Science Recruitment and Analysis Based on Deep Learning

  • Wang Dongbo ,
  • Hu Haotian ,
  • Zhou Xin ,
  • Zhu Danhao
Expand
  • 1. Colledge of Information Science and Technology, Nanjing Agricultural University, Nanjing 210095;
    2. Department of Information Management, Nanjing University, Nanjing 210093;
    3. Department of Computer Science and Technology, Nanjing University, Nanjing 210093

Received date: 2017-12-02

  Revised date: 2018-04-08

  Online published: 2018-07-05

Abstract

[Purpose/significance] Data science is emerging as a new interdisciplinary field which combines many fields. Extracting the corresponding entities knowledge from the announcement information of data science recruitment can not only help to understand the development of data science from a market perspective, but also help to improve the content of data science teaching.[Method/process] Based on the recruitment announcement from the recruitment website, combining with information science data collection, annotation and organization methods, data science corpus was constructed and the corresponding entities from it were extracted.[Result/conclusion] In the existing 11000 annotated data science corpus scale recruitment announcement, based on the Bi-LSTM-CRF, CRF and Bi-LSTM models, this paper compared the extraction performance of data science recruiting entities and finally determined the final data science recruitment entities automatic extraction model, designed the data science recruitment entities automatic extraction platform, and built a data science recruitment entities network.

Cite this article

Wang Dongbo , Hu Haotian , Zhou Xin , Zhu Danhao . Research of Automatic Extraction of Entities of Data Science Recruitment and Analysis Based on Deep Learning[J]. Library and Information Service, 2018 , 62(13) : 64 -73 . DOI: 10.13266/j.issn.0252-3116.2018.13.009

References

[1] BIKEL D M, SCHWARTZ R, WEISCHEDEL R M. An algorithm that learns what's in a name[J]. Machine learning, 1999, 34(1/3):211-231.
[2] BERGER A L, PIETRA V J D, PIETRA S A D. A maximum entropy approach to natural language processing[J]. Computational linguistics, 1996, 22(1):39-71.
[3] LAFFERTY J, MC CALLUM A, PRREIRA F. Conditional random fields:probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the eighteenth international conference on machine learning. San Francisco:Margan Kaufmann, 2001:282-289.
[4] MC CALLUM A, LI W. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons[C]//Proceedings of the seventh conference on natural language learning at HLT-NAACL. Association for Computational Linguistics, 2003:188-191.
[5] 张小衡,王玲玲.中文机构名的识别与分析[J].中文信息学报,1997,11(4):21-32.
[6] ZhANG Y, ZhOU J F. A trainable method for extracting Chinese entity names and their relations[C]//The Workshop on Chinese Language Processing:Held in Conjunction with the, Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2000:66-72.
[7] 郑逢强,林磊,刘秉权.《知网》在命名实体识别中的应用研究[J].中文信息学报, 2008, 22(5):97-101.
[8] 陈宇,郑德权,赵铁军.基于Deep Belief Nets的中文名实体关系抽取[J].软件学报,2012,23(10):2572-2585.
[9] 邵发,黄银阁,周兰江,等.基于实体消歧的中文实体关系抽取[J].山东大学学报(工学版),2014,44(6):32-37.
[10] 许华,刘茂福,姜丽,等.基于语言规则的病症菌实体抽取[J].武汉大学学报(理学版),2015,61(2):51-55.
[11] 冯蕴天, 张宏军, 郝文宁,等. 基于深度信念网络的命名实体识别[J]. 计算机科学, 2016, 43(4):224-230.
[12] DONG C, ZHANG J, ZONG C, et al. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[C]//International conference on computer processing of oriental languages. New York City:Springer International Publishing,2016:239-250.
[13] 朱丹浩, 杨蕾, 王东波. 基于深度学习的中文机构名识别研究——一种汉字级别的循环神经网络方法[J]. 现代图书情报技术, 2016, 32(12):36-43.
[14] 叶鹰, 马费成. 数据科学兴起及其与信息科学的关联[J]. 情报学报, 2015(6):575-580.
[15] 杨京, 王效岳, 白如江,等. 大数据背景下数据科学分析工具现状及发展趋势[J]. 情报理论与实践, 2015, 38(3):134-137.
[16] 周傲英, 钱卫宁, 王长波. 数据科学与工程:大数据时代的新兴交叉学科[J]. 大数据, 2015, 1(2):90-99.
[17] 朝乐门, 卢小宾. 数据科学及其对信息科学的影响[J]. 情报学报, 2017, 36(8):761-771.
[18] 王曰芬, 谢清楠, 宋小康. 国外数据科学研究的回顾与展望[J]. 图书情报工作, 2016, 60(14):5-14.
[19] FREEMAN L C. Centrality in social networks conceptual clarification[J]. Social networks, 1979,1(3),215-239.
Outlines

/