基于多层卷积神经网络的金融事件联合抽取方法

李旭晖; 程威; 唐小雅; 于滔; 陈壮; 钱铁云

doi:10.13266/j.issn.0252-3116.2021.24.010

图书情报工作 >

2021 , Vol. 65 >Issue 24: 89 - 99

DOI: https://doi.org/10.13266/j.issn.0252-3116.2021.24.010

知识组织

基于多层卷积神经网络的金融事件联合抽取方法

李旭晖 ,
程威 ,
唐小雅 ,
于滔 ,
陈壮 ,
钱铁云

展开

1. 武汉大学信息管理学院武汉 430072;
2. 武汉大学大数据研究院武汉 430072;
3. 武汉大学计算机学院武汉 430072

李旭晖,副教授,硕士生导师,E-mail:lixuhui@whu.edu.cn;程威,硕士研究生;唐小雅,硕士研究生;于滔,硕士研究生;陈壮,博士研究生;钱铁云,教授,博士生导师。

收稿日期: 2021-06-15

修回日期: 2021-09-27

网络出版日期: 2021-12-29

基金资助

本文系国家自然科学基金重大研究计划"大数据驱动的管理与决策研究"重点支持项目"基于知识关联的金融大数据价值分析、发现及协同创造机制"（项目编号：91646206）和深证信息联合研究计划课题"企业全生命周期关键事件识别和要素抽取"（项目编号：CHINFO201802）研究成果之一。

收起

A Joint Extraction Method of Financial Events Based on Multi-Layer Convolutional Neural Networks

Li Xuhui ,
Cheng Wei ,
Tang Xiaoya ,
Yu Tao ,
Chen Zhuang ,
Qian Tieyun

Expand

1. School of Information Management, Wuhan University, Wuhan 430072;
2. Big Data Institute, Wuhan University, Wuhan 430072;
3. School of Computer Science, Wuhan University, Wuhan 430072

Received date: 2021-06-15

Revised date: 2021-09-27

Online published: 2021-12-29

Fold

摘要

[目的/意义] 为进一步提升金融领域事件抽取的效果，增强事件抽取两个子任务之间的关联性。[方法/过程] 在中文金融文本上进行事件抽取相关研究，提出一种融合预训练模型与多层卷积神经网络的金融事件联合抽取方法，首先通过预训练模型BERT捕捉句子序列的综合语义信息，然后接入本文设计的多层卷积架构MultiCNN，分层提取局部窗口和高维空间语义信息，同时实现事件识别和要素抽取这两个任务，再通过引入对比损失，进一步强化两个任务之间的关联。[结果/结论] 在中文金融事件数据集上F₁达到82.20%，比各个基准抽取模型均有一定提升。

关键词： 中文事件抽取; 卷积神经网络; 预训练模型; 联合学习

本文引用格式

李旭晖 , 程威 , 唐小雅 , 于滔 , 陈壮 , 钱铁云 . 基于多层卷积神经网络的金融事件联合抽取方法[J]. 图书情报工作, 2021 , 65(24) : 89 -99 . DOI: 10.13266/j.issn.0252-3116.2021.24.010

Abstract

[Purpose/significance] In order to further improve the effect of event extraction in the financial field, the correlation between the two subtasks of event extraction needs to be enhanced.[Method/process] This paper carried out related research about event extraction on Chinese financial texts,and proposed a joint extraction method of financial events that integrated the pre-training model and a multi-layer convolutional neural network. First, the pre-training model BERT captured the comprehensive semantic information of the sentence sequence, then accessed the multi-layer convolutional architecture designed in this paper——MultiCNN, hierarchically extracted local window and high-dimensional spatial semantic information, realized the two tasks of event recognition and element extraction at the same time, and then introduced contrast loss to further strengthen the association between the two tasks.[Result/conclusion] F1 has reached 82.20% on the Chinese financial event data set, which has a certain improvement over the benchmark extraction models.

Key words： Chinese event extraction; convolution neural network; pretraining model; joint learning

参考文献

[1] AHN D. The stages of event extraction[C]//Proceedings of the workshop and annotating and reasoning about time and events.USA:Association for Computational Linguistics, 2006:1-8.
[2] CHEN C, NG V. Joint modeling for chinese event extraction with rich linguistic features[C]//Proceedings of COLING 2012.Mumbai:The COLING 2012 Organizing Committee, 2012:529-544.
[3] DODDINGTON G, MITCHELL A, PRZYBOCKI M A, et al. The automatic content extraction (ace) program-tasks, data, and evaluation[J]. Proc Lrec, 2004, 2(1):837-840.
[4] DEVLIN J, CHANG M, LEE K, et al. BERT:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the North American Chapter of the Association for Computational Linguistics:human language technologies. Minneapolis:The NAACL-HLT Press, 2019:4171-4186.
[5] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 conference on empirical methods in natural language processing. Doha:Association for Computational Linguistics,2014:1746-1751.
[6] RILOFF E. An empirical study of automated dictionary construction for information extraction in three domains[J]. Artificial intelligence, 1996,85(1/2):101-134.
[7] RILOFF E. Automatically generating extraction patterns from untagged text[C]//Proceedings of the national conference on artificial intelligence. Oregon:Association for the Advancement of Artificial Intelligence,1996:1044-1049.
[8] FELDMAN R, ROSENFELD B, BAR-HAIM R, et al. The stock sonar-sentiment analysis of stocks based on a hybrid approach[EB/OL].[2021-11-10]. https://www.researchgate.net/publication/221016483_The_Stock_Sonar_-_Sentiment_Analysis_of_Stocks_Based_on_a_Hybrid_Approach.
[9] 罗明, 黄海量. 基于词汇-语义模式的金融事件信息抽取方法[J]. 计算机应用, 2018,38(01):84-90.
[10] 李响, 杨小琳, 魏勇, 等. 基于支持向量机的新闻事件类型识别[J]. 地理信息世界, 2019,26(02):73-78.
[11] HOU L, LI P, ZHU Q, et al. Event argument extraction based on CRF[C]//Proceedings of the 13th Chinese conference on Chinese lexical semantics.Berlin:Springer, 2012:32-39.
[12] CHEN Y, XU L, LIU K, et al. Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1:long papers).Beijing:Association for Computational Linguistics,2015:167-176.
[13] ZENG Y, YANG H, FENG Y, et al. A convolution biLSTM neural network model for chinese event extraction[C]//NLPCC-ICCPOL 2016.Kunming:Springer,2016:275-287.
[14] 陈斌, 周勇, 刘兵. 基于卷积双向长短期记忆网络的事件触发词抽取[J]. 计算机工程, 2019,45(01):153-158.
[15] 吴文涛, 李培峰, 朱巧明. 基于混合神经网络的实体和事件联合抽取方法[J]. 中文信息学报, 2019,33(08):77-83.
[16] NGUYEN T H, CHO K, GRISHMAN R. Joint event extraction via recurrent neural networks[C]//Proceedings of the 2016 conference of the North American Chapter of the Association for Computational Linguistics:human language technologies.San Diego:Association for Computational Linguistics, 2016:300-309.
[17] 陈斌. 基于长短期记忆网络的事件抽取研究与应用[D]. 徐州:中国矿业大学, 2019.
[18] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//NIPS'17.New York:Curran Associates, 2017:6000-6010.
[19] ZHENG S, CAO W, XU W, et al. Doc2EDAG:An end-to-end document-level framework for chinese financial event extraction[C]//Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing.Hong Kong:Association for Computational Linguistics, 2019:337-346.
[20] YANG S, FENG D, QIAO L, et al. Exploring pre-trained language models for event extraction and generation[C]//Proceedings of the 57th annual meeting of the Association for Computational Linguistics.Florence:Association for Computational Linguistics, 2019:5284-5294.
[21] ZHAO L, LI L, ZHENG X. A BERT based sentiment analysis and key entity detection approach for online financial texts[C]//Proceedings of the 2021 IEEE 24th international conference on computer supported cooperative work in design, 2021:1233-1238.
[22] DU X, CARDIE C. Event extraction by answering (almost) natural questions[C]//Proceedings of the 2020 conference on empirical methods in natural language processing.Online:Association for Computational Linguistics, 2020:671-683.
[23] NGUYEN T H, GRISHMAN R. Graph convolutional networks with argument-aware pooling for event detection[C]//AAAI.Louisiana:Association for the Advancement of Artificial Intelligence,2018:5900-5907.
[24] CUI S, YU B, LIU T, et al. Event detection with relation-aware graph convolutional neural networks.[J]. CoRR, 2020,abs/2002.10757.
[25] YANG H, CHEN Y, LIU K, et al. DCFEE:A document-level chinese financial event extraction system based on automatically labeled training data[C]//Proceedings of ACL 2018, System Demonstrations.Melbourne:Association for Computational Linguistics, 2018:50-55.
[26] EIN-DOR L, GERA A, TOLEDO-RONEN O, et al. Financial event extraction using wikipedia-based weak supervision[J]. ArXiv, 2019,abs/1911.10783.
[27] ZHOU Z, MA L, LIU H. Trade the event:corporate events detection for news-based event-driven trading[C]//Findings of the Association for Computational Linguistics:ACL-IJCNLP 2021.Online:Association for Computational Linguistics, 2021:2114-2124.
[28] RÖNNQVIST S, SARLIN P. Bank distress in the news:describing events through deep learning[J]. Neurocomputing, 2017,264:57-70.
[29] CARTA S, CONSOLI S, PIRAS L, et al. Event detection in finance using hierarchical clustering algorithms on news and tweets[J]. PeerJ computer science, 2021,7:438.
[30] CORRO L D, HOFFART J. Unsupervised extraction of market moving events with neural attention[J]. ArXiv, 2020,abs/2001.09466.
[31] ZHANG Y, WALLACE B. A sensitivity analysis of (and practitioners' guide to) convolutional neural networks for sentence classification[C]//Proceedings of the eighth international joint conference on natural language processing (volume 1:long papers).Taipei:Asian Federation of Natural Language Processing, 2017:253-263.
[32] XU H, LIU B, SHU L, et al. Double embeddings and CNN-based sequence labeling for aspect extraction[C]//Proceedings of the 56th annual meeting of the Association for Computational Linguistics (volume 2:short papers).Melbourne:Association for Computational Linguistics, 2018:592——598.
[33] MA X, HOVY E. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[C]//Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1:long papers).Berlin:Association for Computational Linguistics, 2016:1064-1074.
[34] PENNINGTON J, SOCHER R, MANNING C. GloVe:global vectors for word representation[C]//Proceedings of the 2014 conference on empirical methods in natural language processing.Doha:Association for Computational Linguistics, 2014:1532-1543.
[35] NGUYEN T H, GRISHMAN R. Event detection and domain adaptation with convolutional neural networks[C]//Annual meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Conference. Beijing:Association for Computational Linguistics, 2015:365-371.
[36] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[J]. ArXiv, 2013,abs/1301.3781.
[37] LECUN Y, BOSER B, DENKER J S, et al. Backpropagation applied to handwritten zip code recognition[J]. Neural computation, 1989,1(4):541-551.
[38] BOUVRIE J. Notes on convolutional neural networks[EB/OL].[2021-11-10]. http://cogprints.org/5869/1/cnn_tutorial.pdf.
[39] LAFFERTY J D, MCCALLUM A, PEREIRA F C N. Conditional random fields:probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the eighteenth international conference on machine learning.San Francisco:Morgan Kaufmann Publishers, 2001:282-289.
[40] LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[C]//Proceedings of the 2016 conference of the North American Chapter of the Association for Computational Linguistics:human language technologies.San Diego:Association for Computational Linguistics, 2016:260-270.
[41] YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J]. CoRR, 2016,abs/1511.07122.
[42] STRUBELL E, VERGA P, BELANGER D, et al. Fast and accurate entity recognition with iterated dilated convolutions[C]//Proceedings of the 2017 conference on empirical methods in natural language processing.Copenhagen:Association for Computational Linguistics, 2017:2670-2680.
[43] 李妮, 关焕梅, 杨飘, 等. 基于BERT-IDCNN-CRF的中文命名实体识别方法[J]. 山东大学学报(理学版), 2020,55(01):102-109.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献