[1] SIEGEL N, LOURIE N, POWER R, et al. Extracting Scientific figures with distantly supervised neural networks[C]//Proceedings of the 18th ACM-IEEE on joint conference on digital libraries. Texas:ACM,2018:223-232.
[2] YU H, LEE M.Accessing bioscience images from abstract sentences[J].Bioinformatics, 2006, 22(14):547-556.
[3] STELMASZEWSKA H, BLANDFORD A. From physical to digital:a case study of computer scientists' behaviour in physical libraries[J]. International journal on digital libraries, 2004, 4(2):82-92.
[4] LEE P, WEST J D, HOWE B, et al. Viziometrics:analyzing visual information in the scientific literature[J].IEEE transactions on big data, 2018, 4(1):117-129.
[5] PYREDDY P, CROFT W B. TINTIN:A system for retrieval in text tables[C]//Proceedings of the second ACM international conference on digital libraries. Philadelphia:ACM,1997:193-200.
[6] LIU F, JENSSEN T, NYGAARD V, et al. FigSearch:a figure legend indexing and classification system.[J]. Bioinformatics, 2004, 20(16):2880-2882.
[7] TENOPIR C, SANDUSKY R, CASADO M. The value of CSA deep indexing for researchers (executive summary)[J]. School of information sciences publications and other works, 2006(1):1-4.
[8] LIU Y, BAI K, MITRA P, et al. TableSeer:automatic table metadata extraction and searching in digital libraries[C]//Proceedings of the 7th ACM/IEEE-CS joint conference on digital libraries. New York:ACM, 2007:91-100.
[9] XU S H, JAMES M C, MICHAEL K. Yale image finder (YIF)[J]. Bioinformatics, 2008,17(24):1968-1970.
[10] HONG Y, LIU F, RAMESH B P. Automatic figure ranking and user interfacing for intelligent figure search[J]. Plos one, 2010,5(10):e12983.
[11] NCBI.PMC[EB/OL].[2020-08-31].https://www.ncbi.nlm.nih.gov/pmc/.
[12] CNKI.CNKI图片检索[EB/OL].[2020-08-31].http://image.cnki.net/Default.aspx.
[13] SIEGEL N, HORVITZ Z, LEVIN R, et al. FigureSeer:parsing result-figures in research papers[C]//European conference on computer vision. Amsterdam:Springer International Publishing, 2016:664-680.
[14] National Library of Medicine.Open-i[EB/OL].[2020-08-31].https://openi.nlm.nih.gov/.
[15] FAYYAD U M, PIATETSKY-SHAPIRO G, SMYTH P. From data mining to knowledge discovery in databases[J]. Ai magazine, 1996,17(3):37-54.
[16] 唐皓瑾. 一种面向PDF文件的表格数据抽取方法的研究与实现[D]. 北京:北京邮电大学, 2015.
[17] 刘颖. 基于Web结构的表格信息抽取研究[D]. 合肥:合肥工业大学,2012.
[18] CHAO H, FAN J. Layout and content extraction for PDF documents[C]//Document analysis systems 2004. Florence:Springer, 2004:213-224.
[19] CHOUDHURY S R, GILES C L. An architecture for information extraction from figures in digital libraries[C]//International conference. international world wide web conferences steering committee. Florence:ACM,2015:667-672.
[20] CHHATKULI A, FONCUBIERTA-RODRÍGUEZ A, MARKONIS D, et al. Separating compound figures in journal articles to allow for subfigure classification[C]//Medical imaging 2013:Advanced pacs-based imaging informatics and therapeutic applications. Florida:SPIE Medical Imaging,2013:86740J.
[21] LI P, JIANG X, KAMBHAMETTU C, et al. Compound image segmentation of published biomedical figures[J]. Bioinformatics, 2018, 34(7):1192-1199.
[22] Apache Software Foundation.Apache PDFBox[EB/OL].[2021-05-02].https://pdfbox.apache.org.
[23] YUSUKE S.PDFMiner[EB/OL].[2021-05-02].https://github.com/euske/pdfminer.
[24] Glyph & Cog.Xpdf[EB/OL].[2021-05-02].http://www.xpdfreader.com.
[25] KristianHøgsberg.Poppler[EB/OL].[2021-05-02].http://poppler.freedesktop.org/.
[26] LUIS D L, JINGYI Y, CECILIA N, et al. An automatic system for extracting figures and captions in biomedical pdf documents[C]//2011 IEEE international conference on bioinformatics and biomedicine. Atlanta:IEEE, 2011:578-581.
[27] PRACZYK P A, NOGUERAS-ISO J, MELE S. Automatic extraction of figures from scientific publications in high-energy physics[J]. Information technology and libraries, 2013, 32(4):25-52.
[28] CLARK C, DIVVALA S. PDFFigures 2.0:mining figures from research papers[C]//Proceedings of the 16th ACM/IEEE-CS on joint conference on digital libraries. Newark:ACM, 2016:143-152.
[29] LI P Y, JIANG X Y, SHATKAY H,et,al. Figure and caption extraction from biomedical documents.[J]. Bioinformatics, 2019,35(21):4381-4388.
[30] YILDIZ B, KAISER K, MIKSCH S. Pdf2table:a method to extract table information from pdf files[C]//Proceedings of the 2nd Indian international conference on artificial intelligence. Pune:DBLP, 2008:1-13.
[31] 李海涛, 柳健, 明德烈,等. 一种统计特征点网格分布的表格图像识别方法[J]. 华中科技大学学报(自然科学版), 2002, 30(9):60-63.
[32] 张伯. 基于PDF文字流的表格识别技术的研究[D]. 北京:北京工业大学, 2010.
[33] MANUEL A, MIKE T, JEREMY B M.Tabula[EB/OL].[2021-08-31]. https://tabula.technology/.
[34] RASTAN R, PAIK H Y, SHEPHERD J. TEXUS:A unified framework for extracting and understanding tables in PDF documents[J].Information processing & management, 2019, 55(3):895-918.
[35] PEREZARRIAGA M O, ESTRADA T, ABADMOTA S. TAO:system for table detection and extraction from pdf documents[C]//Proceedings of the 29th international Florida artificial intelligence research society conference. Florida:AAAI, 2016:591-596.
[36] SAS J, ZOLNIEREK A. Three-stage method of text region extraction from diagram raster images[J]. Advances in intelligent systems and computing, 2013, 226:527-538.
[37] FALK BÖSCHEN, ANSGAR SCHERP. A Comparison of approaches for automated text extraction from scholarly figures[C]//International conference on multimedia modeling. Reykjavik:Springer, 2017:15-27.
[38] CHIANG Y Y, KNOBLOCK C.A. Recognizing text in raster maps[J]. Geoinformatica, 2015(19):1-27.
[39] XU, S H, MICHAEL K. A new pivoting and iterative text detection algorithm for biomedical images[M]. Elsevier Science, 2010.
[40] DE S, STANLEY R J, CHENG B, et al. Automated text detection and recognition in annotated biomedical publication images[J]. International journal of healthcare information systems and informatics, 2014, 9(2):34-63.
[41] HE F, WANG D, INNOKENTEVA Y, et al. Extracting molecular entities and their interactions from pathway figures based on deep learning[C]//2019 IEEE international conference on bioinformatics and biomedicine (bibm). San Diego:IEEE, 2020:1191-1193.
[42] NAGY G. Learning the characteristics of critical cells from web tables[C]//International conference on pattern recognition. Tsukuba:IEEE, 2012:1554-1557.
[43] SETH S C, NAGY G. Segmenting tables via indexing of value cells by table headers[C]//International conference on document analysis and recognition. Washington, DC:IEEE, 2013:887-891.
[44] HONG Y, AGARWAL S, JOHNSTON M.Are figure legends sufficient? Evaluating the contribution of associated text to biomedical figure comprehension[J]. Journal of biomedical discovery & collaboration, 2009, 4(1):1-10.
[45] CHOUDHURY S R, MITRA P, KIRK A,et,al. Figure metadata extraction from digital documents[C]//International conference on document analysis & recognition. ieee computer society. Washington, DC:IEEE, 2013:135-139.
[46] LOPEZ L D, YU J, ARIGHI C N, et al. An automatic system for extracting figures and captions in biomedical pdf documents[C]//IEEE international conference on bioinformatics & biomedicine. Atlanta:IEEE, 2012:578-581.
[47] BALAJI P R, SETHI R J, HONG Y, et al. Figure-associated text summarization and evaluation[J]. Plos One, 2015, 10(2):e0115671.
[48] YU H. Towards answering biological questions with experimental evidence:automatically identifying text that summarize image content in full-text articles[C]//Annual symposium proceedings/amia symposium. amia symposium. Washington, DC:AMia, 2006:834-838.
[49] BHATIA S, MITRA P. Summarizing figures, tables and algorithms in scientific publications to augment search results[J]. ACM transactions on information systems, 2010, 30(1):1-24.
[50] MANNING C D, RAGHAVAN P, H SCHVTZE. Introduction to information retrieval[M]. 北京:人民邮电出版社, 2010.
[51] TURTLE H R, CROFT W B. Inference networks for document retrieval[C]//13th international conference on research and development in information retrieval. Brussels:ACM,1990:1-24.
[52] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[J]. Computer science, 2013, arXiv:1301.3781.
[53] SHUAI Z, CHENG M M, WARRELL J, et al. Dense semantic image segmentation with objects and attributes[C]//2014 IEEE conference on computer vision and pattern recognition (CVPR). Columbus:IEEE, 2014, 3214-3221.
[54] VEZHNEVETS A, FERRARI V, BUHMANN J.M.Weakly supervised structured output learning for semantic segmentation[C]//2012 IEEE conference on computer vision and pattern recognition. Providence:IEEE, 2012:845-852.
[55] HUI Z, FRITTS J E, GOLDMAN S A. Image segmentation evaluation:a survey of unsupervised methods[J]. Computer vision & image understanding, 2008, 110(2):260-280.
[56] PEDERSEN K S, LOOG M, DORST P.Salient point and scale detection by minimum likelihood[C]//Proceedings of machine learning research. Bletchley Park:PMLR, 2007:59-72.
[57] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International journal of computer vision, 2004, 60(2):91-110.
[58] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//IEEE computer society conference on computer vision & pattern recognition. San Diego:IEEE, 2005,886-893.
[59] NG R T, SEDIGHIAN A.Evaluating multidimensional indexing structures for images transformed by principal component analysis[C]//Proceedings volume 2670, storage and retrieval for still image and video databases iv. San Jose:SPIE, 1996:50-61.
[60] PHAM, T T, MAILLOT N E, LIM J H, et al. Latent semantic fusion model for image retrieval and annotation[C]//Proceedings of the sixteenth ACM conference on information and knowledge management. Lisbon:ACM, 2007:439-444.
[61] INDYK P. Approximate nearest neighbors:towards removing the curse of dimensionality[C]//Proceedings of the 30th acm symposium on theory of computing (stoc'98). Dallas Texas:ACM, 1998:604-613.
[62] 杨战波.基于深度学习和词嵌入的视觉语义嵌入研究[D] 重庆:西南大学,2019.
[63] WANG H, ZHANG Y, JI Z, et al. Consensus-aware visual-semantic embedding for image-text matching[C]//2020 european conference on computer vision. Glasgow:Qrxiv, 2020:18-34.
[64] WEN K, GU X, CHENG Q. Learning dual semantic relations with graph attention for image-text matching[J]. IEEE transactions on circuits and systems for video technology, 2020(99):1-1.
[65] 陈涛,单蓉蓉,李惠.数字人文中图像资源的语义化标注研究[J].农业图书情报学报,2020,32(9):6-14.
[66] BHAGAT P K, CHOUDHARY P. Image annotation:then and now[J].Image and vision computing, 2018(80):1-23.
[67] ADNAN M M, RAHIM M, REHMAN A, et al. Automatic image annotation based on deep learning models:a systematic review and future challenges[J]. IEEE access, 2021(9):50253-50264.
[68] MIAO R, TOTH R, ZHOU Y, et al. Quick annotator:an open-source digital pathology based rapid image annotation tool[J] The journal of pathology,20217(6):542-547.
[69] DONG Q, LUO G, HAYNOR D, et al. DicomAnnotator:a configurable open-source software program for efficient dicom image annotation[J]. Journal of digital imaging,2020,33(6):1514-1526.
[70] 孙坦, 丁培, 黄永文, 等. 文本挖掘技术在农业知识服务中的应用述评[J]. 农业图书情报学报, 2021, 33(1):4-16.
[71] POCO J, HEER J. Reverse-engineering visualizations:recovering visual encodings from chart images[J]. Computer graphics forum, 2017, 36(3):353-363.
[72] KIM S, LIU Y. Functional-based table category identification in digital library[C]//2011 international conference on document analysis and recognition, ieee,2011:1364-1368.
[73] SAVVA M, KONG N, CHHAJTA A, et al. ReVision:automated classification, analysis and redesign of chart images[C]//User interface software and technology. New York:ACM, 2011:393-402.
[74] NKWENTSHA X, HOUNKANRIN A, NICOLLS F. Automatic classification of medical X-ray images with convolutional neural networks[C]//2020 international saupec/robmech/prasa conference. Cape Town:Springer, 2020:1-4.
[75] HUANG W, ZONG S, TAN C L, et al. Chart image classification using multiple-instance learning[C]//Workshop on applications of computer vision. Texas:ACM, 2007:27-27.
[76] PELKA O, FRIEDRICH C M. FHDO biomedical computer science group at medical classification task of ImageCLEF 2015[C]//Working notes of CLEF 2015 conference. Toulouse:CEUR-WS, 2015.
[77] LI P,SORENSEN S,KOLAGUNDA A, et al. UDEL CIS working notes in ImageCLEF 2016[C]//Working notes of CLEF 2016 conference. Portugal:CEUR-WS, 2016:334-346.
[78] CHHATKULI A, FONCUBIERTA-RODRIGUEZ A, MARKONIS D, et al. Separating compound figures in journal articles to allow for subfigure classification[C]//Proceedings of spie medical imaging, advanced pacs-based imaging informatics and therapeutic applications. Orlando:SPIE, 2013:86740.
[79] YUAN X, ANG D. A novel figure panel classification and extraction method for document image understanding[J]. International journal of data mining and bioinformatics, 2014, 9(l):22-36.
[80] Li P, Jiang X, Kambhamettu C, et al. Segmenting compound biomedical figures into their constituent panels[C]//International conference of the cross-language evaluation forum for europeanm languages. Dublin:Springer, 2017:199-210.
[81] TASCHWER M, MARQUES O. Compound figure separation combining edge and band separator detection[C]//International conference on multimedia modeling. Miami:Springer, 2016:162-173.
[82] SANTOSH K C, AAFAQUE A, ANTANI S, et al. Line segment-based stitched multipanel figure separation for effective biomedical CBIR[J]. International journal of pattern recognition and artificial intelligence, 2017, 31(6):1757003.
[83] 于玉海.面向医学文献的图像模式识别关键技术研究[D]. 大连:大连理工大学.2018.
[84] CRESTAN E, PANTEL P. Web-scale table census and classification[C]//Proceedings of the fourth acm international conference on web search and data mining. Hong Kong:ACM,2011:545-554.
[85] MURPHY R F, VELLISTE M, YAO J, et al. Searching online journals for fluorescence microscope images depicting protein subcellular location patterns[C]//IEEE international symposium on bioinformatics & bioengineering. Bethesda:IEEE, 2001:119-128.
[86] GERTZ M, SATTLER K U, GORIN F, et al. Annotating scientific images:a concept-based approach[C]//Proceedings 14th international conference on scientific and statistical database management. Los Alamitos:IEEE, 2002:59-68.
[87] EMAGE.Data Annotation Methods[EB/OL].[2020-11-02].http://www.emouseatlas.org/emage/about/data_annotation_methods.html#auto_eurexpress.
[88] TOO E C, YUJIAN L, NJUKI S, et al. A comparative study of fine-tuning deep learning models for plant disease identification[J]. Computers and electronics in agriculture, 2018,161(1):272-279.
[89] BARBEDO J A. Plant disease identification from individual lesions and spots using deep learning[J]. Biosystems engineering, 2019, 180(1):96-107.
[90] KUHN T, NAGY M, LUONG T B, et al. Mining images in biomedical publications:Detection and analysis of gel diagrams[J]. J biomed semantics, 2014, 5(1):1-9.
[91] ZHANG Z.Towards efficient and effective semantic table interpretation[C]//International semantic Web conference. New York:Springer-verlag, 2014:487-502.
[92] CAO H, BOWERS S, SCHILDHAUER M P. Approaches for semantically annotating and discovering scientific observational data[C]//Database and expert systems applications. Berlin:Springer, 2011:526-541.
[93] MARTIN M, NUFFELEN B, ABRUZZINI S,et al.The digital agenda scoreboard:a statistical anatomy of Europe's way into the information age[EB/OL].[2021-05-02].http://www.semantic-web-journal.net/sites/default/files/swj283.pdf.
[94] KEMBHAVI A, SALVATO M, KOLVE E, et al. A diagram is worth a dozen images[C]//Computer vision-eccv 2016. Amsterdam:Springer, 2016:235-251.
[95] LEE P, YANG T. S, WEST J, et al. Phyloparser:a hybrid algorithm for extractingphylogenies from dendrograms[C]//14th iapr international conference on document analysis and recognition (icdar). Kyoto:IEEE, 2017:1087-1094.
[96] 何英. PubMed Central文献中的柱形图信息抽取研究与应用[D]. 武汉:武汉理工大学,2018.
[97] AGARWAL S, YU H. FigSum:automatically generating structured text summaries for figures in biomedical literature.[C]//American medical informatics association annual symposium. San Francisco:PMC, 2009:6-10.
[98] SAINI N, SAHA S, POTNURUV, et al. Figure summarization:a multiobjective optimization-based approach[J]. Intelligent systems, 2019,34(6):43-52.
[99] SAINI N, SAHA S, BHATTACHARYYA P, et al. Textual entailment——based figure summarization for biomedical articles[J].ACM transactions on multimedia computing communications and applications, 2020, 16(1s):1-24.
[100] CHEN J, ZHUGE H. Extractive summarization of documents with images based on multi-modal RNN[J]. Future generation computer systems, 2019,99(1):186-196.
[101] 吴晨飞.基于关系建模的视觉问答研究[D]. 北京:北京邮电大学, 2020.
[102] KAFLE K, PRICE B, COHEN S, et al. DVQA:understanding data visualizations via question answering[C]//2018 IEEE/cvf conference on computer vision and pattern recognition. Salt Lake City:IEEE, 2018:5648-5656.
[103] KAHOU S E, MICHALSKI V, ATKINSON A, et al. FigureQA:an annotated figure dataset for visual reasoning[J]. Computer science, 2018, arXiv:1710.07300.
[104] CHAUDHRY R, SHEKHAR S, GUPTA U, et al. LEAF-QA:locate, encode & attend for figure question answering[C]//2020 IEEE winter conference on applications of computer vision (wacv). Snowmass Village:IEEE, 2020:3512-3521.