[目的/意义]利用自然语言处理技术,研究一种从科技规划文本内容中自动构建研究前沿主题地图方法。[方法/过程]首先,利用自然语言处理领域中的信息抽取、主题识别等技术对科技规划文本进行主题挖掘分析,然后,利用Java语言开发相应挖掘工具,构建科学研究前沿主题地图,并进行可视化展示。[结果/结论]通过对碳纳米管研究领域的实证研究证明,该方法能够全面、快速准确的绘制出该领域科学研究前沿地图。
[Purpose/significance] This paper aims at proposing a method of using the natural language processing technology to study contents based on the scientific and technological planning to automatically build the frontier theme map. [Method/process] First, this paper used information extraction and topic recognition technology in the field of natural language processing to mine the topic of scientific and technological planning texts. Then, it used the java language development tools to construct the scientific research frontier theme map, and visually display them. [Result/conclusion] The empirical study on the field of carbon nanotubes has proved that this method can draw the forefront of scientific research in this field comprehensively, quickly and accurately.
[1] PRICE D D. Networks of scientific papers[J]. Science,1965,149(3683):510-515.
[2] SMALL H, GRIFFITH B C. The structure of scientific literatures I:identifying and graphing specialties[J]. Science studies, 1974, 4(1):17-40.
[3] GARFIELD E. Research fronts[J]. Current contents, 1994,41(10):3-7.
[4] PERSSON O. The intellectual base and research fronts of JASIS 1986-1990[J]. Journal of the American Society for Information Science, 1994, 45(1):31-38.
[5] MORRIS S A,YEN G,WU Z,et al. TimeLine visualization of research fronts[J]. Journal of the American Society for Information Science and Technology,2003, 55(5):413-422.
[6] 王海燕,冷伏海. 英国科技规划制定及组织实施的方法研究和启示[J].科学学研究,2013(2):217-222.
[7] 王海燕,冷伏海,吴霞.日本科技规划管理及相关问题研究[J].科技管理研究,2013(15):29-32.
[8] 王海燕,冷伏海. 支持科技规划优先领域选择的战略情报与服务框架研究[J]. 图书情报工作,2013,,57(7):70-74.
[9] 郭颖,汪雪锋,朱东华等. "自顶向下"的科技规划——基于专利数据和技术路线图的新方法[J]. 科学学研究,2012(3):349-358.
[10] 樊春良. 技术预见和科技规划[J]. 科研管理,2003(6):6-12.
[11] 维基百科.主题模型[EB/OL].[2017-03-01].http://zh.wikipedia.org/wiki/%E4%B8%BB%E9%A2%98%E6%A8%A1%E5%9E%8B.
[12] HOFMANN T. Probabilistic latent semantic indexing[C]//Proceedings of the twenty-second annual international SIGIR conference on research and development in information retrieval. New York:ACM,1999:50-57.
[13] HOBBS J, The generic information extraction system[C]//Proceedings of the fifth message understanding conference (MUC-5). San Francisco:Morgan Kaufman, 1993:87-91
[14] 美国国家纳米计划.美国国家纳米计划2010战略计划[EB/OL].[2017-03-2].http://www.whitehouse.gov/files/documents/ostp/NSTC%20Reports/NNI2010.pdf.
[15] 美国航空航天局.2010纳米技术规划[EB/OL].[2017-03-25].http://www.nasa.gov/pdf/501325main_TA10-Nanotech-DRAFT-Nov2010-A.pdf.