Research on the Method of Judging Reference Document in Patent Invalidation Using GBDT

  • Guo Shiqi ,
  • Yun Qiang ,
  • Chen Liang ,
  • Zhou Jie
Expand
  • 1. Institute of Medical Information/Medical Lirary CAMS&PUMC, Beijing 100020;
    2. Institute of Scientific and Technical Information of China, Beijing 100038

Received date: 2020-03-10

  Revised date: 2020-10-12

  Online published: 2021-01-20

Supported by

 

Abstract

[Purpose/significance] Comparative documents are important for judging whether a patent can be granted or invalid. Aiming at the shortcomings of traditional information retrieval methods and rarely using machine learning methods to study the issue of comparative document retrieval, based on the introduction of comparative file information, this paper constructs a patent relevance determination model.[Method/process] Experiments were performed by using the target patents and comparative documents in the patent invalidation judgment as the data set to extract text similarity, co-occurrence vocabulary, and co-word quantity feature information. The GBDT model was used to convert the retrieval of comparative documents into classification issues that determined whether they were relevant.[Result/conclusion] The research results show that the contribution of different field data to the classification effect is different, in which the F1 of the description text reaches 59%, and the classification effect after multi-feature integration is significantly better than the result of single text similarity. Finally, this paper analyzes the experimental misclassifications and points out the next research directions.

Cite this article

Guo Shiqi , Yun Qiang , Chen Liang , Zhou Jie . Research on the Method of Judging Reference Document in Patent Invalidation Using GBDT[J]. Library and Information Service, 2021 , 65(2) : 117 -125 . DOI: 10.13266/j.issn.0252-3116.2021.02.012

References

[1] 国家知识产权局.1985年专利统计年报[EB/OL].[2020-08-05].http://www.cnipa.gov.cn/tjxx/jianbao/1985-1999/85/1.1.htm.
[2] BLOSSER G H, ARSHADI N, AGRAWAL S. A critical assessment of the USPTO policies toward small entity patent applications[J].Technology and innovation,2011,13(3):249-259.
[3] 国家知识产权局.2018专利复审无效十大案件[EB/OL].[2020-08-05].http://www.sipo.gov.cn/mtsd/1138630.htm.
[4] 国家知识产权局.2017专利复审无效十大案件[EB/OL].[2020-08-05].http://www.sipo.gov.cn/mtsd/1123789.htm.
[5] 国家知识产权局.申长雨在国家知识产权局专利审查工作座谈会上强调努力提高专利审查质量和效率,推动知识产权事业高质量发展[EB/OL].[2020-08-05].http://www.sipo.gov.cn/zscqgz/1120594.htm.
[6] 国家知识产权局.2018年中国知识产权发展状况新闻发布会在京举行[EB/OL].[2020-02-05].http://www.sipo.gov.cn/zscqgz/1138755.htm.
[7] 中华人民共和国国家知识产权局.专利审查指南(2010)[M].北京:知识产权出版社,2009.
[8] 中国专利检索技能大赛[EB/OL].[2020-08-05].http://www.ipsearch.top/home/index.html.
[9] 国家知识产权局专利复审委员会.以案说法——专利复审、无效典型案例指引[M].北京:知识产权出版社,2018:1-446.
[10] HUNT D,NGUYEN L,RODGERS M.专利检索:工具与技巧[M].北京市知识产权局,编译.陈可南,译.北京:知识产权出版社,2013.
[11] CLARKE N S.The basics of patent searching[J].World patent information,2018,54:S4-S10.
[12] LUPU M,MAYER K,TAIT J,et al.Current challenges in patent information retrieval[M].Berlin:Springer,2011.
[13] 高继刚.浅析计算机关键词检索的选取在专利检索中的作用[J].通讯世界,2015(12):257-257.
[14] 卢士燕,朱佳,李娇,等.追踪检索在化工领域专利申请审查中的应用[J].广东化工,2019,46(3):131-132.
[15] 朱敬敬,杨喆.专利检索技巧之"顺藤摸瓜"[J].科教导刊-电子版(上旬),2017(10):218-220.
[16] 黄微.专利审查中非专利文献的检索与应用[J].中小企业管理与科技(下旬刊),2016(7):118-119.
[17] RAJSHEKHAR K,SHALABY W,ZADROZNY W.Analytics in post-grant patent review:possibilities and challenges (preliminary report)[J]. Social science electronic publishing,2017.
[18] 隆瑾.专利无效对比文件及其获取研究——以专利引文分析为视角[D].湘潭:湘潭大学,2012.
[19] 张杰,孙宁宁,张海超.基于SAO结构的中文相似专利识别算法及其应用[J]. 情报学报,2016,35(5):472-482.
[20] 刘玉琴,汪雪锋,吕琳.基于权利要求结构信息的中文专利无效检索模型[J].计算机应用研究,2008,25(7):2068-2070.
[21] 翟东升,马文姗.中文专利权利要求书分词算法研究[J].情报杂志,2011,30(11):152-155.
[22] 马双刚.基于深度学习理论与方法的中文专利文本自动分类研究[D].镇江:江苏大学,2016.
[23] 廖列法,勒孚刚,朱亚兰.LDA模型在专利文本分类中的应用[J].现代情报,2017,37(3):35-39.
[24] 胡杰,李少波,于丽娅,等.基于卷积神经网络与随机森林算法的专利文本分类模型[J]. 科学技术与工程,2018,18(6):268-272.
[25] GUO M,YUAN H,QIAN Y.A new method for rare feature extraction in patent documents[C]//201613th international conference on service systems and service management.Kunming:IEEE,2016:687-692.
[26] ZHU F,WANG X,ZHU D,et al. User demand-driven patent topic classification using machine learning techniques[C]//The 11th conference on international fuzzy logic and intelligent technologies in nuclear science.Joao Pessoa:World Scientific, 2014:657-663.
[27] CHEN X,DENG N.A semi-supervised machine learning method for chinesse patent effect annotation[C]//2015 international conference on cyber-enabled distributed computing and knowledge discovery. Xi'an:IEEE,2015:243-250.
[28] KREUCHAUFF F,KORZINOV V.A patent search strategy based on machine learning for the emerging field of service robotics[J].Scientometrics,2017,111(2):743-772.
[29] LEE J. Predicting bad patents[EB/OL].[2020-08-05].https://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-57.pdf.
[30] WINER D.Predicting bad patents:employing machine learning to predict post-grant review outcomes for US patents[EB/OL].[2020-08-05].https://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-60.pdf.
[31] HO W.Predicting bad patents[EB/OL].[2020-08-05].https://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-63.pdf.
[32] YEW T.Predicting bad patents[EB/OL].[2020-08-05].https://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-66.pdf.
[33] RYAN L,MARCOS T. Predicting patent outcomes with text and attributes[EB/OL].[2020-08-05]. http://cs230.stanford.edu/projects_spring_2019/reports/18681598.pdf.
[34] RAJSHEKHAR K,ZADROZNY W,GARAPATI S S.Analytics of patent case rulings:empirical evaluation of models for legal relevance[C]//Proceedings of the 16th international conference on artificial intelligence and law.London:Elsevier,2017:1-9.
[35] 邓洁,余翔,崔利刚.基于专利信息的我国发明专利无效行为实证研究[J].情报杂志,2014,33(8):52-58.
[36] 李航.统计学习方法[M].北京:清华大学出版社,2019.
[37] FRIEDMAN J H.Greedy function approximation:a gradient boosting machine[J].Annals of statistics,2001,29(5):1189-1232.
[38] ApacheCN.scikit-learn(sklearn)官方文档中文版[EB/OL].[2020-08-05].https://sklearn.apachecn.org/.
[39] GENSIM.Core cencepts[EB/OL].[2020-08-05]. https://radimrehurek.com/gensim/auto_examples/core/run_core_concepts.html#core-concepts-document.
[40] 万象云.万象云专利检索[EB/OL].[2020-08-05].https://www.wanxiangyun.net/search/Index.
[41] 中华人民共和国国家知识产权.专利审查指南(2010)[M].北京:知识产权出版社,2010.
Outlines

/