Identification and Clustering of Chinese Same Name Authority Records on Work Relations Extending

  • Wang Ruiyun ,
  • Jia Junzhi
Expand
  • School of Economics and Management, Shanxi University, Taiyuan 030006

Received date: 2017-01-03

  Revised date: 2017-02-18

  Online published: 2017-03-05

Abstract

[Purpose/significance] We manage to deal with the question that the search result set provided by CNASS contains too many records to fulfill clustering and linking for its search service. [Method/process] We transformed the personal name authority record to RDF representation based on entity-attribute-relations according to the FRBR-LRM frame and extended the work relations of the record with linked LC and VIAF record. We designed identification and clustering algorithms of Chinese same name authority records taking full use of the extended work relations in the record to improve the efficiency of identification and clusteing. [Result/conclusion] Through the experiment on 300 result sets of searching Chinese names on CNASS, the this paper statistically analyzed cluster counters and records within the max-cluster, our identification and clustering method is effective.

Cite this article

Wang Ruiyun , Jia Junzhi . Identification and Clustering of Chinese Same Name Authority Records on Work Relations Extending[J]. Library and Information Service, 2017 , 61(5) : 125 -131 . DOI: 10.13266/j.issn.0252-3116.2017.05.017

References

[1] IFLA.FRBR-Library Reference Model[EB/OL].[2016-10-09].http://www.ifla.org/files/assets/cataloguing/frbr-lrm_20160225.pdf.
[2] ALLEMANG D, HENDLER J.实用语义网:RDFS和OWL高效建模(英文版)[M].北京:人民邮电出版社,2009:31-51.
[3] 中文名称规范协作委员会. 中文名称规范联合数据库检索系统[EB/OL].[2016-11-22].http://cnass.cccna.org/jsp/simpleSearch.jsp.
[4] 贾君枝,石燕青. 中国个人名称规范文档的关联数据化研究. 情报学报,2016,35(7):696-703.
[5] 郝嘉树,王广平. 中文人名规范的语义描述与关联探讨[J].图书情报工作,2012,56(14):47-51.
[6] 宋志红,武天兰,李冬梅.合著网络、社会资本与科研影响力[J]. 情报学报,2015,34(11):1123-1131.
[7] 刘斌,赵升,孙笑明,等.我国专利数据中发明家姓名消歧算法研究[J]. 情报学报,2016,35(4):405-414.
[8] 周杰,李弼程,唐永旺,等.基于关键证据和E2LSH的增量式人名聚类消歧方法[J]. 情报学报,2016,35(7):714-722.
[9] 田野,杨眉,祝忠明,等. 关联数据驱动的查询扩展技术研究[J].图书情报工作,2015, 59(4):122-128.
[10] 白林林,贾君枝.关联数据中CNMARC到RDF的映射实现[J].国家图书馆学刊,2015,24(4):94-102.
[11] 廖盖隆,罗竹风,范源,等.中国人名大辞典:当代人物卷[M].上海:上海辞书出版社,1992:1-23.
[12] 司莉,贾欢. 2004-2014年我国多语言信息组织和检索研究进展与启示[J]. 情报学报, 2015,34(6):662-672.
[13] VIAF[EB/OL].[2016-11-02].https://www.viaf.org/processed/LC%7Cn%20%2081028871.
[14] 刘宝柱,苏彦华,张洪林,等. MATLAB7.0从入门到精通[M]. 北京:人民邮电出版社,2010:128-158.

Outlines

/