知识组织

基于规则的纪传体古代汉语文献姓名识别

  • 皇甫晶 ,
  • 王凌云
展开
  • 1. 陕西科技大学图书馆;
    2. 广联达软件股份有限公司
皇甫晶, 陕西科技大学图书馆助理馆员,硕士,E-mail:huangfu774@163.com;王凌云,广联达软件股份有限公司开发工程师,硕士。

收稿日期: 2012-11-05

  修回日期: 2012-11-28

  网络出版日期: 2013-02-05

Rule-based Chinese Person Names Identification in Ancient Chinese Literature of Annals-Biography (Jizhuan) Style

  • Huang Fujing ,
  • Wang Lingyun
Expand
  • 1. Shaanxi University of Science and Technology Library, Xi'an 710021;
    2. Glodon Software Company Limited, Beijing 100193

Received date: 2012-11-05

  Revised date: 2012-11-28

  Online published: 2013-02-05

摘要

设计一个可以自动识别古代汉语文献中姓名的模型系统,对纪传体古代汉语文献中的姓名识别作了实验和探索。以晋陈寿的《三国志·蜀书》十五卷为实验文本,对系统的识别效果进行测试,识别结果为召回率75.4%,准确率91.9%。实验证明,基于规则的方法对于识别纪传体古代汉语文献中的姓名是可行的。

本文引用格式

皇甫晶 , 王凌云 . 基于规则的纪传体古代汉语文献姓名识别[J]. 图书情报工作, 2013 , 57(03) : 120 -124 . DOI: 10.7536/j.issn.0252-3116.2013.03.022

Abstract

This paper designs a model system to automatically identify person names in ancient Chinese literature of annals-biography (Jizhuan) style and makes some explorations. This model system is tested by the experimental text which is composed of 15 volumes of Book Shu of Annals of the Three Kingdoms written by Chen Shou of Jin Dynasty. The recognition result is 75.4% as the recall ratio and 91.9% as the precision ratio. The result shows that the ruled-based method is feasible to identify person names in ancient Chinese literature of annals-biography (Jizhuan) style.

参考文献

[1] 司马迁. 史记[M]. 北京: 中华书局, 1959.

[2] 陈寿. 三国志[M]. 香港: 中华书局, 1971.

[3] 房玄龄. 晋书[M]. 北京: 中华书局, 1974: 2460.

[4] 魏收. 魏书[M]. 北京: 中华书局, 1974: 1307.

[5] 朱保炯, 谢沛霖. 明清进士题名碑录索引[M]. 上海: 上海古籍出版社, 1984.

[6] 中国社会科学院语言文字应用研究所汉字整理研究室. 姓氏人名用字分析统计[M].北京:语文出版社,1991.

[7] 许嘉璐,傅永和.中文信息处理现代汉语词汇研究[M].广州:广东教育出版社,2006.

文章导航

/