An Automatic Marking Approach for Subjective Questions Based on Short Text Similarity Computing

  • Zhang Junsheng ,
  • Shi Chongde ,
  • Xu Hongjiao ,
  • Gao Yingfan ,
  • He Yanqing
Expand
  • Institute of Scientific and Technical Information of China, Beijing 100038

Received date: 2014-07-24

  Revised date: 2014-09-16

  Online published: 2014-10-05

Abstract

The key technology of automatic marking of text subjective questions is to improve the accuracy of computing the short text similarity between the answer of a student and the standard answer. This paper proposes a short text similarity computing method combed with the manual standard of similarity judgment, sets of words, orders of words and synonyms designs and implements the corresponding automatic marking system for subjective questions. It develops a manual grading standard base of items, and the experiment data set is chosen from the 387 real test questions and answers in the bank training area. In the experiments, between the automatic marking results and the manual marking results, the identical rate reaches 58%, and the acceptable marking accuracy is about 80%.

Cite this article

Zhang Junsheng , Shi Chongde , Xu Hongjiao , Gao Yingfan , He Yanqing . An Automatic Marking Approach for Subjective Questions Based on Short Text Similarity Computing[J]. Library and Information Service, 2014 , 58(19) : 31 -38 . DOI: 10.13266/j.issn.0252-3116.2014.19.005

References

[1] Rudner L, Gagne P. An overview of three approaches to scoring written essays by computer[J].Practical Assessment, Research & Evaluation, 2001,7(26).

[2] Landauer T, Foltz P, Laham D. An introduction to latent semantic analysis [J]. Discourse Processes, 1998, 25(2-3): 259-284.

[3] Burstein J, Leacock C, Swartz R. Automated evaluation of essay and short answers[C]//Proceedings of the 5th International Computer Assisted Assessment Conference. Loughborough:Loughborough University, 2001.

[4] Burstei N J, Kaplan R, Wolff S, et al. Using lexical semantic technicues to classify free-responses [C]//Proceedings of Annual Meeting of the Association of Computational Linguistics. Santa Cruz:Universit y of California, 1996:227-246.

[5] Callear D, Jerrams-Smith J, Soh V. CAA of short non-MCQ answers[C]//Proceedings of the 5th International CAA Conference. Loughborough:Loughboroug University, 2001.

[6] Mitchell T, Russell T, Broomhead P, et al. Towards robust computerised marking of free-text responses[C]//Proceedings of the Sixth International Computer Assisted Assessment Conference. Loughborouh:Loughboroug University, 2002.

[7] 高思丹, 袁春风. 语句相似度计算在主观题自动批改技术中的初步应用[J]. 计算机工程与应用, 2004, 40(14): 132-135.

[8] 张添一. 基于文本相似度计算的主观题自动阅卷技术研究[D]. 长春:东北师范大学, 2011.

[9] 田甜,张振国.主观题自动阅卷技术研究[J].计算机工程与设计,2010, 31(16):3697-3699.

[10] 柏雪.主观题自动阅卷系统的研究与设计[D].成都:西南交通大学, 2013.

[11] 高雪霞,尚游.基于知网句子相似度计算的主观题阅卷技术研究[J].新乡学院学报:自然科学版, 2011,28(4):336-338.

[12] Ristad E, Yianilos P. Learning string-edit distance[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(5): 522-532.

[13] Turney P. Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL[C]//Proceedings of European Conference on Machine Learning. Freiburg: Springer, 2001:491-502.

[14] Wu Zhibiao, Palmer M. Verbs semantics and lexical selection[C]//Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics. New Mexico:Association for Computational Linguistics, 1994:133-138.

[15] Leacock C, Chodorow M. Combining local context and WordNet similarity for word sense identification[M]// Fellbaum C. WordNet: An Electronic Lexical Database. Cambridge: MIT Press, 1998:265-283.

[16] Landauer T, Foltz P, Laham D. An introduction to latent semantic analysis[J]. Discourse Processes, 1998, 25(2):259-284.

[17] 梅家驹.同义词词林[M].上海:上海辞书出版社,1983.

[18] Somers H. Review article: Example-based machine translation [J]. Machine Translation, 1999, 14(2): 113-157.

[19] Erkan G, Radev D. LexRank: Graph-based lexical centrality as salience in text summarization [J]. Journal of Artificial Intelligence Research, 2004, 22(1):457-479.

[20] Coelho T, Calado L, Souza B, et al. Image retrieval using multiple evidence ranking[J]. IEEE Transactions on Knowledge and Data Engineering, 2004,16(4): 408-417.

[21] Chatterjee N. A statistical approach for similarity measurement between sentences for EBMT[C]// Proceedings of Symposium on Translation Support Systems. Kanpur: 2001.

[22] Islam A, Inkpen D. Semantic text similarity using corpus-based word similarity and string similarity [J]. ACM Transactions on Knowledge Discovery from Data, 2008, 2(2):1-25.

[23] Zhang Junsheng, Sun Yunchuan, Wang Huilin, et al. Calculating statistical similarity between sentences[J]. Journal of Convergence Information Technology, 2011, 6(2):22-34.

[24] Mihalcea R, Corley C, Strapparava C. Corpus-based and knowledge-based measures of text semantic similarity [C]//Proceedings of the 21st National Conference on Artificial Intelligence. Boston:AAAI Press, 2006: 775-780.

[25] Salton G, McGill M. Introduction to modern information retrieval[M]. New York: McGraw-Hill, 1983.

[26] Tsatsaronis G, Varlamis I, Vazirgiannis M. Text relatedness based on a word thesaurus[J]. Journal of Artificial Intelligence Research, 2010, 37(1): 1-40.

[27] Li Yuhua, McLean D, Bandar Z, et al. Sentence similarity based on semantic nets and corpus statistics [J]. IEEE Transactions on Knowledge and Data Engineering, 2006, 18(8): 1138-1150.

[28] 车万翔,刘挺,秦兵,等.面向双语句对检索的汉语句子相似度计算[C]// 中文信息学会. 全国第七届计算语言学联合学术会议论文集. 北京:清华大学出版社, 2003:520-526.

Outlines

/