Research on Evolution of Knowledge Category Structure in Wikipedia

  • Xu Shengguo ,
  • Liu Xu
  • 1. School of Management Science and Engineering, Dalian University of Technology, Dalian 116024;
    2. Dalian University of Technology Library, Dalian 116085

Received date: 2014-03-05

  Revised date: 2014-03-20

  Online published: 2014-04-05


The structure formed by Wikipedia category page is one kind of new knowledge category structure for that it is spontaneous, self-organized and collaborative edited by people. So it has drawn many researchers' attention. Based on the characteristics of Wikipedia category page, we defined this structure as category structure of domain knowledge, in which Wikipedia category page corresponds to a node, while relationships between these nodes are expressed as edges. By using the model of diversity entropy based on self-avoiding random walks, we got the value of nodes and identified the central nodes and the frontier nodes in the category of domain knowledge. These could help us study the evolution of nodes. The result shows that in most cases, the higher value of diversity entropy the node is, the more likely it will be central node and the longer it has been created. And for the relationships between these nodes in the same community structure, the value of diversity entropy of these nodes shows the similar variation trend. The research in this paper could help us know about development situation and hot areas of domain knowledge, and improve the knowledge category in Wikipedia, which could promote the innovation of knowledge.

Cite this article

Xu Shengguo , Liu Xu . Research on Evolution of Knowledge Category Structure in Wikipedia[J]. Library and Information Service, 2014 , 58(07) : 119 -124 . DOI: 10.13266/j.issn.0252-3116.2014.07.020


[1] 马费成, 刘记. Web2.0环境下的信息构建——对信息构建基本原理的再认识[J].情报学报,2008,27(5):683-690.

[2] 丁大尉, 李正风. 网络信息空间中的知识构建——以维基百科知识生成机制为例[J].自然辩证法研究,2012,28(5):61-65.

[3] Goodyear P. Situated action and distributed knowledge: A JITOL perspective on EPSS[J]. Programmed Learning, 1995,32(1):45-55.

[4] Wikipedia.Wikipedia category overview [EB/OL].[2014-01-10].

[5] Hotho A, Jäschke R, Schmitz C, et al. BibSonomy: A social bookmark and publication sharing system[C/OL].[2014-01-09].

[6] Vander Wal T. Folksonomy[EB/OL].[2014-01-13].

[7] Halavais A, Lackaff D. An analysis of topical coverage of Wikipedia[J]. Journal of Computer-Mediated Communication,2008,13(2):429-440.

[8] Thornton K, McDonald D W. Tagging Wikipedia: collaboratively creating a category system[C]//Proceedings of the 17th ACM international conference on Supporting group work. New York:ACM, 2012:219-228.

[9] Silva F N, Viana M P, Travençolo B A N, et al. Investigating relationships within and between category networks in Wikipedia[J]. Journal of Informetrics,2011,5(3):431-438.

[10] Suchecki K, Salah A A A, Gao Cheng, et al. Evolution of Wikipedia's Category Structure[J]. Advances in Complex Systems,2012,15(supp01):1250,068.

[11] Voss J. Collaborative thesaurus tagging the Wikipedia way[EB/OL]. [2014-01-09].

[12] Salah A A, Gao Cheng, Suchecki K, et al. Need to categorize: A comparative look at the categories of universal decimal classification system and Wikipedia[J]. Leonardo, 2012, 45(1): 84-85.

[13] Colgrove C, Neidert J, Chakoumakos R. Using network structure to learn category classification in Wikipedia[EB/OL]. [2014-01-09].

[14] Kittur A, Chi E H, Suh B. What's in Wikipedia?: mapping topics and conflict using socially annotated category structure[C]//Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. New York: ACM, 2009: 1509-1512.

[15] Chernov S, Iofciu T, Nejdl W, et al. Extracting semantics relationships between Wikipedia categories[EB/OL]. [2014-01-09].

[16] Gantner Z, Schmidt-Thieme L. Automatic content-based categorization of Wikipedia articles[C/OL].[2014-01-13].

[17] Szymański J. Mining relations between wikipedia categories[M]// Communications in Computer and Information Science. Berlin:Springer, 2010: 248-255.

[18] Muchnik L, Itzhack R, Solomon S, et al. Self-emergence of knowledge trees: Extraction of the Wikipedia hierarchies[J]. Physical Review E, 2007, 76(1): 016106.

[19] Biuk-Aghai R P, Cheang F H H. Wikipedia category visualization using radial layout[C]//Proceedings of the 7th International Symposium on Wikis and Open Collaboration. New York:ACM, 2011: 193-194.

[20] Holloway T, Bozicevic M, Brner K. Analyzing and visualizing the semantic coverage of Wikipedia and its authors[J]. Complexity, 2007, 12(3): 30-40.

[21] Wang Juncheng, Ma Feicheng, Cheng Jun. The impact of research design on the half-life of the wikipedia category system[C]. Computer Design and Applications. IEEE, Qinhuangdao, 2010, 4: 25-27.

[22] Zesch T, Gurevych I. Analysis of the Wikipedia category graph for NLP applications[EB/OL].[2014-01-09].

[23] Wang QiShun, Wang Xiaohua, Chen Zhiqun, et al. The category structure in Wikipedia: To analyze and know how it grows[M]. Berlin: Springer, 2013: 538-545.

[24] Satija M P. Classification: Some fundamentals, some myths, some realities[J]. Knowledge Organization, 1998, 25(1): 32-35.

[25] 张余.知识分类新探[J].图书馆论坛, 2006, 26(6): 175-177.

[26] Wikipedia:FAQ/Categorization[EB/OL].[2014-01-14].

[27] Noh J D, Rieger H. Random walks on complex networks[J]. Physical Review Letters, 2004, 92(11): 118701.

[28] Travençolo B A N, Costa L da F. Accessibility in complex networks[J]. Physics Letters A, 2008, 373(1): 89-95.

[29] Costa L da F. Inward and outward node accessibility in complex networks as revealed by non-linear dynamics[EB/OL]. [2014-01-09].

[30] Wikimedia downloads[EB/OL].[2012-10-31].

