图书情报工作 ›› 2018, Vol. 62 ›› Issue (24): 124-133.DOI: 10.13266/j.issn.0252-3116.2018.24.016

• 知识组织 • 上一篇    下一篇

多维度属性加权分析的微博用户聚类研究

张海涛1,2, 唐诗曼1, 魏明珠1, 李泽中1   

  1. 1. 吉林大学管理学院 长春 130022;
    2. 吉林大学信息资源研究中心 长春 130022
  • 收稿日期:2018-05-16 修回日期:2018-07-23 出版日期:2018-12-20 发布日期:2018-12-20
  • 作者简介:张海涛(ORCID:0000-0002-9421-8187),教授,博士生导师,E-mail:zhtinfo@126.com;唐诗曼(ORCID:0000-0002-4355-7963),硕士研究生;魏明珠(ORCID:0000-0001-8430-7461),硕士研究生;李泽中(ORCID:0000-0002-1970-5815),博士研究生。

Research on the Clustering of Microblog Users Based on Multi-dimensional Attribute Weighting Analysis

Zhang Haitao1,2, Tang Shiman1, Wei Mingzhu1, Li Zezhong1   

  1. 1. The Management College of Jilin University, Changchun 130022;
    2. The Information Resource Research Center of Jilin University, Changchun 130022
  • Received:2018-05-16 Revised:2018-07-23 Online:2018-12-20 Published:2018-12-20

摘要: [目的/意义]准确把握社交网络用户兴趣倾向,对用户进行分类并形成高聚合的用户群,对研究社交网络信息生态以及信息推荐有重大意义。[方法/过程]通过构造基于多维度的用户属性描述层次模型,根据模型数据需求从新浪微博抓取用户样本数据,对相关用户背景信息、用户博文信息以及用户行为信息的多维度属性下二阶变量进行量化,构造用户向量表达式,比较单一维度与多维度下的用户分类效果,进一步给属性赋予不同的权重值进行加权分析,在取得最优聚类效果后进行方差分析,对模型进行改进。[结果/结论]基于多维度属性加权后的用户聚类效果明显高于单一维度及多维度非加权条件下的用户聚类,且用户博文内容维度对于提高用户聚类效果的有效性最大。

关键词: 微博, 多维度, 用户聚类, 加权分析

Abstract: [Purpose/significance] It is of great significance for the study of social network information ecology and information recommendation to accurately grasp the interest tendency of social network users and classify users into highly aggregated user groups.[Method/process] In this paper, by constructing the user attributes describe hierarchical model based on multi-dimensional, according to the model data requirements fetching user sample data from Sina microblog, quantify the secondorder variable based on the multi-dimensional property of the users' background information, users' blog information and user behavior information to construct user vector expression, comparing the classification results based on single dimension and the multi-dimensional, given different weights to attribute for weighted analysis, when achieve the optimal clustering results, based it do variance analysis to improve the model.[Result/conclusion] User clustering effect based on the multi-dimensional attribute weighting is significantly better than the user clustering effect based on the single-dimensional and under the condition of the multidimensional unweighted, and users microblog content dimension for improving the validity of user clustering effect is the largest.

Key words: microblogs, multi-dimensional, user-cluster, weighted-analysis

中图分类号: