图书情报工作 ›› 2014, Vol. 58 ›› Issue (10): 145-148.DOI: 10.13266/j.issn.0252-3116.2014.10.026

• 知识组织 • 上一篇    

多标记用户分类系统构建方法研究

刘忠宝1, 赵文娟2, 贾君枝3   

  1. 1. 中北大学计算机与控制工程学院;
    2. 山西大学商务学院;
    3. 山西大学管理学院
  • 收稿日期:2014-04-15 修回日期:2014-05-02 出版日期:2014-05-20 发布日期:2014-05-20
  • 作者简介:刘忠宝,中北大学计算机与控制工程学院讲师,E-mail:liu_zhongbao@hotmail.com;赵文娟,山西大学商务学院讲师;贾君枝,山西大学管理学院教授
  • 基金资助:

    本文系山西大学商务学院资助项目“模式识别技术在个性化信息服务中的应用研究”(项目编号:XS2011005)研究成果之一。

Research on the Construction Approach of Multi-Label User Classification Based on the Visited Pages

Liu Zhongbao1, Zhao Wenjuan2, Jia Junzhi3   

  1. 1. School of Computer and Control Engineering, North University of China, Taiyuan 030051;
    2. School of Information, Business College of Shanxi University, Taiyuan 030031;
    3. School of Management, Shanxi University, Taiyuan 030006
  • Received:2014-04-15 Revised:2014-05-02 Online:2014-05-20 Published:2014-05-20

摘要:

针对一示例同属多类的问题,提出多标记支持向量机并在此基础上构建基于访问页面的多标记用户分类系统。该系统首先对Web页面进行预处理并利用流形判别分析进行文本特征提取,然后利用多标记支持向量机对文本进行分类,最后对分类结果进行评价。真实数据集上的实验表明所建系统的有效性。

关键词: 访问页面, 用户分类, 多标记, 支持向量机

Abstract:

In order to deal with the issue of one instance belonging to several classes, Multi-Label Support Vector Machine (ML-SVM) is proposed and the system of multi-label user classification based on visited pages is constructed. Firstly, the Web pages are preprocessed and the page features are extracted by Manifold-based Discriminant Analysis (MDA). Then, ML-SVM is used to classify the pages and the classification results are evaluated. Experiments on the authentic datasets verify the effectiveness of the constructed system.

Key words: visited pages, user classification, multi-label, Support Vector Machine

中图分类号: