前言

协同过滤推荐系统,包括基于用户的、基于项目的息肉通过率等,今天我们读一篇基于项目的协同过滤算法的论文。

今天读的论文为一篇名叫《基于项目的协同过滤推荐算法》(Item-Based Collaborative Filtering RecommendationAlgorithms)。

摘要

Recommender systems apply knowledge discovery techniques to the problem of making personalized recommendations for information, products or services during a live interaction. These systems, especially the k-nearest neighbor collaborative \x0cltering based ones, are achieving widespread success on the Web. The tremendous growth in the amount of available information and the number of visitors to Web sites in recent years poses some key challenges for recommender systems. These are: producing high quality recommendations, performing many recommendations per second for millions of users and items and achieving high coverage in the face of data sparsity. In traditional collaborative filtering systems the amount of work increases with the number of participants in the system. New recommender system technologies are needed that can quickly produce high quality recommendations, even for very large-scale problems. To address these issues we have explored item-based collaborative filtering techniques. Item-based techniques first analyze the user-item matrix to identify relationships between different items, and then use these relationships to indirectly compute recommendations for users.

推荐系统将知识发现技术应用于实时交互中,为信息、产品或服务提供个性化推荐。这些系统,特别是基于k近邻协作聚类的系统,在Web上取得了广泛的成功。近年来,网站可用信息量和访问量的急剧增长对推荐系统提出了严峻的挑战。这些是:产生高质量的推荐,每秒为数百万用户和物品执行多次推荐,以及在数据稀疏的情况下实现高覆盖率。在传统的协同过滤系统中,工作量会随着参与者数量的增加而增加。新的推荐系统技术需要能够快速产生高质量的推荐,即使是对于非常大规模的问题。为了解决这些问题,我们探索了基于物品的协同过滤技术。基于物品的推荐技术首先通过分析用户-物品矩阵来识别不同物品之间的关系,然后利用这些关系间接地为用户计算推荐。

In this paper we analyze different item-based recommendation generation algorithms. We look into different techniques for computing item-item similarities (e.g., item-item correlation vs. cosine similarities between item vectors) and different techniques for obtaining recommendations from them (e.g., weighted sum vs. regression model). Finally, we ex- perimentally evaluate our results and compare them to the basic k-nearest neighbor approach. Our experiments suggest that item-based algorithms provide dramatically better performance than user-based algorithms, while at the same time providing better quality than the best available userbased algorithms.

本文分析了不同的基于项目的推荐生成算法。我们研究了计算物品相似度的不同技术(例如物品之间的相关度物品向量之间的余弦相似度),以及从中获得推荐的不同技术(例如加权和回归模型)。最后,对实验结果进行评估,并与基本的k近邻方法进行比较。实验表明,基于物品的算法在性能上明显优于基于用户的算法,同时在质量上也优于现有的最好的基于用户的算法。

Sarwar B, Karypis G, Konstan J, et al. Item-based collaborative filtering recommendation algorithms[C]//Proceedings of the 10th international conference on World Wide Web. 2001: 285-295.

摘要部分主要内容

摘要主要介绍了传统的K近邻算法的缺陷:随着互联网技术的快速发展,对推荐系统产生了很大的冲击,文章提出了计算物品相似度的技术,并从中获得不同的推荐技术,最后分析实验结果,同时与K近邻算法比较,实验结果表明,协同过滤推荐算法更好。

引言

The amount of information in the world is increasing far more quickly than our ability to process it. All of us have known the feeling of being overwhelmed by the number of new books, journal articles, and conference proceedings coming out each year. Technology has dramatically reduced the barriers to publishing and distributing information. Now it is time to create the technologies that can help us sift through all the available information to find that which is most valuable to us.

世界上信息量的增长速度远远超过了我们处理信息的能力。我们都有过被每年涌现的新书、期刊文章和会议记录所淹没的感觉。科技极大地减少了出版和传播信息的障碍。现在是时候创造一种技术,帮助我们筛选所有可用的信息,找到对我们最有价值的信息。

One of the most promising such technologies is col laborative filtering [19,27,14,16]. Collaborative filtering works by building a database of preferences for items by users. A new user, Neo, is matched against the database to discover neighbors, which are other users who have historically had similar taste to Neo. Items that the neighbors like are then recommended to Neo, as he will probably also like them. Collaborative filtering has been very successful in both research and practice, and in both information filtering applications and E-commerce applications. However, there remain important research questions in overcoming two fundamental challenges for collaborative filtering recommender systems.

其中最有前途的技术之一是协同过滤。协同过滤的工作原理是建立用户对项目的偏好数据库。将新用户Neo与数据库进行匹配,以发现邻居,这些邻居是历史上与Neo有着相似品味的其他用户。邻居喜欢的物品会被推荐给Neo,因为他可能也会喜欢这些物品。协同过滤在信息过滤应用和电子商务应用中都取得了很大的成功。然而,在克服协同过滤推荐系统的两个基本挑战方面,仍然存在重要的研究问题。

The first challenge is to improve the scalability of the collaborative filtering algorithms. These algorithms are able to search tens of thousands of potential neighbors in real-time, but the demands of modern systems are to search tens of millions of potential neighbors. Further, existing algorithms have performance problems with individual users for whomthe site has large amounts of information. For instance, if a site is using browsing patterns as indications of con- tent preference, it may have thousands of data points for its most frequent visitors. These "long user rows" slow down the number of neighbors that can be searched per second, further reducing scalability.

第一个挑战是提高协同过滤算法的可扩展性。这些算法能够实时搜索数以万计的潜在邻居,但现代系统的需求是搜索数以千万计的潜在邻居。此外,现有算法在处理拥有大量网站信息的个人用户时存在性能问题。例如,如果一个网站使用浏览模式作为内容偏好的指示,那么它可能有数千个最频繁访问者的数据点。这些“长用户行”减慢了每秒可以搜索的邻居的数量,进一步降低了可伸缩性。

The second challenge is to improve the quality of the recommendations for the users. Users need recommendations they can trust to help them find items they will like. Users will "vote with their feet" by refusing to use recommender systems that are not consistently accurate for them.

第二个挑战是提高用户推荐的质量。用户需要他们信任的推荐来帮助他们找到他们喜欢的东西。用户将“用脚投票”,拒绝使用对他们来说不始终准确的推荐系统。

In some ways these two challenges are in con ict, since the less time an algorithm spends searching for neighbors, the more scalable it will be, and the worse its quality. For this reason, it is important to treat the two challenges simultaneously so the solutions discovered are both useful and practical.

在某些方面,这两个挑战是相互冲突的,因为算法搜索邻居的时间越少,它的可扩展性就越强,质量就越差。因此,同时处理这两个挑战非常重要,这样所发现的解决方案才既有用又实用。

In this paper, we address these issues of recommender systems by applying a different approach{item-based algorithm. The bottleneck in conventional collaborative filtering algorithms is the search for neighbors among a large user population of potential neighbors [12]. Item-based algorithms avoid this bottleneck by exploring the relationships between items first, rather than the relationships between users. Recommendations for users are computed by finding items that are similar to other items the user has liked. Because the relationships between items are relatively static,item-based algorithms may be able to provide the same quality as the user-based algorithms with less online computation.

在本文中,我们通过应用一种不同的方法(基于项目的算法)来解决推荐系统的这些问题。传统协同过滤算法的瓶颈是在大量潜在邻居用户群中搜索邻居基于项目的算法通过首先探索项目之间的关系不是用户之间的关系来避免这个瓶颈。对用户的推荐是通过查找与用户喜欢的其他物品相似的物品来计算的。因为项目之间的关系是相对静态的基于项目的算法可能能够提供与基于用户的算法相同的质量,并且在线计算较少

结尾

今天的论文就先读到这里了,今天主要学习相关概念与知识,下次再补充详细的信息吧。


2024-01-28 18:05:28 星期日

这几天有点忙,忘记上传补充内容了,今天有时间补充一下,

补充:查看补充内容,请访问 补充:基于项目的协同过滤推荐算法(Item-Based Collaborative Filtering Recommendation Algorithms)

基于项目的协同过滤推荐算法(Item-Based Collaborative Filtering Recommendation Algorithms)的更多相关文章

  1. 基于物品的协同过滤推荐算法——读“Item-Based Collaborative Filtering Recommendation Algorithms” .

    ligh@local-host$ ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.0.3 基于物品的协同过滤推荐算法--读"Item-Based ...

  2. 基于MapReduce的(用户、物品、内容)的协同过滤推荐算法

    1.基于用户的协同过滤推荐算法 利用相似度矩阵*评分矩阵得到推荐列表 已经推荐过的置零 2.基于物品的协同过滤推荐算法 3.基于内容的推荐 算法思想:给用户推荐和他们之前喜欢的物品在内容上相似的物品 ...

  3. SparkMLlib—协同过滤推荐算法,电影推荐系统,物品喜好推荐

    SparkMLlib-协同过滤推荐算法,电影推荐系统,物品喜好推荐 一.协同过滤 1.1 显示vs隐式反馈 1.2 实例介绍 1.2.1 数据说明 评分数据说明(ratings.data) 用户信息( ...

  4. SimRank协同过滤推荐算法

    在协同过滤推荐算法总结中,我们讲到了用图模型做协同过滤的方法,包括SimRank系列算法和马尔科夫链系列算法.现在我们就对SimRank算法在推荐系统的应用做一个总结. 1. SimRank推荐算法的 ...

  5. Spark ML协同过滤推荐算法

    一.简介 协同过滤算法[Collaborative Filtering Recommendation]算法是最经典.最常用的推荐算法.该算法通过分析用户兴趣,在用户群中找到指定用户的相似用户,综合这些 ...

  6. 基于局部敏感哈希的协同过滤推荐算法之E^2LSH

    需要代码联系作者,不做义务咨询. 一.算法实现 基于p-stable分布,并以‘哈希技术分类’中的分层法为使用方法,就产生了E2LSH算法. E2LSH中的哈希函数定义如下: 其中,v为d维原始数据, ...

  7. 推荐系统| ② 离线推荐&基于隐语义模型的协同过滤推荐

    一.离线推荐服务 离线推荐服务是综合用户所有的历史数据,利用设定的离线统计算法和离线推荐算法周期性的进行结果统计与保存,计算的结果在一定时间周期内是固定不变的,变更的频率取决于算法调度的频率. 离线推 ...

  8. 推荐召回--基于物品的协同过滤:ItemCF

    目录 1. 前言 2. 原理&计算&改进 3. 总结 1. 前言 说完基于用户的协同过滤后,趁热打铁,我们来说说基于物品的协同过滤:"看了又看","买了又 ...

  9. Mahout之(二)协同过滤推荐

    协同过滤 —— Collaborative Filtering 协同过滤简单来说就是根据目标用户的行为特征,为他发现一个兴趣相投.拥有共同经验的群体,然后根据群体的喜好来为目标用户过滤可能感兴趣的内容 ...

  10. 基于用户的协同过滤的电影推荐算法(tensorflow)

    数据集: https://grouplens.org/datasets/movielens/ ml-latest-small 协同过滤算法理论基础 https://blog.csdn.net/u012 ...

随机推荐

  1. shell 获取 目录名 当前目录名

    Four ways to extract the current directory name By  Sergio Gonzalez Duran on November 06, 2007 (9:00 ...

  2. 【转载】Spring Cloud Gateway-全局过滤器(Global Filters)

    http://www.imooc.com/article/290821 TIPS 本文基于Spring Cloud Gateway SR2,理论适配Spring Cloud Gateway SR1以及 ...

  3. [转]Pelco-D协议使用

    1.Pelco-D协议格式如下图所示: 2. 通用示例为:水平向右控制 FF address 00 02 Hspeed 00 checksum水平向左控制 FF address 00 04 Hspee ...

  4. vue辅助函数mapState和mapGetter前面三个点到底是什么意思:对象展开运算符

    import store from "./store" computed: { useName: function() { return store.state.userName ...

  5. 一问一答学习PyQT6,对比WxPython和PyQt6的差异

    在我的基于WxPython的跨平台框架完成后,对WxPython的灵活性以及强大功能有了很深的了解,在跨平台的桌面应用上我突然对PyQt6的开发也感兴趣,于是准备了开发环境学习PyQt 6,并对比下W ...

  6. CDS标准视图:催款范围描述 I_DunningAreaText

    视图名称:催款范围描述 I_DunningAreaText 视图类型: 视图代码: 点击查看代码 @EndUserText.label: 'Dunning Area - Text' @Analytic ...

  7. Springboot 整合 xxl-job

    前言 很久很久以前写过好几篇关于定时任务的使用系列的文章: 这一篇是最简单的,就是单纯跑跑定时任务,那你看这篇就行,没必要用xxljob(因为xxljob要跑服务端,然后自己服务作为客户端接入): 文 ...

  8. 如何快速的开发一个完整的iOS直播app(礼物篇)

    搭建礼物列表 使用modal,设置modal样式为custom,就能做到从小往上显示礼物列表,并且能看见前面的直播界面 礼物模型设计 一开始创建3个礼物模型,保存到数组,传入给礼物View展示,本来礼 ...

  9. struts2框架详解

    struts2框架(1)---struts2入门 struts2框架 如果你之前在MVC模式的时候一直都是通过servlet,获取和返回数据,那么现在开始学习struts2框架, Struts是一个实 ...

  10. Spring Cloud认知学习(二):Feign的使用、熔断器Hystrix

    Feign Feign用于声明式调用服务在上面的服务调用中,我们始终还是没有摆脱restTemplate,我们调用别的服务始终要使用restTemplate来发起.想想我们以前是怎么开发的(三层架构, ...