simple-libfm-example-part1
原文:https://thierrysilbermann.wordpress.com/2015/02/11/simple-libfm-example-part1/
I often get email of people asking me how to run libFM and having trouble to understand the whole pipeline. If you are verse in Machine Learning, a first step is to take a look at Steffen Rendle’s paper ‘Factorization Machines‘ and this one too:Factorization Machines with libFM
I’ll try to explain how to train different kinds of models with the 4 different learning algorithms that libFM provides and use the features of libFM (like grouping and relations)
But first, here is a toy example of how each file should be. (Was posted in the libfm google group)
Simple example for 2 users and 3 items. We have 2 users, 3 items in our training set and now you want to test on the same 2 users, but now you have 4 items (the same 3 from training + one new))
Each user has a categorical feature age [“18-25”, “26-40”, “40-60”] and each item has a numerical feature price.
I one-hot encoded the users:
0 is User1
1 is User2
Same thing for items,
2 is Item1,
3 is Item2,
4 is Item3,
5 is Item4
The categorical feature age need to be one encoded too
6 is the category “18-25”,
7 is the category “26-40”,
8 is the category “40-60”
And finally the numerical feature price for the item
9 will represent the price feature
One sample can be:
5 0:1 3:1 6:1 9:20
#User1 who is 23yo is giving a rating of 5 on Item2 which costs 20 euros
We can then construct example and create a training and test set.
train.libfm
5 0:1 2:1 6:1 9:12.5
5 0:1 3:1 6:1 9:20
4 0:1 4:1 6:1 9:78
1 1:1 2:1 8:1 9:12.5
1 1:1 3:1 8:1 9:20
num_features = 10 #Computed on the highest integer value that represents a feature (here 9 for the Item price) + 1 (because we expect people to start at 0)
test.libfm
0 1:1 4:1 8:1 9:78
0 0:1 5:1 6:1
num_features = 10 #Computed on the highest integer value that represents a feature (here 9 for the Item price) + 1 (because we expect people to start at 0)
For the test, I have two samples I want prediction. The 0 doesn’t really have any effect in testing (Only useful if you have the true value, then libFM will output the RMSE error on it but will not use it to train the model)
Just to be sure, here is the meaning of those two samples in test:
0 1:1 4:1 8:1 9:78
#Here User2 who is 41yo is rating Item3 which costs 78 euros and we gave a rating of 0 because we don’t know yet the real rating0 0:1 5:1 6:1
#We want to know which rating User1 who is 23yo will give to a not-yet seen Item4 and we don’t know the price
This format is the same as for libSVM
From here you have two files: train.libfm and test.libfm (the extension doesn’t matter)
You can then run libFM like this for regression (predicting ratings):
./libfm -task r -method mcmc -train train.libfm -test test.libfm -iter 10 -dim ‘1,1,2’ -out output.libfm
So the model was train using [MCMC (-method mcmc)] on [10 (-iter 10)] iterations using a [linear model (+bias) and using factorization with 2 latent factors. (-dim ‘1,1,2’)]
You will then get some output out of the command line and prediction will be written in the file ‘output.libfm’
Discussions
This is of course a toy example but show you what you can use in libFM to train your model.
I wouldn’t recommand using the price feature like this but maybe do some transformation like log to avoid having a feature with large value but I hope you get the point.
simple-libfm-example-part1的更多相关文章
- PyNest——Part1:neurons and simple neural networks
neurons and simple neural networks pynest – nest模拟器的界面 神经模拟工具(NEST:www.nest-initiative.org)专为仿真点神经元的 ...
- [译] Extending jQuery Part1 Simple extensions
本章包含: JQuery 的起源和目标. 你能扩展JQuery 的那些部分. JQuery 扩展的实例. 如今,JQuery 已经是网络上最受欢迎的JavaScript Library. 1.1 jQ ...
- Linux平台 Oracle 10gR2(10.2.0.5)RAC安装 Part1:准备工作
Linux平台 Oracle 10gR2(10.2.0.5)RAC安装 Part1:准备工作 环境:OEL 5.7 + Oracle 10.2.0.5 RAC 1.实施前准备工作 1.1 服务器安装操 ...
- PHP设计模式(一)简单工厂模式 (Simple Factory For PHP)
最近天气变化无常,身为程序猿的寡人!~终究难耐天气的挑战,病倒了,果然,程序猿还需多保养自己的身体,有句话这么说:一生只有两件事能报复你:不够努力的辜负和过度消耗身体的后患.话不多说,开始吧. 一.什 ...
- Linux平台 Oracle 11gR2 RAC安装Part1:准备工作
一.实施前期准备工作 1.1 服务器安装操作系统 1.2 Oracle安装介质 1.3 共享存储规划 1.4 网络规范分配 二.安装前期准备工作 2.1 各节点系统时间校对 2.2 各节点关闭防火墙和 ...
- Design Patterns Simplified - Part 3 (Simple Factory)【设计模式简述--第三部分(简单工厂)】
原文链接:http://www.c-sharpcorner.com/UploadFile/19b1bd/design-patterns-simplified-part3-factory/ Design ...
- WATERHAMMER: A COMPLEX PHENOMENON WITH A SIMPLE SOLUTION
开启阅读模式 WATERHAMMER A COMPLEX PHENOMENON WITH A SIMPLE SOLUTION Waterhammer is an impact load that is ...
- BZOJ 3489: A simple rmq problem
3489: A simple rmq problem Time Limit: 40 Sec Memory Limit: 600 MBSubmit: 1594 Solved: 520[Submit] ...
- Le lié à la légèreté semblait être et donc plus simple
Il est toutefois vraiment à partir www.runmasterfr.com/free-40-flyknit-2015-hommes-c-1_58_59.html de ...
- ZOJ 3686 A Simple Tree Problem
A Simple Tree Problem Time Limit: 3 Seconds Memory Limit: 65536 KB Given a rooted tree, each no ...
随机推荐
- [ 转载 ] Centos安装Mysql数据库纪录
yum install mysql-community-server依赖关系错误 https://www.cnblogs.com/lzj0218/p/5724446.html rpm -qa|grep ...
- 【BZOJ 3036】 3036: 绿豆蛙的归宿 (概率DP)
3036: 绿豆蛙的归宿 Time Limit: 2 Sec Memory Limit: 128 MBSubmit: 491 Solved: 354 Description 随着新版百度空间的下线 ...
- pygame系列_pygame的各模块叙述
在pygame中,有很多模块,每个模块对应着不同的功能,如果我们知道这些模块是做什么的,那么,对我们的游戏开发会起到关键性的作用. 我们就说说pygame中的各个模块吧!!! #pygame modu ...
- git服务端和客户端百度网盘下载地址
https://pan.baidu.com/s/1BKw-bgYOrQjLkwUMzyH7KQ
- js中什么是对象,对象的概念是什么?
我们一直在用对象 可是你真的理解对象吗,js中有一个说法是一切皆对象,其实这里说的应该是 一切皆可看作对象 对象就是可以拥有属性和方法的一个集合 士兵就是一个对象,它拥有身高体重的属性,保家卫国,吃饭 ...
- intellij idea 部署项目的时候 图中application context 写不写有什么关系?有什么作用?
这个就是你部署之后访问的路径,比如你写一个/test,那反问就是127.0.0.1:8080/test,没有写的话就是127.0.0.1:8080
- jrebel使用
背景与愿景:开发环境下,tomcat对热布署的支持还不够全面,致使开发人员浪费大量时间在重起服务上.为了提高开发效率,决定引入Jrebel,它对热布署的支持相对比较全面.虽然Jrebel官方号称使用它 ...
- SQL 脚本中的全角逗号引起【ORA-01756: 引号内的字符串没有正确结束】
今天运行壹個小程序,功能是读取指定目录下的 SQL 脚本,并加载到内存中批量执行,之前的程序运行良好.但是今天相关开发人员更新了其中壹個 SQL 脚本,于是程序运行的时候就出错了,错误提示信息如下:批 ...
- 读 Zepto 源码系列
虽然最近工作中没有怎么用 zepto ,但是据说 zepto 的源码比较简单,而且网上的资料也比较多,所以我就挑了 zepto 下手,希望能为以后阅读其他框架的源码打下基础吧. 源码版本 本文阅读的源 ...
- 几款开源的ETL工具及ELT初探
ETL,是英文 Extract-Transform-Load 的缩写,用来描述将数据从来源端经过抽取(extract).转换(transform).加载(load)至目的端的过程.ETL 是构建数据仓 ...