原文:https://thierrysilbermann.wordpress.com/2015/02/11/simple-libfm-example-part1/

I often get email of people asking me how to run libFM and having trouble to understand the whole pipeline. If you are verse in Machine Learning, a first step is to take a look at Steffen Rendle’s paper ‘Factorization Machines‘ and this one too:Factorization Machines with libFM

I’ll try to explain how to train different kinds of models with the 4 different learning algorithms that libFM provides and use the features of libFM (like grouping and relations)

But first, here is a toy example of how each file should be. (Was posted in the libfm google group)

Simple example for 2 users and 3 items. We have 2 users, 3 items in our training set and now you want to test on the same 2 users, but now you have 4 items (the same 3 from training + one new))
Each user has a categorical feature age [“18-25”, “26-40”, “40-60”] and each item has a numerical feature price.

I one-hot encoded the users:

0 is User1
1 is User2

Same thing for items,

2 is Item1,
3 is Item2,
4 is Item3,
5 is Item4

The categorical feature age need to be one encoded too

6 is the category “18-25”,
7 is the category “26-40”,
8 is the category “40-60”

And finally the numerical feature price for the item

9 will represent the price feature

One sample can be:

5 0:1 3:1 6:1 9:20
#User1 who is 23yo is giving a rating of 5 on Item2 which costs 20 euros

We can then construct example and create a training and test set.

train.libfm

5 0:1 2:1 6:1 9:12.5
5 0:1 3:1 6:1 9:20  
4 0:1 4:1 6:1 9:78 
1 1:1 2:1 8:1 9:12.5
1 1:1 3:1 8:1 9:20

num_features = 10  #Computed on the highest integer value that represents a feature (here 9 for the Item price) + 1 (because we expect people to start at 0)

test.libfm

0 1:1 4:1 8:1 9:78
0 0:1 5:1 6:1

num_features = 10 #Computed on the highest integer value that represents a feature (here 9 for the Item price) + 1 (because we expect people to start at 0)

For the test, I have two samples I want prediction. The 0 doesn’t really have any effect in testing (Only useful if you have the true value, then libFM will output the RMSE error on it but will not use it to train the model)
Just to be sure, here is the meaning of those two samples in test:

0 1:1 4:1 8:1 9:78
#Here User2 who is 41yo is rating Item3 which costs 78 euros and we gave a rating of 0 because we don’t know yet the real rating

0 0:1 5:1 6:1
#We want to know which rating User1 who is 23yo will give to a not-yet seen Item4 and we don’t know the price

This format is the same as for libSVM

From here you have two files: train.libfm and test.libfm (the extension doesn’t matter)

You can then run libFM like this for regression (predicting ratings):

./libfm -task r -method mcmc -train train.libfm -test test.libfm -iter 10 -dim ‘1,1,2’ -out output.libfm

So the model was train using [MCMC (-method mcmc)] on [10 (-iter 10)] iterations using a [linear model (+bias) and using factorization with 2 latent factors. (-dim ‘1,1,2’)]

You will then get some output out of the command line and prediction will be written in the file ‘output.libfm’

Discussions
This is of course a toy example but show you what you can use in libFM to train your model.
I wouldn’t recommand using the price feature like this but maybe do some transformation like log to avoid having a feature with large value but I hope you get the point.

simple-libfm-example-part1的更多相关文章

  1. PyNest——Part1:neurons and simple neural networks

    neurons and simple neural networks pynest – nest模拟器的界面 神经模拟工具(NEST:www.nest-initiative.org)专为仿真点神经元的 ...

  2. [译] Extending jQuery Part1 Simple extensions

    本章包含: JQuery 的起源和目标. 你能扩展JQuery 的那些部分. JQuery 扩展的实例. 如今,JQuery 已经是网络上最受欢迎的JavaScript Library. 1.1 jQ ...

  3. Linux平台 Oracle 10gR2(10.2.0.5)RAC安装 Part1:准备工作

    Linux平台 Oracle 10gR2(10.2.0.5)RAC安装 Part1:准备工作 环境:OEL 5.7 + Oracle 10.2.0.5 RAC 1.实施前准备工作 1.1 服务器安装操 ...

  4. PHP设计模式(一)简单工厂模式 (Simple Factory For PHP)

    最近天气变化无常,身为程序猿的寡人!~终究难耐天气的挑战,病倒了,果然,程序猿还需多保养自己的身体,有句话这么说:一生只有两件事能报复你:不够努力的辜负和过度消耗身体的后患.话不多说,开始吧. 一.什 ...

  5. Linux平台 Oracle 11gR2 RAC安装Part1:准备工作

    一.实施前期准备工作 1.1 服务器安装操作系统 1.2 Oracle安装介质 1.3 共享存储规划 1.4 网络规范分配 二.安装前期准备工作 2.1 各节点系统时间校对 2.2 各节点关闭防火墙和 ...

  6. Design Patterns Simplified - Part 3 (Simple Factory)【设计模式简述--第三部分(简单工厂)】

    原文链接:http://www.c-sharpcorner.com/UploadFile/19b1bd/design-patterns-simplified-part3-factory/ Design ...

  7. WATERHAMMER: A COMPLEX PHENOMENON WITH A SIMPLE SOLUTION

    开启阅读模式 WATERHAMMER A COMPLEX PHENOMENON WITH A SIMPLE SOLUTION Waterhammer is an impact load that is ...

  8. BZOJ 3489: A simple rmq problem

    3489: A simple rmq problem Time Limit: 40 Sec  Memory Limit: 600 MBSubmit: 1594  Solved: 520[Submit] ...

  9. Le lié à la légèreté semblait être et donc plus simple

    Il est toutefois vraiment à partir www.runmasterfr.com/free-40-flyknit-2015-hommes-c-1_58_59.html de ...

  10. ZOJ 3686 A Simple Tree Problem

    A Simple Tree Problem Time Limit: 3 Seconds      Memory Limit: 65536 KB Given a rooted tree, each no ...

随机推荐

  1. Django2.0中URL的路由机制

    路由是关联url及其处理函数关系的过程.Django的url路由配置在settings.py文件中ROOT_URLCONF变量指定全局路由文件名称. Django的路由都写在urls.py文件中的ur ...

  2. android 注册广播接受者

    韩梦飞沙  韩亚飞  313134555@qq.com  yue31313  han_meng_fei_sha 动态注册 静态注册 动态注册是 通过java代码,注册. 静态注册 是xml清单文件中 ...

  3. luogu P3592 [POI2015]MYJ

    题目链接 luogu P3592 [POI2015]MYJ 题解 区间dp 设f[l][r][k]表示区间l到r内最小值>=k的最大收益 枚举为k的位置p,那么包含p的区间答案全部是k 设h[i ...

  4. hdu 4560 拆点最大流 ***

    题意: 2013年一开始,一档音乐节目“我是歌手”就惊艳了大家一回.闲话少说,现在,你成为了这档节目的总导演,你的任务很简单,安排每一期节目的内容. 现 在有N个歌手,M种歌曲流派(Rock,Pop之 ...

  5. Py脚本运行后暂停不退出

    方法一:在脚本结束后提示用户按任意键退出 import os os.system('pause') 方法二:在脚本结束后等待输入,按回车键退出 input("") 方法三:在脚本结 ...

  6. python开发_platform_获取操作系统详细信息工具

    ''' python中,platform模块给我们提供了很多方法去获取操作系统的信息 如: import platform platform.platform() #获取操作系统名称及版本号,'Win ...

  7. ng-show和ng-if的区别

    第一点区别是, ng-if 在后面表达式为 true 的时候才创建这个 dom 节点, ng-show 是初始时就创建了,用display:block 和 display:none 来控制显示和不显示 ...

  8. windows组策略和共享

    Author: Jin Date: 20140585 ENV: win2008 R2 5年没弄windows了,现在随便弄弄,说实话不太喜欢windows,不出问题时候很方便,一出问题很头大.所有东西 ...

  9. STM32F4: GENERATING A SINE WAVE

    http://amarkham.com/?p=49

  10. spring mvc接收数组

    (一)前言 对于springmvc接收数组的问题啊,我试验过几次,但是了有时候成功了,有时候失败了,也不知道为啥的,然后现在又要用到了,所以打算具体看看到底怎么回事,但是了我实验成功了顺便找了好多资料 ...