原文:https://thierrysilbermann.wordpress.com/2015/02/11/simple-libfm-example-part1/

I often get email of people asking me how to run libFM and having trouble to understand the whole pipeline. If you are verse in Machine Learning, a first step is to take a look at Steffen Rendle’s paper ‘Factorization Machines‘ and this one too:Factorization Machines with libFM

I’ll try to explain how to train different kinds of models with the 4 different learning algorithms that libFM provides and use the features of libFM (like grouping and relations)

But first, here is a toy example of how each file should be. (Was posted in the libfm google group)

Simple example for 2 users and 3 items. We have 2 users, 3 items in our training set and now you want to test on the same 2 users, but now you have 4 items (the same 3 from training + one new))
Each user has a categorical feature age [“18-25”, “26-40”, “40-60”] and each item has a numerical feature price.

I one-hot encoded the users:

0 is User1
1 is User2

Same thing for items,

2 is Item1,
3 is Item2,
4 is Item3,
5 is Item4

The categorical feature age need to be one encoded too

6 is the category “18-25”,
7 is the category “26-40”,
8 is the category “40-60”

And finally the numerical feature price for the item

9 will represent the price feature

One sample can be:

5 0:1 3:1 6:1 9:20
#User1 who is 23yo is giving a rating of 5 on Item2 which costs 20 euros

We can then construct example and create a training and test set.

train.libfm

5 0:1 2:1 6:1 9:12.5
5 0:1 3:1 6:1 9:20  
4 0:1 4:1 6:1 9:78 
1 1:1 2:1 8:1 9:12.5
1 1:1 3:1 8:1 9:20

num_features = 10  #Computed on the highest integer value that represents a feature (here 9 for the Item price) + 1 (because we expect people to start at 0)

test.libfm

0 1:1 4:1 8:1 9:78
0 0:1 5:1 6:1

num_features = 10 #Computed on the highest integer value that represents a feature (here 9 for the Item price) + 1 (because we expect people to start at 0)

For the test, I have two samples I want prediction. The 0 doesn’t really have any effect in testing (Only useful if you have the true value, then libFM will output the RMSE error on it but will not use it to train the model)
Just to be sure, here is the meaning of those two samples in test:

0 1:1 4:1 8:1 9:78
#Here User2 who is 41yo is rating Item3 which costs 78 euros and we gave a rating of 0 because we don’t know yet the real rating

0 0:1 5:1 6:1
#We want to know which rating User1 who is 23yo will give to a not-yet seen Item4 and we don’t know the price

This format is the same as for libSVM

From here you have two files: train.libfm and test.libfm (the extension doesn’t matter)

You can then run libFM like this for regression (predicting ratings):

./libfm -task r -method mcmc -train train.libfm -test test.libfm -iter 10 -dim ‘1,1,2’ -out output.libfm

So the model was train using [MCMC (-method mcmc)] on [10 (-iter 10)] iterations using a [linear model (+bias) and using factorization with 2 latent factors. (-dim ‘1,1,2’)]

You will then get some output out of the command line and prediction will be written in the file ‘output.libfm’

Discussions
This is of course a toy example but show you what you can use in libFM to train your model.
I wouldn’t recommand using the price feature like this but maybe do some transformation like log to avoid having a feature with large value but I hope you get the point.

simple-libfm-example-part1的更多相关文章

  1. PyNest——Part1:neurons and simple neural networks

    neurons and simple neural networks pynest – nest模拟器的界面 神经模拟工具(NEST:www.nest-initiative.org)专为仿真点神经元的 ...

  2. [译] Extending jQuery Part1 Simple extensions

    本章包含: JQuery 的起源和目标. 你能扩展JQuery 的那些部分. JQuery 扩展的实例. 如今,JQuery 已经是网络上最受欢迎的JavaScript Library. 1.1 jQ ...

  3. Linux平台 Oracle 10gR2(10.2.0.5)RAC安装 Part1:准备工作

    Linux平台 Oracle 10gR2(10.2.0.5)RAC安装 Part1:准备工作 环境:OEL 5.7 + Oracle 10.2.0.5 RAC 1.实施前准备工作 1.1 服务器安装操 ...

  4. PHP设计模式(一)简单工厂模式 (Simple Factory For PHP)

    最近天气变化无常,身为程序猿的寡人!~终究难耐天气的挑战,病倒了,果然,程序猿还需多保养自己的身体,有句话这么说:一生只有两件事能报复你:不够努力的辜负和过度消耗身体的后患.话不多说,开始吧. 一.什 ...

  5. Linux平台 Oracle 11gR2 RAC安装Part1:准备工作

    一.实施前期准备工作 1.1 服务器安装操作系统 1.2 Oracle安装介质 1.3 共享存储规划 1.4 网络规范分配 二.安装前期准备工作 2.1 各节点系统时间校对 2.2 各节点关闭防火墙和 ...

  6. Design Patterns Simplified - Part 3 (Simple Factory)【设计模式简述--第三部分(简单工厂)】

    原文链接:http://www.c-sharpcorner.com/UploadFile/19b1bd/design-patterns-simplified-part3-factory/ Design ...

  7. WATERHAMMER: A COMPLEX PHENOMENON WITH A SIMPLE SOLUTION

    开启阅读模式 WATERHAMMER A COMPLEX PHENOMENON WITH A SIMPLE SOLUTION Waterhammer is an impact load that is ...

  8. BZOJ 3489: A simple rmq problem

    3489: A simple rmq problem Time Limit: 40 Sec  Memory Limit: 600 MBSubmit: 1594  Solved: 520[Submit] ...

  9. Le lié à la légèreté semblait être et donc plus simple

    Il est toutefois vraiment à partir www.runmasterfr.com/free-40-flyknit-2015-hommes-c-1_58_59.html de ...

  10. ZOJ 3686 A Simple Tree Problem

    A Simple Tree Problem Time Limit: 3 Seconds      Memory Limit: 65536 KB Given a rooted tree, each no ...

随机推荐

  1. BZOJ1878: [SDOI2009]HH的项链[树状数组+离线 | 主席树]

    题意: 询问区间不同种类颜色数 [2016-11-15] 离线好厉害 对于每一个区间询问,一个数只考虑一次,那么考虑他最后出现的一次 将询问按r排序 从1到n扫描,用树状数组维护一个位置应不应该考虑( ...

  2. 清北冬令营入学测试[ABCDEF]

    http://tyvj.cn/Contest/861 [1.2.2017] 像我这种蒟蒻只做了前6道还有道不会只拿了暴力分 A 描述 这是一道有背景的题目,小A也是一个有故事的人.但可惜的是这里纸张太 ...

  3. [CC-SUBWAY]Subway Ride

    [CC-SUBWAY]Subway Ride 题目大意: 一棵\(n(n\le5\times10^5)\)个点的含重边的树,总边数为\(m(m\le10^6)\),每条边有一个颜色.\(q(q\le5 ...

  4. Java Maven:spring boot + Mybatis连接MySQL,通用mapper的增删改查,映射实现多表查询

    1. MySQL自带库test添加表user.role 角色表role 用户表user 2. 添加依赖,配置属性 相关依赖:百度即可,此处略 application.properties spring ...

  5. 协议栈中使用crc校验函数

    CRC校验介绍:循环冗余校验码,原理是多项式除法 ZigBee协议栈:能够使zigbee节点相互之间组网,数据传输,数据获取,数据显示 思路以及步骤: 1.因为IAR的程序是用c写的,所以上网查找如何 ...

  6. webpack4 + vue + vue-router + vuex

    ps: 所有案例使用的 node 及 npm 版本如下 node版本: v8.4.0 npm: 5.3.0 下一个案例默认是接着上一个继续写的 建议先熟悉以下文档 vue vue-router vue ...

  7. USB PIC Programmer (Brenner8)

    http://uzzors2k.4hv.org/index.php?page=usbpicprog My Tait Serial programmer works alright, but not e ...

  8. MVC使用AdditionalMetadata为Model属性添加额外信息

    当需要为Model的属性添加一些额外信息的时候,使用[AdditionalMetadata("somekey", "some content")]是不错的选择, ...

  9. INDY10 IDHTTPSERVER返回中文不乱码

    INDY10 IDHTTPSERVER返回中文不乱码 procedure TynHttpServer.CommandGet(AContext: TIdContext; ARequestInfo: TI ...

  10. 基于curl的异步http实现

    简述用于windowsclient的一个异步http模块的实现 1.须要实现的feature 1.1 非常easy地发起异步http请求,然后回调. 1.2 可以管理http并发数. 1.3 可以支持 ...