R︱mlr包帮你挑选最适合数据的机器学习模型（分类、回归）+机器学习python和R互查手册

一、R语言的mlr packages

install.packages("mlr")之后就可以看到R里面有哪些机器学习算法、在哪个包里面。

a<-listLearners()

这个包是听CDA网络课程《R语言与机器学习实战》余文华老师所述，感觉很棒，有待以后深入探讨。以下表格是R语言里面，52个机器学习算法的来源以及一些数据要求。

class

name

short.name

package

note

type

installed

numerics

factors

ordered

missings

weights

prob

oneclass

twoclass

multiclass

class.weights

lcens

rcens

icens

classif.avNNet

Neural Network

avNNet

nnet

`size` has been set to `3` by default. Doing bagging training of `nnet` if set `bag = TRUE`.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.binomial

Binomial Regression

binomial

stats

Delegates to `glm` with freely choosable binomial link function via learner parameter `link`.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.C50

C50

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.cforest

Random forest based on conditional inference trees

cforest

party

See `?ctree_control` for possible breakage for nominal features with missingness.

classif

TRUE

FALSE

TRUE

FALSE

classif.ctree

Conditional Inference Trees

ctree

party

See `?ctree_control` for possible breakage for nominal features with missingness.

classif

TRUE

FALSE

TRUE

FALSE

classif.cvglmnet

GLM with Lasso or Elasticnet Regularization (Cross Validated Lambda)

cvglmnet

glmnet

The family parameter is set to `binomial` for two-class problems and to `multinomial` otherwise. Factors automatically get converted to dummy columns, ordered factors to integer.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.gausspr

Gaussian Processes

gausspr

kernlab

Kernel parameters have to be passed directly and not by using the `kpar` list in `gausspr`. Note that `fit` has been set to `FALSE` by default for speed.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.gbm

Gradient Boosting Machine

gbm

`keep.data` is set to FALSE to reduce memory requirements. Note on param 'distribution': gbm will select 'bernoulli' by default for 2 classes, and 'multinomial' for multiclass problems. The latter is the only setting that works for > 2 classes.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.glmnet

GLM with Lasso or Elasticnet Regularization

glmnet

The family parameter is set to `binomial` for two-class problems and to `multinomial` otherwise. Factors automatically get converted to dummy columns, ordered factors to integer. Parameter `s` (value of the regularization parameter used for predictions) is set to `0.1` by default, but needs to be tuned by the user.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.h2o.deeplearning

h2o.deeplearning

h2o.dl

h2o

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.h2o.gbm

h2o.gbm

h2o

'distribution' is set automatically to 'gaussian'.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.h2o.glm

h2o.glm

h2o

'family' is always set to 'binomial' to get a binary classifier.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.h2o.randomForest

h2o.randomForest

h2o.rf

h2o

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.knn

k-Nearest Neighbor

knn

class

classif

TRUE

FALSE

TRUE

FALSE

classif.ksvm

Support Vector Machines

ksvm

kernlab

Kernel parameters have to be passed directly and not by using the `kpar` list in `ksvm`. Note that `fit` has been set to `FALSE` by default for speed.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.lda

Linear Discriminant Analysis

lda

MASS

Learner parameter `predict.method` maps to `method` in `predict.lda`.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.logreg

Logistic Regression

logreg

stats

Delegates to `glm` with `family = binomial(link = "logit")`.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.lssvm

Least Squares Support Vector Machine

lssvm

kernlab

`fitted` has been set to `FALSE` by default for speed.

classif

TRUE

FALSE

TRUE

FALSE

classif.lvq1

Learning Vector Quantization

lvq1

class

classif

TRUE

FALSE

TRUE

FALSE

classif.mlp

Multi-Layer Perceptron

mlp

RSNNS

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.multinom

Multinomial Regression

multinom

nnet

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.naiveBayes

Naive Bayes

nbayes

e1071

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.nnet

Neural Network

nnet

`size` has been set to `3` by default.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.plsdaCaret

Partial Least Squares (PLS) Discriminant Analysis

plsdacaret

caret

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.probit

Probit Regression

probit

stats

Delegates to `glm` with `family = binomial(link = "probit")`.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.qda

Quadratic Discriminant Analysis

qda

MASS

Learner parameter `predict.method` maps to `method` in `predict.qda`.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.randomForest

Random Forest

randomForest

Note that the rf can freeze the R process if trained on a task with 1 feature which is constant. This can happen in feature forward selection, also due to resampling, and you need to remove such features with removeConstantFeatures.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.rpart

Decision Tree

rpart

`xval` has been set to `0` by default for speed.

classif

TRUE

FALSE

TRUE

FALSE

classif.svm

Support Vector Machines (libsvm)

svm

e1071

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

classif.xgboost

eXtreme Gradient Boosting

xgboost

All settings are passed directly, rather than through `xgboost`'s `params` argument. `nrounds` has been set to `1` by default. `num_class` is set internally, so do not set this manually.

classif

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

cluster.dbscan

DBScan Clustering

dbscan

fpc

A cluster index of NA indicates noise points. Specify `method = "dist"` if the data should be interpreted as dissimilarity matrix or object. Otherwise Euclidean distances will be used.

cluster

TRUE

FALSE

cluster.kkmeans

Kernel K-Means

kkmeans

kernlab

`centers` has been set to `2L` by default. The nearest center in kernel distance determines cluster assignment of new data points. Kernel parameters have to be passed directly and not by using the `kpar` list in `kkmeans`

cluster

TRUE

FALSE

regr.avNNet

Neural Network

avNNet

nnet

`size` has been set to `3` by default.

regr

TRUE

FALSE

TRUE

FALSE

regr.cforest

Random Forest Based on Conditional Inference Trees

cforest

party

See `?ctree_control` for possible breakage for nominal features with missingness.

regr

TRUE

FALSE

regr.ctree

Conditional Inference Trees

ctree

party

See `?ctree_control` for possible breakage for nominal features with missingness.

regr

TRUE

FALSE

regr.gausspr

Gaussian Processes

gausspr

kernlab

Kernel parameters have to be passed directly and not by using the `kpar` list in `gausspr`. Note that `fit` has been set to `FALSE` by default for speed.

regr

TRUE

FALSE

TRUE

FALSE

regr.gbm

Gradient Boosting Machine

gbm

`keep.data` is set to FALSE to reduce memory requirements, `distribution` has been set to `"gaussian"` by default.

regr

TRUE

FALSE

TRUE

FALSE

regr.glm

Generalized Linear Regression

glm

stats

'family' must be a character and every family has its own link, i.e. family = 'gaussian', link.gaussian = 'identity', which is also the default.

regr

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

regr.glmnet

GLM with Lasso or Elasticnet Regularization

glmnet

Factors automatically get converted to dummy columns, ordered factors to integer. Parameter `s` (value of the regularization parameter used for predictions) is set to `0.1` by default, but needs to be tuned by the user.

regr

TRUE

FALSE

TRUE

FALSE

regr.h2o.deeplearning

h2o.deeplearning

h2o.dl

h2o

regr

TRUE

FALSE

TRUE

FALSE

regr.h2o.gbm

h2o.gbm

h2o

'distribution' is set automatically to 'gaussian'.

regr

TRUE

FALSE

regr.h2o.glm

h2o.glm

h2o

'family' is always set to 'gaussian'.

regr

TRUE

FALSE

TRUE

FALSE

regr.h2o.randomForest

h2o.randomForest

h2o.rf

h2o

regr

TRUE

FALSE

regr.ksvm

Support Vector Machines

ksvm

kernlab

Kernel parameters have to be passed directly and not by using the `kpar` list in `ksvm`. Note that `fit` has been set to `FALSE` by default for speed.

regr

TRUE

FALSE

regr.lm

Simple Linear Regression

stats

regr

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

regr.mob

Model-based Recursive Partitioning Yielding a Tree with Fitted Models Associated with each Terminal Node

mob

party

regr

TRUE

FALSE

TRUE

FALSE

regr.nnet

Neural Network

nnet

`size` has been set to `3` by default.

regr

TRUE

FALSE

TRUE

FALSE

regr.randomForest

Random Forest

randomForest

See `?regr.randomForest` for information about se estimation. Note that the rf can freeze the R process if trained on a task with 1 feature which is constant. This can happen in feature forward selection, also due to resampling, and you need to remove such features with removeConstantFeatures.

regr

TRUE

FALSE

TRUE

FALSE

regr.rpart

Decision Tree

rpart

`xval` has been set to `0` by default for speed.

regr

TRUE

FALSE

regr.rvm

Relevance Vector Machine

rvm

kernlab

Kernel parameters have to be passed directly and not by using the `kpar` list in `rvm`. Note that `fit` has been set to `FALSE` by default for speed.

regr

TRUE

FALSE

regr.svm

Support Vector Machines (libsvm)

svm

e1071

regr

TRUE

FALSE

regr.xgboost

eXtreme Gradient Boosting

xgboost

All settings are passed directly, rather than through `xgboost`'s `params` argument. `nrounds` has been set to `1` by default.

regr

TRUE

FALSE

TRUE

FALSE

surv.cforest

Random Forest based on Conditional Inference Trees

crf

party,survival

See `?ctree_control` for possible breakage for nominal features with missingness.

surv

TRUE

FALSE

TRUE

FALSE

surv.coxph

Cox Proportional Hazard Model

coxph

survival

surv

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

surv.cvglmnet

GLM with Regularization (Cross Validated Lambda)

cvglmnet

glmnet

Factors automatically get converted to dummy columns, ordered factors to integer.

surv

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

surv.glmnet

GLM with Regularization

glmnet

surv

TRUE

FALSE

TRUE

FALSE

TRUE

FALSE

surv.rpart

Survival Tree

rpart

`xval` has been set to `0` by default for speed.

surv

TRUE

FALSE

TRUE

FALSE

二、ML在python+R的互查

R︱mlr包帮你挑选最适合数据的机器学习模型（分类、回归）+机器学习python和R互查手册的更多相关文章

<转>机器学习系列(9)_机器学习算法一览（附Python和R代码）
转自http://blog.csdn.net/han_xiaoyang/article/details/51191386 – 谷歌的无人车和机器人得到了很多关注,但我们真正的未来却在于能够使电脑变得更 ...
深入对比数据科学工具箱：Python和R之争
建议:如果只是处理(小)数据的,用R.结果更可靠,速度可以接受,上手方便,多有现成的命令.程序可以用.要自己搞个算法.处理大数据.计算量大的,用python.开发效率高,一切尽在掌握. 概述在真实的 ...
【技术翻译】支持向量机简明教程及其在python和R下的调参
原文:Simple Tutorial on SVM and Parameter Tuning in Python and R 介绍数据在机器学习中是重要的一种任务,支持向量机(SVM)在模式分类和非 ...
Python与R的争锋：大数据初学者该怎样选？
在当下,人工智能的浪潮席卷而来.从AlphaGo.无人驾驶技术.人脸识别.语音对话,到商城推荐系统,金融业的风控,量化运营.用户洞察.企业征信.智能投顾等,人工智能的应用广泛渗透到各行各业,也让数据科 ...
（数据科学学习手札29）KNN分类的原理详解&Python与R实现
一.简介 KNN(k-nearst neighbors,KNN)作为机器学习算法中的一种非常基本的算法,也正是因为其原理简单,被广泛应用于电影/音乐推荐等方面,即有些时候我们很难去建立确切的模型来描述 ...
（数据科学学习手札22）主成分分析法在Python与R中的基本功能实现
上一篇中我们详细介绍推导了主成分分析法的原理,并基于Python通过自编函数实现了挑选主成分的过程,而在Python与R中都有比较成熟的主成分分析函数,本篇我们就对这些方法进行介绍: R 在R的基础函 ...
（数据科学学习手札23）决策树分类原理详解&Python与R实现
作为机器学习中可解释性非常好的一种算法,决策树(Decision Tree)是在已知各种情况发生概率的基础上,通过构成决策树来求取净现值的期望值大于等于零的概率,评价项目风险,判断其可行性的决策分析方 ...
使用R语言的RTCGA包获取TCGA数据--转载
转载生信技能树 https://mp.weixin.qq.com/s/JB_329LCWqo5dY6MLawfEA TCGA数据源 - R包RTCGA的简单介绍 - 首先安装及加载包 - 指定任意基因 ...
R实战第八篇：重塑数据(reshape2)
数据重塑通常使用reshape2包,reshape2包用于实现对宽数据及长数据之间的相互转换,由于reshape2包不在R的默认安装包列表中,在第一次使用之前,需要安装和引用: install.pac ...

随机推荐

【转】对GAMIT/GLOBK的基本认识
1.1 GAMIT/GLOBK软件可从网络上申请下载.该软件功能强大,用途广泛,一般包括精确定位,大气层可降水汽估计和空间电离层变化分析等.后两种用途只需要用到GAMIT模块,精确定位则还需要GL ...
iOS页面切换动画实现方式。
iOS页面切换动画实现方式. 1.使用UIView animateWithDuration:animations:completion方法 Java代码 [UIView animateWithDura ...
封装的应用【example_Array工具】
定义一个数组工具[ArrayTool]封装其方法,ArrayDemo调用数组工具ArrayTool package new_Object; //封装多个个功能 class ArrayTool{ //1 ...
vc++调用web服务传输文件
bool webService::UploadFile(LPWSTR appKey, LPWSTR fileName, const int len, unsigned char * buff) { t ...
Vue脚手架（vue-cli）安装总结
单页Web应用(single page web application,SPA),就是只有一张Web页面的应用,是加载单个HTML 页面并在用户与应用程序交互时动态更新该页面的Web应用程序. 提供一 ...
request、response的setCharacterEncoding与response的setContentType
一.request中的setCharacterEncoding方法:作用是用指定的编码集去覆盖request对象中的默认的"ISO-8859-1"编码集,如"UTF-8& ...
php 数组变成树状型结构
<? php $stime = microtime(true); $nodes = [ ['id' = > 1, 'pid' = > 0, 'name' = > 'a'], [ ...
java使用*导包的性能
项目中切换到IDEA工具,使用Git提交代码之后在comments中被吐槽了.事情是这样的原有的导入包被IDEA优化了,譬如java.util.Set, java.util.Map, ... 会被优化 ...
Go生成easyjson文件
[生成easyjson文件] cd services/api_adapter/aliafp #先删除已有的aliafp_easyjson.go文件,并且把除了aliafp.go以外的其他文件移动到 ...
Android App 压力测试方法（Monkey）
一.为什么要开展压力测试 a.提高产品的稳定性:b.提高产品的留存率二.什么时候开展压力测试 a.首轮功能测试通过后:b.下班后的夜间进行三.7个基础知识(理论部分) 3.1 手动测试场景与自动测 ...

R︱mlr包帮你挑选最适合数据的机器学习模型（分类、回归）+机器学习python和R互查手册

R︱mlr包帮你挑选最适合数据的机器学习模型（分类、回归）+机器学习python和R互查手册的更多相关文章

随机推荐

热门专题