Notes

The default values for the parameters controlling the size of the trees (e.g. max_depth, min_samples_leaf, etc.) lead to fully grown and unpruned trees 
which can potentially be very large on some data sets. To reduce memory consumption, the complexity and size of the trees should be controlled by setting
those parameter values. The features are always randomly permuted at each split. Therefore, the best found split may vary, even with the same training data, max_features=n_features
and bootstrap=False, if the improvement of the criterion is identical for several splits enumerated during the search of the best split. To obtain a
deterministic behaviour during fitting, random_state has to be fixed. References [R157]
Breiman, “Random Forests”, Machine Learning, (), -, .

Methods

apply(X) Apply trees in the forest to X, return leaf indices.
decision_path(X) Return the decision path in the forest
fit(X, y[, sample_weight]) Build a forest of trees from the training set (X, y).
get_params([deep]) Get parameters for this estimator.
predict(X) Predict class for X.
predict_log_proba(X) Predict class log-probabilities for X.
predict_proba(X) Predict class probabilities for X.
score(X, y[, sample_weight]) Returns the mean accuracy on the given test data and labels.
set_params(**params) Set the parameters of this estimator.
predict(X)

Predict class for X.

The predicted class of an input sample is a vote by the trees in the forest, weighted by their probability estimates. That is, the predicted class is the one with highest mean probability estimate across the trees.

Parameters:

X : array-like or sparse matrix of shape = [n_samples, n_features]

The input samples. Internally, its dtype will be converted to dtype=np.float32. If a sparse matrix is provided, it will be converted into a sparse csr_matrix.

Returns:

y : array of shape = [n_samples] or [n_samples, n_outputs]

The predicted classes.

predict_log_proba(X)

Predict class log-probabilities for X.

The predicted class log-probabilities of an input sample is computed as the log of the mean predicted class probabilities of the trees in the forest.

Parameters:

X : array-like or sparse matrix of shape = [n_samples, n_features]

The input samples. Internally, its dtype will be converted to dtype=np.float32. If a sparse matrix is provided, it will be converted into a sparse csr_matrix.

Returns:

p : array of shape = [n_samples, n_classes], or a list of n_outputs

such arrays if n_outputs > 1. The class probabilities of the input samples. The order of the classes corresponds to that in the attribute classes_.

predict_proba(X)

Predict class probabilities for X.

The predicted class probabilities of an input sample are computed as the mean predicted class probabilities of the trees in the forest. The class probability of a single tree is the fraction of samples of the same class in a leaf.

Parameters:

X : array-like or sparse matrix of shape = [n_samples, n_features]

The input samples. Internally, its dtype will be converted to dtype=np.float32. If a sparse matrix is provided, it will be converted into a sparse csr_matrix.

Returns:

p : array of shape = [n_samples, n_classes], or a list of n_outputs

such arrays if n_outputs > 1. The class probabilities of the input samples. The order of the classes corresponds to that in the attribute classes_.

score(Xysample_weight=None)

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:

X : array-like, shape = (n_samples, n_features)

Test samples.

y : array-like, shape = (n_samples) or (n_samples, n_outputs)

True labels for X.

sample_weight : array-like, shape = [n_samples], optional

Sample weights.

Returns:

score : float

Mean accuracy of self.predict(X) wrt. y.

From Sklearn:

http://sklearn.apachecn.org/cn/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier

sklearn 随机森林方法的更多相关文章

  1. 使用基于Apache Spark的随机森林方法预测贷款风险

    使用基于Apache Spark的随机森林方法预测贷款风险   原文:Predicting Loan Credit Risk using Apache Spark Machine Learning R ...

  2. 解决sklearn 随机森林数据不平衡的方法

    Handle Imbalanced Classes In Random Forest   Preliminaries # Load libraries from sklearn.ensemble im ...

  3. sklearn_随机森林random forest原理_乳腺癌分类器建模(推荐AAA)

     sklearn实战-乳腺癌细胞数据挖掘(博主亲自录制视频) https://study.163.com/course/introduction.htm?courseId=1005269003& ...

  4. 随机森林random forest及python实现

    引言想通过随机森林来获取数据的主要特征 1.理论根据个体学习器的生成方式,目前的集成学习方法大致可分为两大类,即个体学习器之间存在强依赖关系,必须串行生成的序列化方法,以及个体学习器间不存在强依赖关系 ...

  5. 决策树-预测隐形眼镜类型 (ID3算法,C4.5算法,CART算法,GINI指数,剪枝,随机森林)

    1. 1.问题的引入 2.一个实例 3.基本概念 4.ID3 5.C4.5 6.CART 7.随机森林 2. 我们应该设计什么的算法,使得计算机对贷款申请人员的申请信息自动进行分类,以决定能否贷款? ...

  6. 随机森林入门攻略(内含R、Python代码)

    随机森林入门攻略(内含R.Python代码) 简介 近年来,随机森林模型在界内的关注度与受欢迎程度有着显著的提升,这多半归功于它可以快速地被应用到几乎任何的数据科学问题中去,从而使人们能够高效快捷地获 ...

  7. 随机森林学习-sklearn

    随机森林的Python实现 (RandomForestClassifier) # -*- coding: utf- -*- """ RandomForestClassif ...

  8. sklearn中的随机森林

    阅读了Python的sklearn包中随机森林的代码实现,做了一些笔记. sklearn中的随机森林是基于RandomForestClassifier类实现的,它的原型是 class RandomFo ...

  9. kaggle 欺诈信用卡预测——不平衡训练样本的处理方法 综合结论就是:随机森林+过采样(直接复制或者smote后,黑白比例1:3 or 1:1)效果比较好!记得在smote前一定要先做标准化!!!其实随机森林对特征是否标准化无感,但是svm和LR就非常非常关键了

    先看数据: 特征如下: Time Number of seconds elapsed between each transaction (over two days) numeric V1 No de ...

随机推荐

  1. [转]为什么匿名内部类参数必须为final类型

    1)  从程序设计语言的理论上:局部内部类(即:定义在方法中的内部类),由于本身就是在方法内部(可出现在形式参数定义处或者方法体处),因而访问方法中的局部变量(形式参数或局部变量)是天经地义的.是很自 ...

  2. [转]Configure logging in SSIS packages

    本文转自:http://learnsqlwithbru.com/2009/11/26/configure-logging-in-ssis-packages/ n this article we wil ...

  3. iOS:UIResponser控件的介绍(响应者)

    UIResponser响应者控件   知识: 在iOS中不是任何对象都能处理事件,只有继承了UIResponser的对象才能接收并处理事件.我们称之为“响应者对象” UIApplication,UIV ...

  4. unity macro 分平台处理

    https://docs.unity3d.com/ScriptReference/SystemInfo.html https://docs.unity3d.com/Manual/PlatformDep ...

  5. 模拟源码深入理解Vue数据驱动原理(1)

    Vue有一核心就是数据驱动(Data Driven),允许我们采用简洁的模板语法来声明式的将数据渲染进DOM,且数据与DOM是绑定在一起的,这样当我们改变Vue实例的数据时,对应的DOM元素也就会改变 ...

  6. Hive不等值连接

    select * from ( select t1.instalment_id as r_id , t2.instalment_id as p_id from (select instalment_i ...

  7. 教你用 google-drive-ocamlfuse 在 Linux 上挂载 Google Drive

    如果你在找一个方便的方式在 Linux 机器上挂载你的 Google Drive 文件夹, Jack Wallen 将教你怎么使用 google-drive-ocamlfuse 来挂载 Google ...

  8. 【实践】关于p 标签内嵌 p标签的bug

    项目中遇到了一点小问题: 是这样的,在输入框包裹元素 p标签中想内嵌一个p 标签用作显示提示字符,谁知发生了一下一幕: 页面结构: <p class="modify-info-wrap ...

  9. 将思维转向rss

    本屌丝因为穷住在了离市区比较远的农民房,平时上下班单程地铁时间接近一小时.在这漫长的一小时里,总得干点什么来蹉跎这段时光,看手机是最容易实现的事情.最地铁信号不好,手机也没什么好看的. 经过高人指点说 ...

  10. 云计算之路-试用Azure:竟然无法重置虚拟机的管理员密码

    在忘记管理员密码的情况下,可以远程重置服务器的管理员密码是云计算服务的一个优势,这是使用自己的物理服务器无法实现的. 但是,在使用Azure的时候,我们找遍Azure管理控制台也没找到可以重置虚拟机( ...