随机森林n_estimators 学习曲线
随机森林
单颗树与随机森林的的分对比
# 导入包
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
# 实例化红酒数据集
wine = load_wine()
# 划分测试集和训练集
x_train, x_test, y_train, y_test = train_test_split(wine.data, wine.target, test_size=0.3)
# 实例化决策树和随机森林,random_state=0
clf = DecisionTreeClassifier(random_state=0)
rfc = RandomForestClassifier(random_state=0)
# 训练模型
clf.fit(x_train, y_train)
rfc.fit(x_train, y_train)
#sk-container-id-1 { color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1) }
#sk-container-id-1 pre { padding: 0 }
#sk-container-id-1 div.sk-toggleable { background-color: rgba(255, 255, 255, 1) }
#sk-container-id-1 label.sk-toggleable__label { cursor: pointer; display: block; width: 100%; margin-bottom: 0; padding: 0.3em; box-sizing: border-box; text-align: center }
#sk-container-id-1 label.sk-toggleable__label-arrow:before { content: "▸"; float: left; margin-right: 0.25em; color: rgba(105, 105, 105, 1) }
#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before { color: rgba(0, 0, 0, 1) }
#sk-container-id-1 div.sk-estimator:hover label.sk-toggleable__label-arrow:before { color: rgba(0, 0, 0, 1) }
#sk-container-id-1 div.sk-toggleable__content { max-height: 0; max-width: 0; overflow: hidden; text-align: left; background-color: rgba(240, 248, 255, 1) }
#sk-container-id-1 div.sk-toggleable__content pre { margin: 0.2em; color: rgba(0, 0, 0, 1); border-radius: 0.25em; background-color: rgba(240, 248, 255, 1) }
#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content { max-height: 200px; max-width: 100%; overflow: auto }
#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before { content: "▾" }
#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-1 input.sk-hidden--visually { border: 0; clip: rect(1px, 1px, 1px, 1px); height: 1px; margin: -1px; overflow: hidden; padding: 0; position: absolute; width: 1px }
#sk-container-id-1 div.sk-estimator { font-family: monospace; background-color: rgba(240, 248, 255, 1); border: 1px dotted rgba(0, 0, 0, 1); border-radius: 0.25em; box-sizing: border-box; margin-bottom: 0.5em }
#sk-container-id-1 div.sk-estimator:hover { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-1 div.sk-parallel-item::after { content: ""; width: 100%; border-bottom: 1px solid rgba(128, 128, 128, 1); flex-grow: 1 }
#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-1 div.sk-serial::before { content: ""; position: absolute; border-left: 1px solid rgba(128, 128, 128, 1); box-sizing: border-box; top: 0; bottom: 0; left: 50%; z-index: 0 }
#sk-container-id-1 div.sk-serial { display: flex; flex-direction: column; align-items: center; background-color: rgba(255, 255, 255, 1); padding-right: 0.2em; padding-left: 0.2em; position: relative }
#sk-container-id-1 div.sk-item { position: relative; z-index: 1 }
#sk-container-id-1 div.sk-parallel { display: flex; align-items: stretch; justify-content: center; background-color: rgba(255, 255, 255, 1); position: relative }
#sk-container-id-1 div.sk-item::before, #sk-container-id-1 div.sk-parallel-item::before { content: ""; position: absolute; border-left: 1px solid rgba(128, 128, 128, 1); box-sizing: border-box; top: 0; bottom: 0; left: 50%; z-index: -1 }
#sk-container-id-1 div.sk-parallel-item { display: flex; flex-direction: column; z-index: 1; position: relative; background-color: rgba(255, 255, 255, 1) }
#sk-container-id-1 div.sk-parallel-item:first-child::after { align-self: flex-end; width: 50% }
#sk-container-id-1 div.sk-parallel-item:last-child::after { align-self: flex-start; width: 50% }
#sk-container-id-1 div.sk-parallel-item:only-child::after { width: 0 }
#sk-container-id-1 div.sk-dashed-wrapped { border: 1px dashed rgba(128, 128, 128, 1); margin: 0 0.4em 0.5em; box-sizing: border-box; padding-bottom: 0.4em; background-color: rgba(255, 255, 255, 1) }
#sk-container-id-1 div.sk-label label { font-family: monospace; font-weight: bold; display: inline-block; line-height: 1.2em }
#sk-container-id-1 div.sk-label-container { text-align: center }
#sk-container-id-1 div.sk-container { display: inline-block !important; position: relative }
#sk-container-id-1 div.sk-text-repr-fallback { display: none }
RandomForestClassifier(random_state=0)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestClassifier(random_state=0)
# 返回测试集的分
clf_score = clf.score(x_test, y_test)
rfc_score = rfc.score(x_test, y_test)
print("sinle tree: {0}\nrandom tree: {1}".format(clf_score, rfc_score))
sinle tree: 0.9074074074074074
random tree: 0.9629629629629629
单颗树与随机森林在交叉验证下的对比图
# 导入交叉验证和画图工具
%matplotlib inline
from sklearn.model_selection import cross_val_score
import matplotlib.pyplot as plt
# 实例化决策树和随机森林
clf = DecisionTreeClassifier()
rfc = RandomForestClassifier(n_estimators=25) #创建25棵树组成的随机森林
# 实例化交叉验证 10次
clf_corss = cross_val_score(clf, wine.data, wine.target, cv=10)
rfc_corss = cross_val_score(rfc, wine.data, wine.target, cv=10)
# 查看决策树和随机森林的最好结果
print("single tree mean socre: {}\nrandom tree mean socre {}".format(clf_corss.mean(), rfc_corss.mean()))
single tree mean socre: 0.8705882352941178
random tree mean socre 0.9722222222222221
# 画出决策树和随机森林对比图
plt.plot(range(1, 11), clf_corss, label="single tree")
plt.plot(range(1, 11), rfc_corss, label="random tree")
plt.xticks(range(1, 11))
plt.legend()
<matplotlib.legend.Legend at 0x7ff6f4815d50>

clf_corss = cross_val_score(clf, wine.data, wine.target, cv=10)
clf_corss
array([0.88888889, 0.88888889, 0.72222222, 0.88888889, 0.83333333,
0.83333333, 1. , 0.94444444, 0.94117647, 0.76470588])
rfc_corss = cross_val_score(rfc, wine.data, wine.target, cv=10)
rfc_corss
array([1. , 1. , 0.94444444, 0.94444444, 0.88888889,
1. , 1. , 1. , 1. , 1. ])
十次交叉验证下决策树和随机森林的对比
# 创建分数列表
clf_list = []
rfc_list = []
for i in range(10):
clf = DecisionTreeClassifier()
rfc = RandomForestClassifier(n_estimators=25)
clf_corss_mean = cross_val_score(clf, wine.data, wine.target, cv=10).mean()
rfc_corss_mean = cross_val_score(rfc, wine.data, wine.target, cv=10).mean()
clf_list.append(clf_corss_mean)
rfc_list.append(rfc_corss_mean)
# 画出决策树和随机森林对比图
plt.plot(range(1, 11), clf_list, label="single tree")
plt.plot(range(1, 11), rfc_list, label="random tree")
plt.xticks(range(1, 11))
plt.legend()
<matplotlib.legend.Legend at 0x7ff6f490f670>

n_estimators 学习曲线
# 1-200颗树的学习曲线
superpa = []
for i in range(200):
rfc = RandomForestClassifier(n_estimators=i+1, n_jobs=-1)
rfc_cross = cross_val_score(rfc, wine.data, wine.target, cv=10).mean()
superpa.append(rfc_cross)
print(max(superpa), superpa.index(max(superpa)))
plt.figure(figsize=(20,8))
plt.plot(range(1,201), superpa, label="rfc_cross_mean")
plt.legend()
0.9888888888888889 20
<matplotlib.legend.Legend at 0x7ff6f540f100>

随机森林n_estimators 学习曲线的更多相关文章
- #调整随机森林的参数(调整n_estimators随机森林中树的数量默认10个树,精度递增显著,但并不是越多越好),加上verbose=True,显示进程使用信息
#调整随机森林的参数(调整n_estimators随机森林中树的数量默认10个树,精度递增显著) from sklearn import datasets X, y = datasets.make_c ...
- Python机器学习笔记——随机森林算法
随机森林算法的理论知识 随机森林是一种有监督学习算法,是以决策树为基学习器的集成学习算法.随机森林非常简单,易于实现,计算开销也很小,但是它在分类和回归上表现出非常惊人的性能,因此,随机森林被誉为“代 ...
- 机器学习实战基础(三十五):随机森林 (二)之 RandomForestClassifier 之重要参数
RandomForestClassifier class sklearn.ensemble.RandomForestClassifier (n_estimators=’10’, criterion=’g ...
- python的随机森林模型调参
一.一般的模型调参原则 1.调参前提:模型调参其实是没有定论,需要根据不同的数据集和不同的模型去调.但是有一些调参的思想是有规律可循的,首先我们可以知道,模型不准确只有两种情况:一是过拟合,而是欠拟合 ...
- scikit-learn随机森林调参小结
在Bagging与随机森林算法原理小结中,我们对随机森林(Random Forest, 以下简称RF)的原理做了总结.本文就从实践的角度对RF做一个总结.重点讲述scikit-learn中RF的调参注 ...
- [Machine Learning & Algorithm] 随机森林(Random Forest)
1 什么是随机森林? 作为新兴起的.高度灵活的一种机器学习算法,随机森林(Random Forest,简称RF)拥有广泛的应用前景,从市场营销到医疗保健保险,既可以用来做市场营销模拟的建模,统计客户来 ...
- kaggle数据挖掘竞赛初步--Titanic<随机森林&特征重要性>
完整代码: https://github.com/cindycindyhi/kaggle-Titanic 特征工程系列: Titanic系列之原始数据分析和数据处理 Titanic系列之数据变换 Ti ...
- 机器学习——随机森林,RandomForestClassifier参数含义详解
1.随机森林模型 clf = RandomForestClassifier(n_estimators=200, criterion='entropy', max_depth=4) rf_clf = c ...
- sklearn_随机森林random forest原理_乳腺癌分类器建模(推荐AAA)
sklearn实战-乳腺癌细胞数据挖掘(博主亲自录制视频) https://study.163.com/course/introduction.htm?courseId=1005269003& ...
- Python中随机森林的实现与解释
使用像Scikit-Learn这样的库,现在很容易在Python中实现数百种机器学习算法.这很容易,我们通常不需要任何关于模型如何工作的潜在知识来使用它.虽然不需要了解所有细节,但了解机器学习模型是如 ...
随机推荐
- 【pytest】pytest.mark.dependency用例依赖标签,并解决依赖失效的问题
pytest第三方插件,用来解决用例之间的依赖关系.如果依赖的用例执行失败后 后续的用例会被跳过执行,相当于智能执行了pytest.mark.skip, 首先要安装插件:pip install pyt ...
- element NavMenu侧栏导航菜单(可折叠)
展示效果如图: 代码: <el-scrollbar class="scrollbar-wrapper"> <!-- el-scrollbar超长可滚动 --> ...
- Vue+Element+Table表格动态跨列文章
https://my.oschina.net/u/4772459/blog/4699602 如图所示: 1 <template class="SysRole"> 2 & ...
- UnsupportedOperationException异常
看看下面的例子,这样输出什么呢? public class test { public static void main(String[] args) { String arr = "ab, ...
- python脚本监控定时任务
1.linux服务器中输入命令 crontab -l 查看当前系统的所有定时任务 2. 输入命令 crontab -e ,然后按"i"进行编辑(可新增.修改定时任务).具体定时任务 ...
- CAJ转换为PDF
方法就是下载一个CAJViewer和一个PDF虚拟打印机 CAJViewer下载: http://cajviewer.cnki.net/ 我下载了7.2版本 PDF虚拟打印机可以是Adobe acro ...
- software engineering homework 2
博客信息 沈阳航空航天大学计算机学院2020软件工程作业 作业要求 https://edu.cnblogs.com/campus/sau/Computer1701-1705/homework/1058 ...
- Tomcat和Maven安装与配置
链接:https://pan.baidu.com/s/1aezz2pfCn0DCCPw8udQFXA 提取码:wd4f 一.网站发布1.1.为什么要用tomcat网页开发好了,该如何发布呢?我们需要一 ...
- 12.21linux学习第二十天
今天老刘讲最后一章,第20章使用LNMP架构部署动态网站环境.定制安装17个安装包,编译安装要花很长时间,需要漫长的等待. LNMP动态网站部署架构是一套由Linux + Nginx + MySQL ...
- java流程控制;
一.基础阶段: 1.用户交互Scanner Scanner对象: 之前我们学的基本语法中我们并没有实现程序和人的交互,但是Java给我们提供了这样一个工具类,我们可以获取用户的输入. java.uti ...