Comparing randomized search and grid search for hyperparameter estimation

Compare randomized search and grid search for optimizing hyperparameters of a random forest. All parameters that influence the learning are searched simultaneously (except for the number of estimators, which poses a time / quality tradeoff).

The randomized search and the grid search explore exactly the same space of parameters. The result in parameter settings is quite similar, while the run time for randomized search is drastically lower.

The performance is slightly worse for the randomized search, though this is most likely a noise effect and would not carry over to a held-out test set.

Note that in practice, one would not search over this many different parameters simultaneously using grid search, but pick only the ones deemed most important.

Python source code: randomized_search.py

print(__doc__)

import numpy as np

from time import time
from operator import itemgetter
from scipy.stats import randint as sp_randint from sklearn.grid_search import GridSearchCV, RandomizedSearchCV
from sklearn.datasets import load_digits
from sklearn.ensemble import RandomForestClassifier # get some data
iris = load_digits()
X, y = iris.data, iris.target # build a classifier
clf = RandomForestClassifier(n_estimators=20) # Utility function to report best scores
def report(grid_scores, n_top=3):
top_scores = sorted(grid_scores, key=itemgetter(1), reverse=True)[:n_top]
for i, score in enumerate(top_scores):
print("Model with rank: {0}".format(i + 1))
print("Mean validation score: {0:.3f} (std: {1:.3f})".format(
score.mean_validation_score,
np.std(score.cv_validation_scores)))
print("Parameters: {0}".format(score.parameters))
print("") # specify parameters and distributions to sample from
param_dist = {"max_depth": [3, None],
"max_features": sp_randint(1, 11),
"min_samples_split": sp_randint(1, 11),
"min_samples_leaf": sp_randint(1, 11),
"bootstrap": [True, False],
"criterion": ["gini", "entropy"]} # run randomized search
n_iter_search = 20
random_search = RandomizedSearchCV(clf, param_distributions=param_dist,
n_iter=n_iter_search) start = time()
random_search.fit(X, y)
print("RandomizedSearchCV took %.2f seconds for %d candidates"
" parameter settings." % ((time() - start), n_iter_search))
report(random_search.grid_scores_) # use a full grid over all parameters
param_grid = {"max_depth": [3, None],
"max_features": [1, 3, 10],
"min_samples_split": [1, 3, 10],
"min_samples_leaf": [1, 3, 10],
"bootstrap": [True, False],
"criterion": ["gini", "entropy"]} # run grid search
grid_search = GridSearchCV(clf, param_grid=param_grid)
start = time()
grid_search.fit(X, y) print("GridSearchCV took %.2f seconds for %d candidate parameter settings."
% (time() - start, len(grid_search.grid_scores_)))
report(grid_search.grid_scores_)

Comparing randomized search and grid search for hyperparameter estimation的更多相关文章

  1. 3.2. Grid Search: Searching for estimator parameters

    3.2. Grid Search: Searching for estimator parameters Parameters that are not directly learnt within ...

  2. scikit-learn:3.2. Grid Search: Searching for estimator parameters

    參考:http://scikit-learn.org/stable/modules/grid_search.html GridSearchCV通过(蛮力)搜索參数空间(參数的全部可能组合).寻找最好的 ...

  3. Grid search in the tidyverse

    @drsimonj here to share a tidyverse method of grid search for optimizing a model's hyperparameters. ...

  4. How to Grid Search Hyperparameters for Deep Learning Models in Python With Keras

    Hyperparameter optimization is a big part of deep learning. The reason is that neural networks are n ...

  5. Grid Search学习

    转自:https://www.cnblogs.com/ysugyl/p/8711205.html Grid Search:一种调参手段:穷举搜索:在所有候选的参数选择中,通过循环遍历,尝试每一种可能性 ...

  6. grid search 超参数寻优

    http://scikit-learn.org/stable/modules/grid_search.html 1. 超参数寻优方法 gridsearchCV 和  RandomizedSearchC ...

  7. [转载]Grid Search

    [转载]Grid Search 初学机器学习,之前的模型都是手动调参的,效果一般.同学和我说他用了一个叫grid search的方法.可以实现自动调参,顿时感觉非常高级.吃饭的时候想调参的话最差不过也 ...

  8. 【起航计划 032】2015 起航计划 Android APIDemo的魔鬼步伐 31 App->Search->Invoke Search 搜索功能 Search Dialog SearchView SearchRecentSuggestions

    Search (搜索)是Android平台的一个核心功能之一,用户可以在手机搜索在线的或是本地的信息.Android平台为所有需要提供搜索或是查询功能的应用提 供了一个统一的Search Framew ...

  9. grid search

    sklearn.metrics.make_scorer(score_func, greater_is_better=True, needs_proba=False, needs_threshold=F ...

随机推荐

  1. eclipse GWT开发环境的离线布置方法

    安装方法http://blog.csdn.net/u011029071/article/details/10143841 用eclipse自动更新安装失败N次,还是得手动来 以Google Plugi ...

  2. 移植QT到ZedBoard(制作运行库镜像) 交叉编译 分类: ubuntu shell ZedBoard OpenCV 2014-11-08 18:49 219人阅读 评论(0) 收藏

    制作运行库 由于ubuntu的Qt运行库在/usr/local/Trolltech/Qt-4.7.3/下,由makefile可以看到引用运行库是 INCPATH = -I/usr//mkspecs/d ...

  3. docker 服务注册

    docker 服务注册 etcd docker run -d --name etcd -p 4001:4001 -p 7001:7001 elcolio/etcd

  4. Qt自定义控件(插件)并添加到QtDesigher

    之前使用Qt的时候都是手写代码的(因为批量按钮可以使用数组实现),但当界面越来越复杂时,这种开发效率就太低了; 后来就开始使用QtDesigner,但要使QtDesigner支持我自己写的控件,需要提 ...

  5. 动态代理与AOP

    1. 代理的分类: 静态代理:每个代理类只能为一个接口服务 动态代理:可以通过一个代理类完成全部的代理功能(由JVM生成实现一系列接口的代理类,即:生成实现接口的类的代理) 2. 动态代理: 在Jav ...

  6. oracle周数计算方法

    从星期天开始起算 Function Fn_GetWeekbyDate(P_Date Varchar2) Return Varchar2 Is Begin Return To_char(To_Date( ...

  7. 关于NetworkInfo对象的isConnected()与isAvailable()

      public class MainActivity extends Activity{    /** Called when the activity is first created. */   ...

  8. Html5新增的语义化标签(部分)

    2014年10月29日,万维网联盟宣布,经过接近8年的艰苦努力,html5的标准规范终于制定完成.这是互联网的一次重大变革,这也许是一个时代的来临! 总结一些h5新增的语义化标签,记录下来方便自己学习 ...

  9. WPF TextElement内容模型简介(转)

    本内容模型概述描述了 TextElement 支持的内容. Paragraph 类是 TextElement 的类型. 内容模型描述哪些对象/元素可以包含在其他对象/元素中. 本概述汇总了派生自 Te ...

  10. head标签

    1.head标签中有个<meta>,,个人理解知识,可以设置页面字符集,文本格式,还可以加一些注释,例如如下所示