What are the advantages of logistic regression over decision trees?FAQ

The answer to "Should I ever use learning algorithm (a) over learning algorithm (b)" will pretty much always be yes. Different learning algorithms make different assumptions about the data and have different rates of convergence. The one which works best, i.e. minimizes some cost function of interest (cross validation for example) will be the one that makes assumptions that are consistent with the data and has sufficiently converged to its error rate.

Put in the context of decision trees vs. logistic regression, what are the assumptions made?

Decision trees assume that our decision boundaries are parallel to the axes, for example if we have two features (x1, x2) then it can only create rules such as x1>=4.5, x2>=6.5 etc. which we can visualize as lines parallel to the axis. We see this in practice in the diagram below.

So decision trees chop up the feature space into rectangles (or in higher dimensions, hyper-rectangles). There can be many partitions made and so decision trees naturally scale up to creating more complex (say, higher VC) functions - which can be a problem with over-fitting.

What assumptions does logistic regression make? Despite the probabilistic framework of logistic regression, all that logistic regression assumes is that there is one smooth linear decision boundary. It finds that linear decision boundary by making assumptions that the P(Y|X) of some form, like the inverse logit function applied to a weighted sum of our features. Then it finds the weights by a maximum likelihood approach.

However people get too caught up on that... The decision boundary it creates is a linear* decision boundary that can be of any direction. So if you have data where the decision boundary is not parallel to the axes,

then logistic regression picks it out pretty well, whereas a decision tree will have problems.

So in conclusion,

  • Both algorithms are really fast. There isn't much to distinguish them in terms of run-time.
  • Logistic regression will work better if there's a single decision boundary, not necessarily parallel to the axis.
  • Decision trees can be applied to situations where there's not just one underlying decision boundary, but many, and will work best if the class labels roughly lie in hyper-rectangular regions.
  • Logistic regression is intrinsically simple, it has low variance and so is less prone to over-fitting. Decision trees can be scaled up to be very complex, are are more liable to over-fit. Pruning is applied to avoid this.

Maybe you'll be left thinking, "I wish decision trees didn't have to create rules that are parallel to the axis." This motivates support vector machines.

Footnotes:
* linear in your covariates. If you include non-linear transformations or interactions then it will be non-linear in the space of those original covariates.

What are the advantages of logistic regression over decision trees?FAQ的更多相关文章

  1. Logistic Regression vs Decision Trees vs SVM: Part II

    This is the 2nd part of the series. Read the first part here: Logistic Regression Vs Decision Trees ...

  2. Logistic Regression Vs Decision Trees Vs SVM: Part I

    Classification is one of the major problems that we solve while working on standard business problem ...

  3. Stanford机器学习笔记-2.Logistic Regression

    Content: 2 Logistic Regression. 2.1 Classification. 2.2 Hypothesis representation. 2.2.1 Interpretin ...

  4. [Scikit-learn] 1.1 Generalized Linear Models - Logistic regression & Softmax

    二分类:Logistic regression 多分类:Softmax分类函数 对于损失函数,我们求其最小值, 对于似然函数,我们求其最大值. Logistic是loss function,即: 在逻 ...

  5. Logistic Regression and Gradient Descent

    Logistic Regression and Gradient Descent Logistic regression is an excellent tool to know for classi ...

  6. Logistic Regression 用于预测马是否生病

    1.利用Logistic regression 进行分类的主要思想 根据现有数据对分类边界线建立回归公式,即寻找最佳拟合参数集,然后进行分类. 2.利用梯度下降找出最佳拟合参数 3.代码实现 # -* ...

  7. 逻辑回归 Logistic Regression

    逻辑回归(Logistic Regression)是广义线性回归的一种.逻辑回归是用来做分类任务的常用算法.分类任务的目标是找一个函数,把观测值匹配到相关的类和标签上.比如一个人有没有病,又因为噪声的 ...

  8. logistic regression与SVM

    Logistic模型和SVM都是用于二分类,现在大概说一下两者的区别 ① 寻找最优超平面的方法不同 形象点说,Logistic模型找的那个超平面,是尽量让所有点都远离它,而SVM寻找的那个超平面,是只 ...

  9. Logistic Regression - Formula Deduction

    Sigmoid Function \[ \sigma(z)=\frac{1}{1+e^{(-z)}} \] feature: axial symmetry: \[ \sigma(z)+ \sigma( ...

随机推荐

  1. Linux C编程--格式化I/O

    printf(格式控制,输入表列) 例:printf("%d%d",a,b) (1)d格式符:输出一个有符号的十进制整数 (2)c格式符:输出一个字符 (3)s格式符:输出一个字符 ...

  2. 第四十四篇、iOS开发中git添加.gitignore文件

    .gitignore文件可以直接使用https://github.com/github/gitignore 1.在项目中设置忽略文件(1)将从github上荡下来的对应的.gitignore文件(Sw ...

  3. 使用gulp脚本配合TypeScript开发

    目标:编写TypeScript时,保存即生成js文件.   使用npm安装以下组件 gulp gulp-rename through-gulp gulp-typescript   编写gulpfile ...

  4. JavaScript DOM编程艺术第一章:JavaScript简史

    本系列的博客是由本人在阅读<JavaScript DOM编程艺术>一书过程中做的总结.前面的偏理论部分都是书中原话,觉得有必要记录下来,方便自己翻阅,也希望能为读到本博客的人提供一些帮助, ...

  5. css3学习笔记之2D转换

    translate() 方法 translate()方法,根据左(X轴)和顶部(Y轴)位置给定的参数,从当前元素位置移动. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ...

  6. java web 简单的分页显示

    题外话:该分页显示是用 “表示层-控制层-DAO层-数据库”的设计思想实现的,有什么需要改进的地方大家提出来,共同学习进步. 思路:首先得在 DAO 对象中提供分页查询的方法,在控制层调用该方法查到指 ...

  7. lucene4入门(1)

    欢迎转载http://www.cnblogs.com/shizhongtao/p/3440325.html lucene你可以理解为一种数据库,他是全文搜索的一种引擎. 1.首先去官网download ...

  8. 04_例子讲解:rlViewDemo.exe

    参考资料:http://www.roboticslibrary.org/tutorials/first-steps-windows 使用rlViewDemo对应的快捷方式启动程序,可以看到如下界面: ...

  9. Windows内存原理与内存管理

    WIndows为每个进程分配了4GB的虚拟地址空间,让每个进程都认为自己拥有4GB的内存空间,4GB怎么来的? 32位 CPU可以取地址的空间为2的32次方,就是4GB(正如16位CPU有20根寻址线 ...

  10. Java中的队列:java.util.Queue接口

    队列是一种特殊的线性表,它只允许在表的前端(front)进行删除操作,而在表的后端(rear)进行插入操作. Queue接口与List.Set同一级别,都是继承了Collection接口.Linked ...