The 10 Statistical Techniques Data Scientists Need to Master
就我个人所知有太多的软件工程师尝试转行到数据科学家而盲目地使用机器学习框架来处理数据,例如,TensorFlow或者Apache Spark,但是对于这些框架背后的统计理论没有完全的理解。所以提起 statistical learning,这是机器学习的理论框架,是从统计学和泛函分析(functional analysis)的领域中发展出来的。
推荐的三本书:
- Intro to Statistical Learning (Hastie, Tibshirani, Witten, James)
- Doing Bayesian Data Analysis(Kruschke)
- Time Series Analysis and Applications (Shumway, Stoffer)
我在下面的这些内容上做了很多的练习:
Bayesian Analysis, Markov Chain Monte Carlo, Hierarchical Modeling, Supervised and Unsupervised Learning
推荐的课程:
Recently, I completed the Statistical Learning online course on Stanford Lagunita, which covers all the material in the Intro to Statistical Learning book I read in my Independent Study. Now being exposed to the content twice, I want to share the 10 statistical techniques from the book that I believe any data scientists should learn to be more effective in handling big datasets.
The 10 Statistical Techniques Data Scientists Need to Master的更多相关文章
- Why Apache Spark is a Crossover Hit for Data Scientists [FWD]
Spark is a compelling multi-purpose platform for use cases that span investigative, as well as opera ...
- Seven Python Tools All Data Scientists Should Know How to Use
Seven Python Tools All Data Scientists Should Know How to Use If you’re an aspiring data scientist, ...
- 8 Productivity hacks for Data Scientists & Business Analysts
8 Productivity hacks for Data Scientists & Business Analysts Introduction I was catching up with ...
- Software development skills for data scientists
Software development skills for data scientists Data scientists often come from diverse backgrounds ...
- 18 Candidates for the Top 10 Algorithms in Data Mining
Classification============== #1. C4.5 Quinlan, J. R. 1993. C4.5: Programs for Machine Learning.Morga ...
- 【转】深受开发者喜爱的10大Core Data工具和开源库
http://www.cocoachina.com/ios/20150902/13304.html 在iOS和OSX应用程序中存储和查询数据,Core Data是一个很好的选择.它不仅可以减少内存使用 ...
- [Android Tips] 10. Pull out /data/data/${package_name} files without root access
#!/usr/bin/env bash PACKAGE_NAME=com.your.package DB_NAME=data.db rm -rf ${DB_NAME} adb shell " ...
- Top Data Scientists to Follow & Best Data Science Tutorials on GitHub
http://www.analyticsvidhya.com/blog/2015/07/github-special-data-scientists-to-follow-best-tutorials/ ...
- 10 Big Data Possibilities for 2017 Based on Oracle's Predictions
2017 will see a host of informed predictions, lower costs, and even business-centric gains, courtesy ...
随机推荐
- WPF使用资源图片
一.加载本项目的图片 WPF引入了统一资源表示Uri来标识和访问资源.其中较为常见的情况是用Uri加载图像.Uri表达式的一把形式为:协议+授权+路径 协议:pack:// 授权:有两种,一种用于访问 ...
- 第一个SpringMVC应用流程总结
- 关于redis的几件小事(二)redis线程模型
1.memcached和redis有什么区别? (1)Redis支持服务器端的数据操作 redis和memcached相比,redis拥有更多的 数据结构并且支持更丰富的数据操作 ,通常在memcac ...
- springboot(1)-基础篇
什么是spring boot Spring Boot是由Pivotal团队提供的全新框架,其设计目的是用来简化新Spring应用的初始搭建以及开发过程.该框架使用了特定的方式来进行配置,从而使开发人员 ...
- deep_learning_Function_numpy.linspace()
numpy.linspace()等差数列函数 在numpy中的linspace()函数类似与arange().range()函数: arange() .range() 可以通过指定开始值.终值和步长创 ...
- Linux SWAP交换分区维护
1.查看当前swap分区信息
- 华为ensp问题:云映射本地网卡,直连路由器可以ping通,pc却不行?
拓扑图:cloud 云映射本机物理网卡:192.168.56.1 R1可以Ping通,所有Pc都不行,路由表也存在路由信息,不知道什么问题?
- Rinetd 通过ECS端口转发到内网RDS
前置条件 实现目的:开发本地电脑需要连接没有外网地址的RDS,通过ECS进行转发连接到RDS数据库 客户 PC 终端可以 ssh 登录有公网的 ECS 服务器. 有公网的 ECS 服务器可以通过内网访 ...
- SQL函数取汉字拼音首字母
)='') ) as begin ), ) , ,) if @chn > 'z' if( @chn < '八' ) set @c = 'A' else if ( @chn < '嚓' ...
- 度限制MST
POJ1639 顶点度数限制的最小生成树 做法:首先把和顶点相连的X条边全部删掉 得到顶点和 X个连通块 然后求出这X个连通块的MST 再把X条边连接回去这样我们就首先求出了X度MST 知道了X度MS ...