Post Date: September 3, 2014
By: Stephanie Miller

Marty Rose, Data Scientist in the Acxiom Product and Engineering group, and an active member of the DMA Analytics Council shared the following list of data science books with the Council this week, and we thought the rest of the DMA family would also benefit.

“I didn’t compile this list and am grateful to Chris the original author, but I personally have spent many hundreds of dollars on hard copies of these books, only to find out you can now get them for free online!” he said.  Marty especially recommends the first two books for getting started.

Regardless of your analytics and data background, skills or goals, there’s something for you in this list. Here they are, in no particular order.

  1. An Introduction to Statistical Learning with Applications in R by James, Witten, Hastie & Tibshirani – This book is fantastic and has helped me quite a bit. It provides an overview of several methods, along with the R code for how to complete them. 426 Pages.
  2. The Elements of Statistical Learning by Hastie, Tibshirani & Friedman – This is an in-depth overview of methods, complete with theory, derivations & code. I’d definitely consider this a graduate level text. I’d also consider it one of the best books available on the topic of data mining. 745 Pages.
  3. A Programmer’s Guide to Data Mining by Ron Zacharski – This one is an online book, each chapter downloadable as a PDF. It’s also still in progress, with chapters being added a few times each year.
  4. Probabilistic Programming & Bayesian Methods for Hackers by Cam Davidson-Pilson – This book is absolutely fantastic. The author explains Bayesian statistics, provides several diverse examples of how to apply and includes Python code. Each chapter is an iPython notebook that can be downloaded.
  5. Think Bayes, Bayesian Statistics Made Simple by Allen B. Downey – Another great, easy to digest introduction to Bayesian statistics. The author’s premise is that Bayesian statistics is easier to learn & apply within the context of reusable code samples. It includes a number of examples complete with Python code. 195 Pages.
  6. Data Mining and Analysis, Fundamental Concepts and Algorithms by Zaki & Meira – This title is new to me. It’s a text book that looks to be a complete introduction with derivations & plenty of sample problems. 599 Pages.
  7. An Introduction to Data Science by Jeffrey Stanton – Overview of the skills required to succeed in data science, with a focus on the tools available within R. It has sections on interacting with the Twitter API from within R, text mining, plotting, regression as well as more complicated data mining techniques. 195 Pages.
  8. Machine Learning by Chebira, Mellouk & others – This is an introduction to more advanced machine learning methods. It includes chapters on neural networks, discriminant analysis, natural language processing, regression trees & more, complete with derivations. Each chapter is downloadable as a PDF. 422 Pages.
  9. Machine Learning – The Complete Guide – This one is new to me. It’s a collection of Wikipedia articles organized into chapters & downloadable in a number of formats. I didn’t realize they did this, but its a great idea. Because its a collection of individual articles, it covers quite a bit more material than a single author could write. This is an incredible resource.
  10. Bayesian Reasoning and Machine Learning by David Barber – This is an undergraduate textbook. It includes an overview, derivations, sample problems and MATLAB code. 648 Pages.
  11. A Course in Machine Learning by Hal Daumé III – Another complete introduction to machine learning topics. Each chapter is individually downloadable. 189 Pages.
  12. Information Theory, Inference and Learning Algorithms by David J.C. MacKay – Nice overview of machine learning topics, including an introduction and derivations. One nice feature of this book is that it has a chart that shows how various topics are related to one another. 628 Pages.
  13. Modeling with Data by Ben Klemens – Surprisingly, all of the code in this book is C, Klemens includes a section to defend this choice. The book includes plenty of code samples. 454 Pages.
  14. Mining of Massive Datasets by Rajaraman & Ullman – This book covers concepts and includes several domain specific examples. It includes plenty of derivation and little code. 493 Pages.

Awesome (and Free) Data Science Books[转]的更多相关文章

  1. 51 Free Data Science Books

    51 Free Data Science Books A great collection of free data science books covering a wide range of to ...

  2. 【Repost】A Practical Intro to Data Science

    Are you a interested in taking a course with us? Learn about our programs or contact us at hello@zip ...

  3. Competing in a data science contest without reading the data

    Competing in a data science contest without reading the data Machine learning competitions have beco ...

  4. Comprehensive learning path – Data Science in Python深入学习路径-使用python数据中学习

    http://blog.csdn.net/pipisorry/article/details/44245575 关于怎么学习python,并将python用于数据科学.数据分析.机器学习中的一篇非常好 ...

  5. R8:Learning paths for Data Science[continuous updating…]

    Comprehensive learning path – Data Science in Python Journey from a Python noob to a Kaggler on Pyth ...

  6. 15 Most Read Data Science Articles in 2015. So far …

    15 Most Read Data Science Articles in 2015. So far … We've compiled the latest set of "most rea ...

  7. 11 Facts about Data Science that you must know

    11 Facts about Data Science that you must know Statistics, Machine Learning, Data Science, or Analyt ...

  8. 【转】The most comprehensive Data Science learning plan for 2017

    I joined Analytics Vidhya as an intern last summer. I had no clue what was in store for me. I had be ...

  9. 【转】Comprehensive learning path – Data Science in Python

    Journey from a Python noob to a Kaggler on Python So, you want to become a data scientist or may be ...

随机推荐

  1. IOS 9 遇到的问题

    IOS9中通过url scheme调用其他app时候,控制台会输入 canOpenURL: failed for URL: "XXXXXX://" - error: "T ...

  2. 用宏定义封装LoadLibrary,方便的动态加载dll

    同学们动态加载dll的时候是不是感觉挺麻烦的,每次都::LoadLibrary,::GetProcAddress,还要typedef一堆函数.最近闲来无聊,用宏封装了一下,可以少写不少代码,用来也挺方 ...

  3. mongodb c++ 驱动库编译

    git clone 'https://github.com/mongodb/mongo-cxx-driver.git' scons -j2 --c++11=on --sharedclient --us ...

  4. WampServer修改Mysql密码的步骤

    1.安装成功后,通过 phpmyadmin 进入mysql,点击上面的 [用户] 菜单,在用户[root]主机[localhost]点击编辑权限,下面有一个选项[修改密码],输入您想要的密码,如:12 ...

  5. JavaScript语言基础-环境搭建

    我们要想编写和运行JavaScript脚本,则需要:JavaScript编辑工具和JavaScript运行测试环境.下面我们分别介绍一下.JavaScript编辑工具JavaScript编辑工具最简单 ...

  6. 苹果在Xcode6中弃用segue的push和model

    苹果在Xcode6中弃用了segue的push和model.被show和present取代. 下面是新版本中每种类型的使用总结和例子.建议自己使用前先在测试项目中自己试一试 Show 根据当前屏幕中的 ...

  7. 【学习笔记】【C语言】变量

    1. 什么是变量 当一个数据的值需要经常改变或者不确定时,就应该用变量来表示.比如游戏积分. 2. 定义变量 1> 目的 任何变量在使用之前,必须先进行定义. 定义变量的目的是:在内存中分配一块 ...

  8. 7款值得你心动的HTML5动画和游戏

    1.HTML5 Canvas粒子效果文字动画特效 之前我们分享过很多超酷的文字特效,其中也有利用HTML5和CSS3的.今天我们要来分享一款基于HTML5 Canvas的文字特效,输入框中输入想要展示 ...

  9. Codevs 2898 卢斯的进位制

    时间限制: 1 s  空间限制: 32000 KB  题目等级 : 青铜 Bronze 题目描述 Description 著名科学家卢斯为了检查学生对进位制的理解,他给出了如下的一张加法表,表中的字母 ...

  10. The resource could not be loaded because the App Transport Security policy requires the use of a secure connection.问题解决

    didFailLoadWithError(): Error Domain=NSURLErrorDomain Code=-1022 "The resource could not be loa ...