Spring. Rejuvenation. Rebirth. Everything’s blooming. And, of course, people want free ebooks. With that in mind, here's a list of 10 free machine learning and data science titles to get your spring reading started right.


By Matthew Mayo, KDnuggets.

What better way to enjoy this spring weather than with some free machine learning and data science ebooks? Right? Right?

Here is a quick collection of such books to start your fair weather study off on the right foot. The list begins with a base of statistics, moves on to machine learning foundations, progresses to a few bigger picture titles, has a quick look at an advanced topic or 2, and ends off with something that brings it all together. A mix of classic and contemporary titles, hopefully you find something new (to you) and of interest here.

1. Think Stats: Probability and Statistics for Programmers
By Allen B. Downey

Think Stats is an introduction to Probability and Statistics for Python programmers.

Think Stats emphasizes simple techniques you can use to explore real data sets and answer interesting questions. The book presents a case study using data from the National Institutes of Health. Readers are encouraged to work on a project with real datasets.

2. Probabilistic Programming & Bayesian Methods for Hackers
By Cam Davidson-Pilon

An intro to Bayesian methods and probabilistic programming from a computation/understanding-first, mathematics-second point of view.

The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chapters of slow, mathematical analysis. The typical text on Bayesian inference involves two to three chapters on probability theory, then enters what Bayesian inference is. Unfortunately, due to mathematical intractability of most Bayesian models, the reader is only shown simple, artificial examples. This can leave the user with a so-what feeling about Bayesian inference. In fact, this was the author's own prior opinion.

3. Understanding Machine Learning: From Theory to Algorithms
By Shai Shalev-Shwartz and Shai Ben-David

Machine learning is one of the fastest growing areas of computer science, with far-reaching applications. The aim of this textbook is to introduce machine learning, and the algorithmic paradigms it offers, in a principled way. The book provides a theoretical account of the fundamentals underlying machine learning and the mathematical derivations that transform these principles into practical algorithms. Following a presentation of the basics, the book covers a wide array of central topics unaddressed by previous textbooks. These include a discussion of the computational complexity of learning and the concepts of convexity and stability; important algorithmic paradigms including stochastic gradient descent, neural networks, and structured output learning; and emerging theoretical concepts such as the PAC-Bayes approach and compression-based bounds.

4. The Elements of Statistical Learning
By Trevor Hastie, Robert Tibshirani and Jerome Friedman

This book descibes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting--the first comprehensive treatment of this topic in any book.

5. An Introduction to Statistical Learning with Applications in R
By Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani

This book provides an introduction to statistical learning methods. It is aimed for upper level undergraduate students, masters students and Ph.D. students in the non-mathematical sciences. The book also contains a number of R labs with detailed explanations on how to implement the various methods in real life settings, and should be a valuable resource for a practicing data scientist.

6. Foundations of Data Science
By Avrim Blum, John Hopcroft, and Ravindran Kannan

While traditional areas of computer science remain highly important, increasingly researchers of the future will be involved with using computers to understand and extract usable information from massive data arising in applications, not just how to make computers useful on specific well-defined problems. With this in mind we have written this book to cover the theory likely to be useful in the next 40 years, just as an understanding of automata theory, algorithms, and related topics gave students an advantage in the last 40 years.

7. A Programmer's Guide to Data Mining: The Ancient Art of the Numerati
By Ron Zacharski

This guide follows a learn-by-doing approach. Instead of passively reading the book, I encourage you to work through the exercises and experiment with the Python code I provide. I hope you will be actively involved in trying out and programming data mining techniques. The textbook is laid out as a series of small steps that build on each other until, by the time you complete the book, you have laid the foundation for understanding data mining techniques.

8. Mining of Massive Datasets
By Jure Leskovec, Anand Rajaraman and Jeff Ullman

The book is based on Stanford Computer Science course CS246: Mining Massive Datasets (and CS345A: Data Mining).

The book, like the course, is designed at the undergraduate computer science level with no formal prerequisites. To support deeper explorations, most of the chapters are supplemented with further reading references.

9. Deep Learning
By Ian Goodfellow, Yoshua Bengio and Aaron Courville

The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning in particular. The online version of the book is now complete and will remain available online for free.

10. Machine Learning Yearning
By Andrew Ng

AI, Machine Learning and Deep Learning are transforming numerous industries. But building a machine learning system requires that you make practical decisions:

  • Should you collect more training data?
  • Should you use end-to-end deep learning?
  • How do you deal with your training set not matching your test set?
  • and many more.

Historically, the only way to learn how to make these "strategy" decisions has been a multi-year apprenticeship in a graduate program or company. I am writing a book to help you quickly gain this skill, so that you can become better at building AI systems.

Related:

10-free-must-read-books-machine-learning-data-science的更多相关文章

  1. 用10张图来看机器学习Machine learning in 10 pictures

    I find myself coming back to the same few pictures when explaining basic machine learning concepts. ...

  2. Machine Learning & Data Mining 资料整合

    机器学习常见算法分类汇总 | 码农网 数据挖掘十大经典算法 | CSDN博客 (内含十个算法具体介绍) 支持向量机通俗导论(理解 SVM 的三层境界)| CSDN博客 (强烈推荐关注博主) 教你如何迅 ...

  3. machine learning data sets

    http://archive.ics.uci.edu/ml/datasets.html 例如 3 分类 鸢尾花 数据集: http://archive.ics.uci.edu/ml/datasets/ ...

  4. How do I learn mathematics for machine learning?

    https://www.quora.com/How-do-I-learn-mathematics-for-machine-learning   How do I learn mathematics f ...

  5. How do I learn machine learning?

    https://www.quora.com/How-do-I-learn-machine-learning-1?redirected_qid=6578644   How Can I Learn X? ...

  6. 11 Facts about Data Science that you must know

    11 Facts about Data Science that you must know Statistics, Machine Learning, Data Science, or Analyt ...

  7. R8:Learning paths for Data Science[continuous updating…]

    Comprehensive learning path – Data Science in Python Journey from a Python noob to a Kaggler on Pyth ...

  8. 学习笔记之Machine Learning Crash Course | Google Developers

    Machine Learning Crash Course  |  Google Developers https://developers.google.com/machine-learning/c ...

  9. 【转】The most comprehensive Data Science learning plan for 2017

    I joined Analytics Vidhya as an intern last summer. I had no clue what was in store for me. I had be ...

  10. Roles on a Machine Learning Project (机器学习项目中的角色)

    原文 :https://medium.com/machine-learning-in-practice/roles-on-a-machine-learning-project-216903a6dc12 ...

随机推荐

  1. Java循环中标签的作用(转)

    转自:http://lihengzkj.iteye.com/blog/1090034 以前不知道在循环中可以使用标签.最近遇到后,举得还是有其独特的用处的.我这么说的意思是说标签在循环中可以改变循环执 ...

  2. 在发送信息时应用PendingIntent.FLAG_UPDATE_CURRENT

    1. 连续发送两条信息时,出现bug.以下是bug现象描述. 发送第一条信息,sentReceiver弹出toast告知发送成功,同时在listview中的发送状态立即同步更新为发送成功. 继续发送第 ...

  3. JavaScript原生对象及扩展

    来源于 https://segmentfault.com/a/1190000002634958 内置对象与原生对象 内置(Build-in)对象与原生(Naitve)对象的区别在于:前者总是在引擎初始 ...

  4. Java中用HttpsURLConnection访问Https链接

    在web应用交互过程中,有很多场景需要保证通信数据的安全:在前面也有好多篇文章介绍了在Web Service调用过程中用WS-Security来保证接口交互过程的安全性,值得注意的是,该种方式基于的传 ...

  5. [转载]在VirtualBox中收缩虚拟磁盘映像文件

    原文地址:在VirtualBox中收缩虚拟磁盘映像文件作者:bobby 由于经常要测试一些软件,我在VirtualBox虚拟机中安装了一套Windows.使用过虚拟机的朋友都知道,为了节省硬盘空间,一 ...

  6. Java实现List数组的几种替代方案

    在Java中,禁止定义List<Integer>a[],这种List数组结构. 但是还是可以使用其它一些方式来实现列表数组. 一.使用Node把List包裹起来 public class ...

  7. linux下网络配置小节[from 老男孩的linux运维笔记]

    对于linux高手看似简单的网络配置问题,也许要说出所以然来也并不轻松,因此仍然有太多的初学者徘徊在门外就不奇怪了, 这里,老男孩老师花了一些时间总结了这个文档小结,也还不够完善,欢迎大家补充,交流. ...

  8. Linux如何实现开机启动程序详解(转)

    Linux开机启动程序详解我们假设大家已经熟悉其它操作系统的引导过程,了解硬件的自检引导步骤,就只从Linux操作系统的引导加载程序(对个人电脑而言通常是LILO)开始,介绍Linux开机引导的步骤. ...

  9. linux达人养成计划学习笔记(五)—— 关机和重启命令

    一.shutdown 1.格式: shutdown [选项] 时间(now) 选项: -c: 取消前一个关机命令 -h: 关机 -r: 重启 2.程序放入后台执行: shutdown -r 时间 &a ...

  10. 微信支付HTTPS服务器证书验证指引

    1. 背景介绍 2. 常见问题 3. 验证证书 4. 安装证书 背景介绍 微信支付使用HTTPS来保证通信安全, 在HTTPS服务器上部署了由权威机构签发的证书, 用于证明微信支付平台的真实身份. 商 ...