Before you can plot anything, you need to specify which backend Matplotlib should use. The simplest option is to use Jupyter’s magic command %matplotlib inline. This tells Jupyter to set up Matplotlib so it uses Jupyter’s own backend.

Scatter Plot

housing.plot(kind="scatter", x="longitude", y="latitude")

You can set the parameter alpha to study the density of points:

housing.plot(kind="scatter", x="longitude", y="latitude", alpha=0.1)

The plot can convey more information by setting different colors, sizes, shapes, etc. Here we will use a predefined color map (option cmap) called jet. As an example, we plot the house prices in different locations and let the radius of each circle represents the district’s population (option s), and the color represents the price (option c).

 %matplotlib inline
import matplotlib.pyplot as plt
housing.plot(kind="scatter", x="longitude", y="latitude", alpha=0.4,
s=housing["population"]/100, label="population", figsize=(10,7),
c="median_house_value", cmap=plt.get_cmap("jet"), colorbar=True,
sharex=False)
plt.legend()
save_fig("housing_prices_scatterplot")

Note that the argument sharex=False fixes a display bug (the x-axis values and legend were not displayed). This is a temporary fix (see: https://github.com/pandas-dev/pandas/issues/10611).

Scatter Matrix

 from pandas.plotting import scatter_matrix

 attributes = ["median_house_value", "median_income", "total_rooms",
"housing_median_age"]
scatter_matrix(housing[attributes], figsize=(12, 8))
save_fig("scatter_matrix_plot")

Histogram

Histogram is a useful method to study the distribution of numeric attributes.

 %matplotlib inline
import matplotlib.pyplot as plt
housing.hist(bins=50, figsize=(20,15))
save_fig("attribute_histogram_plots")
plt.show()

For single attribute, you can use the following statement:

housing["median_income"].hist()

Correlation Plot

We can calculate the correlation coefficients between each pair of attributes using corr() method and look at the value by sort_values():

corr_matrix = housing.corr()
corr_matrix["median_house_value"].sort_values(ascending=False)

Also, we can use scatter_matrix function, which plots every numerical attribute against every other numerical attribute. The diagonal displays the histogram of each attribute.

from pandas.tools.plotting import scatter_matrix
attributes = ["median_house_value", "median_income", "total_rooms", "housing_median_age"]
scatter_matrix(housing[attributes], figsize=(12, 8))

[Machine Learning with Python] Data Visualization by Matplotlib Library的更多相关文章

  1. [Machine Learning with Python] Data Preparation through Transformation Pipeline

    In the former article "Data Preparation by Pandas and Scikit-Learn", we discussed about a ...

  2. [Machine Learning with Python] Data Preparation by Pandas and Scikit-Learn

    In this article, we dicuss some main steps in data preparation. Drop Labels Firstly, we drop labels ...

  3. Getting started with machine learning in Python

    Getting started with machine learning in Python Machine learning is a field that uses algorithms to ...

  4. Python (1) - 7 Steps to Mastering Machine Learning With Python

    Step 1: Basic Python Skills install Anacondaincluding numpy, scikit-learn, and matplotlib Step 2: Fo ...

  5. 《Learning scikit-learn Machine Learning in Python》chapter1

    前言 由于实验原因,准备入坑 python 机器学习,而 python 机器学习常用的包就是 scikit-learn ,准备先了解一下这个工具.在这里搜了有 scikit-learn 关键字的书,找 ...

  6. Coursera, Big Data 4, Machine Learning With Big Data (week 1/2)

    Week 1 Machine Learning with Big Data KNime - GUI based Spark MLlib - inside Spark CRISP-DM Week 2, ...

  7. 【Machine Learning】Python开发工具:Anaconda+Sublime

    Python开发工具:Anaconda+Sublime 作者:白宁超 2016年12月23日21:24:51 摘要:随着机器学习和深度学习的热潮,各种图书层出不穷.然而多数是基础理论知识介绍,缺乏实现 ...

  8. In machine learning, is more data always better than better algorithms?

    In machine learning, is more data always better than better algorithms? No. There are times when mor ...

  9. Machine Learning的Python环境设置

    Machine Learning目前经常使用的语言有Python.R和MATLAB.如果采用Python,需要安装大量的数学相关和Machine Learning的包.一般安装Anaconda,可以把 ...

随机推荐

  1. Python语言的简介

    ___________________________________________________________我是一条分割线__________________________________ ...

  2. python 数据结构与算法之排序(冒泡,选择,插入)

    目录 数据结构与算法之排序(冒泡,选择,插入) 为什么学习数据结构与算法: 数据结构与算法: 算法: 数据结构 冒泡排序法 选择排序法 插入排序法 数据结构与算法之排序(冒泡,选择,插入) 为什么学习 ...

  3. Nginx是用来干什么的?

    一.静态HTTP服务器 首先,Nginx是一个HTTP服务器,可以将服务器上的静态文件(如HTML.图片)通过HTTP协议展现给客户端. 配置: server { listen80; # 端口号 lo ...

  4. graph-Dijkstra's shortest-path alogorithm

    直接贴代码吧,简明易懂. 后面自己写了测试,输入数据为: a b c d e 0 1 4 0 2 2 1 2 3 1 3 2 1 4 3 2 1 1 2 3 4 2 4 5 4 3 1 也就是课本上1 ...

  5. zoj 4054

    #define ll long long ; int t; ll ans,tmp; char s[N]; int main() { scanf("%d",&t); whil ...

  6. 使用sprunge粘贴文字

    在irc里面请教的时候,需要输出很多文本,irc禁止输入多行文字. 使用sprunge可以返回一个网址,省去复制粘贴的麻烦. 1> 简单使用: command | curl -F "s ...

  7. debian使用sudo

    debian默认没有sudo ,在命令前无法使用sudo #切换到根用户$ su 输入根用户密码 # apt-get install sudo # nano /etc/sudoers 文件的 User ...

  8. #1 add life to static pages && connect to MySQL

    由于实验室 Project 中需要用到PHP, 之前也没接触过 PHP, 因此把 编程入门 <Head Fist PHP & MySQL >找来花了四五天快速过了一遍. 现在想把书 ...

  9. Setting title-center on "< h1> " element on Android does not work, fix

    app.scss: h1.title-center{ text-align: center!important; }

  10. C++ POST方式访问网页

    bool PostContent(CString strUrl, const CString &strPara, CString &strContent, CString &s ...