import numpy as np import matplotlib.pyplot as plt def is_outlier(points, threshold=3.5): if len(points.shape) == 1: points = points[:, None] # Find the median number of points median = np.median(points, axis=0) diff = np.sum((points - median)**2, ax…
Before you can plot anything, you need to specify which backend Matplotlib should use. The simplest option is to use Jupyter’s magic command %matplotlib inline. This tells Jupyter to set up Matplotlib so it uses Jupyter’s own backend. Scatter Plot ho…
7 Tools for Data Visualization in R, Python, and Julia Last week, some examples of creating visualizations with htmlwidgets and R were presented. Fortunately, there are many more options available for creating nice visualizations. Tools and libraries…
Introduction to Data Visualization with Python | DataCamp https://www.datacamp.com/courses/introduction-to-data-visualization-with-python This course extends Intermediate Python for Data Science to provide a stronger foundation in data visualization…
Bokeh Data Visualization | DataCamp https://www.datacamp.com/courses/interactive-data-visualization-with-bokeh Bokeh is an interactive data visualization library for Python (and other languages!) that targets modern web browsers for presentation. It…
Data visualization - Wikipedia https://en.wikipedia.org/wiki/Data_visualization Data visualization or data visualisation is viewed by many disciplines as a modern equivalent of visual communication. It involves the creation and study of the visual re…
对数据可视化比较有兴趣,因此最近在看coursera上伊利诺伊大学香槟分校的数据可视化课程,做了一些笔记. 1. 定义 Data visualization is a high bandwidth connection between data on a computer system and a human brain, facilitated by visual communication. 2. 特征 洞悉数据,通过对数据的深入观察来帮助做进一步的决策,为后续探索研究提供进一步的假设. 3…
A Quick Overview of the ggplot2 Package in R While it will be important to focus on theory, I want to explain the ggplot2 package because I will be using it throughout the rest of this series. Knowing how it works will keep the focus on the results r…
Introduction to Data Visualization – Theory, R & ggplot2 The topic of data visualization is very popular in the data science community. The market size for visualization products is valued at $4 Billion and is projected to reach $7 Billion by the end…
Coursera课程<Python Data Structures> 密歇根大学 Charles Severance Week6 Tuple 10 Tuples 10.1 Tuples Are Like Lists 元组是另外一种序列,它的方法和list挺像的.它的元素也是从0开始计数. >>> x = ('Glenn', 'Sally', 'Joseph') >>> print(x[2]) Joseph >>> y = (1, 9, 2)…
Coursera课程<Python Data Structures> 密歇根大学 Charles Severance Week4 List 8.2 Manipulating Lists 8.2.1 Concatenating Lists Using + 使用"+"可以把存在的两个list加在一起.如: >>> a = [1, 2, 3] >>> b = [4, 5, 6] >>> c = a + b >>&g…
Python : Data Encapsulation The following table shows the different behaviour: Name Notation Behaviour name Public Can be accessed from inside and outside _name Protected Like a public member, but they shouldn't be directly accessed from outside. __n…
User-defined functions from:https://campus.datacamp.com/courses/python-data-science-toolbox-part-1/writing-your-own-functions?ex=1 Strings in Python To assign the string company = 'DataCamp' You've also learned to use the operations + and * with stri…
原文:http://www.jianshu.com/p/94516a58314d Dataset transformations| 数据转换 Combining estimators|组合学习器 Feature extration|特征提取 Preprocessing data|数据预处理 1 Dataset transformations scikit-learn provides a library of transformers, which may clean (see Preproce…
Learn how humans work to create a more effective computer interface 三种reasoning的方式 Deductive Reasoning (演绎推理): Basically drawing a conclusion based on the data. Inductive Reasoning (归纳推理): If something is ture for x, then it's true for x+1; if it's…
In the former article "Data Preparation by Pandas and Scikit-Learn", we discussed about a series of steps in data preparation. Scikit-Learn provides the Pipeline class to help with such sequences of transformations. The Pipeline constructor take…
In this article, we dicuss some main steps in data preparation. Drop Labels Firstly, we drop labels for train set. Here we use drop() method in Pandas library. housing = strat_train_set.drop("median_house_value", axis=1) # drop labels for traini…
2-D Graphics vector graphics : the graphics that used for drawing shapes with vertices, strokes and fills. raster graphics (光栅图形): a rectilinear ray of pixels and these pixels are assigned colors and by assigning. 一般用vector graphic去描绘点或者线:raster grap…
import numpy as np import matplotlib.pyplot as plt import seaborn; seaborn.set() rand = np.random.RandomState(42) x = rand.rand(10,2) #数组 plt.scatter(x[:,0],x[:,1],s=100) #数组第一列为横坐标,第二列为纵坐标 s=100:散点大小 plt.show() any list of dictionaries can be made i…