[Machine Learning with Python] Familiar with Your Data
Here I list some useful functions in Python to get familiar with your data. As an example, we load a dataset named housing
which is a DataFrame object. Usually, the first thing to do is get top five rows the dataset by head()
function:
housing = load_housing_data()
housing.head()
The info()
method is useful to get a quick description of the data, in particular the total number of rows, and each attribute’s type and number of non-null values.
housing.info()
The describe()
function will return statistics including count, mean, median, std, min, max and quantiles of each feature.
housing.describe()
For categorical varibles, we usually hope to see the labels and the count for each label. value_counts()
function works here:
housing["ocean_proximity"].value_counts()
That’s it. I’ll update more functions if I meet in further study.
[Machine Learning with Python] Familiar with Your Data的更多相关文章
- Python (1) - 7 Steps to Mastering Machine Learning With Python
Step 1: Basic Python Skills install Anacondaincluding numpy, scikit-learn, and matplotlib Step 2: Fo ...
- Getting started with machine learning in Python
Getting started with machine learning in Python Machine learning is a field that uses algorithms to ...
- 《Learning scikit-learn Machine Learning in Python》chapter1
前言 由于实验原因,准备入坑 python 机器学习,而 python 机器学习常用的包就是 scikit-learn ,准备先了解一下这个工具.在这里搜了有 scikit-learn 关键字的书,找 ...
- 【Machine Learning】Python开发工具:Anaconda+Sublime
Python开发工具:Anaconda+Sublime 作者:白宁超 2016年12月23日21:24:51 摘要:随着机器学习和深度学习的热潮,各种图书层出不穷.然而多数是基础理论知识介绍,缺乏实现 ...
- Machine Learning的Python环境设置
Machine Learning目前经常使用的语言有Python.R和MATLAB.如果采用Python,需要安装大量的数学相关和Machine Learning的包.一般安装Anaconda,可以把 ...
- [Machine Learning with Python] My First Data Preprocessing Pipeline with Titanic Dataset
The Dataset was acquired from https://www.kaggle.com/c/titanic For data preprocessing, I firstly def ...
- [Machine Learning with Python] Data Preparation through Transformation Pipeline
In the former article "Data Preparation by Pandas and Scikit-Learn", we discussed about a ...
- [Machine Learning with Python] Data Preparation by Pandas and Scikit-Learn
In this article, we dicuss some main steps in data preparation. Drop Labels Firstly, we drop labels ...
- [Machine Learning with Python] How to get your data?
Using Pandas Library The simplest way is to read data from .csv files and store it as a data frame o ...
随机推荐
- 代理工具--fiddle
正则匹配 1)前缀为“EXACT:”表示完全匹配:只有match=rules时,才匹配 2)无前缀表示基本搜索,表示搜索到字符串就匹配:只要match中包含了rules的字符串,即可 3)前缀为“NO ...
- 将php数组转js数组,js如何接收PHP数组,json的用法
首先下载下面这个文件(这是一段是别人写出来专门解析json的代码),然后引入这个文件! http://pan.baidu.com/s/1dD8qVr7 现在当我们需要用ajax与后台进行交互时,怎样将 ...
- Thinkphp5 获取执行的sql语句
获取最后执行的sql语句 $str_order_action = db('order_action')->getLastSql(); //获取最后执行的sql语句 获取执行的sql语句 $ord ...
- 深入理解FIFO(包含有FIFO深度的解释)——转载
深入理解FIFO(包含有FIFO深度的解释) FIFO: 一.先入先出队列(First Input First Output,FIFO)这是一种传统的按序执行方法,先进入的指令先完成并引退,跟着才执行 ...
- HTTP认证之摘要认证——Digest(二)
导航 HTTP认证之基本认证--Basic(一) HTTP认证之基本认证--Basic(二) HTTP认证之摘要认证--Digest(一) HTTP认证之摘要认证--Digest(二) 在HTTP认证 ...
- 反转单词顺序 VS 左旋转字符串
题目一:输入一个英文句子,翻转句子中单词的顺序,但单词内字符的顺序不变.为简单起见,标垫符号和普通字母一样处理.例如输入字符串“I am a student.”,则输出“student. a am I ...
- python学习-- Django传递数据给JS
var List = {{ List|safe }};//safe 必须存在
- jQuery 遍历函数 ,javascript中的each遍历
jQuery 遍历函数 jQuery 遍历函数包括了用于筛选.查找和串联元素的方法. 函数 描述 .add() 将元素添加到匹配元素的集合中. .andSelf() 把堆栈中之前的元素集添加到当前集合 ...
- sql语句执行时算术运算导致溢出。
执行sql语句时报错: 用户代码未处理 System.OverflowException HResult=-2146233066 Message=算术运算导致溢出. 文章:https://bbs.cs ...
- hibernate缓存机制【转】
一.why(为什么要用Hibernate缓存?) Hibernate是一个持久层框架,经常访问物理数据库. 为了降低应用程序对物理数据源访问的频次,从而提高应用程序的运行性能. 缓存内的数据是对物理数 ...