pandas dataframe 读取 xlsx 文件
refer to:
https://medium.com/@kasiarachuta/reading-and-writingexcel-files-in-python-pandas-8f0da449cc48
dframe = pd.read_excel(“file_name.xlsx”)
dframe = pd.read_excel(“file_name.xlsx”, sheetname=”Sheet_name”)
dframe = pd.read_excel(“file_name.xlsx”, sheetname=number)
原文如下:
//////////////////////////////////////////////////////////////////////////////
Reading and writingExcel files in Python pandas
In data science, you are very likely to mostly work with CSV files. However, knowing how to import and export Excel files is also very useful.
In this post, a Kaggle dataset on 2016 US Elections was used (https://www.kaggle.com/benhamner/d/benhamner/2016-us-election/primary-results-sample-data/output). This dataset has been converted from a CSV file to an Excel file and two sheets have been added with votes for Hilary Clinton (HilaryClinton) and Donald Trump (DonaldTrump). The first sheet (All) contains the original dataset.
Reading Excel files
dframe = pd.read_excel(“file_name.xlsx”)
Reading Excel files is very similar to reading CSV files. By default, the first sheet of the Excel file is read.

I’ve read an Excel file and viewed the first 5 rows
dframe = pd.read_excel(“file_name.xlsx”, sheetname=”Sheet_name”)
Passing the sheetname method allows you to read the sheet of the Excel file that you want. It is very handy if you know its name.

I picked the sheet named “DonaldTrump”
dframe = pd.read_excel(“file_name.xlsx”, sheetname=number)
If you aren’t sure what are the names of your sheets, you can pick them by their order. Please note that the sheets start from 0 (similar to indices in pandas), not from 1.

I read the second sheet of the Excel file
dframe = pd.read_excel(“file_name.xlsx”, header=None)
Sometimes, the top row does not contain the column names. In this case, you pass the argument of header=None.

The first row is not the header — instead, we get the column names as numbers
dframe = pd.read_excel(“file_name.xlsx”, header=n)
Passing the argument of header being equal to a number allows us to pick a specific row as the column names.

I pick the second row (i.e. row index 1 of the original dataset) as my column names.
dframe = pd.read_excel(“file_name.xlsx”, index_col=number)
You can use different columns for the row labels by passing the index_col argument as number.

I now use the county as the index column.
dframe = pd.read_excel(“file_name.xlsx”, skiprows=n)
Sometimes, you don’t want to include all of the rows. If you want to skip the first n rows, just pass the argument of skiprows=n.

Skipping the first two rows (including the header)
Writing an Excel file
dframe.to_excel(‘file_name.xlsx’)

I wrote an Excel file called results.xlsx from my results DataFrame

My exported Excel file
dframe.to_excel(‘file_name.xlsx’, index=False)
If you don’t want to include the index name (for example, here it is a number so it may be meaningless for future use/analysis), you can just pass another argument, setting index as False.

I don’t want index names in my Excel file

Excel file output with no index names
All of the code can be found on my GitHub: https://github.com/kasiarachuta/Blog/blob/master/Reading%20and%20writing%20Excel%20files.ipynb
pandas dataframe 读取 xlsx 文件的更多相关文章
- pandas-19 DataFrame读取写入文件的方法
pandas-19 DataFrame读取写入文件的方法 DataFrame有非常丰富的IO方法,比如DataFrame读写csv文件excel文件等等,操作很简单.下面在代码中标记出来一些常用的读写 ...
- 人工智能-机器学习之seaborn(读取xlsx文件,小提琴图)
我们不止可以读取数据库的内容,还可以读取xlsx文件的内容,这个库有在有些情况还是挺实用的 首先我们想读取这个文件的时候必须得现有个seaborn库 下载命令就是: pip install seab ...
- Python读取xlsx文件
Python读取xlsx文件 脚本如下: from openpyxl import load_workbook workbook = load_workbook(u'/tmp/test.xlsx') ...
- 读取xlsx文件的内容输入到xls文件中
package com.cn.peitest.excel; import java.io.File; import java.io.FileInputStream; import java.io.Fi ...
- C#读取xlsx文件Excel2007
读取Excel 2007的xlsx文件和读取老的.xls文件是一样的,都是用Oledb读取,仅仅连接字符串不同而已. 具体代码实例: public static DataTable GetExcelT ...
- C#基础知识之读取xlsx文件Excel2007
读取Excel 2007的xlsx文件和读取老的.xls文件是一样的,都是用Oledb读取,仅仅连接字符串不同而已. 具体代码实例: public static DataTable GetExcelT ...
- 使用POI读取xlsx文件,包含对excel中自定义时间格式的处理
package poi; import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundExcepti ...
- pandas read_csv读取大文件的Memory error问题
今天在读取一个超大csv文件的时候,遇到困难:首先使用office打不开然后在python中使用基本的pandas.read_csv打开文件时:MemoryError 最后查阅read_csv文档发现 ...
- Pandas dataframe数据写入文件和数据库
转自:http://www.dcharm.com/?p=584 Pandas是Python下一个开源数据分析的库,它提供的数据结构DataFrame极大的简化了数据分析过程中一些繁琐操作,DataFr ...
随机推荐
- linux改变apt-get安装源
最近自己装了个ubuntu kylin 在使用的过程中发现,系统的apt-get 的源有毛病,总是安装不了软件. 感觉应该是传说中的墙的原因,所以准备换到阿里云的源. 下面是步骤: 1.复制原文件备份 ...
- Python 操作 SQL 数据库 (ORCAL)
MySQLdb.connect是python 连接MySQL数据库的方法,在Python中 import MySQLdb即可使用,至于connect中的参数很简单:host:MySQL服务器名user ...
- HttpClient-RestTemplate-Feign
如何通过Java发送HTTP请求,通俗点讲,如何通过Java(模拟浏览器)发送HTTP请求. Java有原生的API可用于发送HTTP请求,即java.net.URL.java.net.URLConn ...
- 用通俗的语言解释restful
实现了REST规范的Web API就叫RESTful API. 简单来说:就是用url定位资源,用http描述来操作资源. web是什么:分布式信息系统为超文本文件和其他对象(资源)提供访问入口. 资 ...
- git使用多个SSH公钥信息
常常在开发环境存在多个git库,比如官方的github.公司搭建的gitlab.自己的私人库等等多个git库,为了方便使用,git需要配置多个SSH公钥信息. 在centos7.5下,进入用户目录,以 ...
- Ubuntu下dlib库编译安装
安装libboost 按照dlib的说明安装始终不成功,参考machine learning is fun作者的指导installing_dlib_on_macos_for_python.md,需要首 ...
- 《JDK 8.0 学习笔记》1~3章
第一章 Java平台概论 了解Java的发展历程和相关术语如JDK.JVM.JRE等 第二章 从JDK到IDE 书本介绍了新建Java程序的注意事项以及在cmd和Eclipse环境下如何运行Java, ...
- NOIP 2018 兔纸旅游记
今年是第一次参加tg呢... Day0 早上出发去中旅坐大巴,走有 lz 特色的OI比赛道路. 车上谈笑风生,看 jw 的 GDOI 的小本本. 到动车站取票入站,看 lmh 和 zn 的爱恨情 ...
- HDU 1698 Just a Hook(线段树:区间更新)
http://acm.hdu.edu.cn/showproblem.php?pid=1698 题意:给出1~n的数,每个数初始为1,每次改变[a,b]的值,最后求1~n的值之和. 思路: 区间更新题目 ...
- HDU 1166 敌兵布阵(线段树 or 二叉索引树)
http://acm.hdu.edu.cn/showproblem.php?pid=1166 题意:第一行一个整数T,表示有T组数据. 每组数据第一行一个正整数N(N<=50000),表示敌人有 ...