refer to:

https://medium.com/@kasiarachuta/reading-and-writingexcel-files-in-python-pandas-8f0da449cc48

dframe = pd.read_excel(“file_name.xlsx”)

dframe = pd.read_excel(“file_name.xlsx”, sheetname=”Sheet_name”)

dframe = pd.read_excel(“file_name.xlsx”, sheetname=number)

原文如下:

//////////////////////////////////////////////////////////////////////////////

Reading and writingExcel files in Python pandas

In data science, you are very likely to mostly work with CSV files. However, knowing how to import and export Excel files is also very useful.

In this post, a Kaggle dataset on 2016 US Elections was used (https://www.kaggle.com/benhamner/d/benhamner/2016-us-election/primary-results-sample-data/output). This dataset has been converted from a CSV file to an Excel file and two sheets have been added with votes for Hilary Clinton (HilaryClinton) and Donald Trump (DonaldTrump). The first sheet (All) contains the original dataset.

Reading Excel files

dframe = pd.read_excel(“file_name.xlsx”)

Reading Excel files is very similar to reading CSV files. By default, the first sheet of the Excel file is read.

 

I’ve read an Excel file and viewed the first 5 rows

dframe = pd.read_excel(“file_name.xlsx”, sheetname=”Sheet_name”)

Passing the sheetname method allows you to read the sheet of the Excel file that you want. It is very handy if you know its name.

 

I picked the sheet named “DonaldTrump”

dframe = pd.read_excel(“file_name.xlsx”, sheetname=number)

If you aren’t sure what are the names of your sheets, you can pick them by their order. Please note that the sheets start from 0 (similar to indices in pandas), not from 1.

 

I read the second sheet of the Excel file

dframe = pd.read_excel(“file_name.xlsx”, header=None)

Sometimes, the top row does not contain the column names. In this case, you pass the argument of header=None.

 

The first row is not the header — instead, we get the column names as numbers

dframe = pd.read_excel(“file_name.xlsx”, header=n)

Passing the argument of header being equal to a number allows us to pick a specific row as the column names.

 

I pick the second row (i.e. row index 1 of the original dataset) as my column names.

dframe = pd.read_excel(“file_name.xlsx”, index_col=number)

You can use different columns for the row labels by passing the index_col argument as number.

 

I now use the county as the index column.

dframe = pd.read_excel(“file_name.xlsx”, skiprows=n)

Sometimes, you don’t want to include all of the rows. If you want to skip the first n rows, just pass the argument of skiprows=n.

 

Skipping the first two rows (including the header)

Writing an Excel file

dframe.to_excel(‘file_name.xlsx’)

 

I wrote an Excel file called results.xlsx from my results DataFrame

 

My exported Excel file

dframe.to_excel(‘file_name.xlsx’, index=False)

If you don’t want to include the index name (for example, here it is a number so it may be meaningless for future use/analysis), you can just pass another argument, setting index as False.

 

I don’t want index names in my Excel file

 

Excel file output with no index names

All of the code can be found on my GitHub: https://github.com/kasiarachuta/Blog/blob/master/Reading%20and%20writing%20Excel%20files.ipynb

pandas dataframe 读取 xlsx 文件的更多相关文章

  1. pandas-19 DataFrame读取写入文件的方法

    pandas-19 DataFrame读取写入文件的方法 DataFrame有非常丰富的IO方法,比如DataFrame读写csv文件excel文件等等,操作很简单.下面在代码中标记出来一些常用的读写 ...

  2. 人工智能-机器学习之seaborn(读取xlsx文件,小提琴图)

    我们不止可以读取数据库的内容,还可以读取xlsx文件的内容,这个库有在有些情况还是挺实用的 首先我们想读取这个文件的时候必须得现有个seaborn库 下载命令就是: pip install  seab ...

  3. Python读取xlsx文件

    Python读取xlsx文件 脚本如下: from openpyxl import load_workbook workbook = load_workbook(u'/tmp/test.xlsx') ...

  4. 读取xlsx文件的内容输入到xls文件中

    package com.cn.peitest.excel; import java.io.File; import java.io.FileInputStream; import java.io.Fi ...

  5. C#读取xlsx文件Excel2007

    读取Excel 2007的xlsx文件和读取老的.xls文件是一样的,都是用Oledb读取,仅仅连接字符串不同而已. 具体代码实例: public static DataTable GetExcelT ...

  6. C#基础知识之读取xlsx文件Excel2007

    读取Excel 2007的xlsx文件和读取老的.xls文件是一样的,都是用Oledb读取,仅仅连接字符串不同而已. 具体代码实例: public static DataTable GetExcelT ...

  7. 使用POI读取xlsx文件,包含对excel中自定义时间格式的处理

    package poi; import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundExcepti ...

  8. pandas read_csv读取大文件的Memory error问题

    今天在读取一个超大csv文件的时候,遇到困难:首先使用office打不开然后在python中使用基本的pandas.read_csv打开文件时:MemoryError 最后查阅read_csv文档发现 ...

  9. Pandas dataframe数据写入文件和数据库

    转自:http://www.dcharm.com/?p=584 Pandas是Python下一个开源数据分析的库,它提供的数据结构DataFrame极大的简化了数据分析过程中一些繁琐操作,DataFr ...

随机推荐

  1. 2018 Multi-University Training Contest 2 Solution

    A - Absolute 留坑. B - Counting Permutations 留坑. C - Cover 留坑. D - Game puts("Yes") #include ...

  2. Arthur and Brackets

    n<605设计出n对夸号  给出n个条件每个条件为[l,r] 表示第i对夸号右夸号离左夸号的距离,然后夸号的右夸号出现的顺序必须按照给的顺序 出现, 那么如果存在就输出否则输出impossilb ...

  3. hdu5141 线段树

    这题说的是给了一串然后计算出这个串的最长递增子序列的长度,然后计算出有过少个子串[i,j] 他们的最长递增子序列和这整个子串的最长递增子序列相同,我们对于每个j最长递增子序列找出他在序列中的使成为最长 ...

  4. java第四天

    p32~p36: 学习javadoc 1.第一步,打开一个一定规模的java项目 2.第二步,搭建测试环境 IntelliJ IDEA ——> Tools ——> Generate Jav ...

  5. nginx配置文件参数详解

    nginx配置文件主要分为4部分:main(全局设置)    main部分设置的指令将影响其他所有设置server(主机设置)server部分的指令主要用于指定主机和端口upstream(负载均衡服务 ...

  6. linux及安全《Linux内核设计与实现》第二章——20135227黄晓妍

    第二章:从内核出发 2.1获取源代码 2.1.1使用git Git:内核开发者们用来管理Linux内核源代码的控制系统. 我们使用git来下载和管理Linux源代码. 2.1.2安装内核源代码(如果使 ...

  7. 20145204《Java程序设计》第10周学习总结

    网络编程 网络编程:在两个或两个以上的设备(例如计算机)之间传输数据.程序员所作的事情就是把数据发送到指定的位置,或者接收到指定的数据,这个就是狭义的网络编程范畴.在发送和接收数据时,大部分的程序设计 ...

  8. POJ 1625 Censored!(AC自动机->指针版+DP+大数)题解

    题目:给你n个字母,p个模式串,要你写一个长度为m的串,要求这个串不能包含模式串,问你这样的串最多能写几个 思路:dp+AC自动机应该能看出来,万万没想到这题还要加大数...orz 状态转移方程dp[ ...

  9. C++实现可变参数列表

    // 接收数量不定的实参.cpp : 定义控制台应用程序的入口点. // #include "stdafx.h" #include <iostream> #includ ...

  10. mac 下安装 express

    express为js的后端框架, 终端 >>>   npm install -g express-generator 然后cd到您要创建项目的目录之下,输入 >>> ...