read_excel()

加载函数为read_excel(),其具体参数如下。

read_excel(io, sheetname=0, header=0, skiprows=None, skip_footer=0, index_col=None,names=None, parse_cols=None, parse_dates=False,date_parser=None,na_values=None,thousands=None, convert_float=True, has_index_names=None, converters=None,dtype=None, true_values=None, false_values=None, engine=None, squeeze=False, **kwds)

常用参数解析:

  • io : string, path object ; excel 路径。

  • sheetname : string, int, mixed list of strings/ints, or None, default 0 返回多表使用sheetname=[0,1],若sheetname=None是返回全表 注意:int/string 返回的是dataframe,而none和list返回的是dict of dataframe

  • header : int, list of ints, default 0 指定列名行,默认0,即取第一行,数据为列名行以下的数据 若数据不含列名,则设定 header = None

  • skiprows : list-like,Rows to skip at the beginning,省略指定行数的数据

  • skip_footer : int,default 0, 省略从尾部数的int行数据

  • index_col : int, list of ints, default None指定列为索引列,也可以使用u”strings”

  • names : array-like, default None, 指定列的名字。

数据源:

sheet1:
ID NUM-1 NUM-2 NUM-3
36901 142 168 661
36902 78 521 602
36903 144 600 521
36904 95 457 468
36905 69 596 695 sheet2:
ID NUM-1 NUM-2 NUM-3
36906 190 527 691
36907 101 403 470

(1)函数原型

basestation ="F://pythonBook_PyPDAM/data/test.xls"
data = pd.read_excel(basestation)
print data

输出:是一个dataframe

      ID  NUM-1  NUM-2  NUM-3
0 36901 142 168 661
1 36902 78 521 602
2 36903 144 600 521
3 36904 95 457 468
4 36905 69 596 695

(2) sheetname参数:返回多表使用sheetname=[0,1],若sheetname=None是返回全表 注意:int/string 返回的是dataframe,而none和list返回的是dict of dataframe

data_1 = pd.read_excel(basestation,sheetname=[0,1])
print data_1
print type(data_1)

输出:dict of dataframe

OrderedDict([(0,       ID  NUM-1  NUM-2  NUM-3
0 36901 142 168 661
1 36902 78 521 602
2 36903 144 600 521
3 36904 95 457 468
4 36905 69 596 695),
(1, ID NUM-1 NUM-2 NUM-3
0 36906 190 527 691
1 36907 101 403 470)])

(3)header参数:指定列名行,默认0,即取第一行,数据为列名行以下的数据 若数据不含列名,则设定 header = None ,注意这里还有列名的一行。


data = pd.read_excel(basestation,header=None)
print data
输出:
0 1 2 3
0 ID NUM-1 NUM-2 NUM-3
1 36901 142 168 661
2 36902 78 521 602
3 36903 144 600 521
4 36904 95 457 468
5 36905 69 596 695 data = pd.read_excel(basestation,header=[3])
print data
输出:
36903 144 600 521
0 36904 95 457 468
1 36905 69 596 695

(4)skiprows 参数:省略指定行数的数据

data = pd.read_excel(basestation,skiprows = [1])
print data
输出:
ID NUM-1 NUM-2 NUM-3
0 36902 78 521 602
1 36903 144 600 521
2 36904 95 457 468
3 36905 69 596 695

(5)skip_footer参数:省略从尾部数的int行的数据

data = pd.read_excel(basestation, skip_footer=3)
print data
输出:
ID NUM-1 NUM-2 NUM-3
0 36901 142 168 661
1 36902 78 521 602

(6)index_col参数:指定列为索引列,也可以使用u”strings”

data = pd.read_excel(basestation, index_col="NUM-3")
print data
输出:
ID NUM-1 NUM-2
NUM-3
661 36901 142 168
602 36902 78 521
521 36903 144 600
468 36904 95 457
695 36905 69 596

(7)names参数: 指定列的名字。

data = pd.read_excel(basestation,names=["a","b","c","e"])
print data
a b c e
0 36901 142 168 661
1 36902 78 521 602
2 36903 144 600 521
3 36904 95 457 468
4 36905 69 596 695

具体参数如下

>>> print help(pandas.read_excel)
Help on function read_excel in module pandas.io.excel: read_excel(io, sheetname=0, header=0, skiprows=None, skip_footer=0, index_col=None, names=None, parse_cols=None, parse_dates=False, date_parser=None, na_values=None, thousands=None, convert_float=True, has_index_names=None, converters=None, dtype=None, true_values=None, false_values=None, engine=None, squeeze=False, **kwds)
Read an Excel table into a pandas DataFrame Parameters
----------
io : string, path object (pathlib.Path or py._path.local.LocalPath),
file-like object, pandas ExcelFile, or xlrd workbook.
The string could be a URL. Valid URL schemes include http, ftp, s3,
and file. For file URLs, a host is expected. For instance, a local
file could be file://localhost/path/to/workbook.xlsx
sheetname : string, int, mixed list of strings/ints, or None, default 0 Strings are used for sheet names, Integers are used in zero-indexed
sheet positions. Lists of strings/integers are used to request multiple sheets. Specify None to get all sheets. str|int -> DataFrame is returned.
list|None -> Dict of DataFrames is returned, with keys representing
sheets. Available Cases * Defaults to 0 -> 1st sheet as a DataFrame
* 1 -> 2nd sheet as a DataFrame
* "Sheet1" -> 1st sheet as a DataFrame
* [0,1,"Sheet5"] -> 1st, 2nd & 5th sheet as a dictionary of DataFrames
* None -> All sheets as a dictionary of DataFrames header : int, list of ints, default 0
Row (0-indexed) to use for the column labels of the parsed
DataFrame. If a list of integers is passed those row positions will
be combined into a ``MultiIndex``
skiprows : list-like
Rows to skip at the beginning (0-indexed)
skip_footer : int, default 0
Rows at the end to skip (0-indexed)
index_col : int, list of ints, default None
Column (0-indexed) to use as the row labels of the DataFrame.
Pass None if there is no such column. If a list is passed,
those columns will be combined into a ``MultiIndex``. If a
subset of data is selected with ``parse_cols``, index_col
is based on the subset.
names : array-like, default None
List of column names to use. If file contains no header row,
then you should explicitly pass header=None
converters : dict, default None
Dict of functions for converting values in certain columns. Keys can
either be integers or column labels, values are functions that take one
input argument, the Excel cell content, and return the transformed
content.
dtype : Type name or dict of column -> type, default None
Data type for data or columns. E.g. {'a': np.float64, 'b': np.int32}
Use `object` to preserve data as stored in Excel and not interpret dtype.
If converters are specified, they will be applied INSTEAD
of dtype conversion. .. versionadded:: 0.20.0 true_values : list, default None
Values to consider as True .. versionadded:: 0.19.0 false_values : list, default None
Values to consider as False .. versionadded:: 0.19.0 parse_cols : int or list, default None
* If None then parse all columns,
* If int then indicates last column to be parsed
* If list of ints then indicates list of column numbers to be parsed
* If string then indicates comma separated list of Excel column letters and
column ranges (e.g. "A:E" or "A,C,E:F"). Ranges are inclusive of
both sides.
squeeze : boolean, default False
If the parsed data only contains one column then return a Series
na_values : scalar, str, list-like, or dict, default None
Additional strings to recognize as NA/NaN. If dict passed, specific
per-column NA values. By default the following values are interpreted
as NaN: '', '#N/A', '#N/A N/A', '#NA', '-1.#IND', '-1.#QNAN', '-NaN', '-nan',
'1.#IND', '1.#QNAN', 'N/A', 'NA', 'NULL', 'NaN', 'nan'.
thousands : str, default None
Thousands separator for parsing string columns to numeric. Note that
this parameter is only necessary for columns stored as TEXT in Excel,
any numeric columns will automatically be parsed, regardless of display
format.
keep_default_na : bool, default True
If na_values are specified and keep_default_na is False the default NaN
values are overridden, otherwise they're appended to.
verbose : boolean, default False
Indicate number of NA values placed in non-numeric columns
engine: string, default None
If io is not a buffer or path, this must be set to identify io.
Acceptable values are None or xlrd
convert_float : boolean, default True
convert integral floats to int (i.e., 1.0 --> 1). If False, all numeric
data will be read in as floats: Excel stores all numbers as floats
internally
has_index_names : boolean, default None
DEPRECATED: for version 0.17+ index names will be automatically
inferred based on index_col. To read Excel output from 0.16.2 and
prior that had saved index names, use True. Returns

to_excel()

存储函数为pd.DataFrame.to_excel(),注意,必须是DataFrame写入excel, 即Write DataFrame to an excel sheet。其具体参数如下:

to_excel(self, excel_writer, sheet_name='Sheet1', na_rep='', float_format=None,columns=None, header=True, index=True, index_label=None,startrow=0, startcol=0, engine=None, merge_cells=True, encoding=None,
inf_rep='inf', verbose=True, freeze_panes=None)

常用参数解析

  • excel_writer : string or ExcelWriter object File path or existing ExcelWriter目标路径

  • sheet_name : string, default ‘Sheet1’ Name of sheet which will contain DataFrame,填充excel的第几页

  • na_rep : string, default ”,Missing data representation 缺失值填充

  • float_format : string, default None Format string for floating point numbers

  • columns : sequence, optional,Columns to write 选择输出的的列。

  • header : boolean or list of string, default True Write out column names. If a list of string is given it is assumed to be aliases for the column names

  • index : boolean, default True,Write row names (index)

  • index_label : string or sequence, default None, Column label for index column(s) if desired. If None is given, andheader and index are True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex.

  • startrow :upper left cell row to dump data frame

  • startcol :upper left cell column to dump data frame

  • engine : string, default None ,write engine to use - you can also set this via the options,io.excel.xlsx.writer, io.excel.xls.writer, andio.excel.xlsm.writer.

  • merge_cells : boolean, default True Write MultiIndex and Hierarchical Rows as merged cells.

  • encoding: string, default None encoding of the resulting excel file. Only necessary for xlwt,other writers support unicode natively.

  • inf_rep : string, default ‘inf’ Representation for infinity (there is no native representation for infinity in Excel)

  • freeze_panes : tuple of integer (length 2), default None Specifies the one-based bottommost row and rightmost column that is to be frozen

数据源:

    ID  NUM-1   NUM-2   NUM-3
0 36901 142 168 661
1 36902 78 521 602
2 36903 144 600 521
3 36904 95 457 468
4 36905 69 596 695
5 36906 165 453 加载数据:
basestation ="F://python/data/test.xls"
basestation_end ="F://python/data/test_end.xls"
data = pd.read_excel(basestation)

(1)参数excel_writer,输出路径。

data.to_excel(basestation_end)
输出:
ID NUM-1 NUM-2 NUM-3
0 36901 142 168 661
1 36902 78 521 602
2 36903 144 600 521
3 36904 95 457 468
4 36905 69 596 695
5 36906 165 453

(2)sheet_name,将数据存储在excel的那个sheet页面。

data.to_excel(basestation_end,sheet_name="sheet2")

(3)na_rep,缺失值填充

data.to_excel(basestation_end,na_rep="NULL")
输出:
ID NUM-1 NUM-2 NUM-3
0 36901 142 168 661
1 36902 78 521 602
2 36903 144 600 521
3 36904 95 457 468
4 36905 69 596 695
5 36906 165 453 NULL

(4)colums参数: sequence, optional,Columns to write 选择输出的的列。

data.to_excel(basestation_end,columns=["ID"])
输出
ID
0 36901
1 36902
2 36903
3 36904
4 36905
5 36906

(5)header 参数: boolean or list of string,默认为True,可以用list命名列的名字。header = False 则不输出题头

data.to_excel(basestation_end,header=["a","b","c","d"])
输出:
a b c d
0 36901 142 168 661
1 36902 78 521 602
2 36903 144 600 521
3 36904 95 457 468
4 36905 69 596 695
5 36906 165 453 data.to_excel(basestation_end,header=False,columns=["ID"])
header = False 则不输出题头
输出:
0 36901
1 36902
2 36903
3 36904
4 36905
5 36906

(6)index : boolean, default True Write row names (index)

默认为True,显示index,当index=False 则不显示行索引(名字)。

index_label : string or sequence, default None

设置索引列的列名。

data.to_excel(basestation_end,index=False)
输出:
ID NUM-1 NUM-2 NUM-3
36901 142 168 661
36902 78 521 602
36903 144 600 521
36904 95 457 468
36905 69 596 695
36906 165 453 data.to_excel(basestation_end,index_label=["f"])
输出:
f ID NUM-1 NUM-2 NUM-3
0 36901 142 168 661
1 36902 78 521 602
2 36903 144 600 521
3 36904 95 457 468
4 36905 69 596 695
5 36906 165 453

来源于https://blog.csdn.net/tongxinzhazha/article/details/78796952

Pandas之read_excel()和to_excel()函数解析的更多相关文章

  1. python重要的第三方库pandas模块常用函数解析之DataFrame

    pandas模块常用函数解析之DataFrame 关注公众号"轻松学编程"了解更多. 以下命令都是在浏览器中输入. cmd命令窗口输入:jupyter notebook 打开浏览器 ...

  2. pandas模块常用函数解析之Series(详解)

    pandas模块常用函数解析之Series 关注公众号"轻松学编程"了解更多. 以下命令都是在浏览器中输入. cmd命令窗口输入:jupyter notebook 打开浏览器输入网 ...

  3. [转]javascript eval函数解析json数据时为什加上圆括号eval("("+data+")")

    javascript eval函数解析json数据时为什么 加上圆括号?为什么要 eval这里要添加 “("("+data+")");//”呢?   原因在于: ...

  4. PHP json_decode 函数解析 json 结果为 NULL 的解决方法

    在做网站 CMS 模块时,对于模块内容 content 字段,保存的是 json 格式的字符串,所以在后台进行模块内容的编辑操作 ( 取出保存的数据 ) 时,需要用到 json_decode() 函数 ...

  5. Matlab中bsxfun和unique函数解析

    一.问题来源 来自于一份LSH代码,记录下来. 二.函数解析 2.1 bsxfun bsxfun是一个matlab自版本R2007a来就提供的一个函数,作用是”applies an element-b ...

  6. socket使用TCP协议时,send、recv函数解析以及TCP连接关闭的问题

    Tcp协议本身是可靠的,并不等于应用程序用tcp发送数据就一定是可靠的.不管是否阻塞,send发送的大小,并不代表对端recv到多少的数据. 在阻塞模式下, send函数的过程是将应用程序请求发送的数 ...

  7. sigaction函数解析

    http://blog.chinaunix.net/uid-1877180-id-3011232.html sigaction函数解析  sigaction函数的功能是检查或修改与指定信号相关联的处理 ...

  8. driver_register()函数解析

    driver_register()函数解析 /** * driver_register - register driver with bus * @drv: driver to register *  ...

  9. async函数解析

    转载请注明出处:async函数解析 async函数是基于Generator函数实现的,也就是说是Generator函数的语法糖.在之前的文章有介绍过Generator函数语法和异步应用,如果对其不了解 ...

随机推荐

  1. OO七大设计原则

    一.单一职责原则(Single Responsibility Principle,SRP) 含义: 1.避免相同的职责分散到不同的类中 2.避免一个类承担太多职责 作用: 1.可以减少类之间的耦合 2 ...

  2. nodeType介绍及应用示例

    一,DOM中的节点类型介绍 DOM将一份文档抽象为一棵树,而树又由众多不同类型的节点构成. 元素节点是DOM中的最小单位节点,它包括了各种标签,比如表示段落的p,表示无序列表的ul等. 文本节点总是被 ...

  3. Python中xlrd、xlwt、win32com模块对xls文件的读写操作

    # -*- coding: utf-8 -*- #xlrd和xlwt只支持xls文件读写,openpyxl只支持xlsx文件的读写操作 import xlrd import xlwt import w ...

  4. Javascript中的相等比较

    在比较相等或不相等之前,会对操作数进行类型转换,然后比较相等性 在转换不同的数据类型时,相等和不相等操作符遵循下列基本规则: 1.如果由一个操作数是布尔值,则在比较相等性之前先将其转换为数值:2.如果 ...

  5. Android开发 Butterknife使用方法总结

    前言: ButterKnife是一个专注于Android系统的View注入框架,以前总是要写很多findViewById来找到View对象,有了ButterKnife可以很轻松的省去这些步骤.是大神J ...

  6. syslog-ng收集日志+ELK平台搭建教程

    syslog-ng部署: 用于接收交换机输出的日志. syslog-ng安装很简单,可以顺便搜一下,文章有很多.我是yum直接安装的. syslog-ng配置: vi /etc/syslog-ng/s ...

  7. 死磕Spring源码系列

    一.Spring总体架构 1.架构图 2.SpringIOC:核心容器提供 Spring 框架的基本功能.核心容器的主要组件是 BeanFactory,它是工厂模式的实现.BeanFactory 使用 ...

  8. day08 python文件操作

    day08 python   一.文件操作     1.文件操作的函数         open(文件名, mode=模式, encoding=字符集)       2.模式: r, w, a, r+ ...

  9. 设置overflow:auto无效的解决办法

    做项目中经常要用到滚动条,有时候给div设置overflow无效. 遇到这样的问题,只需要在 F12 Elements面板检查一下要设置overflow的元素的宽高是否大于父级元素宽高. (overf ...

  10. 【leetcode】963. Minimum Area Rectangle II

    题目如下: Given a set of points in the xy-plane, determine the minimum area of any rectangle formed from ...