未完

for examples:

example 1:

 # Code based on Python 3.x
# _*_ coding: utf-8 _*_
# __Author: "LEMON" import pandas as pd d = pd.date_range('', periods=7)
aList = list(range(1,8)) df = pd.DataFrame(aList, index=d, columns=[' '])
df.index.name = 'value' print('----------df.index---------')
print(df.index) print('---------df.columns---------')
print(df.columns) print('----------df.values---------')
print(df.values) print('----------df.describe--------')
print(df.describe) print('----------information details--------')
print(df.head(2)) #获取开始的n条记录
print(df.tail(3)) #后去最后的n条记录
print(df[3:5]) # df[a:b],获取第a+1至第b-1的记录

运行结果如下:

 ----------df.index---------
DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04',
'2017-01-05', '2017-01-06', '2017-01-07'],
dtype='datetime64[ns]', name='value', freq='D')
---------df.columns---------
Index([' '], dtype='object')
----------df.values---------
[[1]
[2]
[3]
[4]
[5]
[6]
[7]]
----------df.describe--------
<bound method NDFrame.describe of
value
2017-01-01 1
2017-01-02 2
2017-01-03 3
2017-01-04 4
2017-01-05 5
2017-01-06 6
2017-01-07 7>
----------information details-------- value
2017-01-01 1
2017-01-02 2 value
2017-01-05 5
2017-01-06 6
2017-01-07 7 value
2017-01-04 4
2017-01-05 5

example 2:

 # Code based on Python 3.x
# _*_ coding: utf-8 _*_
# __Author: "LEMON" from pandas import Series, DataFrame
import pandas as pd data = {'state': ['Ohino', 'Ohino', 'Ohino', 'Nevada', 'Nevada'],
'year': [2000, 2001, 2002, 2001, 2002],
'pop': [1.5, 1.7, 3.6, 2.4, 2.9]} df = DataFrame(data, index=list(range(1, 6)),
columns=['year', 'state', 'pop', 'name'])
print(df) print('\n', '---------------')
print(list(df.ix[3])) print('\n', '---------------')
print(list(df['year'])) aList = ['', '', '', '']
bList = ['aa', 'bb', 'cb', 'dd']
cList = ['lemon', 'apple', 'orange', 'banana'] d = {'num': aList, 'char': bList, 'fruit': cList} df1 = DataFrame(d, index=['a', 'b', 'c', 'd'])
# df2 = DataFrame(bList)
print('\n', '---------------')
print(df1)
#print(df1.num) print('\n', '---------------')
print(df1.ix['b']) # 获取索引号为 'b' 的行的数据 print('\n', '---------------')
print(df1.ix[:2, 1:3]) # 以切片形式获取部分数据

运行结果如下:

  year   state  pop name
1 2000 Ohino 1.5 NaN
2 2001 Ohino 1.7 NaN
3 2002 Ohino 3.6 NaN
4 2001 Nevada 2.4 NaN
5 2002 Nevada 2.9 NaN ---------------
[2002, 'Ohino', 3.6000000000000001, nan] ---------------
[2000, 2001, 2002, 2001, 2002] ---------------
char fruit num
a aa lemon 1
b bb apple 2
c cb orange 3
d dd banana 4 ---------------
char bb
fruit apple
num 2
Name: b, dtype: object ---------------
fruit num
a lemon 1
b apple 2

example 3 (数据选择-DateFrame.loc()和DateFrame.iloc()) :

 # Code based on Python 3.x
# _*_ coding: utf-8 _*_
# __Author: "LEMON" from matplotlib.finance import quotes_historical_yahoo_ochl
from datetime import date
import pandas as pd today = date.today() start =(today.year-4, today.month+11, today.day-1)
end = (today.year-4, today.month+11, today.day+3)
quotes = quotes_historical_yahoo_ochl('AMX', start, end)
# each items in quotes is type of "tuple" fields = ['date', 'open', 'close', 'high', 'low', 'volume'] quotes1 = []
for t in quotes:
t1 = list(t)
quotes1.append(t1)
# each items in quotes1 is type of "list" for i in range(0, len(quotes1)):
quotes1[i][0] = date.fromordinal(int(quotes1[i][0]))
# date format is changed df = pd.DataFrame(quotes1, index=range(1, len(quotes1)+1), columns=fields)
# df = pd.DataFrame(quotes1, index=['a','b','c','d','e'], columns=fields)
# df = df.drop(['date'], axis=1) print(df) print(df['close'].mean()) #计算某列的mean值
# print(dict(df.mean())['close']) #计算某列的mean值 print(df.sort_values(['open'],ascending = True)) #进行排序,默认(True)是升序
print(df[df.open>=21].date) # index是整数
print(df.loc[2:5, 'date':'close'])
print(df.loc[[2,5],['open','close']])
# loc方法在行和列的选择上是标签形式,可以是连续的选择,或者单个行或列的选择
print(df.iloc[1:6,0:4]) #iloc方法以切片形式选取数据 # index是标签形式
# print(df.loc['a':'d', 'date':'close'])
# print(df.loc[['b','e'],['open','close']])
# loc方法在行和列的选择上是标签形式,可以是连续的选择,或者单个行或列的选择 # 根据判断条件来选择数据
print(df[(df.index>=4) & (df.open>=21)]) # DateFrame 的均值
print(df.mean()) # 默认计算每列的均值
print(df.mean(axis=1)) # axis=1是计算每行的均值 '''
# 获取多只股票的信息
d1 = (today.year-1, today.month+11, today.day) aList = ['BABA', 'KO', 'AMX'] # List of the stock code of companys for i in aList:
q1 = quotes_historical_yahoo_ochl(i, d1, today)
df1 = pd.DataFrame(q1)
print(df1)
'''

运行结果如下:

          date       open      close       high        low     volume
1 2013-12-03 20.999551 21.156955 21.184731 20.795851 5152600.0
2 2013-12-04 20.971773 20.934738 21.064364 20.703261 5174400.0
3 2013-12-05 20.518079 20.545857 21.231027 20.379193 7225600.0
4 2013-12-06 21.166215 20.601411 21.295841 20.536598 9989500.0
20.80974025
20.80974025
date open close high low volume
3 2013-12-05 20.518079 20.545857 21.231027 20.379193 7225600.0
2 2013-12-04 20.971773 20.934738 21.064364 20.703261 5174400.0
1 2013-12-03 20.999551 21.156955 21.184731 20.795851 5152600.0
4 2013-12-06 21.166215 20.601411 21.295841 20.536598 9989500.0
4 2013-12-06
Name: date, dtype: object runfile('E:/Python/Anaco/test_yahoo.py', wdir='E:/Python/Anaco')
date open close high low volume
1 2013-12-03 20.999551 21.156955 21.184731 20.795851 5152600.0
2 2013-12-04 20.971773 20.934738 21.064364 20.703261 5174400.0
3 2013-12-05 20.518079 20.545857 21.231027 20.379193 7225600.0
4 2013-12-06 21.166215 20.601411 21.295841 20.536598 9989500.0
20.80974025
date open close high low volume
3 2013-12-05 20.518079 20.545857 21.231027 20.379193 7225600.0
2 2013-12-04 20.971773 20.934738 21.064364 20.703261 5174400.0
1 2013-12-03 20.999551 21.156955 21.184731 20.795851 5152600.0
4 2013-12-06 21.166215 20.601411 21.295841 20.536598 9989500.0
4 2013-12-06
Name: date, dtype: object
date open close
2 2013-12-04 20.971773 20.934738
3 2013-12-05 20.518079 20.545857
4 2013-12-06 21.166215 20.601411
open close
2 20.971773 20.934738
5 NaN NaN
date open close high
2 2013-12-04 20.971773 20.934738 21.064364
3 2013-12-05 20.518079 20.545857 21.231027
4 2013-12-06 21.166215 20.601411 21.295841
date open close high low volume
4 2013-12-06 21.166215 20.601411 21.295841 20.536598 9989500.0
open 2.091390e+01
close 2.080974e+01
high 2.119399e+01
low 2.060373e+01
volume 6.885525e+06
dtype: float64
1 1.030537e+06
2 1.034897e+06
3 1.445137e+06
4 1.997917e+06
dtype: float64

examples 4: 求微软公司(MSFT)2015年每月股票收盘价的平均值。

 # Code based on Python 3.x
# _*_ coding: utf-8 _*_
# __Author: "LEMON" # 求微软公司(MSFT)2015年每月股票收盘价的平均值。 #Method 1 (update) from matplotlib.finance import quotes_historical_yahoo_ochl
from datetime import date
import pandas as pd
from datetime import datetime today = date.today()
fields = ['date', 'open', 'close', 'high', 'low', 'volume'] start = (today.year - 3, today.month, today.day)
end = today
quotes = quotes_historical_yahoo_ochl('MSFT', start, end)
# each items in quotes is type of "tuple" df = pd.DataFrame(quotes, index=range(1, len(quotes) + 1), columns=fields) list = df.date.tolist()
list1 = []
for x in list:
x = date.fromordinal(int(x))
y = date.strftime(x, '%Y/%m')
list1.append(y) # print(list1)
df1 = df.set_index([list1]).drop('date',axis=1)
# 把日期设置成索引,并删除“date”列 df2 = df1['2015/01':'2015/12'] #选取2015年的数据
print(df2.groupby(df2.index).close.mean())
# 将数据按index进行聚类分析,并计算收盘价“close”的均值 # -----------------------------------------------------
# #Method 1 (old)
#
# from matplotlib.finance import quotes_historical_yahoo_ochl
# from datetime import date
# import pandas as pd
# from datetime import datetime
#
#
# today = date.today()
# fields = ['date', 'open', 'close', 'high', 'low', 'volume']
#
# start2 = (today.year - 3, today.month, today.day)
# end2 = today
# quotes2 = quotes_historical_yahoo_ochl('MSFT', start2, end2)
# # each items in quotes is type of "tuple"
#
# quotes3 = []
# for t in quotes2:
# t1 = list(t)
# quotes3.append(t1)
# # each items in quotes1 is type of "list"
#
# for i in range(0, len(quotes3)):
# quotes3[i][0] = date.fromordinal(int(quotes3[i][0]))
# # date format is changed
#
# df2 = pd.DataFrame(quotes3, index=range(1, len(quotes3) + 1), columns=fields)
#
# df2['date'] = pd.to_datetime(df2['date'], format='%Y-%m-%d') # 转化成pandas的日期格式
# # print(df2)
#
# start2015 = datetime(2015,1,1)
# end2015 = datetime(2015,12,31)
# # start2015 = datetime.strptime('2015-1-1', '%Y-%m-%d')
# # # 将'2015-1-1'字符串设置为时间格式
# # end2015 = datetime.strptime('2015-12-31', '%Y-%m-%d')
# # # 将'2015-12-31'字符串设置为时间格式
#
# df1 = df2[(start2015 <= df2.date) & (df2.date <= end2015)]
# # 通过时间条件来选择2015年的记录
#
# permonth1 = df1.date.dt.to_period('M') #data per month
# g_month1 = df1.groupby(permonth1)
# g_closequotes = g_month1['close']
#
# s_month = g_closequotes.mean() # s_month is Series class
# s_month.index.name = 'date_index'
#
# print(s_month)
# ----------------------------------------------------- # =================================================================
# Method 2 # from matplotlib.finance import quotes_historical_yahoo_ochl
# from datetime import date
#
# import pandas as pd
# today = date.today()
# start = (today.year-3, today.month, today.day)
# quotesMS = quotes_historical_yahoo_ochl('MSFT', start, today)
# attributes=['date','open','close','high','low','volume']
# quotesdfMS = pd.DataFrame(quotesMS, columns= attributes)
#
#
#
# list = []
# for i in range(0, len(quotesMS)):
# x = date.fromordinal(int(quotesMS[i][0]))
# y = date.strftime(x, '%y/%m/%d')
# list.append(y)
# quotesdfMS.index = list
# quotesdfMS = quotesdfMS.drop(['date'], axis = 1)
# list = []
# quotesdfMS15 = quotesdfMS['15/01/01':'15/12/31']
#
# print(quotesdfMS15)
#
# for i in range(0, len(quotesdfMS15)):
# list.append(int(quotesdfMS15.index[i][3:5])) #get month just like '02'
# quotesdfMS15['month'] = list
# print(quotesdfMS15.groupby('month').mean().close)
# =================================================================

输出结果如下:

 2015/01    43.124433
2015/02 40.956772
2015/03 40.203918
2015/04 41.477685
2015/05 45.472291
2015/06 44.145879
2015/07 43.807541
2015/08 43.838895
2015/09 42.114155
2015/10 47.082882
2015/11 52.252878
2015/12 53.916431
Name: close, dtype: float64

pandas基础-Python3的更多相关文章

  1. Pandas 基础(1) - 初识及安装 yupyter

    Hello, 大家好, 昨天说了我会再更新一个关于 Pandas 基础知识的教程, 这里就是啦......Pandas 被广泛应用于数据分析领域, 是一个很好的分析工具, 也是我们后面学习 machi ...

  2. 利用Python进行数据分析(12) pandas基础: 数据合并

    pandas 提供了三种主要方法可以对数据进行合并: pandas.merge()方法:数据库风格的合并: pandas.concat()方法:轴向连接,即沿着一条轴将多个对象堆叠到一起: 实例方法c ...

  3. 利用Python进行数据分析(9) pandas基础: 汇总统计和计算

    pandas 对象拥有一些常用的数学和统计方法.   例如,sum() 方法,进行列小计:   sum() 方法传入 axis=1 指定为横向汇总,即行小计:   idxmax() 获取最大值对应的索 ...

  4. 利用Python进行数据分析(8) pandas基础: Series和DataFrame的基本操作

    一.reindex() 方法:重新索引 针对 Series   重新索引指的是根据index参数重新进行排序. 如果传入的索引值在数据里不存在,则不会报错,而是添加缺失值的新行. 不想用缺失值,可以用 ...

  5. 利用Python进行数据分析(7) pandas基础: Series和DataFrame的简单介绍

    一.pandas 是什么 pandas 是基于 NumPy 的一个 Python 数据分析包,主要目的是为了数据分析.它提供了大量高级的数据结构和对数据处理的方法. pandas 有两个主要的数据结构 ...

  6. Pandas基础学习与Spark Python初探

    摘要:pandas是一个强大的Python数据分析工具包,pandas的两个主要数据结构Series(一维)和DataFrame(二维)处理了金融,统计,社会中的绝大多数典型用例科学,以及许多工程领域 ...

  7. numpy&pandas基础

    numpy基础 import numpy as np 定义array In [156]: np.ones(3) Out[156]: array([1., 1., 1.]) In [157]: np.o ...

  8. 基于 Python 和 Pandas 的数据分析(2) --- Pandas 基础

    在这个用 Python 和 Pandas 实现数据分析的教程中, 我们将明确一些 Pandas 基础知识. 加载到 Pandas Dataframe 的数据形式可以很多, 但是通常需要能形成行和列的数 ...

  9. python学习笔记(四):pandas基础

    pandas 基础 serise import pandas as pd from pandas import Series, DataFrame obj = Series([4, -7, 5, 3] ...

随机推荐

  1. CoreCLR源码探索(一) Object是什么

    .Net程序员们每天都在和Object在打交道 如果你问一个.Net程序员什么是Object,他可能会信誓旦旦的告诉你"Object还不简单吗,就是所有类型的基类" 这个答案是对的 ...

  2. 【.net 深呼吸】细说CodeDom(4):类型定义

    上一篇文章中说了命名空间,你猜猜接下来该说啥.是了,命名空间下面就是类型,知道了如何生成命名空间的定义代码,之后就该学会如何声明类型了. CLR的类型通常有这么几种:类.接口.结构.枚举.委托.是这么 ...

  3. 制作类似ThinkPHP框架中的PATHINFO模式功能

    一.PATHINFO功能简述 搞PHP的都知道ThinkPHP是一个免费开源的轻量级PHP框架,虽说轻量但它的功能却很强大.这也是我接触学习的第一个框架.TP框架中的URL默认模式即是PathInfo ...

  4. 聊聊Unity项目管理的那些事:Git-flow和Unity

    0x00 前言 目前所在的团队实行敏捷开发已经有了一段时间了.敏捷开发中重要的一个话题便是如何对项目进行恰当的版本管理.项目从最初使用svn到之后的Git One Track策略再到现在的GitFlo ...

  5. nw.js桌面软件开发系列 第0.1节 HTML5和桌面软件开发的碰撞

    第0.1节 HTML5和桌面软件开发的碰撞 当我们谈论桌面软件开发技术的时候,你会想到什么?如果不对技术本身进行更为深入的探讨,在我的世界里,有这么多技术概念可以被罗列出来(请原谅我本质上是一个Win ...

  6. 【资源】.Net 入门@提高 - 逆天的高薪之路!

     入门看视频,提高看书籍,飘升做项目.老练研开源,高手读外文,大牛讲低调~    官方学习计划 http://www.cnblogs.com/dunitian/p/5667901.html ----- ...

  7. Bootstrap-Select 动态加载数据的小记

    关于前端框架系列的可以参考我我刚学Bootstrap时候写的LoT.UI http://www.cnblogs.com/dunitian/p/4822808.html#lotui bootstrap- ...

  8. ASP.NET MVC5+EF6+EasyUI 后台管理系统(65)-MVC WebApi 用户验证 (1)

    系列目录 前言: WebAPI主要开放数据给手机APP,其他需要得知数据的系统,或者软件应用,所以移动端与系统的数据源往往是相通的. Web 用户的身份验证,及页面操作权限验证是B/S系统的基础功能, ...

  9. 【翻译】Awesome R资源大全中文版来了,全球最火的R工具包一网打尽,超过300+工具,还在等什么?

    0.前言 虽然很早就知道R被微软收购,也很早知道R在统计分析处理方面很强大,开始一直没有行动过...直到 直到12月初在微软技术大会,看到我软的工程师演示R的使用,我就震惊了,然后最近在网上到处了解和 ...

  10. 从零开始编写自己的C#框架(28)——建模、架构与框架

    文章写到这里,我一直在犹豫是继续写针对中小型框架的设计还是写些框架设计上的进阶方面的内容?对于中小型系统来说,只要将前面的内容进行一下细化,写上二三十章具体开发上的细节,来说明这个通用框架怎么开发的就 ...