[Python] Statistical analysis of time series
Global Statistics:
Common seen methods as such
1. Mean
2. Median
3. Standard deviation: the larger the number means it various a lot.
4. Sum.
Rolling Statistics:

It use a time window, moving forward each day to calculate the mean value of those window periods.
To find which day is good to buy which day is good for sell, we can use Bollinger bands.
Bollinger bands:

import os
import pandas as pd
import matplotlib.pyplot as plt def test_run():
start_date='2017-01-01'
end_data='2017-12-15'
dates=pd.date_range(start_date, end_data) # Create an empty data frame
df=pd.DataFrame(index=dates) symbols=['SPY', 'AAPL', 'IBM', 'GOOG', 'GLD']
for symbol in symbols:
temp=getAdjCloseForSymbol(symbol)
df=df.join(temp, how='inner') return df if __name__ == '__main__':
df=test_run()
# data=data.ix['2017-12-01':'2017-12-15', ['IBM', 'GOOG']]
# df=normalize_data(df)
ax = df['SPY'].plot(title="SPY rolling mean", label='SPY')
rm = df['SPY'].rolling(20).mean()
rm.plot(label='Rolling mean', ax=ax)
ax.set_xlabel('Date')
ax.set_ylabel('Price')
ax.legend(loc="upper left")
plt.show()
Now we can calculate Bollinger bands, it is 2 times std value.
"""Bollinger Bands.""" import os
import pandas as pd
import matplotlib.pyplot as plt def symbol_to_path(symbol, base_dir="data"):
"""Return CSV file path given ticker symbol."""
return os.path.join(base_dir, "{}.csv".format(str(symbol))) def get_data(symbols, dates):
"""Read stock data (adjusted close) for given symbols from CSV files."""
df = pd.DataFrame(index=dates)
if 'SPY' not in symbols: # add SPY for reference, if absent
symbols.insert(0, 'SPY') for symbol in symbols:
df_temp = pd.read_csv(symbol_to_path(symbol), index_col='Date',
parse_dates=True, usecols=['Date', 'Adj Close'], na_values=['nan'])
df_temp = df_temp.rename(columns={'Adj Close': symbol})
df = df.join(df_temp)
if symbol == 'SPY': # drop dates SPY did not trade
df = df.dropna(subset=["SPY"]) return df def plot_data(df, title="Stock prices"):
"""Plot stock prices with a custom title and meaningful axis labels."""
ax = df.plot(title=title, fontsize=12)
ax.set_xlabel("Date")
ax.set_ylabel("Price")
plt.show() def get_rolling_mean(values, window):
"""Return rolling mean of given values, using specified window size."""
return values.rolling(window=window).mean() def get_rolling_std(values, window):
"""Return rolling standard deviation of given values, using specified window size."""
# TODO: Compute and return rolling standard deviation
return values.rolling(window=window).std() def get_bollinger_bands(rm, rstd):
"""Return upper and lower Bollinger Bands."""
# TODO: Compute upper_band and lower_band
upper_band = rstd * 2 + rm
lower_band = rm - rstd * 2
return upper_band, lower_band def test_run():
# Read data
dates = pd.date_range('2012-01-01', '2012-12-31')
symbols = ['SPY']
df = get_data(symbols, dates) # Compute Bollinger Bands
# 1. Compute rolling mean
rm_SPY = get_rolling_mean(df['SPY'], window=20) # 2. Compute rolling standard deviation
rstd_SPY = get_rolling_std(df['SPY'], window=20) # 3. Compute upper and lower bands
upper_band, lower_band = get_bollinger_bands(rm_SPY, rstd_SPY) # Plot raw SPY values, rolling mean and Bollinger Bands
ax = df['SPY'].plot(title="Bollinger Bands", label='SPY')
rm_SPY.plot(label='Rolling mean', ax=ax)
upper_band.plot(label='upper band', ax=ax)
lower_band.plot(label='lower band', ax=ax) # Add axis labels and legend
ax.set_xlabel("Date")
ax.set_ylabel("Price")
ax.legend(loc='upper left')
plt.show() if __name__ == "__main__":
test_run()

Daily return:
Subtract the previous day's closing price from the most recent day's closing price. In this example, subtract $35.50 from $36.75 to get $1.25. Divide your Step 4 result by the previous day's closing price to calculate the daily return. Multiply this result by 100 to convert it to a percentage.

"""Compute daily returns.""" import os
import pandas as pd
import matplotlib.pyplot as plt def symbol_to_path(symbol, base_dir="data"):
"""Return CSV file path given ticker symbol."""
return os.path.join(base_dir, "{}.csv".format(str(symbol))) def get_data(symbols, dates):
"""Read stock data (adjusted close) for given symbols from CSV files."""
df = pd.DataFrame(index=dates)
if 'SPY' not in symbols: # add SPY for reference, if absent
symbols.insert(0, 'SPY') for symbol in symbols:
df_temp = pd.read_csv(symbol_to_path(symbol), index_col='Date',
parse_dates=True, usecols=['Date', 'Adj Close'], na_values=['nan'])
df_temp = df_temp.rename(columns={'Adj Close': symbol})
df = df.join(df_temp)
if symbol == 'SPY': # drop dates SPY did not trade
df = df.dropna(subset=["SPY"]) return df def plot_data(df, title="Stock prices", xlabel="Date", ylabel="Price"):
"""Plot stock prices with a custom title and meaningful axis labels."""
ax = df.plot(title=title, fontsize=12)
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
plt.show() def compute_daily_returns(df):
"""Compute and return the daily return values."""
# TODO: Your code here
# Note: Returned DataFrame must have the same number of rows
return df / df.shift(-1) -1 def test_run():
# Read data
dates = pd.date_range('2012-07-01', '2012-07-31') # one month only
symbols = ['SPY','XOM']
df = get_data(symbols, dates)
plot_data(df) # Compute daily returns
daily_returns = compute_daily_returns(df)
plot_data(daily_returns, title="Daily returns", ylabel="Daily returns") if __name__ == "__main__":
test_run()


Cumulative return:
an investment relative to the principal amount invested over a specified amount of time. ... To calculate cumulative return, subtract the original price of the investment from the current price and divide that difference by the original price.

[Python] Statistical analysis of time series的更多相关文章
- How-to: Do Statistical Analysis with Impala and R
sklearn实战-乳腺癌细胞数据挖掘(博客主亲自录制视频教程) https://study.163.com/course/introduction.htm?courseId=1005269003&a ...
- python data analysis | python数据预处理(基于scikit-learn模块)
原文:http://www.jianshu.com/p/94516a58314d Dataset transformations| 数据转换 Combining estimators|组合学习器 Fe ...
- python学习笔记—DataFrame和Series的排序
更多大数据分析.建模等内容请关注公众号<bigdatamodeling> ################################### 排序 ################## ...
- Should You Build Your Own Backtester?
By Michael Halls-Moore on August 2nd, 2016 This post relates to a talk I gave in April at QuantCon 2 ...
- Python数据分析工具:Pandas之Series
Python数据分析工具:Pandas之Series Pandas概述Pandas是Python的一个数据分析包,该工具为解决数据分析任务而创建.Pandas纳入大量库和标准数据模型,提供高效的操作数 ...
- 用 Python 通过马尔可夫随机场(MRF)与 Ising Model 进行二值图降噪
前言 这个降噪的模型来自 Christopher M. Bishop 的 Pattern Recognition And Machine Learning (就是神书 PRML……),问题是如何对一个 ...
- 大数据分析与机器学习领域Python兵器谱
http://www.thebigdata.cn/JieJueFangAn/13317.html 曾经因为NLTK的缘故开始学习Python,之后渐渐成为我工作中的第一辅助脚本语言,虽然开发语言是C/ ...
- Machine and Deep Learning with Python
Machine and Deep Learning with Python Education Tutorials and courses Supervised learning superstiti ...
- Python 网页爬虫 & 文本处理 & 科学计算 & 机器学习 & 数据挖掘兵器谱(转)
原文:http://www.52nlp.cn/python-网页爬虫-文本处理-科学计算-机器学习-数据挖掘 曾经因为NLTK的缘故开始学习Python,之后渐渐成为我工作中的第一辅助脚本语言,虽然开 ...
随机推荐
- input[type="file"]的图片预览
在项目中遇到用input标签file类型的文件上传,想实在上传之前进行图片的预览功能:之前的做的一个解决方案是文件先上传上去然后返回地址再显示在页面上,这样就不太好,因为用户基本信息可能并没有保存,但 ...
- Collections库使用
Date: 2019-05-27 Author: Sun Collections库 Python拥有一些内置的数据类型,比如str, int, list, tuple, dict等, collec ...
- 搭建ss总结
今天晚上做的事情: 1. https://www.vultr.com/ 购买vps 2. ssh连接到服务器 参照网上帖子安装 https://blog.csdn.net/littlepig19930 ...
- 洛谷3627 [APIO2009]抢掠计划
题目描述 输入格式: 第一行包含两个整数 N.M.N 表示路口的个数,M 表示道路条数.接下来 M 行,每行两个整数,这两个整数都在 1 到 N 之间,第 i+1 行的两个整数表示第 i 条道路的起点 ...
- layui框架下的摸索与学习
一.table表格内的查询 1.单个条件查询: 主要代码: <%-- Created by IntelliJ IDEA. User: Administrator Date: 2019/1/14 ...
- C++根据扩展名获取文件图标、类型
简述 在Windows系统中,根据扩展名来区分文件类型,比如:.txt(文本文件)..exe(可执行程序).*.zip(压缩文件),下面,我们来根据扩展名来获取对应的文件图标.类型. 简述 源码 源码 ...
- Qt之自定义布局管理器(QBorderLayout)
简述 QBorderLayout,顾名思义-边框布局,实现了排列子控件包围中央区域的布局. 具体实现要求不再赘述,请参考前几节内容. 简述 实现 效果 源码 使用 实现 QBorderLayout主要 ...
- Redis Java调用
Redis Java调用 package com.stono.redis; import redis.clients.jedis.Jedis; public class RedisJava { pub ...
- Win7操作系统防火墙无法关闭的问题 无法找到防火墙关闭的地方的解决的方法
计算机右键-管理-服务和应用程序-服务,找到Windows Firewall.双击,启动类型设为自己主动,确定.若这不到这项服务说明被阉割.考虑更换介质安装系统.360等一些安全软件带也有防火墙.全然 ...
- LeetCode 之 Merge Sorted Array(排序)
[问题描写叙述] Given two sorted integer arrays nums1 and nums2, merge nums2 into nums1 as one sorted array ...