[Python] Statistical analysis of time series
Global Statistics:
Common seen methods as such
1. Mean
2. Median
3. Standard deviation: the larger the number means it various a lot.
4. Sum.
Rolling Statistics:

It use a time window, moving forward each day to calculate the mean value of those window periods.
To find which day is good to buy which day is good for sell, we can use Bollinger bands.
Bollinger bands:

import os
import pandas as pd
import matplotlib.pyplot as plt def test_run():
start_date='2017-01-01'
end_data='2017-12-15'
dates=pd.date_range(start_date, end_data) # Create an empty data frame
df=pd.DataFrame(index=dates) symbols=['SPY', 'AAPL', 'IBM', 'GOOG', 'GLD']
for symbol in symbols:
temp=getAdjCloseForSymbol(symbol)
df=df.join(temp, how='inner') return df if __name__ == '__main__':
df=test_run()
# data=data.ix['2017-12-01':'2017-12-15', ['IBM', 'GOOG']]
# df=normalize_data(df)
ax = df['SPY'].plot(title="SPY rolling mean", label='SPY')
rm = df['SPY'].rolling(20).mean()
rm.plot(label='Rolling mean', ax=ax)
ax.set_xlabel('Date')
ax.set_ylabel('Price')
ax.legend(loc="upper left")
plt.show()
Now we can calculate Bollinger bands, it is 2 times std value.
"""Bollinger Bands.""" import os
import pandas as pd
import matplotlib.pyplot as plt def symbol_to_path(symbol, base_dir="data"):
"""Return CSV file path given ticker symbol."""
return os.path.join(base_dir, "{}.csv".format(str(symbol))) def get_data(symbols, dates):
"""Read stock data (adjusted close) for given symbols from CSV files."""
df = pd.DataFrame(index=dates)
if 'SPY' not in symbols: # add SPY for reference, if absent
symbols.insert(0, 'SPY') for symbol in symbols:
df_temp = pd.read_csv(symbol_to_path(symbol), index_col='Date',
parse_dates=True, usecols=['Date', 'Adj Close'], na_values=['nan'])
df_temp = df_temp.rename(columns={'Adj Close': symbol})
df = df.join(df_temp)
if symbol == 'SPY': # drop dates SPY did not trade
df = df.dropna(subset=["SPY"]) return df def plot_data(df, title="Stock prices"):
"""Plot stock prices with a custom title and meaningful axis labels."""
ax = df.plot(title=title, fontsize=12)
ax.set_xlabel("Date")
ax.set_ylabel("Price")
plt.show() def get_rolling_mean(values, window):
"""Return rolling mean of given values, using specified window size."""
return values.rolling(window=window).mean() def get_rolling_std(values, window):
"""Return rolling standard deviation of given values, using specified window size."""
# TODO: Compute and return rolling standard deviation
return values.rolling(window=window).std() def get_bollinger_bands(rm, rstd):
"""Return upper and lower Bollinger Bands."""
# TODO: Compute upper_band and lower_band
upper_band = rstd * 2 + rm
lower_band = rm - rstd * 2
return upper_band, lower_band def test_run():
# Read data
dates = pd.date_range('2012-01-01', '2012-12-31')
symbols = ['SPY']
df = get_data(symbols, dates) # Compute Bollinger Bands
# 1. Compute rolling mean
rm_SPY = get_rolling_mean(df['SPY'], window=20) # 2. Compute rolling standard deviation
rstd_SPY = get_rolling_std(df['SPY'], window=20) # 3. Compute upper and lower bands
upper_band, lower_band = get_bollinger_bands(rm_SPY, rstd_SPY) # Plot raw SPY values, rolling mean and Bollinger Bands
ax = df['SPY'].plot(title="Bollinger Bands", label='SPY')
rm_SPY.plot(label='Rolling mean', ax=ax)
upper_band.plot(label='upper band', ax=ax)
lower_band.plot(label='lower band', ax=ax) # Add axis labels and legend
ax.set_xlabel("Date")
ax.set_ylabel("Price")
ax.legend(loc='upper left')
plt.show() if __name__ == "__main__":
test_run()

Daily return:
Subtract the previous day's closing price from the most recent day's closing price. In this example, subtract $35.50 from $36.75 to get $1.25. Divide your Step 4 result by the previous day's closing price to calculate the daily return. Multiply this result by 100 to convert it to a percentage.

"""Compute daily returns.""" import os
import pandas as pd
import matplotlib.pyplot as plt def symbol_to_path(symbol, base_dir="data"):
"""Return CSV file path given ticker symbol."""
return os.path.join(base_dir, "{}.csv".format(str(symbol))) def get_data(symbols, dates):
"""Read stock data (adjusted close) for given symbols from CSV files."""
df = pd.DataFrame(index=dates)
if 'SPY' not in symbols: # add SPY for reference, if absent
symbols.insert(0, 'SPY') for symbol in symbols:
df_temp = pd.read_csv(symbol_to_path(symbol), index_col='Date',
parse_dates=True, usecols=['Date', 'Adj Close'], na_values=['nan'])
df_temp = df_temp.rename(columns={'Adj Close': symbol})
df = df.join(df_temp)
if symbol == 'SPY': # drop dates SPY did not trade
df = df.dropna(subset=["SPY"]) return df def plot_data(df, title="Stock prices", xlabel="Date", ylabel="Price"):
"""Plot stock prices with a custom title and meaningful axis labels."""
ax = df.plot(title=title, fontsize=12)
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
plt.show() def compute_daily_returns(df):
"""Compute and return the daily return values."""
# TODO: Your code here
# Note: Returned DataFrame must have the same number of rows
return df / df.shift(-1) -1 def test_run():
# Read data
dates = pd.date_range('2012-07-01', '2012-07-31') # one month only
symbols = ['SPY','XOM']
df = get_data(symbols, dates)
plot_data(df) # Compute daily returns
daily_returns = compute_daily_returns(df)
plot_data(daily_returns, title="Daily returns", ylabel="Daily returns") if __name__ == "__main__":
test_run()


Cumulative return:
an investment relative to the principal amount invested over a specified amount of time. ... To calculate cumulative return, subtract the original price of the investment from the current price and divide that difference by the original price.

[Python] Statistical analysis of time series的更多相关文章
- How-to: Do Statistical Analysis with Impala and R
sklearn实战-乳腺癌细胞数据挖掘(博客主亲自录制视频教程) https://study.163.com/course/introduction.htm?courseId=1005269003&a ...
- python data analysis | python数据预处理(基于scikit-learn模块)
原文:http://www.jianshu.com/p/94516a58314d Dataset transformations| 数据转换 Combining estimators|组合学习器 Fe ...
- python学习笔记—DataFrame和Series的排序
更多大数据分析.建模等内容请关注公众号<bigdatamodeling> ################################### 排序 ################## ...
- Should You Build Your Own Backtester?
By Michael Halls-Moore on August 2nd, 2016 This post relates to a talk I gave in April at QuantCon 2 ...
- Python数据分析工具:Pandas之Series
Python数据分析工具:Pandas之Series Pandas概述Pandas是Python的一个数据分析包,该工具为解决数据分析任务而创建.Pandas纳入大量库和标准数据模型,提供高效的操作数 ...
- 用 Python 通过马尔可夫随机场(MRF)与 Ising Model 进行二值图降噪
前言 这个降噪的模型来自 Christopher M. Bishop 的 Pattern Recognition And Machine Learning (就是神书 PRML……),问题是如何对一个 ...
- 大数据分析与机器学习领域Python兵器谱
http://www.thebigdata.cn/JieJueFangAn/13317.html 曾经因为NLTK的缘故开始学习Python,之后渐渐成为我工作中的第一辅助脚本语言,虽然开发语言是C/ ...
- Machine and Deep Learning with Python
Machine and Deep Learning with Python Education Tutorials and courses Supervised learning superstiti ...
- Python 网页爬虫 & 文本处理 & 科学计算 & 机器学习 & 数据挖掘兵器谱(转)
原文:http://www.52nlp.cn/python-网页爬虫-文本处理-科学计算-机器学习-数据挖掘 曾经因为NLTK的缘故开始学习Python,之后渐渐成为我工作中的第一辅助脚本语言,虽然开 ...
随机推荐
- vue非空校验
效果图 实现代码 //页面html <div> <ul class="listinfo"> <li> <span class=" ...
- codeforces 914 D Bash and a Tough Math Puzzle
#include<iostream> #include<cstring> #include<cstdio> #include<algorithm> #i ...
- 升级glibc的感慨,
1. 直接升级 glibc是gnu发布的libc库,即c运行库.glibc是linux系统中最底层的api,几乎其它任何运行库都会依赖于glibc.glibc除了封装linux操作系统所提供的系统服务 ...
- 常见VPS buy地址
***,也是最适合新手使用的: https://bwh1.net/ (支持支付宝) vultr,以下是我的分享链接: https://www.vultr.com/(支持支付宝) SugarHosts: ...
- 【codeforces 816B】Karen and Coffee
[题目链接]:http://codeforces.com/contest/816/problem/B [题意] 给你很多个区间[l,r]; 1<=l<=r<=2e5 一个数字如果被k ...
- Qt之pro配置详解
简述 使用Qt的时候,我们经常会对pro进行一系列繁琐的配置,为方便大家理解.查找,现将常用的配置进行整理. 简述 配置 注释 CONFIG DEFINES DEPENDPATH DESTDIR FO ...
- hdu 5269 ZYB loves Xor I && BestCoder Round #44
题意: ZYB喜欢研究Xor,如今他得到了一个长度为n的数组A. 于是他想知道:对于全部数对(i,j)(i∈[1,n],j∈[1,n]).lowbit(AixorAj)之和为多少.因为答案可能过大,你 ...
- 查看CPU是几核
命令1 (查看有几个CPU):cat /proc/cpuinfo |grep "physical id"|sort |uniq|wc -l 命令2 (每个CPU几核):cat /p ...
- Codeforces 559B Equivalent Strings 等价串
题意:给定两个等长串a,b.推断是否等价.等价的含义为:若长度为奇数,则必须是同样串.若长度是偶数,则将两串都均分成长度为原串一半的两个子串al,ar和bl,br,当中al和bl等价且ar和br等价, ...
- oracle 11g RAC手动卸载grid,no deinstall
1.通过root用户进入到grid的ORACLE_HOME [root@db01]# source /home/grid/.bash_profile [root@db01]# cd $ORACLE_H ...