scikit-learn 学习笔记-- Generalized Linear Models （三）

Bayesian regression

前面介绍的线性模型都是从最小二乘，均方误差的角度去建立的，从最简单的最小二乘到带正则项的 lasso，ridge 等。而 Bayesian regression 是从 Bayesian 概率模型的角度出发的，虽然最后也会转换成一个能量函数的形式。

从前面的线性模型中，我们都假设如下的关系：

y=wx" role="presentation">y=wxy=wx

上面这个关系式其实是直接从值的角度来考虑，其实我们也可以假设如下的关系：

y=wx+ϵ" role="presentation">y=wx+ϵy=wx+ϵ

这个 ϵ" role="presentation" style="position: relative;">ϵϵ 表示一种误差，或者噪声，如果估计的值非常准确，那么 ϵ=0" role="presentation" style="position: relative;">ϵ=0ϵ=0, 否则，这将是一个随机数。

如果我们有一组训练样本，那么每个观察值 y" role="presentation" style="position: relative;">yy 都会有个对应的 ϵ" role="presentation" style="position: relative;">ϵϵ, 而且我们假设 ϵ" role="presentation" style="position: relative;">ϵϵ 是满足独立同分布的。那么我们可以用概率的形式表示为：

对于一组训练集，我们可以表示为：

最后，利用最大似然估计，可以将上面的表达式转化为一个能量最小的形式。上面是从最大似然估计的角度去求系数。

下面我们考虑从最大后验概率的角度，

p(w|y)=p(y|w)p(w|α)p(α)" role="presentation">p(w|y)=p(y|w)p(w|α)p(α)p(w|y)=p(y|w)p(w|α)p(α)

p(w|α)=N(w|0,α−1I)" role="presentation">p(w|α)=N(w|0,α−1I)p(w|α)=N(w|0,α−1I)

p(α)" role="presentation" style="position: relative;">p(α)p(α) 本身是服从 gamma 分布的。

sklearn 上也给出了一个例子：

import numpy as np

import matplotlib.pyplot as plt

from scipy import stats

from sklearn.linear_model import BayesianRidge, LinearRegression

# #############################################################################

# Generating simulated data with Gaussian weights

np.random.seed(0)

n_samples, n_features = 100, 100

X = np.random.randn(n_samples, n_features)  # Create Gaussian data

# Create weights with a precision lambda_ of 4.

lambda_ = 4.

w = np.zeros(n_features)

# Only keep 10 weights of interest

relevant_features = np.random.randint(0, n_features, 10)

for i in relevant_features:

    w[i] = stats.norm.rvs(loc=0, scale=1. / np.sqrt(lambda_))

# Create noise with a precision alpha of 50.

alpha_ = 50.

noise = stats.norm.rvs(loc=0, scale=1. / np.sqrt(alpha_), size=n_samples)

# Create the target

y = np.dot(X, w) + noise

# #############################################################################

# Fit the Bayesian Ridge Regression and an OLS for comparison

clf = BayesianRidge(compute_score=True)

clf.fit(X, y)

ols = LinearRegression()

ols.fit(X, y)

# #############################################################################

# Plot true weights, estimated weights, histogram of the weights, and

# predictions with standard deviations

lw = 2

plt.figure(figsize=(6, 5))

plt.title("Weights of the model")

plt.plot(clf.coef_, color='lightgreen', linewidth=lw,

         label="Bayesian Ridge estimate")

plt.plot(w, color='gold', linewidth=lw, label="Ground truth")

plt.plot(ols.coef_, color='navy', linestyle='--', label="OLS estimate")

plt.xlabel("Features")

plt.ylabel("Values of the weights")

plt.legend(loc="best", prop=dict(size=12))

plt.figure(figsize=(6, 5))

plt.title("Histogram of the weights")

plt.hist(clf.coef_, bins=n_features, color='gold', log=True,

         edgecolor='black')

plt.scatter(clf.coef_[relevant_features], 5 * np.ones(len(relevant_features)),

            color='navy', label="Relevant features")

plt.ylabel("Features")

plt.xlabel("Values of the weights")

plt.legend(loc="upper left")

plt.figure(figsize=(6, 5))

plt.title("Marginal log-likelihood")

plt.plot(clf.scores_, color='navy', linewidth=lw)

plt.ylabel("Score")

plt.xlabel("Iterations")

# Plotting some predictions for polynomial regression

def f(x, noise_amount):

    y = np.sqrt(x) * np.sin(x)

    noise = np.random.normal(0, 1, len(x))

    return y + noise_amount * noise

degree = 10

X = np.linspace(0, 10, 100)

y = f(X, noise_amount=0.1)

clf_poly = BayesianRidge()

clf_poly.fit(np.vander(X, degree), y)

X_plot = np.linspace(0, 11, 25)

y_plot = f(X_plot, noise_amount=0)

y_mean, y_std = clf_poly.predict(np.vander(X_plot, degree), return_std=True)

plt.figure(figsize=(6, 5))

plt.errorbar(X_plot, y_mean, y_std, color='navy',

             label="Polynomial Bayesian Ridge Regression", linewidth=lw)

plt.plot(X_plot, y_plot, color='gold', linewidth=lw,

         label="Ground Truth")

plt.ylabel("Output y")

plt.xlabel("Feature X")

plt.legend(loc="lower left")

plt.show()

scikit-learn 学习笔记-- Generalized Linear Models （三）的更多相关文章

scikit-learn 学习笔记-- Generalized Linear Models (一)
scikit-learn 是非常优秀的一个有关机器学习的 Python Lib,包含了除深度学习之外的传统机器学习的绝大多数算法,对于了解传统机器学习是一个很不错的平台.每个算法都有相应的例子,既可以 ...
scikit-learn 学习笔记-- Generalized Linear Models （二）
Lasso regression 今天介绍另外一种带正则项的线性回归, ridge regression 的正则项是二范数,还有另外一种是一范数的,也就是lasso 回归,lasso 回归的正则项是系 ...
Andrew Ng机器学习公开课笔记 -- Generalized Linear Models
网易公开课,第4课 notes,http://cs229.stanford.edu/notes/cs229-notes1.pdf 前面介绍一个线性回归问题,符合高斯分布一个分类问题,logstic回 ...
机器学习-scikit learn学习笔记
scikit-learn官网:http://scikit-learn.org/stable/ 通常情况下,一个学习问题会包含一组学习样本数据,计算机通过对样本数据的学习,尝试对未知数据进行预测. 学习 ...
[Scikit-learn] 1.1 Generalized Linear Models - from Linear Regression to L1&L2
Introduction 一.Scikit-learning 广义线性模型 From: http://sklearn.lzjqsdd.com/modules/linear_model.html#ord ...
[Scikit-learn] 1.5 Generalized Linear Models - SGD for Regression
梯度下降一.亲手实现“梯度下降” 以下内容其实就是<手动实现简单的梯度下降>. 神经网络的实践笔记,主要包括: Logistic分类函数反向传播相关内容 Link: http://pe ...
[Scikit-learn] 1.5 Generalized Linear Models - SGD for Classification
NB: 因为softmax,NN看上去是分类,其实是拟合(回归),拟合最大似然. 多分类参见:[Scikit-learn] 1.1 Generalized Linear Models - Logist ...
[Scikit-learn] 1.1 Generalized Linear Models - Logistic regression & Softmax
二分类:Logistic regression 多分类:Softmax分类函数对于损失函数,我们求其最小值, 对于似然函数,我们求其最大值. Logistic是loss function,即: 在逻 ...
广义线性模型（Generalized Linear Models）
前面的文章已经介绍了一个回归和一个分类的例子.在逻辑回归模型中我们假设: 在分类问题中我们假设: 他们都是广义线性模型中的一个例子,在理解广义线性模型之前需要先理解指数分布族. 指数分布族(The E ...

随机推荐

2018 Multi-University Training Contest 10 Solution
A - Problem A.Alkane 留坑. B - Problem B. Beads 留坑. C - Problem C. Calculate 留坑. D - Problem D. Permut ...
《零起点，python大数据与量化交易》
<零起点,python大数据与量化交易>,这应该是国内第一部,关于python量化交易的书籍. 有出版社约稿,写本量化交易与大数据的书籍,因为好几年没写书了,再加上近期"前海智库 ...
open-falcon设置报警邮件
下载编译好的二进制包并解压: https://files.cnblogs.com/files/dylan-wu/mail-provider.tar.gz [root@localhost work]# ...
OpenCV_火焰检测——完整代码
转:http://blog.csdn.net/xiao_lxl/article/details/43307993 火焰检测小程序前几天,偶然看到了An Early Fire-Detection Me ...
hadoop经典案例
hadoop经典案例http://blog.csdn.net/column/details/sparkhadoopdemo.html
zoj Candies 思维
http://acm.zju.edu.cn/changsha/showProblem.do?problemId=31 题意: 给你n个非负整数,然后输入n个x[i],x[i] == -1表示第i个数不 ...
python 浮点数取绝对值
import math print(math.fabs(-2.1)) print(math.fabs(-0.0)) print(math.fabs(10.1)) print(math.fabs(0.0 ...
中国铁路基于Intel架构超大规模OpenStack行业云的性能优化研究
1. 项目简介铁路作为一种大众化的交通工具和非常重要的货物运输方式,其业务规模庞大.覆盖全国.服务全国各族人民.铁路面向公众提供的服务业务,主要是客运和货运两大类,且每年365天.每天7*24小时连 ...
linux 环境下 eas console的运行
1)访问 http://<HOST>:19000/easconsole/ 2)然后下载 jnlp 文件. 3)找个jre, 用javaws 运行 jnlp文件
Spring 及 SpringMVC的web.xml配置详解
出处http://blog.csdn.net/u010796790 1.spring 框架解决字符串编码问题:过滤器 CharacterEncodingFilter(filter-name) 2.在w ...

scikit-learn 学习笔记-- Generalized Linear Models （三）

Bayesian regression

scikit-learn 学习笔记-- Generalized Linear Models （三）的更多相关文章

随机推荐

热门专题