The component and implementation of a basic gradient descent in python
in my impression, the gradient descent is for finding the independent variable that can get the minimum/maximum value of an objective function. So we need an obj. function: \(\mathcal{L}\)
- an obj. function: \(\mathcal{L}\)
- The gradient of \(\mathcal{L}: 2x+2\)
- \(\Delta x\) , The value of idependent variable needs to be updated: \(x \leftarrow x+\Delta x\)
1. the \(\mathcal{L}\) is a context function: \(f(x)=x^2+2x+1\)
how to find the \(x_0\) that makes the \(f(x)\) has the minimum value, via gradient descent?
Start with an arbitrary \(x\), calculate the value of \(f(x)\) :
import random
def func(x):
return x*x + 2*x +1
def gred(x): # the gradient of f(x)
return 2*x + 2
x = random.uniform(-10.0,10.0) #randomly pick a float in interval of (-10, 10)
# x = 10
print('x starts at:', x)
y0 = func(x) #first cal
delta = 0.5 #the value of delta_x, each iteration
x = x + delta
# === interation ===
for i in range(100):
print('i=',i)
y1 = func(x)
delta = -0.08*gred(x)
print(' delta=',delta)
if y1 > y0:
print(' y1>y0')
# if gred(x) is positive, the x should decrease.
# if gred(x) is negative, the x should increase.
else:
print(' y1<=y0')
# if gred(x) is positive, the x should increase.
# if gred(x) is negative, the x should decrease.
x = x+delta
y0 = y1
print(' x=', x, 'f(x)=', y1)
Let's disscuss how to determin the some_value in the psudo code above.
if \(y_1-y_0\) has a large positive difference, i.e. \(y1 >> y0\), the x should shift backward heavily. so the some_value can be a ratio of \((y_1-y_0)\times(-gradient)\) , Let's say, some_value: \(\lambda = r \times\) gred(x) , here, \(r=0.08\) is the step-size.
The basic gradient descent has many shortcomings which can be found by search the 'shortcoming of gd'.
Another problem of GD algorithm is , What if the \(\mathcal{L}\) does not have explicit expression of its gradient?
Stochastic Gradient Descent(SGD) is another GD algorithm.
The component and implementation of a basic gradient descent in python的更多相关文章
- (转)Introduction to Gradient Descent Algorithm (along with variants) in Machine Learning
Introduction Optimization is always the ultimate goal whether you are dealing with a real life probl ...
- Logistic Regression and Gradient Descent
Logistic Regression and Gradient Descent Logistic regression is an excellent tool to know for classi ...
- (转) An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms Table of contents: Gradient descent variants ...
- 机器学习-随机梯度下降(Stochastic gradient descent)
sklearn实战-乳腺癌细胞数据挖掘(博主亲自录制视频) https://study.163.com/course/introduction.htm?courseId=1005269003& ...
- An overview of gradient descent optimization algorithms
原文地址:An overview of gradient descent optimization algorithms An overview of gradient descent optimiz ...
- 机器学习数学基础- gradient descent算法(上)
为什么要了解点数学基础 学习大数据分布式计算时多少会涉及到机器学习的算法,所以理解一些机器学习基础,有助于理解大数据分布式计算系统(比如spark)的设计.机器学习中一个常见的就是gradient d ...
- flink 批量梯度下降算法线性回归参数求解(Linear Regression with BGD(batch gradient descent) )
1.线性回归 假设线性函数如下: 假设我们有10个样本x1,y1),(x2,y2).....(x10,y10),求解目标就是根据多个样本求解theta0和theta1的最优值. 什么样的θ最好的呢?最 ...
- 梯度下降(Gradient Descent)小结
在求解机器学习算法的模型参数,即无约束优化问题时,梯度下降(Gradient Descent)是最常采用的方法之一,另一种常用的方法是最小二乘法.这里就对梯度下降法做一个完整的总结. 1. 梯度 在微 ...
- 机器学习基础——梯度下降法(Gradient Descent)
机器学习基础--梯度下降法(Gradient Descent) 看了coursea的机器学习课,知道了梯度下降法.一开始只是对其做了下简单的了解.随着内容的深入,发现梯度下降法在很多算法中都用的到,除 ...
随机推荐
- final、finally、以及finalize的区别
final ---修饰类.变量和方法,修饰的类不能被继承 .修饰的方法不能被重写 .修饰的成员变量不可更改 另外,修饰成员变量必须立即赋值,修饰局部变量使用之前被赋值就可以. finally通常和tr ...
- Python练习七
1.写函数,检查传入字典的每一个value的长度,如果大于2,那么仅保留前两个长度的内容,并将新内容返回给调用者. def func(dic): for k in dic: if len(dic[k] ...
- Linux下搭建测试环境
一. 安装虚拟机 1.选择linux 型号 3.0x 64的版本 2.磁盘分区 /目录, home目录 ,boot,var ,设置root密码 3.安装(过程略) 二. 配置虚拟机网卡 路径:cd / ...
- 转载 JAVA gc垃圾回收机制
thanks:https://m.oschina.net/u/123553 一.GC概要 JVM堆相关知识 为什么先说JVM堆? JVM的堆是Java对象的活动空间,程序中的类的对象从中分 ...
- sourceforge.net安装网站程序数据库相关
sourceforge.net安装网站程序数据库相关 我们应该知道sourceforge.net是可以安装网站(当做一个虚拟空间使用的) 但是在安装cms程序的时候那时的数据库地址再填写“localh ...
- table动态增加删除
基于网上代码修改实现动态添加表数据行 <!DOCTYPE html> <html lang="cn"> <html> <head> ...
- 矩形覆盖(JAVA)
矩形覆盖 题目描述 我们可以用2*1的小矩形横着或者竖着去覆盖更大的矩形.请问用n个2*1的小矩形无重叠地覆盖一个2*n的大矩形,总共有多少种方法? 思路:最初看到这题,只能通过画图归纳来寻找规律. ...
- streamreader 和 streamwriter 以及 string 与 memorystream 使用示例
经常用到,但老记不住,备忘一下 using (var ms = new MemoryStream()) { var sw = new StreamWriter(ms); sw.WriteLine(&q ...
- [JAVA]JAVA实现多线程的三种方式
1.继承Thread类,通过start()方法调用 public class MultiThreadByExtends extends Thread { @Override public void r ...
- MSMQ .NET下的应用
Message Message是MSMQ的数据存储单元,我们的用户数据一般也被填充在Message的body当中,因此很重要,让我们来看一看其在.net中的体现,如图: 在图上我们可以看见,Messa ...