The component and implementation of a basic gradient descent in python

in my impression, the gradient descent is for finding the independent variable that can get the minimum/maximum value of an objective function. So we need an obj. function: \(\mathcal{L}\)

an obj. function: \(\mathcal{L}\)
The gradient of \(\mathcal{L}: 2x+2\)
\(\Delta x\) , The value of idependent variable needs to be updated: \(x \leftarrow x+\Delta x\)

1. the \(\mathcal{L}\) is a context function: \(f(x)=x^2+2x+1\)

how to find the \(x_0\) that makes the \(f(x)\) has the minimum value, via gradient descent?

Start with an arbitrary \(x\), calculate the value of \(f(x)\) :

import random

def func(x):

  return  x*x + 2*x +1

def gred(x):   # the gradient of f(x)

  return 2*x + 2

x = random.uniform(-10.0,10.0)  #randomly pick a float in interval of (-10, 10)

# x = 10

print('x starts at:', x)

y0 = func(x) #first cal

delta = 0.5  #the value of delta_x, each iteration

x = x + delta

# === interation ===

for i in range(100):

  print('i=',i)

  y1 = func(x)

  delta = -0.08*gred(x)

  print('  delta=',delta)

  if y1 > y0:

    print('  y1>y0')

    # if gred(x) is positive, the x should decrease.

    # if gred(x) is negative, the x should increase.

  else:

    print('  y1<=y0')

    # if gred(x) is positive, the x should increase.

    # if gred(x) is negative, the x should decrease.

  x = x+delta

  y0 = y1

  print('  x=', x, 'f(x)=', y1)

Let's disscuss how to determin the some_value in the psudo code above.

if \(y_1-y_0\) has a large positive difference, i.e. \(y1 >> y0\), the x should shift backward heavily. so the some_value can be a ratio of \((y_1-y_0)\times(-gradient)\) , Let's say, some_value: \(\lambda = r \times\) gred(x) , here, \(r=0.08\) is the step-size.

The basic gradient descent has many shortcomings which can be found by search the 'shortcoming of gd'.

Another problem of GD algorithm is , What if the \(\mathcal{L}\) does not have explicit expression of its gradient?

Stochastic Gradient Descent(SGD) is another GD algorithm.

The component and implementation of a basic gradient descent in python的更多相关文章

（转）Introduction to Gradient Descent Algorithm (along with variants) in Machine Learning
Introduction Optimization is always the ultimate goal whether you are dealing with a real life probl ...
Logistic Regression and Gradient Descent
Logistic Regression and Gradient Descent Logistic regression is an excellent tool to know for classi ...
(转) An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms Table of contents: Gradient descent variants ...
机器学习-随机梯度下降（Stochastic gradient descent）
sklearn实战-乳腺癌细胞数据挖掘(博主亲自录制视频) https://study.163.com/course/introduction.htm?courseId=1005269003& ...
An overview of gradient descent optimization algorithms
原文地址:An overview of gradient descent optimization algorithms An overview of gradient descent optimiz ...
机器学习数学基础- gradient descent算法（上）
为什么要了解点数学基础学习大数据分布式计算时多少会涉及到机器学习的算法,所以理解一些机器学习基础,有助于理解大数据分布式计算系统(比如spark)的设计.机器学习中一个常见的就是gradient d ...
flink 批量梯度下降算法线性回归参数求解（Linear Regression with BGD(batch gradient descent) ）
1.线性回归假设线性函数如下: 假设我们有10个样本x1,y1),(x2,y2).....(x10,y10),求解目标就是根据多个样本求解theta0和theta1的最优值. 什么样的θ最好的呢?最 ...
梯度下降（Gradient Descent）小结
在求解机器学习算法的模型参数,即无约束优化问题时,梯度下降(Gradient Descent)是最常采用的方法之一,另一种常用的方法是最小二乘法.这里就对梯度下降法做一个完整的总结. 1. 梯度在微 ...
机器学习基础——梯度下降法（Gradient Descent）
机器学习基础--梯度下降法(Gradient Descent) 看了coursea的机器学习课,知道了梯度下降法.一开始只是对其做了下简单的了解.随着内容的深入,发现梯度下降法在很多算法中都用的到,除 ...

随机推荐

final、finally、以及finalize的区别
final ---修饰类.变量和方法,修饰的类不能被继承 .修饰的方法不能被重写 .修饰的成员变量不可更改另外,修饰成员变量必须立即赋值,修饰局部变量使用之前被赋值就可以. finally通常和tr ...
Python练习七
1.写函数,检查传入字典的每一个value的长度,如果大于2,那么仅保留前两个长度的内容,并将新内容返回给调用者. def func(dic): for k in dic: if len(dic[k] ...
Linux下搭建测试环境
一. 安装虚拟机 1.选择linux 型号 3.0x 64的版本 2.磁盘分区 /目录, home目录 ,boot,var ,设置root密码 3.安装(过程略) 二. 配置虚拟机网卡路径:cd / ...
转载 JAVA gc垃圾回收机制
thanks:https://m.oschina.net/u/123553 一.GC概要 JVM堆相关知识为什么先说JVM堆? JVM的堆是Java对象的活动空间,程序中的类的对象从中分 ...
sourceforge.net安装网站程序数据库相关
sourceforge.net安装网站程序数据库相关我们应该知道sourceforge.net是可以安装网站(当做一个虚拟空间使用的) 但是在安装cms程序的时候那时的数据库地址再填写“localh ...
table动态增加删除
基于网上代码修改实现动态添加表数据行 <!DOCTYPE html> <html lang="cn"> <html> <head> ...
矩形覆盖（JAVA）
矩形覆盖题目描述我们可以用2*1的小矩形横着或者竖着去覆盖更大的矩形.请问用n个2*1的小矩形无重叠地覆盖一个2*n的大矩形,总共有多少种方法? 思路:最初看到这题,只能通过画图归纳来寻找规律. ...
streamreader 和 streamwriter 以及 string 与 memorystream 使用示例
经常用到,但老记不住,备忘一下 using (var ms = new MemoryStream()) { var sw = new StreamWriter(ms); sw.WriteLine(&q ...
[JAVA]JAVA实现多线程的三种方式
1.继承Thread类,通过start()方法调用 public class MultiThreadByExtends extends Thread { @Override public void r ...
MSMQ .NET下的应用
Message Message是MSMQ的数据存储单元,我们的用户数据一般也被填充在Message的body当中,因此很重要,让我们来看一看其在.net中的体现,如图: 在图上我们可以看见,Messa ...

The component and implementation of a basic gradient descent in python

1. the \(\mathcal{L}\) is a context function: \(f(x)=x^2+2x+1\)

The component and implementation of a basic gradient descent in python的更多相关文章

随机推荐

热门专题