AlphaGo:用机器学习技术古老的围棋游戏掌握AlphaGo: Mastering the ancient game of Go with Machine Learning
AlphaGo: Mastering the ancient game of Go with Machine Learning
But one game has thwarted A.I. research thus far: the ancient game of Go. Invented in China over 2500 years ago, Go is played by more than 40 million people worldwide. The rules are simple: players take turns to place black or white stones on a board, trying to capture the opponent's stones or surround empty space to make points of territory. Confucius wrote about the game, and its aesthetic beauty elevated it to one of the four essential arts required of any true Chinese scholar. The game is played primarily through intuition and feel, and because of its subtlety and intellectual depth it has captured the human imagination for centuries.
But as simple as the rules are, Go is a game of profound complexity. The search space in Go is vast -- more than a googol times larger than chess (a number greater than there are atoms in the universe!). As a result, traditional “brute force” AI methods -- which construct a search tree over all possible sequences of moves -- don’t have a chance in Go. To date, computers have played Go only as well as amateurs. Experts predicted it would be at least another 10 years until a computer could beat one of the world’s elite group of Go professionals.
We saw this as an irresistible challenge! We started building a system, AlphaGo, described in a paper in Nature this week, that would overcome these barriers. The key to AlphaGo is reducing the enormous search space to something more manageable. To do this, it combines a state-of-the-art tree search with two deep neural networks, each of which contains many layers with millions of neuron-like connections. One neural network, the “policy network”, predicts the next move, and is used to narrow the search to consider only the moves most likely to lead to a win. The other neural network, the “value network”, is then used to reduce the depth of the search tree -- estimating the winner in each position in place of searching all the way to the end of the game.
AlphaGo’s search algorithm is much more human-like than previous approaches. For example, when Deep Blue played chess, it searched by brute force over thousands of times more positions than AlphaGo. Instead, AlphaGo looks ahead by playing out the remainder of the game in its imagination, many times over - a technique known as Monte-Carlo tree search. But unlike previous Monte-Carlo programs, AlphaGo uses deep neural networks to guide its search. During each simulated game, the policy network suggests intelligent moves to play, while the value network astutely evaluates the position that is reached. Finally, AlphaGo chooses the move that is most successful in simulation.
We first trained the policy network on 30 million moves from games played by human experts, until it could predict the human move 57% of the time (the previous record before AlphaGo was 44%). But our goal is to beat the best human players, not just mimic them. To do this, AlphaGo learned to discover new strategies for itself, by playing thousands of games between its neural networks, and gradually improving them using a trial-and-error process known as reinforcement learning. This approach led to much better policy networks, so strong in fact that the raw neural network (immediately, without any tree search at all) can defeat state-of-the-art Go programs that build enormous search trees.
These policy networks were in turn used to train the value networks, again by reinforcement learning from games of self-play. These value networks can evaluate any Go position and estimate the eventual winner - a problem so hard it was believed to be impossible.
Of course, all of this requires a huge amount of compute power, so we made extensive use ofGoogle Cloud Platform, which enables researchers working on AI and Machine Learning to access elastic compute, storage and networking capacity on demand. In addition, new open source libraries for numerical computation using data flow graphs, such as TensorFlow, allow researchers to efficiently deploy the computation needed for deep learning algorithms across multiple CPUs or GPUs.
So how strong is AlphaGo? To answer this question, we played a tournament between AlphaGo and the best of the rest - the top Go programs at the forefront of A.I. research. Using a single machine, AlphaGo won all but one of its 500 games against these programs. In fact, AlphaGo even beat those programs after giving them 4 free moves headstart at the beginning of each game. A high-performance version of AlphaGo, distributed across many machines, was even stronger.
![]() |
This figure from the Nature article shows the Elo rating and approximate rank of AlphaGo (both single machine and distributed versions), the European champion Fan Hui (a professional 2-dan), and the strongest other Go programs, evaluated over thousands of games. Pale pink bars show the performance of other programs when given a four move headstart. |
It seemed that AlphaGo was ready for a greater challenge. So we invited the reigning 3-time European Go champion Fan Hui — an elite professional player who has devoted his life to Go since the age of 12 — to our London office for a challenge match. The match was played behind closed doors between October 5-9 last year. AlphaGo won by 5 games to 0 -- the first time a computer program has ever beaten a professional Go player.
AlphaGo’s next challenge will be to play the top Go player in the world over the last decade,Lee Sedol. The match will take place this March in Seoul, South Korea. Lee Sedol is excited to take on the challenge saying, "I am privileged to be the one to play, but I am confident that I can win." It should prove to be a fascinating contest!
We are thrilled to have mastered Go and thus achieved one of the grand challenges of AI. However, the most significant aspect of all this for us is that AlphaGo isn’t just an ‘expert’ system built with hand-crafted rules, but instead uses general machine learning techniques to allow it to improve itself, just by watching and playing games. While games are the perfect platform for developing and testing AI algorithms quickly and efficiently, ultimately we want to apply these techniques to important real-world problems. Because the methods we have used are general purpose, our hope is that one day they could be extended to help us address some of society’s toughest and most pressing problems, from climate modelling to complex disease analysis.
AlphaGo:用机器学习技术古老的围棋游戏掌握AlphaGo: Mastering the ancient game of Go with Machine Learning的更多相关文章
- 数据挖掘:实用机器学习技术P295页:
数据挖掘:实用机器学习技术P295页: 在weka软件中的实验者界面中,新建好实验项目后,添加相应的实验数据,然后添加对应需要的分类算法 ,需要使用多个算法时候重复操作添加add algorithm. ...
- java围棋游戏源代码
//李雨泽源代码,不可随意修改.//时间:2017年9月22号.//地点:北京周末约科技有限公司.//package com.bao; /*围棋*/ /*import java.awt.*; impo ...
- 谷歌发布"自动机器学习"技术 AI可自我创造
谷歌发布"自动机器学习"技术 AI可自我创造 据Inverse报道,今年5月份,谷歌宣布其人工智能(AI)研究取得重大进展,似乎帮助科幻小说中最耸人听闻的末日预言成为现实.谷歌推出 ...
- 使用Java的GUI技术实现 “ 贪吃蛇 ” 游戏
详细教程: 使用Java的GUI技术实现 " 贪吃蛇 " 游戏_IT打工酱的博客-CSDN博客
- 【机器学习Machine Learning】资料大全
昨天总结了深度学习的资料,今天把机器学习的资料也总结一下(友情提示:有些网站需要"科学上网"^_^) 推荐几本好书: 1.Pattern Recognition and Machi ...
- 机器学习(Machine Learning)&深度学习(Deep Learning)资料【转】
转自:机器学习(Machine Learning)&深度学习(Deep Learning)资料 <Brief History of Machine Learning> 介绍:这是一 ...
- 数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics)之间有什么关系?
本来我以为不需要解释这个问题的,到底数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)有什么区别,但是前几天因为有个学弟问我,我想了想发现我竟然也回答 ...
- 机器学习笔记1 - Hello World In Machine Learning
前言 Alpha Go在16年以4:1的战绩打败了李世石,17年又以3:0的战绩战胜了中国围棋天才柯洁,这真是科技界振奋人心的进步.伴随着媒体的大量宣传,此事变成了妇孺皆知的大事件.大家又开始激烈的讨 ...
- 学习笔记之机器学习(Machine Learning)
机器学习 - 维基百科,自由的百科全书 https://zh.wikipedia.org/wiki/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0 机器学习是人工智能的一个分 ...
随机推荐
- Python 读写xlsx
# pip install openpyxl # openpyxl只能用于处理xlsx,不能用于处理xlsfrom openpyxl import load_workbook # 打开文件ExcelF ...
- WordPress SMTP发送邮件插件:WP SMTP
对于一个网站而言,发送邮件的功能是必不可少的,现在的主机一般都支持发送邮件,但是不同的主机由于函数限制或者某些其他原因,可能造成没办法正常发送邮件.这时候,我们可能就要借助第三方SMTP发送邮件. 对 ...
- jmeter-----如何安装插件管理?
1.下载插件管理jar文件,http://www.jmeter-plugins.org/wiki/PluginsManager/ 2. 拷贝这jar文件到 \lib\ext文件夹里 3. 重新打开JM ...
- NIO-4pipe
import java.io.IOException; import java.nio.ByteBuffer; import java.nio.channels.Pipe; import org.ju ...
- Java 大小写转换
Java 大小写转换 public class CaseConversion { /** * @param character: a character * @return: a character ...
- nhibernate 比较运算符
比较运算符 HQL运算符 QBC运算符 含义 = Restrictions.eq() 等于 <> Restrictions.not(Exprission.eq()) 不等于 > Re ...
- Coding.net简单使用指南
注意:大家创建项目时一定要选择创建公开仓库!不然别人看不到! Coding.net是一个代码托管平台,简单来说这东西就是一个你在线存放代码的地方. 至于为什么要把代码存到这东西上呢?很多好处,比如防丢 ...
- bWAPP练习--injection篇SQL Injection (GET/Search)
SQL注入: SQL注入,就是通过把SQL命令插入到Web表单提交或输入域名或页面请求的查询字符串,最终达到欺骗服务器执行恶意的SQL命令.具体来说,它是利用现有应用程序,将(恶意)的SQL命令注入到 ...
- cf 633B A trivial problem
Mr. Santa asks all the great programmers of the world to solve a trivial problem. He gives them an i ...
- [BZOJ3583]杰杰的女性朋友(矩阵快速幂)
杰杰的女性朋友 时间限制:10s 空间限制:256MB 题目描述 杰杰是魔法界的一名传奇人物.他对魔法具有深刻的洞察力,惊人的领悟力,以及令人叹为观止的创造力.自从他从事魔法竞赛以来,短短几 ...