[转]You Could Become an AI Master Before You Know It. Here’s How.
转自:https://www.technologyreview.com/s/608921/ai-algorithms-are-starting-to-teach-ai-algorithms/#
You Could Become an AI Master Before You Know It. Here’s How.
Automating machine learning will make the technology more accessible to non–AI experts.
- by Will Knight
- October 17, 2017
At first blush, Scot Barton might not seem like an AI pioneer. He isn’t building self-driving cars or teaching computers to thrash humans at computer games. But within his role at Farmers Insurance, he is blazing a trail for the technology.
Barton leads a team that analyzes data to answer questions about customer behavior and the design of different policies. His group is now using all sorts of cutting-edge machine-learning techniques, from deep neural networks to decision trees. But Barton did not hire an army of AI wizards to make this possible. His team uses a platform called DataRobot, which automates a lot of difficult work involved in applying such techniques.
The insurance company’s work with DataRobot hints at how artificial intelligence might have to evolve in the next few years if it is to realize its enormous potential. Beyond spectacular demonstrations like DeepMind’s game-playing software AlphaGo, AI does have the power to revolutionize entire industries and make all sorts of businesses more efficient and productive. This, in turn, could help rejuvenate the economy by increasing overall productivity. But in order for this to happen, the technology will need to become a whole lot easier to use.
China’s AI Awakening
中国 人工智能 的崛起
The problem is that many of the steps involved in using existing AI techniques currently require significant expertise. And it isn’t as simple as building a more user-friendly interface on top of things, because engineers often have to apply judgment and know-how when crafting and tweaking their code.
But AI researchers and companies are now trying to address this by essentially turning the technology on itself, using machine learning to automate the trickier aspects of developing AI algorithms. Some experts are even building the equivalent of AI-powered operating systems designed to make applications of the technology as accessible as Microsoft Excel is today.
DataRobot is a step in that direction. You feed in raw data, and the platform automatically cleans and reformats it. Then it runs dozens of different algorithms at once against it, ranking their performance. Barton first tried using the platform by inputting a bunch of insurance data to see if it could predict a specific dollar value. Compared with a standard, hand-built statistical approach, the model selected had a 20 percent lower error rate. “Out of the box, with the push of one button; that’s pretty impressive,” he says.
AI Skills Gap
The reality of applying AI was laid bare in a report published by the consulting company McKinsey in June of this year. This report concludes that artificial intelligence, especially machine learning, may overhaul big industries, including manufacturing, finance, and health care, potentially adding up to $126 billion to the U.S. economy by 2025. But the report has one big caveat: a critical talent shortage.
There is certainly a big push to train as many people as possible to use AI (see “Andrew Ng’s Next Trick: Training a Million AI Experts”). But that will take time, and not everyone can become an AI master. The best way to maximize the impact of any technology is to make it as accessible as possible. Only then will AI begin to creep into ordinary offices and workplaces. DataRobot is already being used in some of those settings.
Late one afternoon, DataRobot’s office in Boston’s financial district is deserted apart from a handful of engineers milling around a large display. The company’s solution certainly seems impressive when Jonathan Dahlberg, one of the consultants, gives me a demo. He loads up a public data set of loan applications and payments, and then he has the system develop a bunch of models to see if there are any patterns in why people default.
In a few seconds, dozens of competing algorithms appear on the screen; at the top is a relatively unsexy but widely used gradient-boosting technique called XGBoost. This quickly shows that applicants’ income is especially important, but so is the reason they give for wanting a loan. It turns out that people who mention “starting a business” in their application are an especially bad bet.
DataRobot might match the expertise or skill of a really good data scientist, Dahlberg says, but it can offer a broader perspective. A person might rely too heavily on a certain technique, and DataRobot could automatically reveal a fundamentally better approach. It is also still possible for a user to manually modify the underlying algorithm using the programming languages Python or R. Without a close examination, it’s hard to know how well the system automates some of the trickier aspects of data science, like data cleaning and feature engineering, but it seems to take care of a surprising amount.
The company’s CEO, Jeremy Achin, was inspired to start a company after watching The Social Network, as he admits a little sheepishly when we meet for coffee near MIT. But he got the idea for DataRobot while taking part in data-science competitions on the crowdsourcing platform Kaggle, which was acquired by Google earlier this year. Kaggle offer prizes for the algorithm that performs best at making a specific prediction from a large data set. This task typically involves developing a machine-learning algorithm that feeds on the data. As one of the best early Kaggle contestants, Achin realized he was already automating a lot of the steps involved in each competition. “I thought that if we collected enough data sets, enough problems, and ran enough experiments, we could do machine learning on machine learning. That was the original idea,” he says.
The idea clearly resonated with investors. DataRobot, started in 2012, has raised more than $100 million, including $54 million this March, around the same time that Kaggle was acquired. The company says it has more than 100 customers already. Achin says the concept is a lot less popular with many data scientists, who either feel that their skills cannot be automated or worry that they will be. But he believes that most businesses will have no other option if they want to make use of AI. “I don’t care how many people change their title to ‘data scientist’ on LinkedIn,” he says. “You’re not going to move the needle.”
Self-Learning Systems
The shortage of data scientists is inspiring many others to work on automating machine learning. A growing number of research papers are popping up on using its techniques to automate more and more aspects of AI.
One of the world’s biggest players in AI, Google, is also turning its attention to the idea. Google has invested enormous sums in developing powerful AI algorithms and deploying them across its services. But the company is also keen to add more AI to its cloud services. And going beyond simple tools for image or text classification will mean automating more of the work involved in training machine-learning models.
“The goal is to make this technology more accessible,” says John Giannandrea, a Scottish computer engineer who leads Google’s AI efforts. “So anybody could say ‘Build me a predictive model’ and it goes off and does it.”
Earlier this year, the company announced some significant progress toward this goal, demonstrating an experimental way to automate the process of tuning deep-learning neural networks (see “AI Software Learns to Make AI Software”). These are perhaps the most powerful machine-learning algorithms around, and they have significantly improved the state of the art in image and voice recognition. But they are also notoriously difficult to engineer. Giannandrea says this work is now producing some very promising results, in some cases matching the performance of systems developed by hand. And he expects Google to release more results in coming months.
Others have even grander designs. Eric Xing, a professor at Carnegie Mellon University, for instance, is developing what amounts to an operating system built from different machine-learning components. This OS uses virtualization and machine learning to abstract away much of the complexity in designing and training AI. It even features a graphical user interface that can be used to train a machine-learning model on a particular data set.
Recommended for You
Xing was educated in China and studied at UC Berkeley alongside Andrew Ng, now a well-known figure in the world of AI. He is very polite, and surprisingly casual about wanting to reinvent the way people use computers. Xing envisions his AI OS becoming as easy to use as something like Microsoft’s spreadsheet package, Excel. “This is a core issue across the whole of AI,” he says. “The barrier to entry is just too high.”
Xing has created a company, Petuum, to develop the OS, and it has already created a series of tools aimed at bringing machine learning to medicine. “Doctors want an interface and medical records, images—each requires a different machine-learning approach,” he says. Petuum is also gearing up to release its platform.
Petuum’s OS, and other tools for automating AI, will face some unique challenges. There are already concerns about machine-learning algorithms inadvertently absorbing biases from training data, and some models are simply too opaque to examine carefully (see “The Dark Secret at the Heart of AI”). If AI becomes much easier to use, it’s possible these issues could become more widespread and more entrenched.
“To do machine learning really well, you need a PhD and about five years of experience,” says Rich Caruana, a senior researcher at Microsoft who has been doing data science for about 20 years. “There are many pitfalls. Does your algorithm expire after six months, and is it interpretable?”
Caruana believes it should be possible to automate some of the steps a data scientist needs to take in order to guard against such problems—something similar to a pilot’s pre-flight checklist. But he cautions against trusting too much in systems that promise to automate everything. “I know,” he says, “because I’ve stubbed my toe along the way.”
[转]You Could Become an AI Master Before You Know It. Here’s How.的更多相关文章
- git subtree用法(转)
git subtree用法 一.使用场景 例如,在项目Game中有一个子目录AI.Game和AI分别是一个独立的git项目,可以分开维护.为了避免直接复制粘贴代码,我们希望Game中的AI子目录与AI ...
- git 版本库拆分和subtree用法
git 版本库拆分 原文地址: https://segmentfault.com/a/1190000002548731 程序员最爽的事情是什么?删删删!所有项目本来都很苗条的,时间长了难免有一些越搞越 ...
- 使用GIT SUBTREE集成项目到子目录(转)
原文:http://aoxuis.me/post/2013-08-06-git-subtree 使用场景 例如,在项目Game中有一个子目录AI.Game和AI分别是一个独立的git项目,可以分开维护 ...
- Artificial intelligence(AI)
ORM: https://github.com/sunkaixuan/SqlSugar 微软DEMO: https://github.com/Microsoft/BotBuilder 注册KEY:ht ...
- HDU5900 QSC and Master(区间DP + 最小费用最大流)
题目 Source http://acm.hdu.edu.cn/showproblem.php?pid=5900 Description Every school has some legends, ...
- 2016 年沈阳网络赛---QSC and Master(区间DP)
题目链接 http://acm.hdu.edu.cn/showproblem.php?pid=5900 Problem Description Every school has some legend ...
- HDU 5900 QSC and Master 区间DP
QSC and Master Problem Description Every school has some legends, Northeastern University is the s ...
- 2016沈阳网络赛 QSC and Master
QSC and Master Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 131072/131072 K (Java/Others) ...
- 程序员带你一步步分析AI如何玩Flappy Bird
以下内容来源于一次部门内部的分享,主要针对AI初学者,介绍包括CNN.Deep Q Network以及TensorFlow平台等内容.由于笔者并非深度学习算法研究者,因此以下更多从应用的角度对整个系统 ...
随机推荐
- JavaScript循环语句-6---for语句,while语句的应用逻辑
JavaScript循环语句 学习目标 1.掌握for语句的语法结构 2.掌握for语句的应用逻辑 for语句 语法: For(语句1:语句2:语句3){ 被执行的代码块: } 语句1:在循环(代码块 ...
- 【转载】 Pytorch(0)降低学习率torch.optim.lr_scheduler.ReduceLROnPlateau类
原文地址: https://blog.csdn.net/weixin_40100431/article/details/84311430 ------------------------------- ...
- ORA-15025 搭建DG环境,restore controlfile报错,提示oracle无法使用ASM存储
环境说明: #主库RAC环境 #备库RAC环境,操作系统AIX 6.1 数据库版本11.2.0.3 报错说明: #主库备份控制文件,传输至备库,备库restore 报错 本篇文档,分为两大阶段:第一阶 ...
- RNN 通过字符语言模型 理解BPTT
链接:https://github.com/karpathy/char-rnn http://karpathy.github.io/2015/05/21/rnn-effectiveness/ http ...
- kali的安装
https://www.youtube.com/watch?v=sB3bchzlwio 注意视频中选择的是kali 2016.2版本, 在VMware中选择了Linux-Debian 8.*64(好像 ...
- url的反向解析
1. url的语法格式: url(regex, views, **kwargs, name) name:为地址起别名,反向解析时使用 2.反向解析 对于Django中的url反向解析,是分模板和视图的 ...
- CodeForces - 1101G :(Zero XOR Subset)-less(线性基)
You are given an array a1,a2,…,an of integer numbers. Your task is to divide the array into the maxi ...
- flask使用蓝图,创建副本
随着flask的发展,flask框架越来越复杂,我们需要进行模块化处理,因为之前学过python模块化管理,我可以对一个flask程序进行简单的模块化处理. 我们都有一个博客程序,由此可知博客的前端界 ...
- CentOS7安装PostgreSQL10,pgadmin4
======PostgreSQL10 CentOS7=================FYI:https://tecadmin.net/install-postgresql-server-centos ...
- (5)css样式导入
样式的组成 1.选择器:将样式与页面中的某一个或某些标签建立联系,就要使用选择器,在head标签下写一个style标签,将需控制参数的标签名写在这个style标签下,设置属性即可通过css来控制htm ...