https://arstechnica.com/information-technology/2023/03/you-can-now-run-a-gpt-3-level-ai-model-on-your-laptop-phone-and-raspberry-pi/

Things are moving at lightning speed in AI Land. On Friday, a software developer named Georgi Gerganov created a tool called "llama.cpp" that can run Meta's new GPT-3-class AI large language model, LLaMA, locally on a Mac laptop. Soon thereafter, people worked out how to run LLaMA on Windows as well. Then someone showed it running on a Pixel 6 phone, and next came a Raspberry Pi (albeit running very slowly).

Flying at the speed of LLaMA

Meta's restrictions on LLaMA didn't last long, because on March 2, someone leaked the LLaMA weights on BitTorrent. Since then, there has been an explosion of development surrounding LLaMA. Independent AI researcher Simon Willison has compared this situation to the release of Stable Diffusion, an open source image synthesis model that launched last August. Here's what he wrote in a post on his blog:

It feels to me like that Stable Diffusion moment back in August kick-started the entire new wave of interest in generative AI—which was then pushed into over-drive by the release of ChatGPT at the end of November.

That Stable Diffusion moment is happening again right now, for large language models—the technology behind ChatGPT itself. This morning I ran a GPT-3 class language model on my own personal laptop for the first time!

AI stuff was weird already. It’s about to get a whole lot weirder.

Typically, running GPT-3 requires several datacenter-class A100 GPUs (also, the weights for GPT-3 are not public), but LLaMA made waves because it could run on a single beefy consumer GPU. And now, with optimizations that reduce the model size using a technique called quantization, LLaMA can run on an M1 Mac or a lesser Nvidia consumer GPU (although "llama.cpp" only runs on CPU at the moment—which is impressive and surprising in its own way).

Things are moving so quickly that it's sometimes difficult to keep up with the latest developments. (Regarding AI's rate of progress, a fellow AI reporter told Ars, "It's like those videos of dogs where you upend a crate of tennis balls on them. [They] don't know where to chase first and get lost in the confusion.")

For example, here's a list of notable LLaMA-related events based on a timeline Willison laid out in a Hacker News comment:

February 24, 2023: Meta AI announces LLaMA.
March 2, 2023: Someone leaks the LLaMA models via BitTorrent.
March 10, 2023: Georgi Gerganov creates llama.cpp, which can run on an M1 Mac.
March 11, 2023: Artem Andreenko runs LLaMA 7B (slowly) on a Raspberry Pi 4, 4GB RAM, 10 sec/token.
March 12, 2023: LLaMA 7B running on NPX, a node.js execution tool.
March 13, 2023: Someone gets llama.cpp running on a Pixel 6 phone, also very slowly.
March 13, 2023, 2023: Stanford releases Alpaca 7B, an instruction-tuned version of LLaMA 7B that "behaves similarly to OpenAI's "text-davinci-003" but runs on much less powerful hardware.

After obtaining the LLaMA weights ourselves, we followed Willison's instructions and got the 7B parameter version running on an M1 Macbook Air, and it runs at a reasonable rate of speed. You call it as a script on the command line with a prompt, and LLaMA does its best to complete it in a reasonable way.

Enlarge / A screenshot of LLaMA 7B in action on a MacBook Air running llama.cpp.

Benj Edwards / Ars Technica

There's still the question of how much the quantization affects the quality of the output. In our tests, LLaMA 7B trimmed down to 4-bit quantization was very impressive for running on a MacBook Air—but still not on par with what you might expect from ChatGPT. It's entirely possible that better prompting techniques might generate better results.

Also, optimizations and fine-tunings come quickly when everyone has their hands on the code and the weights—even though LLaMA is still saddled with some fairly restrictive terms of use. The release of Alpaca today by Stanford proves that fine tuning (additional training with a specific goal in mind) can improve performance, and it's still early days after LLaMA's release.

【转帖】You can now run a GPT-3-level AI model on your laptop, phone, and Raspberry Pi的更多相关文章

Raspberry Pi 3 --- Kernel Building and Run in A New Version Kernal
ABSTRACT There are two main methods for building the kernel. You can build locally on a Raspberry Pi ...
OSMC Vs. OpenELEC Vs. LibreELEC – Kodi Operating System Comparison
Kodi's two slim-and-trim kid brothers LibreELEC and OpenELEC were once great solutions for getting t ...
Jenkins报错Caused: java.io.IOException: Cannot run program "sh" (in directory "D:\Jenkins\Jenkins_home\workspace\jmeter_test"): CreateProcess error=2, 系统找不到指定的文件。
想在本地执行我的python文件,我本地搭建了一个Jenkins,使用了execute shell来运行我的脚本,发现报错 [jmeter_test] $ sh -xe D:\tomcat\apach ...
FAT-fs (mmcblk0p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
/******************************************************************************** * FAT-fs (mmcblk0p ...
（转载）如何创建一个以管理员权限运行exe 的快捷方式？ How To Create a Shortcut That Lets a Standard User Run An Application as Administrator
How To Create a Shortcut That Lets a Standard User Run An Application as Administrator by Chris Hoff ...
java.io.IOException: Cannot run program "/opt/jdk1.8.0_191/bin/java" (in directory "/var/lib/jenkins/workspace/xinguan"): error=2, No such file or directory
测试jenkins构建,报错如下 Parsing POMs Established TCP socket on 44463 [xinguan] $ /opt/jdk1.8.0_191/bin/java ...
【协作式原创】自己动手写docker之run代码解析
linux预备知识 urfave cli预备知识准备工作阿里云抢占式实例:centos7.4 每次实例释放后都要重新安装go wget https://dl.google.com/go/go1.1 ...
FreeBSD 10 发布
发行注记:http://www.freebsd.org/releases/10.0R/relnotes.html 下文翻译中... 主要有安全问题修复.新的驱动与硬件支持.新的命名/选项.主要bug修 ...
【Xamarin挖墙脚系列：Mono项目的图标为啥叫Mono】
因为发起人大Boss :Miguel de lcaza 是西班牙人,喜欢猴子.................就跟Hadoop的创始人的闺女喜欢大象一样...................... 历 ...
Win8.1重装win7或win10中途无法安装
一.有的是usb识别不了,因为新的机器可能都是USB3.0的,安装盘是Usb2.0的. F12更改系统BIOS设置,我改了三个地方: 1.设置启动顺序为U盘启动 2.关闭了USB3.0 control ...

随机推荐

Java 编辑PPT SmartArt图形
本文介绍在Java程序中如何来编辑PPT幻灯片中已有的SmartArt图形,包括重置图形样式.颜色.添加/删除图形节点.编辑节点内容.添加超链接到节点(链接到网页.链接到指定幻灯片)等.在PPT中创建 ...
一图看懂CodeArts Release三大特性
本文分享自华为云社区<一图看懂CodeArts Release三大特性,带你玩转发布管理服务>,作者:华为云PaaS服务小智. 华为云发布管理服务Codearts Release,是面向开 ...
干货分享丨玩转物联网IoTDA服务系列六-恒温空调
摘要:本文主要讲述空调接入到物联网平台后,通过恒温空调控制系统,不论空调是否开机,都可以调整空调默认温度,待空调上电开机后,自动按默认温度调节. 场景简介通过恒温控制系统,不论空调是否开机,都可以调 ...
GaussDB技术解读：应用无损透明（ALT）
本文分享自华为云社区<DTCC 2023专家解读丨GaussDB技术解读系列之应用无损透明(ALT)>,作者: GaussDB 数据库. 近日,在第14届中国数据库技术大会(DTCC 20 ...
论文推荐｜TDSC2022 安全补丁识别最新的方案E-SPI
摘要:TDSC 2022发表了安全补丁识别最新的方案"Enhancing Security Patch Identification by Capturing Structures in C ...
云图说｜Git云上仓库哪家好？一张图了解华为云代码托管服务
阅识风云是华为云信息大咖,擅长将复杂信息多元化呈现,其出品的一张图(云图说).深入浅出的博文(云小课)或短视频(云视厅)总有一款能让您快速上手华为云.更多精彩内容请单击此处. 摘要: 云办公时代已然到 ...
据说有人面试栽在了Thread类的stop()方法和interrupt()方法上
摘要:今天就简单的说说Thread类的stop()方法和interrupt()方法到底有啥区别. 本文分享自华为云社区<[高并发]又一个朋友面试栽在了Thread类的stop()方法和inter ...
再谈BOM和DOM(1):BOM与DOM概述
JavaScript的实现包括以下3个部分: ECMAScript(核心):描述了JS的语法和基本对象. 浏览器对象模型(BOM):与浏览器交互的方法和接口文档对象模型 (DOM):处理网页内容的方 ...
Walrus 入门教程：如何创建模板以沉淀可复用的团队最佳实践
模板是 Walrus 的核心功能之一,模板创建完成后用户可以重复使用,并在使用过程中逐渐沉淀研发和运维团队的最佳实践,进一步简化服务及资源的部署.用户可以使用 HCL 语言自定义创建模板,也可以一键复 ...
selenium多标签，多表单切换
Selenium多标签之间的切换多标签之间的切换有的时候点击一个链接,新页面并非由当前页面跳转过去,而是新打开一个页面打开,这种情况下,计算机需要识别多标签或窗口的情况获取所有窗口的句柄 han ...

【转帖】You can now run a GPT-3-level AI model on your laptop, phone, and Raspberry Pi

FURTHER READING

Flying at the speed of LLaMA

FURTHER READING

【转帖】You can now run a GPT-3-level AI model on your laptop, phone, and Raspberry Pi的更多相关文章

随机推荐

热门专题