2023-03-19：使用Go语言和FFmpeg库实现pcm编码为mp3。

答案2023-03-19：

本文将介绍如何使用Go语言和FFmpeg库实现PCM音频文件编码为MP3格式。我们将使用moonfdd/ffmpeg-go库，并在Windows 10 64位操作系统下完成本次实验。

代码参考了FFmpeg —— 15.示例程序（九）：音频编码器（PCM编码为MP3）和19：pcm编码为mp3。

看完整代码，这个肯定能运行通过。

1.准备工作

安装moonfdd/ffmpeg-go库，运行命令：go get -u github.com/moonfdd/ffmpeg-go

2.实现步骤

2.1.设置FFmpeg库路径

首先需要设置FFmpeg库的路径，在本例中是"./lib"目录下。通过moonfdd/ffmpeg-go库提供的函数SetXXXXPath()可以分别设置各个库的路径：

os.Setenv("Path", os.Getenv("Path")+";./lib")

ffcommon.SetAvutilPath("./lib/avutil-56.dll")

ffcommon.SetAvcodecPath("./lib/avcodec-58.dll")

ffcommon.SetAvdevicePath("./lib/avdevice-58.dll")

ffcommon.SetAvfilterPath("./lib/avfilter-56.dll")

ffcommon.SetAvformatPath("./lib/avformat-58.dll")

ffcommon.SetAvpostprocPath("./lib/postproc-55.dll")

ffcommon.SetAvswresamplePath("./lib/swresample-3.dll")

ffcommon.SetAvswscalePath("./lib/swscale-5.dll")

2.2.准备输入PCM文件

本例中输入的PCM文件是16位采样精度、立体声（2个声道）、44100Hz采样率，文件名为"s16le.pcm"，存放在"./out"目录下。如果该文件不存在，则从一个视频文件中提取音频数据并转换格式生成该PCM文件：

inFileName := "./out/s16le.pcm"

_, err = os.Stat(inFileName)

if err != nil {

    if os.IsNotExist(err) {

        fmt.Println("create pcm file")

        exec.Command("./lib/ffmpeg", "-i", "./resources/big_buck_bunny.mp4", "-f", "s16le", "-ar", "44100", "-ac", "2", "-acodec", "pcm_s16le", "-vn", inFileName, "-y").CombinedOutput()

    }

}

2.3.打开输出MP3文件

本例中输出的MP3文件名为"out19.mp3"，存放在"./out"目录下。首先需要调用libavformat.AvformatAllocOutputContext2()函数分配AVFormatContext结构体，并调用libavformat.AvioOpen()函数打开输出文件：

var pFormatCtx *libavformat.AVFormatContext

libavformat.AvformatAllocOutputContext2(&pFormatCtx, nil, "", outFileName)

if libavformat.AvioOpen(&pFormatCtx.Pb, outFileName, libavformat.AVIO_FLAG_READ_WRITE) < 0 {

    fmt.Printf("Cannot open output file.\n")

    return

}

2.4.添加输出流

将要输出的音频流添加到输出文件中。首先需要调用libavformat.AvformatNewStream()函数创建一个新的流对象，并将该流对象的Codec属性设置为要输出的音频编解码器属性：

stream := pFormatCtx.AvformatNewStream(nil)

if stream == nil {

    fmt.Printf("Cannot create a new stream to output file.\n")

    return

}

pCodecCtx = stream.Codec

pCodecCtx.CodecType = libavutil.AVMEDIA_TYPE_AUDIO

pCodecCtx.CodecId = pFormatCtx.Oformat.AudioCodec

pCodecCtx.SampleFmt = libavutil.AV_SAMPLE_FMT_FLTP

pCodecCtx.SampleRate = 44100

pCodecCtx.ChannelLayout = libavutil.AV_CH_LAYOUT_STEREO

pCodecCtx.BitRate = 128000

pCodecCtx.Channels = libavutil.AvGetChannelLayoutNbChannels(pCodecCtx.ChannelLayout)

2.5.查找并打开编码器

根据指定的编码器ID查找对应的编码器对象，调用libavcodec.AvcodecFindEncoder()函数返回对应的AVCodec对象。然后，调用libavcodec.AvcodecOpen2()函数打开编码器并初始化编码器上下文：

pCodec := libavcodec.AvcodecFindEncoder(pCodecCtx.CodecId)

if pCodec == nil {

    fmt.Printf("Cannot find encoder.\n")

    return

}

if pCodec.AvcodecOpen2(pCodecCtx, nil) < 0 {

    fmt.Printf("Cannot open encoder.\n")

    return

}

2.6.写入文件头

调用libavformat.AvformatWriteHeader()函数写入输出文件的文件头信息：

if libavformat.AvformatWriteHeader(pFormatCtx, nil) < 0 {

    fmt.Printf("Error occurred while writing header.\n")

    return

}

2.7.编码音频数据

循环读取输入PCM文件中的音频数据，将其填充到AVFrame对象中，并调用libavcodec.AvcodecSendFrame()函数发送该帧音频数据给编码器。然后循环调用libavcodec.AvcodecReceivePacket()函数接收编码器编码后的数据包，并调用libavformat.AvInterleavedWriteFrame()函数将该数据包写入输出文件中：

for {

    ret := inF.Read(buf)

    if ret == 0 {

        break

    }

    inBufSize := len(buf)

    // fill data to AVFrame structure

    pFrame := libavutil.AvFrameAlloc()

    defer libavutil.AvFrameFree(pFrame)

    pFrame.SetNbSamples(int32(inBufSize) / (2 * 2))

    pFrame.SetFormat(pCodecCtx.SampleFmt)

    pFrame.SetSampleRate(pCodecCtx.SampleRate)

    pFrame.SetChannelLayout(pCodecCtx.ChannelLayout)

    for i := 0; i < int(pFrame.NbSamples()); i++ {

        for t := 0; t < int(pFrame.Channels()); t++ {

            idx := (i*int(pFrame.Channels()) + t) * 2

            val := float32(int16(binary.LittleEndian.Uint16(buf[idx:idx+2]))) / (1 << 15)

            *(*[]float32)(unsafe.Pointer(&pFrame.ExtendedData))[t][i] = val

        }

    }

    // encode audio frame

    if pCodecCtx.AvcodecSendFrame(pFrame) < 0 {

        fmt.Printf("Error while sending a frame to the encoder.\n")

        return

    }

    for {

        pkt := libavcodec.AvPacketAlloc()

        isEof := false

        defer libavcodec.AvPacketFree(pkt)

        ret := pCodecCtx.AvcodecReceivePacket(pkt)

        if ret < 0 {

            fmt.Printf("Error while receiving a packet from the encoder.\n")

            return

        }

        if ret == 0 {

            break

        }

        pkt.SetStreamIndex(stream.Index())

        pkt.SetPts(libavutil.AvRescaleQ(samplesCount, libavutil.AVR{Num: 1, Den: 44100}, stream.TimeBase()))

        samplesCount += int64(pFrame.NbSamples())

        // write encoded data to output file

        if libavformat.AvInterleavedWriteFrame(pFormatCtx, pkt) != 0 {

            fmt.Printf("Error while writing a packet to the container.\n")

            return

        }

        if isEof {

            break

        }

    }

}

2.8.写入文件尾部

最后，调用libavformat.AvWriteTrailer()函数写入输出文件的尾部信息，完成整个音频编码过程：

if libavformat.AvWriteTrailer(pFormatCtx) < 0 {

    fmt.Printf("Error occurred while writing trailer.\n")

    return

}

3.完整代码

package main

import (

	"fmt"

	"os"

	"os/exec"

	"unsafe"

	"github.com/moonfdd/ffmpeg-go/ffcommon"

	"github.com/moonfdd/ffmpeg-go/libavcodec"

	"github.com/moonfdd/ffmpeg-go/libavformat"

	"github.com/moonfdd/ffmpeg-go/libavutil"

	"github.com/moonfdd/ffmpeg-go/libswresample"

)

func main() {

	// https://blog.csdn.net/guoyunfei123/article/details/105643255

	// 时长没误差

	os.Setenv("Path", os.Getenv("Path")+";./lib")

	ffcommon.SetAvutilPath("./lib/avutil-56.dll")

	ffcommon.SetAvcodecPath("./lib/avcodec-58.dll")

	ffcommon.SetAvdevicePath("./lib/avdevice-58.dll")

	ffcommon.SetAvfilterPath("./lib/avfilter-56.dll")

	ffcommon.SetAvformatPath("./lib/avformat-58.dll")

	ffcommon.SetAvpostprocPath("./lib/postproc-55.dll")

	ffcommon.SetAvswresamplePath("./lib/swresample-3.dll")

	ffcommon.SetAvswscalePath("./lib/swscale-5.dll")

	genDir := "./out"

	_, err := os.Stat(genDir)

	if err != nil {

		if os.IsNotExist(err) {

			os.Mkdir(genDir, 0777) //  Everyone can read write and execute

		}

	}

	//./lib/ffmpeg -i .\resources\big_buck_bunny.mp4 -f s16le -ar 44100 -ac 2 -acodec pcm_s16le -vn ./out/s16le.pcm

	// ./lib/ffmpeg -y -f s16le -ac 2 -ar 44100 -acodec pcm_s16le -vn -i ./out/s16le.pcm ./out/s16le.mp3

	inFileName := "./out/s16le.pcm"

	// inFileName := "./out/test16.pcm"

	outFileName := "./out/out19.mp3"

	//是否存在pcm文件

	_, err = os.Stat(inFileName)

	if err != nil {

		if os.IsNotExist(err) {

			fmt.Println("create pcm file")

			exec.Command("./lib/ffmpeg", "-i", "./resources/big_buck_bunny.mp4", "-f", "s16le", "-ar", "44100", "-ac", "2", "-acodec", "pcm_s16le", "-vn", inFileName, "-y").CombinedOutput()

		}

	}

	var pFormatCtx *libavformat.AVFormatContext

	var pCodecCtx *libavcodec.AVCodecContext

	var pCodec *libavcodec.AVCodec

	var pkt libavcodec.AVPacket

	var pFrame *libavutil.AVFrame

	//libavdevice.AvdeviceRegisterAll()

	for {

		libavformat.AvformatAllocOutputContext2(&pFormatCtx, nil, "", outFileName)

		if libavformat.AvioOpen(&pFormatCtx.Pb, outFileName, libavformat.AVIO_FLAG_READ_WRITE) < 0 {

			fmt.Printf("Cannot open output file.\n")

			return

		}

		stream := pFormatCtx.AvformatNewStream(nil)

		if stream == nil {

			fmt.Printf("Cannot create a new stream to output file.\n")

			return

		}

		//设置参数

		pCodecCtx = stream.Codec

		pCodecCtx.CodecType = libavutil.AVMEDIA_TYPE_AUDIO

		pCodecCtx.CodecId = pFormatCtx.Oformat.AudioCodec

		pCodecCtx.SampleFmt = libavutil.AV_SAMPLE_FMT_FLTP

		pCodecCtx.SampleRate = 44100

		pCodecCtx.ChannelLayout = libavutil.AV_CH_LAYOUT_STEREO

		pCodecCtx.BitRate = 128000

		pCodecCtx.Channels = libavutil.AvGetChannelLayoutNbChannels(pCodecCtx.ChannelLayout)

		//查找编码器

		pCodec = libavcodec.AvcodecFindEncoder(pCodecCtx.CodecId)

		if pCodec == nil {

			fmt.Printf("Cannot find audio encoder.\n")

			return

		}

		//打开编码器

		if pCodecCtx.AvcodecOpen2(pCodec, nil) < 0 {

			fmt.Printf("Cannot open encoder.\n")

			return

		}

		//fmtCtx.AvDumpFormat(0, outFileName, 1)

		pFrame = libavutil.AvFrameAlloc()

		if pFrame == nil {

			fmt.Printf("can't alloc frame\n")

			return

		}

		//===========

		pFrame.NbSamples = pCodecCtx.FrameSize

		pFrame.Format = int32(pCodecCtx.SampleFmt)

		pFrame.Channels = 2

		// PCM重采样

		var swr_ctx *libswresample.SwrContext = libswresample.SwrAlloc()

		swr_ctx.SwrAllocSetOpts(libavutil.AvGetDefaultChannelLayout(pCodecCtx.Channels),

			pCodecCtx.SampleFmt,

			pCodecCtx.SampleRate,

			libavutil.AvGetDefaultChannelLayout(pFrame.Channels),

			libavutil.AV_SAMPLE_FMT_S16, // PCM源文件的采样格式

			44100,

			0, uintptr(0))

		swr_ctx.SwrInit()

		/* 分配空间 */

		// uint8_t **convert_data = (uint8_t**)calloc(codecCtx->channels,sizeof(*convert_data));

		convert_data := (**byte)(unsafe.Pointer(libavutil.AvCalloc(uint64(pCodecCtx.Channels), 8)))

		libavutil.AvSamplesAlloc(convert_data, nil, pCodecCtx.Channels, pCodecCtx.FrameSize,

			pCodecCtx.SampleFmt, 0)

		size := libavutil.AvSamplesGetBufferSize(nil, pCodecCtx.Channels,

			pCodecCtx.FrameSize, pCodecCtx.SampleFmt, 1)

		frameBuf := libavutil.AvMalloc(uint64(size))

		libavcodec.AvcodecFillAudioFrame(pFrame, pCodecCtx.Channels, pCodecCtx.SampleFmt,

			(*byte)(unsafe.Pointer(frameBuf)), size, 1)

		//写帧头

		pFormatCtx.AvformatWriteHeader(nil)

		inFile, err := os.Open(inFileName)

		if err != nil {

			fmt.Printf("annot open input file.\n")

			return

		}

		pkt.AvInitPacket()

		pkt.Data = nil

		pkt.Size = 0

		for i := 0; ; i++ {

			//输入一帧数据的长度

			length := pFrame.NbSamples * libavutil.AvGetBytesPerSample(libavutil.AV_SAMPLE_FMT_S16) * pFrame.Channels

			//读PCM：特意注意读取的长度，否则可能出现转码之后声音变快或者变慢

			buf := make([]byte, length)

			n, err := inFile.Read(buf)

			if err != nil {

				fmt.Println("read end")

				break

			}

			if n <= 0 {

				break

			}

			for j := 0; j < n; j++ {

				*(*byte)(unsafe.Pointer(frameBuf + uintptr(j))) = buf[j]

			}

			swr_ctx.SwrConvert(convert_data, pCodecCtx.FrameSize,

				(**byte)(unsafe.Pointer(&pFrame.Data)),

				pFrame.NbSamples)

			//输出一帧数据的长度

			length = pCodecCtx.FrameSize * libavutil.AvGetBytesPerSample(pCodecCtx.SampleFmt)

			//双通道赋值（输出的AAC为双通道）

			// memcpy(frame->data[0],convert_data[0],length);

			// memcpy(frame->data[1],convert_data[1],length);

			c := *(*[2]uintptr)(unsafe.Pointer(convert_data))

			fd0 := uintptr(unsafe.Pointer(pFrame.Data[0]))

			cd0 := uintptr(unsafe.Pointer(c[0]))

			fd1 := uintptr(unsafe.Pointer(pFrame.Data[1]))

			cd1 := uintptr(unsafe.Pointer(c[1]))

			for j := int32(0); j < length; j++ {

				*(*byte)(unsafe.Pointer(fd0)) = *(*byte)(unsafe.Pointer(cd0))

				*(*byte)(unsafe.Pointer(fd1)) = *(*byte)(unsafe.Pointer(cd1))

				fd0++

				cd0++

				fd1++

				cd1++

			}

			pFrame.Pts = int64(i * 100)

			if pCodecCtx.AvcodecSendFrame(pFrame) < 0 {

				fmt.Printf("can't send frame for encoding\n")

				break

			}

			if pCodecCtx.AvcodecReceivePacket(&pkt) >= 0 {

				pkt.StreamIndex = uint32(stream.Index)

				fmt.Printf("write %4d frame, size = %d, length = %d\n", i, size, length)

				pFormatCtx.AvWriteFrame(&pkt)

			}

			pkt.AvPacketUnref()

		}

		// flush encoder

		if flush_encoder(pFormatCtx, 0) < 0 {

			fmt.Printf("flushing encoder failed\n")

			return

		}

		// write trailer

		pFormatCtx.AvWriteTrailer()

		inFile.Close()

		stream.Codec.AvcodecClose()

		libavutil.AvFree(uintptr(unsafe.Pointer(pFrame)))

		libavutil.AvFree(frameBuf)

		pFormatCtx.Pb.AvioClose()

		pFormatCtx.AvformatFreeContext()

		break

	}

	// codecCtx.AvcodecClose()

	// libavutil.AvFree(uintptr(unsafe.Pointer(frame)))

	// fmtCtx.Pb.AvioClose()

	// fmtCtx.AvformatFreeContext()

	fmt.Println("-----------------------------------------")

	// ./lib/ffplay -ar 44100 -ac 2 -f s16le -i ./out/test.pcm

	//_, err = exec.Command("./lib/ffplay.exe", "-ar", "44100", "-ac", "2", "-f", "s16le", "-i", "./out/test16.pcm").Output()

	_, err = exec.Command("./lib/ffplay.exe", outFileName).Output()

	if err != nil {

		fmt.Println("play err = ", err)

	}

}

func flush_encoder(fmt_ctx *libavformat.AVFormatContext, stream_index int) int32 {

	ret := int32(0)

	var got_frame int32

	var enc_pkt libavcodec.AVPacket

	if fmt_ctx.GetStream(uint32(stream_index)).Codec.Codec.Capabilities&libavcodec.AV_CODEC_CAP_DELAY == 0 {

		return 0

	}

	for {

		enc_pkt.Data = nil

		enc_pkt.Size = 0

		enc_pkt.AvInitPacket()

		ret = fmt_ctx.GetStream(uint32(stream_index)).Codec.AvcodecEncodeAudio2(&enc_pkt,

			nil, &got_frame)

		//av_frame_free(NULL)

		if ret < 0 {

			break

		}

		if got_frame == 0 {

			ret = 0

			break

		}

		fmt.Printf("Flush Encoder: Succeed to encode 1 frame!\tsize:%5d\n", enc_pkt.Size)

		/* mux encoded frame */

		ret = fmt_ctx.AvWriteFrame(&enc_pkt)

		if ret < 0 {

			break

		}

	}

	return ret

}

4.运行结果

执行命令：

go run ./examples/a19.audio_encode_pcm2mp3/main.go

2023-03-19：使用Go语言和FFmpeg库实现pcm编码为mp3。的更多相关文章

C语言和C++中动态申请内存
在C语言和C++的动态内存的使用方法是不同的,在C语言中要使用动态内存要包含一个头文件即 #include<malloc.h> 或者是#include<stdlib.h> ...
C语言和C++篇
C语言和C++篇基本上所有主流的编程语言都有String的标准库,因为字符串操作是我们每个程序员几乎每天都要遇到的.想想我们至今的代码,到底生成和使用了多少String!标题上所罗列的语言,可以看成 ...
c语言和java的区别
今晚读了一下c程序设计语言,这是一本经典书籍,发现C语言和java有很多是相同的,毕竟java是由c语言进化来的. 我大概从我自己的思考来谈谈不同点 1.c语言是面向过程,主要单位是函数,变量和函数的 ...
CHENGDU1-Python编程语言和PEP8规范
CHENGDU1-Python编程语言和PEP8规范 PEP8规范6条? 答:PEP8规范说白了就是一种规范,可以遵守,也可以不遵守,遵守PEP8可以让代码的可读性更高. 代码编排:---缩进,4个空 ...
从C,C++,JAVA和C#看String库的发展(一)----C语言和C++篇
转自: http://www.cnblogs.com/wenjiang/p/3266305.html 基本上所有主流的编程语言都有String的标准库,因为字符串操作是我们每个程序员几乎每天都要遇到的 ...
大数据工具比较：R 语言和 Spark 谁更胜一筹？
本文有两重目的,一是在性能方面快速对比下R语言和Spark,二是想向大家介绍下Spark的机器学习库背景介绍由于R语言本身是单线程的,所以可能从性能方面对比Spark和R并不是很明智的做法.即使这 ...
Go语言和ASP.NET的一般处理程序在处理WEB请求时的速度比较
Go语言和ASP.NET的一般处理程序在处理WEB请求时的速度比较 1.首先写一个Go语言的简单WEB程序,就返回一个HelloWord! package main import ( f " ...
c语言和java以及安卓和苹果
苹果手机是本地,没有中间环节,速度快,基于Linux系统安卓是通过虚拟机,影响速度就像c语言和java c适用于架构小的地方,因为直接编译运行而java用于架构比较大的地方,启动慢,启动之后效率 ...
C语言和sh脚本的杂交代码
在网上看到了一个把 C语言和bash杂并起来的例子,这个示子如下所示.在下面这个例子中,我们把脚本用#if 0这个预编译给起来,这样就不会让其编译到C语言中了. #if 0 echo "He ...
聊聊C语言和ABAP
这个公众号之前的文章,分享的都是Jerry和SAP成都研究院的同事在工作中学到的一些知识和感受.而今天这篇文章,写作的由来是因为最近我又参与了SAP成都数字创新空间应聘者的面试,和一些朋友聊了一些关于 ...

随机推荐

【读书笔记】排列研究-模式避免-基础Pattern Avoidance
目录模式避免的定义避免Pattern q 的n-排列计数\(S_n(q)\) q长度是2 q长度是3 对一些模式q,做\(S_n(q)\)的阶估计 Backelin, West, and Xin给 ...
SQL server分页的三种方法
一.Entity Framework的Linq语句的分页写法: var datacount = test.OrderBy(t => t.testID) .Skip(pageSize * (pag ...
解决class path resource [applicationContext.xml] cannot be opened because it does not exist
在学习spring的过程出现class path resource [applicationContext.xml] cannot be opened because it does not exis ...
【深入浅出 Yarn 架构与实现】5-3 Yarn 调度器资源抢占模型
本篇将对 Yarn 调度器中的资源抢占方式进行探究.分析当集群资源不足时,占用量资源少的队列,是如何从其他队列中抢夺资源的.我们将深入源码,一步步分析抢夺资源的具体逻辑. 一.简介在资源调度器中,以 ...
vue中的v-model 与 .sync
<input v-model="parentData"> //等同于 <input :value="parentData" @input=&q ...
Linux中Python自动输入sudo 密码【管道 sudo参数 stdin&stdout】
一.背景和需求背景: 由于docker服务进程都是以root帐号的身份运行的,所以用docker跑abpred出来的文件所有者都是root, 而我作为一般用户,操作这个文件不够权限,运行代码时需要s ...
selenium之文件的上传
文件的上传:主要是两种实现方法: 1.如果定位的元素是type类型是file类型的话,那么直接定位元素并使用send_keys方法完成文件上传 2.如果是非file类型的话,则需要使用第三方工具完成文 ...
CSS3新增选择器（属性选择器、结构伪类选择器、伪元素选择器）
本博文介绍CSS3中新增的选择器,包括属性选择器.结构伪类选择器和伪元素选择器. 1 属性选择器属性选择器([属性])可以根据元素的属性和属性值来对符合要求的元素进行选择. 属性选择器的基础语法如 ...
准确率、召回率及AUC概念分析
准确率&&召回率信息检索.分类.识别.翻译等领域两个最基本指标是准确率(precision rate)和召回率(recall rate),准确率也叫查准率,召回率也叫查全率.这些概念 ...
SpringCloud源码学习笔记3——Nacos服务注册源码分析
系列文章目录和关于我一丶基本概念&Nacos架构 1.为什么需要注册中心实现服务治理.服务动态扩容,以及调用时能有负载均衡的效果. 如果我们将服务提供方的ip地址配置在服务消费方的配置文件 ...