跳到主要内容FFmpeg/OpenCV+C++ 实现直播拉流与推流及视频帧处理 | 极客日志C++AI算法
FFmpeg/OpenCV+C++ 实现直播拉流与推流及视频帧处理
介绍使用 FFmpeg 和 OpenCV 结合 C++ 进行直播流拉取、解码、图像处理及推流的完整方案。涵盖封装解封装原理、编解码基础、硬件加速概念及 FFmpeg 八大库功能。核心实践包括从 RTMP 拉流,解码为 AVFrame,转换为 OpenCV Mat 进行自定义处理(如 AI 识别),再编码回 H.264 并推送到目标地址。提供了详细的格式转换代码、工程实现流程及常用工具类封装示例,解决了音视频同步、时间戳处理及内存管理问题。
独立开发者1 浏览 基础知识
封装和解封装
原始图像序列(frame₀, frame₁, frame₂, ..., frameₙ) ↓ [H.264 编码器] ↓ H.264 视频编码流(包含 I 帧、P 帧、B 帧) ↓ [MP4 封装器] ↓ MP4 文件(movie.mp4) ├── 视频流(H.264 编码) ├── 音频流(可选,如 AAC 编码) └── 元数据(时间戳、分辨率等)
编码与解码
MP4 文件(movie.mp4) ↓ [MP4 解封装器] ↓ 分离出的流 ├── H.264 视频流(包含 I 帧、P 帧、B 帧) └── 音频流(如果有) ↓ [H.264 解码器] ↓ 重建图像序列(recon₀, recon₁, recon₂, ..., reconₙ)
硬件加速编解码
工具介绍
FFmpeg '八大金刚'核心开发库。
- AVUtil
核心工具库,用于辅助实现可移植的多媒体编程。包含安全的可移植的字符串函数,随机数生成器,附加的数学函数,密码学和多媒体相关的功能函数。
- AVFormat
协议及容器封装库。该库封装了协议层(Protocol 协议,包括文件的数据来源,I/O 操作方式等)和媒体格式容器层(Muxer/Demuxer 复用/解复用,媒体文件的封装和解封装方式)。AVFormat 库支持多种输入和输出协议(FILE、HTTP、UDP、RTMP 等),以及多种媒体容器格式(MP4,RMVB 等)。
- AVCodec
编解码库,封装了编解码层(Codec),提供了一个通用的编码/解码框架,包含了用于音频、视频和字幕流的多个编码器和解码器。也可以将其他的第三方的编解码器以插件形式加入(如 x264,x265 等)。
- AVFilter
滤镜库,提供了一个通用的可对音频/视频进行过滤处理的框架,其中包含过滤器,数据源和接收器的概念。
- Postproc
后处理库,包括对于音频/视频文件进行后期处理的常用操作例如隔行滤波,去噪滤波、锐化滤波等,可以用于增强视频的清晰度、减少噪点和伪影等。常与 AVFilter 库一起使用,新版本已经移除。
- AVDevice
设备库,提供了一个通用框架,用于对许多常见的多媒体 I/O 设备进行抓取和渲染。例如:需要操作:【/dev/videoX】
- SWresample
音频处理库,用于处理音频,可以执行音频的重采样、重矩阵化和样本格式转换等操作。
- SWscale
图像转换库,用于处理图像,可以执行图像缩放,色彩空间和像素格式转换等操作。例如:需要将 FFmpeg 中的 YUV 格式转换为 OpenCV 中的 BGR 格式。
工程项目实践
工程目标:从一个 RTMP 输入流(如 rtmp://.../live/456)拉取视频流,使用 FFmpeg 解码后,再通过 OpenCV 进行可能的图像处理(目前注释掉了),然后重新编码为 H.264 视频流,并推送到另一个 RTMP 地址(如 rtmp://.../live/dj/1ZNBJ7C00C009X)。
功能分解说明
-
输入源:使用 FFmpeg 的 libavformat / libavcodec 打开并读取一个 (也可以是本地文件或其它协议)。自动探测流信息,找到视频流()。使用对应的解码器(如 H.264)进行解码,得到原始帧()。
微信扫一扫,关注极客日志
微信公众号「极客日志」,在微信中扫描左侧二维码关注。展示文案:极客日志 zeeklog
相关免费在线工具
- 加密/解密文本
使用加密算法(如AES、TripleDES、Rabbit或RC4)加密和解密文本明文。 在线工具,加密/解密文本在线工具,online
- RSA密钥对生成器
生成新的随机RSA私钥和公钥pem证书。 在线工具,RSA密钥对生成器在线工具,online
- Mermaid 预览与可视化编辑
基于 Mermaid.js 实时预览流程图、时序图等图表,支持源码编辑与即时渲染。 在线工具,Mermaid 预览与可视化编辑在线工具,online
- Base64 字符串编码/解码
将字符串编码和解码为其 Base64 格式表示形式即可。 在线工具,Base64 字符串编码/解码在线工具,online
- Base64 文件转换器
将字符串、文件或图像转换为其 Base64 表示形式。 在线工具,Base64 文件转换器在线工具,online
- Markdown转HTML
将 Markdown(GFM)转为 HTML 片段,浏览器内 marked 解析;与 HTML转Markdown 互为补充。 在线工具,Markdown转HTML在线工具,online
RTMP 流
AVMEDIA_TYPE_VIDEO
AVFrame
图像处理(可选):将解码后的 AVFrame 转换为 OpenCV 的 cv::Mat(BGR 格式)。注释中提到可以在此处添加:叠加文字/Logo、目标检测(如 YOLO)、图像增强等。当前未启用任何处理逻辑。
重新编码:将处理后的 cv::Mat(BGR)转换为 YUV420P 格式的 AVFrame(通过 CVMatToAVFrame 函数)。使用 H.264 编码器(AV_CODEC_ID_H264)对每一帧进行编码。设置了编码参数:码率:400 kbps(50 * 1024 * 8 实际是 400,000 bps),GOP = 30(每 30 帧一个关键帧),分辨率、帧率继承自输入流。
输出推流:使用 FLV 封装格式(RTMP 标准封装)。通过 avformat_write_header 和 av_interleaved_write_frame 将编码后的 H.264 数据推送到目标 RTMP 地址。时间戳(PTS/DTS)经过正确重缩放(av_rescale_q),保证播放同步。
2.1 格式转换
2.1.1 AVFrame 转 Mat
Mat 是 OpenCV 的图像格式,颜色空间为 BGR,对应 FFmpeg 格式为 AV_PIX_FMT_BGR24。AVFrame 一般为 YUV420P,以此格式为例。这个通过 FFmpeg 的格式转换函数就可以解决。转换代码如下:
cv::Mat AVFrameToCVMat(AVFrame *yuv420Frame) {
int srcW = yuv420Frame->width;
int srcH = yuv420Frame->height;
SwsContext *swsCtx = sws_getContext(srcW, srcH, (AVPixelFormat)yuv420Frame->format, srcW, srcH, (AVPixelFormat)AV_PIX_FMT_BGR24, SWS_BICUBIC, NULL, NULL, NULL);
cv::Mat mat;
mat.create(cv::Size(srcW, srcH), CV_8UC3);
AVFrame *bgr24Frame = av_frame_alloc();
av_image_fill_arrays(bgr24Frame->data, bgr24Frame->linesize, (uint8_t *)mat.data, (AVPixelFormat)AV_PIX_FMT_BGR24, srcW, srcH, 1);
sws_scale(swsCtx, (const uint8_t* const*)yuv420Frame->data, yuv420Frame->linesize, 0, srcH, bgr24Frame->data, bgr24Frame->linesize);
av_frame_free(bgr24Frame);
sws_freeContext(swsCtx);
return mat;
}
2.1.2 Mat 转 AVFrame
借鉴上步,首先也是想利用 FFmpeg 的转换函数进行转换,但是总有问题,没有定位到具体原因。不过理解原理后,也可以自己处理数据。基本思路是将 Mat 转换为 YUV420 格式,将 Y、U、V 分量分别填充到对应的 AVFrame 里面就可以了。
AVFrame *CVMatToAVFrame(cv::Mat &inMat) {
AVPixelFormat dstFormat = AV_PIX_FMT_YUV420P;
int width = inMat.cols;
int height = inMat.rows;
AVFrame *frame = av_frame_alloc();
frame->width = width;
frame->height = height;
frame->format = dstFormat;
int ret = av_frame_get_buffer(frame, 32);
if (ret < 0) {
std::cout << "Could not allocate the video frame data" << std::endl;
return nullptr;
}
ret = av_frame_make_writable(frame);
if (ret < 0) {
std::cout << "Av frame make writable failed." << std::endl;
return nullptr;
}
cv::cvtColor(inMat, inMat, cv::COLOR_BGR2YUV_I420);
int frame_size = width * height;
unsigned char *data = inMat.data;
memcpy(frame->data[0], data, frame_size);
memcpy(frame->data[1], data + frame_size, frame_size/4);
memcpy(frame->data[2], data + frame_size * 5/4, frame_size/4);
return frame;
}
2.2 工作流程
RTSP 输入流 ↓ av_read_frame() ← 拉流 ↓ avcodec_send_packet() avcodec_receive_frame() ← GPU 硬解 (h264_cuvid) 得到 GPU 帧(NV12 on CUDA)
2.3 工程实现
extern "C" {
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>
#include <libavutil/error.h>
#include <libavutil/mem.h>
#include <libavdevice/avdevice.h>
#include <libavutil/time.h>
}
#include <opencv2/imgproc/imgproc_c.h>
#include <opencv2/core/opengl.hpp>
#include <opencv2/cudacodec.hpp>
#include <opencv2/freetype.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/opencv.hpp>
#include <iostream>
#include <queue>
#include <string>
#include <vector>
AVFrame *CVMatToAVFrame(cv::Mat &inMat) {
AVPixelFormat dstFormat = AV_PIX_FMT_YUV420P;
int width = inMat.cols;
int height = inMat.rows;
AVFrame *frame = av_frame_alloc();
frame->width = width;
frame->height = height;
frame->format = dstFormat;
int ret = av_frame_get_buffer(frame, 64);
if (ret < 0) {
return nullptr;
}
ret = av_frame_make_writable(frame);
if (ret < 0) {
return nullptr;
}
cv::cvtColor(inMat, inMat, cv::COLOR_BGR2YUV_I420);
int frame_size = width * height;
unsigned char *data = inMat.data;
memcpy(frame->data[0], data, frame_size);
memcpy(frame->data[1], data + frame_size, frame_size/4);
memcpy(frame->data[2], data + frame_size * 5/4, frame_size/4);
return frame;
}
void rtmpPush2(std::string inputUrl, std::string outputUrl){
int videoindex = -1;
avformat_network_init();
const char *inUrl = inputUrl.c_str();
const char *outUrl = outputUrl.c_str();
SwsContext *vsc = NULL;
AVFrame *yuv = NULL;
AVCodecContext *outputVc = NULL;
AVFormatContext *output = NULL;
AVFormatContext *input_ctx = NULL;
AVFormatContext * output_ctx = NULL;
int ret = avformat_open_input(&input_ctx, inUrl, 0, NULL);
if (ret < 0) {
std::cout << "avformat_open_input failed!" << std::endl;
return;
}
std::cout << "avformat_open_input success!" << std::endl;
ret = avformat_find_stream_info(input_ctx, 0);
if (ret != 0) {
return;
}
av_dump_format(input_ctx, 0, inUrl, 0);
ret = avformat_alloc_output_context2(&output_ctx, NULL, "flv", outUrl);
if (ret < 0) {
std::cout << "avformat_alloc_output_context2 failed!" << std::endl;
return;
}
std::cout << "avformat_alloc_output_context2 success!" << std::endl;
std::cout << "nb_streams: " << input_ctx->nb_streams << std::endl;
unsigned int i;
for (i = 0; i < input_ctx->nb_streams; i++) {
AVStream *in_stream = input_ctx->streams[i];
AVStream *out_stream = avformat_new_stream(output_ctx, in_stream->codec->codec);
if (!out_stream) {
std::cout << "Failed to successfully add audio and video stream" << std::endl;
ret = AVERROR_UNKNOWN;
}
ret = avcodec_parameters_copy(out_stream->codecpar, in_stream->codecpar);
if (ret < 0) {
printf("copy 编解码器上下文失败\n");
}
out_stream->codecpar->codec_tag = 0;
out_stream->codec->codec_tag = 0;
if (output_ctx->oformat->flags & AVFMT_GLOBALHEADER) {
out_stream->codec->flags = out_stream->codec->flags | AV_CODEC_FLAG_GLOBAL_HEADER;
}
}
for (i = 0; i < input_ctx->nb_streams; i++) {
if (input_ctx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO) {
videoindex = i;
break;
}
}
AVCodecContext *pCodecCtx = input_ctx->streams[videoindex]->codec;
AVCodec *pCodec = avcodec_find_decoder(pCodecCtx->codec_id);
if (pCodec == NULL) {
std::cout << "找不到解码器" << std::endl;
return;
}
if (avcodec_open2(pCodecCtx, pCodec,NULL)<0) {
std::cout << "解码器无法打开" << std::endl;
return;
}
AVPacket pkt;
long long start_time = av_gettime();
long long frame_index = 0;
int count = 0;
int indexCount = 1;
int numberAbs = 1;
bool flag;
int VideoCount = 0;
std::queue<std::string> VideoList;
AVFrame *pFrame = av_frame_alloc();
int got_picture;
int frame_count = 0;
AVFrame* pFrameYUV = av_frame_alloc();
uint8_t *out_buffer;
out_buffer = new uint8_t[avpicture_get_size(AV_PIX_FMT_BGR24, pCodecCtx->width, pCodecCtx->height)];
avpicture_fill((AVPicture *)pFrameYUV, out_buffer, AV_PIX_FMT_BGR24, pCodecCtx->width, pCodecCtx->height);
try{
vsc = sws_getCachedContext(vsc, pCodecCtx->width, pCodecCtx->height, AV_PIX_FMT_RGB24,
pCodecCtx->width, pCodecCtx->height, AV_PIX_FMT_YUV420P,
SWS_BICUBIC,
0, 0, 0);
if (!vsc) {
throw std::logic_error("sws_getCachedContext failed!");
}
yuv = av_frame_alloc();
yuv->format = AV_PIX_FMT_YUV420P;
yuv->width = pCodecCtx->width;
yuv->height = pCodecCtx->height;
yuv->pts = 0;
int ret = av_frame_get_buffer(yuv, 32);
if (ret != 0) {
char buf[1024] = {0};
av_strerror(ret, buf, sizeof(buf) - 1);
throw std::logic_error(buf);
}
AVCodec *codec = avcodec_find_encoder(AV_CODEC_ID_H264);
if (!codec) {
throw std::logic_error("Can`t find h264 encoder!");
}
outputVc = avcodec_alloc_context3(codec);
if (!outputVc) {
throw std::logic_error("avcodec_alloc_context3 failed!");
}
outputVc->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
outputVc->codec_id = codec->id;
outputVc->thread_count = 8;
outputVc->bit_rate = 50 * 1024 * 8;
outputVc->width = pCodecCtx->width;
outputVc->height = pCodecCtx->height;
outputVc->time_base = {1, pCodecCtx->time_base.den};
outputVc->framerate = {pCodecCtx->time_base.den, 1};
outputVc->gop_size = 30;
outputVc->max_b_frames = 1;
outputVc->qmax = 51;
outputVc->qmin = 10;
outputVc->pix_fmt = AV_PIX_FMT_YUV420P;
ret = avcodec_open2(outputVc, 0, 0);
if (ret != 0) {
char buf[1024] = {0};
av_strerror(ret, buf, sizeof(buf) - 1);
throw std::logic_error(buf);
}
std::cout << "avcodec_open2 success!" << std::endl;
ret = avformat_alloc_output_context2(&output, 0, "flv", outUrl);
if (ret != 0) {
char buf[1024] = {0};
av_strerror(ret, buf, sizeof(buf) - 1);
throw std::logic_error(buf);
}
AVStream *vs = avformat_new_stream(output, codec);
if (!vs) {
throw std::logic_error("avformat_new_stream failed");
}
vs->codecpar->codec_tag = 0;
avcodec_parameters_from_context(vs->codecpar, outputVc);
av_dump_format(output, 0, outUrl, 1);
ret = avio_open(&output->pb, outUrl, AVIO_FLAG_WRITE);
ret = avformat_write_header(output, NULL);
if (ret != 0) {
std::cout << "ret:" << ret << std::endl;
char buf[1024] = {0};
av_strerror(ret, buf, sizeof(buf) - 1);
throw std::logic_error(buf);
}
AVPacket pack;
memset(&pack, 0, sizeof(pack));
int vpts = 0;
bool imgFrameFlag = false;
int timeCount = 0;
while (1) {
ret = av_read_frame(input_ctx, &pkt);
if (ret < 0) {
continue;
}
if (pkt.stream_index == videoindex) {
auto start = std::chrono::system_clock::now();
ret = avcodec_decode_video2(pCodecCtx, pFrame, &got_picture, &pkt);
auto end = std::chrono::system_clock::now();
if (ret < 0) {
std::cout << ret << std::endl;
std::cout << "解码错误" << std::endl;
av_frame_free(&yuv);
av_free_packet(&pack);
continue;
}
if (got_picture) {
SwsContext *img_convert_ctx;
img_convert_ctx = sws_getContext(pCodecCtx->width, pCodecCtx->height, pCodecCtx->pix_fmt, pCodecCtx->width, pCodecCtx->height, AV_PIX_FMT_BGR24, SWS_BICUBIC, NULL, NULL, NULL);
sws_scale(img_convert_ctx, (const uint8_t *const *) pFrame->data, pFrame->linesize, 0, pCodecCtx->height, pFrameYUV->data, pFrameYUV->linesize);
frame_count++;
cv::Mat img;
img = cv::Mat(pCodecCtx->height, pCodecCtx->width, CV_8UC3);
img.data = pFrameYUV->data[0];
yuv = CVMatToAVFrame(img);
yuv->pts = vpts;
vpts++;
ret = avcodec_send_frame(outputVc, yuv);
av_frame_free(&yuv);
if (ret != 0){
av_frame_free(&yuv);
av_free_packet(&pack);
continue;
}
ret = avcodec_receive_packet(outputVc, &pack);
if (ret != 0 || pack.size > 0) {}
else {
av_frame_free(&yuv);
av_free_packet(&pack);
continue;
}
int firstFrame = 0;
if (pack.dts < 0 || pack.pts < 0 || pack.dts > pack.pts || firstFrame) {
firstFrame = 0;
pack.dts = pack.pts = pack.duration = 0;
}
pack.pts = av_rescale_q(pack.pts, outputVc->time_base, vs->time_base);
pack.dts = av_rescale_q(pack.dts, outputVc->time_base, vs->time_base);
pack.duration = av_rescale_q(pack.duration, outputVc->time_base, vs->time_base);
ret = av_interleaved_write_frame(output, &pack);
if (ret < 0) {
printf("发送数据包出错\n");
av_frame_free(&yuv);
av_free_packet(&pack);
continue;
}
av_frame_free(&yuv);
av_free_packet(&pack);
}
}
}
}catch (exception e){
if (vsc) {
sws_freeContext(vsc);
vsc = NULL;
}
if (outputVc) {
avio_closep(&output->pb);
avcodec_free_context(&outputVc);
}
std::cerr << e.what() << std::endl;
}
}
int main() {
av_log_set_level(AV_LOG_TRACE);
rtmpPush2("rtmp://39.170.104.236:28081/live/456","rtmp://39.170.104.237:28081/live/dj/1ZNBJ7C00C009X");
}
2.4 常用工具类封装
#ifndef FFMPEGHEADER_H
#define FFMPEGHEADER_H
extern "C" {
#include "./libavcodec/avcodec.h"
#include "./libavformat/avformat.h"
#include "./libavformat/avio.h"
#include "./libavutil/opt.h"
#include "./libavutil/time.h"
#include "./libavutil/imgutils.h"
#include "./libswscale/swscale.h"
#include "./libswresample/swresample.h"
#include "./libavutil/avutil.h"
#include "./libavutil/ffversion.h"
#include "./libavutil/frame.h"
#include "./libavutil/pixdesc.h"
#include "./libavutil/imgutils.h"
#include "./libavfilter/avfilter.h"
#include "./libavfilter/buffersink.h"
#include "./libavfilter/buffersrc.h"
#include "./libavdevice/avdevice.h"
}
#include <iostream>
char *getAVError(int errNum);
int64_t getRealTimeByPTS(int64_t pts, AVRational timebase);
void calcAndDelay(int64_t startTime, int64_t pts, AVRational timebase);
int32_t hexArrayToDec(char *array, int len);
class VideoSwser {
public:
VideoSwser();
~VideoSwser();
bool initSwsCtx(int srcWidth, int srcHeight, AVPixelFormat srcFmt, int dstWidth, int dstHeight, AVPixelFormat dstFmt);
void release();
AVFrame *getSwsFrame(AVFrame *srcFrame);
private:
bool hasInit;
bool needSws;
int dstWidth;
int dstHeight;
AVPixelFormat dstFmt;
SwsContext *videoSwsCtx;
};
class VideoEncoder {
public:
VideoEncoder();
~VideoEncoder();
bool initEncoder(int width, int height, AVPixelFormat fmt, int fps);
void release();
AVPacket *getEncodePacket(AVFrame *srcFrame);
AVPacket *flushEncoder();
AVCodecContext *getCodecContent();
private:
bool hasInit;
AVCodecContext *videoEnCodecCtx;
};
class AudioSwrer {
public:
AudioSwrer();
~AudioSwrer();
bool initSwrCtx(int inChannels, int inSampleRate, AVSampleFormat inFmt, int outChannels, int outSampleRate, AVSampleFormat outFmt);
void release();
AVFrame *getSwrFrame(AVFrame *srcFrame);
AVFrame *getSwrFrame(uint8_t *srcData);
private:
bool hasInit;
bool needSwr;
int outChannels;
int outSampleRate;
AVSampleFormat outFmt;
SwrContext *audioSwrCtx;
};
class AudioEncoder {
public:
AudioEncoder();
~AudioEncoder();
bool initEncoder(int channels, int sampleRate, AVSampleFormat sampleFmt);
void release();
AVPacket *getEncodePacket(AVFrame *srcFrame);
AVPacket *flushEncoder();
AVCodecContext *getCodecContent();
private:
bool hasInit;
AVCodecContext *audioEnCodecCtx;
};
class AVTimeStamp {
public:
enum PTSMode { PTS_RECTIFY = 0,
PTS_REALTIME
};
public:
AVTimeStamp();
~AVTimeStamp();
void initAudioTimeStampParm(int sampleRate, PTSMode mode = PTS_RECTIFY);
void initVideoTimeStampParm(int fps, PTSMode mode = PTS_RECTIFY);
void startTimeStamp();
int64_t getAudioPts();
int64_t getVideoPts();
private:
PTSMode aMode;
PTSMode vMode;
int64_t startTime;
int64_t audioTimeStamp;
int64_t videoTimeStamp;
double audioDuration;
double videoDuration;
};
#endif
#include "ffmpegheader.h"
char *getAVError(int errNum) {
static char msg[32] = {0};
av_strerror(errNum, msg, 32);
return msg;
}
int64_t getRealTimeByPTS(int64_t pts, AVRational timebase) {
AVRational timebase_q = {1, AV_TIME_BASE};
int64_t ptsTime = av_rescale_q(pts, timebase, timebase_q);
return ptsTime;
}
void calcAndDelay(int64_t startTime, int64_t pts, AVRational timebase) {
int64_t ptsTime = getRealTimeByPTS(pts, timebase);
int64_t nowTime = av_gettime() - startTime;
int64_t offset = ptsTime - nowTime;
if(offset > 1000 && offset < 2*1000*1000)
av_usleep(offset);
}
int32_t hexArrayToDec(char *array, int len) {
if(array == nullptr || len > 4)
return -1;
int32_t result = 0;
for(int i=0; i<len; i++)
result = result * 256 + (unsigned char)array[i];
return result;
}
VideoSwser::VideoSwser() {
videoSwsCtx = nullptr;
hasInit = false;
needSws = false;
}
VideoSwser::~VideoSwser() {
release();
}
bool VideoSwser::initSwsCtx(int srcWidth, int srcHeight, AVPixelFormat srcFmt, int dstWidth, int dstHeight, AVPixelFormat dstFmt) {
release();
if(srcWidth == dstWidth && srcHeight == dstHeight && srcFmt == dstFmt) {
needSws = false;
} else {
videoSwsCtx = sws_getContext(srcWidth, srcHeight, srcFmt, dstWidth, dstHeight, dstFmt, SWS_BILINEAR, NULL, NULL, NULL);
if (videoSwsCtx == NULL) {
std::cout << "sws_getContext error" << std::endl;
return false;
}
this->dstFmt = dstFmt;
this->dstWidth = dstWidth;
this->dstHeight = dstHeight;
needSws = true;
}
hasInit = true;
return true;
}
void VideoSwser::release() {
if(videoSwsCtx) {
sws_freeContext(videoSwsCtx);
videoSwsCtx = nullptr;
}
hasInit = false;
needSws = false;
}
AVFrame *VideoSwser::getSwsFrame(AVFrame *srcFrame) {
if(!hasInit) {
std::cout << "Swser 未初始化" << std::endl;
return nullptr;
}
if(!srcFrame)
return nullptr;
if(!needSws)
return srcFrame;
AVFrame *frame = av_frame_alloc();
frame->format = dstFmt;
frame->width = dstWidth;
frame->height = dstHeight;
int ret = av_frame_get_buffer(frame, 0);
if (ret != 0) {
std::cout << "av_frame_get_buffer swsFrame error" << std::endl;
return nullptr;
}
ret = av_frame_make_writable(frame);
if (ret != 0) {
std::cout << "av_frame_make_writable swsFrame error" << std::endl;
return nullptr;
}
sws_scale(videoSwsCtx, (const uint8_t *const *)srcFrame->data, srcFrame->linesize, 0, dstHeight, frame->data, frame->linesize);
return frame;
}
VideoEncoder::VideoEncoder() {
videoEnCodecCtx = nullptr;
hasInit = false;
}
VideoEncoder::~VideoEncoder() {
release();
}
bool VideoEncoder::initEncoder(int width, int height, AVPixelFormat fmt, int fps) {
release();
AVCodec *videoEnCoder = avcodec_find_encoder(AV_CODEC_ID_H264);
if(!videoEnCoder) {
std::cout << "avcodec_find_encoder AV_CODEC_ID_H264 error" << std::endl;
return false;
}
videoEnCodecCtx = avcodec_alloc_context3(videoEnCoder);
if(!videoEnCodecCtx) {
std::cout << "avcodec_alloc_context3 AV_CODEC_ID_H264 error" << std::endl;
return false;
}
videoEnCodecCtx->bit_rate = 2*1024*1024;
videoEnCodecCtx->width = width;
videoEnCodecCtx->height = height;
videoEnCodecCtx->framerate = {fps, 1};
videoEnCodecCtx->time_base = {1, AV_TIME_BASE};
videoEnCodecCtx->gop_size = fps;
videoEnCodecCtx->max_b_frames = 0;
videoEnCodecCtx->pix_fmt = fmt;
videoEnCodecCtx->thread_count = 2;
videoEnCodecCtx->thread_type = FF_THREAD_FRAME;
videoEnCodecCtx->qmin = 10;
videoEnCodecCtx->qmax = 30;
videoEnCodecCtx->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
int ret = av_opt_set(videoEnCodecCtx->priv_data, "preset", "ultrafast", 0);
if(ret != 0)
std::cout << "av_opt_set preset error" << std::endl;
ret = av_opt_set(videoEnCodecCtx->priv_data, "tune", "zerolatency", 0);
if(ret != 0)
std::cout << "av_opt_set preset error" << std::endl;
ret = av_opt_set(videoEnCodecCtx->priv_data, "profile", "main", 0);
if(ret != 0)
std::cout << "av_opt_set preset error" << std::endl;
ret = avcodec_open2(videoEnCodecCtx, videoEnCoder, NULL);
if(ret != 0) {
std::cout << "avcodec_open2 video error" << std::endl;
return false;
}
hasInit = true;
return true;
}
void VideoEncoder::release() {
if(videoEnCodecCtx) {
avcodec_free_context(&videoEnCodecCtx);
videoEnCodecCtx = nullptr;
}
hasInit = false;
}
AVPacket *VideoEncoder::getEncodePacket(AVFrame *srcFrame) {
if(!hasInit) {
std::cout << "VideoEncoder no init" << std::endl;
return nullptr;
}
if(!srcFrame)
return nullptr;
if(srcFrame->width != videoEnCodecCtx->width || srcFrame->height != videoEnCodecCtx->height || srcFrame->format != videoEnCodecCtx->pix_fmt) {
std::cout << "srcFrame 不符合视频编码器设置格式" << std::endl;
return nullptr;
}
srcFrame->pts = av_rescale_q(srcFrame->pts, AVRational{1, AV_TIME_BASE}, videoEnCodecCtx->time_base);
int ret = avcodec_send_frame(videoEnCodecCtx, srcFrame);
if (ret != 0)
return nullptr;
AVPacket *packet = av_packet_alloc();
ret = avcodec_receive_packet(videoEnCodecCtx, packet);
if (ret != 0) {
av_packet_free(&packet);
return nullptr;
}
return packet;
}
AVPacket *VideoEncoder::flushEncoder() {
if(!hasInit) {
std::cout << "VideoEncoder no init" << std::endl;
return nullptr;
}
int ret = avcodec_send_frame(videoEnCodecCtx, NULL);
if (ret != 0)
return nullptr;
AVPacket *packet = av_packet_alloc();
ret = avcodec_receive_packet(videoEnCodecCtx, packet);
if (ret != 0) {
av_packet_free(&packet);
return nullptr;
}
return packet;
}
AVCodecContext *VideoEncoder::getCodecContent() {
return videoEnCodecCtx;
}
AudioSwrer::AudioSwrer() {
audioSwrCtx = nullptr;
hasInit = false;
needSwr = false;
}
AudioSwrer::~AudioSwrer() {
release();
}
bool AudioSwrer::initSwrCtx(int inChannels, int inSampleRate, AVSampleFormat inFmt, int outChannels, int outSampleRate, AVSampleFormat outFmt) {
release();
if(inChannels == outChannels && inSampleRate == outSampleRate && inFmt == outFmt) {
needSwr = false;
} else {
audioSwrCtx = swr_alloc_set_opts(NULL, av_get_default_channel_layout(outChannels), outFmt, outSampleRate, av_get_default_channel_layout(inChannels), inFmt, inSampleRate, 0, NULL);
if (!audioSwrCtx) {
std::cout << "swr_alloc_set_opts failed!" << std::endl;
return false;
}
int ret = swr_init(audioSwrCtx);
if (ret != 0) {
std::cout << "swr_init error" << std::endl;
swr_free(&audioSwrCtx);
return false;
}
this->outFmt = outFmt;
this->outChannels = outChannels;
this->outSampleRate = outSampleRate;
needSwr = true;
}
hasInit = true;
return true;
}
void AudioSwrer::release() {
if(audioSwrCtx) {
swr_free(&audioSwrCtx);
audioSwrCtx = nullptr;
}
hasInit = false;
needSwr = false;
}
AVFrame *AudioSwrer::getSwrFrame(AVFrame *srcFrame) {
if(!hasInit) {
std::cout << "Swrer 未初始化" << std::endl;
return nullptr;
}
if(!srcFrame)
return nullptr;
if(!needSwr)
return srcFrame;
AVFrame *frame = av_frame_alloc();
frame->format = outFmt;
frame->channels = outChannels;
frame->channel_layout = av_get_default_channel_layout(outChannels);
frame->nb_samples = 1024;
int ret = av_frame_get_buffer(frame, 0);
if (ret != 0) {
std::cout << "av_frame_get_buffer audio error" << std::endl;
return nullptr;
}
ret = av_frame_make_writable(frame);
if (ret != 0) {
std::cout << "av_frame_make_writable swrFrame error" << std::endl;
return nullptr;
}
const uint8_t **inData = (const uint8_t **)srcFrame->data;
swr_convert(audioSwrCtx, frame->data, frame->nb_samples, inData, frame->nb_samples);
return frame;
}
AVFrame *AudioSwrer::getSwrFrame(uint8_t *srcData) {
if(!hasInit) {
std::cout << "Swrer 未初始化" << std::endl;
return nullptr;
}
if(!srcData)
return nullptr;
if(!needSwr)
return nullptr;
AVFrame *frame = av_frame_alloc();
frame->format = outFmt;
frame->channels = outChannels;
frame->sample_rate = outSampleRate;
frame->channel_layout = av_get_default_channel_layout(outChannels);
frame->nb_samples = 1024;
int ret = av_frame_get_buffer(frame, 0);
if (ret != 0) {
std::cout << "av_frame_get_buffer audio error" << std::endl;
return nullptr;
}
ret = av_frame_make_writable(frame);
if (ret != 0) {
std::cout << "av_frame_make_writable swrFrame error" << std::endl;
return nullptr;
}
const uint8_t *indata[AV_NUM_DATA_POINTERS] = {0};
indata[0] = srcData;
swr_convert(audioSwrCtx, frame->data, frame->nb_samples, indata, frame->nb_samples);
return frame;
}
AudioEncoder::AudioEncoder() {
audioEnCodecCtx = nullptr;
hasInit = false;
}
AudioEncoder::~AudioEncoder() {
release();
}
bool AudioEncoder::initEncoder(int channels, int sampleRate, AVSampleFormat sampleFmt) {
release();
AVCodec *audioEnCoder = avcodec_find_encoder(AV_CODEC_ID_AAC);
if (!audioEnCoder) {
std::cout << "avcodec_find_encoder AV_CODEC_ID_AAC failed!" << std::endl;
return false;
}
audioEnCodecCtx = avcodec_alloc_context3(audioEnCoder);
if (!audioEnCodecCtx) {
std::cout << "avcodec_alloc_context3 AV_CODEC_ID_AAC failed!" << std::endl;
return false;
}
audioEnCodecCtx->bit_rate = 64*1024;
audioEnCodecCtx->time_base = AVRational{1, sampleRate};
audioEnCodecCtx->sample_rate = sampleRate;
audioEnCodecCtx->sample_fmt = sampleFmt;
audioEnCodecCtx->channels = channels;
audioEnCodecCtx->channel_layout = av_get_default_channel_layout(channels);
audioEnCodecCtx->frame_size = 1024;
int ret = avcodec_open2(audioEnCodecCtx, audioEnCoder, NULL);
if (ret != 0) {
std::cout << "avcodec_open2 audio error" << getAVError(ret) << std::endl;
return false;
}
hasInit = true;
return true;
}
void AudioEncoder::release() {
if(audioEnCodecCtx) {
avcodec_free_context(&audioEnCodecCtx);
audioEnCodecCtx = nullptr;
}
hasInit = false;
}
AVPacket *AudioEncoder::getEncodePacket(AVFrame *srcFrame) {
if(!hasInit) {
std::cout << "AudioEncoder no init" << std::endl;
return nullptr;
}
if(!srcFrame)
return nullptr;
if(srcFrame->channels != audioEnCodecCtx->channels || srcFrame->sample_rate != audioEnCodecCtx->sample_rate || srcFrame->format != audioEnCodecCtx->sample_fmt) {
std::cout << "srcFrame 不符合音频编码器设置格式" << std::endl;
return nullptr;
}
srcFrame->pts = av_rescale_q(srcFrame->pts, AVRational{1, AV_TIME_BASE}, audioEnCodecCtx->time_base);
int ret = avcodec_send_frame(audioEnCodecCtx, srcFrame);
if (ret != 0)
return nullptr;
AVPacket *packet = av_packet_alloc();
ret = avcodec_receive_packet(audioEnCodecCtx, packet);
if (ret != 0) {
av_packet_free(&packet);
return nullptr;
}
return packet;
}
AVPacket *AudioEncoder::flushEncoder() {
if(!hasInit) {
std::cout << "AudioEncoder no init" << std::endl;
return nullptr;
}
int ret = avcodec_send_frame(audioEnCodecCtx, NULL);
if (ret != 0)
return nullptr;
AVPacket *packet = av_packet_alloc();
ret = avcodec_receive_packet(audioEnCodecCtx, packet);
if (ret != 0) {
av_packet_free(&packet);
return nullptr;
}
return packet;
}
AVCodecContext *AudioEncoder::getCodecContent() {
return audioEnCodecCtx;
}
AVTimeStamp::AVTimeStamp() {
aMode = PTS_RECTIFY;
vMode = PTS_RECTIFY;
startTime = 0;
audioTimeStamp = 0;
videoTimeStamp = 0;
videoDuration = 1000000 / 25;
audioDuration = 1000000 / (44100 / 1024);
}
AVTimeStamp::~AVTimeStamp() {
}
void AVTimeStamp::initAudioTimeStampParm(int sampleRate, AVTimeStamp::PTSMode mode) {
aMode = mode;
audioDuration = 1000000 / (sampleRate / 1024);
}
void AVTimeStamp::initVideoTimeStampParm(int fps, AVTimeStamp::PTSMode mode) {
vMode = mode;
videoDuration = 1000000 / fps;
}
void AVTimeStamp::startTimeStamp() {
audioTimeStamp = 0;
videoTimeStamp = 0;
startTime = av_gettime();
}
int64_t AVTimeStamp::getAudioPts() {
if(aMode == PTS_RECTIFY) {
int64_t elapsed = av_gettime() - startTime;
uint32_t offset = qAbs(elapsed - (audioTimeStamp + audioDuration));
if(offset < (audioDuration * 0.5))
audioTimeStamp += audioDuration;
else
audioTimeStamp = elapsed;
} else {
audioTimeStamp = av_gettime() - startTime;
}
return audioTimeStamp;
}
int64_t AVTimeStamp::getVideoPts() {
if(vMode == PTS_RECTIFY) {
int64_t elapsed = av_gettime() - startTime;
uint32_t offset = qAbs(elapsed - (videoTimeStamp + videoDuration));
if(offset < (videoDuration * 0.5))
videoTimeStamp += videoDuration;
else
videoTimeStamp = elapsed;
} else {
videoTimeStamp = av_gettime() - startTime;
}
return videoTimeStamp;
}