當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

音频处理基本概念及音频重采样

發布時間：2024/4/11 编程问答 50 豆豆

生活随笔收集整理的這篇文章主要介紹了音频处理基本概念及音频重采样小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

音頻處理基本概念及音頻重采樣

是原有的?頻參數不滿?我們的需求，?如在FFmpeg解碼?頻的時候，不同的?源有不同的格式，采樣率等，在解碼后的數據中的這些參數也會不?致(最新FFmpeg 解碼?頻后，?頻格式為AV_SAMPLE_FMT_FLTP，這個參數應該是?致的)，如果我們接下來需要使?解碼后的?頻數據做其他操作，?這些參數的不?致導致會有很多額外?作，此時直接對其進?重采樣，獲取我們制定的?頻參數，這樣就會?便很多。

再?如在將?頻進?SDL播放時候，因為當前的SDL2.0不?持planar格式，也不?持浮點型的，?最新的FFMPEG 16年會將?頻解碼為AV_SAMPLE_FMT_FLTP格式，因此此時就需要我們對其重采樣，使之可以在SDL2.0上進?播放

3. 可調節的參數

通過重采樣，我們可以對：

sample rate(采樣率)

sample format(采樣格式)

channel layout(通道布局，可以通過此參數獲取聲道數

3. 對應參數解析

1. 采樣率

采樣設備每秒抽取樣本的次數

2. 采樣格式及量化精度（位寬）

每種?頻格式有不同的量化精度（位寬），位數越多，表示值就越精確，聲?表現?然就越精準。FFMpeg中?頻格式有以下?種，每種格式有其占?的字節數信息（libavutil/samplefmt.h）：

enum AVSampleFormat {AV_SAMPLE_FMT_NONE = -1,AV_SAMPLE_FMT_U8, ///< unsigned 8 bitsAV_SAMPLE_FMT_S16, ///< signed 16 bitsAV_SAMPLE_FMT_S32, ///< signed 32 bitsAV_SAMPLE_FMT_FLT, ///< floatAV_SAMPLE_FMT_DBL, ///< doubleAV_SAMPLE_FMT_U8P, ///< unsigned 8 bits, planarAV_SAMPLE_FMT_S16P, ///< signed 16 bits, planarAV_SAMPLE_FMT_S32P, ///< signed 32 bits, planarAV_SAMPLE_FMT_FLTP, ///< float, planarAV_SAMPLE_FMT_DBLP, ///< double, planarAV_SAMPLE_FMT_S64, ///< signed 64 bitsAV_SAMPLE_FMT_S64P, ///< signed 64 bits, planarAV_SAMPLE_FMT_NB ///< Number of sample formats. DO NOT USE if linking dynamically };

3. 分?（plane）和打包（packed）

以雙聲道為例，帶P（plane）的數據格式在存儲時，其左聲道和右聲道的數據是分開存儲的，左聲道的數據存儲在data[0]，右聲道的數據存儲在data[1]，每個聲道的所占?的字節數為linesize[0]和linesize[1]；

不帶P（packed）的?頻數據在存儲時，是按照LRLRLR…的格式交替存儲在data[0]中，linesize[0]表示總的數據量

4. 聲道分布（channel_layout)

聲道分布在FFmpeg\libavutil\channel_layout.h中有定義，?般來說?的?較多的是AV_CH_LAYOUT_STEREO（雙聲道）和AV_CH_LAYOUT_SURROUND（三聲道），這兩者的定義如下：

#define AV_CH_LAYOUT_STEREO (AV_CH_FRONT_LEFT|AV_CH_FRONT_RIGHT) #define AV_CH_LAYOUT_SURROUND (AV_CH_LAYOUT_STEREO|AV_CH_FRONT_CENTER)

5. ?頻幀的數據量計算

?幀?頻的數據量（字節）=channel數 * nb_samples樣本數 * 每個樣本占?的字節數

如果該?頻幀是FLTP格式的PCM數據，包含1024個樣本，雙聲道，那么該?頻幀包含的?頻數據量是210244=8192字節。

AV_SAMPLE_FMT_DBL ： 210248 = 16384

6. ?頻播放時間計算

以采樣率44100Hz來計算，每秒44100個sample，?正常?幀為1024個sample，可知每幀播放時間/1024=1000ms/44100，得到每幀播放時間=1024*1000/44100=23.2ms （更精確的是23.21995464852608）。

?幀播放時間（毫秒） = nb_samples樣本數 1000/采樣率 =
（1）10241000/44100=23.21995464852608ms ->約等于 23.2ms，精度損失了0.011995464852608ms，如果累計10萬幀，誤差>1199毫秒，如果有視頻?起的就會有?視頻同步的問題。如果按著23.2去計算pts（0 23.2 46.4 ）就會有累積誤差。
（2）1024*1000/48000=21.33333333333333ms

4. FFmpeg重采樣API

分配?頻重采樣的上下?

struct SwrContext *swr_alloc(void);

當設置好相關的參數后，使?此函數來初始化SwrContext結構體

int swr_init(struct SwrContext *s);

分配SwrContext并設置/重置常?的參數。

struct SwrContext *swr_alloc_set_opts(struct SwrContext *s, // ?頻重采樣上下?int64_t out_ch_layout, // 輸出的layout, 如：5.1聲道enum AVSampleFormat out_sample_fmt, // 輸出的采樣格式。Float, S16,?般選?是s16 絕?部分聲卡?持int out_sample_rate, //輸出采樣率int64_t in_ch_layout, // 輸?的layoutenum AVSampleFormat in_sample_fmt, // 輸?的采樣格式int in_sample_rate, // 輸?的采樣率int log_offset, // ?志相關，不?管先，直接為0void *log_ctx // ?志相關，不?管先，直接為NULL );

將輸?的?頻按照定義的參數進?轉換并輸出

int swr_convert(struct SwrContext *s, // ?頻重采樣的上下? uint8_t **out, // 輸出的指針。傳遞的輸出的數組 int out_count, //輸出的樣本數量，不是字節數。單通道的樣本數量。 const uint8_t **in , //輸?的數組，AVFrame解碼出來的DATA int in_count // 輸?的單通道的樣本數量。 );

返回值 <= out_count

in和in_count可以設置為0，以最后刷新最后?個樣本。

釋放掉SwrContext結構體并將此結構體置為NULL;

void swr_free(struct SwrContext **s);

?頻重采樣，采樣格式轉換和混合庫

與lswr的交互是通過SwrContext完成的，SwrContext被分配給swr_alloc（）或swr_alloc_set_opts（）。它是不透明的，所以所有參數必須使?AVOptions API設置。

為了使?lswr，你需要做的第?件事就是分配SwrContext。這可以使?swr_alloc（）或swr_alloc_set_opts（）來完成。如果您使?前者，則必須通過AVOptions API設置選項。后?個函數提供了相同的功能，但它允許您在同?語句中設置?些常?選項。

例如，以下代碼將設置從平?浮動樣本格式到交織的帶符號16位整數的轉換，從48kHz到44.1kHz的下采樣，以及從5.1聲道到?體聲的下混合（使?默認混合矩陣）。這是使?swr_alloc（）函數

SwrContext *swr = swr_alloc(); av_opt_set_channel_layout(swr, "in_channel_layout", AV_CH_LAYOUT_5POINT1, 0); av_opt_set_channel_layout(swr, "out_channel_layout", AV_CH_LAYOUT_STEREO, 0); av_opt_set_int(swr, "in_sample_rate", 48000, 0); av_opt_set_int(swr, "out_sample_rate", 44100, 0); av_opt_set_sample_fmt(swr, "in_sample_fmt", AV_SAMPLE_FMT_FLTP, 0); av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_S16, 0);

同樣的?作也可以使?swr_alloc_set_opts（）：

SwrContext *swr = swr_alloc_set_opts(NULL, // we're allocating anew contextAV_CH_LAYOUT_STEREO, // out_ch_layoutAV_SAMPLE_FMT_S16, // out_sample_fmt44100, // out_sample_rateAV_CH_LAYOUT_5POINT1, // in_ch_layoutAV_SAMPLE_FMT_FLTP, // in_sample_fmt48000, // in_sample_rate0, // log_offsetNULL); // log_ctx

?旦設置了所有值，它必須?swr_init（）初始化。如果需要更改轉換參數，可以使?AVOptions來更改參數，如上?第?個例?所述; 或者使?swr_alloc_set_opts（），但是第?個參數是分配的上下?。您必須再次調?swr_init（）

轉換本身通過重復調?swr_convert（）來完成。請注意，如果提供的輸出空間不?或采樣率轉換完成后，樣本可能會在swr中緩沖，這需要“未來”樣本。可以隨時通過使?swr_convert（）（in_count可以設置為0）來檢索不需要將來輸?的樣本。在轉換結束時，可以通過調?具有NULL in和in incount的swr_convert（）來刷新重采樣緩沖區。

5. ?頻重采樣?程范例

1. 簡單范例（resample）

/** Copyright (c) 2012 Stefano Sabatini** Permission is hereby granted, free of charge, to any person obtaining a copy* of this software and associated documentation files (the "Software"), to deal* in the Software without restriction, including without limitation the rights* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell* copies of the Software, and to permit persons to whom the Software is* furnished to do so, subject to the following conditions:** The above copyright notice and this permission notice shall be included in* all copies or substantial portions of the Software.** THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN* THE SOFTWARE.*//*** @example resampling_audio.c* libswresample API use example.*/#include <libavutil/opt.h> #include <libavutil/channel_layout.h> #include <libavutil/samplefmt.h> #include <libswresample/swresample.h>static int get_format_from_sample_fmt(const char **fmt,enum AVSampleFormat sample_fmt) {int i;struct sample_fmt_entry {enum AVSampleFormat sample_fmt;const char *fmt_be, *fmt_le;} sample_fmt_entries[] = {{AV_SAMPLE_FMT_U8, "u8", "u8"},{AV_SAMPLE_FMT_S16, "s16be", "s16le"},{AV_SAMPLE_FMT_S32, "s32be", "s32le"},{AV_SAMPLE_FMT_FLT, "f32be", "f32le"},{AV_SAMPLE_FMT_DBL, "f64be", "f64le"},};*fmt = NULL;for (i = 0; i < FF_ARRAY_ELEMS(sample_fmt_entries); i++) {struct sample_fmt_entry *entry = &sample_fmt_entries[i];if (sample_fmt == entry->sample_fmt) {*fmt = AV_NE(entry->fmt_be, entry->fmt_le);return 0;}}fprintf(stderr,"Sample format %s not supported as output format\n",av_get_sample_fmt_name(sample_fmt));return AVERROR(EINVAL); }/*** Fill dst buffer with nb_samples, generated starting from t. 交錯模式的*/ static void fill_samples(double *dst, int nb_samples, int nb_channels, int sample_rate, double *t) {int i, j;double tincr = 1.0 / sample_rate, *dstp = dst;const double c = 2 * M_PI * 440.0;/* generate sin tone with 440Hz frequency and duplicated channels */for (i = 0; i < nb_samples; i++) {*dstp = sin(c * *t);for (j = 1; j < nb_channels; j++)dstp[j] = dstp[0];dstp += nb_channels;*t += tincr;} }int main(int argc, char **argv) {// 輸入參數int64_t src_ch_layout = AV_CH_LAYOUT_STEREO;int src_rate = 48000;enum AVSampleFormat src_sample_fmt = AV_SAMPLE_FMT_DBL;int src_nb_channels = 0;uint8_t **src_data = NULL; // 二級指針int src_linesize;int src_nb_samples = 1024;// 輸出參數int64_t dst_ch_layout = AV_CH_LAYOUT_STEREO;int dst_rate = 44100;enum AVSampleFormat dst_sample_fmt = AV_SAMPLE_FMT_S16;int dst_nb_channels = 0;uint8_t **dst_data = NULL; //二級指針int dst_linesize;int dst_nb_samples;int max_dst_nb_samples;// 輸出文件const char *dst_filename = NULL; // 保存輸出的pcm到本地，然后播放驗證FILE *dst_file;int dst_bufsize;const char *fmt;// 重采樣實例struct SwrContext *swr_ctx;double t;int ret;/*if (argc != 2) {fprintf(stderr, "Usage: %s output_file\n""API example program to show how to resample an audio stream with libswresample.\n""This program generates a series of audio frames, resamples them to a specified ""output format and rate and saves them to an output file named output_file.\n",argv[0]);exit(1);}*/dst_filename = "/Users/lijinwang/Downloads/course/study/believe.pcm";dst_file = fopen(dst_filename, "wb");if (!dst_file) {fprintf(stderr, "Could not open destination file %s\n", dst_filename);exit(1);}// 創建重采樣器/* create resampler context */swr_ctx = swr_alloc();if (!swr_ctx) {fprintf(stderr, "Could not allocate resampler context\n");ret = AVERROR(ENOMEM);goto end;}// 設置重采樣參數/* set options */// 輸入參數av_opt_set_int(swr_ctx, "in_channel_layout", src_ch_layout, 0);av_opt_set_int(swr_ctx, "in_sample_rate", src_rate, 0);av_opt_set_sample_fmt(swr_ctx, "in_sample_fmt", src_sample_fmt, 0);// 輸出參數av_opt_set_int(swr_ctx, "out_channel_layout", dst_ch_layout, 0);av_opt_set_int(swr_ctx, "out_sample_rate", dst_rate, 0);av_opt_set_sample_fmt(swr_ctx, "out_sample_fmt", dst_sample_fmt, 0);// 初始化重采樣/* initialize the resampling context */if ((ret = swr_init(swr_ctx)) < 0) {fprintf(stderr, "Failed to initialize the resampling context\n");goto end;}/* allocate source and destination samples buffers */// 計算出輸入源的通道數量src_nb_channels = av_get_channel_layout_nb_channels(src_ch_layout);// 給輸入源分配內存空間ret = av_samples_alloc_array_and_samples(&src_data, &src_linesize, src_nb_channels,src_nb_samples, src_sample_fmt, 0);if (ret < 0) {fprintf(stderr, "Could not allocate source samples\n");goto end;}/* compute the number of converted samples: buffering is avoided* ensuring that the output buffer will contain at least all the* converted input samples */// 計算輸出采樣數量max_dst_nb_samples = dst_nb_samples =av_rescale_rnd(src_nb_samples, dst_rate, src_rate, AV_ROUND_UP);/* buffer is going to be directly written to a rawaudio file, no alignment */dst_nb_channels = av_get_channel_layout_nb_channels(dst_ch_layout);// 分配輸出緩存內存ret = av_samples_alloc_array_and_samples(&dst_data, &dst_linesize, dst_nb_channels,dst_nb_samples, dst_sample_fmt, 0);if (ret < 0) {fprintf(stderr, "Could not allocate destination samples\n");goto end;}t = 0;do {/* generate synthetic audio */// 生成輸入源fill_samples((double *) src_data[0], src_nb_samples, src_nb_channels, src_rate, &t);/* compute destination number of samples */int64_t delay = swr_get_delay(swr_ctx, src_rate);dst_nb_samples = av_rescale_rnd(delay + src_nb_samples, dst_rate, src_rate, AV_ROUND_UP);if (dst_nb_samples > max_dst_nb_samples) {av_freep(&dst_data[0]);ret = av_samples_alloc(dst_data, &dst_linesize, dst_nb_channels,dst_nb_samples, dst_sample_fmt, 1);if (ret < 0)break;max_dst_nb_samples = dst_nb_samples;}// int fifo_size = swr_get_out_samples(swr_ctx,src_nb_samples);// printf("fifo_size:%d\n", fifo_size);// if(fifo_size < 1024)// continue;/* convert to destination format */// ret = swr_convert(swr_ctx, dst_data, dst_nb_samples, (const uint8_t **)src_data, src_nb_samples);ret = swr_convert(swr_ctx, dst_data, dst_nb_samples, (const uint8_t **) src_data, src_nb_samples);if (ret < 0) {fprintf(stderr, "Error while converting\n");goto end;}dst_bufsize = av_samples_get_buffer_size(&dst_linesize, dst_nb_channels,ret, dst_sample_fmt, 1);if (dst_bufsize < 0) {fprintf(stderr, "Could not get sample buffer size\n");goto end;}printf("t:%f in:%d out:%d\n", t, src_nb_samples, ret);fwrite(dst_data[0], 1, dst_bufsize, dst_file);} while (t < 10);ret = swr_convert(swr_ctx, dst_data, dst_nb_samples, NULL, 0);if (ret < 0) {fprintf(stderr, "Error while converting\n");goto end;}dst_bufsize = av_samples_get_buffer_size(&dst_linesize, dst_nb_channels,ret, dst_sample_fmt, 1);if (dst_bufsize < 0) {fprintf(stderr, "Could not get sample buffer size\n");goto end;}printf("flush in:%d out:%d\n", 0, ret);fwrite(dst_data[0], 1, dst_bufsize, dst_file);if ((ret = get_format_from_sample_fmt(&fmt, dst_sample_fmt)) < 0)goto end;fprintf(stderr, "Resampling succeeded. Play the output file with the command:\n""ffplay -f %s -channel_layout %"PRId64" -channels %d -ar %d %s\n",fmt, dst_ch_layout, dst_nb_channels, dst_rate, dst_filename);end:fclose(dst_file);if (src_data)av_freep(&src_data[0]);av_freep(&src_data);if (dst_data)av_freep(&dst_data[0]);av_freep(&dst_data);swr_free(&swr_ctx);return ret < 0; }

2. 復雜范例

比較多就不上傳了

總結

以上是生活随笔為你收集整理的音频处理基本概念及音频重采样的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

生活随笔