我正在捕獲音頻並將音頻流式傳輸到RTMP服務器。我在MacOS下工作(在Xcode中),所以爲了捕獲音頻採樣緩衝區,我使用了AVFoundation-framework。但對於編碼和流媒體,我需要使用ffmpeg-API和libfaac編碼器。所以輸出格式必須是AAC(用於在iOS設備上支持流播放)。當輸入的pcm採樣數不相等時,如何使用ffmpeg-API將重採樣的PCM音頻編碼爲AAC 1024
我遇到了這樣的問題:音頻捕獲設備(在我的情況下,羅技相機)給了我512採樣LPCM採樣緩衝區,我可以選擇輸入採樣率從16000,24000,36000或48000赫茲。當我將這512個樣本提供給AAC編碼器(配置爲適當的採樣率)時,我聽到一個慢速和抖動音頻(看起來像每幀之後的沉默片)。
我想通了(可能我錯了),該libfaac編碼器只接受1024採樣音頻幀。當我將輸入採樣率設置爲24000並在編碼之前將輸入採樣緩衝器重新採樣爲48000時,我獲得了1024個重採樣採樣。在將這些1024採樣率編碼到AAC後,我聽到輸出聲音正確。但是,當輸出採樣率必須爲48000Hz時,我的網絡攝像頭可以爲任何輸入採樣率在緩衝區中生成512個樣本。所以我需要在任何情況下進行重採樣,並且在重採樣之後,我不會在緩衝區中準確獲得1024個採樣。
有沒有辦法在ffmpeg-API功能中解決這個問題?
我將不勝感激任何幫助。
PS: 我想我可以累積重新採樣的緩衝區,直到採樣數變爲1024,然後對其進行編碼,但這是流,所以會產生時間戳和其他輸入設備的麻煩,而且這種解決方案不是適當。
目前的問題就出來了在[提問]描述的問題:How to fill audio AVFrame (ffmpeg) with the data obtained from CMSampleBufferRef (AVFoundation)?
這裏是音頻編解碼器CONFIGS代碼(有也是視頻流,但視頻做工精細):
/*global variables*/
static AVFrame *aframe;
static AVFrame *frame;
AVOutputFormat *fmt;
AVFormatContext *oc;
AVStream *audio_st, *video_st;
Init()
{
AVCodec *audio_codec, *video_codec;
int ret;
avcodec_register_all();
av_register_all();
avformat_network_init();
avformat_alloc_output_context2(&oc, NULL, "flv", filename);
fmt = oc->oformat;
oc->oformat->video_codec = AV_CODEC_ID_H264;
oc->oformat->audio_codec = AV_CODEC_ID_AAC;
video_st = NULL;
audio_st = NULL;
if (fmt->video_codec != AV_CODEC_ID_NONE)
{ //… /*init video codec*/}
if (fmt->audio_codec != AV_CODEC_ID_NONE) {
audio_codec= avcodec_find_encoder(fmt->audio_codec);
if (!(audio_codec)) {
fprintf(stderr, "Could not find encoder for '%s'\n",
avcodec_get_name(fmt->audio_codec));
exit(1);
}
audio_st= avformat_new_stream(oc, audio_codec);
if (!audio_st) {
fprintf(stderr, "Could not allocate stream\n");
exit(1);
}
audio_st->id = oc->nb_streams-1;
//AAC:
audio_st->codec->sample_fmt = AV_SAMPLE_FMT_S16;
audio_st->codec->bit_rate = 32000;
audio_st->codec->sample_rate = 48000;
audio_st->codec->profile=FF_PROFILE_AAC_LOW;
audio_st->time_base = (AVRational){1, audio_st->codec->sample_rate };
audio_st->codec->channels = 1;
audio_st->codec->channel_layout = AV_CH_LAYOUT_MONO;
if (oc->oformat->flags & AVFMT_GLOBALHEADER)
audio_st->codec->flags |= CODEC_FLAG_GLOBAL_HEADER;
}
if (video_st)
{
// …
/*prepare video*/
}
if (audio_st)
{
aframe = avcodec_alloc_frame();
if (!aframe) {
fprintf(stderr, "Could not allocate audio frame\n");
exit(1);
}
AVCodecContext *c;
int ret;
c = audio_st->codec;
ret = avcodec_open2(c, audio_codec, 0);
if (ret < 0) {
fprintf(stderr, "Could not open audio codec: %s\n", av_err2str(ret));
exit(1);
}
//…
}
和重採樣和編碼音頻:
if (mType == kCMMediaType_Audio)
{
CMSampleTimingInfo timing_info;
CMSampleBufferGetSampleTimingInfo(sampleBuffer, 0, &timing_info);
double pts=0;
double dts=0;
AVCodecContext *c;
AVPacket pkt = { 0 }; // data and size must be 0;
int got_packet, ret;
av_init_packet(&pkt);
c = audio_st->codec;
CMItemCount numSamples = CMSampleBufferGetNumSamples(sampleBuffer);
NSUInteger channelIndex = 0;
CMBlockBufferRef audioBlockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
size_t audioBlockBufferOffset = (channelIndex * numSamples * sizeof(SInt16));
size_t lengthAtOffset = 0;
size_t totalLength = 0;
SInt16 *samples = NULL;
CMBlockBufferGetDataPointer(audioBlockBuffer, audioBlockBufferOffset, &lengthAtOffset, &totalLength, (char **)(&samples));
const AudioStreamBasicDescription *audioDescription = CMAudioFormatDescriptionGetStreamBasicDescription(CMSampleBufferGetFormatDescription(sampleBuffer));
SwrContext *swr = swr_alloc();
int in_smprt = (int)audioDescription->mSampleRate;
av_opt_set_int(swr, "in_channel_layout", AV_CH_LAYOUT_MONO, 0);
av_opt_set_int(swr, "out_channel_layout", audio_st->codec->channel_layout, 0);
av_opt_set_int(swr, "in_channel_count", audioDescription->mChannelsPerFrame, 0);
av_opt_set_int(swr, "out_channel_count", audio_st->codec->channels, 0);
av_opt_set_int(swr, "out_channel_layout", audio_st->codec->channel_layout, 0);
av_opt_set_int(swr, "in_sample_rate", audioDescription->mSampleRate,0);
av_opt_set_int(swr, "out_sample_rate", audio_st->codec->sample_rate,0);
av_opt_set_sample_fmt(swr, "in_sample_fmt", AV_SAMPLE_FMT_S16, 0);
av_opt_set_sample_fmt(swr, "out_sample_fmt", audio_st->codec->sample_fmt, 0);
swr_init(swr);
uint8_t **input = NULL;
int src_linesize;
int in_samples = (int)numSamples;
ret = av_samples_alloc_array_and_samples(&input, &src_linesize, audioDescription->mChannelsPerFrame,
in_samples, AV_SAMPLE_FMT_S16P, 0);
*input=(uint8_t*)samples;
uint8_t *output=NULL;
int out_samples = av_rescale_rnd(swr_get_delay(swr, in_smprt) +in_samples, (int)audio_st->codec->sample_rate, in_smprt, AV_ROUND_UP);
av_samples_alloc(&output, NULL, audio_st->codec->channels, out_samples, audio_st->codec->sample_fmt, 0);
in_samples = (int)numSamples;
out_samples = swr_convert(swr, &output, out_samples, (const uint8_t **)input, in_samples);
aframe->nb_samples =(int) out_samples;
ret = avcodec_fill_audio_frame(aframe, audio_st->codec->channels, audio_st->codec->sample_fmt,
(uint8_t *)output,
(int) out_samples *
av_get_bytes_per_sample(audio_st->codec->sample_fmt) *
audio_st->codec->channels, 1);
aframe->channel_layout = audio_st->codec->channel_layout;
aframe->channels=audio_st->codec->channels;
aframe->sample_rate= audio_st->codec->sample_rate;
if (timing_info.presentationTimeStamp.timescale!=0)
pts=(double) timing_info.presentationTimeStamp.value/timing_info.presentationTimeStamp.timescale;
aframe->pts=pts*audio_st->time_base.den;
aframe->pts = av_rescale_q(aframe->pts, audio_st->time_base, audio_st->codec->time_base);
ret = avcodec_encode_audio2(c, &pkt, aframe, &got_packet);
if (ret < 0) {
fprintf(stderr, "Error encoding audio frame: %s\n", av_err2str(ret));
exit(1);
}
swr_free(&swr);
if (got_packet)
{
pkt.stream_index = audio_st->index;
pkt.pts = av_rescale_q(pkt.pts, audio_st->codec->time_base, audio_st->time_base);
pkt.dts = av_rescale_q(pkt.dts, audio_st->codec->time_base, audio_st->time_base);
// Write the compressed frame to the media file.
ret = av_interleaved_write_frame(oc, &pkt);
if (ret != 0) {
fprintf(stderr, "Error while writing audio frame: %s\n",
av_err2str(ret));
exit(1);
}
}
iPhone的支持比AAC音頻更多,只是好奇,爲什麼你只支持AAC,羅技相機不支持以下任何一種,g711(ulaw),apcm,mpeg2音頻等。大多數相機的我們知道支持至少g711,技術上還有一些額外的api amr也是可以的。 –
AAC編解碼器是我的任務中需要的編解碼器之一,並與其他編解碼器我沒有這樣的問題。 – Aleksei2414904
嗨Aleksei2414904我編碼PCM樣本到aac android並面臨同樣的問題,請幫助我,如果你找到任何解決方案。 –